MCMC for parameters estimation by bayesian approach
International Nuclear Information System (INIS)
Ait Saadi, H.; Ykhlef, F.; Guessoum, A.
2011-01-01
This article discusses the parameter estimation for dynamic system by a Bayesian approach associated with Markov Chain Monte Carlo methods (MCMC). The MCMC methods are powerful for approximating complex integrals, simulating joint distributions, and the estimation of marginal posterior distributions, or posterior means. The MetropolisHastings algorithm has been widely used in Bayesian inference to approximate posterior densities. Calibrating the proposal distribution is one of the main issues of MCMC simulation in order to accelerate the convergence.
Directory of Open Access Journals (Sweden)
Allen Rodrigo
2006-01-01
Full Text Available Using the structured serial coalescent with Bayesian MCMC and serial samples, we estimate population size when some demes are not sampled or are hidden, ie ghost demes. It is found that even with the presence of a ghost deme, accurate inference was possible if the parameters are estimated with the true model. However with an incorrect model, estimates were biased and can be positively misleading. We extend these results to the case where there are sequences from the ghost at the last time sample. This case can arise in HIV patients, when some tissue samples and viral sequences only become available after death. When some sequences from the ghost deme are available at the last sampling time, estimation bias is reduced and accurate estimation of parameters associated with the ghost deme is possible despite sampling bias. Migration rates for this case are also shown to be good estimates when migration values are low.
FPGA Acceleration of the phylogenetic likelihood function for Bayesian MCMC inference methods
Directory of Open Access Journals (Sweden)
Bakos Jason D
2010-04-01
Full Text Available Abstract Background Likelihood (ML-based phylogenetic inference has become a popular method for estimating the evolutionary relationships among species based on genomic sequence data. This method is used in applications such as RAxML, GARLI, MrBayes, PAML, and PAUP. The Phylogenetic Likelihood Function (PLF is an important kernel computation for this method. The PLF consists of a loop with no conditional behavior or dependencies between iterations. As such it contains a high potential for exploiting parallelism using micro-architectural techniques. In this paper, we describe a technique for mapping the PLF and supporting logic onto a Field Programmable Gate Array (FPGA-based co-processor. By leveraging the FPGA's on-chip DSP modules and the high-bandwidth local memory attached to the FPGA, the resultant co-processor can accelerate ML-based methods and outperform state-of-the-art multi-core processors. Results We use the MrBayes 3 tool as a framework for designing our co-processor. For large datasets, we estimate that our accelerated MrBayes, if run on a current-generation FPGA, achieves a 10× speedup relative to software running on a state-of-the-art server-class microprocessor. The FPGA-based implementation achieves its performance by deeply pipelining the likelihood computations, performing multiple floating-point operations in parallel, and through a natural log approximation that is chosen specifically to leverage a deeply pipelined custom architecture. Conclusions Heterogeneous computing, which combines general-purpose processors with special-purpose co-processors such as FPGAs and GPUs, is a promising approach for high-performance phylogeny inference as shown by the growing body of literature in this field. FPGAs in particular are well-suited for this task because of their low power consumption as compared to many-core processors and Graphics Processor Units (GPUs 1.
The neighborhood MCMC sampler for learning Bayesian networks
Alyami, Salem A.; Azad, A. K. M.; Keith, Jonathan M.
2016-07-01
Getting stuck in local maxima is a problem that arises while learning Bayesian networks (BNs) structures. In this paper, we studied a recently proposed Markov chain Monte Carlo (MCMC) sampler, called the Neighbourhood sampler (NS), and examined how efficiently it can sample BNs when local maxima are present. We assume that a posterior distribution f(N,E|D) has been defined, where D represents data relevant to the inference, N and E are the sets of nodes and directed edges, respectively. We illustrate the new approach by sampling from such a distribution, and inferring BNs. The simulations conducted in this paper show that the new learning approach substantially avoids getting stuck in local modes of the distribution, and achieves a more rapid rate of convergence, compared to other common algorithms e.g. the MCMC Metropolis-Hastings sampler.
Bayesian conformational analysis of ring molecules through reversible jump MCMC
DEFF Research Database (Denmark)
Nolsøe, Kim; Kessler, Mathieu; Pérez, José
2005-01-01
In this paper we address the problem of classifying the conformations of mmembered rings using experimental observations obtained by crystal structure analysis. We formulate a model for the data generation mechanism that consists in a multidimensional mixture model. We perform inference for the p...... for the proportions and the components in a Bayesian framework, implementing an MCMC Reversible Jumps Algorithm to obtain samples of the posterior distributions. The method is illustrated on a simulated data set and on real data corresponding to cyclo-octane structures....
Complexity analysis of accelerated MCMC methods for Bayesian inversion
International Nuclear Information System (INIS)
Hoang, Viet Ha; Schwab, Christoph; Stuart, Andrew M
2013-01-01
The Bayesian approach to inverse problems, in which the posterior probability distribution on an unknown field is sampled for the purposes of computing posterior expectations of quantities of interest, is starting to become computationally feasible for partial differential equation (PDE) inverse problems. Balancing the sources of error arising from finite-dimensional approximation of the unknown field, the PDE forward solution map and the sampling of the probability space under the posterior distribution are essential for the design of efficient computational Bayesian methods for PDE inverse problems. We study Bayesian inversion for a model elliptic PDE with an unknown diffusion coefficient. We provide complexity analyses of several Markov chain Monte Carlo (MCMC) methods for the efficient numerical evaluation of expectations under the Bayesian posterior distribution, given data δ. Particular attention is given to bounds on the overall work required to achieve a prescribed error level ε. Specifically, we first bound the computational complexity of ‘plain’ MCMC, based on combining MCMC sampling with linear complexity multi-level solvers for elliptic PDE. Our (new) work versus accuracy bounds show that the complexity of this approach can be quite prohibitive. Two strategies for reducing the computational complexity are then proposed and analyzed: first, a sparse, parametric and deterministic generalized polynomial chaos (gpc) ‘surrogate’ representation of the forward response map of the PDE over the entire parameter space, and, second, a novel multi-level Markov chain Monte Carlo strategy which utilizes sampling from a multi-level discretization of the posterior and the forward PDE. For both of these strategies, we derive asymptotic bounds on work versus accuracy, and hence asymptotic bounds on the computational complexity of the algorithms. In particular, we provide sufficient conditions on the regularity of the unknown coefficients of the PDE and on the
Bayesian inference for Hawkes processes
DEFF Research Database (Denmark)
Rasmussen, Jakob Gulddahl
The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...... intensity function, while the second approach is based on an underlying clustering and branching structure in the Hawkes process. For practical use, MCMC (Markov chain Monte Carlo) methods are employed. The two approaches are compared numerically using three examples of the Hawkes process....
Bayesian inference for Hawkes processes
DEFF Research Database (Denmark)
Rasmussen, Jakob Gulddahl
2013-01-01
The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...... intensity function, while the second approach is based on an underlying clustering and branching structure in the Hawkes process. For practical use, MCMC (Markov chain Monte Carlo) methods are employed. The two approaches are compared numerically using three examples of the Hawkes process....
Bayesian statistical inference
Directory of Open Access Journals (Sweden)
Bruno De Finetti
2017-04-01
Full Text Available This work was translated into English and published in the volume: Bruno De Finetti, Induction and Probability, Biblioteca di Statistica, eds. P. Monari, D. Cocchi, Clueb, Bologna, 1993.Bayesian statistical Inference is one of the last fundamental philosophical papers in which we can find the essential De Finetti's approach to the statistical inference.
Bayesian inference in probabilistic risk assessment-The current state of the art
International Nuclear Information System (INIS)
Kelly, Dana L.; Smith, Curtis L.
2009-01-01
Markov chain Monte Carlo (MCMC) approaches to sampling directly from the joint posterior distribution of aleatory model parameters have led to tremendous advances in Bayesian inference capability in a wide variety of fields, including probabilistic risk analysis. The advent of freely available software coupled with inexpensive computing power has catalyzed this advance. This paper examines where the risk assessment community is with respect to implementing modern computational-based Bayesian approaches to inference. Through a series of examples in different topical areas, it introduces salient concepts and illustrates the practical application of Bayesian inference via MCMC sampling to a variety of important problems
Inference algorithms and learning theory for Bayesian sparse factor analysis
International Nuclear Information System (INIS)
Rattray, Magnus; Sharp, Kevin; Stegle, Oliver; Winn, John
2009-01-01
Bayesian sparse factor analysis has many applications; for example, it has been applied to the problem of inferring a sparse regulatory network from gene expression data. We describe a number of inference algorithms for Bayesian sparse factor analysis using a slab and spike mixture prior. These include well-established Markov chain Monte Carlo (MCMC) and variational Bayes (VB) algorithms as well as a novel hybrid of VB and Expectation Propagation (EP). For the case of a single latent factor we derive a theory for learning performance using the replica method. We compare the MCMC and VB/EP algorithm results with simulated data to the theoretical prediction. The results for MCMC agree closely with the theory as expected. Results for VB/EP are slightly sub-optimal but show that the new algorithm is effective for sparse inference. In large-scale problems MCMC is infeasible due to computational limitations and the VB/EP algorithm then provides a very useful computationally efficient alternative.
Inference algorithms and learning theory for Bayesian sparse factor analysis
Energy Technology Data Exchange (ETDEWEB)
Rattray, Magnus; Sharp, Kevin [School of Computer Science, University of Manchester, Manchester M13 9PL (United Kingdom); Stegle, Oliver [Max-Planck-Institute for Biological Cybernetics, Tuebingen (Germany); Winn, John, E-mail: magnus.rattray@manchester.ac.u [Microsoft Research Cambridge, Roger Needham Building, Cambridge, CB3 0FB (United Kingdom)
2009-12-01
Bayesian sparse factor analysis has many applications; for example, it has been applied to the problem of inferring a sparse regulatory network from gene expression data. We describe a number of inference algorithms for Bayesian sparse factor analysis using a slab and spike mixture prior. These include well-established Markov chain Monte Carlo (MCMC) and variational Bayes (VB) algorithms as well as a novel hybrid of VB and Expectation Propagation (EP). For the case of a single latent factor we derive a theory for learning performance using the replica method. We compare the MCMC and VB/EP algorithm results with simulated data to the theoretical prediction. The results for MCMC agree closely with the theory as expected. Results for VB/EP are slightly sub-optimal but show that the new algorithm is effective for sparse inference. In large-scale problems MCMC is infeasible due to computational limitations and the VB/EP algorithm then provides a very useful computationally efficient alternative.
Inference in hybrid Bayesian networks
DEFF Research Database (Denmark)
Lanseth, Helge; Nielsen, Thomas Dyhre; Rumí, Rafael
2009-01-01
Since the 1980s, Bayesian Networks (BNs) have become increasingly popular for building statistical models of complex systems. This is particularly true for boolean systems, where BNs often prove to be a more efficient modelling framework than traditional reliability-techniques (like fault trees...... decade's research on inference in hybrid Bayesian networks. The discussions are linked to an example model for estimating human reliability....
Bayesian inference with ecological applications
Link, William A
2009-01-01
This text is written to provide a mathematically sound but accessible and engaging introduction to Bayesian inference specifically for environmental scientists, ecologists and wildlife biologists. It emphasizes the power and usefulness of Bayesian methods in an ecological context. The advent of fast personal computers and easily available software has simplified the use of Bayesian and hierarchical models . One obstacle remains for ecologists and wildlife biologists, namely the near absence of Bayesian texts written specifically for them. The book includes many relevant examples, is supported by software and examples on a companion website and will become an essential grounding in this approach for students and research ecologists. Engagingly written text specifically designed to demystify a complex subject Examples drawn from ecology and wildlife research An essential grounding for graduate and research ecologists in the increasingly prevalent Bayesian approach to inference Companion website with analyt...
Interactive Instruction in Bayesian Inference
DEFF Research Database (Denmark)
Khan, Azam; Breslav, Simon; Hornbæk, Kasper
2018-01-01
An instructional approach is presented to improve human performance in solving Bayesian inference problems. Starting from the original text of the classic Mammography Problem, the textual expression is modified and visualizations are added according to Mayer’s principles of instruction. These pri......An instructional approach is presented to improve human performance in solving Bayesian inference problems. Starting from the original text of the classic Mammography Problem, the textual expression is modified and visualizations are added according to Mayer’s principles of instruction....... These principles concern coherence, personalization, signaling, segmenting, multimedia, spatial contiguity, and pretraining. Principles of self-explanation and interactivity are also applied. Four experiments on the Mammography Problem showed that these principles help participants answer the questions...... that an instructional approach to improving human performance in Bayesian inference is a promising direction....
Bailer-Jones, Coryn A. L.
2017-04-01
Preface; 1. Probability basics; 2. Estimation and uncertainty; 3. Statistical models and inference; 4. Linear models, least squares, and maximum likelihood; 5. Parameter estimation: single parameter; 6. Parameter estimation: multiple parameters; 7. Approximating distributions; 8. Monte Carlo methods for inference; 9. Parameter estimation: Markov chain Monte Carlo; 10. Frequentist hypothesis testing; 11. Model comparison; 12. Dealing with more complicated problems; References; Index.
Probability biases as Bayesian inference
Directory of Open Access Journals (Sweden)
Andre; C. R. Martins
2006-11-01
Full Text Available In this article, I will show how several observed biases in human probabilistic reasoning can be partially explained as good heuristics for making inferences in an environment where probabilities have uncertainties associated to them. Previous results show that the weight functions and the observed violations of coalescing and stochastic dominance can be understood from a Bayesian point of view. We will review those results and see that Bayesian methods should also be used as part of the explanation behind other known biases. That means that, although the observed errors are still errors under the be understood as adaptations to the solution of real life problems. Heuristics that allow fast evaluations and mimic a Bayesian inference would be an evolutionary advantage, since they would give us an efficient way of making decisions. %XX In that sense, it should be no surprise that humans reason with % probability as it has been observed.
Large-Scale Distributed Bayesian Matrix Factorization using Stochastic Gradient MCMC
Ahn, S.; Korattikara, A.; Liu, N.; Rajan, S.; Welling, M.
2015-01-01
Despite having various attractive qualities such as high prediction accuracy and the ability to quantify uncertainty and avoid ovrfitting, Bayesian Matrix Factorization has not been widely adopted because of the prohibitive cost of inference. In this paper, we propose a scalable distributed Bayesian
Inference in hybrid Bayesian networks
International Nuclear Information System (INIS)
Langseth, Helge; Nielsen, Thomas D.; Rumi, Rafael; Salmeron, Antonio
2009-01-01
Since the 1980s, Bayesian networks (BNs) have become increasingly popular for building statistical models of complex systems. This is particularly true for boolean systems, where BNs often prove to be a more efficient modelling framework than traditional reliability techniques (like fault trees and reliability block diagrams). However, limitations in the BNs' calculation engine have prevented BNs from becoming equally popular for domains containing mixtures of both discrete and continuous variables (the so-called hybrid domains). In this paper we focus on these difficulties, and summarize some of the last decade's research on inference in hybrid Bayesian networks. The discussions are linked to an example model for estimating human reliability.
Bayesian inference on proportional elections.
Directory of Open Access Journals (Sweden)
Gabriel Hideki Vatanabe Brunello
Full Text Available Polls for majoritarian voting systems usually show estimates of the percentage of votes for each candidate. However, proportional vote systems do not necessarily guarantee the candidate with the most percentage of votes will be elected. Thus, traditional methods used in majoritarian elections cannot be applied on proportional elections. In this context, the purpose of this paper was to perform a Bayesian inference on proportional elections considering the Brazilian system of seats distribution. More specifically, a methodology to answer the probability that a given party will have representation on the chamber of deputies was developed. Inferences were made on a Bayesian scenario using the Monte Carlo simulation technique, and the developed methodology was applied on data from the Brazilian elections for Members of the Legislative Assembly and Federal Chamber of Deputies in 2010. A performance rate was also presented to evaluate the efficiency of the methodology. Calculations and simulations were carried out using the free R statistical software.
Bayesian methods for hackers probabilistic programming and Bayesian inference
Davidson-Pilon, Cameron
2016-01-01
Bayesian methods of inference are deeply natural and extremely powerful. However, most discussions of Bayesian inference rely on intensely complex mathematical analyses and artificial examples, making it inaccessible to anyone without a strong mathematical background. Now, though, Cameron Davidson-Pilon introduces Bayesian inference from a computational perspective, bridging theory to practice–freeing you to get results using computing power. Bayesian Methods for Hackers illuminates Bayesian inference through probabilistic programming with the powerful PyMC language and the closely related Python tools NumPy, SciPy, and Matplotlib. Using this approach, you can reach effective solutions in small increments, without extensive mathematical intervention. Davidson-Pilon begins by introducing the concepts underlying Bayesian inference, comparing it with other techniques and guiding you through building and training your first Bayesian model. Next, he introduces PyMC through a series of detailed examples a...
DEFF Research Database (Denmark)
Picchini, Umberto; Forman, Julie Lyng
2016-01-01
a nonlinear stochastic differential equation model observed with correlated measurement errors and an application to protein folding modelling. An approximate Bayesian computation (ABC)-MCMC algorithm is suggested to allow inference for model parameters within reasonable time constraints. The ABC algorithm......In recent years, dynamical modelling has been provided with a range of breakthrough methods to perform exact Bayesian inference. However, it is often computationally unfeasible to apply exact statistical methodologies in the context of large data sets and complex models. This paper considers...... applications. A simulation study is conducted to compare our strategy with exact Bayesian inference, the latter resulting two orders of magnitude slower than ABC-MCMC for the considered set-up. Finally, the ABC algorithm is applied to a large size protein data. The suggested methodology is fairly general...
Siripatana, Adil
2017-06-08
Bayesian estimation/inversion is commonly used to quantify and reduce modeling uncertainties in coastal ocean model, especially in the framework of parameter estimation. Based on Bayes rule, the posterior probability distribution function (pdf) of the estimated quantities is obtained conditioned on available data. It can be computed either directly, using a Markov chain Monte Carlo (MCMC) approach, or by sequentially processing the data following a data assimilation approach, which is heavily exploited in large dimensional state estimation problems. The advantage of data assimilation schemes over MCMC-type methods arises from the ability to algorithmically accommodate a large number of uncertain quantities without significant increase in the computational requirements. However, only approximate estimates are generally obtained by this approach due to the restricted Gaussian prior and noise assumptions that are generally imposed in these methods. This contribution aims at evaluating the effectiveness of utilizing an ensemble Kalman-based data assimilation method for parameter estimation of a coastal ocean model against an MCMC polynomial chaos (PC)-based scheme. We focus on quantifying the uncertainties of a coastal ocean ADvanced CIRCulation (ADCIRC) model with respect to the Manning’s n coefficients. Based on a realistic framework of observation system simulation experiments (OSSEs), we apply an ensemble Kalman filter and the MCMC method employing a surrogate of ADCIRC constructed by a non-intrusive PC expansion for evaluating the likelihood, and test both approaches under identical scenarios. We study the sensitivity of the estimated posteriors with respect to the parameters of the inference methods, including ensemble size, inflation factor, and PC order. A full analysis of both methods, in the context of coastal ocean model, suggests that an ensemble Kalman filter with appropriate ensemble size and well-tuned inflation provides reliable mean estimates and
Siripatana, Adil; Mayo, Talea; Sraj, Ihab; Knio, Omar; Dawson, Clint; Le Maitre, Olivier; Hoteit, Ibrahim
2017-08-01
Bayesian estimation/inversion is commonly used to quantify and reduce modeling uncertainties in coastal ocean model, especially in the framework of parameter estimation. Based on Bayes rule, the posterior probability distribution function (pdf) of the estimated quantities is obtained conditioned on available data. It can be computed either directly, using a Markov chain Monte Carlo (MCMC) approach, or by sequentially processing the data following a data assimilation approach, which is heavily exploited in large dimensional state estimation problems. The advantage of data assimilation schemes over MCMC-type methods arises from the ability to algorithmically accommodate a large number of uncertain quantities without significant increase in the computational requirements. However, only approximate estimates are generally obtained by this approach due to the restricted Gaussian prior and noise assumptions that are generally imposed in these methods. This contribution aims at evaluating the effectiveness of utilizing an ensemble Kalman-based data assimilation method for parameter estimation of a coastal ocean model against an MCMC polynomial chaos (PC)-based scheme. We focus on quantifying the uncertainties of a coastal ocean ADvanced CIRCulation (ADCIRC) model with respect to the Manning's n coefficients. Based on a realistic framework of observation system simulation experiments (OSSEs), we apply an ensemble Kalman filter and the MCMC method employing a surrogate of ADCIRC constructed by a non-intrusive PC expansion for evaluating the likelihood, and test both approaches under identical scenarios. We study the sensitivity of the estimated posteriors with respect to the parameters of the inference methods, including ensemble size, inflation factor, and PC order. A full analysis of both methods, in the context of coastal ocean model, suggests that an ensemble Kalman filter with appropriate ensemble size and well-tuned inflation provides reliable mean estimates and
DEFF Research Database (Denmark)
Møller, Jesper
2010-01-01
Chapter 9: This contribution concerns statistical inference for parametric models used in stochastic geometry and based on quick and simple simulation free procedures as well as more comprehensive methods based on a maximum likelihood or Bayesian approach combined with markov chain Monte Carlo...... (MCMC) techniques. Due to space limitations the focus is on spatial point processes....
Nonparametric Bayesian inference in biostatistics
Müller, Peter
2015-01-01
As chapters in this book demonstrate, BNP has important uses in clinical sciences and inference for issues like unknown partitions in genomics. Nonparametric Bayesian approaches (BNP) play an ever expanding role in biostatistical inference from use in proteomics to clinical trials. Many research problems involve an abundance of data and require flexible and complex probability models beyond the traditional parametric approaches. As this book's expert contributors show, BNP approaches can be the answer. Survival Analysis, in particular survival regression, has traditionally used BNP, but BNP's potential is now very broad. This applies to important tasks like arrangement of patients into clinically meaningful subpopulations and segmenting the genome into functionally distinct regions. This book is designed to both review and introduce application areas for BNP. While existing books provide theoretical foundations, this book connects theory to practice through engaging examples and research questions. Chapters c...
A Bayesian MCMC method for point process models with intractable normalising constants
DEFF Research Database (Denmark)
Berthelsen, Kasper Klitgaard; Møller, Jesper
2004-01-01
to simulate from the "unknown distribution", perfect simulation algorithms become useful. We illustrate the method in cases whre the likelihood is given by a Markov point process model. Particularly, we consider semi-parametric Bayesian inference in connection to both inhomogeneous Markov point process models...... and pairwise interaction point processes....
Variations on Bayesian Prediction and Inference
2016-05-09
inference 2.2.1 Background There are a number of statistical inference problems that are not generally formulated via a full probability model...problem of inference about an unknown parameter, the Bayesian approach requires a full probability 1. REPORT DATE (DD-MM-YYYY) 4. TITLE AND...the problem of inference about an unknown parameter, the Bayesian approach requires a full probability model/likelihood which can be an obstacle
Compiling Relational Bayesian Networks for Exact Inference
DEFF Research Database (Denmark)
Jaeger, Manfred; Darwiche, Adnan; Chavira, Mark
2006-01-01
We describe in this paper a system for exact inference with relational Bayesian networks as defined in the publicly available PRIMULA tool. The system is based on compiling propositional instances of relational Bayesian networks into arithmetic circuits and then performing online inference...
Compiling Relational Bayesian Networks for Exact Inference
DEFF Research Database (Denmark)
Jaeger, Manfred; Chavira, Mark; Darwiche, Adnan
2004-01-01
We describe a system for exact inference with relational Bayesian networks as defined in the publicly available \\primula\\ tool. The system is based on compiling propositional instances of relational Bayesian networks into arithmetic circuits and then performing online inference by evaluating...
Fast Bayesian Inference in Dirichlet Process Mixture Models.
Wang, Lianming; Dunson, David B
2011-01-01
There has been increasing interest in applying Bayesian nonparametric methods in large samples and high dimensions. As Markov chain Monte Carlo (MCMC) algorithms are often infeasible, there is a pressing need for much faster algorithms. This article proposes a fast approach for inference in Dirichlet process mixture (DPM) models. Viewing the partitioning of subjects into clusters as a model selection problem, we propose a sequential greedy search algorithm for selecting the partition. Then, when conjugate priors are chosen, the resulting posterior conditionally on the selected partition is available in closed form. This approach allows testing of parametric models versus nonparametric alternatives based on Bayes factors. We evaluate the approach using simulation studies and compare it with four other fast nonparametric methods in the literature. We apply the proposed approach to three datasets including one from a large epidemiologic study. Matlab codes for the simulation and data analyses using the proposed approach are available online in the supplemental materials.
Bayesian Inference Methods for Sparse Channel Estimation
DEFF Research Database (Denmark)
Pedersen, Niels Lovmand
2013-01-01
This thesis deals with sparse Bayesian learning (SBL) with application to radio channel estimation. As opposed to the classical approach for sparse signal representation, we focus on the problem of inferring complex signals. Our investigations within SBL constitute the basis for the development...... of Bayesian inference algorithms for sparse channel estimation. Sparse inference methods aim at finding the sparse representation of a signal given in some overcomplete dictionary of basis vectors. Within this context, one of our main contributions to the field of SBL is a hierarchical representation...... analysis of the complex prior representation, where we show that the ability to induce sparse estimates of a given prior heavily depends on the inference method used and, interestingly, whether real or complex variables are inferred. We also show that the Bayesian estimators derived from the proposed...
Bayesian inference on EMRI signals using low frequency approximations
International Nuclear Information System (INIS)
Ali, Asad; Meyer, Renate; Christensen, Nelson; Röver, Christian
2012-01-01
Extreme mass ratio inspirals (EMRIs) are thought to be one of the most exciting gravitational wave sources to be detected with LISA. Due to their complicated nature and weak amplitudes the detection and parameter estimation of such sources is a challenging task. In this paper we present a statistical methodology based on Bayesian inference in which the estimation of parameters is carried out by advanced Markov chain Monte Carlo (MCMC) algorithms such as parallel tempering MCMC. We analysed high and medium mass EMRI systems that fall well inside the low frequency range of LISA. In the context of the Mock LISA Data Challenges, our investigation and results are also the first instance in which a fully Markovian algorithm is applied for EMRI searches. Results show that our algorithm worked well in recovering EMRI signals from different (simulated) LISA data sets having single and multiple EMRI sources and holds great promise for posterior computation under more realistic conditions. The search and estimation methods presented in this paper are general in their nature, and can be applied in any other scenario such as AdLIGO, AdVIRGO and Einstein Telescope with their respective response functions. (paper)
Bayesian Inference on Gravitational Waves
Directory of Open Access Journals (Sweden)
Asad Ali
2015-12-01
Full Text Available The Bayesian approach is increasingly becoming popular among the astrophysics data analysis communities. However, the Pakistan statistics communities are unaware of this fertile interaction between the two disciplines. Bayesian methods have been in use to address astronomical problems since the very birth of the Bayes probability in eighteenth century. Today the Bayesian methods for the detection and parameter estimation of gravitational waves have solid theoretical grounds with a strong promise for the realistic applications. This article aims to introduce the Pakistan statistics communities to the applications of Bayesian Monte Carlo methods in the analysis of gravitational wave data with an overview of the Bayesian signal detection and estimation methods and demonstration by a couple of simplified examples.
Bayesian inference for hybrid discrete-continuous stochastic kinetic models
International Nuclear Information System (INIS)
Sherlock, Chris; Golightly, Andrew; Gillespie, Colin S
2014-01-01
We consider the problem of efficiently performing simulation and inference for stochastic kinetic models. Whilst it is possible to work directly with the resulting Markov jump process (MJP), computational cost can be prohibitive for networks of realistic size and complexity. In this paper, we consider an inference scheme based on a novel hybrid simulator that classifies reactions as either ‘fast’ or ‘slow’ with fast reactions evolving as a continuous Markov process whilst the remaining slow reaction occurrences are modelled through a MJP with time-dependent hazards. A linear noise approximation (LNA) of fast reaction dynamics is employed and slow reaction events are captured by exploiting the ability to solve the stochastic differential equation driving the LNA. This simulation procedure is used as a proposal mechanism inside a particle MCMC scheme, thus allowing Bayesian inference for the model parameters. We apply the scheme to a simple application and compare the output with an existing hybrid approach and also a scheme for performing inference for the underlying discrete stochastic model. (paper)
Bayesian inference for Markov jump processes with informative observations.
Golightly, Andrew; Wilkinson, Darren J
2015-04-01
In this paper we consider the problem of parameter inference for Markov jump process (MJP) representations of stochastic kinetic models. Since transition probabilities are intractable for most processes of interest yet forward simulation is straightforward, Bayesian inference typically proceeds through computationally intensive methods such as (particle) MCMC. Such methods ostensibly require the ability to simulate trajectories from the conditioned jump process. When observations are highly informative, use of the forward simulator is likely to be inefficient and may even preclude an exact (simulation based) analysis. We therefore propose three methods for improving the efficiency of simulating conditioned jump processes. A conditioned hazard is derived based on an approximation to the jump process, and used to generate end-point conditioned trajectories for use inside an importance sampling algorithm. We also adapt a recently proposed sequential Monte Carlo scheme to our problem. Essentially, trajectories are reweighted at a set of intermediate time points, with more weight assigned to trajectories that are consistent with the next observation. We consider two implementations of this approach, based on two continuous approximations of the MJP. We compare these constructs for a simple tractable jump process before using them to perform inference for a Lotka-Volterra system. The best performing construct is used to infer the parameters governing a simple model of motility regulation in Bacillus subtilis.
Bayesian Inference on the Memory Parameter for Gamma-Modulated Regression Models
Directory of Open Access Journals (Sweden)
Plinio Andrade
2015-09-01
Full Text Available In this work, we propose a Bayesian methodology to make inferences for the memory parameter and other characteristics under non-standard assumptions for a class of stochastic processes. This class generalizes the Gamma-modulated process, with trajectories that exhibit long memory behavior, as well as decreasing variability as time increases. Different values of the memory parameter influence the speed of this decrease, making this heteroscedastic model very flexible. Its properties are used to implement an approximate Bayesian computation and MCMC scheme to obtain posterior estimates. We test and validate our method through simulations and real data from the big earthquake that occurred in 2010 in Chile.
An Intuitive Dashboard for Bayesian Network Inference
International Nuclear Information System (INIS)
Reddy, Vikas; Farr, Anna Charisse; Wu, Paul; Mengersen, Kerrie; Yarlagadda, Prasad K D V
2014-01-01
Current Bayesian network software packages provide good graphical interface for users who design and develop Bayesian networks for various applications. However, the intended end-users of these networks may not necessarily find such an interface appealing and at times it could be overwhelming, particularly when the number of nodes in the network is large. To circumvent this problem, this paper presents an intuitive dashboard, which provides an additional layer of abstraction, enabling the end-users to easily perform inferences over the Bayesian networks. Unlike most software packages, which display the nodes and arcs of the network, the developed tool organises the nodes based on the cause-and-effect relationship, making the user-interaction more intuitive and friendly. In addition to performing various types of inferences, the users can conveniently use the tool to verify the behaviour of the developed Bayesian network. The tool has been developed using QT and SMILE libraries in C++
An Intuitive Dashboard for Bayesian Network Inference
Reddy, Vikas; Charisse Farr, Anna; Wu, Paul; Mengersen, Kerrie; Yarlagadda, Prasad K. D. V.
2014-03-01
Current Bayesian network software packages provide good graphical interface for users who design and develop Bayesian networks for various applications. However, the intended end-users of these networks may not necessarily find such an interface appealing and at times it could be overwhelming, particularly when the number of nodes in the network is large. To circumvent this problem, this paper presents an intuitive dashboard, which provides an additional layer of abstraction, enabling the end-users to easily perform inferences over the Bayesian networks. Unlike most software packages, which display the nodes and arcs of the network, the developed tool organises the nodes based on the cause-and-effect relationship, making the user-interaction more intuitive and friendly. In addition to performing various types of inferences, the users can conveniently use the tool to verify the behaviour of the developed Bayesian network. The tool has been developed using QT and SMILE libraries in C++.
Bayesian inference on genetic merit under uncertain paternity
Directory of Open Access Journals (Sweden)
Tempelman Robert J
2003-09-01
Full Text Available Abstract A hierarchical animal model was developed for inference on genetic merit of livestock with uncertain paternity. Fully conditional posterior distributions for fixed and genetic effects, variance components, sire assignments and their probabilities are derived to facilitate a Bayesian inference strategy using MCMC methods. We compared this model to a model based on the Henderson average numerator relationship (ANRM in a simulation study with 10 replicated datasets generated for each of two traits. Trait 1 had a medium heritability (h2 for each of direct and maternal genetic effects whereas Trait 2 had a high h2 attributable only to direct effects. The average posterior probabilities inferred on the true sire were between 1 and 10% larger than the corresponding priors (the inverse of the number of candidate sires in a mating pasture for Trait 1 and between 4 and 13% larger than the corresponding priors for Trait 2. The predicted additive and maternal genetic effects were very similar using both models; however, model choice criteria (Pseudo Bayes Factor and Deviance Information Criterion decisively favored the proposed hierarchical model over the ANRM model.
A Bayesian Network Schema for Lessening Database Inference
National Research Council Canada - National Science Library
Chang, LiWu; Moskowitz, Ira S
2001-01-01
.... The authors introduce a formal schema for database inference analysis, based upon a Bayesian network structure, which identifies critical parameters involved in the inference problem and represents...
Directory of Open Access Journals (Sweden)
Hea-Jung Kim
2017-06-01
Full Text Available This paper develops Bayesian inference in reliability of a class of scale mixtures of log-normal failure time (SMLNFT models with stochastic (or uncertain constraint in their reliability measures. The class is comprehensive and includes existing failure time (FT models (such as log-normal, log-Cauchy, and log-logistic FT models as well as new models that are robust in terms of heavy-tailed FT observations. Since classical frequency approaches to reliability analysis based on the SMLNFT model with stochastic constraint are intractable, the Bayesian method is pursued utilizing a Markov chain Monte Carlo (MCMC sampling based approach. This paper introduces a two-stage maximum entropy (MaxEnt prior, which elicits a priori uncertain constraint and develops Bayesian hierarchical SMLNFT model by using the prior. The paper also proposes an MCMC method for Bayesian inference in the SMLNFT model reliability and calls attention to properties of the MaxEnt prior that are useful for method development. Finally, two data sets are used to illustrate how the proposed methodology works.
Polynomial Chaos Surrogates for Bayesian Inference
Le Maitre, Olivier
2016-01-06
The Bayesian inference is a popular probabilistic method to solve inverse problems, such as the identification of field parameter in a PDE model. The inference rely on the Bayes rule to update the prior density of the sought field, from observations, and derive its posterior distribution. In most cases the posterior distribution has no explicit form and has to be sampled, for instance using a Markov-Chain Monte Carlo method. In practice the prior field parameter is decomposed and truncated (e.g. by means of Karhunen- Lo´eve decomposition) to recast the inference problem into the inference of a finite number of coordinates. Although proved effective in many situations, the Bayesian inference as sketched above faces several difficulties requiring improvements. First, sampling the posterior can be a extremely costly task as it requires multiple resolutions of the PDE model for different values of the field parameter. Second, when the observations are not very much informative, the inferred parameter field can highly depends on its prior which can be somehow arbitrary. These issues have motivated the introduction of reduced modeling or surrogates for the (approximate) determination of the parametrized PDE solution and hyperparameters in the description of the prior field. Our contribution focuses on recent developments in these two directions: the acceleration of the posterior sampling by means of Polynomial Chaos expansions and the efficient treatment of parametrized covariance functions for the prior field. We also discuss the possibility of making such approach adaptive to further improve its efficiency.
The NIFTY way of Bayesian signal inference
International Nuclear Information System (INIS)
Selig, Marco
2014-01-01
We introduce NIFTY, 'Numerical Information Field Theory', a software package for the development of Bayesian signal inference algorithms that operate independently from any underlying spatial grid and its resolution. A large number of Bayesian and Maximum Entropy methods for 1D signal reconstruction, 2D imaging, as well as 3D tomography, appear formally similar, but one often finds individualized implementations that are neither flexible nor easily transferable. Signal inference in the framework of NIFTY can be done in an abstract way, such that algorithms, prototyped in 1D, can be applied to real world problems in higher-dimensional settings. NIFTY as a versatile library is applicable and already has been applied in 1D, 2D, 3D and spherical settings. A recent application is the D 3 PO algorithm targeting the non-trivial task of denoising, deconvolving, and decomposing photon observations in high energy astronomy
The NIFTy way of Bayesian signal inference
Selig, Marco
2014-12-01
We introduce NIFTy, "Numerical Information Field Theory", a software package for the development of Bayesian signal inference algorithms that operate independently from any underlying spatial grid and its resolution. A large number of Bayesian and Maximum Entropy methods for 1D signal reconstruction, 2D imaging, as well as 3D tomography, appear formally similar, but one often finds individualized implementations that are neither flexible nor easily transferable. Signal inference in the framework of NIFTy can be done in an abstract way, such that algorithms, prototyped in 1D, can be applied to real world problems in higher-dimensional settings. NIFTy as a versatile library is applicable and already has been applied in 1D, 2D, 3D and spherical settings. A recent application is the D3PO algorithm targeting the non-trivial task of denoising, deconvolving, and decomposing photon observations in high energy astronomy.
Bayesianism and inference to the best explanation
Directory of Open Access Journals (Sweden)
Valeriano IRANZO
2008-01-01
Full Text Available Bayesianism and Inference to the best explanation (IBE are two different models of inference. Recently there has been some debate about the possibility of “bayesianizing” IBE. Firstly I explore several alternatives to include explanatory considerations in Bayes’s Theorem. Then I distinguish two different interpretations of prior probabilities: “IBE-Bayesianism” (IBE-Bay and “frequentist-Bayesianism” (Freq-Bay. After detailing the content of the latter, I propose a rule for assessing the priors. I also argue that Freq-Bay: (i endorses a role for explanatory value in the assessment of scientific hypotheses; (ii avoids a purely subjectivist reading of prior probabilities; and (iii fits better than IBE-Bayesianism with two basic facts about science, i.e., the prominent role played by empirical testing and the existence of many scientific theories in the past that failed to fulfil their promises and were subsequently abandoned.
Optimal inference with suboptimal models: Addiction and active Bayesian inference
Schwartenbeck, Philipp; FitzGerald, Thomas H.B.; Mathys, Christoph; Dolan, Ray; Wurst, Friedrich; Kronbichler, Martin; Friston, Karl
2015-01-01
When casting behaviour as active (Bayesian) inference, optimal inference is defined with respect to an agent’s beliefs – based on its generative model of the world. This contrasts with normative accounts of choice behaviour, in which optimal actions are considered in relation to the true structure of the environment – as opposed to the agent’s beliefs about worldly states (or the task). This distinction shifts an understanding of suboptimal or pathological behaviour away from aberrant inference as such, to understanding the prior beliefs of a subject that cause them to behave less ‘optimally’ than our prior beliefs suggest they should behave. Put simply, suboptimal or pathological behaviour does not speak against understanding behaviour in terms of (Bayes optimal) inference, but rather calls for a more refined understanding of the subject’s generative model upon which their (optimal) Bayesian inference is based. Here, we discuss this fundamental distinction and its implications for understanding optimality, bounded rationality and pathological (choice) behaviour. We illustrate our argument using addictive choice behaviour in a recently described ‘limited offer’ task. Our simulations of pathological choices and addictive behaviour also generate some clear hypotheses, which we hope to pursue in ongoing empirical work. PMID:25561321
Using Alien Coins to Test Whether Simple Inference Is Bayesian
Cassey, Peter; Hawkins, Guy E.; Donkin, Chris; Brown, Scott D.
2016-01-01
Reasoning and inference are well-studied aspects of basic cognition that have been explained as statistically optimal Bayesian inference. Using a simplified experimental design, we conducted quantitative comparisons between Bayesian inference and human inference at the level of individuals. In 3 experiments, with more than 13,000 participants, we…
Constrained bayesian inference of project performance models
Sunmola, Funlade
2013-01-01
Project performance models play an important role in the management of project success. When used for monitoring projects, they can offer predictive ability such as indications of possible delivery problems. Approaches for monitoring project performance relies on available project information including restrictions imposed on the project, particularly the constraints of cost, quality, scope and time. We study in this paper a Bayesian inference methodology for project performance modelling in ...
Bayesian structural inference for hidden processes
Strelioff, Christopher C.; Crutchfield, James P.
2014-04-01
We introduce a Bayesian approach to discovering patterns in structurally complex processes. The proposed method of Bayesian structural inference (BSI) relies on a set of candidate unifilar hidden Markov model (uHMM) topologies for inference of process structure from a data series. We employ a recently developed exact enumeration of topological ɛ-machines. (A sequel then removes the topological restriction.) This subset of the uHMM topologies has the added benefit that inferred models are guaranteed to be ɛ-machines, irrespective of estimated transition probabilities. Properties of ɛ-machines and uHMMs allow for the derivation of analytic expressions for estimating transition probabilities, inferring start states, and comparing the posterior probability of candidate model topologies, despite process internal structure being only indirectly present in data. We demonstrate BSI's effectiveness in estimating a process's randomness, as reflected by the Shannon entropy rate, and its structure, as quantified by the statistical complexity. We also compare using the posterior distribution over candidate models and the single, maximum a posteriori model for point estimation and show that the former more accurately reflects uncertainty in estimated values. We apply BSI to in-class examples of finite- and infinite-order Markov processes, as well to an out-of-class, infinite-state hidden process.
Bayesian Estimation and Inference using Stochastic Hardware
Directory of Open Access Journals (Sweden)
Chetan Singh Thakur
2016-03-01
Full Text Available In this paper, we present the implementation of two types of Bayesian inference problems to demonstrate the potential of building probabilistic algorithms in hardware using single set of building blocks with the ability to perform these computations in real time. The first implementation, referred to as the BEAST (Bayesian Estimation and Stochastic Tracker, demonstrates a simple problem where an observer uses an underlying Hidden Markov Model (HMM to track a target in one dimension. In this implementation, sensors make noisy observations of the target position at discrete time steps. The tracker learns the transition model for target movement, and the observation model for the noisy sensors, and uses these to estimate the target position by solving the Bayesian recursive equation online. We show the tracking performance of the system and demonstrate how it can learn the observation model, the transition model, and the external distractor (noise probability interfering with the observations. In the second implementation, referred to as the Bayesian INference in DAG (BIND, we show how inference can be performed in a Directed Acyclic Graph (DAG using stochastic circuits. We show how these building blocks can be easily implemented using simple digital logic gates. An advantage of the stochastic electronic implementation is that it is robust to certain types of noise, which may become an issue in integrated circuit (IC technology with feature sizes in the order of tens of nanometers due to their low noise margin, the effect of high-energy cosmic rays and the low supply voltage. In our framework, the flipping of random individual bits would not affect the system performance because information is encoded in a bit stream.
Bayesian Estimation and Inference Using Stochastic Electronics.
Thakur, Chetan Singh; Afshar, Saeed; Wang, Runchun M; Hamilton, Tara J; Tapson, Jonathan; van Schaik, André
2016-01-01
In this paper, we present the implementation of two types of Bayesian inference problems to demonstrate the potential of building probabilistic algorithms in hardware using single set of building blocks with the ability to perform these computations in real time. The first implementation, referred to as the BEAST (Bayesian Estimation and Stochastic Tracker), demonstrates a simple problem where an observer uses an underlying Hidden Markov Model (HMM) to track a target in one dimension. In this implementation, sensors make noisy observations of the target position at discrete time steps. The tracker learns the transition model for target movement, and the observation model for the noisy sensors, and uses these to estimate the target position by solving the Bayesian recursive equation online. We show the tracking performance of the system and demonstrate how it can learn the observation model, the transition model, and the external distractor (noise) probability interfering with the observations. In the second implementation, referred to as the Bayesian INference in DAG (BIND), we show how inference can be performed in a Directed Acyclic Graph (DAG) using stochastic circuits. We show how these building blocks can be easily implemented using simple digital logic gates. An advantage of the stochastic electronic implementation is that it is robust to certain types of noise, which may become an issue in integrated circuit (IC) technology with feature sizes in the order of tens of nanometers due to their low noise margin, the effect of high-energy cosmic rays and the low supply voltage. In our framework, the flipping of random individual bits would not affect the system performance because information is encoded in a bit stream.
Heuristics as Bayesian inference under extreme priors.
Parpart, Paula; Jones, Matt; Love, Bradley C
2018-05-01
Simple heuristics are often regarded as tractable decision strategies because they ignore a great deal of information in the input data. One puzzle is why heuristics can outperform full-information models, such as linear regression, which make full use of the available information. These "less-is-more" effects, in which a relatively simpler model outperforms a more complex model, are prevalent throughout cognitive science, and are frequently argued to demonstrate an inherent advantage of simplifying computation or ignoring information. In contrast, we show at the computational level (where algorithmic restrictions are set aside) that it is never optimal to discard information. Through a formal Bayesian analysis, we prove that popular heuristics, such as tallying and take-the-best, are formally equivalent to Bayesian inference under the limit of infinitely strong priors. Varying the strength of the prior yields a continuum of Bayesian models with the heuristics at one end and ordinary regression at the other. Critically, intermediate models perform better across all our simulations, suggesting that down-weighting information with the appropriate prior is preferable to entirely ignoring it. Rather than because of their simplicity, our analyses suggest heuristics perform well because they implement strong priors that approximate the actual structure of the environment. We end by considering how new heuristics could be derived by infinitely strengthening the priors of other Bayesian models. These formal results have implications for work in psychology, machine learning and economics. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Pasha, Imad; Kriek, Mariska; Johnson, Benjamin; Conroy, Charlie
2018-01-01
Using a novel, MCMC-driven inference framework, we have modeled the stellar and dust emission of 32 composite spectral energy distributions (SEDs), which span from the near-ultraviolet (NUV) to far infrared (FIR). The composite SEDs were originally constructed in a previous work from the photometric catalogs of the NEWFIRM Medium-Band Survey, in which SEDs of individual galaxies at 0.5 MIPS 24 μm was added for each SED type, and in this work, PACS 100 μm, PACS160 μm, SPIRE 25 μm, and SPIRE 350 μm photometry have been added to extend the range of the composite SEDs into the FIR. We fit the composite SEDs with the Prospector code, which utilizes an MCMC sampling to explore the parameter space for models created by the Flexible Stellar Population Synthesis (FSPS) code, in order to investigate how specific star formation rate (sSFR), dust temperature, and other galaxy properties vary with SED type.This work is also being used to better constrain the SPS models within FSPS.
Bayesian Inference of a Multivariate Regression Model
Directory of Open Access Journals (Sweden)
Marick S. Sinay
2014-01-01
Full Text Available We explore Bayesian inference of a multivariate linear regression model with use of a flexible prior for the covariance structure. The commonly adopted Bayesian setup involves the conjugate prior, multivariate normal distribution for the regression coefficients and inverse Wishart specification for the covariance matrix. Here we depart from this approach and propose a novel Bayesian estimator for the covariance. A multivariate normal prior for the unique elements of the matrix logarithm of the covariance matrix is considered. Such structure allows for a richer class of prior distributions for the covariance, with respect to strength of beliefs in prior location hyperparameters, as well as the added ability, to model potential correlation amongst the covariance structure. The posterior moments of all relevant parameters of interest are calculated based upon numerical results via a Markov chain Monte Carlo procedure. The Metropolis-Hastings-within-Gibbs algorithm is invoked to account for the construction of a proposal density that closely matches the shape of the target posterior distribution. As an application of the proposed technique, we investigate a multiple regression based upon the 1980 High School and Beyond Survey.
Efficient Bayesian inference for ARFIMA processes
Graves, T.; Gramacy, R. B.; Franzke, C. L. E.; Watkins, N. W.
2015-03-01
Many geophysical quantities, like atmospheric temperature, water levels in rivers, and wind speeds, have shown evidence of long-range dependence (LRD). LRD means that these quantities experience non-trivial temporal memory, which potentially enhances their predictability, but also hampers the detection of externally forced trends. Thus, it is important to reliably identify whether or not a system exhibits LRD. In this paper we present a modern and systematic approach to the inference of LRD. Rather than Mandelbrot's fractional Gaussian noise, we use the more flexible Autoregressive Fractional Integrated Moving Average (ARFIMA) model which is widely used in time series analysis, and of increasing interest in climate science. Unlike most previous work on the inference of LRD, which is frequentist in nature, we provide a systematic treatment of Bayesian inference. In particular, we provide a new approximate likelihood for efficient parameter inference, and show how nuisance parameters (e.g. short memory effects) can be integrated over in order to focus on long memory parameters, and hypothesis testing more directly. We illustrate our new methodology on the Nile water level data, with favorable comparison to the standard estimators.
Integrating distributed Bayesian inference and reinforcement learning for sensor management
Grappiolo, C.; Whiteson, S.; Pavlin, G.; Bakker, B.
2009-01-01
This paper introduces a sensor management approach that integrates distributed Bayesian inference (DBI) and reinforcement learning (RL). DBI is implemented using distributed perception networks (DPNs), a multiagent approach to performing efficient inference, while RL is used to automatically
A Fast Iterative Bayesian Inference Algorithm for Sparse Channel Estimation
DEFF Research Database (Denmark)
Pedersen, Niels Lovmand; Manchón, Carles Navarro; Fleury, Bernard Henri
2013-01-01
representation of the Bessel K probability density function; a highly efficient, fast iterative Bayesian inference method is then applied to the proposed model. The resulting estimator outperforms other state-of-the-art Bayesian and non-Bayesian estimators, either by yielding lower mean squared estimation error...
Comparative Study of Inference Methods for Bayesian Nonnegative Matrix Factorisation
DEFF Research Database (Denmark)
Brouwer, Thomas; Frellsen, Jes; Liò, Pietro
2017-01-01
In this paper, we study the trade-offs of different inference approaches for Bayesian matrix factorisation methods, which are commonly used for predicting missing values, and for finding patterns in the data. In particular, we consider Bayesian nonnegative variants of matrix factorisation and tri......-factorisation, and compare non-probabilistic inference, Gibbs sampling, variational Bayesian inference, and a maximum-a-posteriori approach. The variational approach is new for the Bayesian nonnegative models. We compare their convergence, and robustness to noise and sparsity of the data, on both synthetic and real...
Towards Bayesian Inference of the Fast-Ion Distribution Function
DEFF Research Database (Denmark)
Stagner, L.; Heidbrink, W.W.; Salewski, Mirko
2012-01-01
sensitivity of the measurements are incorporated into Bayesian likelihood probabilities, while prior probabilities enforce physical constraints. As an initial step, this poster uses Bayesian statistics to infer the DIII-D electron density profile from multiple diagnostic measurements. Likelihood functions....... However, when theory and experiment disagree (for one or more diagnostics), it is unclear how to proceed. Bayesian statistics provides a framework to infer the DF, quantify errors, and reconcile discrepant diagnostic measurements. Diagnostic errors and ``weight functions" that describe the phase space...
Inference in Complex Systems Using Multi-Phase MCMC Sampling With Gradient Matching Burn-in
Lazarus, Alan; Husmeier, Dirk; Papamarkou, Theodore
2017-01-01
We propose a novel method for parameter inference that builds on the current research in gradient matching surrogate likelihood spaces. Adopting a three phase technique, we demonstrate that it is possible to obtain parameter estimates of limited bias whilst still adopting the paradigm of the computationally cheap surrogate approximation.
Numerical approximations for speeding up mcmc inference in the infinite relational model
DEFF Research Database (Denmark)
Schmidt, Mikkel Nørgaard; Albers, Kristoffer Jon
2015-01-01
The infinite relational model (IRM) is a powerful model for discovering clusters in complex networks; however, the computational speed of Markov chain Monte Carlo inference in the model can be a limiting factor when analyzing large networks. We investigate how using numerical approximations...
Arnst, M.; Abello Álvarez, B.; Ponthot, J.-P.; Boman, R.
2017-11-01
This paper is concerned with the characterization and the propagation of errors associated with data limitations in polynomial-chaos-based stochastic methods for uncertainty quantification. Such an issue can arise in uncertainty quantification when only a limited amount of data is available. When the available information does not suffice to accurately determine the probability distributions that must be assigned to the uncertain variables, the Bayesian method for assigning these probability distributions becomes attractive because it allows the stochastic model to account explicitly for insufficiency of the available information. In previous work, such applications of the Bayesian method had already been implemented by using the Metropolis-Hastings and Gibbs Markov Chain Monte Carlo (MCMC) methods. In this paper, we present an alternative implementation, which uses an alternative MCMC method built around an Itô stochastic differential equation (SDE) that is ergodic for the Bayesian posterior. We draw together from the mathematics literature a number of formal properties of this Itô SDE that lend support to its use in the implementation of the Bayesian method, and we describe its discretization, including the choice of the free parameters, by using the implicit Euler method. We demonstrate the proposed methodology on a problem of uncertainty quantification in a complex nonlinear engineering application relevant to metal forming.
Quantum-Like Representation of Non-Bayesian Inference
Asano, M.; Basieva, I.; Khrennikov, A.; Ohya, M.; Tanaka, Y.
2013-01-01
This research is related to the problem of "irrational decision making or inference" that have been discussed in cognitive psychology. There are some experimental studies, and these statistical data cannot be described by classical probability theory. The process of decision making generating these data cannot be reduced to the classical Bayesian inference. For this problem, a number of quantum-like coginitive models of decision making was proposed. Our previous work represented in a natural way the classical Bayesian inference in the frame work of quantum mechanics. By using this representation, in this paper, we try to discuss the non-Bayesian (irrational) inference that is biased by effects like the quantum interference. Further, we describe "psychological factor" disturbing "rationality" as an "environment" correlating with the "main system" of usual Bayesian inference.
Universal Darwinism As a Process of Bayesian Inference.
Campbell, John O
2016-01-01
Many of the mathematical frameworks describing natural selection are equivalent to Bayes' Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus, natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians). As Bayesian inference can always be cast in terms of (variational) free energy minimization, natural selection can be viewed as comprising two components: a generative model of an "experiment" in the external world environment, and the results of that "experiment" or the "surprise" entailed by predicted and actual outcomes of the "experiment." Minimization of free energy implies that the implicit measure of "surprise" experienced serves to update the generative model in a Bayesian manner. This description closely accords with the mechanisms of generalized Darwinian process proposed both by Dawkins, in terms of replicators and vehicles, and Campbell, in terms of inferential systems. Bayesian inference is an algorithm for the accumulation of evidence-based knowledge. This algorithm is now seen to operate over a wide range of evolutionary processes, including natural selection, the evolution of mental models and cultural evolutionary processes, notably including science itself. The variational principle of free energy minimization may thus serve as a unifying mathematical framework for universal Darwinism, the study of evolutionary processes operating throughout nature.
Universal Darwinism as a process of Bayesian inference
Directory of Open Access Journals (Sweden)
John Oberon Campbell
2016-06-01
Full Text Available Many of the mathematical frameworks describing natural selection are equivalent to Bayes’ Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians. As Bayesian inference can always be cast in terms of (variational free energy minimization, natural selection can be viewed as comprising two components: a generative model of an ‘experiment’ in the external world environment, and the results of that 'experiment' or the 'surprise' entailed by predicted and actual outcomes of the ‘experiment’. Minimization of free energy implies that the implicit measure of 'surprise' experienced serves to update the generative model in a Bayesian manner. This description closely accords with the mechanisms of generalized Darwinian process proposed both by Dawkins, in terms of replicators and vehicles, and Campbell, in terms of inferential systems. Bayesian inference is an algorithm for the accumulation of evidence-based knowledge. This algorithm is now seen to operate over a wide range of evolutionary processes, including natural selection, the evolution of mental models and cultural evolutionary processes, notably including science itself. The variational principle of free energy minimization may thus serve as a unifying mathematical framework for universal Darwinism, the study of evolutionary processes operating throughout nature.
Anderson, Eric C; Ng, Thomas C
2016-02-01
We develop a computational framework for addressing pedigree inference problems using small numbers (80-400) of single nucleotide polymorphisms (SNPs). Our approach relaxes the assumptions, which are commonly made, that sampling is complete with respect to the pedigree and that there is no genotyping error. It relies on representing the inferred pedigree as a factor graph and invoking the Sum-Product algorithm to compute and store quantities that allow the joint probability of the data to be rapidly computed under a large class of rearrangements of the pedigree structure. This allows efficient MCMC sampling over the space of pedigrees, and, hence, Bayesian inference of pedigree structure. In this paper we restrict ourselves to inference of pedigrees without loops using SNPs assumed to be unlinked. We present the methodology in general for multigenerational inference, and we illustrate the method by applying it to the inference of full sibling groups in a large sample (n=1157) of Chinook salmon typed at 95 SNPs. The results show that our method provides a better point estimate and estimate of uncertainty than the currently best-available maximum-likelihood sibling reconstruction method. Extensions of this work to more complex scenarios are briefly discussed. Published by Elsevier Inc.
Bayesian inference of radiation belt loss timescales.
Camporeale, E.; Chandorkar, M.
2017-12-01
Electron fluxes in the Earth's radiation belts are routinely studied using the classical quasi-linear radial diffusion model. Although this simplified linear equation has proven to be an indispensable tool in understanding the dynamics of the radiation belt, it requires specification of quantities such as the diffusion coefficient and electron loss timescales that are never directly measured. Researchers have so far assumed a-priori parameterisations for radiation belt quantities and derived the best fit using satellite data. The state of the art in this domain lacks a coherent formulation of this problem in a probabilistic framework. We present some recent progress that we have made in performing Bayesian inference of radial diffusion parameters. We achieve this by making extensive use of the theory connecting Gaussian Processes and linear partial differential equations, and performing Markov Chain Monte Carlo sampling of radial diffusion parameters. These results are important for understanding the role and the propagation of uncertainties in radiation belt simulations and, eventually, for providing a probabilistic forecast of energetic electron fluxes in a Space Weather context.
BAYESIAN INFERENCE OF CMB GRAVITATIONAL LENSING
Energy Technology Data Exchange (ETDEWEB)
Anderes, Ethan [Department of Statistics, University of California, Davis, CA 95616 (United States); Wandelt, Benjamin D.; Lavaux, Guilhem [Sorbonne Universités, UPMC Univ Paris 06 and CNRS, UMR7095, Institut d’Astrophysique de Paris, F-75014, Paris (France)
2015-08-01
The Planck satellite, along with several ground-based telescopes, has mapped the cosmic microwave background (CMB) at sufficient resolution and signal-to-noise so as to allow a detection of the subtle distortions due to the gravitational influence of the intervening matter distribution. A natural modeling approach is to write a Bayesian hierarchical model for the lensed CMB in terms of the unlensed CMB and the lensing potential. So far there has been no feasible algorithm for inferring the posterior distribution of the lensing potential from the lensed CMB map. We propose a solution that allows efficient Markov Chain Monte Carlo sampling from the joint posterior of the lensing potential and the unlensed CMB map using the Hamiltonian Monte Carlo technique. The main conceptual step in the solution is a re-parameterization of CMB lensing in terms of the lensed CMB and the “inverse lensing” potential. We demonstrate a fast implementation on simulated data, including noise and a sky cut, that uses a further acceleration based on a very mild approximation of the inverse lensing potential. We find that the resulting Markov Chain has short correlation lengths and excellent convergence properties, making it promising for applications to high-resolution CMB data sets in the future.
Le Maitre, Olivier
2015-01-07
We address model dimensionality reduction in the Bayesian inference of Gaussian fields, considering prior covariance function with unknown hyper-parameters. The Karhunen-Loeve (KL) expansion of a prior Gaussian process is traditionally derived assuming fixed covariance function with pre-assigned hyperparameter values. Thus, the modes strengths of the Karhunen-Loeve expansion inferred using available observations, as well as the resulting inferred process, dependent on the pre-assigned values for the covariance hyper-parameters. Here, we seek to infer the process and its the covariance hyper-parameters in a single Bayesian inference. To this end, the uncertainty in the hyper-parameters is treated by means of a coordinate transformation, leading to a KL-type expansion on a fixed reference basis of spatial modes, but with random coordinates conditioned on the hyper-parameters. A Polynomial Chaos (PC) expansion of the model prediction is also introduced to accelerate the Bayesian inference and the sampling of the posterior distribution with MCMC method. The PC expansion of the model prediction also rely on a coordinates transformation, enabling us to avoid expanding the dependence of the prediction with respect to the covariance hyper-parameters. We demonstrate the efficiency of the proposed method on a transient diffusion equation by inferring spatially-varying log-diffusivity fields from noisy data.
Bayesian inference of chemical kinetic models from proposed reactions
Galagali, Nikhil; Marzouk, Youssef M.
2015-01-01
© 2014 Elsevier Ltd. Bayesian inference provides a natural framework for combining experimental data with prior knowledge to develop chemical kinetic models and quantify the associated uncertainties, not only in parameter values but also in model
Efficient design and inference in distributed Bayesian networks: an overview
de Oude, P.; Groen, F.C.A.; Pavlin, G.; Bezhanishvili, N.; Löbner, S.; Schwabe, K.; Spada, L.
2011-01-01
This paper discusses an approach to distributed Bayesian modeling and inference, which is relevant for an important class of contemporary real world situation assessment applications. By explicitly considering the locality of causal relations, the presented approach (i) supports coherent distributed
Bayesian Information Criterion as an Alternative way of Statistical Inference
Directory of Open Access Journals (Sweden)
Nadejda Yu. Gubanova
2012-05-01
Full Text Available The article treats Bayesian information criterion as an alternative to traditional methods of statistical inference, based on NHST. The comparison of ANOVA and BIC results for psychological experiment is discussed.
On Bayesian Inference under Sampling from Scale Mixtures of Normals
Fernández, C.; Steel, M.F.J.
1996-01-01
This paper considers a Bayesian analysis of the linear regression model under independent sampling from general scale mixtures of Normals.Using a common reference prior, we investigate the validity of Bayesian inference and the existence of posterior moments of the regression and precision
BayesTwin: An R Package for Bayesian Inference of Item-Level Twin Data
Directory of Open Access Journals (Sweden)
Inga Schwabe
2017-11-01
Full Text Available BayesTwin is an open-source R package that serves as a pipeline to the MCMC program JAGS to perform Bayesian inference on genetically-informative hierarchical twin data. Simultaneously to the biometric model, an item response theory (IRT measurement model is estimated, allowing analysis of the raw phenotypic (item-level data. The integration of such a measurement model is important since earlier research has shown that an analysis based on an aggregated measure (e.g., a sum-score based analysis can lead to an underestimation of heritability and the spurious finding of genotype-environment interactions. The package includes all common biometric and IRT models as well as functions that help plot relevant information or determine whether the analysis was performed well. Funding statement: Partly funded by the PROO grant 411-12-623 from the Netherlands Organisation for Scientific Research (NWO.
Sraj, Ihab
2016-08-26
The authors present a polynomial chaos (PC)-based Bayesian inference method for quantifying the uncertainties of the K-profile parameterization (KPP) within the MIT general circulation model (MITgcm) of the tropical Pacific. The inference of the uncertain parameters is based on a Markov chain Monte Carlo (MCMC) scheme that utilizes a newly formulated test statistic taking into account the different components representing the structures of turbulent mixing on both daily and seasonal time scales in addition to the data quality, and filters for the effects of parameter perturbations over those as a result of changes in the wind. To avoid the prohibitive computational cost of integrating the MITgcm model at each MCMC iteration, a surrogate model for the test statistic using the PC method is built. Because of the noise in the model predictions, a basis-pursuit-denoising (BPDN) compressed sensing approach is employed to determine the PC coefficients of a representative surrogate model. The PC surrogate is then used to evaluate the test statistic in the MCMC step for sampling the posterior of the uncertain parameters. Results of the posteriors indicate good agreement with the default values for two parameters of the KPP model, namely the critical bulk and gradient Richardson numbers; while the posteriors of the remaining parameters were barely informative. © 2016 American Meteorological Society.
Dimension-Independent Likelihood-Informed MCMC
Cui, Tiangang; Law, Kody; Marzouk, Youssef
2015-01-01
Many Bayesian inference problems require exploring the posterior distribution of high-dimensional parameters, which in principle can be described as functions. By exploiting low-dimensional structure in the change from prior to posterior [distributions], we introduce a suite of MCMC samplers that can adapt to the complex structure of the posterior distribution, yet are well-defined on function space. Posterior sampling in nonlinear inverse problems arising from various partial di erential equations and also a stochastic differential equation are used to demonstrate the e ciency of these dimension-independent likelihood-informed samplers.
Dimension-Independent Likelihood-Informed MCMC
Cui, Tiangang
2015-01-07
Many Bayesian inference problems require exploring the posterior distribution of high-dimensional parameters, which in principle can be described as functions. By exploiting low-dimensional structure in the change from prior to posterior [distributions], we introduce a suite of MCMC samplers that can adapt to the complex structure of the posterior distribution, yet are well-defined on function space. Posterior sampling in nonlinear inverse problems arising from various partial di erential equations and also a stochastic differential equation are used to demonstrate the e ciency of these dimension-independent likelihood-informed samplers.
International Nuclear Information System (INIS)
Lucka, Felix
2012-01-01
Sparsity has become a key concept for solving of high-dimensional inverse problems using variational regularization techniques. Recently, using similar sparsity-constraints in the Bayesian framework for inverse problems by encoding them in the prior distribution has attracted attention. Important questions about the relation between regularization theory and Bayesian inference still need to be addressed when using sparsity promoting inversion. A practical obstacle for these examinations is the lack of fast posterior sampling algorithms for sparse, high-dimensional Bayesian inversion. Accessing the full range of Bayesian inference methods requires being able to draw samples from the posterior probability distribution in a fast and efficient way. This is usually done using Markov chain Monte Carlo (MCMC) sampling algorithms. In this paper, we develop and examine a new implementation of a single component Gibbs MCMC sampler for sparse priors relying on L1-norms. We demonstrate that the efficiency of our Gibbs sampler increases when the level of sparsity or the dimension of the unknowns is increased. This property is contrary to the properties of the most commonly applied Metropolis–Hastings (MH) sampling schemes. We demonstrate that the efficiency of MH schemes for L1-type priors dramatically decreases when the level of sparsity or the dimension of the unknowns is increased. Practically, Bayesian inversion for L1-type priors using MH samplers is not feasible at all. As this is commonly believed to be an intrinsic feature of MCMC sampling, the performance of our Gibbs sampler also challenges common beliefs about the applicability of sample based Bayesian inference. (paper)
Cortical hierarchies perform Bayesian causal inference in multisensory perception.
Directory of Open Access Journals (Sweden)
Tim Rohe
2015-02-01
Full Text Available To form a veridical percept of the environment, the brain needs to integrate sensory signals from a common source but segregate those from independent sources. Thus, perception inherently relies on solving the "causal inference problem." Behaviorally, humans solve this problem optimally as predicted by Bayesian Causal Inference; yet, the underlying neural mechanisms are unexplored. Combining psychophysics, Bayesian modeling, functional magnetic resonance imaging (fMRI, and multivariate decoding in an audiovisual spatial localization task, we demonstrate that Bayesian Causal Inference is performed by a hierarchy of multisensory processes in the human brain. At the bottom of the hierarchy, in auditory and visual areas, location is represented on the basis that the two signals are generated by independent sources (= segregation. At the next stage, in posterior intraparietal sulcus, location is estimated under the assumption that the two signals are from a common source (= forced fusion. Only at the top of the hierarchy, in anterior intraparietal sulcus, the uncertainty about the causal structure of the world is taken into account and sensory signals are combined as predicted by Bayesian Causal Inference. Characterizing the computational operations of signal interactions reveals the hierarchical nature of multisensory perception in human neocortex. It unravels how the brain accomplishes Bayesian Causal Inference, a statistical computation fundamental for perception and cognition. Our results demonstrate how the brain combines information in the face of uncertainty about the underlying causal structure of the world.
Empirical Bayesian inference and model uncertainty
International Nuclear Information System (INIS)
Poern, K.
1994-01-01
This paper presents a hierarchical or multistage empirical Bayesian approach for the estimation of uncertainty concerning the intensity of a homogeneous Poisson process. A class of contaminated gamma distributions is considered to describe the uncertainty concerning the intensity. These distributions in turn are defined through a set of secondary parameters, the knowledge of which is also described and updated via Bayes formula. This two-stage Bayesian approach is an example where the modeling uncertainty is treated in a comprehensive way. Each contaminated gamma distributions, represented by a point in the 3D space of secondary parameters, can be considered as a specific model of the uncertainty about the Poisson intensity. Then, by the empirical Bayesian method each individual model is assigned a posterior probability
Bayesian Inference and Online Learning in Poisson Neuronal Networks.
Huang, Yanping; Rao, Rajesh P N
2016-08-01
Motivated by the growing evidence for Bayesian computation in the brain, we show how a two-layer recurrent network of Poisson neurons can perform both approximate Bayesian inference and learning for any hidden Markov model. The lower-layer sensory neurons receive noisy measurements of hidden world states. The higher-layer neurons infer a posterior distribution over world states via Bayesian inference from inputs generated by sensory neurons. We demonstrate how such a neuronal network with synaptic plasticity can implement a form of Bayesian inference similar to Monte Carlo methods such as particle filtering. Each spike in a higher-layer neuron represents a sample of a particular hidden world state. The spiking activity across the neural population approximates the posterior distribution over hidden states. In this model, variability in spiking is regarded not as a nuisance but as an integral feature that provides the variability necessary for sampling during inference. We demonstrate how the network can learn the likelihood model, as well as the transition probabilities underlying the dynamics, using a Hebbian learning rule. We present results illustrating the ability of the network to perform inference and learning for arbitrary hidden Markov models.
Progress on Bayesian Inference of the Fast Ion Distribution Function
DEFF Research Database (Denmark)
Stagner, L.; Heidbrink, W.W,; Chen, X.
2013-01-01
. However, when theory and experiment disagree (for one or more diagnostics), it is unclear how to proceed. Bayesian statistics provides a framework to infer the DF, quantify errors, and reconcile discrepant diagnostic measurements. Diagnostic errors and weight functions that describe the phase space...... sensitivity of the measurements are incorporated into Bayesian likelihood probabilities. Prior probabilities describe physical constraints. This poster will show reconstructions of classically described, low-power, MHD-quiescent distribution functions from actual FIDA measurements. A description of the full...
Robust bayesian inference of generalized Pareto distribution ...
African Journals Online (AJOL)
En utilisant une etude exhaustive de Monte Carlo, nous prouvons que, moyennant une fonction perte generalisee adequate, on peut construire un estimateur Bayesien robuste du modele. Key words: Bayesian estimation; Extreme value; Generalized Fisher information; Gener- alized Pareto distribution; Monte Carlo; ...
A Bayesian nonparametric approach to causal inference on quantiles.
Xu, Dandan; Daniels, Michael J; Winterstein, Almut G
2018-02-25
We propose a Bayesian nonparametric approach (BNP) for causal inference on quantiles in the presence of many confounders. In particular, we define relevant causal quantities and specify BNP models to avoid bias from restrictive parametric assumptions. We first use Bayesian additive regression trees (BART) to model the propensity score and then construct the distribution of potential outcomes given the propensity score using a Dirichlet process mixture (DPM) of normals model. We thoroughly evaluate the operating characteristics of our approach and compare it to Bayesian and frequentist competitors. We use our approach to answer an important clinical question involving acute kidney injury using electronic health records. © 2018, The International Biometric Society.
Neuronal integration of dynamic sources: Bayesian learning and Bayesian inference.
Siegelmann, Hava T; Holzman, Lars E
2010-09-01
One of the brain's most basic functions is integrating sensory data from diverse sources. This ability causes us to question whether the neural system is computationally capable of intelligently integrating data, not only when sources have known, fixed relative dependencies but also when it must determine such relative weightings based on dynamic conditions, and then use these learned weightings to accurately infer information about the world. We suggest that the brain is, in fact, fully capable of computing this parallel task in a single network and describe a neural inspired circuit with this property. Our implementation suggests the possibility that evidence learning requires a more complex organization of the network than was previously assumed, where neurons have different specialties, whose emergence brings the desired adaptivity seen in human online inference.
Dimension-independent likelihood-informed MCMC
Cui, Tiangang
2015-10-08
Many Bayesian inference problems require exploring the posterior distribution of high-dimensional parameters that represent the discretization of an underlying function. This work introduces a family of Markov chain Monte Carlo (MCMC) samplers that can adapt to the particular structure of a posterior distribution over functions. Two distinct lines of research intersect in the methods developed here. First, we introduce a general class of operator-weighted proposal distributions that are well defined on function space, such that the performance of the resulting MCMC samplers is independent of the discretization of the function. Second, by exploiting local Hessian information and any associated low-dimensional structure in the change from prior to posterior distributions, we develop an inhomogeneous discretization scheme for the Langevin stochastic differential equation that yields operator-weighted proposals adapted to the non-Gaussian structure of the posterior. The resulting dimension-independent and likelihood-informed (DILI) MCMC samplers may be useful for a large class of high-dimensional problems where the target probability measure has a density with respect to a Gaussian reference measure. Two nonlinear inverse problems are used to demonstrate the efficiency of these DILI samplers: an elliptic PDE coefficient inverse problem and path reconstruction in a conditioned diffusion.
Dimension-independent likelihood-informed MCMC
Cui, Tiangang; Law, Kody; Marzouk, Youssef M.
2015-01-01
Many Bayesian inference problems require exploring the posterior distribution of high-dimensional parameters that represent the discretization of an underlying function. This work introduces a family of Markov chain Monte Carlo (MCMC) samplers that can adapt to the particular structure of a posterior distribution over functions. Two distinct lines of research intersect in the methods developed here. First, we introduce a general class of operator-weighted proposal distributions that are well defined on function space, such that the performance of the resulting MCMC samplers is independent of the discretization of the function. Second, by exploiting local Hessian information and any associated low-dimensional structure in the change from prior to posterior distributions, we develop an inhomogeneous discretization scheme for the Langevin stochastic differential equation that yields operator-weighted proposals adapted to the non-Gaussian structure of the posterior. The resulting dimension-independent and likelihood-informed (DILI) MCMC samplers may be useful for a large class of high-dimensional problems where the target probability measure has a density with respect to a Gaussian reference measure. Two nonlinear inverse problems are used to demonstrate the efficiency of these DILI samplers: an elliptic PDE coefficient inverse problem and path reconstruction in a conditioned diffusion.
Directory of Open Access Journals (Sweden)
Ali Mohammad-Djafari
2015-06-01
Full Text Available The main content of this review article is first to review the main inference tools using Bayes rule, the maximum entropy principle (MEP, information theory, relative entropy and the Kullback–Leibler (KL divergence, Fisher information and its corresponding geometries. For each of these tools, the precise context of their use is described. The second part of the paper is focused on the ways these tools have been used in data, signal and image processing and in the inverse problems, which arise in different physical sciences and engineering applications. A few examples of the applications are described: entropy in independent components analysis (ICA and in blind source separation, Fisher information in data model selection, different maximum entropy-based methods in time series spectral estimation and in linear inverse problems and, finally, the Bayesian inference for general inverse problems. Some original materials concerning the approximate Bayesian computation (ABC and, in particular, the variational Bayesian approximation (VBA methods are also presented. VBA is used for proposing an alternative Bayesian computational tool to the classical Markov chain Monte Carlo (MCMC methods. We will also see that VBA englobes joint maximum a posteriori (MAP, as well as the different expectation-maximization (EM algorithms as particular cases.
Bayesian inference with information content model check for Langevin equations
DEFF Research Database (Denmark)
Krog, Jens F. C.; Lomholt, Michael Andersen
2017-01-01
The Bayesian data analysis framework has been proven to be a systematic and effective method of parameter inference and model selection for stochastic processes. In this work we introduce an information content model check which may serve as a goodness-of-fit, like the chi-square procedure...
Bayesian inference model for fatigue life of laminated composites
DEFF Research Database (Denmark)
Dimitrov, Nikolay Krasimirov; Kiureghian, Armen Der; Berggreen, Christian
2016-01-01
A probabilistic model for estimating the fatigue life of laminated composite plates is developed. The model is based on lamina-level input data, making it possible to predict fatigue properties for a wide range of laminate configurations. Model parameters are estimated by Bayesian inference. The ...
Stan: A Probabilistic Programming Language for Bayesian Inference and Optimization
Gelman, Andrew; Lee, Daniel; Guo, Jiqiang
2015-01-01
Stan is a free and open-source C++ program that performs Bayesian inference or optimization for arbitrary user-specified models and can be called from the command line, R, Python, Matlab, or Julia and has great promise for fitting large and complex statistical models in many areas of application. We discuss Stan from users' and developers'…
Directory of Open Access Journals (Sweden)
E. Laloy
2017-07-01
Full Text Available The rate at which low-lying sandy areas in temperate regions, such as the Campine Plateau (NE Belgium, have been eroding during the Quaternary is a matter of debate. Current knowledge on the average pace of landscape evolution in the Campine area is largely based on geological inferences and modern analogies. We performed a Bayesian inversion of an in situ-produced 10Be concentration depth profile to infer the average long-term erosion rate together with two other parameters: the surface exposure age and the inherited 10Be concentration. Compared to the latest advances in probabilistic inversion of cosmogenic radionuclide (CRN data, our approach has the following two innovative components: it (1 uses Markov chain Monte Carlo (MCMC sampling and (2 accounts (under certain assumptions for the contribution of model errors to posterior uncertainty. To investigate to what extent our approach differs from the state of the art in practice, a comparison against the Bayesian inversion method implemented in the CRONUScalc program is made. Both approaches identify similar maximum a posteriori (MAP parameter values, but posterior parameter and predictive uncertainty derived using the method taken in CRONUScalc is moderately underestimated. A simple way for producing more consistent uncertainty estimates with the CRONUScalc-like method in the presence of model errors is therefore suggested. Our inferred erosion rate of 39 ± 8. 9 mm kyr−1 (1σ is relatively large in comparison with landforms that erode under comparable (paleo-climates elsewhere in the world. We evaluate this value in the light of the erodibility of the substrate and sudden base level lowering during the Middle Pleistocene. A denser sampling scheme of a two-nuclide concentration depth profile would allow for better inferred erosion rate resolution, and including more uncertain parameters in the MCMC inversion.
Laloy, Eric; Beerten, Koen; Vanacker, Veerle; Christl, Marcus; Rogiers, Bart; Wouters, Laurent
2017-07-01
The rate at which low-lying sandy areas in temperate regions, such as the Campine Plateau (NE Belgium), have been eroding during the Quaternary is a matter of debate. Current knowledge on the average pace of landscape evolution in the Campine area is largely based on geological inferences and modern analogies. We performed a Bayesian inversion of an in situ-produced 10Be concentration depth profile to infer the average long-term erosion rate together with two other parameters: the surface exposure age and the inherited 10Be concentration. Compared to the latest advances in probabilistic inversion of cosmogenic radionuclide (CRN) data, our approach has the following two innovative components: it (1) uses Markov chain Monte Carlo (MCMC) sampling and (2) accounts (under certain assumptions) for the contribution of model errors to posterior uncertainty. To investigate to what extent our approach differs from the state of the art in practice, a comparison against the Bayesian inversion method implemented in the CRONUScalc program is made. Both approaches identify similar maximum a posteriori (MAP) parameter values, but posterior parameter and predictive uncertainty derived using the method taken in CRONUScalc is moderately underestimated. A simple way for producing more consistent uncertainty estimates with the CRONUScalc-like method in the presence of model errors is therefore suggested. Our inferred erosion rate of 39 ± 8. 9 mm kyr-1 (1σ) is relatively large in comparison with landforms that erode under comparable (paleo-)climates elsewhere in the world. We evaluate this value in the light of the erodibility of the substrate and sudden base level lowering during the Middle Pleistocene. A denser sampling scheme of a two-nuclide concentration depth profile would allow for better inferred erosion rate resolution, and including more uncertain parameters in the MCMC inversion.
Bayesian inference of substrate properties from film behavior
International Nuclear Information System (INIS)
Aggarwal, R; Demkowicz, M J; Marzouk, Y M
2015-01-01
We demonstrate that by observing the behavior of a film deposited on a substrate, certain features of the substrate may be inferred with quantified uncertainty using Bayesian methods. We carry out this demonstration on an illustrative film/substrate model where the substrate is a Gaussian random field and the film is a two-component mixture that obeys the Cahn–Hilliard equation. We construct a stochastic reduced order model to describe the film/substrate interaction and use it to infer substrate properties from film behavior. This quantitative inference strategy may be adapted to other film/substrate systems. (paper)
label.switching: An R Package for Dealing with the Label Switching Problem in MCMC Outputs
Directory of Open Access Journals (Sweden)
Panagiotis Papastamoulis
2016-02-01
Full Text Available Label switching is a well-known and fundamental problem in Bayesian estimation of mixture or hidden Markov models. In case that the prior distribution of the model parameters is the same for all states, then both the likelihood and posterior distribution are invariant to permutations of the parameters. This property makes Markov chain Monte Carlo (MCMC samples simulated from the posterior distribution non-identifiable. In this paper, the label.switching package is introduced. It contains one probabilistic and seven deterministic relabeling algorithms in order to post-process a given MCMC sample, provided by the user. Each method returns a set of permutations that can be used to reorder the MCMC output. Then, any parametric function of interest can be inferred using the reordered MCMC sample. A set of user-defined permutations is also accepted, allowing the researcher to benchmark new relabeling methods against the available ones.
Variational Bayesian Inference of Line Spectra
DEFF Research Database (Denmark)
Badiu, Mihai Alin; Hansen, Thomas Lundgaard; Fleury, Bernard Henri
2017-01-01
parameters. We propose an accurate representation of the pdfs of the frequencies by mixtures of von Mises pdfs, which yields closed-form expectations. We define the algorithm VALSE in which the estimates of the pdfs and parameters are iteratively updated. VALSE is a gridless, convergent method, does......; and the coefficients are governed by a Bernoulli-Gaussian prior model turning model order selection into binary sequence detection. Unlike earlier works which retain only point estimates of the frequencies, we undertake a more complete Bayesian treatment by estimating the posterior probability density functions (pdfs......) of the frequencies and computing expectations over them. Thus, we additionally capture and operate with the uncertainty of the frequency estimates. Aiming to maximize the model evidence, variational optimization provides analytic approximations of the posterior pdfs and also gives estimates of the additional...
Bayesian inference of chemical kinetic models from proposed reactions
Galagali, Nikhil
2015-02-01
© 2014 Elsevier Ltd. Bayesian inference provides a natural framework for combining experimental data with prior knowledge to develop chemical kinetic models and quantify the associated uncertainties, not only in parameter values but also in model structure. Most existing applications of Bayesian model selection methods to chemical kinetics have been limited to comparisons among a small set of models, however. The significant computational cost of evaluating posterior model probabilities renders traditional Bayesian methods infeasible when the model space becomes large. We present a new framework for tractable Bayesian model inference and uncertainty quantification using a large number of systematically generated model hypotheses. The approach involves imposing point-mass mixture priors over rate constants and exploring the resulting posterior distribution using an adaptive Markov chain Monte Carlo method. The posterior samples are used to identify plausible models, to quantify rate constant uncertainties, and to extract key diagnostic information about model structure-such as the reactions and operating pathways most strongly supported by the data. We provide numerical demonstrations of the proposed framework by inferring kinetic models for catalytic steam and dry reforming of methane using available experimental data.
Kim, Daesang
2016-01-06
A new Bayesian inference method has been developed and applied to Furan shock tube experimental data for efficient statistical inferences of the Arrhenius parameters of two OH radical consumption reactions. The collected experimental data, which consist of time series signals of OH radical concentrations of 14 shock tube experiments, may require several days for MCMC computations even with the support of a fast surrogate of the combustion simulation model, while the new method reduces it to several hours by splitting the process into two steps of MCMC: the first inference of rate constants and the second inference of the Arrhenius parameters. Each step has low dimensional parameter spaces and the second step does not need the executions of the combustion simulation. Furthermore, the new approach has more flexibility in choosing the ranges of the inference parameters, and the higher speed and flexibility enable the more accurate inferences and the analyses of the propagation of errors in the measured temperatures and the alignment of the experimental time to the inference results.
Bayesian Inference of High-Dimensional Dynamical Ocean Models
Lin, J.; Lermusiaux, P. F. J.; Lolla, S. V. T.; Gupta, A.; Haley, P. J., Jr.
2015-12-01
This presentation addresses a holistic set of challenges in high-dimension ocean Bayesian nonlinear estimation: i) predict the probability distribution functions (pdfs) of large nonlinear dynamical systems using stochastic partial differential equations (PDEs); ii) assimilate data using Bayes' law with these pdfs; iii) predict the future data that optimally reduce uncertainties; and (iv) rank the known and learn the new model formulations themselves. Overall, we allow the joint inference of the state, equations, geometry, boundary conditions and initial conditions of dynamical models. Examples are provided for time-dependent fluid and ocean flows, including cavity, double-gyre and Strait flows with jets and eddies. The Bayesian model inference, based on limited observations, is illustrated first by the estimation of obstacle shapes and positions in fluid flows. Next, the Bayesian inference of biogeochemical reaction equations and of their states and parameters is presented, illustrating how PDE-based machine learning can rigorously guide the selection and discovery of complex ecosystem models. Finally, the inference of multiscale bottom gravity current dynamics is illustrated, motivated in part by classic overflows and dense water formation sites and their relevance to climate monitoring and dynamics. This is joint work with our MSEAS group at MIT.
Bayesian inference data evaluation and decisions
Harney, Hanns Ludwig
2016-01-01
This new edition offers a comprehensive introduction to the analysis of data using Bayes rule. It generalizes Gaussian error intervals to situations in which the data follow distributions other than Gaussian. This is particularly useful when the observed parameter is barely above the background or the histogram of multiparametric data contains many empty bins, so that the determination of the validity of a theory cannot be based on the chi-squared-criterion. In addition to the solutions of practical problems, this approach provides an epistemic insight: the logic of quantum mechanics is obtained as the logic of unbiased inference from counting data. New sections feature factorizing parameters, commuting parameters, observables in quantum mechanics, the art of fitting with coherent and with incoherent alternatives and fitting with multinomial distribution. Additional problems and examples help deepen the knowledge. Requiring no knowledge of quantum mechanics, the book is written on introductory level, with man...
Bayesian inference and updating of reliability data
International Nuclear Information System (INIS)
Sabri, Z.A.; Cullingford, M.C.; David, H.T.; Husseiny, A.A.
1980-01-01
A Bayes methodology for inference of reliability values using available but scarce current data is discussed. The method can be used to update failure rates as more information becomes available from field experience, assuming that the performance of a given component (or system) exhibits a nonhomogeneous Poisson process. Bayes' theorem is used to summarize the historical evidence and current component data in the form of a posterior distribution suitable for prediction and for smoothing or interpolation. An example is given. It may be appropriate to apply the methodology developed here to human error data, in which case the exponential model might be used to describe the learning behavior of the operator or maintenance crew personnel
Bayesian electron density inference from JET lithium beam emission spectra using Gaussian processes
Kwak, Sehyun; Svensson, J.; Brix, M.; Ghim, Y.-C.; Contributors, JET
2017-03-01
A Bayesian model to infer edge electron density profiles is developed for the JET lithium beam emission spectroscopy (Li-BES) system, measuring Li I (2p-2s) line radiation using 26 channels with ∼1 cm spatial resolution and 10∼ 20 ms temporal resolution. The density profile is modelled using a Gaussian process prior, and the uncertainty of the density profile is calculated by a Markov Chain Monte Carlo (MCMC) scheme. From the spectra measured by the transmission grating spectrometer, the Li I line intensities are extracted, and modelled as a function of the plasma density by a multi-state model which describes the relevant processes between neutral lithium beam atoms and plasma particles. The spectral model fully takes into account interference filter and instrument effects, that are separately estimated, again using Gaussian processes. The line intensities are inferred based on a spectral model consistent with the measured spectra within their uncertainties, which includes photon statistics and electronic noise. Our newly developed method to infer JET edge electron density profiles has the following advantages in comparison to the conventional method: (i) providing full posterior distributions of edge density profiles, including their associated uncertainties, (ii) the available radial range for density profiles is increased to the full observation range (∼26 cm), (iii) an assumption of monotonic electron density profile is not necessary, (iv) the absolute calibration factor of the diagnostic system is automatically estimated overcoming the limitation of the conventional technique and allowing us to infer the electron density profiles for all pulses without preprocessing the data or an additional boundary condition, and (v) since the full spectrum is modelled, the procedure of modulating the beam to measure the background signal is only necessary for the case of overlapping of the Li I line with impurity lines.
Evolution in Mind: Evolutionary Dynamics, Cognitive Processes, and Bayesian Inference.
Suchow, Jordan W; Bourgin, David D; Griffiths, Thomas L
2017-07-01
Evolutionary theory describes the dynamics of population change in settings affected by reproduction, selection, mutation, and drift. In the context of human cognition, evolutionary theory is most often invoked to explain the origins of capacities such as language, metacognition, and spatial reasoning, framing them as functional adaptations to an ancestral environment. However, evolutionary theory is useful for understanding the mind in a second way: as a mathematical framework for describing evolving populations of thoughts, ideas, and memories within a single mind. In fact, deep correspondences exist between the mathematics of evolution and of learning, with perhaps the deepest being an equivalence between certain evolutionary dynamics and Bayesian inference. This equivalence permits reinterpretation of evolutionary processes as algorithms for Bayesian inference and has relevance for understanding diverse cognitive capacities, including memory and creativity. Copyright © 2017 Elsevier Ltd. All rights reserved.
Application of Bayesian inference to stochastic analytic continuation
International Nuclear Information System (INIS)
Fuchs, S; Pruschke, T; Jarrell, M
2010-01-01
We present an algorithm for the analytic continuation of imaginary-time quantum Monte Carlo data. The algorithm is strictly based on principles of Bayesian statistical inference. It utilizes Monte Carlo simulations to calculate a weighted average of possible energy spectra. We apply the algorithm to imaginary-time quantum Monte Carlo data and compare the resulting energy spectra with those from a standard maximum entropy calculation.
Practical Statistics for LHC Physicists: Bayesian Inference (3/3)
CERN. Geneva
2015-01-01
These lectures cover those principles and practices of statistics that are most relevant for work at the LHC. The first lecture discusses the basic ideas of descriptive statistics, probability and likelihood. The second lecture covers the key ideas in the frequentist approach, including confidence limits, profile likelihoods, p-values, and hypothesis testing. The third lecture covers inference in the Bayesian approach. Throughout, real-world examples will be used to illustrate the practical application of the ideas. No previous knowledge is assumed.
Furtado-Junior, I; Abrunhosa, F A; Holanda, F C A F; Tavares, M C S
2016-06-01
Fishing selectivity of the mangrove crab Ucides cordatus in the north coast of Brazil can be defined as the fisherman's ability to capture and select individuals from a certain size or sex (or a combination of these factors) which suggests an empirical selectivity. Considering this hypothesis, we calculated the selectivity curves for males and females crabs using the logit function of the logistic model in the formulation. The Bayesian inference consisted of obtaining the posterior distribution by applying the Markov chain Monte Carlo (MCMC) method to software R using the OpenBUGS, BRugs, and R2WinBUGS libraries. The estimated results of width average carapace selection for males and females compared with previous studies reporting the average width of the carapace of sexual maturity allow us to confirm the hypothesis that most mature individuals do not suffer from fishing pressure; thus, ensuring their sustainability.
Bayesian inference in processing experimental data: principles and basic applications
International Nuclear Information System (INIS)
D'Agostini, G
2003-01-01
This paper introduces general ideas and some basic methods of the Bayesian probability theory applied to physics measurements. Our aim is to make the reader familiar, through examples rather than rigorous formalism, with concepts such as the following: model comparison (including the automatic Ockham's Razor filter provided by the Bayesian approach); parametric inference; quantification of the uncertainty about the value of physical quantities, also taking into account systematic effects; role of marginalization; posterior characterization; predictive distributions; hierarchical modelling and hyperparameters; Gaussian approximation of the posterior and recovery of conventional methods, especially maximum likelihood and chi-square fits under well-defined conditions; conjugate priors, transformation invariance and maximum entropy motivated priors; and Monte Carlo (MC) estimates of expectation, including a short introduction to Markov Chain MC methods
Fast model updating coupling Bayesian inference and PGD model reduction
Rubio, Paul-Baptiste; Louf, François; Chamoin, Ludovic
2018-04-01
The paper focuses on a coupled Bayesian-Proper Generalized Decomposition (PGD) approach for the real-time identification and updating of numerical models. The purpose is to use the most general case of Bayesian inference theory in order to address inverse problems and to deal with different sources of uncertainties (measurement and model errors, stochastic parameters). In order to do so with a reasonable CPU cost, the idea is to replace the direct model called for Monte-Carlo sampling by a PGD reduced model, and in some cases directly compute the probability density functions from the obtained analytical formulation. This procedure is first applied to a welding control example with the updating of a deterministic parameter. In the second application, the identification of a stochastic parameter is studied through a glued assembly example.
Sparse linear models: Variational approximate inference and Bayesian experimental design
International Nuclear Information System (INIS)
Seeger, Matthias W
2009-01-01
A wide range of problems such as signal reconstruction, denoising, source separation, feature selection, and graphical model search are addressed today by posterior maximization for linear models with sparsity-favouring prior distributions. The Bayesian posterior contains useful information far beyond its mode, which can be used to drive methods for sampling optimization (active learning), feature relevance ranking, or hyperparameter estimation, if only this representation of uncertainty can be approximated in a tractable manner. In this paper, we review recent results for variational sparse inference, and show that they share underlying computational primitives. We discuss how sampling optimization can be implemented as sequential Bayesian experimental design. While there has been tremendous recent activity to develop sparse estimation, little attendance has been given to sparse approximate inference. In this paper, we argue that many problems in practice, such as compressive sensing for real-world image reconstruction, are served much better by proper uncertainty approximations than by ever more aggressive sparse estimation algorithms. Moreover, since some variational inference methods have been given strong convex optimization characterizations recently, theoretical analysis may become possible, promising new insights into nonlinear experimental design.
Sparse linear models: Variational approximate inference and Bayesian experimental design
Energy Technology Data Exchange (ETDEWEB)
Seeger, Matthias W [Saarland University and Max Planck Institute for Informatics, Campus E1.4, 66123 Saarbruecken (Germany)
2009-12-01
A wide range of problems such as signal reconstruction, denoising, source separation, feature selection, and graphical model search are addressed today by posterior maximization for linear models with sparsity-favouring prior distributions. The Bayesian posterior contains useful information far beyond its mode, which can be used to drive methods for sampling optimization (active learning), feature relevance ranking, or hyperparameter estimation, if only this representation of uncertainty can be approximated in a tractable manner. In this paper, we review recent results for variational sparse inference, and show that they share underlying computational primitives. We discuss how sampling optimization can be implemented as sequential Bayesian experimental design. While there has been tremendous recent activity to develop sparse estimation, little attendance has been given to sparse approximate inference. In this paper, we argue that many problems in practice, such as compressive sensing for real-world image reconstruction, are served much better by proper uncertainty approximations than by ever more aggressive sparse estimation algorithms. Moreover, since some variational inference methods have been given strong convex optimization characterizations recently, theoretical analysis may become possible, promising new insights into nonlinear experimental design.
ANUBIS: artificial neuromodulation using a Bayesian inference system.
Smith, Benjamin J H; Saaj, Chakravarthini M; Allouis, Elie
2013-01-01
Gain tuning is a crucial part of controller design and depends not only on an accurate understanding of the system in question, but also on the designer's ability to predict what disturbances and other perturbations the system will encounter throughout its operation. This letter presents ANUBIS (artificial neuromodulation using a Bayesian inference system), a novel biologically inspired technique for automatically tuning controller parameters in real time. ANUBIS is based on the Bayesian brain concept and modifies it by incorporating a model of the neuromodulatory system comprising four artificial neuromodulators. It has been applied to the controller of EchinoBot, a prototype walking rover for Martian exploration. ANUBIS has been implemented at three levels of the controller; gait generation, foot trajectory planning using Bézier curves, and foot trajectory tracking using a terminal sliding mode controller. We compare the results to a similar system that has been tuned using a multilayer perceptron. The use of Bayesian inference means that the system retains mathematical interpretability, unlike other intelligent tuning techniques, which use neural networks, fuzzy logic, or evolutionary algorithms. The simulation results show that ANUBIS provides significant improvements in efficiency and adaptability of the three controller components; it allows the robot to react to obstacles and uncertainties faster than the system tuned with the MLP, while maintaining stability and accuracy. As well as advancing rover autonomy, ANUBIS could also be applied to other situations where operating conditions are likely to change or cannot be accurately modeled in advance, such as process control. In addition, it demonstrates one way in which neuromodulation could fit into the Bayesian brain framework.
GPU Computing in Bayesian Inference of Realized Stochastic Volatility Model
International Nuclear Information System (INIS)
Takaishi, Tetsuya
2015-01-01
The realized stochastic volatility (RSV) model that utilizes the realized volatility as additional information has been proposed to infer volatility of financial time series. We consider the Bayesian inference of the RSV model by the Hybrid Monte Carlo (HMC) algorithm. The HMC algorithm can be parallelized and thus performed on the GPU for speedup. The GPU code is developed with CUDA Fortran. We compare the computational time in performing the HMC algorithm on GPU (GTX 760) and CPU (Intel i7-4770 3.4GHz) and find that the GPU can be up to 17 times faster than the CPU. We also code the program with OpenACC and find that appropriate coding can achieve the similar speedup with CUDA Fortran
Inferring on the Intentions of Others by Hierarchical Bayesian Learning
Diaconescu, Andreea O.; Mathys, Christoph; Weber, Lilian A. E.; Daunizeau, Jean; Kasper, Lars; Lomakina, Ekaterina I.; Fehr, Ernst; Stephan, Klaas E.
2014-01-01
Inferring on others' (potentially time-varying) intentions is a fundamental problem during many social transactions. To investigate the underlying mechanisms, we applied computational modeling to behavioral data from an economic game in which 16 pairs of volunteers (randomly assigned to “player” or “adviser” roles) interacted. The player performed a probabilistic reinforcement learning task, receiving information about a binary lottery from a visual pie chart. The adviser, who received more predictive information, issued an additional recommendation. Critically, the game was structured such that the adviser's incentives to provide helpful or misleading information varied in time. Using a meta-Bayesian modeling framework, we found that the players' behavior was best explained by the deployment of hierarchical learning: they inferred upon the volatility of the advisers' intentions in order to optimize their predictions about the validity of their advice. Beyond learning, volatility estimates also affected the trial-by-trial variability of decisions: participants were more likely to rely on their estimates of advice accuracy for making choices when they believed that the adviser's intentions were presently stable. Finally, our model of the players' inference predicted the players' interpersonal reactivity index (IRI) scores, explicit ratings of the advisers' helpfulness and the advisers' self-reports on their chosen strategy. Overall, our results suggest that humans (i) employ hierarchical generative models to infer on the changing intentions of others, (ii) use volatility estimates to inform decision-making in social interactions, and (iii) integrate estimates of advice accuracy with non-social sources of information. The Bayesian framework presented here can quantify individual differences in these mechanisms from simple behavioral readouts and may prove useful in future clinical studies of maladaptive social cognition. PMID:25187943
Bayesian inference method for stochastic damage accumulation modeling
International Nuclear Information System (INIS)
Jiang, Xiaomo; Yuan, Yong; Liu, Xian
2013-01-01
Damage accumulation based reliability model plays an increasingly important role in successful realization of condition based maintenance for complicated engineering systems. This paper developed a Bayesian framework to establish stochastic damage accumulation model from historical inspection data, considering data uncertainty. Proportional hazards modeling technique is developed to model the nonlinear effect of multiple influencing factors on system reliability. Different from other hazard modeling techniques such as normal linear regression model, the approach does not require any distribution assumption for the hazard model, and can be applied for a wide variety of distribution models. A Bayesian network is created to represent the nonlinear proportional hazards models and to estimate model parameters by Bayesian inference with Markov Chain Monte Carlo simulation. Both qualitative and quantitative approaches are developed to assess the validity of the established damage accumulation model. Anderson–Darling goodness-of-fit test is employed to perform the normality test, and Box–Cox transformation approach is utilized to convert the non-normality data into normal distribution for hypothesis testing in quantitative model validation. The methodology is illustrated with the seepage data collected from real-world subway tunnels.
Bayesian inference from count data using discrete uniform priors.
Directory of Open Access Journals (Sweden)
Federico Comoglio
Full Text Available We consider a set of sample counts obtained by sampling arbitrary fractions of a finite volume containing an homogeneously dispersed population of identical objects. We report a Bayesian derivation of the posterior probability distribution of the population size using a binomial likelihood and non-conjugate, discrete uniform priors under sampling with or without replacement. Our derivation yields a computationally feasible formula that can prove useful in a variety of statistical problems involving absolute quantification under uncertainty. We implemented our algorithm in the R package dupiR and compared it with a previously proposed Bayesian method based on a Gamma prior. As a showcase, we demonstrate that our inference framework can be used to estimate bacterial survival curves from measurements characterized by extremely low or zero counts and rather high sampling fractions. All in all, we provide a versatile, general purpose algorithm to infer population sizes from count data, which can find application in a broad spectrum of biological and physical problems.
Extreme-Scale Bayesian Inference for Uncertainty Quantification of Complex Simulations
Energy Technology Data Exchange (ETDEWEB)
Biros, George [Univ. of Texas, Austin, TX (United States)
2018-01-12
Uncertainty quantification (UQ)—that is, quantifying uncertainties in complex mathematical models and their large-scale computational implementations—is widely viewed as one of the outstanding challenges facing the field of CS&E over the coming decade. The EUREKA project set to address the most difficult class of UQ problems: those for which both the underlying PDE model as well as the uncertain parameters are of extreme scale. In the project we worked on these extreme-scale challenges in the following four areas: 1. Scalable parallel algorithms for sampling and characterizing the posterior distribution that exploit the structure of the underlying PDEs and parameter-to-observable map. These include structure-exploiting versions of the randomized maximum likelihood method, which aims to overcome the intractability of employing conventional MCMC methods for solving extreme-scale Bayesian inversion problems by appealing to and adapting ideas from large-scale PDE-constrained optimization, which have been very successful at exploring high-dimensional spaces. 2. Scalable parallel algorithms for construction of prior and likelihood functions based on learning methods and non-parametric density estimation. Constructing problem-specific priors remains a critical challenge in Bayesian inference, and more so in high dimensions. Another challenge is construction of likelihood functions that capture unmodeled couplings between observations and parameters. We will create parallel algorithms for non-parametric density estimation using high dimensional N-body methods and combine them with supervised learning techniques for the construction of priors and likelihood functions. 3. Bayesian inadequacy models, which augment physics models with stochastic models that represent their imperfections. The success of the Bayesian inference framework depends on the ability to represent the uncertainty due to imperfections of the mathematical model of the phenomena of interest. This is a
Bayesian Inference for Signal-Based Seismic Monitoring
Moore, D.
2015-12-01
Traditional seismic monitoring systems rely on discrete detections produced by station processing software, discarding significant information present in the original recorded signal. SIG-VISA (Signal-based Vertically Integrated Seismic Analysis) is a system for global seismic monitoring through Bayesian inference on seismic signals. By modeling signals directly, our forward model is able to incorporate a rich representation of the physics underlying the signal generation process, including source mechanisms, wave propagation, and station response. This allows inference in the model to recover the qualitative behavior of recent geophysical methods including waveform matching and double-differencing, all as part of a unified Bayesian monitoring system that simultaneously detects and locates events from a global network of stations. We demonstrate recent progress in scaling up SIG-VISA to efficiently process the data stream of global signals recorded by the International Monitoring System (IMS), including comparisons against existing processing methods that show increased sensitivity from our signal-based model and in particular the ability to locate events (including aftershock sequences that can tax analyst processing) precisely from waveform correlation effects. We also provide a Bayesian analysis of an alleged low-magnitude event near the DPRK test site in May 2010 [1] [2], investigating whether such an event could plausibly be detected through automated processing in a signal-based monitoring system. [1] Zhang, Miao and Wen, Lianxing. "Seismological Evidence for a Low-Yield Nuclear Test on 12 May 2010 in North Korea". Seismological Research Letters, January/February 2015. [2] Richards, Paul. "A Seismic Event in North Korea on 12 May 2010". CTBTO SnT 2015 oral presentation, video at https://video-archive.ctbto.org/index.php/kmc/preview/partner_id/103/uiconf_id/4421629/entry_id/0_ymmtpps0/delivery/http
Link, William A.; Eaton, Mitchell J.
2012-01-01
1. Markov chain Monte Carlo (MCMC) is a simulation technique that has revolutionised the analysis of ecological data, allowing the fitting of complex models in a Bayesian framework. Since 2001, there have been nearly 200 papers using MCMC in publications of the Ecological Society of America and the British Ecological Society, including more than 75 in the journal Ecology and 35 in the Journal of Applied Ecology.
vs. a polynomial chaos-based MCMC
Siripatana, Adil
2014-08-01
Bayesian Inference of Manning\\'s n coefficient in a Storm Surge Model Framework: comparison between Kalman lter and polynomial based method Adil Siripatana Conventional coastal ocean models solve the shallow water equations, which describe the conservation of mass and momentum when the horizontal length scale is much greater than the vertical length scale. In this case vertical pressure gradients in the momentum equations are nearly hydrostatic. The outputs of coastal ocean models are thus sensitive to the bottom stress terms de ned through the formulation of Manning\\'s n coefficients. This thesis considers the Bayesian inference problem of the Manning\\'s n coefficient in the context of storm surge based on the coastal ocean ADCIRC model. In the first part of the thesis, we apply an ensemble-based Kalman filter, the singular evolutive interpolated Kalman (SEIK) filter to estimate both a constant Manning\\'s n coefficient and a 2-D parameterized Manning\\'s coefficient on one ideal and one of more realistic domain using observation system simulation experiments (OSSEs). We study the sensitivity of the system to the ensemble size. we also access the benefits from using an in ation factor on the filter performance. To study the limitation of the Guassian restricted assumption on the SEIK lter, 5 we also implemented in the second part of this thesis a Markov Chain Monte Carlo (MCMC) method based on a Generalized Polynomial chaos (gPc) approach for the estimation of the 1-D and 2-D Mannning\\'s n coe cient. The gPc is used to build a surrogate model that imitate the ADCIRC model in order to make the computational cost of implementing the MCMC with the ADCIRC model reasonable. We evaluate the performance of the MCMC-gPc approach and study its robustness to di erent OSSEs scenario. we also compare its estimates with those resulting from SEIK in term of parameter estimates and full distributions. we present a full analysis of the solution of these two methods, of the
Bayesian inference for identifying interaction rules in moving animal groups.
Directory of Open Access Journals (Sweden)
Richard P Mann
Full Text Available The emergence of similar collective patterns from different self-propelled particle models of animal groups points to a restricted set of "universal" classes for these patterns. While universality is interesting, it is often the fine details of animal interactions that are of biological importance. Universality thus presents a challenge to inferring such interactions from macroscopic group dynamics since these can be consistent with many underlying interaction models. We present a Bayesian framework for learning animal interaction rules from fine scale recordings of animal movements in swarms. We apply these techniques to the inverse problem of inferring interaction rules from simulation models, showing that parameters can often be inferred from a small number of observations. Our methodology allows us to quantify our confidence in parameter fitting. For example, we show that attraction and alignment terms can be reliably estimated when animals are milling in a torus shape, while interaction radius cannot be reliably measured in such a situation. We assess the importance of rate of data collection and show how to test different models, such as topological and metric neighbourhood models. Taken together our results both inform the design of experiments on animal interactions and suggest how these data should be best analysed.
Bayesian inference for psychology. Part I : Theoretical advantages and practical ramifications
Wagenmakers, E.-J.; Marsman, M.; Jamil, T.; Ly, A.; Verhagen, J.; Love, J.; Selker, R.; Gronau, Q.F.; Šmíra, M.; Epskamp, S.; Matzke, D.; Rouder, J.N.; Morey, R.D.
2018-01-01
Bayesian parameter estimation and Bayesian hypothesis testing present attractive alternatives to classical inference using confidence intervals and p values. In part I of this series we outline ten prominent advantages of the Bayesian approach. Many of these advantages translate to concrete
Siripatana, Adil; Mayo, Talea; Sraj, Ihab; Knio, Omar; Dawson, Clint; Le Maitre, Olivier; Hoteit, Ibrahim
2017-01-01
an ensemble Kalman-based data assimilation method for parameter estimation of a coastal ocean model against an MCMC polynomial chaos (PC)-based scheme. We focus on quantifying the uncertainties of a coastal ocean ADvanced CIRCulation (ADCIRC) model
Liu, Fang; Eugenio, Evercita C
2018-04-01
Beta regression is an increasingly popular statistical technique in medical research for modeling of outcomes that assume values in (0, 1), such as proportions and patient reported outcomes. When outcomes take values in the intervals [0,1), (0,1], or [0,1], zero-or-one-inflated beta (zoib) regression can be used. We provide a thorough review on beta regression and zoib regression in the modeling, inferential, and computational aspects via the likelihood-based and Bayesian approaches. We demonstrate the statistical and practical importance of correctly modeling the inflation at zero/one rather than ad hoc replacing them with values close to zero/one via simulation studies; the latter approach can lead to biased estimates and invalid inferences. We show via simulation studies that the likelihood-based approach is computationally faster in general than MCMC algorithms used in the Bayesian inferences, but runs the risk of non-convergence, large biases, and sensitivity to starting values in the optimization algorithm especially with clustered/correlated data, data with sparse inflation at zero and one, and data that warrant regularization of the likelihood. The disadvantages of the regular likelihood-based approach make the Bayesian approach an attractive alternative in these cases. Software packages and tools for fitting beta and zoib regressions in both the likelihood-based and Bayesian frameworks are also reviewed.
Bayesian Inference for Linear Parabolic PDEs with Noisy Boundary Conditions
Ruggeri, Fabrizio; Sawlan, Zaid A; Scavino, Marco; Tempone, Raul
2016-01-01
In this work we develop a hierarchical Bayesian setting to infer unknown parameters in initial-boundary value problems (IBVPs) for one-dimensional linear parabolic partial differential equations. Noisy boundary data and known initial condition are assumed. We derive the likelihood function associated with the forward problem, given some measurements of the solution field subject to Gaussian noise. Such function is then analytically marginalized using the linearity of the equation. Gaussian priors have been assumed for the time-dependent Dirichlet boundary values. Our approach is applied to synthetic data for the one-dimensional heat equation model, where the thermal diffusivity is the unknown parameter. We show how to infer the thermal diffusivity parameter when its prior distribution is lognormal or modeled by means of a space-dependent stationary lognormal random field. We use the Laplace method to provide approximated Gaussian posterior distributions for the thermal diffusivity. Expected information gains and predictive posterior densities for observable quantities are numerically estimated for different experimental setups.
Bayesian Inference for Linear Parabolic PDEs with Noisy Boundary Conditions
Ruggeri, Fabrizio
2015-01-07
In this work we develop a hierarchical Bayesian setting to infer unknown parameters in initial-boundary value problems (IBVPs) for one-dimensional linear parabolic partial differential equations. Noisy boundary data and known initial condition are assumed. We derive the likelihood function associated with the forward problem, given some measurements of the solution field subject to Gaussian noise. Such function is then analytically marginalized using the linearity of the equation. Gaussian priors have been assumed for the time-dependent Dirichlet boundary values. Our approach is applied to synthetic data for the one-dimensional heat equation model, where the thermal diffusivity is the unknown parameter. We show how to infer the thermal diffusivity parameter when its prior distribution is lognormal or modeled by means of a space-dependent stationary lognormal random field. We use the Laplace method to provide approximated Gaussian posterior distributions for the thermal diffusivity. Expected information gains and predictive posterior densities for observable quantities are numerically estimated for different experimental setups.
Bayesian Inference for Linear Parabolic PDEs with Noisy Boundary Conditions
Ruggeri, Fabrizio
2016-01-06
In this work we develop a hierarchical Bayesian setting to infer unknown parameters in initial-boundary value problems (IBVPs) for one-dimensional linear parabolic partial differential equations. Noisy boundary data and known initial condition are assumed. We derive the likelihood function associated with the forward problem, given some measurements of the solution field subject to Gaussian noise. Such function is then analytically marginalized using the linearity of the equation. Gaussian priors have been assumed for the time-dependent Dirichlet boundary values. Our approach is applied to synthetic data for the one-dimensional heat equation model, where the thermal diffusivity is the unknown parameter. We show how to infer the thermal diffusivity parameter when its prior distribution is lognormal or modeled by means of a space-dependent stationary lognormal random field. We use the Laplace method to provide approximated Gaussian posterior distributions for the thermal diffusivity. Expected information gains and predictive posterior densities for observable quantities are numerically estimated for different experimental setups.
Bayesian inference for partially identified models exploring the limits of limited data
Gustafson, Paul
2015-01-01
Introduction Identification What Is against Us? What Is for Us? Some Simple Examples of Partially Identified ModelsThe Road Ahead The Structure of Inference in Partially Identified Models Bayesian Inference The Structure of Posterior Distributions in PIMs Computational Strategies Strength of Bayesian Updating, Revisited Posterior MomentsCredible Intervals Evaluating the Worth of Inference Partial Identification versus Model Misspecification The Siren Call of Identification Comp
Rajabi, Mohammad Mahdi; Ataie-Ashtiani, Behzad
2016-05-01
Bayesian inference has traditionally been conceived as the proper framework for the formal incorporation of expert knowledge in parameter estimation of groundwater models. However, conventional Bayesian inference is incapable of taking into account the imprecision essentially embedded in expert provided information. In order to solve this problem, a number of extensions to conventional Bayesian inference have been introduced in recent years. One of these extensions is 'fuzzy Bayesian inference' which is the result of integrating fuzzy techniques into Bayesian statistics. Fuzzy Bayesian inference has a number of desirable features which makes it an attractive approach for incorporating expert knowledge in the parameter estimation process of groundwater models: (1) it is well adapted to the nature of expert provided information, (2) it allows to distinguishably model both uncertainty and imprecision, and (3) it presents a framework for fusing expert provided information regarding the various inputs of the Bayesian inference algorithm. However an important obstacle in employing fuzzy Bayesian inference in groundwater numerical modeling applications is the computational burden, as the required number of numerical model simulations often becomes extremely exhaustive and often computationally infeasible. In this paper, a novel approach of accelerating the fuzzy Bayesian inference algorithm is proposed which is based on using approximate posterior distributions derived from surrogate modeling, as a screening tool in the computations. The proposed approach is first applied to a synthetic test case of seawater intrusion (SWI) in a coastal aquifer. It is shown that for this synthetic test case, the proposed approach decreases the number of required numerical simulations by an order of magnitude. Then the proposed approach is applied to a real-world test case involving three-dimensional numerical modeling of SWI in Kish Island, located in the Persian Gulf. An expert
Bayesian Inference for Functional Dynamics Exploring in fMRI Data
Directory of Open Access Journals (Sweden)
Xuan Guo
2016-01-01
Full Text Available This paper aims to review state-of-the-art Bayesian-inference-based methods applied to functional magnetic resonance imaging (fMRI data. Particularly, we focus on one specific long-standing challenge in the computational modeling of fMRI datasets: how to effectively explore typical functional interactions from fMRI time series and the corresponding boundaries of temporal segments. Bayesian inference is a method of statistical inference which has been shown to be a powerful tool to encode dependence relationships among the variables with uncertainty. Here we provide an introduction to a group of Bayesian-inference-based methods for fMRI data analysis, which were designed to detect magnitude or functional connectivity change points and to infer their functional interaction patterns based on corresponding temporal boundaries. We also provide a comparison of three popular Bayesian models, that is, Bayesian Magnitude Change Point Model (BMCPM, Bayesian Connectivity Change Point Model (BCCPM, and Dynamic Bayesian Variable Partition Model (DBVPM, and give a summary of their applications. We envision that more delicate Bayesian inference models will be emerging and play increasingly important roles in modeling brain functions in the years to come.
Hernández, Mario R.; Francés, Félix
2015-04-01
One phase of the hydrological models implementation process, significantly contributing to the hydrological predictions uncertainty, is the calibration phase in which values of the unknown model parameters are tuned by optimizing an objective function. An unsuitable error model (e.g. Standard Least Squares or SLS) introduces noise into the estimation of the parameters. The main sources of this noise are the input errors and the hydrological model structural deficiencies. Thus, the biased calibrated parameters cause the divergence model phenomenon, where the errors variance of the (spatially and temporally) forecasted flows far exceeds the errors variance in the fitting period, and provoke the loss of part or all of the physical meaning of the modeled processes. In other words, yielding a calibrated hydrological model which works well, but not for the right reasons. Besides, an unsuitable error model yields a non-reliable predictive uncertainty assessment. Hence, with the aim of prevent all these undesirable effects, this research focuses on the Bayesian joint inference (BJI) of both the hydrological and error model parameters, considering a general additive (GA) error model that allows for correlation, non-stationarity (in variance and bias) and non-normality of model residuals. As hydrological model, it has been used a conceptual distributed model called TETIS, with a particular split structure of the effective model parameters. Bayesian inference has been performed with the aid of a Markov Chain Monte Carlo (MCMC) algorithm called Dream-ZS. MCMC algorithm quantifies the uncertainty of the hydrological and error model parameters by getting the joint posterior probability distribution, conditioned on the observed flows. The BJI methodology is a very powerful and reliable tool, but it must be used correctly this is, if non-stationarity in errors variance and bias is modeled, the Total Laws must be taken into account. The results of this research show that the
BayesCLUMPY: BAYESIAN INFERENCE WITH CLUMPY DUSTY TORUS MODELS
International Nuclear Information System (INIS)
Asensio Ramos, A.; Ramos Almeida, C.
2009-01-01
Our aim is to present a fast and general Bayesian inference framework based on the synergy between machine learning techniques and standard sampling methods and apply it to infer the physical properties of clumpy dusty torus using infrared photometric high spatial resolution observations of active galactic nuclei. We make use of the Metropolis-Hastings Markov Chain Monte Carlo algorithm for sampling the posterior distribution function. Such distribution results from combining all a priori knowledge about the parameters of the model and the information introduced by the observations. The main difficulty resides in the fact that the model used to explain the observations is computationally demanding and the sampling is very time consuming. For this reason, we apply a set of artificial neural networks that are used to approximate and interpolate a database of models. As a consequence, models not present in the original database can be computed ensuring continuity. We focus on the application of this solution scheme to the recently developed public database of clumpy dusty torus models. The machine learning scheme used in this paper allows us to generate any model from the database using only a factor of 10 -4 of the original size of the database and a factor of 10 -3 in computing time. The posterior distribution obtained for each model parameter allows us to investigate how the observations constrain the parameters and which ones remain partially or completely undetermined, providing statistically relevant confidence intervals. As an example, the application to the nuclear region of Centaurus A shows that the optical depth of the clouds, the total number of clouds, and the radial extent of the cloud distribution zone are well constrained using only six filters. The code is freely available from the authors.
Bayesian Inference for NASA Probabilistic Risk and Reliability Analysis
Dezfuli, Homayoon; Kelly, Dana; Smith, Curtis; Vedros, Kurt; Galyean, William
2009-01-01
This document, Bayesian Inference for NASA Probabilistic Risk and Reliability Analysis, is intended to provide guidelines for the collection and evaluation of risk and reliability-related data. It is aimed at scientists and engineers familiar with risk and reliability methods and provides a hands-on approach to the investigation and application of a variety of risk and reliability data assessment methods, tools, and techniques. This document provides both: A broad perspective on data analysis collection and evaluation issues. A narrow focus on the methods to implement a comprehensive information repository. The topics addressed herein cover the fundamentals of how data and information are to be used in risk and reliability analysis models and their potential role in decision making. Understanding these topics is essential to attaining a risk informed decision making environment that is being sought by NASA requirements and procedures such as 8000.4 (Agency Risk Management Procedural Requirements), NPR 8705.05 (Probabilistic Risk Assessment Procedures for NASA Programs and Projects), and the System Safety requirements of NPR 8715.3 (NASA General Safety Program Requirements).
Bayesian Inference of Ecological Interactions from Spatial Data
Directory of Open Access Journals (Sweden)
Christopher R. Stephens
2017-11-01
Full Text Available The characterization and quantification of ecological interactions and the construction of species’ distributions and their associated ecological niches are of fundamental theoretical and practical importance. In this paper, we discuss a Bayesian inference framework, which, using spatial data, offers a general formalism within which ecological interactions may be characterized and quantified. Interactions are identified through deviations of the spatial distribution of co-occurrences of spatial variables relative to a benchmark for the non-interacting system and based on a statistical ensemble of spatial cells. The formalism allows for the integration of both biotic and abiotic factors of arbitrary resolution. We concentrate on the conceptual and mathematical underpinnings of the formalism, showing how, using the naive Bayes approximation, it can be used to not only compare and contrast the relative contribution from each variable, but also to construct species’ distributions and ecological niches based on an arbitrary variable type. We also show how non-linear interactions between distinct niche variables can be identified and the degree of confounding between variables accounted for.
Metainference: A Bayesian inference method for heterogeneous systems.
Bonomi, Massimiliano; Camilloni, Carlo; Cavalli, Andrea; Vendruscolo, Michele
2016-01-01
Modeling a complex system is almost invariably a challenging task. The incorporation of experimental observations can be used to improve the quality of a model and thus to obtain better predictions about the behavior of the corresponding system. This approach, however, is affected by a variety of different errors, especially when a system simultaneously populates an ensemble of different states and experimental data are measured as averages over such states. To address this problem, we present a Bayesian inference method, called "metainference," that is able to deal with errors in experimental measurements and with experimental measurements averaged over multiple states. To achieve this goal, metainference models a finite sample of the distribution of models using a replica approach, in the spirit of the replica-averaging modeling based on the maximum entropy principle. To illustrate the method, we present its application to a heterogeneous model system and to the determination of an ensemble of structures corresponding to the thermal fluctuations of a protein molecule. Metainference thus provides an approach to modeling complex systems with heterogeneous components and interconverting between different states by taking into account all possible sources of errors.
Self-Associations Influence Task-Performance through Bayesian Inference.
Bengtsson, Sara L; Penny, Will D
2013-01-01
The way we think about ourselves impacts greatly on our behavior. This paper describes a behavioral study and a computational model that shed new light on this important area. Participants were primed "clever" and "stupid" using a scrambled sentence task, and we measured the effect on response time and error-rate on a rule-association task. First, we observed a confirmation bias effect in that associations to being "stupid" led to a gradual decrease in performance, whereas associations to being "clever" did not. Second, we observed that the activated self-concepts selectively modified attention toward one's performance. There was an early to late double dissociation in RTs in that primed "clever" resulted in RT increase following error responses, whereas primed "stupid" resulted in RT increase following correct responses. We propose a computational model of subjects' behavior based on the logic of the experimental task that involves two processes; memory for rules and the integration of rules with subsequent visual cues. The model incorporates an adaptive decision threshold based on Bayes rule, whereby decision thresholds are increased if integration was inferred to be faulty. Fitting the computational model to experimental data confirmed our hypothesis that priming affects the memory process. This model explains both the confirmation bias and double dissociation effects and demonstrates that Bayesian inferential principles can be used to study the effect of self-concepts on behavior.
Self-associations influence task-performance through Bayesian inference
Directory of Open Access Journals (Sweden)
Sara L Bengtsson
2013-08-01
Full Text Available The way we think about ourselves impacts greatly on our behaviour. This paper describes a behavioural study and a computational model that sheds new light on this important area. Participants were primed 'clever' and 'stupid' using a scrambled sentence task, and we measured the effect on response time and error-rate on a rule-association task. First, we observed a confirmation bias effect in that associations to being 'stupid' led to a gradual decrease in performance, whereas associations to being 'clever' did not. Second, we observed that the activated self-concepts selectively modified attention towards one's performance. There was an early to late double dissociation in RTs in that primed 'clever' resulted in RT increase following error responses, whereas primed 'stupid' resulted in RT increase following correct responses. We propose a computational model of subjects' behaviour based on the logic of the experimental task that involves two processes; memory for rules and the integration of rules with subsequent visual cues. The model also incorporates an adaptive decision threshold based on Bayes rule, whereby decision thresholds are increased if integration was inferred to be faulty. Fitting the computational model to experimental data confirmed our hypothesis that priming affects the memory process. This model explains both the confirmation bias and double dissociation effects and demonstrates that Bayesian inferential principles can be used to study the effect of self-concepts on behaviour.
Serang, Oliver
2014-01-01
Exact Bayesian inference can sometimes be performed efficiently for special cases where a function has commutative and associative symmetry of its inputs (called “causal independence”). For this reason, it is desirable to exploit such symmetry on big data sets. Here we present a method to exploit a general form of this symmetry on probabilistic adder nodes by transforming those probabilistic adder nodes into a probabilistic convolution tree with which dynamic programming computes exact probabilities. A substantial speedup is demonstrated using an illustration example that can arise when identifying splice forms with bottom-up mass spectrometry-based proteomics. On this example, even state-of-the-art exact inference algorithms require a runtime more than exponential in the number of splice forms considered. By using the probabilistic convolution tree, we reduce the runtime to and the space to where is the number of variables joined by an additive or cardinal operator. This approach, which can also be used with junction tree inference, is applicable to graphs with arbitrary dependency on counting variables or cardinalities and can be used on diverse problems and fields like forward error correcting codes, elemental decomposition, and spectral demixing. The approach also trivially generalizes to multiple dimensions. PMID:24626234
Bayesian inference for dynamic transcriptional regulation; the Hes1 system as a case study.
Heron, Elizabeth A; Finkenstädt, Bärbel; Rand, David A
2007-10-01
In this study, we address the problem of estimating the parameters of regulatory networks and provide the first application of Markov chain Monte Carlo (MCMC) methods to experimental data. As a case study, we consider a stochastic model of the Hes1 system expressed in terms of stochastic differential equations (SDEs) to which rigorous likelihood methods of inference can be applied. When fitting continuous-time stochastic models to discretely observed time series the lengths of the sampling intervals are important, and much of our study addresses the problem when the data are sparse. We estimate the parameters of an autoregulatory network providing results both for simulated and real experimental data from the Hes1 system. We develop an estimation algorithm using MCMC techniques which are flexible enough to allow for the imputation of latent data on a finer time scale and the presence of prior information about parameters which may be informed from other experiments as well as additional measurement error.
Mocapy++ - a toolkit for inference and learning in dynamic Bayesian networks
DEFF Research Database (Denmark)
Paluszewski, Martin; Hamelryck, Thomas Wim
2010-01-01
Background Mocapy++ is a toolkit for parameter learning and inference in dynamic Bayesian networks (DBNs). It supports a wide range of DBN architectures and probability distributions, including distributions from directional statistics (the statistics of angles, directions and orientations...
Sythesis of MCMC and Belief Propagation
Energy Technology Data Exchange (ETDEWEB)
Ahn, Sungsoo [Korea Advanced Institute of Science and Technology, Daejeon (South Korea); Chertkov, Michael [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Shin, Jinwoo [Korea Advanced Institute of Science and Technology, Daejeon (South Korea)
2016-05-27
Markov Chain Monte Carlo (MCMC) and Belief Propagation (BP) are the most popular algorithms for computational inference in Graphical Models (GM). In principle, MCMC is an exact probabilistic method which, however, often suffers from exponentially slow mixing. In contrast, BP is a deterministic method, which is typically fast, empirically very successful, however in general lacking control of accuracy over loopy graphs. In this paper, we introduce MCMC algorithms correcting the approximation error of BP, i.e., we provide a way to compensate for BP errors via a consecutive BP-aware MCMC. Our framework is based on the Loop Calculus (LC) approach which allows to express the BP error as a sum of weighted generalized loops. Although the full series is computationally intractable, it is known that a truncated series, summing up all 2-regular loops, is computable in polynomial-time for planar pair-wise binary GMs and it also provides a highly accurate approximation empirically. Motivated by this, we first propose a polynomial-time approximation MCMC scheme for the truncated series of general (non-planar) pair-wise binary models. Our main idea here is to use the Worm algorithm, known to provide fast mixing in other (related) problems, and then design an appropriate rejection scheme to sample 2-regular loops. Furthermore, we also design an efficient rejection-free MCMC scheme for approximating the full series. The main novelty underlying our design is in utilizing the concept of cycle basis, which provides an efficient decomposition of the generalized loops. In essence, the proposed MCMC schemes run on transformed GM built upon the non-trivial BP solution, and our experiments show that this synthesis of BP and MCMC outperforms both direct MCMC and bare BP schemes.
Inconsistency of Bayesian Inference for Misspecified Linear Models, and a Proposal for Repairing It
Grünwald, P.; van Ommen, T.
2017-01-01
We empirically show that Bayesian inference can be inconsistent under misspecification in simple linear regression problems, both in a model averaging/selection and in a Bayesian ridge regression setting. We use the standard linear model, which assumes homoskedasticity, whereas the data are
Inconsistency of Bayesian inference for misspecified linear models, and a proposal for repairing it
P.D. Grünwald (Peter); T. van Ommen (Thijs)
2017-01-01
textabstractWe empirically show that Bayesian inference can be inconsistent under misspecification in simple linear regression problems, both in a model averaging/selection and in a Bayesian ridge regression setting. We use the standard linear model, which assumes homoskedasticity, whereas the data
Elsheikh, Ahmed H.
2014-02-01
An efficient Bayesian calibration method based on the nested sampling (NS) algorithm and non-intrusive polynomial chaos method is presented. Nested sampling is a Bayesian sampling algorithm that builds a discrete representation of the posterior distributions by iteratively re-focusing a set of samples to high likelihood regions. NS allows representing the posterior probability density function (PDF) with a smaller number of samples and reduces the curse of dimensionality effects. The main difficulty of the NS algorithm is in the constrained sampling step which is commonly performed using a random walk Markov Chain Monte-Carlo (MCMC) algorithm. In this work, we perform a two-stage sampling using a polynomial chaos response surface to filter out rejected samples in the Markov Chain Monte-Carlo method. The combined use of nested sampling and the two-stage MCMC based on approximate response surfaces provides significant computational gains in terms of the number of simulation runs. The proposed algorithm is applied for calibration and model selection of subsurface flow models. © 2013.
Bayesian inference of nonlinear unsteady aerodynamics from aeroelastic limit cycle oscillations
Energy Technology Data Exchange (ETDEWEB)
Sandhu, Rimple [Department of Civil and Environmental Engineering, Carleton University, Ottawa, Ontario (Canada); Poirel, Dominique [Department of Mechanical and Aerospace Engineering, Royal Military College of Canada, Kingston, Ontario (Canada); Pettit, Chris [Department of Aerospace Engineering, United States Naval Academy, Annapolis, MD (United States); Khalil, Mohammad [Department of Civil and Environmental Engineering, Carleton University, Ottawa, Ontario (Canada); Sarkar, Abhijit, E-mail: abhijit.sarkar@carleton.ca [Department of Civil and Environmental Engineering, Carleton University, Ottawa, Ontario (Canada)
2016-07-01
A Bayesian model selection and parameter estimation algorithm is applied to investigate the influence of nonlinear and unsteady aerodynamic loads on the limit cycle oscillation (LCO) of a pitching airfoil in the transitional Reynolds number regime. At small angles of attack, laminar boundary layer trailing edge separation causes negative aerodynamic damping leading to the LCO. The fluid–structure interaction of the rigid, but elastically mounted, airfoil and nonlinear unsteady aerodynamics is represented by two coupled nonlinear stochastic ordinary differential equations containing uncertain parameters and model approximation errors. Several plausible aerodynamic models with increasing complexity are proposed to describe the aeroelastic system leading to LCO. The likelihood in the posterior parameter probability density function (pdf) is available semi-analytically using the extended Kalman filter for the state estimation of the coupled nonlinear structural and unsteady aerodynamic model. The posterior parameter pdf is sampled using a parallel and adaptive Markov Chain Monte Carlo (MCMC) algorithm. The posterior probability of each model is estimated using the Chib–Jeliazkov method that directly uses the posterior MCMC samples for evidence (marginal likelihood) computation. The Bayesian algorithm is validated through a numerical study and then applied to model the nonlinear unsteady aerodynamic loads using wind-tunnel test data at various Reynolds numbers.
UNSUPERVISED TRANSIENT LIGHT CURVE ANALYSIS VIA HIERARCHICAL BAYESIAN INFERENCE
Energy Technology Data Exchange (ETDEWEB)
Sanders, N. E.; Soderberg, A. M. [Harvard-Smithsonian Center for Astrophysics, 60 Garden Street, Cambridge, MA 02138 (United States); Betancourt, M., E-mail: nsanders@cfa.harvard.edu [Department of Statistics, University of Warwick, Coventry CV4 7AL (United Kingdom)
2015-02-10
Historically, light curve studies of supernovae (SNe) and other transient classes have focused on individual objects with copious and high signal-to-noise observations. In the nascent era of wide field transient searches, objects with detailed observations are decreasing as a fraction of the overall known SN population, and this strategy sacrifices the majority of the information contained in the data about the underlying population of transients. A population level modeling approach, simultaneously fitting all available observations of objects in a transient sub-class of interest, fully mines the data to infer the properties of the population and avoids certain systematic biases. We present a novel hierarchical Bayesian statistical model for population level modeling of transient light curves, and discuss its implementation using an efficient Hamiltonian Monte Carlo technique. As a test case, we apply this model to the Type IIP SN sample from the Pan-STARRS1 Medium Deep Survey, consisting of 18,837 photometric observations of 76 SNe, corresponding to a joint posterior distribution with 9176 parameters under our model. Our hierarchical model fits provide improved constraints on light curve parameters relevant to the physical properties of their progenitor stars relative to modeling individual light curves alone. Moreover, we directly evaluate the probability for occurrence rates of unseen light curve characteristics from the model hyperparameters, addressing observational biases in survey methodology. We view this modeling framework as an unsupervised machine learning technique with the ability to maximize scientific returns from data to be collected by future wide field transient searches like LSST.
UNSUPERVISED TRANSIENT LIGHT CURVE ANALYSIS VIA HIERARCHICAL BAYESIAN INFERENCE
International Nuclear Information System (INIS)
Sanders, N. E.; Soderberg, A. M.; Betancourt, M.
2015-01-01
Historically, light curve studies of supernovae (SNe) and other transient classes have focused on individual objects with copious and high signal-to-noise observations. In the nascent era of wide field transient searches, objects with detailed observations are decreasing as a fraction of the overall known SN population, and this strategy sacrifices the majority of the information contained in the data about the underlying population of transients. A population level modeling approach, simultaneously fitting all available observations of objects in a transient sub-class of interest, fully mines the data to infer the properties of the population and avoids certain systematic biases. We present a novel hierarchical Bayesian statistical model for population level modeling of transient light curves, and discuss its implementation using an efficient Hamiltonian Monte Carlo technique. As a test case, we apply this model to the Type IIP SN sample from the Pan-STARRS1 Medium Deep Survey, consisting of 18,837 photometric observations of 76 SNe, corresponding to a joint posterior distribution with 9176 parameters under our model. Our hierarchical model fits provide improved constraints on light curve parameters relevant to the physical properties of their progenitor stars relative to modeling individual light curves alone. Moreover, we directly evaluate the probability for occurrence rates of unseen light curve characteristics from the model hyperparameters, addressing observational biases in survey methodology. We view this modeling framework as an unsupervised machine learning technique with the ability to maximize scientific returns from data to be collected by future wide field transient searches like LSST
Bayesian posterior distributions without Markov chains.
Cole, Stephen R; Chu, Haitao; Greenland, Sander; Hamra, Ghassan; Richardson, David B
2012-03-01
Bayesian posterior parameter distributions are often simulated using Markov chain Monte Carlo (MCMC) methods. However, MCMC methods are not always necessary and do not help the uninitiated understand Bayesian inference. As a bridge to understanding Bayesian inference, the authors illustrate a transparent rejection sampling method. In example 1, they illustrate rejection sampling using 36 cases and 198 controls from a case-control study (1976-1983) assessing the relation between residential exposure to magnetic fields and the development of childhood cancer. Results from rejection sampling (odds ratio (OR) = 1.69, 95% posterior interval (PI): 0.57, 5.00) were similar to MCMC results (OR = 1.69, 95% PI: 0.58, 4.95) and approximations from data-augmentation priors (OR = 1.74, 95% PI: 0.60, 5.06). In example 2, the authors apply rejection sampling to a cohort study of 315 human immunodeficiency virus seroconverters (1984-1998) to assess the relation between viral load after infection and 5-year incidence of acquired immunodeficiency syndrome, adjusting for (continuous) age at seroconversion and race. In this more complex example, rejection sampling required a notably longer run time than MCMC sampling but remained feasible and again yielded similar results. The transparency of the proposed approach comes at a price of being less broadly applicable than MCMC.
Norris, Peter M.; da Silva, Arlindo M.
2018-01-01
A method is presented to constrain a statistical model of sub-gridcolumn moisture variability using high-resolution satellite cloud data. The method can be used for large-scale model parameter estimation or cloud data assimilation. The gridcolumn model includes assumed probability density function (PDF) intra-layer horizontal variability and a copula-based inter-layer correlation model. The observables used in the current study are Moderate Resolution Imaging Spectroradiometer (MODIS) cloud-top pressure, brightness temperature and cloud optical thickness, but the method should be extensible to direct cloudy radiance assimilation for a small number of channels. The algorithm is a form of Bayesian inference with a Markov chain Monte Carlo (MCMC) approach to characterizing the posterior distribution. This approach is especially useful in cases where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach is not gradient-based and allows jumps into regions of non-zero cloud probability. The current study uses a skewed-triangle distribution for layer moisture. The article also includes a discussion of the Metropolis and multiple-try Metropolis versions of MCMC. PMID:29618847
Norris, Peter M.; Da Silva, Arlindo M.
2016-01-01
A method is presented to constrain a statistical model of sub-gridcolumn moisture variability using high-resolution satellite cloud data. The method can be used for large-scale model parameter estimation or cloud data assimilation. The gridcolumn model includes assumed probability density function (PDF) intra-layer horizontal variability and a copula-based inter-layer correlation model. The observables used in the current study are Moderate Resolution Imaging Spectroradiometer (MODIS) cloud-top pressure, brightness temperature and cloud optical thickness, but the method should be extensible to direct cloudy radiance assimilation for a small number of channels. The algorithm is a form of Bayesian inference with a Markov chain Monte Carlo (MCMC) approach to characterizing the posterior distribution. This approach is especially useful in cases where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach is not gradient-based and allows jumps into regions of non-zero cloud probability. The current study uses a skewed-triangle distribution for layer moisture. The article also includes a discussion of the Metropolis and multiple-try Metropolis versions of MCMC.
Bayesian inference for psychology. Part II: Example applications with JASP.
Wagenmakers, Eric-Jan; Love, Jonathon; Marsman, Maarten; Jamil, Tahira; Ly, Alexander; Verhagen, Josine; Selker, Ravi; Gronau, Quentin F; Dropmann, Damian; Boutin, Bruno; Meerhoff, Frans; Knight, Patrick; Raj, Akash; van Kesteren, Erik-Jan; van Doorn, Johnny; Šmíra, Martin; Epskamp, Sacha; Etz, Alexander; Matzke, Dora; de Jong, Tim; van den Bergh, Don; Sarafoglou, Alexandra; Steingroever, Helen; Derks, Koen; Rouder, Jeffrey N; Morey, Richard D
2018-02-01
Bayesian hypothesis testing presents an attractive alternative to p value hypothesis testing. Part I of this series outlined several advantages of Bayesian hypothesis testing, including the ability to quantify evidence and the ability to monitor and update this evidence as data come in, without the need to know the intention with which the data were collected. Despite these and other practical advantages, Bayesian hypothesis tests are still reported relatively rarely. An important impediment to the widespread adoption of Bayesian tests is arguably the lack of user-friendly software for the run-of-the-mill statistical problems that confront psychologists for the analysis of almost every experiment: the t-test, ANOVA, correlation, regression, and contingency tables. In Part II of this series we introduce JASP ( http://www.jasp-stats.org ), an open-source, cross-platform, user-friendly graphical software package that allows users to carry out Bayesian hypothesis tests for standard statistical problems. JASP is based in part on the Bayesian analyses implemented in Morey and Rouder's BayesFactor package for R. Armed with JASP, the practical advantages of Bayesian hypothesis testing are only a mouse click away.
Narimani, Zahra; Beigy, Hamid; Ahmad, Ashar; Masoudi-Nejad, Ali; Fröhlich, Holger
2017-01-01
Inferring the structure of molecular networks from time series protein or gene expression data provides valuable information about the complex biological processes of the cell. Causal network structure inference has been approached using different methods in the past. Most causal network inference techniques, such as Dynamic Bayesian Networks and ordinary differential equations, are limited by their computational complexity and thus make large scale inference infeasible. This is specifically true if a Bayesian framework is applied in order to deal with the unavoidable uncertainty about the correct model. We devise a novel Bayesian network reverse engineering approach using ordinary differential equations with the ability to include non-linearity. Besides modeling arbitrary, possibly combinatorial and time dependent perturbations with unknown targets, one of our main contributions is the use of Expectation Propagation, an algorithm for approximate Bayesian inference over large scale network structures in short computation time. We further explore the possibility of integrating prior knowledge into network inference. We evaluate the proposed model on DREAM4 and DREAM8 data and find it competitive against several state-of-the-art existing network inference methods.
Bayesian Estimation of the Logistic Positive Exponent IRT Model
Bolfarine, Heleno; Bazan, Jorge Luis
2010-01-01
A Bayesian inference approach using Markov Chain Monte Carlo (MCMC) is developed for the logistic positive exponent (LPE) model proposed by Samejima and for a new skewed Logistic Item Response Theory (IRT) model, named Reflection LPE model. Both models lead to asymmetric item characteristic curves (ICC) and can be appropriate because a symmetric…
Bayesian Modelling of fMRI Time Series
DEFF Research Database (Denmark)
Højen-Sørensen, Pedro; Hansen, Lars Kai; Rasmussen, Carl Edward
2000-01-01
We present a Hidden Markov Model (HMM) for inferring the hidden psychological state (or neural activity) during single trial fMRI activation experiments with blocked task paradigms. Inference is based on Bayesian methodology, using a combination of analytical and a variety of Markov Chain Monte...... Carlo (MCMC) sampling techniques. The advantage of this method is that detection of short time learning effects between repeated trials is possible since inference is based only on single trial experiments....
Numerical methods for Bayesian inference in the face of aging
International Nuclear Information System (INIS)
Clarotti, C.A.; Villain, B.; Procaccia, H.
1996-01-01
In recent years, much attention has been paid to Bayesian methods for Risk Assessment. Until now, these methods have been studied from a theoretical point of view. Researchers have been mainly interested in: studying the effectiveness of Bayesian methods in handling rare events; debating about the problem of priors and other philosophical issues. An aspect central to the Bayesian approach is numerical computation because any safety/reliability problem, in a Bayesian frame, ends with a problem of numerical integration. This aspect has been neglected until now because most Risk studies assumed the Exponential model as the basic probabilistic model. The existence of conjugate priors makes numerical integration unnecessary in this case. If aging is to be taken into account, no conjugate family is available and the use of numerical integration becomes compulsory. EDF (National Board of Electricity, of France) and ENEA (National Committee for Energy, New Technologies and Environment, of Italy) jointly carried out a research program aimed at developing quadrature methods suitable for Bayesian Interference with underlying Weibull or gamma distributions. The paper will illustrate the main results achieved during the above research program and will discuss, via some sample cases, the performances of the numerical algorithms which on the appearance of stress corrosion cracking in the tubes of Steam Generators of PWR French power plants. (authors)
Learning Bayesian networks for discrete data
Liang, Faming
2009-02-01
Bayesian networks have received much attention in the recent literature. In this article, we propose an approach to learn Bayesian networks using the stochastic approximation Monte Carlo (SAMC) algorithm. Our approach has two nice features. Firstly, it possesses the self-adjusting mechanism and thus avoids essentially the local-trap problem suffered by conventional MCMC simulation-based approaches in learning Bayesian networks. Secondly, it falls into the class of dynamic importance sampling algorithms; the network features can be inferred by dynamically weighted averaging the samples generated in the learning process, and the resulting estimates can have much lower variation than the single model-based estimates. The numerical results indicate that our approach can mix much faster over the space of Bayesian networks than the conventional MCMC simulation-based approaches. © 2008 Elsevier B.V. All rights reserved.
Bayesian inference for disease prevalence using negative binomial group testing
Pritchard, Nicholas A.; Tebbs, Joshua M.
2011-01-01
Group testing, also known as pooled testing, and inverse sampling are both widely used methods of data collection when the goal is to estimate a small proportion. Taking a Bayesian approach, we consider the new problem of estimating disease prevalence from group testing when inverse (negative binomial) sampling is used. Using different distributions to incorporate prior knowledge of disease incidence and different loss functions, we derive closed form expressions for posterior distributions and resulting point and credible interval estimators. We then evaluate our new estimators, on Bayesian and classical grounds, and apply our methods to a West Nile Virus data set. PMID:21259308
Sparse Bayesian Inference and the Temperature Structure of the Solar Corona
Energy Technology Data Exchange (ETDEWEB)
Warren, Harry P. [Space Science Division, Naval Research Laboratory, Washington, DC 20375 (United States); Byers, Jeff M. [Materials Science and Technology Division, Naval Research Laboratory, Washington, DC 20375 (United States); Crump, Nicholas A. [Naval Center for Space Technology, Naval Research Laboratory, Washington, DC 20375 (United States)
2017-02-20
Measuring the temperature structure of the solar atmosphere is critical to understanding how it is heated to high temperatures. Unfortunately, the temperature of the upper atmosphere cannot be observed directly, but must be inferred from spectrally resolved observations of individual emission lines that span a wide range of temperatures. Such observations are “inverted” to determine the distribution of plasma temperatures along the line of sight. This inversion is ill posed and, in the absence of regularization, tends to produce wildly oscillatory solutions. We introduce the application of sparse Bayesian inference to the problem of inferring the temperature structure of the solar corona. Within a Bayesian framework a preference for solutions that utilize a minimum number of basis functions can be encoded into the prior and many ad hoc assumptions can be avoided. We demonstrate the efficacy of the Bayesian approach by considering a test library of 40 assumed temperature distributions.
Sraj, Ihab; Zedler, Sarah E.; Knio, Omar; Jackson, Charles S.; Hoteit, Ibrahim
2016-01-01
The authors present a polynomial chaos (PC)-based Bayesian inference method for quantifying the uncertainties of the K-profile parameterization (KPP) within the MIT general circulation model (MITgcm) of the tropical Pacific. The inference
Using SAS PROC MCMC for Item Response Theory Models
Ames, Allison J.; Samonte, Kelli
2015-01-01
Interest in using Bayesian methods for estimating item response theory models has grown at a remarkable rate in recent years. This attentiveness to Bayesian estimation has also inspired a growth in available software such as WinBUGS, R packages, BMIRT, MPLUS, and SAS PROC MCMC. This article intends to provide an accessible overview of Bayesian…
Nonparametric Bayesian inference for multidimensional compound Poisson processes
Gugushvili, S.; van der Meulen, F.; Spreij, P.
2015-01-01
Given a sample from a discretely observed multidimensional compound Poisson process, we study the problem of nonparametric estimation of its jump size density r0 and intensity λ0. We take a nonparametric Bayesian approach to the problem and determine posterior contraction rates in this context,
Bayesian inference using WBDev: a tutorial for social scientists
Wetzels, R.; Lee, M.D.; Wagenmakers, E.-J.
2010-01-01
Over the last decade, the popularity of Bayesian data analysis in the empirical sciences has greatly increased. This is partly due to the availability of WinBUGS, a free and flexible statistical software package that comes with an array of predefined functions and distributions, allowing users to
Metis: A Pure Metropolis Markov Chain Monte Carlo Bayesian Inference Library
Energy Technology Data Exchange (ETDEWEB)
Bates, Cameron Russell [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Mckigney, Edward Allen [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
2018-01-09
The use of Bayesian inference in data analysis has become the standard for large scienti c experiments [1, 2]. The Monte Carlo Codes Group(XCP-3) at Los Alamos has developed a simple set of algorithms currently implemented in C++ and Python to easily perform at-prior Markov Chain Monte Carlo Bayesian inference with pure Metropolis sampling. These implementations are designed to be user friendly and extensible for customization based on speci c application requirements. This document describes the algorithmic choices made and presents two use cases.
Inference method using bayesian network for diagnosis of pulmonary nodules
International Nuclear Information System (INIS)
Kawagishi, Masami; Iizuka, Yoshio; Yamamoto, Hiroyuki; Yakami, Masahiro; Kubo, Takeshi; Fujimoto, Koji; Togashi, Kaori
2010-01-01
This report describes the improvements of a naive Bayes model that infers the diagnosis of pulmonary nodules in chest CT images based on the findings obtained when a radiologist interprets the CT images. We have previously introduced an inference model using a naive Bayes classifier and have reported its clinical value based on evaluation using clinical data. In the present report, we introduce the following improvements to the original inference model: the selection of findings based on correlations and the generation of a model using only these findings, and the introduction of classifiers that integrate several simple classifiers each of which is specialized for specific diagnosis. These improvements were found to increase the inference accuracy by 10.4% (p<.01) as compared to the original model in 100 cases (222 nodules) based on leave-one-out evaluation. (author)
On a full Bayesian inference for force reconstruction problems
Aucejo, M.; De Smet, O.
2018-05-01
In a previous paper, the authors introduced a flexible methodology for reconstructing mechanical sources in the frequency domain from prior local information on both their nature and location over a linear and time invariant structure. The proposed approach was derived from Bayesian statistics, because of its ability in mathematically accounting for experimenter's prior knowledge. However, since only the Maximum a Posteriori estimate was computed, the posterior uncertainty about the regularized solution given the measured vibration field, the mechanical model and the regularization parameter was not assessed. To answer this legitimate question, this paper fully exploits the Bayesian framework to provide, from a Markov Chain Monte Carlo algorithm, credible intervals and other statistical measures (mean, median, mode) for all the parameters of the force reconstruction problem.
Operational modal analysis modeling, Bayesian inference, uncertainty laws
Au, Siu-Kui
2017-01-01
This book presents operational modal analysis (OMA), employing a coherent and comprehensive Bayesian framework for modal identification and covering stochastic modeling, theoretical formulations, computational algorithms, and practical applications. Mathematical similarities and philosophical differences between Bayesian and classical statistical approaches to system identification are discussed, allowing their mathematical tools to be shared and their results correctly interpreted. Many chapters can be used as lecture notes for the general topic they cover beyond the OMA context. After an introductory chapter (1), Chapters 2–7 present the general theory of stochastic modeling and analysis of ambient vibrations. Readers are first introduced to the spectral analysis of deterministic time series (2) and structural dynamics (3), which do not require the use of probability concepts. The concepts and techniques in these chapters are subsequently extended to a probabilistic context in Chapter 4 (on stochastic pro...
Boos, Moritz; Seer, Caroline; Lange, Florian; Kopp, Bruno
2016-01-01
Cognitive determinants of probabilistic inference were examined using hierarchical Bayesian modeling techniques. A classic urn-ball paradigm served as experimental strategy, involving a factorial two (prior probabilities) by two (likelihoods) design. Five computational models of cognitive processes were compared with the observed behavior. Parameter-free Bayesian posterior probabilities and parameter-free base rate neglect provided inadequate models of probabilistic inference. The introduction of distorted subjective probabilities yielded more robust and generalizable results. A general class of (inverted) S-shaped probability weighting functions had been proposed; however, the possibility of large differences in probability distortions not only across experimental conditions, but also across individuals, seems critical for the model's success. It also seems advantageous to consider individual differences in parameters of probability weighting as being sampled from weakly informative prior distributions of individual parameter values. Thus, the results from hierarchical Bayesian modeling converge with previous results in revealing that probability weighting parameters show considerable task dependency and individual differences. Methodologically, this work exemplifies the usefulness of hierarchical Bayesian modeling techniques for cognitive psychology. Theoretically, human probabilistic inference might be best described as the application of individualized strategic policies for Bayesian belief revision.
Directory of Open Access Journals (Sweden)
Moritz eBoos
2016-05-01
Full Text Available Cognitive determinants of probabilistic inference were examined using hierarchical Bayesian modelling techniques. A classic urn-ball paradigm served as experimental strategy, involving a factorial two (prior probabilities by two (likelihoods design. Five computational models of cognitive processes were compared with the observed behaviour. Parameter-free Bayesian posterior probabilities and parameter-free base rate neglect provided inadequate models of probabilistic inference. The introduction of distorted subjective probabilities yielded more robust and generalizable results. A general class of (inverted S-shaped probability weighting functions had been proposed; however, the possibility of large differences in probability distortions not only across experimental conditions, but also across individuals, seems critical for the model’s success. It also seems advantageous to consider individual differences in parameters of probability weighting as being sampled from weakly informative prior distributions of individual parameter values. Thus, the results from hierarchical Bayesian modelling converge with previous results in revealing that probability weighting parameters show considerable task dependency and individual differences. Methodologically, this work exemplifies the usefulness of hierarchical Bayesian modelling techniques for cognitive psychology. Theoretically, human probabilistic inference might be best described as the application of individualized strategic policies for Bayesian belief revision.
International Nuclear Information System (INIS)
Qin, H.; Zhou, W.; Zhang, S.
2015-01-01
Stochastic process-based models are developed to characterize the generation and growth of metal-loss corrosion defects on oil and gas steel pipelines. The generation of corrosion defects over time is characterized by the non-homogenous Poisson process, and the growth of depths of individual defects is modeled by the non-homogenous gamma process (NHGP). The defect generation and growth models are formulated in a hierarchical Bayesian framework, whereby the parameters of the models are evaluated from the in-line inspection (ILI) data through the Bayesian updating by accounting for the probability of detection (POD) and measurement errors associated with the ILI data. The Markov Chain Monte Carlo (MCMC) simulation in conjunction with the data augmentation (DA) technique is employed to carry out the Bayesian updating. Numerical examples that involve simulated ILI data are used to illustrate and validate the proposed methodology. - Highlights: • Bayesian updating of growth and generation models of defects on energy pipelines. • Non-homogeneous Poisson process for defect generation. • Non-homogeneous gamma process for defect growth. • Updating based on inspection data with detecting and sizing uncertainties. • MCMC in conjunction with data augmentation technique employed for the updating.
Bayesian inference and the analytic continuation of imaginary-time quantum Monte Carlo data
International Nuclear Information System (INIS)
Gubernatis, J.E.; Bonca, J.; Jarrell, M.
1995-01-01
We present brief description of how methods of Bayesian inference are used to obtain real frequency information by the analytic continuation of imaginary-time quantum Monte Carlo data. We present the procedure we used, which is due to R. K. Bryan, and summarize several bottleneck issues
DEFF Research Database (Denmark)
Jacobsen, Christian Robert Dahl; Møller, Jesper
2017-01-01
We introduce new estimation methods for a subclass of the Gaussian scale mixture models for wavelet trees by Wainwright, Simoncelli and Willsky that rely on modern results for composite likelihoods and approximate Bayesian inference. Our methodology is illustrated for denoising and edge detection...
Paudel, Y.; Botzen, W.J.W.; Aerts, J.C.J.H.
2013-01-01
This study applies Bayesian Inference to estimate flood risk for 53 dyke ring areas in the Netherlands, and focuses particularly on the data scarcity and extreme behaviour of catastrophe risk. The probability density curves of flood damage are estimated through Monte Carlo simulations. Based on
Bayesian Inference for Linear Parabolic PDEs with Noisy Boundary Conditions
Ruggeri, Fabrizio; Sawlan, Zaid A; Scavino, Marco; Tempone, Raul
2015-01-01
have been assumed for the time-dependent Dirichlet boundary values. Our approach is applied to synthetic data for the one-dimensional heat equation model, where the thermal diffusivity is the unknown parameter. We show how to infer the thermal
DEFF Research Database (Denmark)
Møller, Jesper
(This text written by Jesper Møller, Aalborg University, is submitted for the collection ‘Stochastic Geometry: Highlights, Interactions and New Perspectives', edited by Wilfrid S. Kendall and Ilya Molchanov, to be published by ClarendonPress, Oxford, and planned to appear as Section 4.1 with the ......(This text written by Jesper Møller, Aalborg University, is submitted for the collection ‘Stochastic Geometry: Highlights, Interactions and New Perspectives', edited by Wilfrid S. Kendall and Ilya Molchanov, to be published by ClarendonPress, Oxford, and planned to appear as Section 4.......1 with the title ‘Inference'.) This contribution concerns statistical inference for parametric models used in stochastic geometry and based on quick and simple simulation free procedures as well as more comprehensive methods using Markov chain Monte Carlo (MCMC) simulations. Due to space limitations the focus...
Bayesian inference and model comparison for metallic fatigue data
Babuška, Ivo
2016-02-23
In this work, we present a statistical treatment of stress-life (S-N) data drawn from a collection of records of fatigue experiments that were performed on 75S-T6 aluminum alloys. Our main objective is to predict the fatigue life of materials by providing a systematic approach to model calibration, model selection and model ranking with reference to S-N data. To this purpose, we consider fatigue-limit models and random fatigue-limit models that are specially designed to allow the treatment of the run-outs (right-censored data). We first fit the models to the data by maximum likelihood methods and estimate the quantiles of the life distribution of the alloy specimen. To assess the robustness of the estimation of the quantile functions, we obtain bootstrap confidence bands by stratified resampling with respect to the cycle ratio. We then compare and rank the models by classical measures of fit based on information criteria. We also consider a Bayesian approach that provides, under the prior distribution of the model parameters selected by the user, their simulation-based posterior distributions. We implement and apply Bayesian model comparison methods, such as Bayes factor ranking and predictive information criteria based on cross-validation techniques under various a priori scenarios.
Bayesian inference and model comparison for metallic fatigue data
Babuška, Ivo; Sawlan, Zaid A; Scavino, Marco; Szabó , Barna; Tempone, Raul
2016-01-01
In this work, we present a statistical treatment of stress-life (S-N) data drawn from a collection of records of fatigue experiments that were performed on 75S-T6 aluminum alloys. Our main objective is to predict the fatigue life of materials by providing a systematic approach to model calibration, model selection and model ranking with reference to S-N data. To this purpose, we consider fatigue-limit models and random fatigue-limit models that are specially designed to allow the treatment of the run-outs (right-censored data). We first fit the models to the data by maximum likelihood methods and estimate the quantiles of the life distribution of the alloy specimen. To assess the robustness of the estimation of the quantile functions, we obtain bootstrap confidence bands by stratified resampling with respect to the cycle ratio. We then compare and rank the models by classical measures of fit based on information criteria. We also consider a Bayesian approach that provides, under the prior distribution of the model parameters selected by the user, their simulation-based posterior distributions. We implement and apply Bayesian model comparison methods, such as Bayes factor ranking and predictive information criteria based on cross-validation techniques under various a priori scenarios.
Rajaona, Harizo; Septier, François; Armand, Patrick; Delignon, Yves; Olry, Christophe; Albergel, Armand; Moussafir, Jacques
2015-12-01
In the eventuality of an accidental or intentional atmospheric release, the reconstruction of the source term using measurements from a set of sensors is an important and challenging inverse problem. A rapid and accurate estimation of the source allows faster and more efficient action for first-response teams, in addition to providing better damage assessment. This paper presents a Bayesian probabilistic approach to estimate the location and the temporal emission profile of a pointwise source. The release rate is evaluated analytically by using a Gaussian assumption on its prior distribution, and is enhanced with a positivity constraint to improve the estimation. The source location is obtained by the means of an advanced iterative Monte-Carlo technique called Adaptive Multiple Importance Sampling (AMIS), which uses a recycling process at each iteration to accelerate its convergence. The proposed methodology is tested using synthetic and real concentration data in the framework of the Fusion Field Trials 2007 (FFT-07) experiment. The quality of the obtained results is comparable to those coming from the Markov Chain Monte Carlo (MCMC) algorithm, a popular Bayesian method used for source estimation. Moreover, the adaptive processing of the AMIS provides a better sampling efficiency by reusing all the generated samples.
Practical Bayesian inference a primer for physical scientists
Bailer-Jones, Coryn A L
2017-01-01
Science is fundamentally about learning from data, and doing so in the presence of uncertainty. This volume is an introduction to the major concepts of probability and statistics, and the computational tools for analysing and interpreting data. It describes the Bayesian approach, and explains how this can be used to fit and compare models in a range of problems. Topics covered include regression, parameter estimation, model assessment, and Monte Carlo methods, as well as widely used classical methods such as regularization and hypothesis testing. The emphasis throughout is on the principles, the unifying probabilistic approach, and showing how the methods can be implemented in practice. R code (with explanations) is included and is available online, so readers can reproduce the plots and results for themselves. Aimed primarily at undergraduate and graduate students, these techniques can be applied to a wide range of data analysis problems beyond the scope of this work.
Directory of Open Access Journals (Sweden)
A. A. Zolotin
2015-07-01
Full Text Available Posteriori inference is one of the three kinds of probabilistic-logic inferences in the probabilistic graphical models theory and the base for processing of knowledge patterns with probabilistic uncertainty using Bayesian networks. The paper deals with a task of local posteriori inference description in algebraic Bayesian networks that represent a class of probabilistic graphical models by means of matrix-vector equations. The latter are essentially based on the use of tensor product of matrices, Kronecker degree and Hadamard product. Matrix equations for calculating posteriori probabilities vectors within posteriori inference in knowledge patterns with quanta propositions are obtained. Similar equations of the same type have already been discussed within the confines of the theory of algebraic Bayesian networks, but they were built only for the case of posteriori inference in the knowledge patterns on the ideals of conjuncts. During synthesis and development of matrix-vector equations on quanta propositions probability vectors, a number of earlier results concerning normalizing factors in posteriori inference and assignment of linear projective operator with a selector vector was adapted. We consider all three types of incoming evidences - deterministic, stochastic and inaccurate - combined with scalar and interval estimation of probability truth of propositional formulas in the knowledge patterns. Linear programming problems are formed. Their solution gives the desired interval values of posterior probabilities in the case of inaccurate evidence or interval estimates in a knowledge pattern. That sort of description of a posteriori inference gives the possibility to extend the set of knowledge pattern types that we can use in the local and global posteriori inference, as well as simplify complex software implementation by use of existing third-party libraries, effectively supporting submission and processing of matrices and vectors when
International Nuclear Information System (INIS)
Von Nessi, G T; Hole, M J
2014-01-01
We present recent results and technical breakthroughs for the Bayesian inference of tokamak equilibria using force-balance as a prior constraint. Issues surrounding model parameter representation and posterior analysis are discussed and addressed. These points motivate the recent advancements embodied in the Bayesian Equilibrium Analysis and Simulation Tool (BEAST) software being presently utilized to study equilibria on the Mega-Ampere Spherical Tokamak (MAST) experiment in the UK (von Nessi et al 2012 J. Phys. A 46 185501). State-of-the-art results of using BEAST to study MAST equilibria are reviewed, with recent code advancements being systematically presented though out the manuscript. (paper)
Bayesian inference and model comparison for metallic fatigue data
Babuska, Ivo
2016-01-06
In this work, we present a statistical treatment of stress-life (S-N) data drawn from a collection of records of fatigue experiments that were performed on 75S-T6 aluminum alloys. Our main objective is to predict the fatigue life of materials by providing a systematic approach to model calibration, model selection and model ranking with reference to S-N data. To this purpose, we consider fatigue-limit models and random fatigue-limit models that are specially designed to allow the treatment of the run-outs (right-censored data). We first fit the models to the data by maximum likelihood methods and estimate the quantiles of the life distribution of the alloy specimen. We then compare and rank the models by classical measures of fit based on information criteria. We also consider a Bayesian approach that provides, under the prior distribution of the model parameters selected by the user, their simulation-based posterior distributions.
Estimating mountain basin-mean precipitation from streamflow using Bayesian inference
Henn, Brian; Clark, Martyn P.; Kavetski, Dmitri; Lundquist, Jessica D.
2015-10-01
Estimating basin-mean precipitation in complex terrain is difficult due to uncertainty in the topographical representativeness of precipitation gauges relative to the basin. To address this issue, we use Bayesian methodology coupled with a multimodel framework to infer basin-mean precipitation from streamflow observations, and we apply this approach to snow-dominated basins in the Sierra Nevada of California. Using streamflow observations, forcing data from lower-elevation stations, the Bayesian Total Error Analysis (BATEA) methodology and the Framework for Understanding Structural Errors (FUSE), we infer basin-mean precipitation, and compare it to basin-mean precipitation estimated using topographically informed interpolation from gauges (PRISM, the Parameter-elevation Regression on Independent Slopes Model). The BATEA-inferred spatial patterns of precipitation show agreement with PRISM in terms of the rank of basins from wet to dry but differ in absolute values. In some of the basins, these differences may reflect biases in PRISM, because some implied PRISM runoff ratios may be inconsistent with the regional climate. We also infer annual time series of basin precipitation using a two-step calibration approach. Assessment of the precision and robustness of the BATEA approach suggests that uncertainty in the BATEA-inferred precipitation is primarily related to uncertainties in hydrologic model structure. Despite these limitations, time series of inferred annual precipitation under different model and parameter assumptions are strongly correlated with one another, suggesting that this approach is capable of resolving year-to-year variability in basin-mean precipitation.
Gunji, Yukio-Pegio; Shinohara, Shuji; Haruna, Taichi; Basios, Vasileios
2017-02-01
To overcome the dualism between mind and matter and to implement consciousness in science, a physical entity has to be embedded with a measurement process. Although quantum mechanics have been regarded as a candidate for implementing consciousness, nature at its macroscopic level is inconsistent with quantum mechanics. We propose a measurement-oriented inference system comprising Bayesian and inverse Bayesian inferences. While Bayesian inference contracts probability space, the newly defined inverse one relaxes the space. These two inferences allow an agent to make a decision corresponding to an immediate change in their environment. They generate a particular pattern of joint probability for data and hypotheses, comprising multiple diagonal and noisy matrices. This is expressed as a nondistributive orthomodular lattice equivalent to quantum logic. We also show that an orthomodular lattice can reveal information generated by inverse syllogism as well as the solutions to the frame and symbol-grounding problems. Our model is the first to connect macroscopic cognitive processes with the mathematical structure of quantum mechanics with no additional assumptions. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
An empirical Bayesian approach for model-based inference of cellular signaling networks
Directory of Open Access Journals (Sweden)
Klinke David J
2009-11-01
Full Text Available Abstract Background A common challenge in systems biology is to infer mechanistic descriptions of biological process given limited observations of a biological system. Mathematical models are frequently used to represent a belief about the causal relationships among proteins within a signaling network. Bayesian methods provide an attractive framework for inferring the validity of those beliefs in the context of the available data. However, efficient sampling of high-dimensional parameter space and appropriate convergence criteria provide barriers for implementing an empirical Bayesian approach. The objective of this study was to apply an Adaptive Markov chain Monte Carlo technique to a typical study of cellular signaling pathways. Results As an illustrative example, a kinetic model for the early signaling events associated with the epidermal growth factor (EGF signaling network was calibrated against dynamic measurements observed in primary rat hepatocytes. A convergence criterion, based upon the Gelman-Rubin potential scale reduction factor, was applied to the model predictions. The posterior distributions of the parameters exhibited complicated structure, including significant covariance between specific parameters and a broad range of variance among the parameters. The model predictions, in contrast, were narrowly distributed and were used to identify areas of agreement among a collection of experimental studies. Conclusion In summary, an empirical Bayesian approach was developed for inferring the confidence that one can place in a particular model that describes signal transduction mechanisms and for inferring inconsistencies in experimental measurements.
Large-Scale Optimization for Bayesian Inference in Complex Systems
Energy Technology Data Exchange (ETDEWEB)
Willcox, Karen [MIT; Marzouk, Youssef [MIT
2013-11-12
The SAGUARO (Scalable Algorithms for Groundwater Uncertainty Analysis and Robust Optimization) Project focused on the development of scalable numerical algorithms for large-scale Bayesian inversion in complex systems that capitalize on advances in large-scale simulation-based optimization and inversion methods. The project was a collaborative effort among MIT, the University of Texas at Austin, Georgia Institute of Technology, and Sandia National Laboratories. The research was directed in three complementary areas: efficient approximations of the Hessian operator, reductions in complexity of forward simulations via stochastic spectral approximations and model reduction, and employing large-scale optimization concepts to accelerate sampling. The MIT--Sandia component of the SAGUARO Project addressed the intractability of conventional sampling methods for large-scale statistical inverse problems by devising reduced-order models that are faithful to the full-order model over a wide range of parameter values; sampling then employs the reduced model rather than the full model, resulting in very large computational savings. Results indicate little effect on the computed posterior distribution. On the other hand, in the Texas--Georgia Tech component of the project, we retain the full-order model, but exploit inverse problem structure (adjoint-based gradients and partial Hessian information of the parameter-to-observation map) to implicitly extract lower dimensional information on the posterior distribution; this greatly speeds up sampling methods, so that fewer sampling points are needed. We can think of these two approaches as ``reduce then sample'' and ``sample then reduce.'' In fact, these two approaches are complementary, and can be used in conjunction with each other. Moreover, they both exploit deterministic inverse problem structure, in the form of adjoint-based gradient and Hessian information of the underlying parameter-to-observation map, to
Bayesian inference of shared recombination hotspots between humans and chimpanzees.
Wang, Ying; Rannala, Bruce
2014-12-01
Recombination generates variation and facilitates evolution. Recombination (or lack thereof) also contributes to human genetic disease. Methods for mapping genes influencing complex genetic diseases via association rely on linkage disequilibrium (LD) in human populations, which is influenced by rates of recombination across the genome. Comparative population genomic analyses of recombination using related primate species can identify factors influencing rates of recombination in humans. Such studies can indicate how variable hotspots for recombination may be both among individuals (or populations) and over evolutionary timescales. Previous studies have suggested that locations of recombination hotspots are not conserved between humans and chimpanzees. We made use of the data sets from recent resequencing projects and applied a Bayesian method for identifying hotspots and estimating recombination rates. We also reanalyzed SNP data sets for regions with known hotspots in humans using samples from the human and chimpanzee. The Bayes factors (BF) of shared recombination hotspots between human and chimpanzee across regions were obtained. Based on the analysis of the aligned regions of human chromosome 21, locations where the two species show evidence of shared recombination hotspots (with high BFs) were identified. Interestingly, previous comparative studies of human and chimpanzee that focused on the known human recombination hotspots within the β-globin and HLA regions did not find overlapping of hotspots. Our results show high BFs of shared hotspots at locations within both regions, and the estimated locations of shared hotspots overlap with the locations of human recombination hotspots obtained from sperm-typing studies. Copyright © 2014 by the Genetics Society of America.
Bayesian analysis of Markov point processes
DEFF Research Database (Denmark)
Berthelsen, Kasper Klitgaard; Møller, Jesper
2006-01-01
Recently Møller, Pettitt, Berthelsen and Reeves introduced a new MCMC methodology for drawing samples from a posterior distribution when the likelihood function is only specified up to a normalising constant. We illustrate the method in the setting of Bayesian inference for Markov point processes...... a partially ordered Markov point process as the auxiliary variable. As the method requires simulation from the "unknown" likelihood, perfect simulation algorithms for spatial point processes become useful....
An application of MCMC simulation in mortality projection for populations with limited data
Directory of Open Access Journals (Sweden)
Jackie Li
2014-01-01
Full Text Available Objective: IIn this paper, we investigate the use of Bayesian modeling and Markov chain Monte Carlo (MCMC simulation, via the software WinBUGS, to project future mortality for populations with limited data. In particular, we adapt some extensions of the Lee-Carter method under the Bayesian framework to allow for situations in which mortality data are scarce. Our approach would be useful for certain developing nations that have not been regularly collecting death counts and population statistics. Inferences of the model estimates and forecasts can readily be drawn from the simulated samples. Information on another population resembling the population under study can be exploited and incorporated into the prior distributions in order to facilitate the construction of probability intervals. The two sets of data can also be modeled in a joint manner. We demonstrate an application of this approach to some data from China and Taiwan.
DEFF Research Database (Denmark)
Heller, Rasmus; Chikhi, Lounes; Siegismund, Hans
2013-01-01
Many coalescent-based methods aiming to infer the demographic history of populations assume a single, isolated and panmictic population (i.e. a Wright-Fisher model). While this assumption may be reasonable under many conditions, several recent studies have shown that the results can be misleading...... when it is violated. Among the most widely applied demographic inference methods are Bayesian skyline plots (BSPs), which are used across a range of biological fields. Violations of the panmixia assumption are to be expected in many biological systems, but the consequences for skyline plot inferences...... the best scheme for inferring demographic change over a typical time scale. Analyses of data from a structured African buffalo population demonstrate how BSP results can be strengthened by simulations. We recommend that sample selection should be carefully considered in relation to population structure...
Detection of multiple damages employing best achievable eigenvectors under Bayesian inference
Prajapat, Kanta; Ray-Chaudhuri, Samit
2018-05-01
A novel approach is presented in this work to localize simultaneously multiple damaged elements in a structure along with the estimation of damage severity for each of the damaged elements. For detection of damaged elements, a best achievable eigenvector based formulation has been derived. To deal with noisy data, Bayesian inference is employed in the formulation wherein the likelihood of the Bayesian algorithm is formed on the basis of errors between the best achievable eigenvectors and the measured modes. In this approach, the most probable damage locations are evaluated under Bayesian inference by generating combinations of various possible damaged elements. Once damage locations are identified, damage severities are estimated using a Bayesian inference Markov chain Monte Carlo simulation. The efficiency of the proposed approach has been demonstrated by carrying out a numerical study involving a 12-story shear building. It has been found from this study that damage scenarios involving as low as 10% loss of stiffness in multiple elements are accurately determined (localized and severities quantified) even when 2% noise contaminated modal data are utilized. Further, this study introduces a term parameter impact (evaluated based on sensitivity of modal parameters towards structural parameters) to decide the suitability of selecting a particular mode, if some idea about the damaged elements are available. It has been demonstrated here that the accuracy and efficiency of the Bayesian quantification algorithm increases if damage localization is carried out a-priori. An experimental study involving a laboratory scale shear building and different stiffness modification scenarios shows that the proposed approach is efficient enough to localize the stories with stiffness modification.
Bayesian Inference using Neural Net Likelihood Models for Protein Secondary Structure Prediction
Directory of Open Access Journals (Sweden)
Seong-Gon Kim
2011-06-01
Full Text Available Several techniques such as Neural Networks, Genetic Algorithms, Decision Trees and other statistical or heuristic methods have been used to approach the complex non-linear task of predicting Alpha-helicies, Beta-sheets and Turns of a proteins secondary structure in the past. This project introduces a new machine learning method by using an offline trained Multilayered Perceptrons (MLP as the likelihood models within a Bayesian Inference framework to predict secondary structures proteins. Varying window sizes are used to extract neighboring amino acid information and passed back and forth between the Neural Net models and the Bayesian Inference process until there is a convergence of the posterior secondary structure probability.
Prudhomme, Serge; Bryant, Corey M.
2015-01-01
Parameter estimation for complex models using Bayesian inference is usually a very costly process as it requires a large number of solves of the forward problem. We show here how the construction of adaptive surrogate models using a posteriori error estimates for quantities of interest can significantly reduce the computational cost in problems of statistical inference. As surrogate models provide only approximations of the true solutions of the forward problem, it is nevertheless necessary to control these errors in order to construct an accurate reduced model with respect to the observables utilized in the identification of the model parameters. Effectiveness of the proposed approach is demonstrated on a numerical example dealing with the Spalart–Allmaras model for the simulation of turbulent channel flows. In particular, we illustrate how Bayesian model selection using the adapted surrogate model in place of solving the coupled nonlinear equations leads to the same quality of results while requiring fewer nonlinear PDE solves.
Prudhomme, Serge
2015-09-17
Parameter estimation for complex models using Bayesian inference is usually a very costly process as it requires a large number of solves of the forward problem. We show here how the construction of adaptive surrogate models using a posteriori error estimates for quantities of interest can significantly reduce the computational cost in problems of statistical inference. As surrogate models provide only approximations of the true solutions of the forward problem, it is nevertheless necessary to control these errors in order to construct an accurate reduced model with respect to the observables utilized in the identification of the model parameters. Effectiveness of the proposed approach is demonstrated on a numerical example dealing with the Spalart–Allmaras model for the simulation of turbulent channel flows. In particular, we illustrate how Bayesian model selection using the adapted surrogate model in place of solving the coupled nonlinear equations leads to the same quality of results while requiring fewer nonlinear PDE solves.
A method for crack sizing using Bayesian inference arising in eddy current testing
International Nuclear Information System (INIS)
Kojima, Fumio; Kikuchi, Mitsuhiro
2008-01-01
This paper is concerned with a sizing methodology of crack using Bayesian inference arising in eddy current testing. There is often uncertainty about data through quantitative measurements of nondestructive testing and this can yield misleading inference of crack sizing at on-site monitoring. In this paper, we propose optimal strategies of measurements in eddy current testing using Bayesian prior-to-posteriori analysis. First our likelihood functional is given by Gaussian distribution with the measurement model based on the hybrid use of finite and boundary element methods. Secondly, given a priori distributions of crack sizing, we propose a method for estimating the region of interest for sizing cracks. Finally an optimal sensing method is demonstrated using our idea. (author)
Directory of Open Access Journals (Sweden)
Benjamin W. Y. Lo
2013-01-01
Full Text Available Objective. The novel clinical prediction approach of Bayesian neural networks with fuzzy logic inferences is created and applied to derive prognostic decision rules in cerebral aneurysmal subarachnoid hemorrhage (aSAH. Methods. The approach of Bayesian neural networks with fuzzy logic inferences was applied to data from five trials of Tirilazad for aneurysmal subarachnoid hemorrhage (3551 patients. Results. Bayesian meta-analyses of observational studies on aSAH prognostic factors gave generalizable posterior distributions of population mean log odd ratios (ORs. Similar trends were noted in Bayesian and linear regression ORs. Significant outcome predictors include normal motor response, cerebral infarction, history of myocardial infarction, cerebral edema, history of diabetes mellitus, fever on day 8, prior subarachnoid hemorrhage, admission angiographic vasospasm, neurological grade, intraventricular hemorrhage, ruptured aneurysm size, history of hypertension, vasospasm day, age and mean arterial pressure. Heteroscedasticity was present in the nontransformed dataset. Artificial neural networks found nonlinear relationships with 11 hidden variables in 1 layer, using the multilayer perceptron model. Fuzzy logic decision rules (centroid defuzzification technique denoted cut-off points for poor prognosis at greater than 2.5 clusters. Discussion. This aSAH prognostic system makes use of existing knowledge, recognizes unknown areas, incorporates one's clinical reasoning, and compensates for uncertainty in prognostication.
2016-03-01
each IDF curve and subsequently used to force a calibrated and validated precipitation - runoff model. Probability-based, risk-informed hydrologic...ERDC/CHL CHETN-X-2 March 2016 Approved for public release; distribution is unlimited. Bayesian Inference of Nonstationary Precipitation Intensity...based means by which to develop local precipitation Intensity-Duration-Frequency (IDF) curves using historical rainfall time series data collected for
Khan, Haseeb A.; Arif, Ibrahim A.; Bahkali, Ali H.; Al Farhan, Ahmad H.; Al Homaidan, Ali A.
2008-01-01
This investigation was aimed to compare the inference of antelope phylogenies resulting from the 16S rRNA, cytochrome-b (cyt-b) and d-loop segments of mitochondrial DNA using three different computational models including Bayesian (BA), maximum parsimony (MP) and unweighted pair group method with arithmetic mean (UPGMA). The respective nucleotide sequences of three Oryx species (Oryx leucoryx, Oryx dammah and Oryx gazella) and an out-group (Addax nasomaculatus) were aligned and subjected to B...
Dorazio, R.M.; Johnson, F.A.
2003-01-01
Bayesian inference and decision theory may be used in the solution of relatively complex problems of natural resource management, owing to recent advances in statistical theory and computing. In particular, Markov chain Monte Carlo algorithms provide a computational framework for fitting models of adequate complexity and for evaluating the expected consequences of alternative management actions. We illustrate these features using an example based on management of waterfowl habitat.
Statistical detection of EEG synchrony using empirical bayesian inference.
Directory of Open Access Journals (Sweden)
Archana K Singh
Full Text Available There is growing interest in understanding how the brain utilizes synchronized oscillatory activity to integrate information across functionally connected regions. Computing phase-locking values (PLV between EEG signals is a popular method for quantifying such synchronizations and elucidating their role in cognitive tasks. However, high-dimensionality in PLV data incurs a serious multiple testing problem. Standard multiple testing methods in neuroimaging research (e.g., false discovery rate, FDR suffer severe loss of power, because they fail to exploit complex dependence structure between hypotheses that vary in spectral, temporal and spatial dimension. Previously, we showed that a hierarchical FDR and optimal discovery procedures could be effectively applied for PLV analysis to provide better power than FDR. In this article, we revisit the multiple comparison problem from a new Empirical Bayes perspective and propose the application of the local FDR method (locFDR; Efron, 2001 for PLV synchrony analysis to compute FDR as a posterior probability that an observed statistic belongs to a null hypothesis. We demonstrate the application of Efron's Empirical Bayes approach for PLV synchrony analysis for the first time. We use simulations to validate the specificity and sensitivity of locFDR and a real EEG dataset from a visual search study for experimental validation. We also compare locFDR with hierarchical FDR and optimal discovery procedures in both simulation and experimental analyses. Our simulation results showed that the locFDR can effectively control false positives without compromising on the power of PLV synchrony inference. Our results from the application locFDR on experiment data detected more significant discoveries than our previously proposed methods whereas the standard FDR method failed to detect any significant discoveries.
Statistical detection of EEG synchrony using empirical bayesian inference.
Singh, Archana K; Asoh, Hideki; Takeda, Yuji; Phillips, Steven
2015-01-01
There is growing interest in understanding how the brain utilizes synchronized oscillatory activity to integrate information across functionally connected regions. Computing phase-locking values (PLV) between EEG signals is a popular method for quantifying such synchronizations and elucidating their role in cognitive tasks. However, high-dimensionality in PLV data incurs a serious multiple testing problem. Standard multiple testing methods in neuroimaging research (e.g., false discovery rate, FDR) suffer severe loss of power, because they fail to exploit complex dependence structure between hypotheses that vary in spectral, temporal and spatial dimension. Previously, we showed that a hierarchical FDR and optimal discovery procedures could be effectively applied for PLV analysis to provide better power than FDR. In this article, we revisit the multiple comparison problem from a new Empirical Bayes perspective and propose the application of the local FDR method (locFDR; Efron, 2001) for PLV synchrony analysis to compute FDR as a posterior probability that an observed statistic belongs to a null hypothesis. We demonstrate the application of Efron's Empirical Bayes approach for PLV synchrony analysis for the first time. We use simulations to validate the specificity and sensitivity of locFDR and a real EEG dataset from a visual search study for experimental validation. We also compare locFDR with hierarchical FDR and optimal discovery procedures in both simulation and experimental analyses. Our simulation results showed that the locFDR can effectively control false positives without compromising on the power of PLV synchrony inference. Our results from the application locFDR on experiment data detected more significant discoveries than our previously proposed methods whereas the standard FDR method failed to detect any significant discoveries.
Constraining East Antarctic mass trends using a Bayesian inference approach
Martin-Español, Alba; Bamber, Jonathan L.
2016-04-01
East Antarctica is an order of magnitude larger than its western neighbour and the Greenland ice sheet. It has the greatest potential to contribute to sea level rise of any source, including non-glacial contributors. It is, however, the most challenging ice mass to constrain because of a range of factors including the relative paucity of in-situ observations and the poor signal to noise ratio of Earth Observation data such as satellite altimetry and gravimetry. A recent study using satellite radar and laser altimetry (Zwally et al. 2015) concluded that the East Antarctic Ice Sheet (EAIS) had been accumulating mass at a rate of 136±28 Gt/yr for the period 2003-08. Here, we use a Bayesian hierarchical model, which has been tested on, and applied to, the whole of Antarctica, to investigate the impact of different assumptions regarding the origin of elevation changes of the EAIS. We combined GRACE, satellite laser and radar altimeter data and GPS measurements to solve simultaneously for surface processes (primarily surface mass balance, SMB), ice dynamics and glacio-isostatic adjustment over the period 2003-13. The hierarchical model partitions mass trends between SMB and ice dynamics based on physical principles and measures of statistical likelihood. Without imposing the division between these processes, the model apportions about a third of the mass trend to ice dynamics, +18 Gt/yr, and two thirds, +39 Gt/yr, to SMB. The total mass trend for that period for the EAIS was 57±20 Gt/yr. Over the period 2003-08, we obtain an ice dynamic trend of 12 Gt/yr and a SMB trend of 15 Gt/yr, with a total mass trend of 27 Gt/yr. We then imposed the condition that the surface mass balance is tightly constrained by the regional climate model RACMO2.3 and allowed height changes due to ice dynamics to occur in areas of low surface velocities (solution that satisfies all the input data, given these constraints. By imposing these conditions, over the period 2003-13 we obtained a mass
Energy Technology Data Exchange (ETDEWEB)
Kang, Seong Keun; Seong, Poong Hyun [KAIST, Daejeon (Korea, Republic of)
2014-08-15
Bayesian methodology has been used widely used in various research fields. It is method of inference using Bayes' rule to update the estimation of probability for the certain hypothesis when additional evidences are acquired. According to the current researches, malfunction of nuclear power plant can be detected by using this Bayesian inference which consistently piles up the newly incoming data and updates its estimation. However, those researches are based on the assumption that people are doing like computer perfectly, which can be criticized and may cause a problem in real world application. Studies in cognitive psychology indicates that when the amount of information becomes larger, people can't save the whole data because people have limited memory capacity which is well known as working memory, and also they have attention problem. The purpose of this paper is to consider the psychological factors and confirm how much this working memory and attention will affect the resulted estimation based on the Bayesian inference. To confirm this, experiment on human is needed, and the tool of experiment is Compact Nuclear Simulator (CNS)
International Nuclear Information System (INIS)
Kang, Seong Keun; Seong, Poong Hyun
2014-01-01
Bayesian methodology has been used widely used in various research fields. It is method of inference using Bayes' rule to update the estimation of probability for the certain hypothesis when additional evidences are acquired. According to the current researches, malfunction of nuclear power plant can be detected by using this Bayesian inference which consistently piles up the newly incoming data and updates its estimation. However, those researches are based on the assumption that people are doing like computer perfectly, which can be criticized and may cause a problem in real world application. Studies in cognitive psychology indicates that when the amount of information becomes larger, people can't save the whole data because people have limited memory capacity which is well known as working memory, and also they have attention problem. The purpose of this paper is to consider the psychological factors and confirm how much this working memory and attention will affect the resulted estimation based on the Bayesian inference. To confirm this, experiment on human is needed, and the tool of experiment is Compact Nuclear Simulator (CNS)
Adaptability and phenotypic stability of common bean genotypes through Bayesian inference.
Corrêa, A M; Teodoro, P E; Gonçalves, M C; Barroso, L M A; Nascimento, M; Santos, A; Torres, F E
2016-04-27
This study used Bayesian inference to investigate the genotype x environment interaction in common bean grown in Mato Grosso do Sul State, and it also evaluated the efficiency of using informative and minimally informative a priori distributions. Six trials were conducted in randomized blocks, and the grain yield of 13 common bean genotypes was assessed. To represent the minimally informative a priori distributions, a probability distribution with high variance was used, and a meta-analysis concept was adopted to represent the informative a priori distributions. Bayes factors were used to conduct comparisons between the a priori distributions. The Bayesian inference was effective for the selection of upright common bean genotypes with high adaptability and phenotypic stability using the Eberhart and Russell method. Bayes factors indicated that the use of informative a priori distributions provided more accurate results than minimally informative a priori distributions. According to Bayesian inference, the EMGOPA-201, BAMBUÍ, CNF 4999, CNF 4129 A 54, and CNFv 8025 genotypes had specific adaptability to favorable environments, while the IAPAR 14 and IAC CARIOCA ETE genotypes had specific adaptability to unfavorable environments.
Sraj, Ihab; Le Maî tre, Olivier P.; Knio, Omar; Hoteit, Ibrahim
2015-01-01
using a coordinate transformation to account for the dependence with respect to the covariance hyper-parameters. Polynomial Chaos expansions are employed for the acceleration of the Bayesian inference using similar coordinate transformations, enabling us
A Bayesian Framework That Integrates Heterogeneous Data for Inferring Gene Regulatory Networks
Energy Technology Data Exchange (ETDEWEB)
Santra, Tapesh, E-mail: tapesh.santra@ucd.ie [Systems Biology Ireland, University College Dublin, Dublin (Ireland)
2014-05-20
Reconstruction of gene regulatory networks (GRNs) from experimental data is a fundamental challenge in systems biology. A number of computational approaches have been developed to infer GRNs from mRNA expression profiles. However, expression profiles alone are proving to be insufficient for inferring GRN topologies with reasonable accuracy. Recently, it has been shown that integration of external data sources (such as gene and protein sequence information, gene ontology data, protein–protein interactions) with mRNA expression profiles may increase the reliability of the inference process. Here, I propose a new approach that incorporates transcription factor binding sites (TFBS) and physical protein interactions (PPI) among transcription factors (TFs) in a Bayesian variable selection (BVS) algorithm which can infer GRNs from mRNA expression profiles subjected to genetic perturbations. Using real experimental data, I show that the integration of TFBS and PPI data with mRNA expression profiles leads to significantly more accurate networks than those inferred from expression profiles alone. Additionally, the performance of the proposed algorithm is compared with a series of least absolute shrinkage and selection operator (LASSO) regression-based network inference methods that can also incorporate prior knowledge in the inference framework. The results of this comparison suggest that BVS can outperform LASSO regression-based method in some circumstances.
A Bayesian Framework That Integrates Heterogeneous Data for Inferring Gene Regulatory Networks
International Nuclear Information System (INIS)
Santra, Tapesh
2014-01-01
Reconstruction of gene regulatory networks (GRNs) from experimental data is a fundamental challenge in systems biology. A number of computational approaches have been developed to infer GRNs from mRNA expression profiles. However, expression profiles alone are proving to be insufficient for inferring GRN topologies with reasonable accuracy. Recently, it has been shown that integration of external data sources (such as gene and protein sequence information, gene ontology data, protein–protein interactions) with mRNA expression profiles may increase the reliability of the inference process. Here, I propose a new approach that incorporates transcription factor binding sites (TFBS) and physical protein interactions (PPI) among transcription factors (TFs) in a Bayesian variable selection (BVS) algorithm which can infer GRNs from mRNA expression profiles subjected to genetic perturbations. Using real experimental data, I show that the integration of TFBS and PPI data with mRNA expression profiles leads to significantly more accurate networks than those inferred from expression profiles alone. Additionally, the performance of the proposed algorithm is compared with a series of least absolute shrinkage and selection operator (LASSO) regression-based network inference methods that can also incorporate prior knowledge in the inference framework. The results of this comparison suggest that BVS can outperform LASSO regression-based method in some circumstances.
Empirical verification for application of Bayesian inference in situation awareness evaluations
International Nuclear Information System (INIS)
Kang, Seongkeun; Kim, Ar Ryum; Seong, Poong Hyun
2017-01-01
Highlights: • Situation awareness (SA) of human operators is significantly important for safe operation in nuclear power plants (NPPs). • SA of human operators was empirically estimated using Bayesian inference. • In this empirical study, the effect of attention and working memory to SA was considered. • Complexcity of the given task and design of human machine interface (HMI) considerably affect SA of human operators. - Abstract: Bayesian methodology has been widely used in various research fields. According to current research, malfunctions of nuclear power plants can be detected using this Bayesian inference, which consistently piles up newly incoming data and updates the estimation. However, these studies have been based on the assumption that people work like computers—perfectly—a supposition that may cause a problem in real world applications. Studies in cognitive psychology indicate that when the amount of information to be processed becomes larger, people cannot save the whole set of data in their heads due to limited attention and limited memory capacity, also known as working memory. The purpose of the current research is to consider how actual human aware the situation contrasts with our expectations, and how such disparity affects the results of conventional Bayesian inference, if at all. We compared situation awareness (SA) of ideal operators with SA of human operators, and for the human operator we used both text-based human machine interface (HMI) and infographic-based HMI to further compare two existing human operators. In addition, two different scenarios were selected how scenario complexity affects SA of human operators. As a results, when a malfunction occurred, the ideal operator found the malfunction nearly 100% probability of the time using Bayesian inference. In contrast, out of forty-six human operators, only 69.57% found the correct malfunction with simple scenario and 58.70% with complex scenario in the text-based HMI. In
McGeachie, Michael J; Chang, Hsun-Hsien; Weiss, Scott T
2014-06-01
Bayesian Networks (BN) have been a popular predictive modeling formalism in bioinformatics, but their application in modern genomics has been slowed by an inability to cleanly handle domains with mixed discrete and continuous variables. Existing free BN software packages either discretize continuous variables, which can lead to information loss, or do not include inference routines, which makes prediction with the BN impossible. We present CGBayesNets, a BN package focused around prediction of a clinical phenotype from mixed discrete and continuous variables, which fills these gaps. CGBayesNets implements Bayesian likelihood and inference algorithms for the conditional Gaussian Bayesian network (CGBNs) formalism, one appropriate for predicting an outcome of interest from, e.g., multimodal genomic data. We provide four different network learning algorithms, each making a different tradeoff between computational cost and network likelihood. CGBayesNets provides a full suite of functions for model exploration and verification, including cross validation, bootstrapping, and AUC manipulation. We highlight several results obtained previously with CGBayesNets, including predictive models of wood properties from tree genomics, leukemia subtype classification from mixed genomic data, and robust prediction of intensive care unit mortality outcomes from metabolomic profiles. We also provide detailed example analysis on public metabolomic and gene expression datasets. CGBayesNets is implemented in MATLAB and available as MATLAB source code, under an Open Source license and anonymous download at http://www.cgbayesnets.com.
Directory of Open Access Journals (Sweden)
Ross H Johnstone
2017-03-01
Full Text Available Dose-response (or ‘concentration-effect’ relationships commonly occur in biological and pharmacological systems and are well characterised by Hill curves. These curves are described by an equation with two parameters: the inhibitory concentration 50% (IC50; and the Hill coefficient. Typically just the ‘best fit’ parameter values are reported in the literature. Here we introduce a Python-based software tool, PyHillFit , and describe the underlying Bayesian inference methods that it uses, to infer probability distributions for these parameters as well as the level of experimental observation noise. The tool also allows for hierarchical fitting, characterising the effect of inter-experiment variability. We demonstrate the use of the tool on a recently published dataset on multiple ion channel inhibition by multiple drug compounds. We compare the maximum likelihood, Bayesian and hierarchical Bayesian approaches. We then show how uncertainty in dose-response inputs can be characterised and propagated into a cardiac action potential simulation to give a probability distribution on model outputs.
International Nuclear Information System (INIS)
Ma Xiang; Zabaras, Nicholas
2009-01-01
A new approach to modeling inverse problems using a Bayesian inference method is introduced. The Bayesian approach considers the unknown parameters as random variables and seeks the probabilistic distribution of the unknowns. By introducing the concept of the stochastic prior state space to the Bayesian formulation, we reformulate the deterministic forward problem as a stochastic one. The adaptive hierarchical sparse grid collocation (ASGC) method is used for constructing an interpolant to the solution of the forward model in this prior space which is large enough to capture all the variability/uncertainty in the posterior distribution of the unknown parameters. This solution can be considered as a function of the random unknowns and serves as a stochastic surrogate model for the likelihood calculation. Hierarchical Bayesian formulation is used to derive the posterior probability density function (PPDF). The spatial model is represented as a convolution of a smooth kernel and a Markov random field. The state space of the PPDF is explored using Markov chain Monte Carlo algorithms to obtain statistics of the unknowns. The likelihood calculation is performed by directly sampling the approximate stochastic solution obtained through the ASGC method. The technique is assessed on two nonlinear inverse problems: source inversion and permeability estimation in flow through porous media
Bayesian techniques for fatigue life prediction and for inference in linear time dependent PDEs
Scavino, Marco
2016-01-08
In this talk we introduce first the main characteristics of a systematic statistical approach to model calibration, model selection and model ranking when stress-life data are drawn from a collection of records of fatigue experiments. Focusing on Bayesian prediction assessment, we consider fatigue-limit models and random fatigue-limit models under different a priori assumptions. In the second part of the talk, we present a hierarchical Bayesian technique for the inference of the coefficients of time dependent linear PDEs, under the assumption that noisy measurements are available in both the interior of a domain of interest and from boundary conditions. We present a computational technique based on the marginalization of the contribution of the boundary parameters and apply it to inverse heat conduction problems.
Davis, A. D.; Huan, X.; Heimbach, P.; Marzouk, Y.
2017-12-01
Borehole data are essential for calibrating ice sheet models. However, field expeditions for acquiring borehole data are often time-consuming, expensive, and dangerous. It is thus essential to plan the best sampling locations that maximize the value of data while minimizing costs and risks. We present an uncertainty quantification (UQ) workflow based on rigorous probability framework to achieve these objectives. First, we employ an optimal experimental design (OED) procedure to compute borehole locations that yield the highest expected information gain. We take into account practical considerations of location accessibility (e.g., proximity to research sites, terrain, and ice velocity may affect feasibility of drilling) and robustness (e.g., real-time constraints such as weather may force researchers to drill at sub-optimal locations near those originally planned), by incorporating a penalty reflecting accessibility as well as sensitivity to deviations from the optimal locations. Next, we extract vertical temperature profiles from these boreholes and formulate a Bayesian inverse problem to reconstruct past surface temperatures. Using a model of temperature advection/diffusion, the top boundary condition (corresponding to surface temperatures) is calibrated via efficient Markov chain Monte Carlo (MCMC). The overall procedure can then be iterated to choose new optimal borehole locations for the next expeditions.Through this work, we demonstrate powerful UQ methods for designing experiments, calibrating models, making predictions, and assessing sensitivity--all performed under an uncertain environment. We develop a theoretical framework as well as practical software within an intuitive workflow, and illustrate their usefulness for combining data and models for environmental and climate research.
A Bayesian inference approach to unveil supply curves in electricity markets
DEFF Research Database (Denmark)
Mitridati, Lesia Marie-Jeanne Mariane; Pinson, Pierre
2017-01-01
in the literature on modeling this uncertainty. In this study we introduce a Bayesian inference approach to reveal the aggregate supply curve in a day-ahead electricity market. The proposed algorithm relies on Markov Chain Monte Carlo and Sequential Monte Carlo methods. The major appeal of this approach......With increased competition in wholesale electricity markets, the need for new decision-making tools for strategic producers has arisen. Optimal bidding strategies have traditionally been modeled as stochastic profit maximization problems. However, for producers with non-negligible market power...
Bayesian inference for multivariate point processes observed at sparsely distributed times
DEFF Research Database (Denmark)
Rasmussen, Jakob Gulddahl; Møller, Jesper; Aukema, B.H.
We consider statistical and computational aspects of simulation-based Bayesian inference for a multivariate point process which is only observed at sparsely distributed times. For specicity we consider a particular data set which has earlier been analyzed by a discrete time model involving unknown...... normalizing constants. We discuss the advantages and disadvantages of using continuous time processes compared to discrete time processes in the setting of the present paper as well as other spatial-temporal situations. Keywords: Bark beetle, conditional intensity, forest entomology, Markov chain Monte Carlo...
Directory of Open Access Journals (Sweden)
Y. Paudel
2013-03-01
Full Text Available This study applies Bayesian Inference to estimate flood risk for 53 dyke ring areas in the Netherlands, and focuses particularly on the data scarcity and extreme behaviour of catastrophe risk. The probability density curves of flood damage are estimated through Monte Carlo simulations. Based on these results, flood insurance premiums are estimated using two different practical methods that each account in different ways for an insurer's risk aversion and the dispersion rate of loss data. This study is of practical relevance because insurers have been considering the introduction of flood insurance in the Netherlands, which is currently not generally available.
DEFF Research Database (Denmark)
Meng, Weizhi; Li, Wenjuan; Xiang, Yang
2017-01-01
and experience for both patients and healthcare workers, and the underlying network architecture to support such devices is also referred to as medical smartphone networks (MSNs). MSNs, similar to other networks, are subject to a wide range of attacks (e.g. leakage of sensitive patient information by a malicious...... insider). In this work, we focus on MSNs and present a compact but efficient trust-based approach using Bayesian inference to identify malicious nodes in such an environment. We then demonstrate the effectiveness of our approach in detecting malicious nodes by evaluating the deployment of our proposed...
Paudel, Y.; Botzen, W. J. W.; Aerts, J. C. J. H.
2013-03-01
This study applies Bayesian Inference to estimate flood risk for 53 dyke ring areas in the Netherlands, and focuses particularly on the data scarcity and extreme behaviour of catastrophe risk. The probability density curves of flood damage are estimated through Monte Carlo simulations. Based on these results, flood insurance premiums are estimated using two different practical methods that each account in different ways for an insurer's risk aversion and the dispersion rate of loss data. This study is of practical relevance because insurers have been considering the introduction of flood insurance in the Netherlands, which is currently not generally available.
Zonta, Zivko J; Flotats, Xavier; Magrí, Albert
2014-08-01
The procedure commonly used for the assessment of the parameters included in activated sludge models (ASMs) relies on the estimation of their optimal value within a confidence region (i.e. frequentist inference). Once optimal values are estimated, parameter uncertainty is computed through the covariance matrix. However, alternative approaches based on the consideration of the model parameters as probability distributions (i.e. Bayesian inference), may be of interest. The aim of this work is to apply (and compare) both Bayesian and frequentist inference methods when assessing uncertainty for an ASM-type model, which considers intracellular storage and biomass growth, simultaneously. Practical identifiability was addressed exclusively considering respirometric profiles based on the oxygen uptake rate and with the aid of probabilistic global sensitivity analysis. Parameter uncertainty was thus estimated according to both the Bayesian and frequentist inferential procedures. Results were compared in order to evidence the strengths and weaknesses of both approaches. Since it was demonstrated that Bayesian inference could be reduced to a frequentist approach under particular hypotheses, the former can be considered as a more generalist methodology. Hence, the use of Bayesian inference is encouraged for tackling inferential issues in ASM environments.
Inferring gene and protein interactions using PubMed citations and consensus Bayesian networks.
Deeter, Anthony; Dalman, Mark; Haddad, Joseph; Duan, Zhong-Hui
2017-01-01
The PubMed database offers an extensive set of publication data that can be useful, yet inherently complex to use without automated computational techniques. Data repositories such as the Genomic Data Commons (GDC) and the Gene Expression Omnibus (GEO) offer experimental data storage and retrieval as well as curated gene expression profiles. Genetic interaction databases, including Reactome and Ingenuity Pathway Analysis, offer pathway and experiment data analysis using data curated from these publications and data repositories. We have created a method to generate and analyze consensus networks, inferring potential gene interactions, using large numbers of Bayesian networks generated by data mining publications in the PubMed database. Through the concept of network resolution, these consensus networks can be tailored to represent possible genetic interactions. We designed a set of experiments to confirm that our method is stable across variation in both sample and topological input sizes. Using gene product interactions from the KEGG pathway database and data mining PubMed publication abstracts, we verify that regardless of the network resolution or the inferred consensus network, our method is capable of inferring meaningful gene interactions through consensus Bayesian network generation with multiple, randomized topological orderings. Our method can not only confirm the existence of currently accepted interactions, but has the potential to hypothesize new ones as well. We show our method confirms the existence of known gene interactions such as JAK-STAT-PI3K-AKT-mTOR, infers novel gene interactions such as RAS- Bcl-2 and RAS-AKT, and found significant pathway-pathway interactions between the JAK-STAT signaling and Cardiac Muscle Contraction KEGG pathways.
Afrika Statistika ISSN 2316-090X Bayesian Inference of C-AR(1 ...
African Journals Online (AJOL)
2017-11-29
Nov 29, 2017 ... been carried out to validate the theoretical results, and then the ... In their study, they used the Gibbs sampling and Markov Chain Monte Carlo (MCMC) .... The posterior probability gives the information about the parameter(s).
Wavelet-Bayesian inference of cosmic strings embedded in the cosmic microwave background
McEwen, J. D.; Feeney, S. M.; Peiris, H. V.; Wiaux, Y.; Ringeval, C.; Bouchet, F. R.
2017-12-01
Cosmic strings are a well-motivated extension to the standard cosmological model and could induce a subdominant component in the anisotropies of the cosmic microwave background (CMB), in addition to the standard inflationary component. The detection of strings, while observationally challenging, would provide a direct probe of physics at very high-energy scales. We develop a framework for cosmic string inference from observations of the CMB made over the celestial sphere, performing a Bayesian analysis in wavelet space where the string-induced CMB component has distinct statistical properties to the standard inflationary component. Our wavelet-Bayesian framework provides a principled approach to compute the posterior distribution of the string tension Gμ and the Bayesian evidence ratio comparing the string model to the standard inflationary model. Furthermore, we present a technique to recover an estimate of any string-induced CMB map embedded in observational data. Using Planck-like simulations, we demonstrate the application of our framework and evaluate its performance. The method is sensitive to Gμ ∼ 5 × 10-7 for Nambu-Goto string simulations that include an integrated Sachs-Wolfe contribution only and do not include any recombination effects, before any parameters of the analysis are optimized. The sensitivity of the method compares favourably with other techniques applied to the same simulations.
Multi-model polynomial chaos surrogate dictionary for Bayesian inference in elasticity problems
Contreras, Andres A.
2016-09-19
A method is presented for inferring the presence of an inclusion inside a domain; the proposed approach is suitable to be used in a diagnostic device with low computational power. Specifically, we use the Bayesian framework for the inference of stiff inclusions embedded in a soft matrix, mimicking tumors in soft tissues. We rely on a polynomial chaos (PC) surrogate to accelerate the inference process. The PC surrogate predicts the dependence of the displacements field with the random elastic moduli of the materials, and are computed by means of the stochastic Galerkin (SG) projection method. Moreover, the inclusion\\'s geometry is assumed to be unknown, and this is addressed by using a dictionary consisting of several geometrical models with different configurations. A model selection approach based on the evidence provided by the data (Bayes factors) is used to discriminate among the different geometrical models and select the most suitable one. The idea of using a dictionary of pre-computed geometrical models helps to maintain the computational cost of the inference process very low, as most of the computational burden is carried out off-line for the resolution of the SG problems. Numerical tests are used to validate the methodology, assess its performance, and analyze the robustness to model errors. © 2016 Elsevier Ltd
Hadwin, Paul J.; Sipkens, T. A.; Thomson, K. A.; Liu, F.; Daun, K. J.
2016-01-01
Auto-correlated laser-induced incandescence (AC-LII) infers the soot volume fraction (SVF) of soot particles by comparing the spectral incandescence from laser-energized particles to the pyrometrically inferred peak soot temperature. This calculation requires detailed knowledge of model parameters such as the absorption function of soot, which may vary with combustion chemistry, soot age, and the internal structure of the soot. This work presents a Bayesian methodology to quantify such uncertainties. This technique treats the additional "nuisance" model parameters, including the soot absorption function, as stochastic variables and incorporates the current state of knowledge of these parameters into the inference process through maximum entropy priors. While standard AC-LII analysis provides a point estimate of the SVF, Bayesian techniques infer the posterior probability density, which will allow scientists and engineers to better assess the reliability of AC-LII inferred SVFs in the context of environmental regulations and competing diagnostics.
Bayesian inference of earthquake parameters from buoy data using a polynomial chaos-based surrogate
Giraldi, Loic
2017-04-07
This work addresses the estimation of the parameters of an earthquake model by the consequent tsunami, with an application to the Chile 2010 event. We are particularly interested in the Bayesian inference of the location, the orientation, and the slip of an Okada-based model of the earthquake ocean floor displacement. The tsunami numerical model is based on the GeoClaw software while the observational data is provided by a single DARTⓇ buoy. We propose in this paper a methodology based on polynomial chaos expansion to construct a surrogate model of the wave height at the buoy location. A correlated noise model is first proposed in order to represent the discrepancy between the computational model and the data. This step is necessary, as a classical independent Gaussian noise is shown to be unsuitable for modeling the error, and to prevent convergence of the Markov Chain Monte Carlo sampler. Second, the polynomial chaos model is subsequently improved to handle the variability of the arrival time of the wave, using a preconditioned non-intrusive spectral method. Finally, the construction of a reduced model dedicated to Bayesian inference is proposed. Numerical results are presented and discussed.
Hierarchical Bayesian inference of the initial mass function in composite stellar populations
Dries, M.; Trager, S. C.; Koopmans, L. V. E.; Popping, G.; Somerville, R. S.
2018-03-01
The initial mass function (IMF) is a key ingredient in many studies of galaxy formation and evolution. Although the IMF is often assumed to be universal, there is continuing evidence that it is not universal. Spectroscopic studies that derive the IMF of the unresolved stellar populations of a galaxy often assume that this spectrum can be described by a single stellar population (SSP). To alleviate these limitations, in this paper we have developed a unique hierarchical Bayesian framework for modelling composite stellar populations (CSPs). Within this framework, we use a parametrized IMF prior to regulate a direct inference of the IMF. We use this new framework to determine the number of SSPs that is required to fit a set of realistic CSP mock spectra. The CSP mock spectra that we use are based on semi-analytic models and have an IMF that varies as a function of stellar velocity dispersion of the galaxy. Our results suggest that using a single SSP biases the determination of the IMF slope to a higher value than the true slope, although the trend with stellar velocity dispersion is overall recovered. If we include more SSPs in the fit, the Bayesian evidence increases significantly and the inferred IMF slopes of our mock spectra converge, within the errors, to their true values. Most of the bias is already removed by using two SSPs instead of one. We show that we can reconstruct the variable IMF of our mock spectra for signal-to-noise ratios exceeding ˜75.
International Nuclear Information System (INIS)
Unkelbach, J; Oelfke, U
2005-01-01
We present a method to calculate dose uncertainties due to inter-fraction organ movements in fractionated radiotherapy, i.e. in addition to the expectation value of the dose distribution a variance distribution is calculated. To calculate the expectation value of the dose distribution in the presence of organ movements, one estimates a probability distribution of possible patient geometries. The respective variance of the expected dose distribution arises for two reasons: first, the patient is irradiated with a finite number of fractions only and second, the probability distribution of patient geometries has to be estimated from a small number of images and is therefore not exactly known. To quantify the total dose variance, we propose a method that is based on the principle of Bayesian inference. The method is of particular interest when organ motion is incorporated in inverse IMRT planning by means of inverse planning performed on a probability distribution of patient geometries. In order to make this a robust approach, it turns out that the dose variance should be considered (and minimized) in the optimization process. As an application of the presented concept of Bayesian inference, we compare three approaches to inverse planning based on probability distributions that account for an increasing degree of uncertainty. The Bayes theorem further provides a concept to interpolate between patient specific data and population-based knowledge on organ motion which is relevant since the number of CT images of a patient is typically small
Sraj, Ihab
2015-10-22
This paper addresses model dimensionality reduction for Bayesian inference based on prior Gaussian fields with uncertainty in the covariance function hyper-parameters. The dimensionality reduction is traditionally achieved using the Karhunen-Loève expansion of a prior Gaussian process assuming covariance function with fixed hyper-parameters, despite the fact that these are uncertain in nature. The posterior distribution of the Karhunen-Loève coordinates is then inferred using available observations. The resulting inferred field is therefore dependent on the assumed hyper-parameters. Here, we seek to efficiently estimate both the field and covariance hyper-parameters using Bayesian inference. To this end, a generalized Karhunen-Loève expansion is derived using a coordinate transformation to account for the dependence with respect to the covariance hyper-parameters. Polynomial Chaos expansions are employed for the acceleration of the Bayesian inference using similar coordinate transformations, enabling us to avoid expanding explicitly the solution dependence on the uncertain hyper-parameters. We demonstrate the feasibility of the proposed method on a transient diffusion equation by inferring spatially-varying log-diffusivity fields from noisy data. The inferred profiles were found closer to the true profiles when including the hyper-parameters’ uncertainty in the inference formulation.
Evidence cross-validation and Bayesian inference of MAST plasma equilibria
Energy Technology Data Exchange (ETDEWEB)
Nessi, G. T. von; Hole, M. J. [Research School of Physical Sciences and Engineering, Australian National University, Canberra ACT 0200 (Australia); Svensson, J. [Max-Planck-Institut fuer Plasmaphysik, D-17491 Greifswald (Germany); Appel, L. [EURATOM/CCFE Fusion Association, Culham Science Centre, Abingdon, Oxon OX14 3DB (United Kingdom)
2012-01-15
In this paper, current profiles for plasma discharges on the mega-ampere spherical tokamak are directly calculated from pickup coil, flux loop, and motional-Stark effect observations via methods based in the statistical theory of Bayesian analysis. By representing toroidal plasma current as a series of axisymmetric current beams with rectangular cross-section and inferring the current for each one of these beams, flux-surface geometry and q-profiles are subsequently calculated by elementary application of Biot-Savart's law. The use of this plasma model in the context of Bayesian analysis was pioneered by Svensson and Werner on the joint-European tokamak [Svensson and Werner,Plasma Phys. Controlled Fusion 50(8), 085002 (2008)]. In this framework, linear forward models are used to generate diagnostic predictions, and the probability distribution for the currents in the collection of plasma beams was subsequently calculated directly via application of Bayes' formula. In this work, we introduce a new diagnostic technique to identify and remove outlier observations associated with diagnostics falling out of calibration or suffering from an unidentified malfunction. These modifications enable a good agreement between Bayesian inference of the last-closed flux-surface with other corroborating data, such as that from force balance considerations using EFIT++[Appel et al., ''A unified approach to equilibrium reconstruction'' Proceedings of the 33rd EPS Conference on Plasma Physics (Rome, Italy, 2006)]. In addition, this analysis also yields errors on the plasma current profile and flux-surface geometry as well as directly predicting the Shafranov shift of the plasma core.
Evidence cross-validation and Bayesian inference of MAST plasma equilibria
International Nuclear Information System (INIS)
Nessi, G. T. von; Hole, M. J.; Svensson, J.; Appel, L.
2012-01-01
In this paper, current profiles for plasma discharges on the mega-ampere spherical tokamak are directly calculated from pickup coil, flux loop, and motional-Stark effect observations via methods based in the statistical theory of Bayesian analysis. By representing toroidal plasma current as a series of axisymmetric current beams with rectangular cross-section and inferring the current for each one of these beams, flux-surface geometry and q-profiles are subsequently calculated by elementary application of Biot-Savart's law. The use of this plasma model in the context of Bayesian analysis was pioneered by Svensson and Werner on the joint-European tokamak [Svensson and Werner,Plasma Phys. Controlled Fusion 50(8), 085002 (2008)]. In this framework, linear forward models are used to generate diagnostic predictions, and the probability distribution for the currents in the collection of plasma beams was subsequently calculated directly via application of Bayes' formula. In this work, we introduce a new diagnostic technique to identify and remove outlier observations associated with diagnostics falling out of calibration or suffering from an unidentified malfunction. These modifications enable a good agreement between Bayesian inference of the last-closed flux-surface with other corroborating data, such as that from force balance considerations using EFIT++[Appel et al., ''A unified approach to equilibrium reconstruction'' Proceedings of the 33rd EPS Conference on Plasma Physics (Rome, Italy, 2006)]. In addition, this analysis also yields errors on the plasma current profile and flux-surface geometry as well as directly predicting the Shafranov shift of the plasma core.
Probabilistic Damage Characterization Using the Computationally-Efficient Bayesian Approach
Warner, James E.; Hochhalter, Jacob D.
2016-01-01
This work presents a computationally-ecient approach for damage determination that quanti es uncertainty in the provided diagnosis. Given strain sensor data that are polluted with measurement errors, Bayesian inference is used to estimate the location, size, and orientation of damage. This approach uses Bayes' Theorem to combine any prior knowledge an analyst may have about the nature of the damage with information provided implicitly by the strain sensor data to form a posterior probability distribution over possible damage states. The unknown damage parameters are then estimated based on samples drawn numerically from this distribution using a Markov Chain Monte Carlo (MCMC) sampling algorithm. Several modi cations are made to the traditional Bayesian inference approach to provide signi cant computational speedup. First, an ecient surrogate model is constructed using sparse grid interpolation to replace a costly nite element model that must otherwise be evaluated for each sample drawn with MCMC. Next, the standard Bayesian posterior distribution is modi ed using a weighted likelihood formulation, which is shown to improve the convergence of the sampling process. Finally, a robust MCMC algorithm, Delayed Rejection Adaptive Metropolis (DRAM), is adopted to sample the probability distribution more eciently. Numerical examples demonstrate that the proposed framework e ectively provides damage estimates with uncertainty quanti cation and can yield orders of magnitude speedup over standard Bayesian approaches.
International Nuclear Information System (INIS)
Nakamura, Makoto
2009-01-01
It is important for Level 1 PSA to quantify input reliability parameters and their uncertainty. Bayesian methods for inference of system/component unavailability, however, are not well studied. At present practitioners allocate the uncertainty (i.e. error factor) of the unavailability based on engineering judgment. Systematic methods based on Bayesian statistics are needed for quantification of such uncertainty. In this study we have developed a new method for Bayesian inference of unavailability, where the posterior of system/component unavailability is described by the inverted gamma distribution. We show that the average of the posterior comes close to the point estimate of the unavailability as the number of outages goes to infinity. That indicates validity of the new method. Using plant data recorded in NUCIA, we have applied the new method to inference of system unavailability under unplanned outages due to violations of LCO at BWRs in Japan. According to the inference results, the unavailability is populated in the order of 10 -5 -10 -4 and the error factor is within 1-2. Thus, the new Bayesian method allows one to quantify magnitudes and widths (i.e. error factor) of uncertainty distributions of unavailability. (author)
Kim, Daesang; El Gharamti, Iman; Bisetti, Fabrizio; Farooq, Aamir; Knio, Omar
2016-01-01
A new Bayesian inference method has been developed and applied to Furan shock tube experimental data for efficient statistical inferences of the Arrhenius parameters of two OH radical consumption reactions. The collected experimental data, which
TYPE Ia SUPERNOVA LIGHT-CURVE INFERENCE: HIERARCHICAL BAYESIAN ANALYSIS IN THE NEAR-INFRARED
International Nuclear Information System (INIS)
Mandel, Kaisey S.; Friedman, Andrew S.; Kirshner, Robert P.; Wood-Vasey, W. Michael
2009-01-01
We present a comprehensive statistical analysis of the properties of Type Ia supernova (SN Ia) light curves in the near-infrared using recent data from Peters Automated InfraRed Imaging TELescope and the literature. We construct a hierarchical Bayesian framework, incorporating several uncertainties including photometric error, peculiar velocities, dust extinction, and intrinsic variations, for principled and coherent statistical inference. SN Ia light-curve inferences are drawn from the global posterior probability of parameters describing both individual supernovae and the population conditioned on the entire SN Ia NIR data set. The logical structure of the hierarchical model is represented by a directed acyclic graph. Fully Bayesian analysis of the model and data is enabled by an efficient Markov Chain Monte Carlo algorithm exploiting the conditional probabilistic structure using Gibbs sampling. We apply this framework to the JHK s SN Ia light-curve data. A new light-curve model captures the observed J-band light-curve shape variations. The marginal intrinsic variances in peak absolute magnitudes are σ(M J ) = 0.17 ± 0.03, σ(M H ) = 0.11 ± 0.03, and σ(M Ks ) = 0.19 ± 0.04. We describe the first quantitative evidence for correlations between the NIR absolute magnitudes and J-band light-curve shapes, and demonstrate their utility for distance estimation. The average residual in the Hubble diagram for the training set SNe at cz > 2000kms -1 is 0.10 mag. The new application of bootstrap cross-validation to SN Ia light-curve inference tests the sensitivity of the statistical model fit to the finite sample and estimates the prediction error at 0.15 mag. These results demonstrate that SN Ia NIR light curves are as effective as corrected optical light curves, and, because they are less vulnerable to dust absorption, they have great potential as precise and accurate cosmological distance indicators.
Calibrated birth-death phylogenetic time-tree priors for bayesian inference.
Heled, Joseph; Drummond, Alexei J
2015-05-01
Here we introduce a general class of multiple calibration birth-death tree priors for use in Bayesian phylogenetic inference. All tree priors in this class separate ancestral node heights into a set of "calibrated nodes" and "uncalibrated nodes" such that the marginal distribution of the calibrated nodes is user-specified whereas the density ratio of the birth-death prior is retained for trees with equal values for the calibrated nodes. We describe two formulations, one in which the calibration information informs the prior on ranked tree topologies, through the (conditional) prior, and the other which factorizes the prior on divergence times and ranked topologies, thus allowing uniform, or any arbitrary prior distribution on ranked topologies. Although the first of these formulations has some attractive properties, the algorithm we present for computing its prior density is computationally intensive. However, the second formulation is always faster and computationally efficient for up to six calibrations. We demonstrate the utility of the new class of multiple-calibration tree priors using both small simulations and a real-world analysis and compare the results to existing schemes. The two new calibrated tree priors described in this article offer greater flexibility and control of prior specification in calibrated time-tree inference and divergence time dating, and will remove the need for indirect approaches to the assessment of the combined effect of calibration densities and tree priors in Bayesian phylogenetic inference. © The Author(s) 2014. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.
Inference of reactive transport model parameters using a Bayesian multivariate approach
Carniato, Luca; Schoups, Gerrit; van de Giesen, Nick
2014-08-01
Parameter estimation of subsurface transport models from multispecies data requires the definition of an objective function that includes different types of measurements. Common approaches are weighted least squares (WLS), where weights are specified a priori for each measurement, and weighted least squares with weight estimation (WLS(we)) where weights are estimated from the data together with the parameters. In this study, we formulate the parameter estimation task as a multivariate Bayesian inference problem. The WLS and WLS(we) methods are special cases in this framework, corresponding to specific prior assumptions about the residual covariance matrix. The Bayesian perspective allows for generalizations to cases where residual correlation is important and for efficient inference by analytically integrating out the variances (weights) and selected covariances from the joint posterior. Specifically, the WLS and WLS(we) methods are compared to a multivariate (MV) approach that accounts for specific residual correlations without the need for explicit estimation of the error parameters. When applied to inference of reactive transport model parameters from column-scale data on dissolved species concentrations, the following results were obtained: (1) accounting for residual correlation between species provides more accurate parameter estimation for high residual correlation levels whereas its influence for predictive uncertainty is negligible, (2) integrating out the (co)variances leads to an efficient estimation of the full joint posterior with a reduced computational effort compared to the WLS(we) method, and (3) in the presence of model structural errors, none of the methods is able to identify the correct parameter values.
Albert, Carlo; Ulzega, Simone; Stoop, Ruedi
2016-04-01
Parameter inference is a fundamental problem in data-driven modeling. Given observed data that is believed to be a realization of some parameterized model, the aim is to find parameter values that are able to explain the observed data. In many situations, the dominant sources of uncertainty must be included into the model for making reliable predictions. This naturally leads to stochastic models. Stochastic models render parameter inference much harder, as the aim then is to find a distribution of likely parameter values. In Bayesian statistics, which is a consistent framework for data-driven learning, this so-called posterior distribution can be used to make probabilistic predictions. We propose a novel, exact, and very efficient approach for generating posterior parameter distributions for stochastic differential equation models calibrated to measured time series. The algorithm is inspired by reinterpreting the posterior distribution as a statistical mechanics partition function of an object akin to a polymer, where the measurements are mapped on heavier beads compared to those of the simulated data. To arrive at distribution samples, we employ a Hamiltonian Monte Carlo approach combined with a multiple time-scale integration. A separation of time scales naturally arises if either the number of measurement points or the number of simulation points becomes large. Furthermore, at least for one-dimensional problems, we can decouple the harmonic modes between measurement points and solve the fastest part of their dynamics analytically. Our approach is applicable to a wide range of inference problems and is highly parallelizable.
Nonparametric Bayesian inference for mean residual life functions in survival analysis.
Poynor, Valerie; Kottas, Athanasios
2018-01-19
Modeling and inference for survival analysis problems typically revolves around different functions related to the survival distribution. Here, we focus on the mean residual life (MRL) function, which provides the expected remaining lifetime given that a subject has survived (i.e. is event-free) up to a particular time. This function is of direct interest in reliability, medical, and actuarial fields. In addition to its practical interpretation, the MRL function characterizes the survival distribution. We develop general Bayesian nonparametric inference for MRL functions built from a Dirichlet process mixture model for the associated survival distribution. The resulting model for the MRL function admits a representation as a mixture of the kernel MRL functions with time-dependent mixture weights. This model structure allows for a wide range of shapes for the MRL function. Particular emphasis is placed on the selection of the mixture kernel, taken to be a gamma distribution, to obtain desirable properties for the MRL function arising from the mixture model. The inference method is illustrated with a data set of two experimental groups and a data set involving right censoring. The supplementary material available at Biostatistics online provides further results on empirical performance of the model, using simulated data examples. © The Author 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
A Bayesian method for inferring transmission chains in a partially observed epidemic.
Energy Technology Data Exchange (ETDEWEB)
Marzouk, Youssef M.; Ray, Jaideep
2008-10-01
We present a Bayesian approach for estimating transmission chains and rates in the Abakaliki smallpox epidemic of 1967. The epidemic affected 30 individuals in a community of 74; only the dates of appearance of symptoms were recorded. Our model assumes stochastic transmission of the infections over a social network. Distinct binomial random graphs model intra- and inter-compound social connections, while disease transmission over each link is treated as a Poisson process. Link probabilities and rate parameters are objects of inference. Dates of infection and recovery comprise the remaining unknowns. Distributions for smallpox incubation and recovery periods are obtained from historical data. Using Markov chain Monte Carlo, we explore the joint posterior distribution of the scalar parameters and provide an expected connectivity pattern for the social graph and infection pathway.
Mohammad-Djafari, Ali
2015-01-01
The main object of this tutorial article is first to review the main inference tools using Bayesian approach, Entropy, Information theory and their corresponding geometries. This review is focused mainly on the ways these tools have been used in data, signal and image processing. After a short introduction of the different quantities related to the Bayes rule, the entropy and the Maximum Entropy Principle (MEP), relative entropy and the Kullback-Leibler divergence, Fisher information, we will study their use in different fields of data and signal processing such as: entropy in source separation, Fisher information in model order selection, different Maximum Entropy based methods in time series spectral estimation and finally, general linear inverse problems.
Mocapy++ - A toolkit for inference and learning in dynamic Bayesian networks
Directory of Open Access Journals (Sweden)
Hamelryck Thomas
2010-03-01
Full Text Available Abstract Background Mocapy++ is a toolkit for parameter learning and inference in dynamic Bayesian networks (DBNs. It supports a wide range of DBN architectures and probability distributions, including distributions from directional statistics (the statistics of angles, directions and orientations. Results The program package is freely available under the GNU General Public Licence (GPL from SourceForge http://sourceforge.net/projects/mocapy. The package contains the source for building the Mocapy++ library, several usage examples and the user manual. Conclusions Mocapy++ is especially suitable for constructing probabilistic models of biomolecular structure, due to its support for directional statistics. In particular, it supports the Kent distribution on the sphere and the bivariate von Mises distribution on the torus. These distributions have proven useful to formulate probabilistic models of protein and RNA structure in atomic detail.
Energy Technology Data Exchange (ETDEWEB)
Chakraborty, Shubhankar; Das, Prasanta Kr., E-mail: pkd@mech.iitkgp.ernet.in [Department of Mechanical Engineering, Indian Institute of Technology Kharagpur, Kharagpur 721302 (India); Roy Chaudhuri, Partha [Department of Physics, Indian Institute of Technology Kharagpur, Kharagpur 721302 (India)
2016-07-15
In this communication, a novel optical technique has been proposed for the reconstruction of the shape of a Taylor bubble using measurements from multiple arrays of optical sensors. The deviation of an optical beam passing through the bubble depends on the contour of bubble surface. A theoretical model of the deviation of a beam during the traverse of a Taylor bubble through it has been developed. Using this model and the time history of the deviation captured by the sensor array, the bubble shape has been reconstructed. The reconstruction has been performed using an inverse algorithm based on Bayesian inference technique and Markov chain Monte Carlo sampling algorithm. The reconstructed nose shape has been compared with the true shape, extracted through image processing of high speed images. Finally, an error analysis has been performed to pinpoint the sources of the errors.
Multi-Objective data analysis using Bayesian Inference for MagLIF experiments
Knapp, Patrick; Glinksy, Michael; Evans, Matthew; Gom, Matth; Han, Stephanie; Harding, Eric; Slutz, Steve; Hahn, Kelly; Harvey-Thompson, Adam; Geissel, Matthias; Ampleford, David; Jennings, Christopher; Schmit, Paul; Smith, Ian; Schwarz, Jens; Peterson, Kyle; Jones, Brent; Rochau, Gregory; Sinars, Daniel
2017-10-01
The MagLIF concept has recently demonstrated Gbar pressures and confinement of charged fusion products at stagnation. We present a new analysis methodology that allows for integration of multiple diagnostics including nuclear, x-ray imaging, and x-ray power to determine the temperature, pressure, liner areal density, and mix fraction. A simplified hot-spot model is used with a Bayesian inference network to determine the most probable model parameters that describe the observations while simultaneously revealing the principal uncertainties in the analysis. Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC., a wholly owned subsidiary of Honeywell International, Inc., for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-NA-0003525.
A combined evidence Bayesian method for human ancestry inference applied to Afro-Colombians.
Rishishwar, Lavanya; Conley, Andrew B; Vidakovic, Brani; Jordan, I King
2015-12-15
Uniparental genetic markers, mitochondrial DNA (mtDNA) and Y chromosomal DNA, are widely used for the inference of human ancestry. However, the resolution of ancestral origins based on mtDNA haplotypes is limited by the fact that such haplotypes are often found to be distributed across wide geographical regions. We have addressed this issue here by combining two sources of ancestry information that have typically been considered separately: historical records regarding population origins and genetic information on mtDNA haplotypes. To combine these distinct data sources, we applied a Bayesian approach that considers historical records, in the form of prior probabilities, together with data on the geographical distribution of mtDNA haplotypes, formulated as likelihoods, to yield ancestry assignments from posterior probabilities. This combined evidence Bayesian approach to ancestry assignment was evaluated for its ability to accurately assign sub-continental African ancestral origins to Afro-Colombians based on their mtDNA haplotypes. We demonstrate that the incorporation of historical prior probabilities via this analytical framework can provide for substantially increased resolution in sub-continental African ancestry assignment for members of this population. In addition, a personalized approach to ancestry assignment that involves the tuning of priors to individual mtDNA haplotypes yields even greater resolution for individual ancestry assignment. Despite the fact that Colombia has a large population of Afro-descendants, the ancestry of this community has been understudied relative to populations with primarily European and Native American ancestry. Thus, the application of the kind of combined evidence approach developed here to the study of ancestry in the Afro-Colombian population has the potential to be impactful. The formal Bayesian analytical framework we propose for combining historical and genetic information also has the potential to be widely applied
Khan, Haseeb A; Arif, Ibrahim A; Bahkali, Ali H; Al Farhan, Ahmad H; Al Homaidan, Ali A
2008-10-06
This investigation was aimed to compare the inference of antelope phylogenies resulting from the 16S rRNA, cytochrome-b (cyt-b) and d-loop segments of mitochondrial DNA using three different computational models including Bayesian (BA), maximum parsimony (MP) and unweighted pair group method with arithmetic mean (UPGMA). The respective nucleotide sequences of three Oryx species (Oryx leucoryx, Oryx dammah and Oryx gazella) and an out-group (Addax nasomaculatus) were aligned and subjected to BA, MP and UPGMA models for comparing the topologies of respective phylogenetic trees. The 16S rRNA region possessed the highest frequency of conserved sequences (97.65%) followed by cyt-b (94.22%) and d-loop (87.29%). There were few transitions (2.35%) and none transversions in 16S rRNA as compared to cyt-b (5.61% transitions and 0.17% transversions) and d-loop (11.57% transitions and 1.14% transversions) while comparing the four taxa. All the three mitochondrial segments clearly differentiated the genus Addax from Oryx using the BA or UPGMA models. The topologies of all the gamma-corrected Bayesian trees were identical irrespective of the marker type. The UPGMA trees resulting from 16S rRNA and d-loop sequences were also identical (Oryx dammah grouped with Oryx leucoryx) to Bayesian trees except that the UPGMA tree based on cyt-b showed a slightly different phylogeny (Oryx dammah grouped with Oryx gazella) with a low bootstrap support. However, the MP model failed to differentiate the genus Addax from Oryx. These findings demonstrate the efficiency and robustness of BA and UPGMA methods for phylogenetic analysis of antelopes using mitochondrial markers.
Using Approximate Bayesian Computation to infer sex ratios from acoustic data.
Lehnen, Lisa; Schorcht, Wigbert; Karst, Inken; Biedermann, Martin; Kerth, Gerald; Puechmaille, Sebastien J
2018-01-01
Population sex ratios are of high ecological relevance, but are challenging to determine in species lacking conspicuous external cues indicating their sex. Acoustic sexing is an option if vocalizations differ between sexes, but is precluded by overlapping distributions of the values of male and female vocalizations in many species. A method allowing the inference of sex ratios despite such an overlap will therefore greatly increase the information extractable from acoustic data. To meet this demand, we developed a novel approach using Approximate Bayesian Computation (ABC) to infer the sex ratio of populations from acoustic data. Additionally, parameters characterizing the male and female distribution of acoustic values (mean and standard deviation) are inferred. This information is then used to probabilistically assign a sex to a single acoustic signal. We furthermore develop a simpler means of sex ratio estimation based on the exclusion of calls from the overlap zone. Applying our methods to simulated data demonstrates that sex ratio and acoustic parameter characteristics of males and females are reliably inferred by the ABC approach. Applying both the ABC and the exclusion method to empirical datasets (echolocation calls recorded in colonies of lesser horseshoe bats, Rhinolophus hipposideros) provides similar sex ratios as molecular sexing. Our methods aim to facilitate evidence-based conservation, and to benefit scientists investigating ecological or conservation questions related to sex- or group specific behaviour across a wide range of organisms emitting acoustic signals. The developed methodology is non-invasive, low-cost and time-efficient, thus allowing the study of many sites and individuals. We provide an R-script for the easy application of the method and discuss potential future extensions and fields of applications. The script can be easily adapted to account for numerous biological systems by adjusting the type and number of groups to be
Yu, Bin; Xu, Jia-Meng; Li, Shan; Chen, Cheng; Chen, Rui-Xin; Wang, Lei; Zhang, Yan; Wang, Ming-Hui
2017-10-06
Gene regulatory networks (GRNs) research reveals complex life phenomena from the perspective of gene interaction, which is an important research field in systems biology. Traditional Bayesian networks have a high computational complexity, and the network structure scoring model has a single feature. Information-based approaches cannot identify the direction of regulation. In order to make up for the shortcomings of the above methods, this paper presents a novel hybrid learning method (DBNCS) based on dynamic Bayesian network (DBN) to construct the multiple time-delayed GRNs for the first time, combining the comprehensive score (CS) with the DBN model. DBNCS algorithm first uses CMI2NI (conditional mutual inclusive information-based network inference) algorithm for network structure profiles learning, namely the construction of search space. Then the redundant regulations are removed by using the recursive optimization algorithm (RO), thereby reduce the false positive rate. Secondly, the network structure profiles are decomposed into a set of cliques without loss, which can significantly reduce the computational complexity. Finally, DBN model is used to identify the direction of gene regulation within the cliques and search for the optimal network structure. The performance of DBNCS algorithm is evaluated by the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in Escherichia coli , and compared with other state-of-the-art methods. The experimental results show the rationality of the algorithm design and the outstanding performance of the GRNs.
Bayesian inference for the distribution of grams of marijuana in a joint.
Ridgeway, Greg; Kilmer, Beau
2016-08-01
The average amount of marijuana in a joint is unknown, yet this figure is a critical quantity for creating credible measures of marijuana consumption. It is essential for projecting tax revenues post-legalization, estimating the size of illicit marijuana markets, and learning about how much marijuana users are consuming in order to understand health and behavioral consequences. Arrestee Drug Abuse Monitoring data collected between 2000 and 2010 contain relevant information on 10,628 marijuana transactions, joints and loose marijuana purchases, including the city in which the purchase occurred and the price paid for the marijuana. Using the Brown-Silverman drug pricing model to link marijuana price and weight, we are able to infer the distribution of grams of marijuana in a joint and provide a Bayesian posterior distribution for the mean weight of marijuana in a joint. We estimate that the mean weight of marijuana in a joint is 0.32g (95% Bayesian posterior interval: 0.30-0.35). Our estimate of the mean weight of marijuana in a joint is lower than figures commonly used to make estimates of marijuana consumption. These estimates can be incorporated into drug policy discussions to produce better understanding about illicit marijuana markets, the size of potential legalized marijuana markets, and health and behavior outcomes. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Walter, G.
2015-01-01
In the Bayesian approach to statistical inference, possibly subjective knowledge on model parameters can be expressed by so-called prior distributions. A prior distribution is updated, via Bayes’ Rule, to the so-called posterior distribution, which combines prior information and information from
Bayesian Analysis for Penalized Spline Regression Using WinBUGS
Directory of Open Access Journals (Sweden)
Ciprian M. Crainiceanu
2005-09-01
Full Text Available Penalized splines can be viewed as BLUPs in a mixed model framework, which allows the use of mixed model software for smoothing. Thus, software originally developed for Bayesian analysis of mixed models can be used for penalized spline regression. Bayesian inference for nonparametric models enjoys the flexibility of nonparametric models and the exact inference provided by the Bayesian inferential machinery. This paper provides a simple, yet comprehensive, set of programs for the implementation of nonparametric Bayesian analysis in WinBUGS. Good mixing properties of the MCMC chains are obtained by using low-rank thin-plate splines, while simulation times per iteration are reduced employing WinBUGS specific computational tricks.
Bayesian inference for heterogeneous caprock permeability based on above zone pressure monitoring
Energy Technology Data Exchange (ETDEWEB)
Namhata, Argha; Small, Mitchell J.; Dilmore, Robert M.; Nakles, David V.; King, Seth
2017-02-01
The presence of faults/ fractures or highly permeable zones in the primary sealing caprock of a CO2 storage reservoir can result in leakage of CO2. Monitoring of leakage requires the capability to detect and resolve the onset, location, and volume of leakage in a systematic and timely manner. Pressure-based monitoring possesses such capabilities. This study demonstrates a basis for monitoring network design based on the characterization of CO2 leakage scenarios through an assessment of the integrity and permeability of the caprock inferred from above zone pressure measurements. Four representative heterogeneous fractured seal types are characterized to demonstrate seal permeability ranging from highly permeable to impermeable. Based on Bayesian classification theory, the probability of each fractured caprock scenario given above zone pressure measurements with measurement error is inferred. The sensitivity to injection rate and caprock thickness is also evaluated and the probability of proper classification is calculated. The time required to distinguish between above zone pressure outcomes and the associated leakage scenarios is also computed.
Louzoun, Yoram; Alter, Idan; Gragert, Loren; Albrecht, Mark; Maiers, Martin
2018-05-01
Regardless of sampling depth, accurate genotype imputation is limited in regions of high polymorphism which often have a heavy-tailed haplotype frequency distribution. Many rare haplotypes are thus unobserved. Statistical methods to improve imputation by extending reference haplotype distributions using linkage disequilibrium patterns that relate allele and haplotype frequencies have not yet been explored. In the field of unrelated stem cell transplantation, imputation of highly polymorphic human leukocyte antigen (HLA) genes has an important application in identifying the best-matched stem cell donor when searching large registries totaling over 28,000,000 donors worldwide. Despite these large registry sizes, a significant proportion of searched patients present novel HLA haplotypes. Supporting this observation, HLA population genetic models have indicated that many extant HLA haplotypes remain unobserved. The absent haplotypes are a significant cause of error in haplotype matching. We have applied a Bayesian inference methodology for extending haplotype frequency distributions, using a model where new haplotypes are created by recombination of observed alleles. Applications of this joint probability model offer significant improvement in frequency distribution estimates over the best existing alternative methods, as we illustrate using five-locus HLA frequency data from the National Marrow Donor Program registry. Transplant matching algorithms and disease association studies involving phasing and imputation of rare variants may benefit from this statistical inference framework.
Bayesian nonparametric generative models for causal inference with missing at random covariates.
Roy, Jason; Lum, Kirsten J; Zeldow, Bret; Dworkin, Jordan D; Re, Vincent Lo; Daniels, Michael J
2018-03-26
We propose a general Bayesian nonparametric (BNP) approach to causal inference in the point treatment setting. The joint distribution of the observed data (outcome, treatment, and confounders) is modeled using an enriched Dirichlet process. The combination of the observed data model and causal assumptions allows us to identify any type of causal effect-differences, ratios, or quantile effects, either marginally or for subpopulations of interest. The proposed BNP model is well-suited for causal inference problems, as it does not require parametric assumptions about the distribution of confounders and naturally leads to a computationally efficient Gibbs sampling algorithm. By flexibly modeling the joint distribution, we are also able to impute (via data augmentation) values for missing covariates within the algorithm under an assumption of ignorable missingness, obviating the need to create separate imputed data sets. This approach for imputing the missing covariates has the additional advantage of guaranteeing congeniality between the imputation model and the analysis model, and because we use a BNP approach, parametric models are avoided for imputation. The performance of the method is assessed using simulation studies. The method is applied to data from a cohort study of human immunodeficiency virus/hepatitis C virus co-infected patients. © 2018, The International Biometric Society.
A Hamiltonian Monte–Carlo method for Bayesian inference of supermassive black hole binaries
International Nuclear Information System (INIS)
Porter, Edward K; Carré, Jérôme
2014-01-01
We investigate the use of a Hamiltonian Monte–Carlo to map out the posterior density function for supermassive black hole binaries. While previous Markov Chain Monte–Carlo (MCMC) methods, such as Metropolis–Hastings MCMC, have been successfully employed for a number of different gravitational wave sources, these methods are essentially random walk algorithms. The Hamiltonian Monte–Carlo treats the inverse likelihood surface as a ‘gravitational potential’ and by introducing canonical positions and momenta, dynamically evolves the Markov chain by solving Hamilton's equations of motion. This method is not as widely used as other MCMC algorithms due to the necessity of calculating gradients of the log-likelihood, which for most applications results in a bottleneck that makes the algorithm computationally prohibitive. We circumvent this problem by using accepted initial phase-space trajectory points to analytically fit for each of the individual gradients. Eliminating the waveform generation needed for the numerical derivatives reduces the total number of required templates for a 10 6 iteration chain from ∼10 9 to ∼10 6 . The result is in an implementation of the Hamiltonian Monte–Carlo that is faster, and more efficient by a factor of approximately the dimension of the parameter space, than a Hessian MCMC. (paper)
MCMC for Wind Power Simulation
Papaefthymiou, G.; Klöckl, B.
2008-01-01
This paper contributes a Markov chain Monte Carlo (MCMC) method for the direct generation of synthetic time series of wind power output. It is shown that obtaining a stochastic model directly in the wind power domain leads to reduced number of states and to lower order of the Markov chain at equal
Izquierdo, K.; Lekic, V.; Montesi, L.
2017-12-01
Gravity inversions are especially important for planetary applications since measurements of the variations in gravitational acceleration are often the only constraint available to map out lateral density variations in the interiors of planets and other Solar system objects. Currently, global gravity data is available for the terrestrial planets and the Moon. Although several methods for inverting these data have been developed and applied, the non-uniqueness of global density models that fit the data has not yet been fully characterized. We make use of Bayesian inference and a Reversible Jump Markov Chain Monte Carlo (RJMCMC) approach to develop a Trans-dimensional Hierarchical Bayesian (THB) inversion algorithm that yields a large sample of models that fit a gravity field. From this group of models, we can determine the most likely value of parameters of a global density model and a measure of the non-uniqueness of each parameter when the number of anomalies describing the gravity field is not fixed a priori. We explore the use of a parallel tempering algorithm and fast multipole method to reduce the number of iterations and computing time needed. We applied this method to a synthetic gravity field of the Moon and a long wavelength synthetic model of density anomalies in the Earth's lower mantle. We obtained a good match between the given gravity field and the gravity field produced by the most likely model in each inversion. The number of anomalies of the models showed parsimony of the algorithm, the value of the noise variance of the input data was retrieved, and the non-uniqueness of the models was quantified. Our results show that the ability to constrain the latitude and longitude of density anomalies, which is excellent at shallow locations (information about the overall density distribution of celestial bodies even when there is no other geophysical data available.
The R Package MitISEM: Efficient and Robust Simulation Procedures for Bayesian Inference
Directory of Open Access Journals (Sweden)
Nalan Baştürk
2017-07-01
Full Text Available This paper presents the R package MitISEM (mixture of t by importance sampling weighted expectation maximization which provides an automatic and flexible two-stage method to approximate a non-elliptical target density kernel - typically a posterior density kernel - using an adaptive mixture of Student t densities as approximating density. In the first stage a mixture of Student t densities is fitted to the target using an expectation maximization algorithm where each step of the optimization procedure is weighted using importance sampling. In the second stage this mixture density is a candidate density for efficient and robust application of importance sampling or the Metropolis-Hastings (MH method to estimate properties of the target distribution. The package enables Bayesian inference and prediction on model parameters and probabilities, in particular, for models where densities have multi-modal or other non-elliptical shapes like curved ridges. These shapes occur in research topics in several scientific fields. For instance, analysis of DNA data in bio-informatics, obtaining loans in the banking sector by heterogeneous groups in financial economics and analysis of education's effect on earned income in labor economics. The package MitISEM provides also an extended algorithm, 'sequential MitISEM', which substantially decreases computation time when the target density has to be approximated for increasing data samples. This occurs when the posterior or predictive density is updated with new observations and/or when one computes model probabilities using predictive likelihoods. We illustrate the MitISEM algorithm using three canonical statistical and econometric models that are characterized by several types of non-elliptical posterior shapes and that describe well-known data patterns in econometrics and finance. We show that MH using the candidate density obtained by MitISEM outperforms, in terms of numerical efficiency, MH using a simpler
qPR: An adaptive partial-report procedure based on Bayesian inference.
Baek, Jongsoo; Lesmes, Luis Andres; Lu, Zhong-Lin
2016-08-01
Iconic memory is best assessed with the partial report procedure in which an array of letters appears briefly on the screen and a poststimulus cue directs the observer to report the identity of the cued letter(s). Typically, 6-8 cue delays or 600-800 trials are tested to measure the iconic memory decay function. Here we develop a quick partial report, or qPR, procedure based on a Bayesian adaptive framework to estimate the iconic memory decay function with much reduced testing time. The iconic memory decay function is characterized by an exponential function and a joint probability distribution of its three parameters. Starting with a prior of the parameters, the method selects the stimulus to maximize the expected information gain in the next test trial. It then updates the posterior probability distribution of the parameters based on the observer's response using Bayesian inference. The procedure is reiterated until either the total number of trials or the precision of the parameter estimates reaches a certain criterion. Simulation studies showed that only 100 trials were necessary to reach an average absolute bias of 0.026 and a precision of 0.070 (both in terms of probability correct). A psychophysical validation experiment showed that estimates of the iconic memory decay function obtained with 100 qPR trials exhibited good precision (the half width of the 68.2% credible interval = 0.055) and excellent agreement with those obtained with 1,600 trials of the conventional method of constant stimuli procedure (RMSE = 0.063). Quick partial-report relieves the data collection burden in characterizing iconic memory and makes it possible to assess iconic memory in clinical populations.
Radial anisotropy of Northeast Asia inferred from Bayesian inversions of ambient noise data
Lee, S. J.; Kim, S.; Rhie, J.
2017-12-01
The eastern margin of the Eurasia plate exhibits complex tectonic settings due to interactions with the subducting Pacific and Philippine Sea plates and the colliding India plate. Distributed extensional basins and intraplate volcanoes, and their heterogeneous features in the region are not easily explained with a simple mechanism. Observations of radial anisotropy in the entire lithosphere and the part of the asthenosphere provide the most effective evidence for the deformation of the lithosphere and the associated variation of the lithosphere-asthenosphere boundary (LAB). To infer anisotropic structures of crustal and upper-mantle in this region, radial anisotropy is measured using ambient noise data. In a continuation of previous Rayleigh wave tomography study in Northeast Asia, we conduct Love wave tomography to determine radial anisotropy using the Bayesian inversion techniques. Continuous seismic noise recordings of 237 broad-band seismic stations are used and more than 55,000 group and phase velocities of fundamental mode are measured for periods of 5-60 s. Total 8 different types of dispersion maps of Love wave from this study (period 10-60 s), Rayleigh wave from previous tomographic study (Kim et al., 2016; period 8-70 s) and longer period data (period 70-200 s) from a global model (Ekstrom, 2011) are jointly inverted using a hierarchical and transdimensional Bayesian technique. For each grid-node, boundary depths, velocities and anisotropy parameters of layers are sampled simultaneously on the assumption of the layered half-space model. The constructed 3-D radial anisotropy model provides much more details about the crust and upper mantle anisotropic structures, and about the complex undulation of the LAB.
Thomas, D.L.; Johnson, D.; Griffith, B.
2006-01-01
Bayesian hierarchical discrete-choice model for resource selection can provide managers with 2 components of population-level inference: average population selection and variability of selection. Both components are necessary to make sound management decisions based on animal selection.
How does aging affect recognition-based inference? A hierarchical Bayesian modeling approach.
Horn, Sebastian S; Pachur, Thorsten; Mata, Rui
2015-01-01
The recognition heuristic (RH) is a simple strategy for probabilistic inference according to which recognized objects are judged to score higher on a criterion than unrecognized objects. In this article, a hierarchical Bayesian extension of the multinomial r-model is applied to measure use of the RH on the individual participant level and to re-evaluate differences between younger and older adults' strategy reliance across environments. Further, it is explored how individual r-model parameters relate to alternative measures of the use of recognition and other knowledge, such as adherence rates and indices from signal-detection theory (SDT). Both younger and older adults used the RH substantially more often in an environment with high than low recognition validity, reflecting adaptivity in strategy use across environments. In extension of previous analyses (based on adherence rates), hierarchical modeling revealed that in an environment with low recognition validity, (a) older adults had a stronger tendency than younger adults to rely on the RH and (b) variability in RH use between individuals was larger than in an environment with high recognition validity; variability did not differ between age groups. Further, the r-model parameters correlated moderately with an SDT measure expressing how well people can discriminate cases where the RH leads to a correct vs. incorrect inference; this suggests that the r-model and the SDT measures may offer complementary insights into the use of recognition in decision making. In conclusion, younger and older adults are largely adaptive in their application of the RH, but cognitive aging may be associated with an increased tendency to rely on this strategy. Copyright © 2014 Elsevier B.V. All rights reserved.
Aeschbacher, S; Futschik, A; Beaumont, M A
2013-02-01
We propose a two-step procedure for estimating multiple migration rates in an approximate Bayesian computation (ABC) framework, accounting for global nuisance parameters. The approach is not limited to migration, but generally of interest for inference problems with multiple parameters and a modular structure (e.g. independent sets of demes or loci). We condition on a known, but complex demographic model of a spatially subdivided population, motivated by the reintroduction of Alpine ibex (Capra ibex) into Switzerland. In the first step, the global parameters ancestral mutation rate and male mating skew have been estimated for the whole population in Aeschbacher et al. (Genetics 2012; 192: 1027). In the second step, we estimate in this study the migration rates independently for clusters of demes putatively connected by migration. For large clusters (many migration rates), ABC faces the problem of too many summary statistics. We therefore assess by simulation if estimation per pair of demes is a valid alternative. We find that the trade-off between reduced dimensionality for the pairwise estimation on the one hand and lower accuracy due to the assumption of pairwise independence on the other depends on the number of migration rates to be inferred: the accuracy of the pairwise approach increases with the number of parameters, relative to the joint estimation approach. To distinguish between low and zero migration, we perform ABC-type model comparison between a model with migration and one without. Applying the approach to microsatellite data from Alpine ibex, we find no evidence for substantial gene flow via migration, except for one pair of demes in one direction. © 2013 Blackwell Publishing Ltd.
Elsheikh, Ahmed H.; Hoteit, Ibrahim; Wheeler, Mary Fanett
2014-01-01
An efficient Bayesian calibration method based on the nested sampling (NS) algorithm and non-intrusive polynomial chaos method is presented. Nested sampling is a Bayesian sampling algorithm that builds a discrete representation of the posterior
Barham, Barbara J.; Barham, Peter J.; Campbell, Kate J.; Crawford, Robert J. M.; Grigg, Jennifer; Horswill, Cat; Morris, Taryn L.; Pichegru, Lorien; Steinfurth, Antje; Weller, Florian; Winker, Henning
2018-01-01
Global forage-fish landings are increasing, with potentially grave consequences for marine ecosystems. Predators of forage fish may be influenced by this harvest, but the nature of these effects is contentious. Experimental fishery manipulations offer the best solution to quantify population-level impacts, but are rare. We used Bayesian inference to examine changes in chick survival, body condition and population growth rate of endangered African penguins Spheniscus demersus in response to 8 years of alternating time–area closures around two pairs of colonies. Our results demonstrate that fishing closures improved chick survival and condition, after controlling for changing prey availability. However, this effect was inconsistent across sites and years, highlighting the difficultly of assessing management interventions in marine ecosystems. Nevertheless, modelled increases in population growth rates exceeded 1% at one colony; i.e. the threshold considered biologically meaningful by fisheries management in South Africa. Fishing closures evidently can improve the population trend of a forage-fish-dependent predator—we therefore recommend they continue in South Africa and support their application elsewhere. However, detecting demographic gains for mobile marine predators from small no-take zones requires experimental time frames and scales that will often exceed those desired by decision makers. PMID:29343602
A dynamic discretization method for reliability inference in Dynamic Bayesian Networks
International Nuclear Information System (INIS)
Zhu, Jiandao; Collette, Matthew
2015-01-01
The material and modeling parameters that drive structural reliability analysis for marine structures are subject to a significant uncertainty. This is especially true when time-dependent degradation mechanisms such as structural fatigue cracking are considered. Through inspection and monitoring, information such as crack location and size can be obtained to improve these parameters and the corresponding reliability estimates. Dynamic Bayesian Networks (DBNs) are a powerful and flexible tool to model dynamic system behavior and update reliability and uncertainty analysis with life cycle data for problems such as fatigue cracking. However, a central challenge in using DBNs is the need to discretize certain types of continuous random variables to perform network inference while still accurately tracking low-probability failure events. Most existing discretization methods focus on getting the overall shape of the distribution correct, with less emphasis on the tail region. Therefore, a novel scheme is presented specifically to estimate the likelihood of low-probability failure events. The scheme is an iterative algorithm which dynamically partitions the discretization intervals at each iteration. Through applications to two stochastic crack-growth example problems, the algorithm is shown to be robust and accurate. Comparisons are presented between the proposed approach and existing methods for the discretization problem. - Highlights: • A dynamic discretization method is developed for low-probability events in DBNs. • The method is compared to existing approaches on two crack growth problems. • The method is shown to improve on existing methods for low-probability events
Characteristics of SiC neutron sensor spectrum unfolding process based on Bayesian inference
Energy Technology Data Exchange (ETDEWEB)
Cetnar, Jerzy; Krolikowski, Igor [Faculty of Energy and Fuels AGH - University of Science and Technology, Al. Mickiewicza 30, 30-059 Krakow (Poland); Ottaviani, L. [IM2NP, UMR CNRS 7334, Aix-Marseille University, Case 231 -13397 Marseille Cedex 20 (France); Lyoussi, A. [CEA, DEN, DER, Instrumentation Sensors and Dosimetry Laboratory, Cadarache, F-13108 St-Paul-Lez-Durance (France)
2015-07-01
This paper deals with SiC detector signal interpretation in neutron radiation measurements in mixed neutron gamma radiation fields, which is called the detector inverse problem or the spectrum unfolding, and it aims in finding a representation of the primary radiation, based on the measured detector signals. In our novel methodology we resort to Bayesian inference approach. In the developed procedure the resultant spectra is unfolded form detector channels reading, where the estimated neutron fluence in a group structure is obtained with its statistical characteristic comprising of standard deviation and correlation matrix. In the paper we present results of unfolding process for case of D-T neutron source in neutron moderating environment. Discussions of statistical properties of obtained results are presented as well as of the physical meaning of obtained correlation matrix of estimated group fluence. The presented works has been carried out within the I-SMART project, which is part of the KIC InnoEnergy R and D program. (authors)
Bayesian Predictive Inference of a Proportion Under a Twofold Small-Area Model
Directory of Open Access Journals (Sweden)
Nandram Balgobin
2016-03-01
Full Text Available We extend the twofold small-area model of Stukel and Rao (1997; 1999 to accommodate binary data. An example is the Third International Mathematics and Science Study (TIMSS, in which pass-fail data for mathematics of students from US schools (clusters are available at the third grade by regions and communities (small areas. We compare the finite population proportions of these small areas. We present a hierarchical Bayesian model in which the firststage binary responses have independent Bernoulli distributions, and each subsequent stage is modeled using a beta distribution, which is parameterized by its mean and a correlation coefficient. This twofold small-area model has an intracluster correlation at the first stage and an intercluster correlation at the second stage. The final-stage mean and all correlations are assumed to be noninformative independent random variables. We show how to infer the finite population proportion of each area. We have applied our models to synthetic TIMSS data to show that the twofold model is preferred over a onefold small-area model that ignores the clustering within areas. We further compare these models using a simulation study, which shows that the intracluster correlation is particularly important.
Directory of Open Access Journals (Sweden)
Dario Cuevas Rivera
2015-10-01
Full Text Available The olfactory information that is received by the insect brain is encoded in the form of spatiotemporal patterns in the projection neurons of the antennal lobe. These dense and overlapping patterns are transformed into a sparse code in Kenyon cells in the mushroom body. Although it is clear that this sparse code is the basis for rapid categorization of odors, it is yet unclear how the sparse code in Kenyon cells is computed and what information it represents. Here we show that this computation can be modeled by sequential firing rate patterns using Lotka-Volterra equations and Bayesian online inference. This new model can be understood as an 'intelligent coincidence detector', which robustly and dynamically encodes the presence of specific odor features. We found that the model is able to qualitatively reproduce experimentally observed activity in both the projection neurons and the Kenyon cells. In particular, the model explains mechanistically how sparse activity in the Kenyon cells arises from the dense code in the projection neurons. The odor classification performance of the model proved to be robust against noise and time jitter in the observed input sequences. As in recent experimental results, we found that recognition of an odor happened very early during stimulus presentation in the model. Critically, by using the model, we found surprising but simple computational explanations for several experimental phenomena.
Schmit, C. J.; Pritchard, J. R.
2018-03-01
Next generation radio experiments such as LOFAR, HERA, and SKA are expected to probe the Epoch of Reionization (EoR) and claim a first direct detection of the cosmic 21cm signal within the next decade. Data volumes will be enormous and can thus potentially revolutionize our understanding of the early Universe and galaxy formation. However, numerical modelling of the EoR can be prohibitively expensive for Bayesian parameter inference and how to optimally extract information from incoming data is currently unclear. Emulation techniques for fast model evaluations have recently been proposed as a way to bypass costly simulations. We consider the use of artificial neural networks as a blind emulation technique. We study the impact of training duration and training set size on the quality of the network prediction and the resulting best-fitting values of a parameter search. A direct comparison is drawn between our emulation technique and an equivalent analysis using 21CMMC. We find good predictive capabilities of our network using training sets of as low as 100 model evaluations, which is within the capabilities of fully numerical radiative transfer codes.
International Nuclear Information System (INIS)
George, J.S.; Schmidt, D.M.; Wood, C.C.
1999-01-01
We have developed a Bayesian approach to the analysis of neural electromagnetic (MEG/EEG) data that can incorporate or fuse information from other imaging modalities and addresses the ill-posed inverse problem by sarnpliig the many different solutions which could have produced the given data. From these samples one can draw probabilistic inferences about regions of activation. Our source model assumes a variable number of variable size cortical regions of stimulus-correlated activity. An active region consists of locations on the cortical surf ace, within a sphere centered on some location in cortex. The number and radi of active regions can vary to defined maximum values. The goal of the analysis is to determine the posterior probability distribution for the set of parameters that govern the number, location, and extent of active regions. Markov Chain Monte Carlo is used to generate a large sample of sets of parameters distributed according to the posterior distribution. This sample is representative of the many different source distributions that could account for given data, and allows identification of probable (i.e. consistent) features across solutions. Examples of the use of this analysis technique with both simulated and empirical MEG data are presented
Directory of Open Access Journals (Sweden)
Antonio Canale
2017-06-01
Full Text Available msBP is an R package that implements a new method to perform Bayesian multiscale nonparametric inference introduced by Canale and Dunson (2016. The method, based on mixtures of multiscale beta dictionary densities, overcomes the drawbacks of Pólya trees and inherits many of the advantages of Dirichlet process mixture models. The key idea is that an infinitely-deep binary tree is introduced, with a beta dictionary density assigned to each node of the tree. Using a multiscale stick-breaking characterization, stochastically decreasing weights are assigned to each node. The result is an infinite mixture model. The package msBP implements a series of basic functions to deal with this family of priors such as random densities and numbers generation, creation and manipulation of binary tree objects, and generic functions to plot and print the results. In addition, it implements the Gibbs samplers for posterior computation to perform multiscale density estimation and multiscale testing of group differences described in Canale and Dunson (2016.
Bayesian inference of the heat transfer properties of a wall using experimental data
Iglesias, Marco
2016-01-06
A hierarchical Bayesian inference method is developed to estimate the thermal resistance and volumetric heat capacity of a wall. We apply our methodology to a real case study where measurements are recorded each minute from two temperature probes and two heat flux sensors placed on both sides of a solid brick wall along a period of almost five days. We model the heat transfer through the wall by means of the one-dimensional heat equation with Dirichlet boundary conditions. The initial/boundary conditions for the temperature are approximated by piecewise linear functions. We assume that temperature and heat flux measurements have independent Gaussian noise and derive the joint likelihood of the wall parameters and the initial/boundary conditions. Under the model assumptions, the boundary conditions are marginalized analytically from the joint likelihood. ApproximatedGaussian posterior distributions for the wall parameters and the initial condition parameter are obtained using the Laplace method, after incorporating the available prior information. The information gain is estimated under different experimental setups, to determine the best allocation of resources.
Duggento, Andrea; Stankovski, Tomislav; McClintock, Peter V. E.; Stefanovska, Aneta
2012-12-01
Living systems have time-evolving interactions that, until recently, could not be identified accurately from recorded time series in the presence of noise. Stankovski [Phys. Rev. Lett.PRLTAO0031-900710.1103/PhysRevLett.109.024101 109, 024101 (2012)] introduced a method based on dynamical Bayesian inference that facilitates the simultaneous detection of time-varying synchronization, directionality of influence, and coupling functions. It can distinguish unsynchronized dynamics from noise-induced phase slips. The method is based on phase dynamics, with Bayesian inference of the time-evolving parameters being achieved by shaping the prior densities to incorporate knowledge of previous samples. We now present the method in detail using numerically generated data, data from an analog electronic circuit, and cardiorespiratory data. We also generalize the method to encompass networks of interacting oscillators and thus demonstrate its applicability to small-scale networks.
Chen, Mingjie; Izady, Azizallah; Abdalla, Osman A.; Amerjeed, Mansoor
2018-02-01
Bayesian inference using Markov Chain Monte Carlo (MCMC) provides an explicit framework for stochastic calibration of hydrogeologic models accounting for uncertainties; however, the MCMC sampling entails a large number of model calls, and could easily become computationally unwieldy if the high-fidelity hydrogeologic model simulation is time consuming. This study proposes a surrogate-based Bayesian framework to address this notorious issue, and illustrates the methodology by inverse modeling a regional MODFLOW model. The high-fidelity groundwater model is approximated by a fast statistical model using Bagging Multivariate Adaptive Regression Spline (BMARS) algorithm, and hence the MCMC sampling can be efficiently performed. In this study, the MODFLOW model is developed to simulate the groundwater flow in an arid region of Oman consisting of mountain-coast aquifers, and used to run representative simulations to generate training dataset for BMARS model construction. A BMARS-based Sobol' method is also employed to efficiently calculate input parameter sensitivities, which are used to evaluate and rank their importance for the groundwater flow model system. According to sensitivity analysis, insensitive parameters are screened out of Bayesian inversion of the MODFLOW model, further saving computing efforts. The posterior probability distribution of input parameters is efficiently inferred from the prescribed prior distribution using observed head data, demonstrating that the presented BMARS-based Bayesian framework is an efficient tool to reduce parameter uncertainties of a groundwater system.
Directory of Open Access Journals (Sweden)
Hero Alfred
2010-11-01
Full Text Available Abstract Background Nonparametric Bayesian techniques have been developed recently to extend the sophistication of factor models, allowing one to infer the number of appropriate factors from the observed data. We consider such techniques for sparse factor analysis, with application to gene-expression data from three virus challenge studies. Particular attention is placed on employing the Beta Process (BP, the Indian Buffet Process (IBP, and related sparseness-promoting techniques to infer a proper number of factors. The posterior density function on the model parameters is computed using Gibbs sampling and variational Bayesian (VB analysis. Results Time-evolving gene-expression data are considered for respiratory syncytial virus (RSV, Rhino virus, and influenza, using blood samples from healthy human subjects. These data were acquired in three challenge studies, each executed after receiving institutional review board (IRB approval from Duke University. Comparisons are made between several alternative means of per-forming nonparametric factor analysis on these data, with comparisons as well to sparse-PCA and Penalized Matrix Decomposition (PMD, closely related non-Bayesian approaches. Conclusions Applying the Beta Process to the factor scores, or to the singular values of a pseudo-SVD construction, the proposed algorithms infer the number of factors in gene-expression data. For real data the "true" number of factors is unknown; in our simulations we consider a range of noise variances, and the proposed Bayesian models inferred the number of factors accurately relative to other methods in the literature, such as sparse-PCA and PMD. We have also identified a "pan-viral" factor of importance for each of the three viruses considered in this study. We have identified a set of genes associated with this pan-viral factor, of interest for early detection of such viruses based upon the host response, as quantified via gene-expression data.
Chen, Bo; Chen, Minhua; Paisley, John; Zaas, Aimee; Woods, Christopher; Ginsburg, Geoffrey S; Hero, Alfred; Lucas, Joseph; Dunson, David; Carin, Lawrence
2010-11-09
Nonparametric Bayesian techniques have been developed recently to extend the sophistication of factor models, allowing one to infer the number of appropriate factors from the observed data. We consider such techniques for sparse factor analysis, with application to gene-expression data from three virus challenge studies. Particular attention is placed on employing the Beta Process (BP), the Indian Buffet Process (IBP), and related sparseness-promoting techniques to infer a proper number of factors. The posterior density function on the model parameters is computed using Gibbs sampling and variational Bayesian (VB) analysis. Time-evolving gene-expression data are considered for respiratory syncytial virus (RSV), Rhino virus, and influenza, using blood samples from healthy human subjects. These data were acquired in three challenge studies, each executed after receiving institutional review board (IRB) approval from Duke University. Comparisons are made between several alternative means of per-forming nonparametric factor analysis on these data, with comparisons as well to sparse-PCA and Penalized Matrix Decomposition (PMD), closely related non-Bayesian approaches. Applying the Beta Process to the factor scores, or to the singular values of a pseudo-SVD construction, the proposed algorithms infer the number of factors in gene-expression data. For real data the "true" number of factors is unknown; in our simulations we consider a range of noise variances, and the proposed Bayesian models inferred the number of factors accurately relative to other methods in the literature, such as sparse-PCA and PMD. We have also identified a "pan-viral" factor of importance for each of the three viruses considered in this study. We have identified a set of genes associated with this pan-viral factor, of interest for early detection of such viruses based upon the host response, as quantified via gene-expression data.
Energy Technology Data Exchange (ETDEWEB)
La Russa, D [The Ottawa Hospital Cancer Centre, Ottawa, ON (Canada)
2015-06-15
Purpose: The purpose of this project is to develop a robust method of parameter estimation for a Poisson-based TCP model using Bayesian inference. Methods: Bayesian inference was performed using the PyMC3 probabilistic programming framework written in Python. A Poisson-based TCP regression model that accounts for clonogen proliferation was fit to observed rates of local relapse as a function of equivalent dose in 2 Gy fractions for a population of 623 stage-I non-small-cell lung cancer patients. The Slice Markov Chain Monte Carlo sampling algorithm was used to sample the posterior distributions, and was initiated using the maximum of the posterior distributions found by optimization. The calculation of TCP with each sample step required integration over the free parameter α, which was performed using an adaptive 24-point Gauss-Legendre quadrature. Convergence was verified via inspection of the trace plot and posterior distribution for each of the fit parameters, as well as with comparisons of the most probable parameter values with their respective maximum likelihood estimates. Results: Posterior distributions for α, the standard deviation of α (σ), the average tumour cell-doubling time (Td), and the repopulation delay time (Tk), were generated assuming α/β = 10 Gy, and a fixed clonogen density of 10{sup 7} cm−{sup 3}. Posterior predictive plots generated from samples from these posterior distributions are in excellent agreement with the observed rates of local relapse used in the Bayesian inference. The most probable values of the model parameters also agree well with maximum likelihood estimates. Conclusion: A robust method of performing Bayesian inference of TCP data using a complex TCP model has been established.
Jakkareddy, Pradeep S.; Balaji, C.
2016-09-01
This paper employs the Bayesian based Metropolis Hasting - Markov Chain Monte Carlo algorithm to solve inverse heat transfer problem of determining the spatially varying heat transfer coefficient from a flat plate with flush mounted discrete heat sources with measured temperatures at the bottom of the plate. The Nusselt number is assumed to be of the form Nu = aReb(x/l)c . To input reasonable values of ’a’ and ‘b’ into the inverse problem, first limited two dimensional conjugate convection simulations were done with Comsol. Based on the guidance from this different values of ‘a’ and ‘b’ are input to a computationally less complex problem of conjugate conduction in the flat plate (15mm thickness) and temperature distributions at the bottom of the plate which is a more convenient location for measuring the temperatures without disturbing the flow were obtained. Since the goal of this work is to demonstrate the eficiacy of the Bayesian approach to accurately retrieve ‘a’ and ‘b’, numerically generated temperatures with known values of ‘a’ and ‘b’ are treated as ‘surrogate’ experimental data. The inverse problem is then solved by repeatedly using the forward solutions together with the MH-MCMC aprroach. To speed up the estimation, the forward model is replaced by an artificial neural network. The mean, maximum-a-posteriori and standard deviation of the estimated parameters ‘a’ and ‘b’ are reported. The robustness of the proposed method is examined, by synthetically adding noise to the temperatures.
Kim, Daesang; El Gharamti, Iman; Hantouche, Mireille; Elwardani, Ahmed Elsaid; Farooq, Aamir; Bisetti, Fabrizio; Knio, Omar
2017-01-01
We developed a novel two-step hierarchical method for the Bayesian inference of the rate parameters of a target reaction from time-resolved concentration measurements in shock tubes. The method was applied to the calibration of the parameters
Multi-scale inference of interaction rules in animal groups using Bayesian model selection.
Directory of Open Access Journals (Sweden)
Richard P Mann
Full Text Available Inference of interaction rules of animals moving in groups usually relies on an analysis of large scale system behaviour. Models are tuned through repeated simulation until they match the observed behaviour. More recent work has used the fine scale motions of animals to validate and fit the rules of interaction of animals in groups. Here, we use a Bayesian methodology to compare a variety of models to the collective motion of glass prawns (Paratya australiensis. We show that these exhibit a stereotypical 'phase transition', whereby an increase in density leads to the onset of collective motion in one direction. We fit models to this data, which range from: a mean-field model where all prawns interact globally; to a spatial Markovian model where prawns are self-propelled particles influenced only by the current positions and directions of their neighbours; up to non-Markovian models where prawns have 'memory' of previous interactions, integrating their experiences over time when deciding to change behaviour. We show that the mean-field model fits the large scale behaviour of the system, but does not capture the observed locality of interactions. Traditional self-propelled particle models fail to capture the fine scale dynamics of the system. The most sophisticated model, the non-Markovian model, provides a good match to the data at both the fine scale and in terms of reproducing global dynamics, while maintaining a biologically plausible perceptual range. We conclude that prawns' movements are influenced by not just the current direction of nearby conspecifics, but also those encountered in the recent past. Given the simplicity of prawns as a study system our research suggests that self-propelled particle models of collective motion should, if they are to be realistic at multiple biological scales, include memory of previous interactions and other non-Markovian effects.
Energy Technology Data Exchange (ETDEWEB)
Kang, Seongkeun; Seong, Poong Hyun [Korea Advanced Institute of Science and Technology, Daejeon (Korea, Republic of)
2014-05-15
The purpose of this paper is to confirm if Bayesian inference can properly reflect the situation awareness of real human operators, and find the difference between the situation of ideal and practical operators, and investigate the factors which contributes to those difference. As a results, human can not think like computer. If human can memorize all the information, and their thinking process is same to the CPU of computer, the results of these two experiments come out more than 99%. However the probability of finding right malfunction by humans are only 64.52% in simple experiment, and 51.61% in complex experiment. Cognition is the mental processing that includes the attention of working memory, comprehending and producing language, calculating, reasoning, problem solving, and decision making. There are many reasons why human thinking process is different with computer, but in this experiment, we suggest that the working memory is the most important factor. Humans have limited working memory which has only seven chunks capacity. These seven chunks are called magic number. If there are more than seven sequential information, people start to forget the previous information because their working memory capacity is running over. We can check how much working memory affects to the result through the simple experiment. Then what if we neglect the effect of working memory? The total number of subjects who have incorrect memory is 7 (subject 3, 5, 6, 7, 8, 15, 25). They could find the right malfunction if the memory hadn't changed because of lack of working memory. Then the probability of find correct malfunction will be increased to 87.10% from 64.52%. Complex experiment has similar result. In this case, eight subjects(1, 5, 8, 9, 15, 17, 18, 30) had changed the memory, and it affects to find the right malfunction. Considering it, then the probability would be (16+8)/31 = 77.42%.
Palacios, Julia A; Minin, Vladimir N
2013-03-01
Changes in population size influence genetic diversity of the population and, as a result, leave a signature of these changes in individual genomes in the population. We are interested in the inverse problem of reconstructing past population dynamics from genomic data. We start with a standard framework based on the coalescent, a stochastic process that generates genealogies connecting randomly sampled individuals from the population of interest. These genealogies serve as a glue between the population demographic history and genomic sequences. It turns out that only the times of genealogical lineage coalescences contain information about population size dynamics. Viewing these coalescent times as a point process, estimating population size trajectories is equivalent to estimating a conditional intensity of this point process. Therefore, our inverse problem is similar to estimating an inhomogeneous Poisson process intensity function. We demonstrate how recent advances in Gaussian process-based nonparametric inference for Poisson processes can be extended to Bayesian nonparametric estimation of population size dynamics under the coalescent. We compare our Gaussian process (GP) approach to one of the state-of-the-art Gaussian Markov random field (GMRF) methods for estimating population trajectories. Using simulated data, we demonstrate that our method has better accuracy and precision. Next, we analyze two genealogies reconstructed from real sequences of hepatitis C and human Influenza A viruses. In both cases, we recover more believed aspects of the viral demographic histories than the GMRF approach. We also find that our GP method produces more reasonable uncertainty estimates than the GMRF method. Copyright © 2013, The International Biometric Society.
Koskela, J. J.; Croke, B. W. F.; Koivusalo, H.; Jakeman, A. J.; Kokkonen, T.
2012-11-01
Bayesian inference is used to study the effect of precipitation and model structural uncertainty on estimates of model parameters and confidence limits of predictive variables in a conceptual rainfall-runoff model in the snow-fed Rudbäck catchment (142 ha) in southern Finland. The IHACRES model is coupled with a simple degree day model to account for snow accumulation and melt. The posterior probability distribution of the model parameters is sampled by using the Differential Evolution Adaptive Metropolis (DREAM(ZS)) algorithm and the generalized likelihood function. Precipitation uncertainty is taken into account by introducing additional latent variables that were used as multipliers for individual storm events. Results suggest that occasional snow water equivalent (SWE) observations together with daily streamflow observations do not contain enough information to simultaneously identify model parameters, precipitation uncertainty and model structural uncertainty in the Rudbäck catchment. The addition of an autoregressive component to account for model structure error and latent variables having uniform priors to account for input uncertainty lead to dubious posterior distributions of model parameters. Thus our hypothesis that informative priors for latent variables could be replaced by additional SWE data could not be confirmed. The model was found to work adequately in 1-day-ahead simulation mode, but the results were poor in the simulation batch mode. This was caused by the interaction of parameters that were used to describe different sources of uncertainty. The findings may have lessons for other cases where parameterizations are similarly high in relation to available prior information.
Hallo, Miroslav; Asano, Kimiyuki; Gallovič, František
2017-09-01
On April 16, 2016, Kumamoto prefecture in Kyushu region, Japan, was devastated by a shallow M JMA7.3 earthquake. The series of foreshocks started by M JMA6.5 foreshock 28 h before the mainshock. They have originated in Hinagu fault zone intersecting the mainshock Futagawa fault zone; hence, the tectonic background for this earthquake sequence is rather complex. Here we infer centroid moment tensors (CMTs) for 11 events with M JMA between 4.8 and 6.5, using strong motion records of the K-NET, KiK-net and F-net networks. We use upgraded Bayesian full-waveform inversion code ISOLA-ObsPy, which takes into account uncertainty of the velocity model. Such an approach allows us to reliably assess uncertainty of the CMT parameters including the centroid position. The solutions show significant systematic spatial and temporal variations throughout the sequence. Foreshocks are right-lateral steeply dipping strike-slip events connected to the NE-SW shear zone. Those located close to the intersection of the Hinagu and Futagawa fault zones are dipping slightly to ESE, while those in the southern area are dipping to WNW. Contrarily, aftershocks are mostly normal dip-slip events, being related to the N-S extensional tectonic regime. Most of the deviatoric moment tensors contain only minor CLVD component, which can be attributed to the velocity model uncertainty. Nevertheless, two of the CMTs involve a significant CLVD component, which may reflect complex rupture process. Decomposition of those moment tensors into two pure shear moment tensors suggests combined right-lateral strike-slip and normal dip-slip mechanisms, consistent with the tectonic settings of the intersection of the Hinagu and Futagawa fault zones.[Figure not available: see fulltext.
International Nuclear Information System (INIS)
Kang, Seongkeun; Seong, Poong Hyun
2014-01-01
The purpose of this paper is to confirm if Bayesian inference can properly reflect the situation awareness of real human operators, and find the difference between the situation of ideal and practical operators, and investigate the factors which contributes to those difference. As a results, human can not think like computer. If human can memorize all the information, and their thinking process is same to the CPU of computer, the results of these two experiments come out more than 99%. However the probability of finding right malfunction by humans are only 64.52% in simple experiment, and 51.61% in complex experiment. Cognition is the mental processing that includes the attention of working memory, comprehending and producing language, calculating, reasoning, problem solving, and decision making. There are many reasons why human thinking process is different with computer, but in this experiment, we suggest that the working memory is the most important factor. Humans have limited working memory which has only seven chunks capacity. These seven chunks are called magic number. If there are more than seven sequential information, people start to forget the previous information because their working memory capacity is running over. We can check how much working memory affects to the result through the simple experiment. Then what if we neglect the effect of working memory? The total number of subjects who have incorrect memory is 7 (subject 3, 5, 6, 7, 8, 15, 25). They could find the right malfunction if the memory hadn't changed because of lack of working memory. Then the probability of find correct malfunction will be increased to 87.10% from 64.52%. Complex experiment has similar result. In this case, eight subjects(1, 5, 8, 9, 15, 17, 18, 30) had changed the memory, and it affects to find the right malfunction. Considering it, then the probability would be (16+8)/31 = 77.42%
Spatial variability of coastal wetland resilience to sea-level rise using Bayesian inference
Hardy, T.; Wu, W.
2017-12-01
The coastal wetlands in the Northern Gulf of Mexico (NGOM) account for 40% of coastal wetland area in the United States and provide various ecosystem services to the region and broader areas. Increasing rates of relative sea-level rise (RSLR), and reduced sediment input have increased coastal wetland loss in the NGOM, accounting for 80% of coastal wetland loss in the nation. Traditional models for predicting the impact of RSLR on coastal wetlands in the NGOM have focused on coastal erosion driven by geophysical variables only, and/or at small spatial extents. Here we developed a model in Bayesian inference to make probabilistic prediction of wetland loss in the entire NGOM as a function of vegetation productivity and geophysical attributes. We also studied how restoration efforts help maintain the area of coastal wetlands. Vegetation productivity contributes organic matter to wetland sedimentation and was approximated using the remotely sensed normalized difference moisture index (NDMI). The geophysical variables include RSLR, tidal range, river discharge, coastal slope, and wave height. We found a significantly positive relation between wetland loss and RSLR, which varied significantly at different river discharge regimes. There also existed a significantly negative relation between wetland loss and NDMI, indicating that in-situ vegetation productivity contributed to wetland resilience to RSLR. This relation did not vary significantly between river discharge regimes. The spatial relation revealed three areas of high RSLR but relatively low wetland loss; these areas were associated with wetland restoration projects in coastal Louisiana. Two projects were breakwater projects, where hard materials were placed off-shore to reduce wave action and promote sedimentation. And one project was a vegetation planting project used to promote sedimentation and wetland stabilization. We further developed an interactive web tool that allows stakeholders to develop similar wetland
Multi-scale inference of interaction rules in animal groups using Bayesian model selection.
Directory of Open Access Journals (Sweden)
Richard P Mann
2012-01-01
Full Text Available Inference of interaction rules of animals moving in groups usually relies on an analysis of large scale system behaviour. Models are tuned through repeated simulation until they match the observed behaviour. More recent work has used the fine scale motions of animals to validate and fit the rules of interaction of animals in groups. Here, we use a Bayesian methodology to compare a variety of models to the collective motion of glass prawns (Paratya australiensis. We show that these exhibit a stereotypical 'phase transition', whereby an increase in density leads to the onset of collective motion in one direction. We fit models to this data, which range from: a mean-field model where all prawns interact globally; to a spatial Markovian model where prawns are self-propelled particles influenced only by the current positions and directions of their neighbours; up to non-Markovian models where prawns have 'memory' of previous interactions, integrating their experiences over time when deciding to change behaviour. We show that the mean-field model fits the large scale behaviour of the system, but does not capture fine scale rules of interaction, which are primarily mediated by physical contact. Conversely, the Markovian self-propelled particle model captures the fine scale rules of interaction but fails to reproduce global dynamics. The most sophisticated model, the non-Markovian model, provides a good match to the data at both the fine scale and in terms of reproducing global dynamics. We conclude that prawns' movements are influenced by not just the current direction of nearby conspecifics, but also those encountered in the recent past. Given the simplicity of prawns as a study system our research suggests that self-propelled particle models of collective motion should, if they are to be realistic at multiple biological scales, include memory of previous interactions and other non-Markovian effects.
A new Bayesian Inference-based Phase Associator for Earthquake Early Warning
Meier, Men-Andrin; Heaton, Thomas; Clinton, John; Wiemer, Stefan
2013-04-01
State of the art network-based Earthquake Early Warning (EEW) systems can provide warnings for large magnitude 7+ earthquakes. Although regions in the direct vicinity of the epicenter will not receive warnings prior to damaging shaking, real-time event characterization is available before the destructive S-wave arrival across much of the strongly affected region. In contrast, in the case of the more frequent medium size events, such as the devastating 1994 Mw6.7 Northridge, California, earthquake, providing timely warning to the smaller damage zone is more difficult. For such events the "blind zone" of current systems (e.g. the CISN ShakeAlert system in California) is similar in size to the area over which severe damage occurs. We propose a faster and more robust Bayesian inference-based event associator, that in contrast to the current standard associators (e.g. Earthworm Binder), is tailored to EEW and exploits information other than only phase arrival times. In particular, the associator potentially allows for reliable automated event association with as little as two observations, which, compared to the ShakeAlert system, would speed up the real-time characterizations by about ten seconds and thus reduce the blind zone area by up to 80%. We compile an extensive data set of regional and teleseismic earthquake and noise waveforms spanning a wide range of earthquake magnitudes and tectonic regimes. We pass these waveforms through a causal real-time filterbank with passband filters between 0.1 and 50Hz, and, updating every second from the event detection, extract the maximum amplitudes in each frequency band. Using this dataset, we define distributions of amplitude maxima in each passband as a function of epicentral distance and magnitude. For the real-time data, we pass incoming broadband and strong motion waveforms through the same filterbank and extract an evolving set of maximum amplitudes in each passband. We use the maximum amplitude distributions to check
Huang, Yi-Fei; Golding, G Brian
2015-02-15
A number of statistical phylogenetic methods have been developed to infer conserved functional sites or regions in proteins. Many methods, e.g. Rate4Site, apply the standard phylogenetic models to infer site-specific substitution rates and totally ignore the spatial correlation of substitution rates in protein tertiary structures, which may reduce their power to identify conserved functional patches in protein tertiary structures when the sequences used in the analysis are highly similar. The 3D sliding window method has been proposed to infer conserved functional patches in protein tertiary structures, but the window size, which reflects the strength of the spatial correlation, must be predefined and is not inferred from data. We recently developed GP4Rate to solve these problems under the Bayesian framework. Unfortunately, GP4Rate is computationally slow. Here, we present an intuitive web server, FuncPatch, to perform a fast approximate Bayesian inference of conserved functional patches in protein tertiary structures. Both simulations and four case studies based on empirical data suggest that FuncPatch is a good approximation to GP4Rate. However, FuncPatch is orders of magnitudes faster than GP4Rate. In addition, simulations suggest that FuncPatch is potentially a useful tool complementary to Rate4Site, but the 3D sliding window method is less powerful than FuncPatch and Rate4Site. The functional patches predicted by FuncPatch in the four case studies are supported by experimental evidence, which corroborates the usefulness of FuncPatch. The software FuncPatch is freely available at the web site, http://info.mcmaster.ca/yifei/FuncPatch golding@mcmaster.ca Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Raithel, Carolyn A.; Özel, Feryal; Psaltis, Dimitrios
2017-08-01
One of the key goals of observing neutron stars is to infer the equation of state (EoS) of the cold, ultradense matter in their interiors. Here, we present a Bayesian statistical method of inferring the pressures at five fixed densities, from a sample of mock neutron star masses and radii. We show that while five polytropic segments are needed for maximum flexibility in the absence of any prior knowledge of the EoS, regularizers are also necessary to ensure that simple underlying EoS are not over-parameterized. For ideal data with small measurement uncertainties, we show that the pressure at roughly twice the nuclear saturation density, {ρ }{sat}, can be inferred to within 0.3 dex for many realizations of potential sources of uncertainties. The pressures of more complicated EoS with significant phase transitions can also be inferred to within ˜30%. We also find that marginalizing the multi-dimensional parameter space of pressure to infer a mass-radius relation can lead to biases of nearly 1 km in radius, toward larger radii. Using the full, five-dimensional posterior likelihoods avoids this bias.
Modelling maximum river flow by using Bayesian Markov Chain Monte Carlo
Cheong, R. Y.; Gabda, D.
2017-09-01
Analysis of flood trends is vital since flooding threatens human living in terms of financial, environment and security. The data of annual maximum river flows in Sabah were fitted into generalized extreme value (GEV) distribution. Maximum likelihood estimator (MLE) raised naturally when working with GEV distribution. However, previous researches showed that MLE provide unstable results especially in small sample size. In this study, we used different Bayesian Markov Chain Monte Carlo (MCMC) based on Metropolis-Hastings algorithm to estimate GEV parameters. Bayesian MCMC method is a statistical inference which studies the parameter estimation by using posterior distribution based on Bayes’ theorem. Metropolis-Hastings algorithm is used to overcome the high dimensional state space faced in Monte Carlo method. This approach also considers more uncertainty in parameter estimation which then presents a better prediction on maximum river flow in Sabah.
Systematic validation of non-equilibrium thermochemical models using Bayesian inference
Miki, Kenji; Panesi, Marco; Prudhomme, Serge
2015-01-01
and Treanor [3], and Park [4]. Up to 16 uncertain parameters are estimated using Bayesian updating based on the latest absolute volumetric radiance data collected at the Electric Arc Shock Tube (EAST) installed inside the NASA Ames Research Center. Following
DEFF Research Database (Denmark)
Hooghoudt, Jan Otto; Barroso, Margarida; Waagepetersen, Rasmus Plenge
2017-01-01
Főrster resonance energy transfer (FRET) is a quantum-physical phenomenon where energy may be transferred from one molecule to a neighbour molecule if the molecules are close enough. Using fluorophore molecule marking of proteins in a cell it is possible to measure in microscopic images to what....... In this paper we propose a new likelihood-based approach to statistical inference for FRET microscopic data. The likelihood function is obtained from a detailed modeling of the FRET data generating mechanism conditional on a protein configuration. We next follow a Bayesian approach and introduce a spatial point...
Estimating the Effective Sample Size of Tree Topologies from Bayesian Phylogenetic Analyses
Lanfear, Robert; Hua, Xia; Warren, Dan L.
2016-01-01
Bayesian phylogenetic analyses estimate posterior distributions of phylogenetic tree topologies and other parameters using Markov chain Monte Carlo (MCMC) methods. Before making inferences from these distributions, it is important to assess their adequacy. To this end, the effective sample size (ESS) estimates how many truly independent samples of a given parameter the output of the MCMC represents. The ESS of a parameter is frequently much lower than the number of samples taken from the MCMC because sequential samples from the chain can be non-independent due to autocorrelation. Typically, phylogeneticists use a rule of thumb that the ESS of all parameters should be greater than 200. However, we have no method to calculate an ESS of tree topology samples, despite the fact that the tree topology is often the parameter of primary interest and is almost always central to the estimation of other parameters. That is, we lack a method to determine whether we have adequately sampled one of the most important parameters in our analyses. In this study, we address this problem by developing methods to estimate the ESS for tree topologies. We combine these methods with two new diagnostic plots for assessing posterior samples of tree topologies, and compare their performance on simulated and empirical data sets. Combined, the methods we present provide new ways to assess the mixing and convergence of phylogenetic tree topologies in Bayesian MCMC analyses. PMID:27435794
Bayesian inference – a way to combine statistical data and semantic analysis meaningfully
Directory of Open Access Journals (Sweden)
Eila Lindfors
2011-11-01
Full Text Available This article focuses on presenting the possibilities of Bayesian modelling (Finite Mixture Modelling in the semantic analysis of statistically modelled data. The probability of a hypothesis in relation to the data available is an important question in inductive reasoning. Bayesian modelling allows the researcher to use many models at a time and provides tools to evaluate the goodness of different models. The researcher should always be aware that there is no such thing as the exact probability of an exact event. This is the reason for using probabilistic models. Each model presents a different perspective on the phenomenon in focus, and the researcher has to choose the most probable model with a view to previous research and the knowledge available.The idea of Bayesian modelling is illustrated here by presenting two different sets of data, one from craft science research (n=167 and the other (n=63 from educational research (Lindfors, 2007, 2002. The principles of how to build models and how to combine different profiles are described in the light of the research mentioned.Bayesian modelling is an analysis based on calculating probabilities in relation to a specific set of quantitative data. It is a tool for handling data and interpreting it semantically. The reliability of the analysis arises from an argumentation of which model can be selected from the model space as the basis for an interpretation, and on which arguments.Keywords: method, sloyd, Bayesian modelling, student teachersURN:NBN:no-29959
Yu, Jianbo
2015-12-01
Prognostics is much efficient to achieve zero-downtime performance, maximum productivity and proactive maintenance of machines. Prognostics intends to assess and predict the time evolution of machine health degradation so that machine failures can be predicted and prevented. A novel prognostics system is developed based on the data-model-fusion scheme using the Bayesian inference-based self-organizing map (SOM) and an integration of logistic regression (LR) and high-order particle filtering (HOPF). In this prognostics system, a baseline SOM is constructed to model the data distribution space of healthy machine under an assumption that predictable fault patterns are not available. Bayesian inference-based probability (BIP) derived from the baseline SOM is developed as a quantification indication of machine health degradation. BIP is capable of offering failure probability for the monitored machine, which has intuitionist explanation related to health degradation state. Based on those historic BIPs, the constructed LR and its modeling noise constitute a high-order Markov process (HOMP) to describe machine health propagation. HOPF is used to solve the HOMP estimation to predict the evolution of the machine health in the form of a probability density function (PDF). An on-line model update scheme is developed to adapt the Markov process changes to machine health dynamics quickly. The experimental results on a bearing test-bed illustrate the potential applications of the proposed system as an effective and simple tool for machine health prognostics.
Zhou, X.; Albertson, J. D.
2016-12-01
Natural gas is considered as a bridge fuel towards clean energy due to its potential lower greenhouse gas emission comparing with other fossil fuels. Despite numerous efforts, an efficient and cost-effective approach to monitor fugitive methane emissions along the natural gas production-supply chain has not been developed yet. Recently, mobile methane measurement has been introduced which applies a Bayesian approach to probabilistically infer methane emission rates and update estimates recursively when new measurements become available. However, the likelihood function, especially the error term which determines the shape of the estimate uncertainty, is not rigorously defined and evaluated with field data. To address this issue, we performed a series of near-source (using a specialized vehicle mounted with fast response methane analyzers and a GPS unit. Methane concentrations were measured at two different heights along mobile traversals downwind of the sources, and concurrent wind and temperature data are recorded by nearby 3-D sonic anemometers. With known methane release rates, the measurements were used to determine the functional form and the parameterization of the likelihood function in the Bayesian inference scheme under different meteorological conditions.
Adaptive multiscale MCMC algorithm for uncertainty quantification in seismic parameter estimation
Tan, Xiaosi
2014-08-05
Formulating an inverse problem in a Bayesian framework has several major advantages (Sen and Stoffa, 1996). It allows finding multiple solutions subject to flexible a priori information and performing uncertainty quantification in the inverse problem. In this paper, we consider Bayesian inversion for the parameter estimation in seismic wave propagation. The Bayes\\' theorem allows writing the posterior distribution via the likelihood function and the prior distribution where the latter represents our prior knowledge about physical properties. One of the popular algorithms for sampling this posterior distribution is Markov chain Monte Carlo (MCMC), which involves making proposals and calculating their acceptance probabilities. However, for large-scale problems, MCMC is prohibitevely expensive as it requires many forward runs. In this paper, we propose a multilevel MCMC algorithm that employs multilevel forward simulations. Multilevel forward simulations are derived using Generalized Multiscale Finite Element Methods that we have proposed earlier (Efendiev et al., 2013a; Chung et al., 2013). Our overall Bayesian inversion approach provides a substantial speed-up both in the process of the sampling via preconditioning using approximate posteriors and the computation of the forward problems for different proposals by using the adaptive nature of multiscale methods. These aspects of the method are discussed n the paper. This paper is motivated by earlier work of M. Sen and his collaborators (Hong and Sen, 2007; Hong, 2008) who proposed the development of efficient MCMC techniques for seismic applications. In the paper, we present some preliminary numerical results.
Scliar, Marilia O; Gouveia, Mateus H; Benazzo, Andrea; Ghirotto, Silvia; Fagundes, Nelson J R; Leal, Thiago P; Magalhães, Wagner C S; Pereira, Latife; Rodrigues, Maira R; Soares-Souza, Giordano B; Cabrera, Lilia; Berg, Douglas E; Gilman, Robert H; Bertorelle, Giorgio; Tarazona-Santos, Eduardo
2014-09-30
Archaeology reports millenary cultural contacts between Peruvian Coast-Andes and the Amazon Yunga, a rainforest transitional region between Andes and Lower Amazonia. To clarify the relationships between cultural and biological evolution of these populations, in particular between Amazon Yungas and Andeans, we used DNA-sequence data, a model-based Bayesian approach and several statistical validations to infer a set of demographic parameters. We found that the genetic diversity of the Shimaa (an Amazon Yunga population) is a subset of that of Quechuas from Central-Andes. Using the Isolation-with-Migration population genetics model, we inferred that the Shimaa ancestors were a small subgroup that split less than 5300 years ago (after the development of complex societies) from an ancestral Andean population. After the split, the most plausible scenario compatible with our results is that the ancestors of Shimaas moved toward the Peruvian Amazon Yunga and incorporated the culture and language of some of their neighbors, but not a substantial amount of their genes. We validated our results using Approximate Bayesian Computations, posterior predictive tests and the analysis of pseudo-observed datasets. We presented a case study in which model-based Bayesian approaches, combined with necessary statistical validations, shed light into the prehistoric demographic relationship between Andeans and a population from the Amazon Yunga. Our results offer a testable model for the peopling of this large transitional environmental region between the Andes and the Lower Amazonia. However, studies on larger samples and involving more populations of these regions are necessary to confirm if the predominant Andean biological origin of the Shimaas is the rule, and not the exception.
Directory of Open Access Journals (Sweden)
M. Subbiah
2017-01-01
Full Text Available Extensive statistical practice has shown the importance and relevance of the inferential problem of estimating probability parameters in a binomial experiment; especially on the issues of competing intervals from frequentist, Bayesian, and Bootstrap approaches. The package written in the free R environment and presented in this paper tries to take care of the issues just highlighted, by pooling a number of widely available and well-performing methods and apporting on them essential variations. A wide range of functions helps users with differing skills to estimate, evaluate, summarize, numerically and graphically, various measures adopting either the frequentist or the Bayesian paradigm.
Bayesian networks in educational assessment
Almond, Russell G; Steinberg, Linda S; Yan, Duanli; Williamson, David M
2015-01-01
Bayesian inference networks, a synthesis of statistics and expert systems, have advanced reasoning under uncertainty in medicine, business, and social sciences. This innovative volume is the first comprehensive treatment exploring how they can be applied to design and analyze innovative educational assessments. Part I develops Bayes nets’ foundations in assessment, statistics, and graph theory, and works through the real-time updating algorithm. Part II addresses parametric forms for use with assessment, model-checking techniques, and estimation with the EM algorithm and Markov chain Monte Carlo (MCMC). A unique feature is the volume’s grounding in Evidence-Centered Design (ECD) framework for assessment design. This “design forward” approach enables designers to take full advantage of Bayes nets’ modularity and ability to model complex evidentiary relationships that arise from performance in interactive, technology-rich assessments such as simulations. Part III describes ECD, situates Bayes nets as ...
Czech Academy of Sciences Publication Activity Database
Fernandes, R.; Eley, Y.; Brabec, Marek; Lucquin, A.; Millard, A.; Craig, O.E.
2018-01-01
Roč. 117, March (2018), s. 31-42 ISSN 0146-6380 Institutional support: RVO:67985807 Keywords : Fatty acids * carbon isotopes * pottery use * Bayesian mixing models * FRUITS Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 3.081, year: 2016
Adaptive Bayesian inference on the mean of an infinite-dimensional normal distribution
Belitser, E.; Ghosal, S.
2003-01-01
We consider the problem of estimating the mean of an infinite-break dimensional normal distribution from the Bayesian perspective. Under the assumption that the unknown true mean satisfies a "smoothness condition," we first derive the convergence rate of the posterior distribution for a prior that
International Nuclear Information System (INIS)
Blanc, Guillermo A.; Kewley, Lisa; Vogt, Frédéric P. A.; Dopita, Michael A.
2015-01-01
We present a new method for inferring the metallicity (Z) and ionization parameter (q) of H II regions and star-forming galaxies using strong nebular emission lines (SELs). We use Bayesian inference to derive the joint and marginalized posterior probability density functions for Z and q given a set of observed line fluxes and an input photoionization model. Our approach allows the use of arbitrary sets of SELs and the inclusion of flux upper limits. The method provides a self-consistent way of determining the physical conditions of ionized nebulae that is not tied to the arbitrary choice of a particular SEL diagnostic and uses all the available information. Unlike theoretically calibrated SEL diagnostics, the method is flexible and not tied to a particular photoionization model. We describe our algorithm, validate it against other methods, and present a tool that implements it called IZI. Using a sample of nearby extragalactic H II regions, we assess the performance of commonly used SEL abundance diagnostics. We also use a sample of 22 local H II regions having both direct and recombination line (RL) oxygen abundance measurements in the literature to study discrepancies in the abundance scale between different methods. We find that oxygen abundances derived through Bayesian inference using currently available photoionization models in the literature can be in good (∼30%) agreement with RL abundances, although some models perform significantly better than others. We also confirm that abundances measured using the direct method are typically ∼0.2 dex lower than both RL and photoionization-model-based abundances
Energy Technology Data Exchange (ETDEWEB)
Blanc, Guillermo A. [Observatories of the Carnegie Institution for Science, 813 Santa Barbara Street, Pasadena, CA 91101 (United States); Kewley, Lisa; Vogt, Frédéric P. A.; Dopita, Michael A. [Research School of Astronomy and Astrophysics, Australian National University, Cotter Road, Weston, ACT 2611 (Australia)
2015-01-10
We present a new method for inferring the metallicity (Z) and ionization parameter (q) of H II regions and star-forming galaxies using strong nebular emission lines (SELs). We use Bayesian inference to derive the joint and marginalized posterior probability density functions for Z and q given a set of observed line fluxes and an input photoionization model. Our approach allows the use of arbitrary sets of SELs and the inclusion of flux upper limits. The method provides a self-consistent way of determining the physical conditions of ionized nebulae that is not tied to the arbitrary choice of a particular SEL diagnostic and uses all the available information. Unlike theoretically calibrated SEL diagnostics, the method is flexible and not tied to a particular photoionization model. We describe our algorithm, validate it against other methods, and present a tool that implements it called IZI. Using a sample of nearby extragalactic H II regions, we assess the performance of commonly used SEL abundance diagnostics. We also use a sample of 22 local H II regions having both direct and recombination line (RL) oxygen abundance measurements in the literature to study discrepancies in the abundance scale between different methods. We find that oxygen abundances derived through Bayesian inference using currently available photoionization models in the literature can be in good (∼30%) agreement with RL abundances, although some models perform significantly better than others. We also confirm that abundances measured using the direct method are typically ∼0.2 dex lower than both RL and photoionization-model-based abundances.
Trajectory averaging for stochastic approximation MCMC algorithms
Liang, Faming
2010-01-01
to the stochastic approximation Monte Carlo algorithm [Liang, Liu and Carroll J. Amer. Statist. Assoc. 102 (2007) 305-320]. The application of the trajectory averaging estimator to other stochastic approximationMCMC algorithms, for example, a stochastic
Multi-model polynomial chaos surrogate dictionary for Bayesian inference in elasticity problems
Contreras, Andres A.; Le Maî tre, Olivier P.; Aquino, Wilkins; Knio, Omar
2016-01-01
of stiff inclusions embedded in a soft matrix, mimicking tumors in soft tissues. We rely on a polynomial chaos (PC) surrogate to accelerate the inference process. The PC surrogate predicts the dependence of the displacements field with the random elastic
Air kerma rate estimation by means of in-situ gamma spectrometry: A Bayesian approach
International Nuclear Information System (INIS)
Cabal, Gonzalo; Kluson, Jaroslav
2008-01-01
Full text: Bayesian inference is used to determine the Air Kerma Rate based on a set of in situ environmental gamma spectra measurements performed with a NaI(Tl) scintillation detector. A natural advantage of such approach is the possibility to quantify uncertainty not only in the Air Kerma Rate estimation but also for the gamma spectra which is unfolded within the procedure. The measurements were performed using a 3'' x 3'' NaI(Tl) scintillation detector. The response matrices of such detection system were calculated using a Monte Carlo code. For the calculations of the spectra as well as the Air Kerma Rate the WinBugs program was used. WinBugs is a dedicated software for Bayesian inference using Monte Carlo Markov chain methods (MCMC). The results of such calculations are shown and compared with other non-Bayesian approachs such as the Scofield-Gold iterative method and the Maximum Entropy Method
From least squares to multilevel modeling: A graphical introduction to Bayesian inference
Loredo, Thomas J.
2016-01-01
This tutorial presentation will introduce some of the key ideas and techniques involved in applying Bayesian methods to problems in astrostatistics. The focus will be on the big picture: understanding the foundations (interpreting probability, Bayes's theorem, the law of total probability and marginalization), making connections to traditional methods (propagation of errors, least squares, chi-squared, maximum likelihood, Monte Carlo simulation), and highlighting problems where a Bayesian approach can be particularly powerful (Poisson processes, density estimation and curve fitting with measurement error). The "graphical" component of the title reflects an emphasis on pictorial representations of some of the math, but also on the use of graphical models (multilevel or hierarchical models) for analyzing complex data. Code for some examples from the talk will be available to participants, in Python and in the Stan probabilistic programming language.
DEFF Research Database (Denmark)
Iglesias, Juan Eugenio; Sabuncu, Mert Rory; Van Leemput, Koen
2013-01-01
Many segmentation algorithms in medical image analysis use Bayesian modeling to augment local image appearance with prior anatomical knowledge. Such methods often contain a large number of free parameters that are first estimated and then kept fixed during the actual segmentation process. However...... in an Alzheimer’s disease classification task. As an additional benefit, the technique also allows one to compute informative “error bars” on the volume estimates of individual structures....
Inferring the most probable maps of underground utilities using Bayesian mapping model
Bilal, Muhammad; Khan, Wasiq; Muggleton, Jennifer; Rustighi, Emiliano; Jenks, Hugo; Pennock, Steve R.; Atkins, Phil R.; Cohn, Anthony
2018-03-01
Mapping the Underworld (MTU), a major initiative in the UK, is focused on addressing social, environmental and economic consequences raised from the inability to locate buried underground utilities (such as pipes and cables) by developing a multi-sensor mobile device. The aim of MTU device is to locate different types of buried assets in real time with the use of automated data processing techniques and statutory records. The statutory records, even though typically being inaccurate and incomplete, provide useful prior information on what is buried under the ground and where. However, the integration of information from multiple sensors (raw data) with these qualitative maps and their visualization is challenging and requires the implementation of robust machine learning/data fusion approaches. An approach for automated creation of revised maps was developed as a Bayesian Mapping model in this paper by integrating the knowledge extracted from sensors raw data and available statutory records. The combination of statutory records with the hypotheses from sensors was for initial estimation of what might be found underground and roughly where. The maps were (re)constructed using automated image segmentation techniques for hypotheses extraction and Bayesian classification techniques for segment-manhole connections. The model consisting of image segmentation algorithm and various Bayesian classification techniques (segment recognition and expectation maximization (EM) algorithm) provided robust performance on various simulated as well as real sites in terms of predicting linear/non-linear segments and constructing refined 2D/3D maps.
Hagemann, M. W.; Gleason, C. J.; Durand, M. T.
2017-11-01
The forthcoming Surface Water and Ocean Topography (SWOT) NASA satellite mission will measure water surface width, height, and slope of major rivers worldwide. The resulting data could provide an unprecedented account of river discharge at continental scales, but reliable methods need to be identified prior to launch. Here we present a novel algorithm for discharge estimation from only remotely sensed stream width, slope, and height at multiple locations along a mass-conserved river segment. The algorithm, termed the Bayesian AMHG-Manning (BAM) algorithm, implements a Bayesian formulation of streamflow uncertainty using a combination of Manning's equation and at-many-stations hydraulic geometry (AMHG). Bayesian methods provide a statistically defensible approach to generating discharge estimates in a physically underconstrained system but rely on prior distributions that quantify the a priori uncertainty of unknown quantities including discharge and hydraulic equation parameters. These were obtained from literature-reported values and from a USGS data set of acoustic Doppler current profiler (ADCP) measurements at USGS stream gauges. A data set of simulated widths, slopes, and heights from 19 rivers was used to evaluate the algorithms using a set of performance metrics. Results across the 19 rivers indicate an improvement in performance of BAM over previously tested methods and highlight a path forward in solving discharge estimation using solely satellite remote sensing.
Safner, T.; Miller, M.P.; McRae, B.H.; Fortin, M.-J.; Manel, S.
2011-01-01
Recently, techniques available for identifying clusters of individuals or boundaries between clusters using genetic data from natural populations have expanded rapidly. Consequently, there is a need to evaluate these different techniques. We used spatially-explicit simulation models to compare three spatial Bayesian clustering programs and two edge detection methods. Spatially-structured populations were simulated where a continuous population was subdivided by barriers. We evaluated the ability of each method to correctly identify boundary locations while varying: (i) time after divergence, (ii) strength of isolation by distance, (iii) level of genetic diversity, and (iv) amount of gene flow across barriers. To further evaluate the methods' effectiveness to detect genetic clusters in natural populations, we used previously published data on North American pumas and a European shrub. Our results show that with simulated and empirical data, the Bayesian spatial clustering algorithms outperformed direct edge detection methods. All methods incorrectly detected boundaries in the presence of strong patterns of isolation by distance. Based on this finding, we support the application of Bayesian spatial clustering algorithms for boundary detection in empirical datasets, with necessary tests for the influence of isolation by distance. ?? 2011 by the authors; licensee MDPI, Basel, Switzerland.
Inference on inspiral signals using LISA MLDC data
International Nuclear Information System (INIS)
Roever, Christian; Stroeer, Alexander; Bloomer, Ed; Christensen, Nelson; Clark, James; Hendry, Martin; Messenger, Chris; Meyer, Renate; Pitkin, Matt; Toher, Jennifer; Umstaetter, Richard; Vecchio, Alberto; Veitch, John; Woan, Graham
2007-01-01
In this paper, we describe a Bayesian inference framework for the analysis of data obtained by LISA. We set up a model for binary inspiral signals as defined for the Mock LISA Data Challenge 1.2 (MLDC), and implemented a Markov chain Monte Carlo (MCMC) algorithm to facilitate exploration and integration of the posterior distribution over the nine-dimensional parameter space. Here, we present intermediate results showing how, using this method, information about the nine parameters can be extracted from the data
Goodlet, Brent R.; Mills, Leah; Bales, Ben; Charpagne, Marie-Agathe; Murray, Sean P.; Lenthe, William C.; Petzold, Linda; Pollock, Tresa M.
2018-06-01
Bayesian inference is employed to precisely evaluate single crystal elastic properties of novel γ -γ ' Co- and CoNi-based superalloys from simple and non-destructive resonant ultrasound spectroscopy (RUS) measurements. Nine alloys from three Co-, CoNi-, and Ni-based alloy classes were evaluated in the fully aged condition, with one alloy per class also evaluated in the solution heat-treated condition. Comparisons are made between the elastic properties of the three alloy classes and among the alloys of a single class, with the following trends observed. A monotonic rise in the c_{44} (shear) elastic constant by a total of 12 pct is observed between the three alloy classes as Co is substituted for Ni. Elastic anisotropy ( A) is also increased, with a large majority of the nearly 13 pct increase occurring after Co becomes the dominant constituent. Together the five CoNi alloys, with Co:Ni ratios from 1:1 to 1.5:1, exhibited remarkably similar properties with an average A 1.8 pct greater than the Ni-based alloy CMSX-4. Custom code demonstrating a substantial advance over previously reported methods for RUS inversion is also reported here for the first time. CmdStan-RUS is built upon the open-source probabilistic programing language of Stan and formulates the inverse problem using Bayesian methods. Bayesian posterior distributions are efficiently computed with Hamiltonian Monte Carlo (HMC), while initial parameterization is randomly generated from weakly informative prior distributions. Remarkably robust convergence behavior is demonstrated across multiple independent HMC chains in spite of initial parameterization often very far from actual parameter values. Experimental procedures are substantially simplified by allowing any arbitrary misorientation between the specimen and crystal axes, as elastic properties and misorientation are estimated simultaneously.
Ponciano, José Miguel
2017-11-22
Using a nonparametric Bayesian approach Palacios and Minin (2013) dramatically improved the accuracy, precision of Bayesian inference of population size trajectories from gene genealogies. These authors proposed an extension of a Gaussian Process (GP) nonparametric inferential method for the intensity function of non-homogeneous Poisson processes. They found that not only the statistical properties of the estimators were improved with their method, but also, that key aspects of the demographic histories were recovered. The authors' work represents the first Bayesian nonparametric solution to this inferential problem because they specify a convenient prior belief without a particular functional form on the population trajectory. Their approach works so well and provides such a profound understanding of the biological process, that the question arises as to how truly "biology-free" their approach really is. Using well-known concepts of stochastic population dynamics, here I demonstrate that in fact, Palacios and Minin's GP model can be cast as a parametric population growth model with density dependence and environmental stochasticity. Making this link between population genetics and stochastic population dynamics modeling provides novel insights into eliciting biologically meaningful priors for the trajectory of the effective population size. The results presented here also bring novel understanding of GP as models for the evolution of a trait. Thus, the ecological principles foundation of Palacios and Minin (2013)'s prior adds to the conceptual and scientific value of these authors' inferential approach. I conclude this note by listing a series of insights brought about by this connection with Ecology. Copyright © 2017 The Author. Published by Elsevier Inc. All rights reserved.
Underwood, Kristen L.; Rizzo, Donna M.; Schroth, Andrew W.; Dewoolkar, Mandar M.
2017-12-01
Given the variable biogeochemical, physical, and hydrological processes driving fluvial sediment and nutrient export, the water science and management communities need data-driven methods to identify regions prone to production and transport under variable hydrometeorological conditions. We use Bayesian analysis to segment concentration-discharge linear regression models for total suspended solids (TSS) and particulate and dissolved phosphorus (PP, DP) using 22 years of monitoring data from 18 Lake Champlain watersheds. Bayesian inference was leveraged to estimate segmented regression model parameters and identify threshold position. The identified threshold positions demonstrated a considerable range below and above the median discharge—which has been used previously as the default breakpoint in segmented regression models to discern differences between pre and post-threshold export regimes. We then applied a Self-Organizing Map (SOM), which partitioned the watersheds into clusters of TSS, PP, and DP export regimes using watershed characteristics, as well as Bayesian regression intercepts and slopes. A SOM defined two clusters of high-flux basins, one where PP flux was predominantly episodic and hydrologically driven; and another in which the sediment and nutrient sourcing and mobilization were more bimodal, resulting from both hydrologic processes at post-threshold discharges and reactive processes (e.g., nutrient cycling or lateral/vertical exchanges of fine sediment) at prethreshold discharges. A separate DP SOM defined two high-flux clusters exhibiting a bimodal concentration-discharge response, but driven by differing land use. Our novel framework shows promise as a tool with broad management application that provides insights into landscape drivers of riverine solute and sediment export.
Bayesian inference for data assimilation using Least-Squares Finite Element methods
International Nuclear Information System (INIS)
Dwight, Richard P
2010-01-01
It has recently been observed that Least-Squares Finite Element methods (LS-FEMs) can be used to assimilate experimental data into approximations of PDEs in a natural way, as shown by Heyes et al. in the case of incompressible Navier-Stokes flow. The approach was shown to be effective without regularization terms, and can handle substantial noise in the experimental data without filtering. Of great practical importance is that - unlike other data assimilation techniques - it is not significantly more expensive than a single physical simulation. However the method as presented so far in the literature is not set in the context of an inverse problem framework, so that for example the meaning of the final result is unclear. In this paper it is shown that the method can be interpreted as finding a maximum a posteriori (MAP) estimator in a Bayesian approach to data assimilation, with normally distributed observational noise, and a Bayesian prior based on an appropriate norm of the governing equations. In this setting the method may be seen to have several desirable properties: most importantly discretization and modelling error in the simulation code does not affect the solution in limit of complete experimental information, so these errors do not have to be modelled statistically. Also the Bayesian interpretation better justifies the choice of the method, and some useful generalizations become apparent. The technique is applied to incompressible Navier-Stokes flow in a pipe with added velocity data, where its effectiveness, robustness to noise, and application to inverse problems is demonstrated.
Bayesian inference in a discrete shock model using confounded common cause data
International Nuclear Information System (INIS)
Kvam, Paul H.; Martz, Harry F.
1995-01-01
We consider redundant systems of identical components for which reliability is assessed statistically using only demand-based failures and successes. Direct assessment of system reliability can lead to gross errors in estimation if there exist external events in the working environment that cause two or more components in the system to fail in the same demand period which have not been included in the reliability model. We develop a simple Bayesian model for estimating component reliability and the corresponding probability of common cause failure in operating systems for which the data is confounded; that is, the common cause failures cannot be distinguished from multiple independent component failures in the narrative event descriptions
Bayesian inference for spatio-temporal spike-and-slab priors
DEFF Research Database (Denmark)
Andersen, Michael Riis; Vehtari, Aki; Winther, Ole
2017-01-01
a transformed Gaussian process on the spike-and-slab probabilities. An expectation propagation (EP) algorithm for posterior inference under the proposed model is derived. For large scale problems, the standard EP algorithm can be prohibitively slow. We therefore introduce three different approximation schemes...
A cost minimisation and Bayesian inference model predicts startle reflex modulation across species.
Bach, Dominik R
2015-04-07
In many species, rapid defensive reflexes are paramount to escaping acute danger. These reflexes are modulated by the state of the environment. This is exemplified in fear-potentiated startle, a more vigorous startle response during conditioned anticipation of an unrelated threatening event. Extant explanations of this phenomenon build on descriptive models of underlying psychological states, or neural processes. Yet, they fail to predict invigorated startle during reward anticipation and instructed attention, and do not explain why startle reflex modulation evolved. Here, we fill this lacuna by developing a normative cost minimisation model based on Bayesian optimality principles. This model predicts the observed pattern of startle modification by rewards, punishments, instructed attention, and several other states. Moreover, the mathematical formalism furnishes predictions that can be tested experimentally. Comparing the model with existing data suggests a specific neural implementation of the underlying computations which yields close approximations to the optimal solution under most circumstances. This analysis puts startle modification into the framework of Bayesian decision theory and predictive coding, and illustrates the importance of an adaptive perspective to interpret defensive behaviour across species. Copyright © 2015 The Author. Published by Elsevier Ltd.. All rights reserved.
Directory of Open Access Journals (Sweden)
Mariana Inés Pocovi
2015-06-01
Full Text Available Understanding the population structure and genetic diversity in sugarcane (Saccharum officinarum L. accessions from INTA germplasm bank (Argentina will be of great importance for germplasm collection and breeding improvement as it will identify diverse parental combinations to create segregating progenies with maximum genetic variability for further selection. A Bayesian approach, ordination methods (PCoA, Principal Coordinate Analysis and clustering analysis (UPGMA, Unweighted Pair Group Method with Arithmetic Mean were applied to this purpose. Sixty three INTA sugarcane hybrids were genotyped for 107 Simple Sequence Repeat (SSR and 136 Amplified Fragment Length Polymorphism (AFLP loci. Given the low probability values found with AFLP for individual assignment (4.7%, microsatellites seemed to perform better (54% for STRUCTURE analysis that revealed the germplasm to exist in five optimum groups with partly corresponding to their origin. However clusters shown high degree of admixture, F ST values confirmed the existence of differences among groups. Dissimilarity coefficients ranged from 0.079 to 0.651. PCoA separated sugarcane in groups that did not agree with those identified by STRUCTURE. The clustering including all genotypes neither showed resemblance to populations find by STRUCTURE, but clustering performed considering only individuals displaying a proportional membership > 0.6 in their primary population obtained with STRUCTURE showed close similarities. The Bayesian method indubitably brought more information on cultivar origins than classical PCoA and hierarchical clustering method.
Bayesian inference in mass flow simulations - from back calculation to prediction
Kofler, Andreas; Fischer, Jan-Thomas; Hellweger, Valentin; Huber, Andreas; Mergili, Martin; Pudasaini, Shiva; Fellin, Wolfgang; Oberguggenberger, Michael
2017-04-01
Mass flow simulations are an integral part of hazard assessment. Determining the hazard potential requires a multidisciplinary approach, including different scientific fields such as geomorphology, meteorology, physics, civil engineering and mathematics. An important task in snow avalanche simulation is to predict process intensities (runout, flow velocity and depth, ...). The application of probabilistic methods allows one to develop a comprehensive simulation concept, ranging from back to forward calculation and finally to prediction of mass flow events. In this context optimized parameter sets for the used simulation model or intensities of the modeled mass flow process (e.g. runout distances) are represented by probability distributions. Existing deterministic flow models, in particular with respect to snow avalanche dynamics, contain several parameters (e.g. friction). Some of these parameters are more conceptual than physical and their direct measurement in the field is hardly possible. Hence, parameters have to be optimized by matching simulation results to field observations. This inverse problem can be solved by a Bayesian approach (Markov chain Monte Carlo). The optimization process yields parameter distributions, that can be utilized for probabilistic reconstruction and prediction of avalanche events. Arising challenges include the limited amount of observations, correlations appearing in model parameters or observed avalanche characteristics (e.g. velocity and runout) and the accurate handling of ensemble simulations, always taking into account the related uncertainties. Here we present an operational Bayesian simulation framework with r.avaflow, the open source GIS simulation model for granular avalanches and debris flows.
Bayesian inference in genetic parameter estimation of visual scores in Nellore beef-cattle
2009-01-01
The aim of this study was to estimate the components of variance and genetic parameters for the visual scores which constitute the Morphological Evaluation System (MES), such as body structure (S), precocity (P) and musculature (M) in Nellore beef-cattle at the weaning and yearling stages, by using threshold Bayesian models. The information used for this was gleaned from visual scores of 5,407 animals evaluated at the weaning and 2,649 at the yearling stages. The genetic parameters for visual score traits were estimated through two-trait analysis, using the threshold animal model, with Bayesian statistics methodology and MTGSAM (Multiple Trait Gibbs Sampler for Animal Models) threshold software. Heritability estimates for S, P and M were 0.68, 0.65 and 0.62 (at weaning) and 0.44, 0.38 and 0.32 (at the yearling stage), respectively. Heritability estimates for S, P and M were found to be high, and so it is expected that these traits should respond favorably to direct selection. The visual scores evaluated at the weaning and yearling stages might be used in the composition of new selection indexes, as they presented sufficient genetic variability to promote genetic progress in such morphological traits. PMID:21637450
Energy Technology Data Exchange (ETDEWEB)
Marzouk, Youssef; Fast P. (Lawrence Livermore National Laboratory, Livermore, CA); Kraus, M. (Peterson AFB, CO); Ray, J. P.
2006-01-01
Terrorist attacks using an aerosolized pathogen preparation have gained credibility as a national security concern after the anthrax attacks of 2001. The ability to characterize such attacks, i.e., to estimate the number of people infected, the time of infection, and the average dose received, is important when planning a medical response. We address this question of characterization by formulating a Bayesian inverse problem predicated on a short time-series of diagnosed patients exhibiting symptoms. To be of relevance to response planning, we limit ourselves to 3-5 days of data. In tests performed with anthrax as the pathogen, we find that these data are usually sufficient, especially if the model of the outbreak used in the inverse problem is an accurate one. In some cases the scarcity of data may initially support outbreak characterizations at odds with the true one, but with sufficient data the correct inferences are recovered; in other words, the inverse problem posed and its solution methodology are consistent. We also explore the effect of model error-situations for which the model used in the inverse problem is only a partially accurate representation of the outbreak; here, the model predictions and the observations differ by more than a random noise. We find that while there is a consistent discrepancy between the inferred and the true characterizations, they are also close enough to be of relevance when planning a response.
Prudhomme, Serge
2015-01-07
The need for surrogate models and adaptive methods can be best appreciated if one is interested in parameter estimation using a Bayesian calibration procedure for validation purposes. We extend here our latest work on error decomposition and adaptive refinement for response surfaces to the development of surrogate models that can be substituted for the full models to estimate the parameters of Reynolds-averaged Navier-Stokes models. The error estimates and adaptive schemes are driven here by a quantity of interest and are thus based on the approximation of an adjoint problem. We will focus in particular to the accurate estimation of evidences to facilitate model selection. The methodology will be illustrated on the Spalart-Allmaras RANS model for turbulence simulation.
Galliano, Frédéric
2018-05-01
This article presents a new dust spectral energy distribution (SED) model, named HerBIE, aimed at eliminating the noise-induced correlations and large scatter obtained when performing least-squares fits. The originality of this code is to apply the hierarchical Bayesian approach to full dust models, including realistic optical properties, stochastic heating, and the mixing of physical conditions in the observed regions. We test the performances of our model by applying it to synthetic observations. We explore the impact on the recovered parameters of several effects: signal-to-noise ratio, SED shape, sample size, the presence of intrinsic correlations, the wavelength coverage, and the use of different SED model components. We show that this method is very efficient: the recovered parameters are consistently distributed around their true values. We do not find any clear bias, even for the most degenerate parameters, or with extreme signal-to-noise ratios.
Prudhomme, Serge
2015-01-01
The need for surrogate models and adaptive methods can be best appreciated if one is interested in parameter estimation using a Bayesian calibration procedure for validation purposes. We extend here our latest work on error decomposition and adaptive refinement for response surfaces to the development of surrogate models that can be substituted for the full models to estimate the parameters of Reynolds-averaged Navier-Stokes models. The error estimates and adaptive schemes are driven here by a quantity of interest and are thus based on the approximation of an adjoint problem. We will focus in particular to the accurate estimation of evidences to facilitate model selection. The methodology will be illustrated on the Spalart-Allmaras RANS model for turbulence simulation.
Partial inversion of elliptic operator to speed up computation of likelihood in Bayesian inference
Litvinenko, Alexander
2017-08-09
In this paper, we speed up the solution of inverse problems in Bayesian settings. By computing the likelihood, the most expensive part of the Bayesian formula, one compares the available measurement data with the simulated data. To get simulated data, repeated solution of the forward problem is required. This could be a great challenge. Often, the available measurement is a functional $F(u)$ of the solution $u$ or a small part of $u$. Typical examples of $F(u)$ are the solution in a point, solution on a coarser grid, in a small subdomain, the mean value in a subdomain. It is a waste of computational resources to evaluate, first, the whole solution and then compute a part of it. In this work, we compute the functional $F(u)$ direct, without computing the full inverse operator and without computing the whole solution $u$. The main ingredients of the developed approach are the hierarchical domain decomposition technique, the finite element method and the Schur complements. To speed up computations and to reduce the storage cost, we approximate the forward operator and the Schur complement in the hierarchical matrix format. Applying the hierarchical matrix technique, we reduced the computing cost to $\\\\mathcal{O}(k^2n \\\\log^2 n)$, where $k\\\\ll n$ and $n$ is the number of degrees of freedom. Up to the $\\\\H$-matrix accuracy, the computation of the functional $F(u)$ is exact. To reduce the computational resources further, we can approximate $F(u)$ on, for instance, multiple coarse meshes. The offered method is well suited for solving multiscale problems. A disadvantage of this method is the assumption that one has to have access to the discretisation and to the procedure of assembling the Galerkin matrix.
Fukuda, J.; Johnson, K. M.
2009-12-01
Studies utilizing inversions of geodetic data for the spatial distribution of coseismic slip on faults typically present the result as a single fault plane and slip distribution. Commonly the geometry of the fault plane is assumed to be known a priori and the data are inverted for slip. However, sometimes there is not strong a priori information on the geometry of the fault that produced the earthquake and the data is not always strong enough to completely resolve the fault geometry. We develop a method to solve for the full posterior probability distribution of fault slip and fault geometry parameters in a Bayesian framework using Monte Carlo methods. The slip inversion problem is particularly challenging because it often involves multiple data sets with unknown relative weights (e.g. InSAR, GPS), model parameters that are related linearly (slip) and nonlinearly (fault geometry) through the theoretical model to surface observations, prior information on model parameters, and a regularization prior to stabilize the inversion. We present the theoretical framework and solution method for a Bayesian inversion that can handle all of these aspects of the problem. The method handles the mixed linear/nonlinear nature of the problem through combination of both analytical least-squares solutions and Monte Carlo methods. We first illustrate and validate the inversion scheme using synthetic data sets. We then apply the method to inversion of geodetic data from the 2003 M6.6 San Simeon, California earthquake. We show that the uncertainty in strike and dip of the fault plane is over 20 degrees. We characterize the uncertainty in the slip estimate with a volume around the mean fault solution in which the slip most likely occurred. Slip likely occurred somewhere in a volume that extends 5-10 km in either direction normal to the fault plane. We implement slip inversions with both traditional, kinematic smoothing constraints on slip and a simple physical condition of uniform stress
Wang, L.; Davis, J. L.; Tamisiea, M. E.
2017-12-01
The Antarctic ice sheet (AIS) holds about 60% of all fresh water on the Earth, an amount equivalent to about 58 m of sea-level rise. Observation of AIS mass change is thus essential in determining and predicting its contribution to sea level. While the ice mass loss estimates for West Antarctica (WA) and the Antarctic Peninsula (AP) are in good agreement, what the mass balance over East Antarctica (EA) is, and whether or not it compensates for the mass loss is under debate. Besides the different error sources and sensitivities of different measurement types, complex spatial and temporal variabilities would be another factor complicating the accurate estimation of the AIS mass balance. Therefore, a model that allows for variabilities in both melting rate and seasonal signals would seem appropriate in the estimation of present-day AIS melting. We present a stochastic filter technique, which enables the Bayesian separation of the systematic stripe noise and mass signal in decade-length GRACE monthly gravity series, and allows the estimation of time-variable seasonal and inter-annual components in the signals. One of the primary advantages of this Bayesian method is that it yields statistically rigorous uncertainty estimates reflecting the inherent spatial resolution of the data. By applying the stochastic filter to the decade-long GRACE observations, we present the temporal variabilities of the AIS mass balance at basin scale, particularly over East Antarctica, and decipher the EA mass variations in the past decade, and their role in affecting overall AIS mass balance and sea level.
Exploring the Connection Between Sampling Problems in Bayesian Inference and Statistical Mechanics
Pohorille, Andrew
2006-01-01
The Bayesian and statistical mechanical communities often share the same objective in their work - estimating and integrating probability distribution functions (pdfs) describing stochastic systems, models or processes. Frequently, these pdfs are complex functions of random variables exhibiting multiple, well separated local minima. Conventional strategies for sampling such pdfs are inefficient, sometimes leading to an apparent non-ergodic behavior. Several recently developed techniques for handling this problem have been successfully applied in statistical mechanics. In the multicanonical and Wang-Landau Monte Carlo (MC) methods, the correct pdfs are recovered from uniform sampling of the parameter space by iteratively establishing proper weighting factors connecting these distributions. Trivial generalizations allow for sampling from any chosen pdf. The closely related transition matrix method relies on estimating transition probabilities between different states. All these methods proved to generate estimates of pdfs with high statistical accuracy. In another MC technique, parallel tempering, several random walks, each corresponding to a different value of a parameter (e.g. "temperature"), are generated and occasionally exchanged using the Metropolis criterion. This method can be considered as a statistically correct version of simulated annealing. An alternative approach is to represent the set of independent variables as a Hamiltonian system. Considerab!e progress has been made in understanding how to ensure that the system obeys the equipartition theorem or, equivalently, that coupling between the variables is correctly described. Then a host of techniques developed for dynamical systems can be used. Among them, probably the most powerful is the Adaptive Biasing Force method, in which thermodynamic integration and biased sampling are combined to yield very efficient estimates of pdfs. The third class of methods deals with transitions between states described
Bayesian inference in an item response theory model with a generalized student t link function
Azevedo, Caio L. N.; Migon, Helio S.
2012-10-01
In this paper we introduce a new item response theory (IRT) model with a generalized Student t-link function with unknown degrees of freedom (df), named generalized t-link (GtL) IRT model. In this model we consider only the difficulty parameter in the item response function. GtL is an alternative to the two parameter logit and probit models, since the degrees of freedom (df) play a similar role to the discrimination parameter. However, the behavior of the curves of the GtL is different from those of the two parameter models and the usual Student t link, since in GtL the curve obtained from different df's can cross the probit curves in more than one latent trait level. The GtL model has similar proprieties to the generalized linear mixed models, such as the existence of sufficient statistics and easy parameter interpretation. Also, many techniques of parameter estimation, model fit assessment and residual analysis developed for that models can be used for the GtL model. We develop fully Bayesian estimation and model fit assessment tools through a Metropolis-Hastings step within Gibbs sampling algorithm. We consider a prior sensitivity choice concerning the degrees of freedom. The simulation study indicates that the algorithm recovers all parameters properly. In addition, some Bayesian model fit assessment tools are considered. Finally, a real data set is analyzed using our approach and other usual models. The results indicate that our model fits the data better than the two parameter models.
Harris Recurrence and MCMC: A Simplified Approach
DEFF Research Database (Denmark)
Asmussen, Søren; Glynn, Peter W.
A key result underlying the theory of MCMC is that any η-irreducible Markov chain having a transition density with respect to η and possessing a stationary distribution is automatically positive Harris recurrent. This paper provides a short self-contained proof of this fact.......A key result underlying the theory of MCMC is that any η-irreducible Markov chain having a transition density with respect to η and possessing a stationary distribution is automatically positive Harris recurrent. This paper provides a short self-contained proof of this fact....
Niwayama, Ritsuya; Nagao, Hiromichi; Kitajima, Tomoya S; Hufnagel, Lars; Shinohara, Kyosuke; Higuchi, Tomoyuki; Ishikawa, Takuji; Kimura, Akatsuki
2016-01-01
Cellular structures are hydrodynamically interconnected, such that force generation in one location can move distal structures. One example of this phenomenon is cytoplasmic streaming, whereby active forces at the cell cortex induce streaming of the entire cytoplasm. However, it is not known how the spatial distribution and magnitude of these forces move distant objects within the cell. To address this issue, we developed a computational method that used cytoplasm hydrodynamics to infer the spatial distribution of shear stress at the cell cortex induced by active force generators from experimentally obtained flow field of cytoplasmic streaming. By applying this method, we determined the shear-stress distribution that quantitatively reproduces in vivo flow fields in Caenorhabditis elegans embryos and mouse oocytes during meiosis II. Shear stress in mouse oocytes were predicted to localize to a narrower cortical region than that with a high cortical flow velocity and corresponded with the localization of the cortical actin cap. The predicted patterns of pressure gradient in both species were consistent with species-specific cytoplasmic streaming functions. The shear-stress distribution inferred by our method can contribute to the characterization of active force generation driving biological streaming.
Yildiz, Izzet B; von Kriegstein, Katharina; Kiebel, Stefan J
2013-01-01
Our knowledge about the computational mechanisms underlying human learning and recognition of sound sequences, especially speech, is still very limited. One difficulty in deciphering the exact means by which humans recognize speech is that there are scarce experimental findings at a neuronal, microscopic level. Here, we show that our neuronal-computational understanding of speech learning and recognition may be vastly improved by looking at an animal model, i.e., the songbird, which faces the same challenge as humans: to learn and decode complex auditory input, in an online fashion. Motivated by striking similarities between the human and songbird neural recognition systems at the macroscopic level, we assumed that the human brain uses the same computational principles at a microscopic level and translated a birdsong model into a novel human sound learning and recognition model with an emphasis on speech. We show that the resulting Bayesian model with a hierarchy of nonlinear dynamical systems can learn speech samples such as words rapidly and recognize them robustly, even in adverse conditions. In addition, we show that recognition can be performed even when words are spoken by different speakers and with different accents-an everyday situation in which current state-of-the-art speech recognition models often fail. The model can also be used to qualitatively explain behavioral data on human speech learning and derive predictions for future experiments.
Directory of Open Access Journals (Sweden)
Izzet B Yildiz
Full Text Available Our knowledge about the computational mechanisms underlying human learning and recognition of sound sequences, especially speech, is still very limited. One difficulty in deciphering the exact means by which humans recognize speech is that there are scarce experimental findings at a neuronal, microscopic level. Here, we show that our neuronal-computational understanding of speech learning and recognition may be vastly improved by looking at an animal model, i.e., the songbird, which faces the same challenge as humans: to learn and decode complex auditory input, in an online fashion. Motivated by striking similarities between the human and songbird neural recognition systems at the macroscopic level, we assumed that the human brain uses the same computational principles at a microscopic level and translated a birdsong model into a novel human sound learning and recognition model with an emphasis on speech. We show that the resulting Bayesian model with a hierarchy of nonlinear dynamical systems can learn speech samples such as words rapidly and recognize them robustly, even in adverse conditions. In addition, we show that recognition can be performed even when words are spoken by different speakers and with different accents-an everyday situation in which current state-of-the-art speech recognition models often fail. The model can also be used to qualitatively explain behavioral data on human speech learning and derive predictions for future experiments.
Systematic validation of non-equilibrium thermochemical models using Bayesian inference
Miki, Kenji
2015-10-01
© 2015 Elsevier Inc. The validation process proposed by Babuška et al. [1] is applied to thermochemical models describing post-shock flow conditions. In this validation approach, experimental data is involved only in the calibration of the models, and the decision process is based on quantities of interest (QoIs) predicted on scenarios that are not necessarily amenable experimentally. Moreover, uncertainties present in the experimental data, as well as those resulting from an incomplete physical model description, are propagated to the QoIs. We investigate four commonly used thermochemical models: a one-temperature model (which assumes thermal equilibrium among all inner modes), and two-temperature models developed by Macheret et al. [2], Marrone and Treanor [3], and Park [4]. Up to 16 uncertain parameters are estimated using Bayesian updating based on the latest absolute volumetric radiance data collected at the Electric Arc Shock Tube (EAST) installed inside the NASA Ames Research Center. Following the solution of the inverse problems, the forward problems are solved in order to predict the radiative heat flux, QoI, and examine the validity of these models. Our results show that all four models are invalid, but for different reasons: the one-temperature model simply fails to reproduce the data while the two-temperature models exhibit unacceptably large uncertainties in the QoI predictions.
Park, Taeyoung; Jeong, Jong-Hyeon; Lee, Jae Won
2012-08-15
There is often an interest in estimating a residual life function as a summary measure of survival data. For ease in presentation of the potential therapeutic effect of a new drug, investigators may summarize survival data in terms of the remaining life years of patients. Under heavy right censoring, however, some reasonably high quantiles (e.g., median) of a residual lifetime distribution cannot be always estimated via a popular nonparametric approach on the basis of the Kaplan-Meier estimator. To overcome the difficulties in dealing with heavily censored survival data, this paper develops a Bayesian nonparametric approach that takes advantage of a fully model-based but highly flexible probabilistic framework. We use a Dirichlet process mixture of Weibull distributions to avoid strong parametric assumptions on the unknown failure time distribution, making it possible to estimate any quantile residual life function under heavy censoring. Posterior computation through Markov chain Monte Carlo is straightforward and efficient because of conjugacy properties and partial collapse. We illustrate the proposed methods by using both simulated data and heavily censored survival data from a recent breast cancer clinical trial conducted by the National Surgical Adjuvant Breast and Bowel Project. Copyright © 2012 John Wiley & Sons, Ltd.
A framework for Bayesian nonparametric inference for causal effects of mediation.
Kim, Chanmin; Daniels, Michael J; Marcus, Bess H; Roy, Jason A
2017-06-01
We propose a Bayesian non-parametric (BNP) framework for estimating causal effects of mediation, the natural direct, and indirect, effects. The strategy is to do this in two parts. Part 1 is a flexible model (using BNP) for the observed data distribution. Part 2 is a set of uncheckable assumptions with sensitivity parameters that in conjunction with Part 1 allows identification and estimation of the causal parameters and allows for uncertainty about these assumptions via priors on the sensitivity parameters. For Part 1, we specify a Dirichlet process mixture of multivariate normals as a prior on the joint distribution of the outcome, mediator, and covariates. This approach allows us to obtain a (simple) closed form of each marginal distribution. For Part 2, we consider two sets of assumptions: (a) the standard sequential ignorability (Imai et al., 2010) and (b) weakened set of the conditional independence type assumptions introduced in Daniels et al. (2012) and propose sensitivity analyses for both. We use this approach to assess mediation in a physical activity promotion trial. © 2016, The International Biometric Society.
Baur, Brittany; Bozdag, Serdar
2015-04-01
One of the challenging and important computational problems in systems biology is to infer gene regulatory networks (GRNs) of biological systems. Several methods that exploit gene expression data have been developed to tackle this problem. In this study, we propose the use of copy number and DNA methylation data to infer GRNs. We developed an algorithm that scores regulatory interactions between genes based on canonical correlation analysis. In this algorithm, copy number or DNA methylation variables are treated as potential regulator variables, and expression variables are treated as potential target variables. We first validated that the canonical correlation analysis method is able to infer true interactions in high accuracy. We showed that the use of DNA methylation or copy number datasets leads to improved inference over steady-state expression. Our results also showed that epigenetic and structural information could be used to infer directionality of regulatory interactions. Additional improvements in GRN inference can be gleaned from incorporating the result in an informative prior in a dynamic Bayesian algorithm. This is the first study that incorporates copy number and DNA methylation into an informative prior in dynamic Bayesian framework. By closely examining top-scoring interactions with different sources of epigenetic or structural information, we also identified potential novel regulatory interactions.
Rapid molecular evolution of human bocavirus revealed by Bayesian coalescent inference.
Zehender, Gianguglielmo; De Maddalena, Chiara; Canuti, Marta; Zappa, Alessandra; Amendola, Antonella; Lai, Alessia; Galli, Massimo; Tanzi, Elisabetta
2010-03-01
Human bocavirus (HBoV) is a linear single-stranded DNA virus belonging to the Parvoviridae family that has recently been isolated from the upper respiratory tract of children with acute respiratory infection. All of the strains observed so far segregate into two genotypes (1 and 2) with a low level of polymorphism. Given the recent description of the infection and the lack of epidemiological and molecular data, we estimated the virus's rates of molecular evolution and population dynamics. A dataset of forty-nine dated VP2 sequences, including also eight new isolates obtained from pharyngeal swabs of Italian patients with acute respiratory tract infections, was submitted to phylogenetic analysis. The model parameters, evolutionary rates and population dynamics were co-estimated using a Bayesian Markov Chain Monte Carlo approach, and site-specific positive and negative selection was also investigated. Recombination was investigated by seven different methods and one suspected recombinant strain was excluded from further analysis. The estimated mean evolutionary rate of HBoV was 8.6x10(-4)subs/site/year, and that of the 1st+2nd codon positions was more than 15 times less than that of the 3rd codon position. Viral population dynamics analysis revealed that the two known genotypes diverged recently (mean tMRCA: 24 years), and that the epidemic due to HBoV genotype 2 grew exponentially at a rate of 1.01year(-1). Selection analysis of the partial VP2 showed that 8.5% of sites were under significant negative pressure and the absence of positive selection. Our results show that, like other parvoviruses, HBoV is characterised by a rapid evolution. The low level of polymorphism is probably due to a relatively recent divergence between the circulating genotypes and strong purifying selection acting on viral antigens.
Directory of Open Access Journals (Sweden)
Pablo Fresia
Full Text Available Insect pest phylogeography might be shaped both by biogeographic events and by human influence. Here, we conducted an approximate Bayesian computation (ABC analysis to investigate the phylogeography of the New World screwworm fly, Cochliomyia hominivorax, with the aim of understanding its population history and its order and time of divergence. Our ABC analysis supports that populations spread from North to South in the Americas, in at least two different moments. The first split occurred between the North/Central American and South American populations in the end of the Last Glacial Maximum (15,300-19,000 YBP. The second split occurred between the North and South Amazonian populations in the transition between the Pleistocene and the Holocene eras (9,100-11,000 YBP. The species also experienced population expansion. Phylogenetic analysis likewise suggests this north to south colonization and Maxent models suggest an increase in the number of suitable areas in South America from the past to present. We found that the phylogeographic patterns observed in C. hominivorax cannot be explained only by climatic oscillations and can be connected to host population histories. Interestingly we found these patterns are very coincident with general patterns of ancient human movements in the Americas, suggesting that humans might have played a crucial role in shaping the distribution and population structure of this insect pest. This work presents the first hypothesis test regarding the processes that shaped the current phylogeographic structure of C. hominivorax and represents an alternate perspective on investigating the problem of insect pests.
Final Report: Large-Scale Optimization for Bayesian Inference in Complex Systems
Energy Technology Data Exchange (ETDEWEB)
Ghattas, Omar [The University of Texas at Austin
2013-10-15
The SAGUARO (Scalable Algorithms for Groundwater Uncertainty Analysis and Robust Optimiza- tion) Project focuses on the development of scalable numerical algorithms for large-scale Bayesian inversion in complex systems that capitalize on advances in large-scale simulation-based optimiza- tion and inversion methods. Our research is directed in three complementary areas: efficient approximations of the Hessian operator, reductions in complexity of forward simulations via stochastic spectral approximations and model reduction, and employing large-scale optimization concepts to accelerate sampling. Our efforts are integrated in the context of a challenging testbed problem that considers subsurface reacting flow and transport. The MIT component of the SAGUARO Project addresses the intractability of conventional sampling methods for large-scale statistical inverse problems by devising reduced-order models that are faithful to the full-order model over a wide range of parameter values; sampling then employs the reduced model rather than the full model, resulting in very large computational savings. Results indicate little effect on the computed posterior distribution. On the other hand, in the Texas-Georgia Tech component of the project, we retain the full-order model, but exploit inverse problem structure (adjoint-based gradients and partial Hessian information of the parameter-to- observation map) to implicitly extract lower dimensional information on the posterior distribution; this greatly speeds up sampling methods, so that fewer sampling points are needed. We can think of these two approaches as "reduce then sample" and "sample then reduce." In fact, these two approaches are complementary, and can be used in conjunction with each other. Moreover, they both exploit deterministic inverse problem structure, in the form of adjoint-based gradient and Hessian information of the underlying parameter-to-observation map, to achieve their speedups.
Comparison of sampling techniques for Bayesian parameter estimation
Allison, Rupert; Dunkley, Joanna
2014-02-01
The posterior probability distribution for a set of model parameters encodes all that the data have to tell us in the context of a given model; it is the fundamental quantity for Bayesian parameter estimation. In order to infer the posterior probability distribution we have to decide how to explore parameter space. Here we compare three prescriptions for how parameter space is navigated, discussing their relative merits. We consider Metropolis-Hasting sampling, nested sampling and affine-invariant ensemble Markov chain Monte Carlo (MCMC) sampling. We focus on their performance on toy-model Gaussian likelihoods and on a real-world cosmological data set. We outline the sampling algorithms themselves and elaborate on performance diagnostics such as convergence time, scope for parallelization, dimensional scaling, requisite tunings and suitability for non-Gaussian distributions. We find that nested sampling delivers high-fidelity estimates for posterior statistics at low computational cost, and should be adopted in favour of Metropolis-Hastings in many cases. Affine-invariant MCMC is competitive when computing clusters can be utilized for massive parallelization. Affine-invariant MCMC and existing extensions to nested sampling naturally probe multimodal and curving distributions.
Bayesian median regression for temporal gene expression data
Yu, Keming; Vinciotti, Veronica; Liu, Xiaohui; 't Hoen, Peter A. C.
2007-09-01
Most of the existing methods for the identification of biologically interesting genes in a temporal expression profiling dataset do not fully exploit the temporal ordering in the dataset and are based on normality assumptions for the gene expression. In this paper, we introduce a Bayesian median regression model to detect genes whose temporal profile is significantly different across a number of biological conditions. The regression model is defined by a polynomial function where both time and condition effects as well as interactions between the two are included. MCMC-based inference returns the posterior distribution of the polynomial coefficients. From this a simple Bayes factor test is proposed to test for significance. The estimation of the median rather than the mean, and within a Bayesian framework, increases the robustness of the method compared to a Hotelling T2-test previously suggested. This is shown on simulated data and on muscular dystrophy gene expression data.
Directory of Open Access Journals (Sweden)
Mario A Pardo
Full Text Available We inferred the population densities of blue whales (Balaenoptera musculus and short-beaked common dolphins (Delphinus delphis in the Northeast Pacific Ocean as functions of the water-column's physical structure by implementing hierarchical models in a Bayesian framework. This approach allowed us to propagate the uncertainty of the field observations into the inference of species-habitat relationships and to generate spatially explicit population density predictions with reduced effects of sampling heterogeneity. Our hypothesis was that the large-scale spatial distributions of these two cetacean species respond primarily to ecological processes resulting from shoaling and outcropping of the pycnocline in regions of wind-forced upwelling and eddy-like circulation. Physically, these processes affect the thermodynamic balance of the water column, decreasing its volume and thus the height of the absolute dynamic topography (ADT. Biologically, they lead to elevated primary productivity and persistent aggregation of low-trophic-level prey. Unlike other remotely sensed variables, ADT provides information about the structure of the entire water column and it is also routinely measured at high spatial-temporal resolution by satellite altimeters with uniform global coverage. Our models provide spatially explicit population density predictions for both species, even in areas where the pycnocline shoals but does not outcrop (e.g. the Costa Rica Dome and the North Equatorial Countercurrent thermocline ridge. Interannual variations in distribution during El Niño anomalies suggest that the population density of both species decreases dramatically in the Equatorial Cold Tongue and the Costa Rica Dome, and that their distributions retract to particular areas that remain productive, such as the more oceanic waters in the central California Current System, the northern Gulf of California, the North Equatorial Countercurrent thermocline ridge, and the more
Energy Technology Data Exchange (ETDEWEB)
Weyant, Anja; Wood-Vasey, W. Michael [Pittsburgh Particle Physics, Astrophysics, and Cosmology Center (PITT PACC), Physics and Astronomy Department, University of Pittsburgh, Pittsburgh, PA 15260 (United States); Schafer, Chad, E-mail: anw19@pitt.edu [Department of Statistics, Carnegie Mellon University, Pittsburgh, PA 15213 (United States)
2013-02-20
Cosmological inference becomes increasingly difficult when complex data-generating processes cannot be modeled by simple probability distributions. With the ever-increasing size of data sets in cosmology, there is an increasing burden placed on adequate modeling; systematic errors in the model will dominate where previously these were swamped by statistical errors. For example, Gaussian distributions are an insufficient representation for errors in quantities like photometric redshifts. Likewise, it can be difficult to quantify analytically the distribution of errors that are introduced in complex fitting codes. Without a simple form for these distributions, it becomes difficult to accurately construct a likelihood function for the data as a function of parameters of interest. Approximate Bayesian computation (ABC) provides a means of probing the posterior distribution when direct calculation of a sufficiently accurate likelihood is intractable. ABC allows one to bypass direct calculation of the likelihood but instead relies upon the ability to simulate the forward process that generated the data. These simulations can naturally incorporate priors placed on nuisance parameters, and hence these can be marginalized in a natural way. We present and discuss ABC methods in the context of supernova cosmology using data from the SDSS-II Supernova Survey. Assuming a flat cosmology and constant dark energy equation of state, we demonstrate that ABC can recover an accurate posterior distribution. Finally, we show that ABC can still produce an accurate posterior distribution when we contaminate the sample with Type IIP supernovae.
International Nuclear Information System (INIS)
Weyant, Anja; Wood-Vasey, W. Michael; Schafer, Chad
2013-01-01
Cosmological inference becomes increasingly difficult when complex data-generating processes cannot be modeled by simple probability distributions. With the ever-increasing size of data sets in cosmology, there is an increasing burden placed on adequate modeling; systematic errors in the model will dominate where previously these were swamped by statistical errors. For example, Gaussian distributions are an insufficient representation for errors in quantities like photometric redshifts. Likewise, it can be difficult to quantify analytically the distribution of errors that are introduced in complex fitting codes. Without a simple form for these distributions, it becomes difficult to accurately construct a likelihood function for the data as a function of parameters of interest. Approximate Bayesian computation (ABC) provides a means of probing the posterior distribution when direct calculation of a sufficiently accurate likelihood is intractable. ABC allows one to bypass direct calculation of the likelihood but instead relies upon the ability to simulate the forward process that generated the data. These simulations can naturally incorporate priors placed on nuisance parameters, and hence these can be marginalized in a natural way. We present and discuss ABC methods in the context of supernova cosmology using data from the SDSS-II Supernova Survey. Assuming a flat cosmology and constant dark energy equation of state, we demonstrate that ABC can recover an accurate posterior distribution. Finally, we show that ABC can still produce an accurate posterior distribution when we contaminate the sample with Type IIP supernovae.
International Nuclear Information System (INIS)
Karras, D A; Mertzios, G B
2009-01-01
A novel approach is presented in this paper for improving anisotropic diffusion PDE models, based on the Perona–Malik equation. A solution is proposed from an engineering perspective to adaptively estimate the parameters of the regularizing function in this equation. The goal of such a new adaptive diffusion scheme is to better preserve edges when the anisotropic diffusion PDE models are applied to image enhancement tasks. The proposed adaptive parameter estimation in the anisotropic diffusion PDE model involves self-organizing maps and Bayesian inference to define edge probabilities accurately. The proposed modifications attempt to capture not only simple edges but also difficult textural edges and incorporate their probability in the anisotropic diffusion model. In the context of the application of PDE models to image processing such adaptive schemes are closely related to the discrete image representation problem and the investigation of more suitable discretization algorithms using constraints derived from image processing theory. The proposed adaptive anisotropic diffusion model illustrates these concepts when it is numerically approximated by various discretization schemes in a database of magnetic resonance images (MRI), where it is shown to be efficient in image filtering and restoration applications
Xing, Junliang; Ai, Haizhou; Liu, Liwei; Lao, Shihong
2011-06-01
Multiple object tracking (MOT) is a very challenging task yet of fundamental importance for many practical applications. In this paper, we focus on the problem of tracking multiple players in sports video which is even more difficult due to the abrupt movements of players and their complex interactions. To handle the difficulties in this problem, we present a new MOT algorithm which contributes both in the observation modeling level and in the tracking strategy level. For the observation modeling, we develop a progressive observation modeling process that is able to provide strong tracking observations and greatly facilitate the tracking task. For the tracking strategy, we propose a dual-mode two-way Bayesian inference approach which dynamically switches between an offline general model and an online dedicated model to deal with single isolated object tracking and multiple occluded object tracking integrally by forward filtering and backward smoothing. Extensive experiments on different kinds of sports videos, including football, basketball, as well as hockey, demonstrate the effectiveness and efficiency of the proposed method.
Höhna, Sebastian; May, Michael R; Moore, Brian R
2016-03-01
Many fundamental questions in evolutionary biology entail estimating rates of lineage diversification (speciation-extinction) that are modeled using birth-death branching processes. We leverage recent advances in branching-process theory to develop a flexible Bayesian framework for specifying diversification models-where rates are constant, vary continuously, or change episodically through time-and implement numerical methods to estimate parameters of these models from molecular phylogenies, even when species sampling is incomplete. We enable both statistical inference and efficient simulation under these models. We also provide robust methods for comparing the relative and absolute fit of competing branching-process models to a given tree, thereby providing rigorous tests of biological hypotheses regarding patterns and processes of lineage diversification. The source code for TESS is freely available at http://cran.r-project.org/web/packages/TESS/ CONTACT: Sebastian.Hoehna@gmail.com. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
DeLannoy, Gabrielle J. M.; Reichle, Rolf H.; Vrugt, Jasper A.
2013-01-01
Uncertainties in L-band (1.4 GHz) radiative transfer modeling (RTM) affect the simulation of brightness temperatures (Tb) over land and the inversion of satellite-observed Tb into soil moisture retrievals. In particular, accurate estimates of the microwave soil roughness, vegetation opacity and scattering albedo for large-scale applications are difficult to obtain from field studies and often lack an uncertainty estimate. Here, a Markov Chain Monte Carlo (MCMC) simulation method is used to determine satellite-scale estimates of RTM parameters and their posterior uncertainty by minimizing the misfit between long-term averages and standard deviations of simulated and observed Tb at a range of incidence angles, at horizontal and vertical polarization, and for morning and evening overpasses. Tb simulations are generated with the Goddard Earth Observing System (GEOS-5) and confronted with Tb observations from the Soil Moisture Ocean Salinity (SMOS) mission. The MCMC algorithm suggests that the relative uncertainty of the RTM parameter estimates is typically less than 25 of the maximum a posteriori density (MAP) parameter value. Furthermore, the actual root-mean-square-differences in long-term Tb averages and standard deviations are found consistent with the respective estimated total simulation and observation error standard deviations of m3.1K and s2.4K. It is also shown that the MAP parameter values estimated through MCMC simulation are in close agreement with those obtained with Particle Swarm Optimization (PSO).
von Hansen, Yann; Mehlich, Alexander; Pelz, Benjamin; Rief, Matthias; Netz, Roland R
2012-09-01
The thermal fluctuations of micron-sized beads in dual trap optical tweezer experiments contain complete dynamic information about the viscoelastic properties of the embedding medium and-if present-macromolecular constructs connecting the two beads. To quantitatively interpret the spectral properties of the measured signals, a detailed understanding of the instrumental characteristics is required. To this end, we present a theoretical description of the signal processing in a typical dual trap optical tweezer experiment accounting for polarization crosstalk and instrumental noise and discuss the effect of finite statistics. To infer the unknown parameters from experimental data, a maximum likelihood method based on the statistical properties of the stochastic signals is derived. In a first step, the method can be used for calibration purposes: We propose a scheme involving three consecutive measurements (both traps empty, first one occupied and second empty, and vice versa), by which all instrumental and physical parameters of the setup are determined. We test our approach for a simple model system, namely a pair of unconnected, but hydrodynamically interacting spheres. The comparison to theoretical predictions based on instantaneous as well as retarded hydrodynamics emphasizes the importance of hydrodynamic retardation effects due to vorticity diffusion in the fluid. For more complex experimental scenarios, where macromolecular constructs are tethered between the two beads, the same maximum likelihood method in conjunction with dynamic deconvolution theory will in a second step allow one to determine the viscoelastic properties of the tethered element connecting the two beads.
Bayesian Spatial Modelling with R-INLA
Directory of Open Access Journals (Sweden)
Finn Lindgren
2015-02-01
Full Text Available The principles behind the interface to continuous domain spatial models in the R- INLA software package for R are described. The integrated nested Laplace approximation (INLA approach proposed by Rue, Martino, and Chopin (2009 is a computationally effective alternative to MCMC for Bayesian inference. INLA is designed for latent Gaussian models, a very wide and flexible class of models ranging from (generalized linear mixed to spatial and spatio-temporal models. Combined with the stochastic partial differential equation approach (SPDE, Lindgren, Rue, and Lindstrm 2011, one can accommodate all kinds of geographically referenced data, including areal and geostatistical ones, as well as spatial point process data. The implementation interface covers stationary spatial mod- els, non-stationary spatial models, and also spatio-temporal models, and is applicable in epidemiology, ecology, environmental risk assessment, as well as general geostatistics.
Vehicle Trajectory Estimation Using Spatio-Temporal MCMC
Directory of Open Access Journals (Sweden)
Francois Bardet
2010-01-01
Full Text Available This paper presents an algorithm for modeling and tracking vehicles in video sequences within one integrated framework. Most of the solutions are based on sequential methods that make inference according to current information. In contrast, we propose a deferred logical inference method that makes a decision according to a sequence of observations, thus processing a spatio-temporal search on the whole trajectory. One of the drawbacks of deferred logical inference methods is that the solution space of hypotheses grows exponentially related to the depth of observation. Our approach takes into account both the kinematic model of the vehicle and a driver behavior model in order to reduce the space of the solutions. The resulting proposed state model explains the trajectory with only 11 parameters. The solution space is then sampled with a Markov Chain Monte Carlo (MCMC that uses a model-driven proposal distribution in order to control random walk behavior. We demonstrate our method on real video sequences from which we have ground truth provided by a RTK GPS (Real-Time Kinematic GPS. Experimental results show that the proposed algorithm outperforms a sequential inference solution (particle filter.
Bayesian Statistics: Concepts and Applications in Animal Breeding – A Review
Directory of Open Access Journals (Sweden)
Lsxmikant-Sambhaji Kokate
2011-07-01
Full Text Available Statistics uses two major approaches- conventional (or frequentist and Bayesian approach. Bayesian approach provides a complete paradigm for both statistical inference and decision making under uncertainty. Bayesian methods solve many of the difficulties faced by conventional statistical methods, and extend the applicability of statistical methods. It exploits the use of probabilistic models to formulate scientific problems. To use Bayesian statistics, there is computational difficulty and secondly, Bayesian methods require specifying prior probability distributions. Markov Chain Monte-Carlo (MCMC methods were applied to overcome the computational difficulty, and interest in Bayesian methods was renewed. In Bayesian statistics, Bayesian structural equation model (SEM is used. It provides a powerful and flexible approach for studying quantitative traits for wide spectrum problems and thus it has no operational difficulties, with the exception of some complex cases. In this method, the problems are solved at ease, and the statisticians feel it comfortable with the particular way of expressing the results and employing the software available to analyze a large variety of problems.
The Bayesian group lasso for confounded spatial data
Hefley, Trevor J.; Hooten, Mevin B.; Hanks, Ephraim M.; Russell, Robin E.; Walsh, Daniel P.
2017-01-01
Generalized linear mixed models for spatial processes are widely used in applied statistics. In many applications of the spatial generalized linear mixed model (SGLMM), the goal is to obtain inference about regression coefficients while achieving optimal predictive ability. When implementing the SGLMM, multicollinearity among covariates and the spatial random effects can make computation challenging and influence inference. We present a Bayesian group lasso prior with a single tuning parameter that can be chosen to optimize predictive ability of the SGLMM and jointly regularize the regression coefficients and spatial random effect. We implement the group lasso SGLMM using efficient Markov chain Monte Carlo (MCMC) algorithms and demonstrate how multicollinearity among covariates and the spatial random effect can be monitored as a derived quantity. To test our method, we compared several parameterizations of the SGLMM using simulated data and two examples from plant ecology and disease ecology. In all examples, problematic levels multicollinearity occurred and influenced sampling efficiency and inference. We found that the group lasso prior resulted in roughly twice the effective sample size for MCMC samples of regression coefficients and can have higher and less variable predictive accuracy based on out-of-sample data when compared to the standard SGLMM.
Sraj, Ihab; Mandli, Kyle T.; Knio, Omar; Dawson, Clint N.; Hoteit, Ibrahim
2014-01-01
Tsunami computational models are employed to explore multiple flooding scenarios and to predict water elevations. However, accurate estimation of water elevations requires accurate estimation of many model parameters including the Manning's n friction parameterization. Our objective is to develop an efficient approach for the uncertainty quantification and inference of the Manning's n coefficient which we characterize here by three different parameters set to be constant in the on-shore, near-shore and deep-water regions as defined using iso-baths. We use Polynomial Chaos (PC) to build an inexpensive surrogate for the G. eoC. law model and employ Bayesian inference to estimate and quantify uncertainties related to relevant parameters using the DART buoy data collected during the Tōhoku tsunami. The surrogate model significantly reduces the computational burden of the Markov Chain Monte-Carlo (MCMC) sampling of the Bayesian inference. The PC surrogate is also used to perform a sensitivity analysis.
Sraj, Ihab
2014-11-01
Tsunami computational models are employed to explore multiple flooding scenarios and to predict water elevations. However, accurate estimation of water elevations requires accurate estimation of many model parameters including the Manning\\'s n friction parameterization. Our objective is to develop an efficient approach for the uncertainty quantification and inference of the Manning\\'s n coefficient which we characterize here by three different parameters set to be constant in the on-shore, near-shore and deep-water regions as defined using iso-baths. We use Polynomial Chaos (PC) to build an inexpensive surrogate for the G. eoC. law model and employ Bayesian inference to estimate and quantify uncertainties related to relevant parameters using the DART buoy data collected during the Tōhoku tsunami. The surrogate model significantly reduces the computational burden of the Markov Chain Monte-Carlo (MCMC) sampling of the Bayesian inference. The PC surrogate is also used to perform a sensitivity analysis.
Strategies for MCMC computation inquantitative genetics
DEFF Research Database (Denmark)
Waagepetersen, Rasmus; Ibánēz-Escriche, Noelia; Sorensen, Daniel
another extension of the linear mixed model introducing genetic random effects influencing the log residual variances of the observations thereby producing a genetically structured variance heterogeneity. Considerable computational problems arise when abandoning the standard linear mixed model. Maximum...... the various algorithms in the context of the heterogeneous variance model. Apart from being a model of great interest in its own right, this model has proven to be a hard test for MCMC methods. We compare the performances of the different algorithms when applied to three real datasets which differ markedly...... results of applying two MCMC schemes to data sets with pig litter sizes, rabbit litter sizes, and snail weights. Some concluding remarks are given in Section 5....
DEFF Research Database (Denmark)
Choi, Sang-Hyeon; Lee, Ikjin; Jeong, Cheol-Ho
2016-01-01
Sabine absorption coefficient is a widely used one deduced from reverberation time measurements via the Sabine equation. First- and second-level Bayesian analysis are used to estimate the flow resistivity of a sound absorber and the influences of the test chambers from Sabine absorption...... coefficients measured in 13 different reverberation chambers. The first-level Bayesian analysis is more general than the second-level Bayesian analysis. Sharper posterior distribution can be acquired by the second-level Bayesian analysis than the one by the first-level Bayesian analysis because more data...... are used to set more reliable prior distribution. The estimated room’s influences by the first- and the second-level Bayesian analyses are similar to the estimated results by the mean absolute error minimization....
Directory of Open Access Journals (Sweden)
Simon Boitard
2016-03-01
Full Text Available Inferring the ancestral dynamics of effective population size is a long-standing question in population genetics, which can now be tackled much more accurately thanks to the massive genomic data available in many species. Several promising methods that take advantage of whole-genome sequences have been recently developed in this context. However, they can only be applied to rather small samples, which limits their ability to estimate recent population size history. Besides, they can be very sensitive to sequencing or phasing errors. Here we introduce a new approximate Bayesian computation approach named PopSizeABC that allows estimating the evolution of the effective population size through time, using a large sample of complete genomes. This sample is summarized using the folded allele frequency spectrum and the average zygotic linkage disequilibrium at different bins of physical distance, two classes of statistics that are widely used in population genetics and can be easily computed from unphased and unpolarized SNP data. Our approach provides accurate estimations of past population sizes, from the very first generations before present back to the expected time to the most recent common ancestor of the sample, as shown by simulations under a wide range of demographic scenarios. When applied to samples of 15 or 25 complete genomes in four cattle breeds (Angus, Fleckvieh, Holstein and Jersey, PopSizeABC revealed a series of population declines, related to historical events such as domestication or modern breed creation. We further highlight that our approach is robust to sequencing errors, provided summary statistics are computed from SNPs with common alleles.
Gogoshin, Grigoriy; Boerwinkle, Eric; Rodin, Andrei S
2017-04-01
Bayesian network (BN) reconstruction is a prototypical systems biology data analysis approach that has been successfully used to reverse engineer and model networks reflecting different layers of biological organization (ranging from genetic to epigenetic to cellular pathway to metabolomic). It is especially relevant in the context of modern (ongoing and prospective) studies that generate heterogeneous high-throughput omics datasets. However, there are both theoretical and practical obstacles to the seamless application of BN modeling to such big data, including computational inefficiency of optimal BN structure search algorithms, ambiguity in data discretization, mixing data types, imputation and validation, and, in general, limited scalability in both reconstruction and visualization of BNs. To overcome these and other obstacles, we present BNOmics, an improved algorithm and software toolkit for inferring and analyzing BNs from omics datasets. BNOmics aims at comprehensive systems biology-type data exploration, including both generating new biological hypothesis and testing and validating the existing ones. Novel aspects of the algorithm center around increasing scalability and applicability to varying data types (with different explicit and implicit distributional assumptions) within the same analysis framework. An output and visualization interface to widely available graph-rendering software is also included. Three diverse applications are detailed. BNOmics was originally developed in the context of genetic epidemiology data and is being continuously optimized to keep pace with the ever-increasing inflow of available large-scale omics datasets. As such, the software scalability and usability on the less than exotic computer hardware are a priority, as well as the applicability of the algorithm and software to the heterogeneous datasets containing many data types-single-nucleotide polymorphisms and other genetic/epigenetic/transcriptome variables, metabolite
Yau, C; Papaspiliopoulos, O; Roberts, G O; Holmes, C
2011-01-01
We consider the development of Bayesian Nonparametric methods for product partition models such as Hidden Markov Models and change point models. Our approach uses a Mixture of Dirichlet Process (MDP) model for the unknown sampling distribution (likelihood) for the observations arising in each state and a computationally efficient data augmentation scheme to aid inference. The method uses novel MCMC methodology which combines recent retrospective sampling methods with the use of slice sampler variables. The methodology is computationally efficient, both in terms of MCMC mixing properties, and robustness to the length of the time series being investigated. Moreover, the method is easy to implement requiring little or no user-interaction. We apply our methodology to the analysis of genomic copy number variation.
Introduction to Bayesian statistics
Bolstad, William M
2017-01-01
There is a strong upsurge in the use of Bayesian methods in applied statistical analysis, yet most introductory statistics texts only present frequentist methods. Bayesian statistics has many important advantages that students should learn about if they are going into fields where statistics will be used. In this Third Edition, four newly-added chapters address topics that reflect the rapid advances in the field of Bayesian staistics. The author continues to provide a Bayesian treatment of introductory statistical topics, such as scientific data gathering, discrete random variables, robust Bayesian methods, and Bayesian approaches to inferenfe cfor discrete random variables, bionomial proprotion, Poisson, normal mean, and simple linear regression. In addition, newly-developing topics in the field are presented in four new chapters: Bayesian inference with unknown mean and variance; Bayesian inference for Multivariate Normal mean vector; Bayesian inference for Multiple Linear RegressionModel; and Computati...
Karabatsos, George
2017-02-01
Most of applied statistics involves regression analysis of data. In practice, it is important to specify a regression model that has minimal assumptions which are not violated by data, to ensure that statistical inferences from the model are informative and not misleading. This paper presents a stand-alone and menu-driven software package, Bayesian Regression: Nonparametric and Parametric Models, constructed from MATLAB Compiler. Currently, this package gives the user a choice from 83 Bayesian models for data analysis. They include 47 Bayesian nonparametric (BNP) infinite-mixture regression models; 5 BNP infinite-mixture models for density estimation; and 31 normal random effects models (HLMs), including normal linear models. Each of the 78 regression models handles either a continuous, binary, or ordinal dependent variable, and can handle multi-level (grouped) data. All 83 Bayesian models can handle the analysis of weighted observations (e.g., for meta-analysis), and the analysis of left-censored, right-censored, and/or interval-censored data. Each BNP infinite-mixture model has a mixture distribution assigned one of various BNP prior distributions, including priors defined by either the Dirichlet process, Pitman-Yor process (including the normalized stable process), beta (two-parameter) process, normalized inverse-Gaussian process, geometric weights prior, dependent Dirichlet process, or the dependent infinite-probits prior. The software user can mouse-click to select a Bayesian model and perform data analysis via Markov chain Monte Carlo (MCMC) sampling. After the sampling completes, the software automatically opens text output that reports MCMC-based estimates of the model's posterior distribution and model predictive fit to the data. Additional text and/or graphical output can be generated by mouse-clicking other menu options. This includes output of MCMC convergence analyses, and estimates of the model's posterior predictive distribution, for selected
A Bayesian method for detecting pairwise associations in compositional data.
Directory of Open Access Journals (Sweden)
Emma Schwager
2017-11-01
Full Text Available Compositional data consist of vectors of proportions normalized to a constant sum from a basis of unobserved counts. The sum constraint makes inference on correlations between unconstrained features challenging due to the information loss from normalization. However, such correlations are of long-standing interest in fields including ecology. We propose a novel Bayesian framework (BAnOCC: Bayesian Analysis of Compositional Covariance to estimate a sparse precision matrix through a LASSO prior. The resulting posterior, generated by MCMC sampling, allows uncertainty quantification of any function of the precision matrix, including the correlation matrix. We also use a first-order Taylor expansion to approximate the transformation from the unobserved counts to the composition in order to investigate what characteristics of the unobserved counts can make the correlations more or less difficult to infer. On simulated datasets, we show that BAnOCC infers the true network as well as previous methods while offering the advantage of posterior inference. Larger and more realistic simulated datasets further showed that BAnOCC performs well as measured by type I and type II error rates. Finally, we apply BAnOCC to a microbial ecology dataset from the Human Microbiome Project, which in addition to reproducing established ecological results revealed unique, competition-based roles for Proteobacteria in multiple distinct habitats.
Jameel, M. Y.; Brewer, S.; Fiorella, R.; Tipple, B. J.; Bowen, G. J.; Terry, S.
2017-12-01
Public water supply systems (PWSS) are complex distribution systems and critical infrastructure, making them vulnerable to physical disruption and contamination. Exploring the susceptibility of PWSS to such perturbations requires detailed knowledge of the supply system structure and operation. Although the physical structure of supply systems (i.e., pipeline connection) is usually well documented for developed cities, the actual flow patterns of water in these systems are typically unknown or estimated based on hydrodynamic models with limited observational validation. Here, we present a novel method for mapping the flow structure of water in a large, complex PWSS, building upon recent work highlighting the potential of stable isotopes of water (SIW) to document water management practices within complex PWSS. We sampled a major water distribution system of the Salt Lake Valley, Utah, measuring SIW of water sources, treatment facilities, and numerous sites within in the supply system. We then developed a hierarchical Bayesian (HB) isotope mixing model to quantify the proportion of water supplied by different sources at sites within the supply system. Known production volumes and spatial distance effects were used to define the prior probabilities for each source; however, we did not include other physical information about the supply system. Our results were in general agreement with those obtained by hydrodynamic models and provide quantitative estimates of contributions of different water sources to a given site along with robust estimates of uncertainty. Secondary properties of the supply system, such as regions of "static" and "dynamic" source (e.g., regions supplied dominantly by one source vs. those experiencing active mixing between multiple sources), can be inferred from the results. The isotope-based HB isotope mixing model offers a new investigative technique for analyzing PWSS and documenting aspects of supply system structure and operation that are
A Bayesian model for binary Markov chains
Directory of Open Access Journals (Sweden)
Belkheir Essebbar
2004-02-01
Full Text Available This note is concerned with Bayesian estimation of the transition probabilities of a binary Markov chain observed from heterogeneous individuals. The model is founded on the Jeffreys' prior which allows for transition probabilities to be correlated. The Bayesian estimator is approximated by means of Monte Carlo Markov chain (MCMC techniques. The performance of the Bayesian estimates is illustrated by analyzing a small simulated data set.
Han, Feng; Zheng, Yi
2018-06-01
Significant Input uncertainty is a major source of error in watershed water quality (WWQ) modeling. It remains challenging to address the input uncertainty in a rigorous Bayesian framework. This study develops the Bayesian Analysis of Input and Parametric Uncertainties (BAIPU), an approach for the joint analysis of input and parametric uncertainties through a tight coupling of Markov Chain Monte Carlo (MCMC) analysis and Bayesian Model Averaging (BMA). The formal likelihood function for this approach is derived considering a lag-1 autocorrelated, heteroscedastic, and Skew Exponential Power (SEP) distributed error model. A series of numerical experiments were performed based on a synthetic nitrate pollution case and on a real study case in the Newport Bay Watershed, California. The Soil and Water Assessment Tool (SWAT) and Differential Evolution Adaptive Metropolis (DREAM(ZS)) were used as the representative WWQ model and MCMC algorithm, respectively. The major findings include the following: (1) the BAIPU can be implemented and used to appropriately identify the uncertain parameters and characterize the predictive uncertainty; (2) the compensation effect between the input and parametric uncertainties can seriously mislead the modeling based management decisions, if the input uncertainty is not explicitly accounted for; (3) the BAIPU accounts for the interaction between the input and parametric uncertainties and therefore provides more accurate calibration and uncertainty results than a sequential analysis of the uncertainties; and (4) the BAIPU quantifies the credibility of different input assumptions on a statistical basis and can be implemented as an effective inverse modeling approach to the joint inference of parameters and inputs.
A Bayesian approach for parameter estimation and prediction using a computationally intensive model
International Nuclear Information System (INIS)
Higdon, Dave; McDonnell, Jordan D; Schunck, Nicolas; Sarich, Jason; Wild, Stefan M
2015-01-01
Bayesian methods have been successful in quantifying uncertainty in physics-based problems in parameter estimation and prediction. In these cases, physical measurements y are modeled as the best fit of a physics-based model η(θ), where θ denotes the uncertain, best input setting. Hence the statistical model is of the form y=η(θ)+ϵ, where ϵ accounts for measurement, and possibly other, error sources. When nonlinearity is present in η(⋅), the resulting posterior distribution for the unknown parameters in the Bayesian formulation is typically complex and nonstandard, requiring computationally demanding computational approaches such as Markov chain Monte Carlo (MCMC) to produce multivariate draws from the posterior. Although generally applicable, MCMC requires thousands (or even millions) of evaluations of the physics model η(⋅). This requirement is problematic if the model takes hours or days to evaluate. To overcome this computational bottleneck, we present an approach adapted from Bayesian model calibration. This approach combines output from an ensemble of computational model runs with physical measurements, within a statistical formulation, to carry out inference. A key component of this approach is a statistical response surface, or emulator, estimated from the ensemble of model runs. We demonstrate this approach with a case study in estimating parameters for a density functional theory model, using experimental mass/binding energy measurements from a collection of atomic nuclei. We also demonstrate how this approach produces uncertainties in predictions for recent mass measurements obtained at Argonne National Laboratory. (paper)
Trajectory averaging for stochastic approximation MCMC algorithms
Liang, Faming
2010-10-01
The subject of stochastic approximation was founded by Robbins and Monro [Ann. Math. Statist. 22 (1951) 400-407]. After five decades of continual development, it has developed into an important area in systems control and optimization, and it has also served as a prototype for the development of adaptive algorithms for on-line estimation and control of stochastic systems. Recently, it has been used in statistics with Markov chain Monte Carlo for solving maximum likelihood estimation problems and for general simulation and optimizations. In this paper, we first show that the trajectory averaging estimator is asymptotically efficient for the stochastic approximation MCMC (SAMCMC) algorithm under mild conditions, and then apply this result to the stochastic approximation Monte Carlo algorithm [Liang, Liu and Carroll J. Amer. Statist. Assoc. 102 (2007) 305-320]. The application of the trajectory averaging estimator to other stochastic approximationMCMC algorithms, for example, a stochastic approximation MLE algorithm for missing data problems, is also considered in the paper. © Institute of Mathematical Statistics, 2010.
GSEVM v.2: MCMC software to analyse genetically structured environmental variance models
DEFF Research Database (Denmark)
Ibáñez-Escriche, N; Garcia, M; Sorensen, D
2010-01-01
This note provides a description of software that allows to fit Bayesian genetically structured variance models using Markov chain Monte Carlo (MCMC). The gsevm v.2 program was written in Fortran 90. The DOS and Unix executable programs, the user's guide, and some example files are freely available...... for research purposes at http://www.bdporc.irta.es/estudis.jsp. The main feature of the program is to compute Monte Carlo estimates of marginal posterior distributions of parameters of interest. The program is quite flexible, allowing the user to fit a variety of linear models at the level of the mean...
Directory of Open Access Journals (Sweden)
Wills Rachael A
2009-05-01
Full Text Available Abstract Background The problem of silent multiple comparisons is one of the most difficult statistical problems faced by scientists. It is a particular problem for investigating a one-off cancer cluster reported to a health department because any one of hundreds, or possibly thousands, of neighbourhoods, schools, or workplaces could have reported a cluster, which could have been for any one of several types of cancer or any one of several time periods. Methods This paper contrasts the frequentist approach with a Bayesian approach for dealing with silent multiple comparisons in the context of a one-off cluster reported to a health department. Two published cluster investigations were re-analysed using the Dunn-Sidak method to adjust frequentist p-values and confidence intervals for silent multiple comparisons. Bayesian methods were based on the Gamma distribution. Results Bayesian analysis with non-informative priors produced results similar to the frequentist analysis, and suggested that both clusters represented a statistical excess. In the frequentist framework, the statistical significance of both clusters was extremely sensitive to the number of silent multiple comparisons, which can only ever be a subjective "guesstimate". The Bayesian approach is also subjective: whether there is an apparent statistical excess depends on the specified prior. Conclusion In cluster investigations, the frequentist approach is just as subjective as the Bayesian approach, but the Bayesian approach is less ambitious in that it treats the analysis as a synthesis of data and personal judgements (possibly poor ones, rather than objective reality. Bayesian analysis is (arguably a useful tool to support complicated decision-making, because it makes the uncertainty associated with silent multiple comparisons explicit.
Bayesian methods for data analysis
Carlin, Bradley P.
2009-01-01
Approaches for statistical inference Introduction Motivating Vignettes Defining the Approaches The Bayes-Frequentist Controversy Some Basic Bayesian Models The Bayes approach Introduction Prior Distributions Bayesian Inference Hierarchical Modeling Model Assessment Nonparametric Methods Bayesian computation Introduction Asymptotic Methods Noniterative Monte Carlo Methods Markov Chain Monte Carlo Methods Model criticism and selection Bayesian Modeling Bayesian Robustness Model Assessment Bayes Factors via Marginal Density Estimation Bayes Factors
International Nuclear Information System (INIS)
Davis, Sergio; Gutiérrez, Gonzalo
2013-01-01
We present a systematic implementation of the recently developed Z-method for computing melting points of solids, augmented by a Bayesian analysis of the data obtained from molecular dynamics simulations. The use of Bayesian inference allows us to extract valuable information from limited data, reducing the computational cost of drawing the isochoric curve. From this Bayesian Z-method we obtain posterior distributions for the melting temperature T m , the critical superheating temperature T LS and the slopes dT/dE of the liquid and solid phases. The method therefore gives full quantification of the errors in the prediction of the melting point. This procedure is applied to the estimation of the melting point of Ti 2 GaN (one of the so-called MAX phases), a complex, laminar material, by density functional theory molecular dynamics, finding an estimate T m of 2591.61 ± 89.61 K, which is in good agreement with melting points of similar ceramics. (paper)
Kim, Daesang
2017-06-22
We developed a novel two-step hierarchical method for the Bayesian inference of the rate parameters of a target reaction from time-resolved concentration measurements in shock tubes. The method was applied to the calibration of the parameters of the reaction of hydroxyl with 2-methylfuran, which is studied experimentally via absorption measurements of the OH radical\\'s concentration following shock-heating. In the first step of the approach, each shock tube experiment is treated independently to infer the posterior distribution of the rate constant and error hyper-parameter that best explains the OH signal. In the second step, these posterior distributions are sampled to calibrate the parameters appearing in the Arrhenius reaction model for the rate constant. Furthermore, the second step is modified and repeated in order to explore alternative rate constant models and to assess the effect of uncertainties in the reflected shock\\'s temperature. Comparisons of the estimates obtained via the proposed methodology against the common least squares approach are presented. The relative merits of the novel Bayesian framework are highlighted, especially with respect to the opportunity to utilize the posterior distributions of the parameters in future uncertainty quantification studies.
Caticha, Ariel
2011-03-01
In this tutorial we review the essential arguments behing entropic inference. We focus on the epistemological notion of information and its relation to the Bayesian beliefs of rational agents. The problem of updating from a prior to a posterior probability distribution is tackled through an eliminative induction process that singles out the logarithmic relative entropy as the unique tool for inference. The resulting method of Maximum relative Entropy (ME), includes as special cases both MaxEnt and Bayes' rule, and therefore unifies the two themes of these workshops—the Maximum Entropy and the Bayesian methods—into a single general inference scheme.
Use of SAMC for Bayesian analysis of statistical models with intractable normalizing constants
Jin, Ick Hoon
2014-03-01
Statistical inference for the models with intractable normalizing constants has attracted much attention. During the past two decades, various approximation- or simulation-based methods have been proposed for the problem, such as the Monte Carlo maximum likelihood method and the auxiliary variable Markov chain Monte Carlo methods. The Bayesian stochastic approximation Monte Carlo algorithm specifically addresses this problem: It works by sampling from a sequence of approximate distributions with their average converging to the target posterior distribution, where the approximate distributions can be achieved using the stochastic approximation Monte Carlo algorithm. A strong law of large numbers is established for the Bayesian stochastic approximation Monte Carlo estimator under mild conditions. Compared to the Monte Carlo maximum likelihood method, the Bayesian stochastic approximation Monte Carlo algorithm is more robust to the initial guess of model parameters. Compared to the auxiliary variable MCMC methods, the Bayesian stochastic approximation Monte Carlo algorithm avoids the requirement for perfect samples, and thus can be applied to many models for which perfect sampling is not available or very expensive. The Bayesian stochastic approximation Monte Carlo algorithm also provides a general framework for approximate Bayesian analysis. © 2012 Elsevier B.V. All rights reserved.
Modeling error distributions of growth curve models through Bayesian methods.
Zhang, Zhiyong
2016-06-01
Growth curve models are widely used in social and behavioral sciences. However, typical growth curve models often assume that the errors are normally distributed although non-normal data may be even more common than normal data. In order to avoid possible statistical inference problems in blindly assuming normality, a general Bayesian framework is proposed to flexibly model normal and non-normal data through the explicit specification of the error distributions. A simulation study shows when the distribution of the error is correctly specified, one can avoid the loss in the efficiency of standard error estimates. A real example on the analysis of mathematical ability growth data from the Early Childhood Longitudinal Study, Kindergarten Class of 1998-99 is used to show the application of the proposed methods. Instructions and code on how to conduct growth curve analysis with both normal and non-normal error distributions using the the MCMC procedure of SAS are provided.
Directory of Open Access Journals (Sweden)
Ilaria A M Marino
Full Text Available A precise inference of past demographic histories including dating of demographic events using bayesian methods can only be achieved with the use of appropriate molecular rates and evolutionary models. Using a set of 596 mitochondrial cytochrome c oxidase I (COI sequences of two sister species of European green crabs of the genus Carcinus (C. maenas and C. aestuarii, our study shows how chronologies of past evolutionary events change significantly with the application of revised molecular rates that incorporate biogeographic events for calibration and appropriate demographic priors. A clear signal of demographic expansion was found for both species, dated between 10,000 and 20,000 years ago, which places the expansions events in a time frame following the Last Glacial Maximum (LGM. In the case of C. aestuarii, a population expansion was only inferred for the Adriatic-Ionian, suggestive of a colonization event following the flooding of the Adriatic Sea (18,000 years ago. For C. maenas, the demographic expansion inferred for the continental populations of West and North Europe might result from a northward recolonization from a southern refugium when the ice sheet retreated after the LGM. Collectively, our results highlight the importance of using adequate calibrations and demographic priors in order to avoid considerable overestimates of evolutionary time scales.
Yuan, Ying; MacKinnon, David P.
2009-01-01
This article proposes Bayesian analysis of mediation effects. Compared to conventional frequentist mediation analysis, the Bayesian approach has several advantages. First, it allows researchers to incorporate prior information into the mediation analysis, thus potentially improving the efficiency of estimates. Second, under the Bayesian mediation analysis, inference is straightforward and exact, which makes it appealing for studies with small samples. Third, the Bayesian approach is conceptua...
Bayesian artificial intelligence
Korb, Kevin B
2010-01-01
Updated and expanded, Bayesian Artificial Intelligence, Second Edition provides a practical and accessible introduction to the main concepts, foundation, and applications of Bayesian networks. It focuses on both the causal discovery of networks and Bayesian inference procedures. Adopting a causal interpretation of Bayesian networks, the authors discuss the use of Bayesian networks for causal modeling. They also draw on their own applied research to illustrate various applications of the technology.New to the Second EditionNew chapter on Bayesian network classifiersNew section on object-oriente
ABCtoolbox: a versatile toolkit for approximate Bayesian computations
Directory of Open Access Journals (Sweden)
Neuenschwander Samuel
2010-03-01
Full Text Available Abstract Background The estimation of demographic parameters from genetic data often requires the computation of likelihoods. However, the likelihood function is computationally intractable for many realistic evolutionary models, and the use of Bayesian inference has therefore been limited to very simple models. The situation changed recently with the advent of Approximate Bayesian Computation (ABC algorithms allowing one to obtain parameter posterior distributions based on simulations not requiring likelihood computations. Results Here we present ABCtoolbox, a series of open source programs to perform Approximate Bayesian Computations (ABC. It implements various ABC algorithms including rejection sampling, MCMC without likelihood, a Particle-based sampler and ABC-GLM. ABCtoolbox is bundled with, but not limited to, a program that allows parameter inference in a population genetics context and the simultaneous use of different types of markers with different ploidy levels. In addition, ABCtoolbox can also interact with most simulation and summary statistics computation programs. The usability of the ABCtoolbox is demonstrated by inferring the evolutionary history of two evolutionary lineages of Microtus arvalis. Using nuclear microsatellites and mitochondrial sequence data in the same estimation procedure enabled us to infer sex-specific population sizes and migration rates and to find that males show smaller population sizes but much higher levels of migration than females. Conclusion ABCtoolbox allows a user to perform all the necessary steps of a full ABC analysis, from parameter sampling from prior distributions, data simulations, computation of summary statistics, estimation of posterior distributions, model choice, validation of the estimation procedure, and visualization of the results.
A Bayesian alternative for multi-objective ecohydrological model specification
Tang, Yating; Marshall, Lucy; Sharma, Ashish; Ajami, Hoori
2018-01-01
Recent studies have identified the importance of vegetation processes in terrestrial hydrologic systems. Process-based ecohydrological models combine hydrological, physical, biochemical and ecological processes of the catchments, and as such are generally more complex and parametric than conceptual hydrological models. Thus, appropriate calibration objectives and model uncertainty analysis are essential for ecohydrological modeling. In recent years, Bayesian inference has become one of the most popular tools for quantifying the uncertainties in hydrological modeling with the development of Markov chain Monte Carlo (MCMC) techniques. The Bayesian approach offers an appealing alternative to traditional multi-objective hydrologic model calibrations by defining proper prior distributions that can be considered analogous to the ad-hoc weighting often prescribed in multi-objective calibration. Our study aims to develop appropriate prior distributions and likelihood functions that minimize the model uncertainties and bias within a Bayesian ecohydrological modeling framework based on a traditional Pareto-based model calibration technique. In our study, a Pareto-based multi-objective optimization and a formal Bayesian framework are implemented in a conceptual ecohydrological model that combines a hydrological model (HYMOD) and a modified Bucket Grassland Model (BGM). Simulations focused on one objective (streamflow/LAI) and multiple objectives (streamflow and LAI) with different emphasis defined via the prior distribution of the model error parameters. Results show more reliable outputs for both predicted streamflow and LAI using Bayesian multi-objective calibration with specified prior distributions for error parameters based on results from the Pareto front in the ecohydrological modeling. The methodology implemented here provides insight into the usefulness of multiobjective Bayesian calibration for ecohydrologic systems and the importance of appropriate prior
Serang, Oliver
2015-08-01
Observations depending on sums of random variables are common throughout many fields; however, no efficient solution is currently known for performing max-product inference on these sums of general discrete distributions (max-product inference can be used to obtain maximum a posteriori estimates). The limiting step to max-product inference is the max-convolution problem (sometimes presented in log-transformed form and denoted as "infimal convolution," "min-convolution," or "convolution on the tropical semiring"), for which no O(k log(k)) method is currently known. Presented here is an O(k log(k)) numerical method for estimating the max-convolution of two nonnegative vectors (e.g., two probability mass functions), where k is the length of the larger vector. This numerical max-convolution method is then demonstrated by performing fast max-product inference on a convolution tree, a data structure for performing fast inference given information on the sum of n discrete random variables in O(nk log(nk)log(n)) steps (where each random variable has an arbitrary prior distribution on k contiguous possible states). The numerical max-convolution method can be applied to specialized classes of hidden Markov models to reduce the runtime of computing the Viterbi path from nk(2) to nk log(k), and has potential application to the all-pairs shortest paths problem.
McNally, Kevin; Cotton, Richard; Cocker, John; Jones, Kate; Bartels, Mike; Rick, David; Price, Paul; Loizou, George
2012-01-01
There are numerous biomonitoring programs, both recent and ongoing, to evaluate environmental exposure of humans to chemicals. Due to the lack of exposure and kinetic data, the correlation of biomarker levels with exposure concentrations leads to difficulty in utilizing biomonitoring data for biological guidance values. Exposure reconstruction or reverse dosimetry is the retrospective interpretation of external exposure consistent with biomonitoring data. We investigated the integration of physiologically based pharmacokinetic modelling, global sensitivity analysis, Bayesian inference, and Markov chain Monte Carlo simulation to obtain a population estimate of inhalation exposure to m-xylene. We used exhaled breath and venous blood m-xylene and urinary 3-methylhippuric acid measurements from a controlled human volunteer study in order to evaluate the ability of our computational framework to predict known inhalation exposures. We also investigated the importance of model structure and dimensionality with respect to its ability to reconstruct exposure. PMID:22719759
Directory of Open Access Journals (Sweden)
Kevin McNally
2012-01-01
Full Text Available There are numerous biomonitoring programs, both recent and ongoing, to evaluate environmental exposure of humans to chemicals. Due to the lack of exposure and kinetic data, the correlation of biomarker levels with exposure concentrations leads to difficulty in utilizing biomonitoring data for biological guidance values. Exposure reconstruction or reverse dosimetry is the retrospective interpretation of external exposure consistent with biomonitoring data. We investigated the integration of physiologically based pharmacokinetic modelling, global sensitivity analysis, Bayesian inference, and Markov chain Monte Carlo simulation to obtain a population estimate of inhalation exposure to m-xylene. We used exhaled breath and venous blood m-xylene and urinary 3-methylhippuric acid measurements from a controlled human volunteer study in order to evaluate the ability of our computational framework to predict known inhalation exposures. We also investigated the importance of model structure and dimensionality with respect to its ability to reconstruct exposure.
van Lier-Walqui, M.; Morrison, H.; Kumjian, M. R.; Prat, O. P.
2016-12-01
Microphysical parameterization schemes have reached an impressive level of sophistication: numerous prognostic hydrometeor categories, and either size-resolved (bin) particle size distributions, or multiple prognostic moments of the size distribution. Yet, uncertainty in model representation of microphysical processes and the effects of microphysics on numerical simulation of weather has not shown a improvement commensurate with the advanced sophistication of these schemes. We posit that this may be caused by unconstrained assumptions of these schemes, such as ad-hoc parameter value choices and structural uncertainties (e.g. choice of a particular form for the size distribution). We present work on development and observational constraint of a novel microphysical parameterization approach, the Bayesian Observationally-constrained Statistical-physical Scheme (BOSS), which seeks to address these sources of uncertainty. Our framework avoids unnecessary a priori assumptions, and instead relies on observations to provide probabilistic constraint of the scheme structure and sensitivities to environmental and microphysical conditions. We harness the rich microphysical information content of polarimetric radar observations to develop and constrain BOSS within a Bayesian inference framework using a Markov Chain Monte Carlo sampler (see Kumjian et al., this meeting for details on development of an associated polarimetric forward operator). Our work shows how knowledge of microphysical processes is provided by polarimetric radar observations of diverse weather conditions, and which processes remain highly uncertain, even after considering observations.
MCMC estimation of multidimensional IRT models
Beguin, Anton; Glas, Cornelis A.W.
1998-01-01
A Bayesian procedure to estimate the three-parameter normal ogive model and a generalization to a model with multidimensional ability parameters are discussed. The procedure is a generalization of a procedure by J. Albert (1992) for estimating the two-parameter normal ogive model. The procedure will
Learning Bayesian network classifiers for credit scoring using Markov Chain Monte Carlo search
Baesens, B.; Egmont-Petersen, M.; Castelo, R.; Vanthienen, J.
2001-01-01
In this paper, we will evaluate the power and usefulness of Bayesian network classifiers for credit scoring. Various types of Bayesian network classifiers will be evaluated and contrasted including unrestricted Bayesian network classifiers learnt using Markov Chain Monte Carlo (MCMC) search.
Lu, Dan; Ricciuto, Daniel; Walker, Anthony; Safta, Cosmin; Munger, William
2017-09-01
Calibration of terrestrial ecosystem models is important but challenging. Bayesian inference implemented by Markov chain Monte Carlo (MCMC) sampling provides a comprehensive framework to estimate model parameters and associated uncertainties using their posterior distributions. The effectiveness and efficiency of the method strongly depend on the MCMC algorithm used. In this work, a differential evolution adaptive Metropolis (DREAM) algorithm is used to estimate posterior distributions of 21 parameters for the data assimilation linked ecosystem carbon (DALEC) model using 14 years of daily net ecosystem exchange data collected at the Harvard Forest Environmental Measurement Site eddy-flux tower. The calibration of DREAM results in a better model fit and predictive performance compared to the popular adaptive Metropolis (AM) scheme. Moreover, DREAM indicates that two parameters controlling autumn phenology have multiple modes in their posterior distributions while AM only identifies one mode. The application suggests that DREAM is very suitable to calibrate complex terrestrial ecosystem models, where the uncertain parameter size is usually large and existence of local optima is always a concern. In addition, this effort justifies the assumptions of the error model used in Bayesian calibration according to the residual analysis. The result indicates that a heteroscedastic, correlated, Gaussian error model is appropriate for the problem, and the consequent constructed likelihood function can alleviate the underestimation of parameter uncertainty that is usually caused by using uncorrelated error models.
Directory of Open Access Journals (Sweden)
Zoltán Csörnyei
2010-01-01
Full Text Available Genetic parameters of number of piglets born alive (NBA and gestation length (GL were analyzed for 39798 Hungarian Landrace (HLA, 141397 records and 70356 Hungarian Large White (HLW, 246961 records sows. Bivariate repeatability animal models were used, applying a Bayesian statistics. Estimated and heritabilitie repeatabilities (within brackets, were low for NBA, 0.07 (0.14 for HLA and 0.08 (0.17 for HLW, but somewhat higher for GL, 0.18 (0.27 for HLA and 0.26 (0.35 for HLW. Estimated genetic correlations between NBA and GL were low, -0.08 for HLA and -0.05 for HLW.
Beatty, William; Jay, Chadwick V.; Fischbach, Anthony S.
2016-01-01
State-space models offer researchers an objective approach to modeling complex animal location data sets, and state-space model behavior classifications are often assumed to have a link to animal behavior. In this study, we evaluated the behavioral classification accuracy of a Bayesian state-space model in Pacific walruses using Argos satellite tags with sensors to detect animal behavior in real time. We fit a two-state discrete-time continuous-space Bayesian state-space model to data from 306 Pacific walruses tagged in the Chukchi Sea. We matched predicted locations and behaviors from the state-space model (resident, transient behavior) to true animal behavior (foraging, swimming, hauled out) and evaluated classification accuracy with kappa statistics (κ) and root mean square error (RMSE). In addition, we compared biased random bridge utilization distributions generated with resident behavior locations to true foraging behavior locations to evaluate differences in space use patterns. Results indicated that the two-state model fairly classified true animal behavior (0.06 ≤ κ ≤ 0.26, 0.49 ≤ RMSE ≤ 0.59). Kernel overlap metrics indicated utilization distributions generated with resident behavior locations were generally smaller than utilization distributions generated with true foraging behavior locations. Consequently, we encourage researchers to carefully examine parameters and priors associated with behaviors in state-space models, and reconcile these parameters with the study species and its expected behaviors.
Bayesian target tracking based on particle filter
Institute of Scientific and Technical Information of China (English)
无
2005-01-01
For being able to deal with the nonlinear or non-Gaussian problems, particle filters have been studied by many researchers. Based on particle filter, the extended Kalman filter (EKF) proposal function is applied to Bayesian target tracking. Markov chain Monte Carlo (MCMC) method, the resampling step, etc novel techniques are also introduced into Bayesian target tracking. And the simulation results confirm the improved particle filter with these techniques outperforms the basic one.
Caticha, Ariel
2010-01-01
In this tutorial we review the essential arguments behing entropic inference. We focus on the epistemological notion of information and its relation to the Bayesian beliefs of rational agents. The problem of updating from a prior to a posterior probability distribution is tackled through an eliminative induction process that singles out the logarithmic relative entropy as the unique tool for inference. The resulting method of Maximum relative Entropy (ME), includes as special cases both MaxEn...
DEFF Research Database (Denmark)
Jeong, Cheol-Ho; Choi, Sang-Hyeon; Lee, Ikjin
2017-01-01
A Bayesian analysis is applied to determine the flow resistivity of a porous sample and the influence of the test chamber based on measured Sabine absorption coefficient data. The Sabine absorption coefficient measured in a reverberation chamber according to ISO 354 is influenced by the test...... chamber significantly, whereas the flow resistivity is a rather reproducible material property, from which the absorptive characteristics can be calculated through reliable models. Using Sabine absorption coefficients measured in 13 European reverberation chambers, the maximum a posteriori...... and the uncertainty of the flow resistivity and the test chamber’s influence are estimated. Inclusion of more than one chamber’s absorption data helps the flow resistivity converge towards a reliable value with a standard deviation below 17%...
Murray, Jessica R.; Minson, Sarah E.; Svarc, Jerry L.
2014-01-01
Fault creep, depending on its rate and spatial extent, is thought to reduce earthquake hazard by releasing tectonic strain aseismically. We use Bayesian inversion and a newly expanded GPS data set to infer the deep slip rates below assigned locking depths on the San Andreas, Maacama, and Bartlett Springs Faults of Northern California and, for the latter two, the spatially variable interseismic creep rate above the locking depth. We estimate deep slip rates of 21.5 ± 0.5, 13.1 ± 0.8, and 7.5 ± 0.7 mm/yr below 16 km, 9 km, and 13 km on the San Andreas, Maacama, and Bartlett Springs Faults, respectively. We infer that on average the Bartlett Springs fault creeps from the Earth's surface to 13 km depth, and below 5 km the creep rate approaches the deep slip rate. This implies that microseismicity may extend below the locking depth; however, we cannot rule out the presence of locked patches in the seismogenic zone that could generate moderate earthquakes. Our estimated Maacama creep rate, while comparable to the inferred deep slip rate at the Earth's surface, decreases with depth, implying a slip deficit exists. The Maacama deep slip rate estimate, 13.1 mm/yr, exceeds long-term geologic slip rate estimates, perhaps due to distributed off-fault strain or the presence of multiple active fault strands. While our creep rate estimates are relatively insensitive to choice of model locking depth, insufficient independent information regarding locking depths is a source of epistemic uncertainty that impacts deep slip rate estimates.
Advances in Bayesian Model Based Clustering Using Particle Learning
Energy Technology Data Exchange (ETDEWEB)
Merl, D M
2009-11-19
Recent work by Carvalho, Johannes, Lopes and Polson and Carvalho, Lopes, Polson and Taddy introduced a sequential Monte Carlo (SMC) alternative to traditional iterative Monte Carlo strategies (e.g. MCMC and EM) for Bayesian inference for a large class of dynamic models. The basis of SMC techniques involves representing the underlying inference problem as one of state space estimation, thus giving way to inference via particle filtering. The key insight of Carvalho et al was to construct the sequence of filtering distributions so as to make use of the posterior predictive distribution of the observable, a distribution usually only accessible in certain Bayesian settings. Access to this distribution allows a reversal of the usual propagate and resample steps characteristic of many SMC methods, thereby alleviating to a large extent many problems associated with particle degeneration. Furthermore, Carvalho et al point out that for many conjugate models the posterior distribution of the static variables can be parametrized in terms of [recursively defined] sufficient statistics of the previously observed data. For models where such sufficient statistics exist, particle learning as it is being called, is especially well suited for the analysis of streaming data do to the relative invariance of its algorithmic complexity with the number of data observations. Through a particle learning approach, a statistical model can be fit to data as the data is arriving, allowing at any instant during the observation process direct quantification of uncertainty surrounding underlying model parameters. Here we describe the use of a particle learning approach for fitting a standard Bayesian semiparametric mixture model as described in Carvalho, Lopes, Polson and Taddy. In Section 2 we briefly review the previously presented particle learning algorithm for the case of a Dirichlet process mixture of multivariate normals. In Section 3 we describe several novel extensions to the original
Energy Technology Data Exchange (ETDEWEB)
Sigeti, David E. [Los Alamos National Laboratory; Pelak, Robert A. [Los Alamos National Laboratory
2012-09-11
We present a Bayesian statistical methodology for identifying improvement in predictive simulations, including an analysis of the number of (presumably expensive) simulations that will need to be made in order to establish with a given level of confidence that an improvement has been observed. Our analysis assumes the ability to predict (or postdict) the same experiments with legacy and new simulation codes and uses a simple binomial model for the probability, {theta}, that, in an experiment chosen at random, the new code will provide a better prediction than the old. This model makes it possible to do statistical analysis with an absolute minimum of assumptions about the statistics of the quantities involved, at the price of discarding some potentially important information in the data. In particular, the analysis depends only on whether or not the new code predicts better than the old in any given experiment, and not on the magnitude of the improvement. We show how the posterior distribution for {theta} may be used, in a kind of Bayesian hypothesis testing, both to decide if an improvement has been observed and to quantify our confidence in that decision. We quantify the predictive probability that should be assigned, prior to taking any data, to the possibility of achieving a given level of confidence, as a function of sample size. We show how this predictive probability depends on the true value of {theta} and, in particular, how there will always be a region around {theta} = 1/2 where it is highly improbable that we will be able to identify an improvement in predictive capability, although the width of this region will shrink to zero as the sample size goes to infinity. We show how the posterior standard deviation may be used, as a kind of 'plan B metric' in the case that the analysis shows that {theta} is close to 1/2 and argue that such a plan B should generally be part of hypothesis testing. All the analysis presented in the paper is done with a
Renner, Susanne S; Zhang, Li-Bing
2004-06-01
Pistia stratiotes (water lettuce) and Lemna (duckweeds) are the only free-floating aquatic Araceae. The geographic origin and phylogenetic placement of these unrelated aroids present long-standing problems because of their highly modified reproductive structures and wide geographical distributions. We sampled chloroplast (trnL-trnF and rpl20-rps12 spacers, trnL intron) and mitochondrial sequences (nad1 b/c intron) for all genera implicated as close relatives of Pistia by morphological, restriction site, and sequencing data, and present a hypothesis about its geographic origin based on the consensus of trees obtained from the combined data, using Bayesian, maximum likelihood, parsimony, and distance analyses. Of the 14 genera closest to Pistia, only Alocasia, Arisaema, and Typhonium are species-rich, and the latter two were studied previously, facilitating the choice of representatives that span the roots of these genera. Results indicate that Pistia and the Seychelles endemic Protarum sechellarum are the basalmost branches in a grade comprising the tribes Colocasieae (Ariopsis, Steudnera, Remusatia, Alocasia, Colocasia), Arisaemateae (Arisaema, Pinellia), and Areae (Arum, Biarum, Dracunculus, Eminium, Helicodiceros, Theriophonum, Typhonium). Unexpectedly, all Areae genera are embedded in Typhonium, which throws new light on the geographic history of Areae. A Bayesian analysis of divergence times that explores the effects of multiple fossil and geological calibration points indicates that the Pistia lineage is 90 to 76 million years (my) old. The oldest fossils of the Pistia clade, though not Pistia itself, are 45-my-old leaves from Germany; the closest outgroup, Peltandreae (comprising a few species in Florida, the Mediterranean, and Madagascar), is known from 60-my-old leaves from Europe, Kazakhstan, North Dakota, and Tennessee. Based on the geographic ranges of close relatives, Pistia likely originated in the Tethys region, with Protarum then surviving on the
Fang, Fang; Chen, Jing; Jiang, Li-Yun; Qu, Yan-Hua; Qiao, Ge-Xia
2018-01-09
Biological invasion is considered one of the most important global environmental problems. Knowledge of the source and dispersal routes of invasion could facilitate the eradication and control of invasive species. Soybean aphid, Aphis glycines Matsumura, is one of the most destructive soybean pests. For effective management of this pest, we conducted genetic analyses and approximate Bayesian computation (ABC) analysis to determine the origins and dispersal of the aphid species, as well as the source of its invasion in the USA, using eight microsatellite loci and the mitochondrial cytochrome c oxidase subunit I (COI) gene. We were able to identify a significant isolation by distance (IBD) pattern and three genetic lineages in the microsatellite data but not in the mtDNA dataset. The genetic structure showed that the USA population has the closest relationship with those from Korea and Japan, indicating that the two latter populations might be the sources of the invasion to the USA. Both population genetic analyses and ABC showed that the northeastern populations in China were the possible sources of the further spread of A. glycines to Indonesia. The dispersal history of this aphid can provide useful information for pest management strategies and can further help predict areas at risk of invasion. This article is protected by copyright. All rights reserved.
Safari, Parviz; Danyali, Syyedeh Fatemeh; Rahimi, Mehdi
2018-06-02
Drought is the main abiotic stress seriously influencing wheat production. Information about the inheritance of drought tolerance is necessary to determine the most appropriate strategy to develop tolerant cultivars and populations. In this study, generation means analysis to identify the genetic effects controlling grain yield inheritance in water deficit and normal conditions was considered as a model selection problem in a Bayesian framework. Stochastic search variable selection (SSVS) was applied to identify the most important genetic effects and the best fitted models using different generations obtained from two crosses applying two water regimes in two growing seasons. The SSVS is used to evaluate the effect of each variable on the dependent variable via posterior variable inclusion probabilities. The model with the highest posterior probability is selected as the best model. In this study, the grain yield was controlled by the main effects (additive and non-additive effects) and epistatic. The results demonstrate that breeding methods such as recurrent selection and subsequent pedigree method and hybrid production can be useful to improve grain yield.
Xing, Dongyuan; Huang, Yangxin; Chen, Henian; Zhu, Yiliang; Dagne, Getachew A; Baldwin, Julie
2017-08-01
Semicontinuous data featured with an excessive proportion of zeros and right-skewed continuous positive values arise frequently in practice. One example would be the substance abuse/dependence symptoms data for which a substantial proportion of subjects investigated may report zero. Two-part mixed-effects models have been developed to analyze repeated measures of semicontinuous data from longitudinal studies. In this paper, we propose a flexible two-part mixed-effects model with skew distributions for correlated semicontinuous alcohol data under the framework of a Bayesian approach. The proposed model specification consists of two mixed-effects models linked by the correlated random effects: (i) a model on the occurrence of positive values using a generalized logistic mixed-effects model (Part I); and (ii) a model on the intensity of positive values using a linear mixed-effects model where the model errors follow skew distributions including skew- t and skew-normal distributions (Part II). The proposed method is illustrated with an alcohol abuse/dependence symptoms data from a longitudinal observational study, and the analytic results are reported by comparing potential models under different random-effects structures. Simulation studies are conducted to assess the performance of the proposed models and method.
International Nuclear Information System (INIS)
Wu Wei; Biber, Patrick D; Peterson, Mark S; Gong Chongfeng
2012-01-01
To study the impact of the Deepwater Horizon oil spill on photosynthesis of coastal salt marsh plants in Mississippi, we developed a hierarchical Bayesian (HB) model based on field measurements collected from July 2010 to November 2011. We sampled three locations in Davis Bayou, Mississippi (30.375°N, 88.790°W) representative of a range of oil spill impacts. Measured photosynthesis was negative (respiration only) at the heavily oiled location in July 2010 only, and rates started to increase by August 2010. Photosynthesis at the medium oiling location was lower than at the control location in July 2010 and it continued to decrease in September 2010. During winter 2010–2011, the contrast between the control and the two impacted locations was not as obvious as in the growing season of 2010. Photosynthesis increased through spring 2011 at the three locations and decreased starting with October at the control location and a month earlier (September) at the impacted locations. Using the field data, we developed an HB model. The model simulations agreed well with the measured photosynthesis, capturing most of the variability of the measured data. On the basis of the posteriors of the parameters, we found that air temperature and photosynthetic active radiation positively influenced photosynthesis whereas the leaf stress level negatively affected photosynthesis. The photosynthesis rates at the heavily impacted location had recovered to the status of the control location about 140 days after the initial impact, while the impact at the medium impact location was never severe enough to make photosynthesis significantly lower than that at the control location over the study period. The uncertainty in modeling photosynthesis rates mainly came from the individual and micro-site scales, and to a lesser extent from the leaf scale. (letter)
Wu, Zhen; Liu, Yong; Liang, Zhongyao; Wu, Sifeng; Guo, Huaicheng
2017-06-01
Lake eutrophication is associated with excessive anthropogenic nutrients (mainly nitrogen (N) and phosphorus (P)) and unobserved internal nutrient cycling. Despite the advances in understanding the role of external loadings, the contribution of internal nutrient cycling is still an open question. A dynamic mass-balance model was developed to simulate and measure the contributions of internal cycling and external loading. It was based on the temporal Bayesian Hierarchical Framework (BHM), where we explored the seasonal patterns in the dynamics of nutrient cycling processes and the limitation of N and P on phytoplankton growth in hyper-eutrophic Lake Dianchi, China. The dynamic patterns of the five state variables (Chla, TP, ammonia, nitrate and organic N) were simulated based on the model. Five parameters (algae growth rate, sediment exchange rate of N and P, nitrification rate and denitrification rate) were estimated based on BHM. The model provided a good fit to observations. Our model results highlighted the role of internal cycling of N and P in Lake Dianchi. The internal cycling processes contributed more than external loading to the N and P changes in the water column. Further insights into the nutrient limitation analysis indicated that the sediment exchange of P determined the P limitation. Allowing for the contribution of denitrification to N removal, N was the more limiting nutrient in most of the time, however, P was the more important nutrient for eutrophication management. For Lake Dianchi, it would not be possible to recover solely by reducing the external watershed nutrient load; the mechanisms of internal cycling should also be considered as an approach to inhibit the release of sediments and to enhance denitrification. Copyright © 2017 Elsevier Ltd. All rights reserved.
Wu, Mengmeng; Lin, Zhixiang; Ma, Shining; Chen, Ting; Jiang, Rui; Wong, Wing Hung
2017-12-01
Although genome-wide association studies (GWAS) have successfully identified thousands of genomic loci associated with hundreds of complex traits in the past decade, the debate about such problems as missing heritability and weak interpretability has been appealing for effective computational methods to facilitate the advanced analysis of the vast volume of existing and anticipated genetic data. Towards this goal, gene-level integrative GWAS analysis with the assumption that genes associated with a phenotype tend to be enriched in biological gene sets or gene networks has recently attracted much attention, due to such advantages as straightforward interpretation, less multiple testing burdens, and robustness across studies. However, existing methods in this category usually exploit non-tissue-specific gene networks and thus lack the ability to utilize informative tissue-specific characteristics. To overcome this limitation, we proposed a Bayesian approach called SIGNET (Simultaneously Inference of GeNEs and Tissues) to integrate GWAS data and multiple tissue-specific gene networks for the simultaneous inference of phenotype-associated genes and relevant tissues. Through extensive simulation studies, we showed the effectiveness of our method in finding both associated genes and relevant tissues for a phenotype. In applications to real GWAS data of 14 complex phenotypes, we demonstrated the power of our method in both deciphering genetic basis and discovering biological insights of a phenotype. With this understanding, we expect to see SIGNET as a valuable tool for integrative GWAS analysis, thereby boosting the prevention, diagnosis, and treatment of human inherited diseases and eventually facilitating precision medicine.
Drilleau, M.; Beucler, E.; Mocquet, A.; Verhoeven, O.; Moebs, G.; Burgos, G.; Montagner, J.
2013-12-01
Mineralogical transformations and matter transfers within the Earth's mantle make the 350-1000 km depth range (considered here as the mantle transition zone) highly heterogeneous and anisotropic. Most of the 3-D global tomographic models are anchored on small perturbations from 1-D models such as PREM, and are secondly interpreted in terms of temperature and composition distributions. However, the degree of heterogeneity in the transition zone can be strong enough so that the concept of a 1-D reference seismic model may be addressed. To avoid the use of any seismic reference model, we developed a Markov chain Monte Carlo algorithm to directly interpret surface wave dispersion curves in terms of temperature and radial anisotropy distributions, considering a given composition of the mantle. These interpretations are based on laboratory measurements of elastic moduli and Birch-Murnaghan equation of state. An originality of the algorithm is its ability to explore both smoothly varying models and first-order discontinuities, using C1-Bézier curves, which interpolate the randomly chosen values for depth, temperature and radial anisotropy. This parameterization is able to generate a self-adapting parameter space exploration while reducing the computing time. Using a Bayesian exploration, the probability distributions on temperature and anisotropy are governed by uncertainties on the data set. The method was successfully applied to both synthetic data and real dispersion curves. Surface wave measurements along the Vanuatu- California path suggest a strong anisotropy above 400 km depth which decreases below, and a monotonous temperature distribution between 350 and 1000 km depth. On the contrary, a negative shear wave anisotropy of about 2 % is found at the top of the transition zone below Eurasia. Considering compositions ranging from piclogite to pyrolite, the overall temperature profile and temperature gradient are higher for the continental path than for the oceanic
Bayesian and maximum likelihood estimation of genetic maps
DEFF Research Database (Denmark)
York, Thomas L.; Durrett, Richard T.; Tanksley, Steven
2005-01-01
There has recently been increased interest in the use of Markov Chain Monte Carlo (MCMC)-based Bayesian methods for estimating genetic maps. The advantage of these methods is that they can deal accurately with missing data and genotyping errors. Here we present an extension of the previous methods...... of genotyping errors. A similar advantage of the Bayesian method was not observed for missing data. We also re-analyse a recently published set of data from the eggplant and show that the use of the MCMC-based method leads to smaller estimates of genetic distances....
MCMC exploration of supermassive black hole binary inspirals
International Nuclear Information System (INIS)
Cornish, Neil J; Porter, Edward K
2006-01-01
The Laser Interferometer Space Antenna will be able to detect the inspiral and merger of super massive black hole binaries (SMBHBs) anywhere in the universe. Standard matched filtering techniques can be used to detect and characterize these systems. Markov Chain Monte Carlo (MCMC) methods are ideally suited to this and other LISA data analysis problems as they are able to efficiently handle models with large dimensions. Here we compare the posterior parameter distributions derived by an MCMC algorithm with the distributions predicted by the Fisher information matrix. We find excellent agreement for the extrinsic parameters, while the Fisher matrix slightly overestimates errors in the intrinsic parameters
Directory of Open Access Journals (Sweden)
Tamás Petkovits
Full Text Available Although the fungal order Mortierellales constitutes one of the largest classical groups of Zygomycota, its phylogeny is poorly understood and no modern taxonomic revision is currently available. In the present study, 90 type and reference strains were used to infer a comprehensive phylogeny of Mortierellales from the sequence data of the complete ITS region and the LSU and SSU genes with a special attention to the monophyly of the genus Mortierella. Out of 15 alternative partitioning strategies compared on the basis of Bayes factors, the one with the highest number of partitions was found optimal (with mixture models yielding the best likelihood and tree length values, implying a higher complexity of evolutionary patterns in the ribosomal genes than generally recognized. Modeling the ITS1, 5.8S, and ITS2, loci separately improved model fit significantly as compared to treating all as one and the same partition. Further, within-partition mixture models suggests that not only the SSU, LSU and ITS regions evolve under qualitatively and/or quantitatively different constraints, but that significant heterogeneity can be found within these loci also. The phylogenetic analysis indicated that the genus Mortierella is paraphyletic with respect to the genera Dissophora, Gamsiella and Lobosporangium and the resulting phylogeny contradict previous, morphology-based sectional classification of Mortierella. Based on tree structure and phenotypic traits, we recognize 12 major clades, for which we attempt to summarize phenotypic similarities. M. longicollis is closely related to the outgroup taxon Rhizopus oryzae, suggesting that it belongs to the Mucorales. Our results demonstrate that traits used in previous classifications of the Mortierellales are highly homoplastic and that the Mortierellales is in a need of a reclassification, where new, phylogenetically informative phenotypic traits should be identified, with molecular phylogenies playing a decisive role.
Bayesian object classification of gold nanoparticles
Konomi, Bledar A.; Dhavala, Soma S.; Huang, Jianhua Z.; Kundu, Subrata; Huitink, David; Liang, Hong; Ding, Yu; Mallick, Bani K.
2013-01-01
The properties of materials synthesized with nanoparticles (NPs) are highly correlated to the sizes and shapes of the nanoparticles. The transmission electron microscopy (TEM) imaging technique can be used to measure the morphological characteristics of NPs, which can be simple circles or more complex irregular polygons with varying degrees of scales and sizes. A major difficulty in analyzing the TEM images is the overlapping of objects, having different morphological properties with no specific information about the number of objects present. Furthermore, the objects lying along the boundary render automated image analysis much more difficult. To overcome these challenges, we propose a Bayesian method based on the marked-point process representation of the objects. We derive models, both for the marks which parameterize the morphological aspects and the points which determine the location of the objects. The proposed model is an automatic image segmentation and classification procedure, which simultaneously detects the boundaries and classifies the NPs into one of the predetermined shape families. We execute the inference by sampling the posterior distribution using Markov chainMonte Carlo (MCMC) since the posterior is doubly intractable. We apply our novel method to several TEM imaging samples of gold NPs, producing the needed statistical characterization of their morphology. © Institute of Mathematical Statistics, 2013.
BEAST: Bayesian evolutionary analysis by sampling trees
Directory of Open Access Journals (Sweden)
Drummond Alexei J
2007-11-01
Full Text Available Abstract Background The evolutionary analysis of molecular sequence variation is a statistical enterprise. This is reflected in the increased use of probabilistic models for phylogenetic inference, multiple sequence alignment, and molecular population genetics. Here we present BEAST: a fast, flexible software architecture for Bayesian analysis of molecular sequences related by an evolutionary tree. A large number of popular stochastic models of sequence evolution are provided and tree-based models suitable for both within- and between-species sequence data are implemented. Results BEAST version 1.4.6 consists of 81000 lines of Java source code, 779 classes and 81 packages. It provides models for DNA and protein sequence evolution, highly parametric coalescent analysis, relaxed clock phylogenetics, non-contemporaneous sequence data, statistical alignment and a wide range of options for prior distributions. BEAST source code is object-oriented, modular in design and freely available at http://beast-mcmc.googlecode.com/ under the GNU LGPL license. Conclusion BEAST is a powerful and flexible evolutionary analysis package for molecular sequence variation. It also provides a resource for the further development of new models and statistical methods of evolutionary analysis.
Bayesian object classification of gold nanoparticles
Konomi, Bledar A.
2013-06-01
The properties of materials synthesized with nanoparticles (NPs) are highly correlated to the sizes and shapes of the nanoparticles. The transmission electron microscopy (TEM) imaging technique can be used to measure the morphological characteristics of NPs, which can be simple circles or more complex irregular polygons with varying degrees of scales and sizes. A major difficulty in analyzing the TEM images is the overlapping of objects, having different morphological properties with no specific information about the number of objects present. Furthermore, the objects lying along the boundary render automated image analysis much more difficult. To overcome these challenges, we propose a Bayesian method based on the marked-point process representation of the objects. We derive models, both for the marks which parameterize the morphological aspects and the points which determine the location of the objects. The proposed model is an automatic image segmentation and classification procedure, which simultaneously detects the boundaries and classifies the NPs into one of the predetermined shape families. We execute the inference by sampling the posterior distribution using Markov chainMonte Carlo (MCMC) since the posterior is doubly intractable. We apply our novel method to several TEM imaging samples of gold NPs, producing the needed statistical characterization of their morphology. © Institute of Mathematical Statistics, 2013.
Bayesian Reliability Analysis of Non-Stationarity in Multi-agent Systems
Directory of Open Access Journals (Sweden)
TONT Gabriela
2013-05-01
Full Text Available The Bayesian methods provide information about the meaningful parameters in a statistical analysis obtained by combining the prior and sampling distributions to form the posterior distribution of theparameters. The desired inferences are obtained from this joint posterior. An estimation strategy for hierarchical models, where the resulting joint distribution of the associated model parameters cannotbe evaluated analytically, is to use sampling algorithms, known as Markov Chain Monte Carlo (MCMC methods, from which approximate solutions can be obtained. Both serial and parallel configurations of subcomponents are permitted. The capability of time-dependent method to describe a multi-state system is based on a case study, assessingthe operatial situation of studied system. The rationality and validity of the presented model are demonstrated via a case of study. The effect of randomness of the structural parameters is alsoexamined.
Directory of Open Access Journals (Sweden)
Guillermo Amo de Paz
Full Text Available There is a long-standing debate on the extent of vicariance and long-distance dispersal events to explain the current distribution of organisms, especially in those with small diaspores potentially prone to long-distance dispersal. Age estimates of clades play a crucial role in evaluating the impact of these processes. The aim of this study is to understand the evolutionary history of the largest clade of macrolichens, the parmelioid lichens (Parmeliaceae, Lecanoromycetes, Ascomycota by dating the origin of the group and its major lineages. They have a worldwide distribution with centers of distribution in the Neo- and Paleotropics, and semi-arid subtropical regions of the Southern Hemisphere. Phylogenetic analyses were performed using DNA sequences of nuLSU and mtSSU rDNA, and the protein-coding RPB1 gene. The three DNA regions had different evolutionary rates: RPB1 gave a rate two to four times higher than nuLSU and mtSSU. Divergence times of the major clades were estimated with partitioned BEAST analyses allowing different rates for each DNA region and using a relaxed clock model. Three calibrations points were used to date the tree: an inferred age at the stem of Lecanoromycetes, and two dated fossils: Parmelia in the parmelioid group, and Alectoria. Palaeoclimatic conditions and the palaeogeological area cladogram were compared to the dated phylogeny of parmelioid. The parmelioid group diversified around the K/T boundary, and the major clades diverged during the Eocene and Oligocene. The radiation of the genera occurred through globally changing climatic condition of the early Oligocene, Miocene and early Pliocene. The estimated divergence times are consistent with long-distance dispersal events being the major factor to explain the biogeographical distribution patterns of Southern Hemisphere parmelioids, especially for Africa-Australia disjunctions, because the sequential break-up of Gondwana started much earlier than the origin of these
Acceptance probability of IP-MCMC-PF: revisited
Iglesias Garcia, Fernando; Bocquel, Melanie; Mandal, Pranab K.; Driessen, Hans
2015-01-01
Tracking of multiple objects via particle filtering faces the difficulty of dealing effectively with high dimensional state spaces. One efficient solution consists of integrating Markov chain Monte Carlo (MCMC) sampling at the core of the particle filter. To accomplish such integration, a few
Bayesian binary regression model: an application to in-hospital death after AMI prediction
Directory of Open Access Journals (Sweden)
Aparecida D. P. Souza
2004-08-01
Full Text Available A Bayesian binary regression model is developed to predict death of patients after acute myocardial infarction (AMI. Markov Chain Monte Carlo (MCMC methods are used to make inference and to evaluate Bayesian binary regression models. A model building strategy based on Bayes factor is proposed and aspects of model validation are extensively discussed in the paper, including the posterior distribution for the c-index and the analysis of residuals. Risk assessment, based on variables easily available within minutes of the patients' arrival at the hospital, is very important to decide the course of the treatment. The identified model reveals itself strongly reliable and accurate, with a rate of correct classification of 88% and a concordance index of 83%.Um modelo bayesiano de regressão binária é desenvolvido para predizer óbito hospitalar em pacientes acometidos por infarto agudo do miocárdio. Métodos de Monte Carlo via Cadeias de Markov (MCMC são usados para fazer inferência e validação. Uma estratégia para construção de modelos, baseada no uso do fator de Bayes, é proposta e aspectos de validação são extensivamente discutidos neste artigo, incluindo a distribuição a posteriori para o índice de concordância e análise de resíduos. A determinação de fatores de risco, baseados em variáveis disponíveis na chegada do paciente ao hospital, é muito importante para a tomada de decisão sobre o curso do tratamento. O modelo identificado se revela fortemente confiável e acurado, com uma taxa de classificação correta de 88% e um índice de concordância de 83%.
Understanding Computational Bayesian Statistics
Bolstad, William M
2011-01-01
A hands-on introduction to computational statistics from a Bayesian point of view Providing a solid grounding in statistics while uniquely covering the topics from a Bayesian perspective, Understanding Computational Bayesian Statistics successfully guides readers through this new, cutting-edge approach. With its hands-on treatment of the topic, the book shows how samples can be drawn from the posterior distribution when the formula giving its shape is all that is known, and how Bayesian inferences can be based on these samples from the posterior. These ideas are illustrated on common statistic
Directory of Open Access Journals (Sweden)
Filippo Trentini
Full Text Available We consider modeling jointly microarray RNA expression and DNA copy number data. We propose Bayesian mixture models that define latent Gaussian probit scores for the DNA and RNA, and integrate between the two platforms via a regression of the RNA probit scores on the DNA probit scores. Such a regression conveniently allows us to include additional sample specific covariates such as biological conditions and clinical outcomes. The two developed methods are aimed respectively to make inference on differential behaviour of genes in patients showing different subtypes of breast cancer and to predict the pathological complete response (pCR of patients borrowing strength across the genomic platforms. Posterior inference is carried out via MCMC simulations. We demonstrate the proposed methodology using a published data set consisting of 121 breast cancer patients.
Can a significance test be genuinely Bayesian?
Pereira, Carlos A. de B.; Stern, Julio Michael; Wechsler, Sergio
2008-01-01
The Full Bayesian Significance Test, FBST, is extensively reviewed. Its test statistic, a genuine Bayesian measure of evidence, is discussed in detail. Its behavior in some problems of statistical inference like testing for independence in contingency tables is discussed.
A Bayesian setting for an inverse problem in heat transfer
Ruggeri, Fabrizio; Sawlan, Zaid A; Scavino, Marco; Tempone, Raul
2014-01-01
In this work a Bayesian setting is developed to infer the thermal conductivity, an unknown parameter that appears into heat equation. Temperature data are available on the basis of cooling experiments. The realistic assumption that the boundary data are noisy is introduced, for a given prescribed initial condition. We show how to derive the global likelihood function for the forward boundary-initial condition problem, given the values of the temperature field plus Gaussian noise. We assume that the thermal conductivity parameter can be modelled a priori through a lognormal distributed random variable or by means of a space-dependent stationary lognormal random field. In both cases, given Gaussian priors for the time-dependent Dirichlet boundary values, we marginalize out analytically the joint posterior distribution of and the random boundary conditions, TL and TR, using the linearity of the heat equation. Synthetic data are used to carry out the inference. We exploit the concentration of the posterior distribution of , using the Laplace approximation and therefore avoiding costly MCMC computations.
A Bayesian setting for an inverse problem in heat transfer
Ruggeri, Fabrizio
2014-01-06
In this work a Bayesian setting is developed to infer the thermal conductivity, an unknown parameter that appears into heat equation. Temperature data are available on the basis of cooling experiments. The realistic assumption that the boundary data are noisy is introduced, for a given prescribed initial condition. We show how to derive the global likelihood function for the forward boundary-initial condition problem, given the values of the temperature field plus Gaussian noise. We assume that the thermal conductivity parameter can be modelled a priori through a lognormal distributed random variable or by means of a space-dependent stationary lognormal random field. In both cases, given Gaussian priors for the time-dependent Dirichlet boundary values, we marginalize out analytically the joint posterior distribution of and the random boundary conditions, TL and TR, using the linearity of the heat equation. Synthetic data are used to carry out the inference. We exploit the concentration of the posterior distribution of , using the Laplace approximation and therefore avoiding costly MCMC computations.
Bayesian nonparametric estimation of hazard rate in monotone Aalen model
Czech Academy of Sciences Publication Activity Database
Timková, Jana
2014-01-01
Roč. 50, č. 6 (2014), s. 849-868 ISSN 0023-5954 Institutional support: RVO:67985556 Keywords : Aalen model * Bayesian estimation * MCMC Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.541, year: 2014 http://library.utia.cas.cz/separaty/2014/SI/timkova-0438210.pdf
Bayesian Inference in Statistical Analysis
Box, George E P
2011-01-01
The Wiley Classics Library consists of selected books that have become recognized classics in their respective fields. With these new unabridged and inexpensive editions, Wiley hopes to extend the life of these important works by making them available to future generations of mathematicians and scientists. Currently available in the Series: T. W. Anderson The Statistical Analysis of Time Series T. S. Arthanari & Yadolah Dodge Mathematical Programming in Statistics Emil Artin Geometric Algebra Norman T. J. Bailey The Elements of Stochastic Processes with Applications to the Natural Sciences Rob
Bayesian Regression of Thermodynamic Models of Redox Active Materials
Energy Technology Data Exchange (ETDEWEB)
Johnston, Katherine [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
2017-09-01
Finding a suitable functional redox material is a critical challenge to achieving scalable, economically viable technologies for storing concentrated solar energy in the form of a defected oxide. Demonstrating e ectiveness for thermal storage or solar fuel is largely accomplished by using a thermodynamic model derived from experimental data. The purpose of this project is to test the accuracy of our regression model on representative data sets. Determining the accuracy of the model includes parameter tting the model to the data, comparing the model using di erent numbers of param- eters, and analyzing the entropy and enthalpy calculated from the model. Three data sets were considered in this project: two demonstrating materials for solar fuels by wa- ter splitting and the other of a material for thermal storage. Using Bayesian Inference and Markov Chain Monte Carlo (MCMC), parameter estimation was preformed on the three data sets. Good results were achieved, except some there was some deviations on the edges of the data input ranges. The evidence values were then calculated in a variety of ways and used to compare models with di erent number of parameters. It was believed that at least one of the parameters was unnecessary and comparing evidence values demonstrated that the parameter was need on one data set and not signi cantly helpful on another. The entropy was calculated by taking the derivative in one variable and integrating over another. and its uncertainty was also calculated by evaluating the entropy over multiple MCMC samples. Afterwards, all the parts were written up as a tutorial for the Uncertainty Quanti cation Toolkit (UQTk).
Bayesian analysis in plant pathology.
Mila, A L; Carriquiry, A L
2004-09-01
ABSTRACT Bayesian methods are currently much discussed and applied in several disciplines from molecular biology to engineering. Bayesian inference is the process of fitting a probability model to a set of data and summarizing the results via probability distributions on the parameters of the model and unobserved quantities such as predictions for new observations. In this paper, after a short introduction of Bayesian inference, we present the basic features of Bayesian methodology using examples from sequencing genomic fragments and analyzing microarray gene-expressing levels, reconstructing disease maps, and designing experiments.
Uncertainty Quantification Bayesian Framework for Porous Media Flows
Demyanov, V.; Christie, M.; Erbas, D.
2005-12-01
Uncertainty quantification is an increasingly important aspect of many areas of applied science, where the challenge is to make reliable predictions about the performance of complex physical systems in the absence of complete or reliable data. Predicting flows of fluids through undersurface reservoirs is an example of a complex system where accuracy in prediction is needed (e.g. in oil industry it is essential for financial reasons). Simulation of fluid flow in oil reservoirs is usually carried out using large commercially written finite difference simulators solving conservation equations describing the multi-phase flow through the porous reservoir rocks, which is a highly computationally expensive task. This work examines a Bayesian Framework for uncertainty quantification in porous media flows that uses a stochastic sampling algorithm to generate models that match observed time series data. The framework is flexible for a wide range of general physical/statistical parametric models, which are used to describe the underlying hydro-geological process in its temporal dynamics. The approach is based on exploration of the parameter space and update of the prior beliefs about what the most likely model definitions are. Optimization problem for a highly parametric physical model usually have multiple solutions, which impact the uncertainty of the made predictions. Stochastic search algorithm (e.g. genetic algorithm) allows to identify multiple "good enough" models in the parameter space. Furthermore, inference of the generated model ensemble via MCMC based algorithm evaluates the posterior probability of the generated models and quantifies uncertainty of the predictions. Machine learning algorithm - Artificial Neural Networks - are used to speed up the identification of regions in parameter space where good matches to observed data can be found. Adaptive nature of ANN allows to develop different ways of integrating them into the Bayesian framework: as direct time
Ransom, Katherine M.; Bell, Andrew M.; Barber, Quinn E.; Kourakos, George; Harter, Thomas
2018-05-01
This study is focused on nitrogen loading from a wide variety of crop and land-use types in the Central Valley, California, USA, an intensively farmed region with high agricultural crop diversity. Nitrogen loading rates for several crop types have been measured based on field-scale experiments, and recent research has calculated nitrogen loading rates for crops throughout the Central Valley based on a mass balance approach. However, research is lacking to infer nitrogen loading rates for the broad diversity of crop and land-use types directly from groundwater nitrate measurements. Relating groundwater nitrate measurements to specific crops must account for the uncertainty about and multiplicity in contributing crops (and other land uses) to individual well measurements, and for the variability of nitrogen loading within farms and from farm to farm for the same crop type. In this study, we developed a Bayesian regression model that allowed us to estimate land-use-specific groundwater nitrogen loading rate probability distributions for 15 crop and land-use groups based on a database of recent nitrate measurements from 2149 private wells in the Central Valley. The water and natural, rice, and alfalfa and pasture groups had the lowest median estimated nitrogen loading rates, each with a median estimate below 5 kg N ha-1 yr-1. Confined animal feeding operations (dairies) and citrus and subtropical crops had the greatest median estimated nitrogen loading rates at approximately 269 and 65 kg N ha-1 yr-1, respectively. In general, our probability-based estimates compare favorably with previous direct measurements and with mass-balance-based estimates of nitrogen loading. Nitrogen mass-balance-based estimates are larger than our groundwater nitrate derived estimates for manured and nonmanured forage, nuts, cotton, tree fruit, and rice crops. These discrepancies are thought to be due to groundwater age mixing, dilution from infiltrating river water, or denitrification
Bayesian theory and applications
Dellaportas, Petros; Polson, Nicholas G; Stephens, David A
2013-01-01
The development of hierarchical models and Markov chain Monte Carlo (MCMC) techniques forms one of the most profound advances in Bayesian analysis since the 1970s and provides the basis for advances in virtually all areas of applied and theoretical Bayesian statistics. This volume guides the reader along a statistical journey that begins with the basic structure of Bayesian theory, and then provides details on most of the past and present advances in this field. The book has a unique format. There is an explanatory chapter devoted to each conceptual advance followed by journal-style chapters that provide applications or further advances on the concept. Thus, the volume is both a textbook and a compendium of papers covering a vast range of topics. It is appropriate for a well-informed novice interested in understanding the basic approach, methods and recent applications. Because of its advanced chapters and recent work, it is also appropriate for a more mature reader interested in recent applications and devel...
von der Linden, Wolfgang; Dose, Volker; von Toussaint, Udo
2014-06-01
Preface; Part I. Introduction: 1. The meaning of probability; 2. Basic definitions; 3. Bayesian inference; 4. Combinatrics; 5. Random walks; 6. Limit theorems; 7. Continuous distributions; 8. The central limit theorem; 9. Poisson processes and waiting times; Part II. Assigning Probabilities: 10. Transformation invariance; 11. Maximum entropy; 12. Qualified maximum entropy; 13. Global smoothness; Part III. Parameter Estimation: 14. Bayesian parameter estimation; 15. Frequentist parameter estimation; 16. The Cramer-Rao inequality; Part IV. Testing Hypotheses: 17. The Bayesian way; 18. The frequentist way; 19. Sampling distributions; 20. Bayesian vs frequentist hypothesis tests; Part V. Real World Applications: 21. Regression; 22. Inconsistent data; 23. Unrecognized signal contributions; 24. Change point problems; 25. Function estimation; 26. Integral equations; 27. Model selection; 28. Bayesian experimental design; Part VI. Probabilistic Numerical Techniques: 29. Numerical integration; 30. Monte Carlo methods; 31. Nested sampling; Appendixes; References; Index.
Minsley, Burke J.
2011-01-01
A meaningful interpretation of geophysical measurements requires an assessment of the space of models that are consistent with the data, rather than just a single, ‘best’ model which does not convey information about parameter uncertainty. For this purpose, a trans-dimensional Bayesian Markov chain Monte Carlo (MCMC) algorithm is developed for assessing frequencydomain electromagnetic (FDEM) data acquired from airborne or ground-based systems. By sampling the distribution of models that are consistent with measured data and any prior knowledge, valuable inferences can be made about parameter values such as the likely depth to an interface, the distribution of possible resistivity values as a function of depth and non-unique relationships between parameters. The trans-dimensional aspect of the algorithm allows the number of layers to be a free parameter that is controlled by the data, where models with fewer layers are inherently favoured, which provides a natural measure of parsimony and a signiﬁcant degree of ﬂexibility in parametrization. The MCMC algorithm is used with synthetic examples to illustrate how the distribution of acceptable models is affected by the choice of prior information, the system geometry and conﬁguration and the uncertainty in the measured system elevation. An airborne FDEM data set that was acquired for the purpose of hydrogeological characterization is also studied. The results compare favorably with traditional least-squares analysis, borehole resistivity and lithology logs from the site, and also provide new information about parameter uncertainty necessary for model assessment.
Aggelopoulos, Nikolaos C
2015-08-01
Perceptual inference refers to the ability to infer sensory stimuli from predictions that result from internal neural representations built through prior experience. Methods of Bayesian statistical inference and decision theory model cognition adequately by using error sensing either in guiding action or in "generative" models that predict the sensory information. In this framework, perception can be seen as a process qualitatively distinct from sensation, a process of information evaluation using previously acquired and stored representations (memories) that is guided by sensory feedback. The stored representations can be utilised as internal models of sensory stimuli enabling long term associations, for example in operant conditioning. Evidence for perceptual inference is contributed by such phenomena as the cortical co-localisation of object perception with object memory, the response invariance in the responses of some neurons to variations in the stimulus, as well as from situations in which perception can be dissociated from sensation. In the context of perceptual inference, sensory areas of the cerebral cortex that have been facilitated by a priming signal may be regarded as comparators in a closed feedback loop, similar to the better known motor reflexes in the sensorimotor system. The adult cerebral cortex can be regarded as similar to a servomechanism, in using sensory feedback to correct internal models, producing predictions of the outside world on the basis of past experience. Copyright © 2015 Elsevier Ltd. All rights reserved.
Borsboom, D.; Haig, B.D.
2013-01-01
Unlike most other statistical frameworks, Bayesian statistical inference is wedded to a particular approach in the philosophy of science (see Howson & Urbach, 2006); this approach is called Bayesianism. Rather than being concerned with model fitting, this position in the philosophy of science
Plug & Play object oriented Bayesian networks
DEFF Research Database (Denmark)
Bangsø, Olav; Flores, J.; Jensen, Finn Verner
2003-01-01
been shown to be quite suitable for dynamic domains as well. However, processing object oriented Bayesian networks in practice does not take advantage of their modular structure. Normally the object oriented Bayesian network is transformed into a Bayesian network and, inference is performed...... dynamic domains. The communication needed between instances is achieved by means of a fill-in propagation scheme....
Kim, Sungduk; Chen, Ming-Hui; Ibrahim, Joseph G.; Shah, Arvind K.; Lin, Jianxin
2013-01-01
In this paper, we propose a class of Box-Cox transformation regression models with multidimensional random effects for analyzing multivariate responses for individual patient data (IPD) in meta-analysis. Our modeling formulation uses a multivariate normal response meta-analysis model with multivariate random effects, in which each response is allowed to have its own Box-Cox transformation. Prior distributions are specified for the Box-Cox transformation parameters as well as the regression coefficients in this complex model, and the Deviance Information Criterion (DIC) is used to select the best transformation model. Since the model is quite complex, a novel Monte Carlo Markov chain (MCMC) sampling scheme is developed to sample from the joint posterior of the parameters. This model is motivated by a very rich dataset comprising 26 clinical trials involving cholesterol lowering drugs where the goal is to jointly model the three dimensional response consisting of Low Density Lipoprotein Cholesterol (LDL-C), High Density Lipoprotein Cholesterol (HDL-C), and Triglycerides (TG) (LDL-C, HDL-C, TG). Since the joint distribution of (LDL-C, HDL-C, TG) is not multivariate normal and in fact quite skewed, a Box-Cox transformation is needed to achieve normality. In the clinical literature, these three variables are usually analyzed univariately: however, a multivariate approach would be more appropriate since these variables are correlated with each other. A detailed analysis of these data is carried out using the proposed methodology. PMID:23580436
Ghosh, Sujit K
2010-01-01
Bayesian methods are rapidly becoming popular tools for making statistical inference in various fields of science including biology, engineering, finance, and genetics. One of the key aspects of Bayesian inferential method is its logical foundation that provides a coherent framework to utilize not only empirical but also scientific information available to a researcher. Prior knowledge arising from scientific background, expert judgment, or previously collected data is used to build a prior distribution which is then combined with current data via the likelihood function to characterize the current state of knowledge using the so-called posterior distribution. Bayesian methods allow the use of models of complex physical phenomena that were previously too difficult to estimate (e.g., using asymptotic approximations). Bayesian methods offer a means of more fully understanding issues that are central to many practical problems by allowing researchers to build integrated models based on hierarchical conditional distributions that can be estimated even with limited amounts of data. Furthermore, advances in numerical integration methods, particularly those based on Monte Carlo methods, have made it possible to compute the optimal Bayes estimators. However, there is a reasonably wide gap between the background of the empirically trained scientists and the full weight of Bayesian statistical inference. Hence, one of the goals of this chapter is to bridge the gap by offering elementary to advanced concepts that emphasize linkages between standard approaches and full probability modeling via Bayesian methods.
Bayesian modeling using WinBUGS
Ntzoufras, Ioannis
2009-01-01
A hands-on introduction to the principles of Bayesian modeling using WinBUGS Bayesian Modeling Using WinBUGS provides an easily accessible introduction to the use of WinBUGS programming techniques in a variety of Bayesian modeling settings. The author provides an accessible treatment of the topic, offering readers a smooth introduction to the principles of Bayesian modeling with detailed guidance on the practical implementation of key principles. The book begins with a basic introduction to Bayesian inference and the WinBUGS software and goes on to cover key topics, including: Markov Chain Monte Carlo algorithms in Bayesian inference Generalized linear models Bayesian hierarchical models Predictive distribution and model checking Bayesian model and variable evaluation Computational notes and screen captures illustrate the use of both WinBUGS as well as R software to apply the discussed techniques. Exercises at the end of each chapter allow readers to test their understanding of the presented concepts and all ...
Variational inference & deep learning : A new synthesis
Kingma, D.P.
2017-01-01
In this thesis, Variational Inference and Deep Learning: A New Synthesis, we propose novel solutions to the problems of variational (Bayesian) inference, generative modeling, representation learning, semi-supervised learning, and stochastic optimization.
Variational inference & deep learning: A new synthesis
Kingma, D.P.
2017-01-01
In this thesis, Variational Inference and Deep Learning: A New Synthesis, we propose novel solutions to the problems of variational (Bayesian) inference, generative modeling, representation learning, semi-supervised learning, and stochastic optimization.
Rahpeyma, Sahar; Halldorsson, Benedikt; Hrafnkelsson, Birgir; Jonsson, Sigurjon
2018-01-01
Knowledge of the characteristics of earthquake ground motion is fundamental for earthquake hazard assessments. Over small distances, relative to the source–site distance, where uniform site conditions are expected, the ground motion variability is also expected to be insignificant. However, despite being located on what has been characterized as a uniform lava‐rock site condition, considerable peak ground acceleration (PGA) variations were observed on stations of a small‐aperture array (covering approximately 1 km2) of accelerographs in Southwest Iceland during the Ölfus earthquake of magnitude 6.3 on May 29, 2008 and its sequence of aftershocks. We propose a novel Bayesian hierarchical model for the PGA variations accounting separately for earthquake event effects, station effects, and event‐station effects. An efficient posterior inference scheme based on Markov chain Monte Carlo (MCMC) simulations is proposed for the new model. The variance of the station effect is certainly different from zero according to the posterior density, indicating that individual station effects are different from one another. The Bayesian hierarchical model thus captures the observed PGA variations and quantifies to what extent the source and recording sites contribute to the overall variation in ground motions over relatively small distances on the lava‐rock site condition.
Rahpeyma, Sahar
2018-04-17
Knowledge of the characteristics of earthquake ground motion is fundamental for earthquake hazard assessments. Over small distances, relative to the source–site distance, where uniform site conditions are expected, the ground motion variability is also expected to be insignificant. However, despite being located on what has been characterized as a uniform lava‐rock site condition, considerable peak ground acceleration (PGA) variations were observed on stations of a small‐aperture array (covering approximately 1 km2) of accelerographs in Southwest Iceland during the Ölfus earthquake of magnitude 6.3 on May 29, 2008 and its sequence of aftershocks. We propose a novel Bayesian hierarchical model for the PGA variations accounting separately for earthquake event effects, station effects, and event‐station effects. An efficient posterior inference scheme based on Markov chain Monte Carlo (MCMC) simulations is proposed for the new model. The variance of the station effect is certainly different from zero according to the posterior density, indicating that individual station effects are different from one another. The Bayesian hierarchical model thus captures the observed PGA variations and quantifies to what extent the source and recording sites contribute to the overall variation in ground motions over relatively small distances on the lava‐rock site condition.
Directory of Open Access Journals (Sweden)
Moslem Moradi
2015-06-01
Full Text Available Here in, an application of a new seismic inversion algorithm in one of Iran’s oilfields is described. Stochastic (geostatistical seismic inversion, as a complementary method to deterministic inversion, is perceived as contribution combination of geostatistics and seismic inversion algorithm. This method integrates information from different data sources with different scales, as prior information in Bayesian statistics. Data integration leads to a probability density function (named as a posteriori probability that can yield a model of subsurface. The Markov Chain Monte Carlo (MCMC method is used to sample the posterior probability distribution, and the subsurface model characteristics can be extracted by analyzing a set of the samples. In this study, the theory of stochastic seismic inversion in a Bayesian framework was described and applied to infer P-impedance and porosity models. The comparison between the stochastic seismic inversion and the deterministic model based seismic inversion indicates that the stochastic seismic inversion can provide more detailed information of subsurface character. Since multiple realizations are extracted by this method, an estimation of pore volume and uncertainty in the estimation were analyzed.
Liu, Y.; Pau, G. S. H.; Finsterle, S.
2015-12-01
Parameter inversion involves inferring the model parameter values based on sparse observations of some observables. To infer the posterior probability distributions of the parameters, Markov chain Monte Carlo (MCMC) methods are typically used. However, the large number of forward simulations needed and limited computational resources limit the complexity of the hydrological model we can use in these methods. In view of this, we studied the implicit sampling (IS) method, an efficient importance sampling technique that generates samples in the high-probability region of the posterior distribution and thus reduces the number of forward simulations that we need to run. For a pilot-point inversion of a heterogeneous permeability field based on a synthetic ponded infiltration experiment simulated with TOUGH2 (a subsurface modeling code), we showed that IS with linear map provides an accurate Bayesian description of the parameterized permeability field at the pilot points with just approximately 500 forward simulations. We further studied the use of surrogate models to improve the computational efficiency of parameter inversion. We implemented two reduced-order models (ROMs) for the TOUGH2 forward model. One is based on polynomial chaos expansion (PCE), of which the coefficients are obtained using the sparse Bayesian learning technique to mitigate the "curse of dimensionality" of the PCE terms. The other model is Gaussian process regression (GPR) for which different covariance, likelihood and inference models are considered. Preliminary results indicate that ROMs constructed based on the prior parameter space perform poorly. It is thus impractical to replace this hydrological model by a ROM directly in a MCMC method. However, the IS method can work with a ROM constructed for parameters in the close vicinity of the maximum a posteriori probability (MAP) estimate. We will discuss the accuracy and computational efficiency of using ROMs in the implicit sampling procedure
Bayesian methods for proteomic biomarker development
Directory of Open Access Journals (Sweden)
Belinda Hernández
2015-12-01
In this review we provide an introduction to Bayesian inference and demonstrate some of the advantages of using a Bayesian framework. We summarize how Bayesian methods have been used previously in proteomics and other areas of bioinformatics. Finally, we describe some popular and emerging Bayesian models from the statistical literature and provide a worked tutorial including code snippets to show how these methods may be applied for the evaluation of proteomic biomarkers.
Numerical Methods for Bayesian Inverse Problems
Ernst, Oliver
2014-01-06
We present recent results on Bayesian inversion for a groundwater flow problem with an uncertain conductivity field. In particular, we show how direct and indirect measurements can be used to obtain a stochastic model for the unknown. The main tool here is Bayes’ theorem which merges the indirect data with the stochastic prior model for the conductivity field obtained by the direct measurements. Further, we demonstrate how the resulting posterior distribution of the quantity of interest, in this case travel times of radionuclide contaminants, can be obtained by Markov Chain Monte Carlo (MCMC) simulations. Moreover, we investigate new, promising MCMC methods which exploit geometrical features of the posterior and which are suited to infinite dimensions.
Numerical Methods for Bayesian Inverse Problems
Ernst, Oliver; Sprungk, Bjorn; Cliffe, K. Andrew; Starkloff, Hans-Jorg
2014-01-01
We present recent results on Bayesian inversion for a groundwater flow problem with an uncertain conductivity field. In particular, we show how direct and indirect measurements can be used to obtain a stochastic model for the unknown. The main tool here is Bayes’ theorem which merges the indirect data with the stochastic prior model for the conductivity field obtained by the direct measurements. Further, we demonstrate how the resulting posterior distribution of the quantity of interest, in this case travel times of radionuclide contaminants, can be obtained by Markov Chain Monte Carlo (MCMC) simulations. Moreover, we investigate new, promising MCMC methods which exploit geometrical features of the posterior and which are suited to infinite dimensions.
Directory of Open Access Journals (Sweden)
Thomas Christopher M
2011-07-01
Full Text Available Abstract Background IncP-1 plasmids are broad host range plasmids that have been found in clinical and environmental bacteria. They often carry genes for antibiotic resistance or catabolic pathways. The archetypal IncP-1 plasmid RK2 is a well-characterized biological system, with a fully sequenced and annotated genome and wide range of experimental measurements. Its central control operon, encoding two global regulators KorA and KorB, is a natural example of a negatively self-regulated operon. To increase our understanding of the regulation of this operon, we have constructed a dynamical mathematical model using Ordinary Differential Equations, and employed a Bayesian inference scheme, Markov Chain Monte Carlo (MCMC using the Metropolis-Hastings algorithm, as a way of integrating experimental measurements and a priori knowledge. We also compared MCMC and Metabolic Control Analysis (MCA as approaches for determining the sensitivity of model parameters. Results We identified two distinct sets of parameter values, with different biological interpretations, that fit and explain the experimental data. This allowed us to highlight the proportion of repressor protein as dimers as a key experimental measurement defining the dynamics of the system. Analysis of joint posterior distributions led to the identification of correlations between parameters for protein synthesis and partial repression by KorA or KorB dimers, indicating the necessary use of joint posteriors for correct parameter estimation. Using MCA, we demonstrated that the system is highly sensitive to the growth rate but insensitive to repressor monomerization rates in their selected value regions; the latter outcome was also confirmed by MCMC. Finally, by examining a series of different model refinements for partial repression by KorA or KorB dimers alone, we showed that a model including partial repression by KorA and KorB was most compatible with existing experimental data. Conclusions We
Bayesian estimates of linkage disequilibrium
Directory of Open Access Journals (Sweden)
Abad-Grau María M
2007-06-01
Full Text Available Abstract Background The maximum likelihood estimator of D' – a standard measure of linkage disequilibrium – is biased toward disequilibrium, and the bias is particularly evident in small samples and rare haplotypes. Results This paper proposes a Bayesian estimation of D' to address this problem. The reduction of the bias is achieved by using a prior distribution on the pair-wise associations between single nucleotide polymorphisms (SNPs that increases the likelihood of equilibrium with increasing physical distances between pairs of SNPs. We show how to compute the Bayesian estimate using a stochastic estimation based on MCMC methods, and also propose a numerical approximation to the Bayesian estimates that can be used to estimate patterns of LD in large datasets of SNPs. Conclusion Our Bayesian estimator of D' corrects the bias toward disequilibrium that affects the maximum likelihood estimator. A consequence of this feature is a more objective view about the extent of linkage disequilibrium in the human genome, and a more realistic number of tagging SNPs to fully exploit the power of genome wide association studies.
Molitor, John
2012-03-01
Bayesian methods have seen an increase in popularity in a wide variety of scientific fields, including epidemiology. One of the main reasons for their widespread application is the power of the Markov chain Monte Carlo (MCMC) techniques generally used to fit these models. As a result, researchers often implicitly associate Bayesian models with MCMC estimation procedures. However, Bayesian models do not always require Markov-chain-based methods for parameter estimation. This is important, as MCMC estimation methods, while generally quite powerful, are complex and computationally expensive and suffer from convergence problems related to the manner in which they generate correlated samples used to estimate probability distributions for parameters of interest. In this issue of the Journal, Cole et al. (Am J Epidemiol. 2012;175(5):368-375) present an interesting paper that discusses non-Markov-chain-based approaches to fitting Bayesian models. These methods, though limited, can overcome some of the problems associated with MCMC techniques and promise to provide simpler approaches to fitting Bayesian models. Applied researchers will find these estimation approaches intuitively appealing and will gain a deeper understanding of Bayesian models through their use. However, readers should be aware that other non-Markov-chain-based methods are currently in active development and have been widely published in other fields.
Gomes, Guilherme J. C.; Vrugt, Jasper A.; Vargas, Eurípedes A.
2016-04-01
The depth to bedrock controls a myriad of processes by influencing subsurface flow paths, erosion rates, soil moisture, and water uptake by plant roots. As hillslope interiors are very difficult and costly to illuminate and access, the topography of the bedrock surface is largely unknown. This essay is concerned with the prediction of spatial patterns in the depth to bedrock (DTB) using high-resolution topographic data, numerical modeling, and Bayesian analysis. Our DTB model builds on the bottom-up control on fresh-bedrock topography hypothesis of Rempe and Dietrich (2014) and includes a mass movement and bedrock-valley morphology term to extent the usefulness and general applicability of the model. We reconcile the DTB model with field observations using Bayesian analysis with the DREAM algorithm. We investigate explicitly the benefits of using spatially distributed parameter values to account implicitly, and in a relatively simple way, for rock mass heterogeneities that are very difficult, if not impossible, to characterize adequately in the field. We illustrate our method using an artificial data set of bedrock depth observations and then evaluate our DTB model with real-world data collected at the Papagaio river basin in Rio de Janeiro, Brazil. Our results demonstrate that the DTB model predicts accurately the observed bedrock depth data. The posterior mean DTB simulation is shown to be in good agreement with the measured data. The posterior prediction uncertainty of the DTB model can be propagated forward through hydromechanical models to derive probabilistic estimates of factors of safety.
Statistical inference an integrated Bayesianlikelihood approach
Aitkin, Murray
2010-01-01
Filling a gap in current Bayesian theory, Statistical Inference: An Integrated Bayesian/Likelihood Approach presents a unified Bayesian treatment of parameter inference and model comparisons that can be used with simple diffuse prior specifications. This novel approach provides new solutions to difficult model comparison problems and offers direct Bayesian counterparts of frequentist t-tests and other standard statistical methods for hypothesis testing.After an overview of the competing theories of statistical inference, the book introduces the Bayes/likelihood approach used throughout. It pre
DEFF Research Database (Denmark)
Shariati, M M; Korsgaard, I R; Sorensen, D
2009-01-01
model with unknown covariates (RNUC) is a model in which unknown environmental effects can be inferred jointly with the remaining parameters. The problem of identifiability of parameters at the level of the likelihood and the associated behaviour of MCMC chains were discussed using the RNUC...... as fixed and there are other fixed factors in the model, the contrasts involving environmental effects, the variance of environmental sensitivities (genetic slopes) and the residual variance are the only identifiable parameters. These different identifiability scenarios were generated by changing...... as an example. It was shown theoretically that when environmental effects (covariates) are considered as random effects, estimable functions of the fixed effects, (co)variance components and genetic effects are identifiable as well as the environmental effects. When the environmental effects are treated...
Bayesian analysis of spatial point processes in the neighbourhood of Voronoi networks
DEFF Research Database (Denmark)
Skare, Øivind; Møller, Jesper; Jensen, Eva Bjørn Vedel
2007-01-01
A model for an inhomogeneous Poisson process with high intensity near the edges of a Voronoi tessellation in 2D or 3D is proposed. The model is analysed in a Bayesian setting with priors on nuclei of the Voronoi tessellation and other model parameters. An MCMC algorithm is constructed to sample...
Bayesian analysis of spatial point processes in the neighbourhood of Voronoi networks
DEFF Research Database (Denmark)
Skare, Øivind; Møller, Jesper; Vedel Jensen, Eva B.
A model for an inhomogeneous Poisson process with high intensity near the edges of a Voronoi tessellation in 2D or 3D is proposed. The model is analysed in a Bayesian setting with priors on nuclei of the Voronoi tessellation and other model parameters. An MCMC algorithm is constructed to sample...
An MCMC determination of the primordial helium abundance
Aver, Erik; Olive, Keith A.; Skillman, Evan D.
2012-04-01
Spectroscopic observations of the chemical abundances in metal-poor H II regions provide an independent method for estimating the primordial helium abundance. H II regions are described by several physical parameters such as electron density, electron temperature, and reddening, in addition to y, the ratio of helium to hydrogen. It had been customary to estimate or determine self-consistently these parameters to calculate y. Frequentist analyses of the parameter space have been shown to be successful in these parameter determinations, and Markov Chain Monte Carlo (MCMC) techniques have proven to be very efficient in sampling this parameter space. Nevertheless, accurate determination of the primordial helium abundance from observations of H II regions is constrained by both systematic and statistical uncertainties. In an attempt to better reduce the latter, and continue to better characterize the former, we apply MCMC methods to the large dataset recently compiled by Izotov, Thuan, & Stasińska (2007). To improve the reliability of the determination, a high quality dataset is needed. In pursuit of this, a variety of cuts are explored. The efficacy of the He I λ4026 emission line as a constraint on the solutions is first examined, revealing the introduction of systematic bias through its absence. As a clear measure of the quality of the physical solution, a χ2 analysis proves instrumental in the selection of data compatible with the theoretical model. Nearly two-thirds of the observations fall outside a standard 95% confidence level cut, which highlights the care necessary in selecting systems and warrants further investigation into potential deficiencies of the model or data. In addition, the method also allows us to exclude systems for which parameter estimations are statistical outliers. As a result, the final selected dataset gains in reliability and exhibits improved consistency. Regression to zero metallicity yields Yp = 0.2534 ± 0.0083, in broad agreement
An MCMC determination of the primordial helium abundance
International Nuclear Information System (INIS)
Aver, Erik; Olive, Keith A.; Skillman, Evan D.
2012-01-01
Spectroscopic observations of the chemical abundances in metal-poor H II regions provide an independent method for estimating the primordial helium abundance. H II regions are described by several physical parameters such as electron density, electron temperature, and reddening, in addition to y, the ratio of helium to hydrogen. It had been customary to estimate or determine self-consistently these parameters to calculate y. Frequentist analyses of the parameter space have been shown to be successful in these parameter determinations, and Markov Chain Monte Carlo (MCMC) techniques have proven to be very efficient in sampling this parameter space. Nevertheless, accurate determination of the primordial helium abundance from observations of H II regions is constrained by both systematic and statistical uncertainties. In an attempt to better reduce the latter, and continue to better characterize the former, we apply MCMC methods to the large dataset recently compiled by Izotov, Thuan, and Stasińska (2007). To improve the reliability of the determination, a high quality dataset is needed. In pursuit of this, a variety of cuts are explored. The efficacy of the He I λ4026 emission line as a constraint on the solutions is first examined, revealing the introduction of systematic bias through its absence. As a clear measure of the quality of the physical solution, a χ 2 analysis proves instrumental in the selection of data compatible with the theoretical model. Nearly two-thirds of the observations fall outside a standard 95% confidence level cut, which highlights the care necessary in selecting systems and warrants further investigation into potential deficiencies of the model or data. In addition, the method also allows us to exclude systems for which parameter estimations are statistical outliers. As a result, the final selected dataset gains in reliability and exhibits improved consistency. Regression to zero metallicity yields Y p = 0.2534 ± 0.0083, in broad
Evaluating impacts using a BACI design, ratios, and a Bayesian approach with a focus on restoration
Conner, Mary M.; Saunders, W. Carl; Bouwes, Nicolaas; Jordan, Chris
2016-01-01
Before-after-control-impact (BACI) designs are an effective method to evaluate natural and human-induced perturbations on ecological variables when treatment sites cannot be randomly chosen. While effect sizes of interest can be tested with frequentist methods, using Bayesian Markov chain Monte Carlo (MCMC) sampling methods, probabilities of effect sizes, such as a ?20?% increase in density after restoration, can be directly estimated. Although BACI and Bayesian methods are used widely for as...
Efficiency of alternative MCMC strategies illustrated using the reaction norm model
DEFF Research Database (Denmark)
Shariati, Mohammad Mahdi; Sørensen, D.
2008-01-01
The Markov chain Monte Carlo (MCMC) strategy provides remarkable flexibility for fitting complex hierarchical models. However, when parameters are highly correlated in their posterior distributions and their number is large, a particular MCMC algorithm may perform poorly and the resulting...... in the low correlation scenario where SG was the best strategy. The two LH proposals could not compete with any of the Gibbs sampling algorithms. In this study it was not possible to find an MCMC strategy that performs optimally across the range of target distributions and across all possible values...
Directory of Open Access Journals (Sweden)
L.S. Vismara
2007-12-01
Full Text Available A dinâmica da população de plantas daninhas pode ser representada por um sistema de equações que relaciona as densidades de sementes produzidas e de plântulas em áreas de cultivo. Os valores dos parâmetros dos modelos podem ser inferidos diretamente de experimentação e análise estatística ou extraídos da literatura. O presente trabalho teve por objetivo estimar os parâmetros do modelo de densidade populacional de plantas daninhas, a partir de um experimento conduzido na área experimental da Embrapa Milho e Sorgo, Sete Lagoas, MG, via os procedimentos de inferências clássica e Bayesiana.Dynamics of weed populations can be described as a system of equations relating the produced seed and seedling densities in crop areas. The model parameter values can be either directly inferred from experimentation and statistical analysis or obtained from the literature. The objective of this work was to estimate the weed population density model parameters based on experimental field data at Embrapa Milho e Sorgo, Sete Lagoas, MG, using classic and Bayesian inferences.
Directory of Open Access Journals (Sweden)
José Marques Carneiro Júnior
2010-07-01
Full Text Available Dados simulados foram utilizados para comparar as metodologias Eblup eBayesiana, em dados com homogeneidade de variâncias, heterogeneidade de variância genética e heterogeneidade de variância genética e ambiental. Para obtenção dessas estruturas foram feitos descartes estratégicos dos valores genéticos aditivos e ambientais de acordo com o tipo de heterogeneidade e o nível de variabilidade desejada (alta, média ou baixa, sendo utilizados dois tamanhos de população (grande e pequena. Para a metodologia Bayesiana foram utilizados três níveis de informação a priori: não informativo, pouco informativo e informativo. A presença da heterogeneidade de variâncias causa problemas para a seleção dos melhores indivíduos, principalmente se a heterogeneidade estiver nos componentes de variância genética e ambiental, sendo os animais selecionadosequivocadamente do ambiente mais variável. Os métodos comparados tiveram resultados semelhantes, quando distribuições a priori não informativas foram utilizadas, e as populações de tamanho grande, de modo geral, apresentaram melhores predições de valores genéticos.Foi observado, para a metodologia Bayesiana, que o aumento no nível de informação a priori influencia positivamente as predições dos valores genéticos, principalmente para as populações pequenas. O método Bayesiano é indicado para populações de tamanho pequeno quando há disponibilidade de distribuições a priori informativas.Simulated data were used to compare EBLUP and Bayesian methods in datawith homogeneity of variance, heterogeneity of variance and genetic heterogeneity of genetic and environmental variance. For these structures were strategic disposal of additive genetic and environmental values in accordance with the type of heterogeneity and the desired level of variability: high, medium or low. We used two sizes of population: large and small. For the Bayesian methodology was used three levels of a
A computer program for uncertainty analysis integrating regression and Bayesian methods
Lu, Dan; Ye, Ming; Hill, Mary C.; Poeter, Eileen P.; Curtis, Gary
2014-01-01
This work develops a new functionality in UCODE_2014 to evaluate Bayesian credible intervals using the Markov Chain Monte Carlo (MCMC) method. The MCMC capability in UCODE_2014 is based on the FORTRAN version of the differential evolution adaptive Metropolis (DREAM) algorithm of Vrugt et al. (2009), which estimates the posterior probability density function of model parameters in high-dimensional and multimodal sampling problems. The UCODE MCMC capability provides eleven prior probability distributions and three ways to initialize the sampling process. It evaluates parametric and predictive uncertainties and it has parallel computing capability based on multiple chains to accelerate the sampling process. This paper tests and demonstrates the MCMC capability using a 10-dimensional multimodal mathematical function, a 100-dimensional Gaussian function, and a groundwater reactive transport model. The use of the MCMC capability is made straightforward and flexible by adopting the JUPITER API protocol. With the new MCMC capability, UCODE_2014 can be used to calculate three types of uncertainty intervals, which all can account for prior information: (1) linear confidence intervals which require linearity and Gaussian error assumptions and typically 10s–100s of highly parallelizable model runs after optimization, (2) nonlinear confidence intervals which require a smooth objective function surface and Gaussian observation error assumptions and typically 100s–1,000s of partially parallelizable model runs after optimization, and (3) MCMC Bayesian credible intervals which require few assumptions and commonly 10,000s–100,000s or more partially parallelizable model runs. Ready access allows users to select methods best suited to their work, and to compare methods in many circumstances.
Topics in Bayesian statistics and maximum entropy
International Nuclear Information System (INIS)
Mutihac, R.; Cicuttin, A.; Cerdeira, A.; Stanciulescu, C.
1998-12-01
Notions of Bayesian decision theory and maximum entropy methods are reviewed with particular emphasis on probabilistic inference and Bayesian modeling. The axiomatic approach is considered as the best justification of Bayesian analysis and maximum entropy principle applied in natural sciences. Particular emphasis is put on solving the inverse problem in digital image restoration and Bayesian modeling of neural networks. Further topics addressed briefly include language modeling, neutron scattering, multiuser detection and channel equalization in digital communications, genetic information, and Bayesian court decision-making. (author)
DEFF Research Database (Denmark)
Mørup, Morten; Schmidt, Mikkel N
2012-01-01
Many networks of scientific interest naturally decompose into clusters or communities with comparatively fewer external than internal links; however, current Bayesian models of network communities do not exert this intuitive notion of communities. We formulate a nonparametric Bayesian model...... for community detection consistent with an intuitive definition of communities and present a Markov chain Monte Carlo procedure for inferring the community structure. A Matlab toolbox with the proposed inference procedure is available for download. On synthetic and real networks, our model detects communities...... consistent with ground truth, and on real networks, it outperforms existing approaches in predicting missing links. This suggests that community structure is an important structural property of networks that should be explicitly modeled....
Bayesian spatial semi-parametric modeling of HIV variation in Kenya.
Directory of Open Access Journals (Sweden)
Oscar Ngesa
Full Text Available Spatial statistics has seen rapid application in many fields, especially epidemiology and public health. Many studies, nonetheless, make limited use of the geographical location information and also usually assume that the covariates, which are related to the response variable, have linear effects. We develop a Bayesian semi-parametric regression model for HIV prevalence data. Model estimation and inference is based on fully Bayesian approach via Markov Chain Monte Carlo (McMC. The model is applied to HIV prevalence data among men in Kenya, derived from the Kenya AIDS indicator survey, with n = 3,662. Past studies have concluded that HIV infection has a nonlinear association with age. In this study a smooth function based on penalized regression splines is used to estimate this nonlinear effect. Other covariates were assumed to have a linear effect. Spatial references to the counties were modeled as both structured and unstructured spatial effects. We observe that circumcision reduces the risk of HIV infection. The results also indicate that men in the urban areas were more likely to be infected by HIV as compared to their rural counterpart. Men with higher education had the lowest risk of HIV infection. A nonlinear relationship between HIV infection and age was established. Risk of HIV infection increases with age up to the age of 40 then declines with increase in age. Men who had STI in the last 12 months were more likely to be infected with HIV. Also men who had ever used a condom were found to have higher likelihood to be infected by HIV. A significant spatial variation of HIV infection in Kenya was also established. The study shows the practicality and flexibility of Bayesian semi-parametric regression model in analyzing epidemiological data.
Jafarzadeh, S Reza; Johnson, Wesley O; Gardner, Ian A
2016-03-15
The area under the receiver operating characteristic (ROC) curve (AUC) is used as a performance metric for quantitative tests. Although multiple biomarkers may be available for diagnostic or screening purposes, diagnostic accuracy is often assessed individually rather than in combination. In this paper, we consider the interesting problem of combining multiple biomarkers for use in a single diagnostic criterion with the goal of improving the diagnostic accuracy above that of an individual biomarker. The diagnostic criterion created from multiple biomarkers is based on the predictive probability of disease, conditional on given multiple biomarker outcomes. If the computed predictive probability exceeds a specified cutoff, the corresponding subject is allocated as 'diseased'. This defines a standard diagnostic criterion that has its own ROC curve, namely, the combined ROC (cROC). The AUC metric for cROC, namely, the combined AUC (cAUC), is used to compare the predictive criterion based on multiple biomarkers to one based on fewer biomarkers. A multivariate random-effects model is proposed for modeling multiple normally distributed dependent scores. Bayesian methods for estimating ROC curves and corresponding (marginal) AUCs are developed when a perfect reference standard is not available. In addition, cAUCs are computed to compare the accuracy of different combinations of biomarkers for diagnosis. The methods are evaluated using simulations and are applied to data for Johne's disease (paratuberculosis) in cattle. Copyright © 2015 John Wiley & Sons, Ltd.
Frasso, Gianluca; Lambert, Philippe
2016-10-01
SummaryThe 2014 Ebola outbreak in Sierra Leone is analyzed using a susceptible-exposed-infectious-removed (SEIR) epidemic compartmental model. The discrete time-stochastic model for the epidemic evolution is coupled to a set of ordinary differential equations describing the dynamics of the expected proportions of subjects in each epidemic state. The unknown parameters are estimated in a Bayesian framework by combining data on the number of new (laboratory confirmed) Ebola cases reported by the Ministry of Health and prior distributions for the transition rates elicited using information collected by the WHO during the follow-up of specific Ebola cases. The time-varying disease transmission rate is modeled in a flexible way using penalized B-splines. Our framework represents a valuable stochastic tool for the study of an epidemic dynamic even when only irregularly observed and possibly aggregated data are available. Simulations and the analysis of the 2014 Sierra Leone Ebola data highlight the merits of the proposed methodology. In particular, the flexible modeling of the disease transmission rate makes the estimation of the effective reproduction number robust to the misspecification of the initial epidemic states and to underreporting of the infectious cases. © The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
DEFF Research Database (Denmark)
Antoniou, Constantinos; Harrison, Glenn W.; Lau, Morten I.
2015-01-01
A large literature suggests that many individuals do not apply Bayes’ Rule when making decisions that depend on them correctly pooling prior information and sample data. We replicate and extend a classic experimental study of Bayesian updating from psychology, employing the methods of experimenta...... economics, with careful controls for the confounding effects of risk aversion. Our results show that risk aversion significantly alters inferences on deviations from Bayes’ Rule....
Bayesian linkage and segregation analysis: factoring the problem.
Matthysse, S
2000-01-01
Complex segregation analysis and linkage methods are mathematical techniques for the genetic dissection of complex diseases. They are used to delineate complex modes of familial transmission and to localize putative disease susceptibility loci to specific chromosomal locations. The computational problem of Bayesian linkage and segregation analysis is one of integration in high-dimensional spaces. In this paper, three available techniques for Bayesian linkage and segregation analysis are discussed: Markov Chain Monte Carlo (MCMC), importance sampling, and exact calculation. The contribution of each to the overall integration will be explicitly discussed.
Zhang, Linlin; Guindani, Michele; Versace, Francesco; Vannucci, Marina
2014-07-15
In this paper we present a novel wavelet-based Bayesian nonparametric regression model for the analysis of functional magnetic resonance imaging (fMRI) data. Our goal is to provide a joint analytical framework that allows to detect regions of the brain which exhibit neuronal activity in response to a stimulus and, simultaneously, infer the association, or clustering, of spatially remote voxels that exhibit fMRI time series with similar characteristics. We start by modeling the data with a hemodynamic response function (HRF) with a voxel-dependent shape parameter. We detect regions of the brain activated in response to a given stimulus by using mixture priors with a spike at zero on the coefficients of the regression model. We account for the complex spatial correlation structure of the brain by using a Markov random field (MRF) prior on the parameters guiding the selection of the activated voxels, therefore capturing correlation among nearby voxels. In order to infer association of the voxel time courses, we assume correlated errors, in particular long memory, and exploit the whitening properties of discrete wavelet transforms. Furthermore, we achieve clustering of the voxels by imposing a Dirichlet process (DP) prior on the parameters of the long memory process. For inference, we use Markov Chain Monte Carlo (MCMC) sampling techniques that combine Metropolis-Hastings schemes employed in Bayesian variable selection with sampling algorithms for nonparametric DP models. We explore the performance of the proposed model on simulated data, with both block- and event-related design, and on real fMRI data. Copyright © 2014 Elsevier Inc. All rights reserved.
Barraza Bernadas, V.; Grings, F.; Roitberg, E.; Perna, P.; Karszenbaum, H.
2017-12-01
The Dry Chaco region (DCF) has the highest absolute deforestation rates of all Argentinian forests. The most recent report indicates a current deforestation rate of 200,000 Ha year-1. In order to better monitor this process, DCF was chosen to implement an early warning program for illegal deforestation. Although the area is intensively studied using medium resolution imagery (Landsat), the products obtained have a yearly pace and therefore unsuited for an early warning program. In this paper, we evaluated the performance of an online Bayesian change-point detection algorithm for MODIS Enhanced Vegetation Index (EVI) and Land Surface Temperature (LST) datasets. The goal was to to monitor the abrupt changes in vegetation dynamics associated with deforestation events. We tested this model by simulating 16-day EVI and 8-day LST time series with varying amounts of seasonality, noise, length of the time series and by adding abrupt changes with different magnitudes. This model was then tested on real satellite time series available through the Google Earth Engine, over a pilot area in DCF, where deforestation was common in the 2004-2016 period. A comparison with yearly benchmark products based on Landsat images is also presented (REDAF dataset). The results shows the advantages of using an automatic model to detect a changepoint in the time series than using only visual inspection techniques. Simulating time series with varying amounts of seasonality and noise, and by adding abrupt changes at different times and magnitudes, revealed that this model is robust against noise, and is not influenced by changes in amplitude of the seasonal component. Furthermore, the results compared favorably with REDAF dataset (near 65% of agreement). These results show the potential to combine LST and EVI to identify deforestation events. This work is being developed within the frame of the national Forest Law for the protection and sustainable development of Native Forest in Argentina in
Norris, Peter M.; da Silva, Arlindo M.
2018-01-01
Part 1 of this series presented a Monte Carlo Bayesian method for constraining a complex statistical model of global circulation model (GCM) sub-gridcolumn moisture variability using high-resolution Moderate Resolution Imaging Spectroradiometer (MODIS) cloud data, thereby permitting parameter estimation and cloud data assimilation for large-scale models. This article performs some basic testing of this new approach, verifying that it does indeed reduce mean and standard deviation biases significantly with respect to the assimilated MODIS cloud optical depth, brightness temperature and cloud-top pressure and that it also improves the simulated rotational–Raman scattering cloud optical centroid pressure (OCP) against independent (non-assimilated) retrievals from the Ozone Monitoring Instrument (OMI). Of particular interest, the Monte Carlo method does show skill in the especially difficult case where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach allows non-gradient-based jumps into regions of non-zero cloud probability. In the example provided, the method is able to restore marine stratocumulus near the Californian coast, where the background state has a clear swath. This article also examines a number of algorithmic and physical sensitivities of the new method and provides guidance for its cost-effective implementation. One obvious difficulty for the method, and other cloud data assimilation methods as well, is the lack of information content in passive-radiometer-retrieved cloud observables on cloud vertical structure, beyond cloud-top pressure and optical thickness, thus necessitating strong dependence on the background vertical moisture structure. It is found that a simple flow-dependent correlation modification from Riishojgaard provides some help in this respect, by
da Silva, Arlindo M.; Norris, Peter M.
2013-01-01
Part I presented a Monte Carlo Bayesian method for constraining a complex statistical model of GCM sub-gridcolumn moisture variability using high-resolution MODIS cloud data, thereby permitting large-scale model parameter estimation and cloud data assimilation. This part performs some basic testing of this new approach, verifying that it does indeed significantly reduce mean and standard deviation biases with respect to the assimilated MODIS cloud optical depth, brightness temperature and cloud top pressure, and that it also improves the simulated rotational-Ramman scattering cloud optical centroid pressure (OCP) against independent (non-assimilated) retrievals from the OMI instrument. Of particular interest, the Monte Carlo method does show skill in the especially difficult case where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach allows finite jumps into regions of non-zero cloud probability. In the example provided, the method is able to restore marine stratocumulus near the Californian coast where the background state has a clear swath. This paper also examines a number of algorithmic and physical sensitivities of the new method and provides guidance for its cost-effective implementation. One obvious difficulty for the method, and other cloud data assimilation methods as well, is the lack of information content in the cloud observables on cloud vertical structure, beyond cloud top pressure and optical thickness, thus necessitating strong dependence on the background vertical moisture structure. It is found that a simple flow-dependent correlation modification due to Riishojgaard (1998) provides some help in this respect, by better honoring inversion structures in the background state.
Norris, Peter M.; da Silva, Arlindo M.
2016-01-01
Part 1 of this series presented a Monte Carlo Bayesian method for constraining a complex statistical model of global circulation model (GCM) sub-gridcolumn moisture variability using high-resolution Moderate Resolution Imaging Spectroradiometer (MODIS) cloud data, thereby permitting parameter estimation and cloud data assimilation for large-scale models. This article performs some basic testing of this new approach, verifying that it does indeed reduce mean and standard deviation biases significantly with respect to the assimilated MODIS cloud optical depth, brightness temperature and cloud-top pressure and that it also improves the simulated rotational-Raman scattering cloud optical centroid pressure (OCP) against independent (non-assimilated) retrievals from the Ozone Monitoring Instrument (OMI). Of particular interest, the Monte Carlo method does show skill in the especially difficult case where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach allows non-gradient-based jumps into regions of non-zero cloud probability. In the example provided, the method is able to restore marine stratocumulus near the Californian coast, where the background state has a clear swath. This article also examines a number of algorithmic and physical sensitivities of the new method and provides guidance for its cost-effective implementation. One obvious difficulty for the method, and other cloud data assimilation methods as well, is the lack of information content in passive-radiometer-retrieved cloud observables on cloud vertical structure, beyond cloud-top pressure and optical thickness, thus necessitating strong dependence on the background vertical moisture structure. It is found that a simple flow-dependent correlation modification from Riishojgaard provides some help in this respect, by
Advances in Bayesian Modeling in Educational Research
Levy, Roy
2016-01-01
In this article, I provide a conceptually oriented overview of Bayesian approaches to statistical inference and contrast them with frequentist approaches that currently dominate conventional practice in educational research. The features and advantages of Bayesian approaches are illustrated with examples spanning several statistical modeling…
Multiview Bayesian Correlated Component Analysis
DEFF Research Database (Denmark)
Kamronn, Simon Due; Poulsen, Andreas Trier; Hansen, Lars Kai
2015-01-01
are identical. Here we propose a hierarchical probabilistic model that can infer the level of universality in such multiview data, from completely unrelated representations, corresponding to canonical correlation analysis, to identical representations as in correlated component analysis. This new model, which...... we denote Bayesian correlated component analysis, evaluates favorably against three relevant algorithms in simulated data. A well-established benchmark EEG data set is used to further validate the new model and infer the variability of spatial representations across multiple subjects....
Bayesian Approaches to Imputation, Hypothesis Testing, and Parameter Estimation
Ross, Steven J.; Mackey, Beth
2015-01-01
This chapter introduces three applications of Bayesian inference to common and novel issues in second language research. After a review of the critiques of conventional hypothesis testing, our focus centers on ways Bayesian inference can be used for dealing with missing data, for testing theory-driven substantive hypotheses without a default null…
Bayesian Multiresolution Variable Selection for Ultra-High Dimensional Neuroimaging Data.
Zhao, Yize; Kang, Jian; Long, Qi
2018-01-01
Ultra-high dimensional variable selection has become increasingly important in analysis of neuroimaging data. For example, in the Autism Brain Imaging Data Exchange (ABIDE) study, neuroscientists are interested in identifying important biomarkers for early detection of the autism spectrum disorder (ASD) using high resolution brain images that include hundreds of thousands voxels. However, most existing methods are not feasible for solving this problem due to their extensive computational costs. In this work, we propose a novel multiresolution variable selection procedure under a Bayesian probit regression framework. It recursively uses posterior samples for coarser-scale variable selection to guide the posterior inference on finer-scale variable selection, leading to very efficient Markov chain Monte Carlo (MCMC) algorithms. The proposed algorithms are computationally feasible for ultra-high dimensional data. Also, our model incorporates two levels of structural information into variable selection using Ising priors: the spatial dependence between voxels and the functional connectivity between anatomical brain regions. Applied to the resting state functional magnetic resonance imaging (R-fMRI) data in the ABIDE study, our methods identify voxel-level imaging biomarkers highly predictive of the ASD, which are biologically meaningful and interpretable. Extensive simulations also show that our methods achieve better performance in variable selection compared to existing methods.
Ryu, Duchwan
2010-09-28
We consider nonparametric regression analysis in a generalized linear model (GLM) framework for data with covariates that are the subject-specific random effects of longitudinal measurements. The usual assumption that the effects of the longitudinal covariate processes are linear in the GLM may be unrealistic and if this happens it can cast doubt on the inference of observed covariate effects. Allowing the regression functions to be unknown, we propose to apply Bayesian nonparametric methods including cubic smoothing splines or P-splines for the possible nonlinearity and use an additive model in this complex setting. To improve computational efficiency, we propose the use of data-augmentation schemes. The approach allows flexible covariance structures for the random effects and within-subject measurement errors of the longitudinal processes. The posterior model space is explored through a Markov chain Monte Carlo (MCMC) sampler. The proposed methods are illustrated and compared to other approaches, the "naive" approach and the regression calibration, via simulations and by an application that investigates the relationship between obesity in adulthood and childhood growth curves. © 2010, The International Biometric Society.
Bayesian modeling of JET Li-BES for edge electron density profiles using Gaussian processes
Kwak, Sehyun; Svensson, Jakob; Brix, Mathias; Ghim, Young-Chul; JET Contributors Collaboration
2015-11-01
A Bayesian model for the JET lithium beam emission spectroscopy (Li-BES) system has been developed to infer edge electron density profiles. The 26 spatial channels measure emission profiles with ~15 ms temporal resolution and ~1 cm spatial resolution. The lithium I (2p-2s) line radiation in an emission spectrum is calculated using a multi-state model, which expresses collisions between the neutral lithium beam atoms and the plasma particles as a set of differential equations. The emission spectrum is described in the model including photon and electronic noise, spectral line shapes, interference filter curves, and relative calibrations. This spectral modeling gets rid of the need of separate background measurements for calculating the intensity of the line radiation. Gaussian processes are applied to model both emission spectrum and edge electron density profile, and the electron temperature to calculate all the rate coefficients is obtained from the JET high resolution Thomson scattering (HRTS) system. The posterior distributions of the edge electron density profile are explored via the numerical technique and the Markov chain Monte Carlo (MCMC) samplings. See the Appendix of F. Romanelli et al., Proceedings of the 25th IAEA Fusion Energy Conference 2014, Saint Petersburg, Russia.
Applications of Bayesian approach in modelling risk of malaria-related hospital mortality
Directory of Open Access Journals (Sweden)
Simbeye Jupiter S
2008-02-01
Full Text Available Abstract Background Malaria is a major public health problem in Malawi, however, quantifying its burden in a population is a challenge. Routine hospital data provide a proxy for measuring the incidence of severe malaria and for crudely estimating morbidity rates. Using such data, this paper proposes a method to describe trends, patterns and factors associated with in-hospital mortality attributed to the disease. Methods We develop semiparametric regression models which allow joint analysis of nonlinear effects of calendar time and continuous covariates, spatially structured variation, unstructured heterogeneity, and other fixed covariates. Modelling and inference use the fully Bayesian approach via Markov Chain Monte Carlo (MCMC simulation techniques. The methodology is applied to analyse data arising from paediatric wards in Zomba district, Malawi, between 2002 and 2003. Results and Conclusion We observe that the risk of dying in hospital is lower in the dry season, and for children who travel a distance of less than 5 kms to the hospital, but increases for those who are referred to the hospital. The results also indicate significant differences in both structured and unstructured spatial effects, and the health facility effects reveal considerable differences by type of facility or practice. More importantly, our approach shows non-linearities in the effect of metrical covariates on the probability of dying in hospital. The study emphasizes that the methodological framework used provides a useful tool for analysing the data at hand and of similar structure.
Cruz-Marcelo, Alejandro; Ensor, Katherine B.; Rosner, Gary L.
2011-01-01
The term structure of interest rates is used to price defaultable bonds and credit derivatives, as well as to infer the quality of bonds for risk management purposes. We introduce a model that jointly estimates term structures by means of a Bayesian hierarchical model with a prior probability model based on Dirichlet process mixtures. The modeling methodology borrows strength across term structures for purposes of estimation. The main advantage of our framework is its ability to produce reliable estimators at the company level even when there are only a few bonds per company. After describing the proposed model, we discuss an empirical application in which the term structure of 197 individual companies is estimated. The sample of 197 consists of 143 companies with only one or two bonds. In-sample and out-of-sample tests are used to quantify the improvement in accuracy that results from approximating the term structure of corporate bonds with estimators by company rather than by credit rating, the latter being a popular choice in the financial literature. A complete description of a Markov chain Monte Carlo (MCMC) scheme for the proposed model is available as Supplementary Material. PMID:21765566
International Nuclear Information System (INIS)
Cai, Caifang; Lambert, Marc; Rodet, Thomas
2014-01-01
First, we present the implementation of a random walk Metropolis-within-Gibbs (MWG) sampling method in flaw characterization based on a metamodeling method. The role of metamodeling is to reduce the computational time cost in Eddy Current Testing (ECT) forward model calculation. In such a way, the use of Markov Chain Monte Carlo (MCMC) methods becomes possible. Secondly, we analyze the influence of partially known parameters in Bayesian estimation. The objective is to evaluate the importance of providing more specific prior information. Simulation results show that even partially known information has great interest in providing more accurate flaw parameter estimations. The improvement ratio depends on the parameter dependence and the interest shows only when the provided information is specific enough
Modeling and Bayesian parameter estimation for shape memory alloy bending actuators
Crews, John H.; Smith, Ralph C.
2012-04-01
In this paper, we employ a homogenized energy model (HEM) for shape memory alloy (SMA) bending actuators. Additionally, we utilize a Bayesian method for quantifying parameter uncertainty. The system consists of a SMA wire attached to a flexible beam. As the actuator is heated, the beam bends, providing endoscopic motion. The model parameters are fit to experimental data using an ordinary least-squares approach. The uncertainty in the fit model parameters is then quantified using Markov Chain Monte Carlo (MCMC) methods. The MCMC algorithm provides bounds on the parameters, which will ultimately be used in robust control algorithms. One purpose of the paper is to test the feasibility of the Random Walk Metropolis algorithm, the MCMC method used here.
Bayesian image restoration, using configurations
DEFF Research Database (Denmark)
Thorarinsdottir, Thordis
configurations are expressed in terms of the mean normal measure of the random set. These probabilities are used as prior probabilities in a Bayesian image restoration approach. Estimation of the remaining parameters in the model is outlined for salt and pepper noise. The inference in the model is discussed...
Bayesian image restoration, using configurations
DEFF Research Database (Denmark)
Thorarinsdottir, Thordis Linda
2006-01-01
configurations are expressed in terms of the mean normal measure of the random set. These probabilities are used as prior probabilities in a Bayesian image restoration approach. Estimation of the remaining parameters in the model is outlined for the salt and pepper noise. The inference in the model is discussed...
MCMC-ODPR: Primer design optimization using Markov Chain Monte Carlo sampling
Directory of Open Access Journals (Sweden)
Kitchen James L
2012-11-01
Full Text Available Abstract Background Next generation sequencing technologies often require numerous primer designs that require good target coverage that can be financially costly. We aimed to develop a system that would implement primer reuse to design degenerate primers that could be designed around SNPs, thus find the fewest necessary primers and the lowest cost whilst maintaining an acceptable coverage and provide a cost effective solution. We have implemented Metropolis-Hastings Markov Chain Monte Carlo for optimizing primer reuse. We call it the Markov Chain Monte Carlo Optimized Degenerate Primer Reuse (MCMC-ODPR algorithm. Results After repeating the program 1020 times to assess the variance, an average of 17.14% fewer primers were found to be necessary using MCMC-ODPR for an equivalent coverage without implementing primer reuse. The algorithm was able to reuse primers up to five times. We compared MCMC-ODPR with single sequence primer design programs Primer3 and Primer-BLAST and achieved a lower primer cost per amplicon base covered of 0.21 and 0.19 and 0.18 primer nucleotides on three separate gene sequences, respectively. With multiple sequences, MCMC-ODPR achieved a lower cost per base covered of 0.19 than programs BatchPrimer3 and PAMPS, which achieved 0.25 and 0.64 primer nucleotides, respectively. Conclusions MCMC-ODPR is a useful tool for designing primers at various melting temperatures at good target coverage. By combining degeneracy with optimal primer reuse the user may increase coverage of sequences amplified by the designed primers at significantly lower costs. Our analyses showed that overall MCMC-ODPR outperformed the other primer-design programs in our study in terms of cost per covered base.
MCMC-ODPR: primer design optimization using Markov Chain Monte Carlo sampling.
Kitchen, James L; Moore, Jonathan D; Palmer, Sarah A; Allaby, Robin G
2012-11-05
Next generation sequencing technologies often require numerous primer designs that require good target coverage that can be financially costly. We aimed to develop a system that would implement primer reuse to design degenerate primers that could be designed around SNPs, thus find the fewest necessary primers and the lowest cost whilst maintaining an acceptable coverage and provide a cost effective solution. We have implemented Metropolis-Hastings Markov Chain Monte Carlo for optimizing primer reuse. We call it the Markov Chain Monte Carlo Optimized Degenerate Primer Reuse (MCMC-ODPR) algorithm. After repeating the program 1020 times to assess the variance, an average of 17.14% fewer primers were found to be necessary using MCMC-ODPR for an equivalent coverage without implementing primer reuse. The algorithm was able to reuse primers up to five times. We compared MCMC-ODPR with single sequence primer design programs Primer3 and Primer-BLAST and achieved a lower primer cost per amplicon base covered of 0.21 and 0.19 and 0.18 primer nucleotides on three separate gene sequences, respectively. With multiple sequences, MCMC-ODPR achieved a lower cost per base covered of 0.19 than programs BatchPrimer3 and PAMPS, which achieved 0.25 and 0.64 primer nucleotides, respectively. MCMC-ODPR is a useful tool for designing primers at various melting temperatures at good target coverage. By combining degeneracy with optimal primer reuse the user may increase coverage of sequences amplified by the designed primers at significantly lower costs. Our analyses showed that overall MCMC-ODPR outperformed the other primer-design programs in our study in terms of cost per covered base.
An MCMC Study of General Squark Flavour Mixing in the MSSM
Energy Technology Data Exchange (ETDEWEB)
Herrmann, Björn [Annecy, LAPTH; De Causmaecker, Karen [Intl. Solvay Inst., Brussels; Fuks, Benjamin [UPMC, Paris (main); Mahmoudi, Farvah [Lyon, Ecole Normale Superieure; O' Leary, Ben [Wurzburg U.; Porod, Werner [Wurzburg U.; Sekmen, Sezen [Kyungpook Natl. U.; Strobbe, Nadja [Fermilab
2015-10-05
We present an extensive study of non-minimally flavour violating (NMFV) terms in the Lagrangian of the Minimal Supersymmetric Standard Model (MSSM). We impose a variety of theoretical and experimental constraints and perform a detailed scan of the parameter space by means of a Markov Chain Monte-Carlo (MCMC) setup. This represents the first study of several non-zero flavour-violating elements within the MSSM. We present the results of the MCMC scan with a special focus on the flavour-violating parameters. Based on these results, we define benchmark scenarios for future studies of NMFV effects at the LHC.
Lesaffre, Emmanuel
2012-01-01
The growth of biostatistics has been phenomenal in recent years and has been marked by considerable technical innovation in both methodology and computational practicality. One area that has experienced significant growth is Bayesian methods. The growing use of Bayesian methodology has taken place partly due to an increasing number of practitioners valuing the Bayesian paradigm as matching that of scientific discovery. In addition, computational advances have allowed for more complex models to be fitted routinely to realistic data sets. Through examples, exercises and a combination of introd
Probability and Bayesian statistics
1987-01-01
This book contains selected and refereed contributions to the "Inter national Symposium on Probability and Bayesian Statistics" which was orga nized to celebrate the 80th birthday of Professor Bruno de Finetti at his birthplace Innsbruck in Austria. Since Professor de Finetti died in 1985 the symposium was dedicated to the memory of Bruno de Finetti and took place at Igls near Innsbruck from 23 to 26 September 1986. Some of the pa pers are published especially by the relationship to Bruno de Finetti's scientific work. The evolution of stochastics shows growing importance of probability as coherent assessment of numerical values as degrees of believe in certain events. This is the basis for Bayesian inference in the sense of modern statistics. The contributions in this volume cover a broad spectrum ranging from foundations of probability across psychological aspects of formulating sub jective probability statements, abstract measure theoretical considerations, contributions to theoretical statistics an...
ZHOU, Lin
1996-01-01
In this paper I consider social choices under uncertainty. I prove that any social choice rule that satisfies independence of irrelevant alternatives, translation invariance, and weak anonymity is consistent with ex post Bayesian utilitarianism
Bayesian analysis of data and model error in rainfall-runoff hydrological models
Kavetski, D.; Franks, S. W.; Kuczera, G.
2004-12-01
A major unresolved issue in the identification and use of conceptual hydrologic models is realistic description of uncertainty in the data and model structure. In particular, hydrologic parameters often cannot be measured directly and must be inferred (calibrated) from observed forcing/response data (typically, rainfall and runoff). However, rainfall varies significantly in space and time, yet is often estimated from sparse gauge networks. Recent work showed that current calibration methods (e.g., standard least squares, multi-objective calibration, generalized likelihood uncertainty estimation) ignore forcing uncertainty and assume that the rainfall is known exactly. Consequently, they can yield strongly biased and misleading parameter estimates. This deficiency confounds attempts to reliably test model hypotheses, to generalize results across catchments (the regionalization problem) and to quantify predictive uncertainty when the hydrologic model is extrapolated. This paper continues the development of a Bayesian total error analysis (BATEA) methodology for the calibration and identification of hydrologic models, which explicitly incorporates the uncertainty in both the forcing and response data, and allows systematic model comparison based on residual model errors and formal Bayesian hypothesis testing (e.g., using Bayes factors). BATEA is based on explicit stochastic models for both forcing and response uncertainty, whereas current techniques focus solely on response errors. Hence, unlike existing methods, the BATEA parameter equations directly reflect the modeler's confidence in all the data. We compare several approaches to approximating the parameter distributions: a) full Markov Chain Monte Carlo methods and b) simplified approaches based on linear approximations. Studies using synthetic and real data from the US and Australia show that BATEA systematically reduces the parameter bias, leads to more meaningful model fits and allows model comparison taking
International Nuclear Information System (INIS)
Puncher, M.; Birchall, A.; Bull, R. K.
2012-01-01
Estimating uncertainties on doses from bioassay data is of interest in epidemiology studies that estimate cancer risk from occupational exposures to radionuclides. Bayesian methods provide a logical framework to calculate these uncertainties. However, occupational exposures often consist of many intakes, and this can make the Bayesian calculation computationally intractable. This paper describes a novel strategy for increasing the computational speed of the calculation by simplifying the intake pattern to a single composite intake, termed as complex intake regime (CIR). In order to assess whether this approximation is accurate and fast enough for practical purposes, the method is implemented by the Weighted Likelihood Monte Carlo Sampling (WeLMoS) method and evaluated by comparing its performance with a Markov Chain Monte Carlo (MCMC) method. The MCMC method gives the full solution (all intakes are independent), but is very computationally intensive to apply routinely. Posterior distributions of model parameter values, intakes and doses are calculated for a representative sample of plutonium workers from the United Kingdom Atomic Energy cohort using the WeLMoS method with the CIR and the MCMC method. The distributions are in good agreement: posterior means and Q 0.025 and Q 0.975 quantiles are typically within 20 %. Furthermore, the WeLMoS method using the CIR converges quickly: a typical case history takes around 10-20 min on a fast workstation, whereas the MCMC method took around 12-hr. The advantages and disadvantages of the method are discussed. (authors)
A Bayesian Reflection on Surfaces
Directory of Open Access Journals (Sweden)
David R. Wolf
1999-10-01
Full Text Available Abstract: The topic of this paper is a novel Bayesian continuous-basis field representation and inference framework. Within this paper several problems are solved: The maximally informative inference of continuous-basis fields, that is where the basis for the field is itself a continuous object and not representable in a finite manner; the tradeoff between accuracy of representation in terms of information learned, and memory or storage capacity in bits; the approximation of probability distributions so that a maximal amount of information about the object being inferred is preserved; an information theoretic justification for multigrid methodology. The maximally informative field inference framework is described in full generality and denoted the Generalized Kalman Filter. The Generalized Kalman Filter allows the update of field knowledge from previous knowledge at any scale, and new data, to new knowledge at any other scale. An application example instance, the inference of continuous surfaces from measurements (for example, camera image data, is presented.
Bayesian networks improve causal environmental ...
Rule-based weight of evidence approaches to ecological risk assessment may not account for uncertainties and generally lack probabilistic integration of lines of evidence. Bayesian networks allow causal inferences to be made from evidence by including causal knowledge about the problem, using this knowledge with probabilistic calculus to combine multiple lines of evidence, and minimizing biases in predicting or diagnosing causal relationships. Too often, sources of uncertainty in conventional weight of evidence approaches are ignored that can be accounted for with Bayesian networks. Specifying and propagating uncertainties improve the ability of models to incorporate strength of the evidence in the risk management phase of an assessment. Probabilistic inference from a Bayesian network allows evaluation of changes in uncertainty for variables from the evidence. The network structure and probabilistic framework of a Bayesian approach provide advantages over qualitative approaches in weight of evidence for capturing the impacts of multiple sources of quantifiable uncertainty on predictions of ecological risk. Bayesian networks can facilitate the development of evidence-based policy under conditions of uncertainty by incorporating analytical inaccuracies or the implications of imperfect information, structuring and communicating causal issues through qualitative directed graph formulations, and quantitatively comparing the causal power of multiple stressors on value
Bayesian Group Bridge for Bi-level Variable Selection.
Mallick, Himel; Yi, Nengjun
2017-06-01
A Bayesian bi-level variable selection method (BAGB: Bayesian Analysis of Group Bridge) is developed for regularized regression and classification. This new development is motivated by grouped data, where generic variables can be divided into multiple groups, with variables in the same group being mechanistically related or statistically correlated. As an alternative to frequentist group variable selection methods, BAGB incorporates structural information among predictors through a group-wise shrinkage prior. Posterior computation proceeds via an efficient MCMC algorithm. In addition to the usual ease-of-interpretation of hierarchical linear models, the Bayesian formulation produces valid standard errors, a feature that is notably absent in the frequentist framework. Empirical evidence of the attractiveness of the method is illustrated by extensive Monte Carlo simulations and real data analysis. Finally, several extensions of this new approach are presented, providing a unified framework for bi-level variable selection in general models with flexible penalties.
Technical Note: Approximate Bayesian parameterization of a process-based tropical forest model
Hartig, F.; Dislich, C.; Wiegand, T.; Huth, A.
2014-02-01
Inverse parameter estimation of process-based models is a long-standing problem in many scientific disciplines. A key question for inverse parameter estimation is how to define the metric that quantifies how well model predictions fit to the data. This metric can be expressed by general cost or objective functions, but statistical inversion methods require a particular metric, the probability of observing the data given the model parameters, known as the likelihood. For technical and computational reasons, likelihoods for process-based stochastic models are usually based on general assumptions about variability in the observed data, and not on the stochasticity generated by the model. Only in recent years have new methods become available that allow the generation of likelihoods directly from stochastic simulations. Previous applications of these approximate Bayesian methods have concentrated on relatively simple models. Here, we report on the application of a simulation-based likelihood approximation for FORMIND, a parameter-rich individual-based model of tropical forest dynamics. We show that approximate Bayesian inference, based on a parametric likelihood approximation placed in a conventional Markov chain Monte Carlo (MCMC) sampler, performs well in retrieving known parameter values from virtual inventory data generated by the forest model. We analyze the results of the parameter estimation, examine its sensitivity to the choice and aggregation of model outputs and observed data (summary statistics), and demonstrate the application of this method by fitting the FORMIND model to field data from an Ecuadorian tropical forest. Finally, we discuss how this approach differs from approximate Bayesian computation (ABC), another method commonly used to generate simulation-based likelihood approximations. Our results demonstrate that simulation-based inference, which offers considerable conceptual advantages over more traditional methods for inverse parameter estimation
Technical Note: Approximate Bayesian parameterization of a complex tropical forest model
Hartig, F.; Dislich, C.; Wiegand, T.; Huth, A.
2013-08-01
Inverse parameter estimation of process-based models is a long-standing problem in ecology and evolution. A key problem of inverse parameter estimation is to define a metric that quantifies how well model predictions fit to the data. Such a metric can be expressed by general cost or objective functions, but statistical inversion approaches are based on a particular metric, the probability of observing the data given the model, known as the likelihood. Deriving likelihoods for dynamic models requires making assumptions about the probability for observations to deviate from mean model predictions. For technical reasons, these assumptions are usually derived without explicit consideration of the processes in the simulation. Only in recent years have new methods become available that allow generating likelihoods directly from stochastic simulations. Previous applications of these approximate Bayesian methods have concentrated on relatively simple models. Here, we report on the application of a simulation-based likelihood approximation for FORMIND, a parameter-rich individual-based model of tropical forest dynamics. We show that approximate Bayesian inference, based on a parametric likelihood approximation placed in a conventional MCMC, performs well in retrieving known parameter values from virtual field data generated by the forest model. We analyze the results of the parameter estimation, examine the sensitivity towards the choice and aggregation of model outputs and observed data (summary statistics), and show results from using this method to fit the FORMIND model to field data from an Ecuadorian tropical forest. Finally, we discuss differences of this approach to Approximate Bayesian Computing (ABC), another commonly used method to generate simulation-based likelihood approximations. Our results demonstrate that simulation-based inference, which offers considerable conceptual advantages over more traditional methods for inverse parameter estimation, can