An Integrated Procedure for Bayesian Reliability Inference Using MCMC
Directory of Open Access Journals (Sweden)
Jing Lin
2014-01-01
Full Text Available The recent proliferation of Markov chain Monte Carlo (MCMC approaches has led to the use of the Bayesian inference in a wide variety of fields. To facilitate MCMC applications, this paper proposes an integrated procedure for Bayesian inference using MCMC methods, from a reliability perspective. The goal is to build a framework for related academic research and engineering applications to implement modern computational-based Bayesian approaches, especially for reliability inferences. The procedure developed here is a continuous improvement process with four stages (Plan, Do, Study, and Action and 11 steps, including: (1 data preparation; (2 prior inspection and integration; (3 prior selection; (4 model selection; (5 posterior sampling; (6 MCMC convergence diagnostic; (7 Monte Carlo error diagnostic; (8 model improvement; (9 model comparison; (10 inference making; (11 data updating and inference improvement. The paper illustrates the proposed procedure using a case study.
Hu, Zixi; Yao, Zhewei; Li, Jinglai
2017-03-01
Many scientific and engineering problems require to perform Bayesian inference for unknowns of infinite dimension. In such problems, many standard Markov Chain Monte Carlo (MCMC) algorithms become arbitrary slow under the mesh refinement, which is referred to as being dimension dependent. To this end, a family of dimensional independent MCMC algorithms, known as the preconditioned Crank-Nicolson (pCN) methods, were proposed to sample the infinite dimensional parameters. In this work we develop an adaptive version of the pCN algorithm, where the covariance operator of the proposal distribution is adjusted based on sampling history to improve the simulation efficiency. We show that the proposed algorithm satisfies an important ergodicity condition under some mild assumptions. Finally we provide numerical examples to demonstrate the performance of the proposed method.
Soliman, Ahmed A.; Al Sobhi, Mashail M.
2015-02-01
This article deals with the problem of estimating parameters of the Gompertz distribution (GD) based on progressive first-failure censored data using Bayesian and non-Bayesian approaches. The two-sample prediction problem is considered to derive Bayesian prediction bounds for both future order statistics and future record values based on progressive first failure censored informative samples from GD. The sampling schemes such as, first-failure censoring, progressive type II censoring, type II censoring and complete sample can be obtained as special cases of the progressive first-failure censored scheme. Markov chain Monte Carlo (MCMC) method with Gibbs sampling procedure is used to compute the Bayes estimates and also to construct the corresponding credible intervals of the parameters. A simulation study has been conducted in order to compare the proposed Bayes estimators with the maximum likelihood estimators MLE. Finally, some numerical computations with real data set are presented for illustrating all the proposed inferential procedures.
The neighborhood MCMC sampler for learning Bayesian networks
Alyami, Salem A.; Azad, A. K. M.; Keith, Jonathan M.
2016-07-01
Getting stuck in local maxima is a problem that arises while learning Bayesian networks (BNs) structures. In this paper, we studied a recently proposed Markov chain Monte Carlo (MCMC) sampler, called the Neighbourhood sampler (NS), and examined how efficiently it can sample BNs when local maxima are present. We assume that a posterior distribution f(N,E|D) has been defined, where D represents data relevant to the inference, N and E are the sets of nodes and directed edges, respectively. We illustrate the new approach by sampling from such a distribution, and inferring BNs. The simulations conducted in this paper show that the new learning approach substantially avoids getting stuck in local modes of the distribution, and achieves a more rapid rate of convergence, compared to other common algorithms e.g. the MCMC Metropolis-Hastings sampler.
Larjo, Antti; Lähdesmäki, Harri
2015-12-01
Bayesian networks have become popular for modeling probabilistic relationships between entities. As their structure can also be given a causal interpretation about the studied system, they can be used to learn, for example, regulatory relationships of genes or proteins in biological networks and pathways. Inference of the Bayesian network structure is complicated by the size of the model structure space, necessitating the use of optimization methods or sampling techniques, such Markov Chain Monte Carlo (MCMC) methods. However, convergence of MCMC chains is in many cases slow and can become even a harder issue as the dataset size grows. We show here how to improve convergence in the Bayesian network structure space by using an adjustable proposal distribution with the possibility to propose a wide range of steps in the structure space, and demonstrate improved network structure inference by analyzing phosphoprotein data from the human primary T cell signaling network.
Bayesian inference for Hawkes processes
DEFF Research Database (Denmark)
Rasmussen, Jakob Gulddahl
2013-01-01
The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...... intensity function, while the second approach is based on an underlying clustering and branching structure in the Hawkes process. For practical use, MCMC (Markov chain Monte Carlo) methods are employed. The two approaches are compared numerically using three examples of the Hawkes process....
Bayesian inference for Hawkes processes
DEFF Research Database (Denmark)
Rasmussen, Jakob Gulddahl
The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...... intensity function, while the second approach is based on an underlying clustering and branching structure in the Hawkes process. For practical use, MCMC (Markov chain Monte Carlo) methods are employed. The two approaches are compared numerically using three examples of the Hawkes process....
Bayesian inference for OPC modeling
Burbine, Andrew; Sturtevant, John; Fryer, David; Smith, Bruce W.
2016-03-01
The use of optical proximity correction (OPC) demands increasingly accurate models of the photolithographic process. Model building and inference techniques in the data science community have seen great strides in the past two decades which make better use of available information. This paper aims to demonstrate the predictive power of Bayesian inference as a method for parameter selection in lithographic models by quantifying the uncertainty associated with model inputs and wafer data. Specifically, the method combines the model builder's prior information about each modelling assumption with the maximization of each observation's likelihood as a Student's t-distributed random variable. Through the use of a Markov chain Monte Carlo (MCMC) algorithm, a model's parameter space is explored to find the most credible parameter values. During parameter exploration, the parameters' posterior distributions are generated by applying Bayes' rule, using a likelihood function and the a priori knowledge supplied. The MCMC algorithm used, an affine invariant ensemble sampler (AIES), is implemented by initializing many walkers which semiindependently explore the space. The convergence of these walkers to global maxima of the likelihood volume determine the parameter values' highest density intervals (HDI) to reveal champion models. We show that this method of parameter selection provides insights into the data that traditional methods do not and outline continued experiments to vet the method.
Bayesian Cosmological inference beyond statistical isotropy
Souradeep, Tarun; Das, Santanu; Wandelt, Benjamin
2016-10-01
With advent of rich data sets, computationally challenge of inference in cosmology has relied on stochastic sampling method. First, I review the widely used MCMC approach used to infer cosmological parameters and present a adaptive improved implementation SCoPE developed by our group. Next, I present a general method for Bayesian inference of the underlying covariance structure of random fields on a sphere. We employ the Bipolar Spherical Harmonic (BipoSH) representation of general covariance structure on the sphere. We illustrate the efficacy of the method with a principled approach to assess violation of statistical isotropy (SI) in the sky maps of Cosmic Microwave Background (CMB) fluctuations. The general, principled, approach to a Bayesian inference of the covariance structure in a random field on a sphere presented here has huge potential for application to other many aspects of cosmology and astronomy, as well as, more distant areas of research like geosciences and climate modelling.
Bayesian network reconstruction using systems genetics data: comparison of MCMC methods.
Tasaki, Shinya; Sauerwine, Ben; Hoff, Bruce; Toyoshiba, Hiroyoshi; Gaiteri, Chris; Chaibub Neto, Elias
2015-04-01
Reconstructing biological networks using high-throughput technologies has the potential to produce condition-specific interactomes. But are these reconstructed networks a reliable source of biological interactions? Do some network inference methods offer dramatically improved performance on certain types of networks? To facilitate the use of network inference methods in systems biology, we report a large-scale simulation study comparing the ability of Markov chain Monte Carlo (MCMC) samplers to reverse engineer Bayesian networks. The MCMC samplers we investigated included foundational and state-of-the-art Metropolis-Hastings and Gibbs sampling approaches, as well as novel samplers we have designed. To enable a comprehensive comparison, we simulated gene expression and genetics data from known network structures under a range of biologically plausible scenarios. We examine the overall quality of network inference via different methods, as well as how their performance is affected by network characteristics. Our simulations reveal that network size, edge density, and strength of gene-to-gene signaling are major parameters that differentiate the performance of various samplers. Specifically, more recent samplers including our novel methods outperform traditional samplers for highly interconnected large networks with strong gene-to-gene signaling. Our newly developed samplers show comparable or superior performance to the top existing methods. Moreover, this performance gain is strongest in networks with biologically oriented topology, which indicates that our novel samplers are suitable for inferring biological networks. The performance of MCMC samplers in this simulation framework can guide the choice of methods for network reconstruction using systems genetics data.
Probabilistic Inferences in Bayesian Networks
Ding, Jianguo
2010-01-01
This chapter summarizes the popular inferences methods in Bayesian networks. The results demonstrates that the evidence can propagated across the Bayesian networks by any links, whatever it is forward or backward or intercausal style. The belief updating of Bayesian networks can be obtained by various available inference techniques. Theoretically, exact inferences in Bayesian networks is feasible and manageable. However, the computing and inference is NP-hard. That means, in applications, in ...
Computationally efficient Bayesian inference for inverse problems.
Energy Technology Data Exchange (ETDEWEB)
Marzouk, Youssef M.; Najm, Habib N.; Rahn, Larry A.
2007-10-01
Bayesian statistics provides a foundation for inference from noisy and incomplete data, a natural mechanism for regularization in the form of prior information, and a quantitative assessment of uncertainty in the inferred results. Inverse problems - representing indirect estimation of model parameters, inputs, or structural components - can be fruitfully cast in this framework. Complex and computationally intensive forward models arising in physical applications, however, can render a Bayesian approach prohibitive. This difficulty is compounded by high-dimensional model spaces, as when the unknown is a spatiotemporal field. We present new algorithmic developments for Bayesian inference in this context, showing strong connections with the forward propagation of uncertainty. In particular, we introduce a stochastic spectral formulation that dramatically accelerates the Bayesian solution of inverse problems via rapid evaluation of a surrogate posterior. We also explore dimensionality reduction for the inference of spatiotemporal fields, using truncated spectral representations of Gaussian process priors. These new approaches are demonstrated on scalar transport problems arising in contaminant source inversion and in the inference of inhomogeneous material or transport properties. We also present a Bayesian framework for parameter estimation in stochastic models, where intrinsic stochasticity may be intermingled with observational noise. Evaluation of a likelihood function may not be analytically tractable in these cases, and thus several alternative Markov chain Monte Carlo (MCMC) schemes, operating on the product space of the observations and the parameters, are introduced.
Bayesian inference in geomagnetism
Backus, George E.
1988-01-01
The inverse problem in empirical geomagnetic modeling is investigated, with critical examination of recently published studies. Particular attention is given to the use of Bayesian inference (BI) to select the damping parameter lambda in the uniqueness portion of the inverse problem. The mathematical bases of BI and stochastic inversion are explored, with consideration of bound-softening problems and resolution in linear Gaussian BI. The problem of estimating the radial magnetic field B(r) at the earth core-mantle boundary from surface and satellite measurements is then analyzed in detail, with specific attention to the selection of lambda in the studies of Gubbins (1983) and Gubbins and Bloxham (1985). It is argued that the selection method is inappropriate and leads to lambda values much larger than those that would result if a reasonable bound on the heat flow at the CMB were assumed.
Improving the structure MCMC sampler for Bayesian networks by introducing a new edge reversal move
Grzegorczyk, Marco; Husmeier, Dirk
2008-01-01
Applications of Bayesian networks in systems biology are computationally demanding due to the large number of model parameters. Conventional MCMC schemes based on proposal moves in structure space tend to be too slow in mixing and convergence, and have recently been superseded by proposal moves in t
Murakami, Yohei; Takada, Shoji
2013-01-01
When model parameters in systems biology are not available from experiments, they need to be inferred so that the resulting simulation reproduces the experimentally known phenomena. For the purpose, Bayesian statistics with Markov chain Monte Carlo (MCMC) is a useful method. Conventional MCMC needs likelihood to evaluate a posterior distribution of acceptable parameters, while the approximate Bayesian computation (ABC) MCMC evaluates posterior distribution with use of qualitative fitness measure. However, none of these algorithms can deal with mixture of quantitative, i.e., likelihood, and qualitative fitness measures simultaneously. Here, to deal with this mixture, we formulated Bayesian formula for hybrid fitness measures (HFM). Then we implemented it to MCMC (MCMC-HFM). We tested MCMC-HFM first for a kinetic toy model with a positive feedback. Inferring kinetic parameters mainly related to the positive feedback, we found that MCMC-HFM reliably infer them using both qualitative and quantitative fitness measures. Then, we applied the MCMC-HFM to an apoptosis signal transduction network previously proposed. For kinetic parameters related to implicit positive feedbacks, which are important for bistability and irreversibility of the output, the MCMC-HFM reliably inferred these kinetic parameters. In particular, some kinetic parameters that have experimental estimates were inferred without using these data and the results were consistent with experiments. Moreover, for some parameters, the mixed use of quantitative and qualitative fitness measures narrowed down the acceptable range of parameters.
Inference in hybrid Bayesian networks
DEFF Research Database (Denmark)
Lanseth, Helge; Nielsen, Thomas Dyhre; Rumí, Rafael
2009-01-01
Since the 1980s, Bayesian Networks (BNs) have become increasingly popular for building statistical models of complex systems. This is particularly true for boolean systems, where BNs often prove to be a more efficient modelling framework than traditional reliability-techniques (like fault trees...... decade's research on inference in hybrid Bayesian networks. The discussions are linked to an example model for estimating human reliability....
Bayesian Inference: with ecological applications
Link, William A.; Barker, Richard J.
2010-01-01
This text provides a mathematically rigorous yet accessible and engaging introduction to Bayesian inference with relevant examples that will be of interest to biologists working in the fields of ecology, wildlife management and environmental studies as well as students in advanced undergraduate statistics.. This text opens the door to Bayesian inference, taking advantage of modern computational efficiencies and easily accessible software to evaluate complex hierarchical models.
Institute of Scientific and Technical Information of China (English)
Sheng Zheng
2013-01-01
The estimation of lower atmospheric refractivity from radar sea clutter (RFC) is a complicated nonlinear optimization problem.This paper deals with the RFC problem in a Bayesian framework.It uses the unbiased Markov Chain Monte Carlo (MCMC) sampling technique,which can provide accurate posterior probability distributions of the estimated refractivity parameters by using an electromagnetic split-step fast Fourier transform terrain parabolic equation propagation model within a Bayesian inversion framework.In contrast to the global optimization algorithm,the Bayesian-MCMC can obtain not only the approximate solutions,but also the probability distributions of the solutions,that is,uncertainty analyses of solutions.The Bayesian-MCMC algorithm is implemented on the simulation radar sea-clutter data and the real radar seaclutter data.Reference data are assumed to be simulation data and refractivity profiles are obtained using a helicopter.The inversion algorithm is assessed (i) by comparing the estimated refractivity profiles from the assumed simulation and the helicopter sounding data; (ii) the one-dimensional (1D) and two-dimensional (2D) posterior probability distribution of solutions.
Partial Order MCMC for Structure Discovery in Bayesian Networks
Niinimaki, Teppo; Koivisto, Mikko
2012-01-01
We present a new Markov chain Monte Carlo method for estimating posterior probabilities of structural features in Bayesian networks. The method draws samples from the posterior distribution of partial orders on the nodes; for each sampled partial order, the conditional probabilities of interest are computed exactly. We give both analytical and empirical results that suggest the superiority of the new method compared to previous methods, which sample either directed acyclic graphs or linear orders on the nodes.
Perception, illusions and Bayesian inference.
Nour, Matthew M; Nour, Joseph M
2015-01-01
Descriptive psychopathology makes a distinction between veridical perception and illusory perception. In both cases a perception is tied to a sensory stimulus, but in illusions the perception is of a false object. This article re-examines this distinction in light of new work in theoretical and computational neurobiology, which views all perception as a form of Bayesian statistical inference that combines sensory signals with prior expectations. Bayesian perceptual inference can solve the 'inverse optics' problem of veridical perception and provides a biologically plausible account of a number of illusory phenomena, suggesting that veridical and illusory perceptions are generated by precisely the same inferential mechanisms.
Probability biases as Bayesian inference
Directory of Open Access Journals (Sweden)
Andre; C. R. Martins
2006-11-01
Full Text Available In this article, I will show how several observed biases in human probabilistic reasoning can be partially explained as good heuristics for making inferences in an environment where probabilities have uncertainties associated to them. Previous results show that the weight functions and the observed violations of coalescing and stochastic dominance can be understood from a Bayesian point of view. We will review those results and see that Bayesian methods should also be used as part of the explanation behind other known biases. That means that, although the observed errors are still errors under the be understood as adaptations to the solution of real life problems. Heuristics that allow fast evaluations and mimic a Bayesian inference would be an evolutionary advantage, since they would give us an efficient way of making decisions. %XX In that sense, it should be no surprise that humans reason with % probability as it has been observed.
Bayesian inference for Hawkes processes
DEFF Research Database (Denmark)
Rasmussen, Jakob Gulddahl
The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...
Bayesian Inference for Radio Observations
Lochner, Michelle; Zwart, Jonathan T L; Smirnov, Oleg; Bassett, Bruce A; Oozeer, Nadeem; Kunz, Martin
2015-01-01
(Abridged) New telescopes like the Square Kilometre Array (SKA) will push into a new sensitivity regime and expose systematics, such as direction-dependent effects, that could previously be ignored. Current methods for handling such systematics rely on alternating best estimates of instrumental calibration and models of the underlying sky, which can lead to inaccurate uncertainty estimates and biased results because such methods ignore any correlations between parameters. These deconvolution algorithms produce a single image that is assumed to be a true representation of the sky, when in fact it is just one realisation of an infinite ensemble of images compatible with the noise in the data. In contrast, here we report a Bayesian formalism that simultaneously infers both systematics and science. Our technique, Bayesian Inference for Radio Observations (BIRO), determines all parameters directly from the raw data, bypassing image-making entirely, by sampling from the joint posterior probability distribution. Thi...
Bayesian inference on proportional elections.
Directory of Open Access Journals (Sweden)
Gabriel Hideki Vatanabe Brunello
Full Text Available Polls for majoritarian voting systems usually show estimates of the percentage of votes for each candidate. However, proportional vote systems do not necessarily guarantee the candidate with the most percentage of votes will be elected. Thus, traditional methods used in majoritarian elections cannot be applied on proportional elections. In this context, the purpose of this paper was to perform a Bayesian inference on proportional elections considering the Brazilian system of seats distribution. More specifically, a methodology to answer the probability that a given party will have representation on the chamber of deputies was developed. Inferences were made on a Bayesian scenario using the Monte Carlo simulation technique, and the developed methodology was applied on data from the Brazilian elections for Members of the Legislative Assembly and Federal Chamber of Deputies in 2010. A performance rate was also presented to evaluate the efficiency of the methodology. Calculations and simulations were carried out using the free R statistical software.
Bayesian inference on EMRI signals using low frequency approximations
Ali, Asad; Meyer, Renate; Röver, Christian; 10.1088/0264-9381/29/14/145014
2013-01-01
Extreme mass ratio inspirals (EMRIs) are thought to be one of the most exciting gravitational wave sources to be detected with LISA. Due to their complicated nature and weak amplitudes the detection and parameter estimation of such sources is a challenging task. In this paper we present a statistical methodology based on Bayesian inference in which the estimation of parameters is carried out by advanced Markov chain Monte Carlo (MCMC) algorithms such as parallel tempering MCMC. We analysed high and medium mass EMRI systems that fall well inside the low frequency range of LISA. In the context of the Mock LISA Data Challenges, our investigation and results are also the first instance in which a fully Markovian algorithm is applied for EMRI searches. Results show that our algorithm worked well in recovering EMRI signals from different (simulated) LISA data sets having single and multiple EMRI sources and holds great promise for posterior computation under more realistic conditions. The search and estimation meth...
Large-Scale Distributed Bayesian Matrix Factorization using Stochastic Gradient MCMC
Ahn, S.; Korattikara, A.; Liu, N.; Rajan, S.; Welling, M.
2015-01-01
Despite having various attractive qualities such as high prediction accuracy and the ability to quantify uncertainty and avoid ovrfitting, Bayesian Matrix Factorization has not been widely adopted because of the prohibitive cost of inference. In this paper, we propose a scalable distributed Bayesian
Bayesian methods for hackers probabilistic programming and Bayesian inference
Davidson-Pilon, Cameron
2016-01-01
Bayesian methods of inference are deeply natural and extremely powerful. However, most discussions of Bayesian inference rely on intensely complex mathematical analyses and artificial examples, making it inaccessible to anyone without a strong mathematical background. Now, though, Cameron Davidson-Pilon introduces Bayesian inference from a computational perspective, bridging theory to practice–freeing you to get results using computing power. Bayesian Methods for Hackers illuminates Bayesian inference through probabilistic programming with the powerful PyMC language and the closely related Python tools NumPy, SciPy, and Matplotlib. Using this approach, you can reach effective solutions in small increments, without extensive mathematical intervention. Davidson-Pilon begins by introducing the concepts underlying Bayesian inference, comparing it with other techniques and guiding you through building and training your first Bayesian model. Next, he introduces PyMC through a series of detailed examples a...
DEFF Research Database (Denmark)
Picchini, Umberto; Forman, Julie Lyng
2016-01-01
In recent years, dynamical modelling has been provided with a range of breakthrough methods to perform exact Bayesian inference. However, it is often computationally unfeasible to apply exact statistical methodologies in the context of large data sets and complex models. This paper considers...... a nonlinear stochastic differential equation model observed with correlated measurement errors and an application to protein folding modelling. An approximate Bayesian computation (ABC)-MCMC algorithm is suggested to allow inference for model parameters within reasonable time constraints. The ABC algorithm...... applications. A simulation study is conducted to compare our strategy with exact Bayesian inference, the latter resulting two orders of magnitude slower than ABC-MCMC for the considered set-up. Finally, the ABC algorithm is applied to a large size protein data. The suggested methodology is fairly general...
Compiling Relational Bayesian Networks for Exact Inference
DEFF Research Database (Denmark)
Jaeger, Manfred; Chavira, Mark; Darwiche, Adnan
2004-01-01
We describe a system for exact inference with relational Bayesian networks as defined in the publicly available \\primula\\ tool. The system is based on compiling propositional instances of relational Bayesian networks into arithmetic circuits and then performing online inference by evaluating...... and differentiating these circuits in time linear in their size. We report on experimental results showing the successful compilation, and efficient inference, on relational Bayesian networks whose {\\primula}--generated propositional instances have thousands of variables, and whose jointrees have clusters...
Nonparametric Bayesian inference in biostatistics
Müller, Peter
2015-01-01
As chapters in this book demonstrate, BNP has important uses in clinical sciences and inference for issues like unknown partitions in genomics. Nonparametric Bayesian approaches (BNP) play an ever expanding role in biostatistical inference from use in proteomics to clinical trials. Many research problems involve an abundance of data and require flexible and complex probability models beyond the traditional parametric approaches. As this book's expert contributors show, BNP approaches can be the answer. Survival Analysis, in particular survival regression, has traditionally used BNP, but BNP's potential is now very broad. This applies to important tasks like arrangement of patients into clinically meaningful subpopulations and segmenting the genome into functionally distinct regions. This book is designed to both review and introduce application areas for BNP. While existing books provide theoretical foundations, this book connects theory to practice through engaging examples and research questions. Chapters c...
Bayesian Inference with Optimal Maps
Moselhy, Tarek A El
2011-01-01
We present a new approach to Bayesian inference that entirely avoids Markov chain simulation, by constructing a map that pushes forward the prior measure to the posterior measure. Existence and uniqueness of a suitable measure-preserving map is established by formulating the problem in the context of optimal transport theory. We discuss various means of explicitly parameterizing the map and computing it efficiently through solution of an optimization problem, exploiting gradient information from the forward model when possible. The resulting algorithm overcomes many of the computational bottlenecks associated with Markov chain Monte Carlo. Advantages of a map-based representation of the posterior include analytical expressions for posterior moments and the ability to generate arbitrary numbers of independent posterior samples without additional likelihood evaluations or forward solves. The optimization approach also provides clear convergence criteria for posterior approximation and facilitates model selectio...
A Bayesian MCMC method for point process models with intractable normalising constants
DEFF Research Database (Denmark)
Berthelsen, Kasper Klitgaard; Møller, Jesper
2004-01-01
to simulate from the "unknown distribution", perfect simulation algorithms become useful. We illustrate the method in cases whre the likelihood is given by a Markov point process model. Particularly, we consider semi-parametric Bayesian inference in connection to both inhomogeneous Markov point process models...
Parallel local approximation MCMC for expensive models
Conrad, Patrick; Davis, Andrew; Marzouk, Youssef; Pillai, Natesh; Smith, Aaron
2016-01-01
Performing Bayesian inference via Markov chain Monte Carlo (MCMC) can be exceedingly expensive when posterior evaluations invoke the evaluation of a computationally expensive model, such as a system of partial differential equations. In recent work [Conrad et al. JASA 2015, arXiv:1402.1694] we described a framework for constructing and refining local approximations of such models during an MCMC simulation. These posterior--adapted approximations harness regularity of the model to reduce the c...
Compiling Relational Bayesian Networks for Exact Inference
DEFF Research Database (Denmark)
Jaeger, Manfred; Darwiche, Adnan; Chavira, Mark
2006-01-01
We describe in this paper a system for exact inference with relational Bayesian networks as defined in the publicly available PRIMULA tool. The system is based on compiling propositional instances of relational Bayesian networks into arithmetic circuits and then performing online inference...... by evaluating and differentiating these circuits in time linear in their size. We report on experimental results showing successful compilation and efficient inference on relational Bayesian networks, whose PRIMULA--generated propositional instances have thousands of variables, and whose jointrees have clusters...
An application of Bayesian inference for solar-like pulsators
Benomar, O.
2008-12-01
As the amount of data collected by space-borne asteroseismic instruments (such as CoRoT and Kepler) increases drastically, it will be useful to have automated processes to extract a maximum of information from these data. The use of a Bayesian approach could be very help- ful for this goal. Only a few attempts have been made in this way (e.g. Brewer et al. 2007). We propose to use Markov Chain Monte Carlo simulations (MCMC) with Metropolis-Hasting (MH) based algorithms to infer the main stellar oscillation parameters from the power spec- trum, in the case of solar-like pulsators. Given a number of modes to be fitted, the algorithm is able to give the best set of parameters (frequency, linewidth, amplitude, rotational split- ting) corresponding to a chosen input model. We illustrate this algorithm with one of the first CoRoT targets: HD 49933.
Bayesian Inference Methods for Sparse Channel Estimation
DEFF Research Database (Denmark)
Pedersen, Niels Lovmand
2013-01-01
This thesis deals with sparse Bayesian learning (SBL) with application to radio channel estimation. As opposed to the classical approach for sparse signal representation, we focus on the problem of inferring complex signals. Our investigations within SBL constitute the basis for the development...... of Bayesian inference algorithms for sparse channel estimation. Sparse inference methods aim at finding the sparse representation of a signal given in some overcomplete dictionary of basis vectors. Within this context, one of our main contributions to the field of SBL is a hierarchical representation...... analysis of the complex prior representation, where we show that the ability to induce sparse estimates of a given prior heavily depends on the inference method used and, interestingly, whether real or complex variables are inferred. We also show that the Bayesian estimators derived from the proposed...
Bayesian Inference on Gravitational Waves
Directory of Open Access Journals (Sweden)
Asad Ali
2015-12-01
Full Text Available The Bayesian approach is increasingly becoming popular among the astrophysics data analysis communities. However, the Pakistan statistics communities are unaware of this fertile interaction between the two disciplines. Bayesian methods have been in use to address astronomical problems since the very birth of the Bayes probability in eighteenth century. Today the Bayesian methods for the detection and parameter estimation of gravitational waves have solid theoretical grounds with a strong promise for the realistic applications. This article aims to introduce the Pakistan statistics communities to the applications of Bayesian Monte Carlo methods in the analysis of gravitational wave data with an overview of the Bayesian signal detection and estimation methods and demonstration by a couple of simplified examples.
Picturing classical and quantum Bayesian inference
Coecke, Bob
2011-01-01
We introduce a graphical framework for Bayesian inference that is sufficiently general to accommodate not just the standard case but also recent proposals for a theory of quantum Bayesian inference wherein one considers density operators rather than probability distributions as representative of degrees of belief. The diagrammatic framework is stated in the graphical language of symmetric monoidal categories and of compact structures and Frobenius structures therein, in which Bayesian inversion boils down to transposition with respect to an appropriate compact structure. We characterize classical Bayesian inference in terms of a graphical property and demonstrate that our approach eliminates some purely conventional elements that appear in common representations thereof, such as whether degrees of belief are represented by probabilities or entropic quantities. We also introduce a quantum-like calculus wherein the Frobenius structure is noncommutative and show that it can accommodate Leifer's calculus of `cond...
AGNfitter: A Bayesian MCMC approach to fitting spectral energy distributions of AGN
Rivera, Gabriela Calistro; Hennawi, Joseph F; Hogg, David W
2016-01-01
We present AGNfitter, a publicly available open-source algorithm implementing a fully Bayesian Markov Chain Monte Carlo method to fit the spectral energy distributions (SEDs) of active galactic nuclei (AGN) from the sub-mm to the UV, allowing one to robustly disentangle the physical processes responsible for their emission. AGNfitter makes use of a large library of theoretical, empirical, and semi-empirical models to characterize both the nuclear and host galaxy emission simultaneously. The model consists of four physical emission components: an accretion disk, a torus of AGN heated dust, stellar populations, and cold dust in star forming regions. AGNfitter determines the posterior distributions of numerous parameters that govern the physics of AGN with a fully Bayesian treatment of errors and parameter degeneracies, allowing one to infer integrated luminosities, dust attenuation parameters, stellar masses, and star formation rates. We tested AGNfitter's performace on real data by fitting the SEDs of a sample...
An Intuitive Dashboard for Bayesian Network Inference
Reddy, Vikas; Charisse Farr, Anna; Wu, Paul; Mengersen, Kerrie; Yarlagadda, Prasad K. D. V.
2014-03-01
Current Bayesian network software packages provide good graphical interface for users who design and develop Bayesian networks for various applications. However, the intended end-users of these networks may not necessarily find such an interface appealing and at times it could be overwhelming, particularly when the number of nodes in the network is large. To circumvent this problem, this paper presents an intuitive dashboard, which provides an additional layer of abstraction, enabling the end-users to easily perform inferences over the Bayesian networks. Unlike most software packages, which display the nodes and arcs of the network, the developed tool organises the nodes based on the cause-and-effect relationship, making the user-interaction more intuitive and friendly. In addition to performing various types of inferences, the users can conveniently use the tool to verify the behaviour of the developed Bayesian network. The tool has been developed using QT and SMILE libraries in C++.
Tactile length contraction as Bayesian inference.
Tong, Jonathan; Ngo, Vy; Goldreich, Daniel
2016-08-01
To perceive, the brain must interpret stimulus-evoked neural activity. This is challenging: The stochastic nature of the neural response renders its interpretation inherently uncertain. Perception would be optimized if the brain used Bayesian inference to interpret inputs in light of expectations derived from experience. Bayesian inference would improve perception on average but cause illusions when stimuli violate expectation. Intriguingly, tactile, auditory, and visual perception are all prone to length contraction illusions, characterized by the dramatic underestimation of the distance between punctate stimuli delivered in rapid succession; the origin of these illusions has been mysterious. We previously proposed that length contraction illusions occur because the brain interprets punctate stimulus sequences using Bayesian inference with a low-velocity expectation. A novel prediction of our Bayesian observer model is that length contraction should intensify if stimuli are made more difficult to localize. Here we report a tactile psychophysical study that tested this prediction. Twenty humans compared two distances on the forearm: a fixed reference distance defined by two taps with 1-s temporal separation and an adjustable comparison distance defined by two taps with temporal separation t ≤ 1 s. We observed significant length contraction: As t was decreased, participants perceived the two distances as equal only when the comparison distance was made progressively greater than the reference distance. Furthermore, the use of weaker taps significantly enhanced participants' length contraction. These findings confirm the model's predictions, supporting the view that the spatiotemporal percept is a best estimate resulting from a Bayesian inference process.
Variational Bayesian Inference of Line Spectra
DEFF Research Database (Denmark)
Badiu, Mihai Alin; Hansen, Thomas Lundgaard; Fleury, Bernard Henri
2016-01-01
In this paper, we address the fundamental problem of line spectral estimation in a Bayesian framework. We target model order and parameter estimation via variational inference in a probabilistic model in which the frequencies are continuous-valued, i.e., not restricted to a grid; and the coeffici......In this paper, we address the fundamental problem of line spectral estimation in a Bayesian framework. We target model order and parameter estimation via variational inference in a probabilistic model in which the frequencies are continuous-valued, i.e., not restricted to a grid......; and the coefficients are governed by a Bernoulli-Gaussian prior model turning model order selection into binary sequence detection. Unlike earlier works which retain only point estimates of the frequencies, we undertake a more complete Bayesian treatment by estimating the posterior probability density functions (pdfs...
Decision generation tools and Bayesian inference
Jannson, Tomasz; Wang, Wenjian; Forrester, Thomas; Kostrzewski, Andrew; Veeris, Christian; Nielsen, Thomas
2014-05-01
Digital Decision Generation (DDG) tools are important software sub-systems of Command and Control (C2) systems and technologies. In this paper, we present a special type of DDGs based on Bayesian Inference, related to adverse (hostile) networks, including such important applications as terrorism-related networks and organized crime ones.
Bayesian Inference in Queueing Networks
Sutton, Charles
2010-01-01
Modern Web services, such as those at Google, Yahoo!, and Amazon, handle billions of requests per day on clusters of thousands of computers. Because these services operate under strict performance requirements, a statistical understanding of their performance is of great practical interest. Such services are modeled by networks of queues, where one queue models each of the individual computers in the system. A key challenge is that the data is incomplete, because recording detailed information about every request to a heavily used system can require unacceptable overhead. In this paper we develop a Bayesian perspective on queueing models in which the arrival and departure times that are not observed are treated as latent variables. Underlying this viewpoint is the observation that a queueing model defines a deterministic transformation between the data and a set of independent variables called the service times. With this viewpoint in hand, we sample from the posterior distribution over missing data and model...
AGNfitter: A Bayesian MCMC Approach to Fitting Spectral Energy Distributions of AGNs
Calistro Rivera, Gabriela; Lusso, Elisabeta; Hennawi, Joseph F.; Hogg, David W.
2016-12-01
We present AGNfitter, a publicly available open-source algorithm implementing a fully Bayesian Markov Chain Monte Carlo method to fit the spectral energy distributions (SEDs) of active galactic nuclei (AGNs) from the sub-millimeter to the UV, allowing one to robustly disentangle the physical processes responsible for their emission. AGNfitter makes use of a large library of theoretical, empirical, and semi-empirical models to characterize both the nuclear and host galaxy emission simultaneously. The model consists of four physical emission components: an accretion disk, a torus of AGN heated dust, stellar populations, and cold dust in star-forming regions. AGNfitter determines the posterior distributions of numerous parameters that govern the physics of AGNs with a fully Bayesian treatment of errors and parameter degeneracies, allowing one to infer integrated luminosities, dust attenuation parameters, stellar masses, and star-formation rates. We tested AGNfitter’s performance on real data by fitting the SEDs of a sample of 714 X-ray selected AGNs from the XMM-COSMOS survey, spectroscopically classified as Type1 (unobscured) and Type2 (obscured) AGNs by their optical-UV emission lines. We find that two independent model parameters, namely the reddening of the accretion disk and the column density of the dusty torus, are good proxies for AGN obscuration, allowing us to develop a strategy for classifying AGNs as Type1 or Type2, based solely on an SED-fitting analysis. Our classification scheme is in excellent agreement with the spectroscopic classification, giving a completeness fraction of ˜ 86 % and ˜ 70 % , and an efficiency of ˜ 80 % and ˜ 77 % , for Type1 and Type2 AGNs, respectively.
The NIFTY way of Bayesian signal inference
Energy Technology Data Exchange (ETDEWEB)
Selig, Marco, E-mail: mselig@mpa-Garching.mpg.de [Max Planck Institut für Astrophysik, Karl-Schwarzschild-Straße 1, D-85748 Garching, Germany, and Ludwig-Maximilians-Universität München, Geschwister-Scholl-Platz 1, D-80539 München (Germany)
2014-12-05
We introduce NIFTY, 'Numerical Information Field Theory', a software package for the development of Bayesian signal inference algorithms that operate independently from any underlying spatial grid and its resolution. A large number of Bayesian and Maximum Entropy methods for 1D signal reconstruction, 2D imaging, as well as 3D tomography, appear formally similar, but one often finds individualized implementations that are neither flexible nor easily transferable. Signal inference in the framework of NIFTY can be done in an abstract way, such that algorithms, prototyped in 1D, can be applied to real world problems in higher-dimensional settings. NIFTY as a versatile library is applicable and already has been applied in 1D, 2D, 3D and spherical settings. A recent application is the D{sup 3}PO algorithm targeting the non-trivial task of denoising, deconvolving, and decomposing photon observations in high energy astronomy.
Bayesianism and inference to the best explanation
Directory of Open Access Journals (Sweden)
Valeriano IRANZO
2008-01-01
Full Text Available Bayesianism and Inference to the best explanation (IBE are two different models of inference. Recently there has been some debate about the possibility of “bayesianizing” IBE. Firstly I explore several alternatives to include explanatory considerations in Bayes’s Theorem. Then I distinguish two different interpretations of prior probabilities: “IBE-Bayesianism” (IBE-Bay and “frequentist-Bayesianism” (Freq-Bay. After detailing the content of the latter, I propose a rule for assessing the priors. I also argue that Freq-Bay: (i endorses a role for explanatory value in the assessment of scientific hypotheses; (ii avoids a purely subjectivist reading of prior probabilities; and (iii fits better than IBE-Bayesianism with two basic facts about science, i.e., the prominent role played by empirical testing and the existence of many scientific theories in the past that failed to fulfil their promises and were subsequently abandoned.
Using Alien Coins to Test Whether Simple Inference Is Bayesian
Cassey, Peter; Hawkins, Guy E.; Donkin, Chris; Brown, Scott D.
2016-01-01
Reasoning and inference are well-studied aspects of basic cognition that have been explained as statistically optimal Bayesian inference. Using a simplified experimental design, we conducted quantitative comparisons between Bayesian inference and human inference at the level of individuals. In 3 experiments, with more than 13,000 participants, we…
Bayesian Estimation and Inference Using Stochastic Electronics.
Thakur, Chetan Singh; Afshar, Saeed; Wang, Runchun M; Hamilton, Tara J; Tapson, Jonathan; van Schaik, André
2016-01-01
In this paper, we present the implementation of two types of Bayesian inference problems to demonstrate the potential of building probabilistic algorithms in hardware using single set of building blocks with the ability to perform these computations in real time. The first implementation, referred to as the BEAST (Bayesian Estimation and Stochastic Tracker), demonstrates a simple problem where an observer uses an underlying Hidden Markov Model (HMM) to track a target in one dimension. In this implementation, sensors make noisy observations of the target position at discrete time steps. The tracker learns the transition model for target movement, and the observation model for the noisy sensors, and uses these to estimate the target position by solving the Bayesian recursive equation online. We show the tracking performance of the system and demonstrate how it can learn the observation model, the transition model, and the external distractor (noise) probability interfering with the observations. In the second implementation, referred to as the Bayesian INference in DAG (BIND), we show how inference can be performed in a Directed Acyclic Graph (DAG) using stochastic circuits. We show how these building blocks can be easily implemented using simple digital logic gates. An advantage of the stochastic electronic implementation is that it is robust to certain types of noise, which may become an issue in integrated circuit (IC) technology with feature sizes in the order of tens of nanometers due to their low noise margin, the effect of high-energy cosmic rays and the low supply voltage. In our framework, the flipping of random individual bits would not affect the system performance because information is encoded in a bit stream.
Trans-Dimensional Bayesian Inference for Gravitational Lens Substructures
Brewer, Brendon J; Lewis, Geraint F
2015-01-01
We introduce a Bayesian solution to the problem of inferring the density profile of strong gravitational lenses when the lens galaxy may contain multiple dark or faint substructures. The source and lens models are based on a superposition of an unknown number of non-negative basis functions (or "blobs") whose form was chosen with speed as a primary criterion. The prior distribution for the blobs' properties is specified hierarchically, so the mass function of substructures is a natural output of the method. We use reversible jump Markov Chain Monte Carlo (MCMC) within Diffusive Nested Sampling (DNS) to sample the posterior distribution and evaluate the marginal likelihood of the model, including the summation over the unknown number of blobs in the source and the lens. We demonstrate the method on a simulated data set with a single substructure, which is recovered well with moderate uncertainties. We also apply the method to the g-band image of the "Cosmic Horseshoe" system, and find some hints of potential s...
Geometric ergodicity of a hybrid sampler for Bayesian inference of phylogenetic branch lengths.
Spade, David A; Herbei, Radu; Kubatko, Laura S
2015-10-01
One of the fundamental goals in phylogenetics is to make inferences about the evolutionary pattern among a group of individuals, such as genes or species, using present-day genetic material. This pattern is represented by a phylogenetic tree, and as computational methods have caught up to the statistical theory, Bayesian methods of making inferences about phylogenetic trees have become increasingly popular. Bayesian inference of phylogenetic trees requires sampling from intractable probability distributions. Common methods of sampling from these distributions include Markov chain Monte Carlo (MCMC) and Sequential Monte Carlo (SMC) methods, and one way that both of these methods can proceed is by first simulating a tree topology and then taking a sample from the posterior distribution of the branch lengths given the tree topology and the data set. In many MCMC methods, it is difficult to verify that the underlying Markov chain is geometrically ergodic, and thus, it is necessary to rely on output-based convergence diagnostics in order to assess convergence on an ad hoc basis. These diagnostics suffer from several important limitations, so in an effort to circumvent these limitations, this work establishes geometric convergence for a particular Markov chain that is used to sample branch lengths under a fairly general class of nucleotide substitution models and provides a numerical method for estimating the time this Markov chain takes to converge.
Bayesian inference for pulsar timing models
Vigeland, Sarah J
2013-01-01
The extremely regular, periodic radio emission from millisecond pulsars make them useful tools for studying neutron star astrophysics, general relativity, and low-frequency gravitational waves. These studies require that the observed pulse time of arrivals are fit to complicated timing models that describe numerous effects such as the astrometry of the source, the evolution of the pulsar's spin, the presence of a binary companion, and the propagation of the pulses through the interstellar medium. In this paper, we discuss the benefits of using Bayesian inference to obtain these timing solutions. These include the validation of linearized least-squares model fits when they are correct, and the proper characterization of parameter uncertainties when they are not; the incorporation of prior parameter information and of models of correlated noise; and the Bayesian comparison of alternative timing models. We describe our computational setup, which combines the timing models of tempo2 with the nested-sampling integ...
Human collective intelligence as distributed Bayesian inference
Krafft, Peter M; Pan, Wei; Della Penna, Nicolás; Altshuler, Yaniv; Shmueli, Erez; Tenenbaum, Joshua B; Pentland, Alex
2016-01-01
Collective intelligence is believed to underly the remarkable success of human society. The formation of accurate shared beliefs is one of the key components of human collective intelligence. How are accurate shared beliefs formed in groups of fallible individuals? Answering this question requires a multiscale analysis. We must understand both the individual decision mechanisms people use, and the properties and dynamics of those mechanisms in the aggregate. As of yet, mathematical tools for such an approach have been lacking. To address this gap, we introduce a new analytical framework: We propose that groups arrive at accurate shared beliefs via distributed Bayesian inference. Distributed inference occurs through information processing at the individual level, and yields rational belief formation at the group level. We instantiate this framework in a new model of human social decision-making, which we validate using a dataset we collected of over 50,000 users of an online social trading platform where inves...
MultiNest: an efficient and robust Bayesian inference tool for cosmology and particle physics
Feroz, F; Bridges, M
2008-01-01
We present further development and the first public release of our multimodal nested sampling algorithm, called MultiNest. This Bayesian inference tool calculates the evidence, with an associated error estimate, and produces posterior samples from distributions that may contain multiple modes and pronounced (curving) degeneracies in high dimensions. The developments presented here lead to further substantial improvements in sampling efficiency and robustness, as compared to the original algorithm presented in Feroz & Hobson (2008), which itself significantly outperformed existing MCMC techniques in a wide range of astrophysical inference problems. The accuracy and economy of the MultiNest algorithm is demonstrated by application to two toy problems and to a cosmological inference problem focussing on the extension of the vanilla $\\Lambda$CDM model to include spatial curvature and a varying equation of state for dark energy. The MultiNest software, which is fully parallelized using MPI and includes an inte...
Bayesian Inference of a Multivariate Regression Model
Directory of Open Access Journals (Sweden)
Marick S. Sinay
2014-01-01
Full Text Available We explore Bayesian inference of a multivariate linear regression model with use of a flexible prior for the covariance structure. The commonly adopted Bayesian setup involves the conjugate prior, multivariate normal distribution for the regression coefficients and inverse Wishart specification for the covariance matrix. Here we depart from this approach and propose a novel Bayesian estimator for the covariance. A multivariate normal prior for the unique elements of the matrix logarithm of the covariance matrix is considered. Such structure allows for a richer class of prior distributions for the covariance, with respect to strength of beliefs in prior location hyperparameters, as well as the added ability, to model potential correlation amongst the covariance structure. The posterior moments of all relevant parameters of interest are calculated based upon numerical results via a Markov chain Monte Carlo procedure. The Metropolis-Hastings-within-Gibbs algorithm is invoked to account for the construction of a proposal density that closely matches the shape of the target posterior distribution. As an application of the proposed technique, we investigate a multiple regression based upon the 1980 High School and Beyond Survey.
Universal Darwinism as a process of Bayesian inference
Campbell, John O
2016-01-01
Many of the mathematical frameworks describing natural selection are equivalent to Bayes Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians). As Bayesian inference can always be cast in terms of (variational) free energy minimization, natural selection can be viewed as comprising two components: a generative model of an "experiment" in the external world environment, and the results of that "experiment" or the "surprise" entailed by predicted and actual outcomes of the "experiment". Minimization of free energy implies that the implicit measure of "surprise" experienced serves to update the generative model in a Bayesian manner. This description clo...
Variational Bayesian Inference of Line Spectra
DEFF Research Database (Denmark)
Badiu, Mihai Alin; Hansen, Thomas Lundgaard; Fleury, Bernard Henri
2017-01-01
In this paper, we address the fundamental problem of line spectral estimation in a Bayesian framework. We target model order and parameter estimation via variational inference in a probabilistic model in which the frequencies are continuous-valued, i.e., not restricted to a grid; and the coeffici......In this paper, we address the fundamental problem of line spectral estimation in a Bayesian framework. We target model order and parameter estimation via variational inference in a probabilistic model in which the frequencies are continuous-valued, i.e., not restricted to a grid......) of the frequencies and computing expectations over them. Thus, we additionally capture and operate with the uncertainty of the frequency estimates. Aiming to maximize the model evidence, variational optimization provides analytic approximations of the posterior pdfs and also gives estimates of the additional...... just point estimates, significantly improves the performance. The performance of VALSE is superior to that of state-of-the-art methods and closely approaches the Cramér-Rao bound computed for the true model order....
Hierarchical Bayesian inference in the visual cortex
Lee, Tai Sing; Mumford, David
2003-07-01
Traditional views of visual processing suggest that early visual neurons in areas V1 and V2 are static spatiotemporal filters that extract local features from a visual scene. The extracted information is then channeled through a feedforward chain of modules in successively higher visual areas for further analysis. Recent electrophysiological recordings from early visual neurons in awake behaving monkeys reveal that there are many levels of complexity in the information processing of the early visual cortex, as seen in the long-latency responses of its neurons. These new findings suggest that activity in the early visual cortex is tightly coupled and highly interactive with the rest of the visual system. They lead us to propose a new theoretical setting based on the mathematical framework of hierarchical Bayesian inference for reasoning about the visual system. In this framework, the recurrent feedforward/feedback loops in the cortex serve to integrate top-down contextual priors and bottom-up observations so as to implement concurrent probabilistic inference along the visual hierarchy. We suggest that the algorithms of particle filtering and Bayesian-belief propagation might model these interactive cortical computations. We review some recent neurophysiological evidences that support the plausibility of these ideas. 2003 Optical Society of America
Porter, Edward K
2014-01-01
With the advance in computational resources, Bayesian inference is increasingly becoming the standard tool of practise in GW astronomy. However, algorithms such as Markov Chain Monte Carlo (MCMC) require a large number of iterations to guarantee convergence to the target density. Each chain demands a large number of evaluations of the likelihood function, and in the case of a Hessian MCMC, calculations of the Fisher information matrix for use as a proposal distribution. As each iteration requires the generation of at least one gravitational waveform, we very quickly reach a point of exclusion for current Bayesian algorithms, especially for low mass systems where the length of the waveforms is large and the waveform generation time is on the order of seconds. This suddenly demands a timescale of many weeks for a single MCMC. As each likelihood and Fisher information matrix calculation requires the evaluation of noise-weighted scalar products, we demonstrate that by using the linearity of integration, and the f...
A Fast Iterative Bayesian Inference Algorithm for Sparse Channel Estimation
DEFF Research Database (Denmark)
Pedersen, Niels Lovmand; Manchón, Carles Navarro; Fleury, Bernard Henri
2013-01-01
representation of the Bessel K probability density function; a highly efficient, fast iterative Bayesian inference method is then applied to the proposed model. The resulting estimator outperforms other state-of-the-art Bayesian and non-Bayesian estimators, either by yielding lower mean squared estimation error...
Bayesian electron density inference from JET lithium beam emission spectra using Gaussian processes
Kwak, Sehyun; Brix, M; Ghim, Y -c
2016-01-01
A Bayesian model to infer edge electron density profiles is developed for the JET lithium beam emission spectroscopy system, measuring Li I line radiation using 26 channels with ~1 cm spatial resolution and 10~20 ms temporal resolution. The density profile is modelled using a Gaussian process prior, and the uncertainty of the density profile is calculated by a Markov Chain Monte Carlo (MCMC) scheme. From the spectra measured by the transmission grating spectrometer, the Li line intensities are extracted, and modelled as a function of the plasma density by a multi-state model which describes the relevant processes between neutral lithium beam atoms and plasma particles. The spectral model fully takes into account interference filter and instrument effects, that are separately estimated, again using Gaussian processes. The line intensities are inferred based on a spectral model consistent with the measured spectra within their uncertainties, which includes photon statistics and electronic noise. Our newly devel...
Quantum-Like Representation of Non-Bayesian Inference
Asano, M.; Basieva, I.; Khrennikov, A.; Ohya, M.; Tanaka, Y.
2013-01-01
This research is related to the problem of "irrational decision making or inference" that have been discussed in cognitive psychology. There are some experimental studies, and these statistical data cannot be described by classical probability theory. The process of decision making generating these data cannot be reduced to the classical Bayesian inference. For this problem, a number of quantum-like coginitive models of decision making was proposed. Our previous work represented in a natural way the classical Bayesian inference in the frame work of quantum mechanics. By using this representation, in this paper, we try to discuss the non-Bayesian (irrational) inference that is biased by effects like the quantum interference. Further, we describe "psychological factor" disturbing "rationality" as an "environment" correlating with the "main system" of usual Bayesian inference.
Universal Darwinism As a Process of Bayesian Inference.
Campbell, John O
2016-01-01
Many of the mathematical frameworks describing natural selection are equivalent to Bayes' Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus, natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians). As Bayesian inference can always be cast in terms of (variational) free energy minimization, natural selection can be viewed as comprising two components: a generative model of an "experiment" in the external world environment, and the results of that "experiment" or the "surprise" entailed by predicted and actual outcomes of the "experiment." Minimization of free energy implies that the implicit measure of "surprise" experienced serves to update the generative model in a Bayesian manner. This description closely accords with the mechanisms of generalized Darwinian process proposed both by Dawkins, in terms of replicators and vehicles, and Campbell, in terms of inferential systems. Bayesian inference is an algorithm for the accumulation of evidence-based knowledge. This algorithm is now seen to operate over a wide range of evolutionary processes, including natural selection, the evolution of mental models and cultural evolutionary processes, notably including science itself. The variational principle of free energy minimization may thus serve as a unifying mathematical framework for universal Darwinism, the study of evolutionary processes operating throughout nature.
Universal Darwinism as a process of Bayesian inference
Directory of Open Access Journals (Sweden)
John Oberon Campbell
2016-06-01
Full Text Available Many of the mathematical frameworks describing natural selection are equivalent to Bayes’ Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians. As Bayesian inference can always be cast in terms of (variational free energy minimization, natural selection can be viewed as comprising two components: a generative model of an ‘experiment’ in the external world environment, and the results of that 'experiment' or the 'surprise' entailed by predicted and actual outcomes of the ‘experiment’. Minimization of free energy implies that the implicit measure of 'surprise' experienced serves to update the generative model in a Bayesian manner. This description closely accords with the mechanisms of generalized Darwinian process proposed both by Dawkins, in terms of replicators and vehicles, and Campbell, in terms of inferential systems. Bayesian inference is an algorithm for the accumulation of evidence-based knowledge. This algorithm is now seen to operate over a wide range of evolutionary processes, including natural selection, the evolution of mental models and cultural evolutionary processes, notably including science itself. The variational principle of free energy minimization may thus serve as a unifying mathematical framework for universal Darwinism, the study of evolutionary processes operating throughout nature.
Numerical approximations for speeding up mcmc inference in the infinite relational model
DEFF Research Database (Denmark)
Schmidt, Mikkel Nørgaard; Albers, Kristoffer Jon
2015-01-01
The infinite relational model (IRM) is a powerful model for discovering clusters in complex networks; however, the computational speed of Markov chain Monte Carlo inference in the model can be a limiting factor when analyzing large networks. We investigate how using numerical approximations of th...
Bayesian Information Criterion as an Alternative way of Statistical Inference
Directory of Open Access Journals (Sweden)
Nadejda Yu. Gubanova
2012-05-01
Full Text Available The article treats Bayesian information criterion as an alternative to traditional methods of statistical inference, based on NHST. The comparison of ANOVA and BIC results for psychological experiment is discussed.
Bayesian multimodel inference for dose-response studies
Link, W.A.; Albers, P.H.
2007-01-01
Statistical inference in dose?response studies is model-based: The analyst posits a mathematical model of the relation between exposure and response, estimates parameters of the model, and reports conclusions conditional on the model. Such analyses rarely include any accounting for the uncertainties associated with model selection. The Bayesian inferential system provides a convenient framework for model selection and multimodel inference. In this paper we briefly describe the Bayesian paradigm and Bayesian multimodel inference. We then present a family of models for multinomial dose?response data and apply Bayesian multimodel inferential methods to the analysis of data on the reproductive success of American kestrels (Falco sparveriuss) exposed to various sublethal dietary concentrations of methylmercury.
Sraj, Ihab; Zedler, Sarah E.; Knio, Omar M.; Jackson, Charles S.; Hoteit, Ibrahim
2016-12-01
The authors present a Polynomial Chaos (PC)-based Bayesian inference method for quantifying the uncertainties of the K-Profile Parametrization (KPP) within the MIT General Circulation Model (MITgcm) of the tropical pacific. The inference of the uncertain parameters is based on a Markov Chain Monte Carlo (MCMC) scheme that utilizes a newly formulated test statistic taking into account the different components representing the structures of turbulent mixing on both daily and seasonal timescales in addition to the data quality, and filters for the effects of parameter perturbations over those due to changes in the wind. To avoid the prohibitive computational cost of integrating the MITgcm model at each MCMC iteration, we build a surrogate model for the test statistic using the PC method. To filter out the noise in the model predictions and avoid related convergence issues, we resort to a Basis-Pursuit-DeNoising (BPDN) compressed sensing approach to determine the PC coefficients of a representative surrogate model. The PC surrogate is then used to evaluate the test statistic in the MCMC step for sampling the posterior of the uncertain parameters. Results of the posteriors indicate good agreement with the default values for two parameters of the KPP model namely the critical bulk and gradient Richardson numbers; while the posteriors of the remaining parameters were barely informative.
Sraj, Ihab
2016-08-26
The authors present a polynomial chaos (PC)-based Bayesian inference method for quantifying the uncertainties of the K-profile parameterization (KPP) within the MIT general circulation model (MITgcm) of the tropical Pacific. The inference of the uncertain parameters is based on a Markov chain Monte Carlo (MCMC) scheme that utilizes a newly formulated test statistic taking into account the different components representing the structures of turbulent mixing on both daily and seasonal time scales in addition to the data quality, and filters for the effects of parameter perturbations over those as a result of changes in the wind. To avoid the prohibitive computational cost of integrating the MITgcm model at each MCMC iteration, a surrogate model for the test statistic using the PC method is built. Because of the noise in the model predictions, a basis-pursuit-denoising (BPDN) compressed sensing approach is employed to determine the PC coefficients of a representative surrogate model. The PC surrogate is then used to evaluate the test statistic in the MCMC step for sampling the posterior of the uncertain parameters. Results of the posteriors indicate good agreement with the default values for two parameters of the KPP model, namely the critical bulk and gradient Richardson numbers; while the posteriors of the remaining parameters were barely informative. © 2016 American Meteorological Society.
Bayesian Inference and Online Learning in Poisson Neuronal Networks.
Huang, Yanping; Rao, Rajesh P N
2016-08-01
Motivated by the growing evidence for Bayesian computation in the brain, we show how a two-layer recurrent network of Poisson neurons can perform both approximate Bayesian inference and learning for any hidden Markov model. The lower-layer sensory neurons receive noisy measurements of hidden world states. The higher-layer neurons infer a posterior distribution over world states via Bayesian inference from inputs generated by sensory neurons. We demonstrate how such a neuronal network with synaptic plasticity can implement a form of Bayesian inference similar to Monte Carlo methods such as particle filtering. Each spike in a higher-layer neuron represents a sample of a particular hidden world state. The spiking activity across the neural population approximates the posterior distribution over hidden states. In this model, variability in spiking is regarded not as a nuisance but as an integral feature that provides the variability necessary for sampling during inference. We demonstrate how the network can learn the likelihood model, as well as the transition probabilities underlying the dynamics, using a Hebbian learning rule. We present results illustrating the ability of the network to perform inference and learning for arbitrary hidden Markov models.
Bayesian Networks: Aspects of Approximate Inference
Bolt, J.H.
2008-01-01
A Bayesian network can be used to model consisely the probabilistic knowledge with respect to a given problem domain. Such a network consists of an acyclic directed graph in which the nodes represent stochastic variables, supplemented with probabilities indicating the strength of the influences betw
Nonparametric Bayesian inference of the microcanonical stochastic block model
Peixoto, Tiago P
2016-01-01
A principled approach to characterize the hidden modular structure of networks is to formulate generative models, and then infer their parameters from data. When the desired structure is composed of modules or "communities", a suitable choice for this task is the stochastic block model (SBM), where nodes are divided into groups, and the placement of edges is conditioned on the group memberships. Here, we present a nonparametric Bayesian method to infer the modular structure of empirical networks, including the number of modules and their hierarchical organization. We focus on a microcanonical variant of the SBM, where the structure is imposed via hard constraints. We show how this simple model variation allows simultaneously for two important improvements over more traditional inference approaches: 1. Deeper Bayesian hierarchies, with noninformative priors replaced by sequences of priors and hyperpriors, that not only remove limitations that seriously degrade the inference on large networks, but also reveal s...
Variations on Bayesian Prediction and Inference
2016-05-09
Variations on Bayesian prediction and inference” Ryan Martin Department of Mathematics, Statistics , and Computer Science University of Illinois at Chicago...using statistical ideas/methods. We recently learned that this new project will be supported, in part, by the National Science Foundation. 2.2 Problem 2...41. Kalli, M., Griffin, J. E., Walker, S. G. (2011). Slice sampling mixture models. Statistics and Computing 21, 93–105. Koenker, R. (2005). Quantile
基于Bayesian-MCMC方法的水体污染识别反问题%Event Source Identification of Water Pollution Based on Bayesian-MCMC
Institute of Scientific and Technical Information of China (English)
陈海洋; 滕彦国; 王金生; 宋柳霆; 周振瑶
2012-01-01
For the ill-posed environment hydraulic inverse problem, a methodical model was constructed based on Bayesian inference and two-dimensional water quality model. Markov chain Monte Carlo simulation was applied to get posterior probability distribution of the source's position, intensity and event init time. The result of case study shows that the method based on Bayesian inference with Markov chain Monte Carlo simulation is fit for inverse problem such as contamination event source identification featuring high accuracy and little error. Compared with the identification results of hybrid genetic algorithm and pattern search, the presented approach indicated high stability and robust on the same inverse problem.%针对具有不适定性的环境水力学反问题,基于贝叶斯推理和二维水质模型建立水体污染识别反演模型,运用马尔科夫链蒙特卡罗法抽样获得污染源源强、污染源位置和污染泄漏时间等模型参数的后验概率分布和统计结果.实例研究结果表明,基于马尔科夫链蒙特卡罗抽样算法的贝叶斯推理可以较好地用来实现水体污染识别,具有识别精度高,误差小的特点,其可靠性和稳定性高于混合遗传-模式搜索优化算法.
Fast Bayesian inference of optical trap stiffness and particle diffusion
Bera, Sudipta; Paul, Shuvojit; Singh, Rajesh; Ghosh, Dipanjan; Kundu, Avijit; Banerjee, Ayan; Adhikari, R.
2017-01-01
Bayesian inference provides a principled way of estimating the parameters of a stochastic process that is observed discretely in time. The overdamped Brownian motion of a particle confined in an optical trap is generally modelled by the Ornstein-Uhlenbeck process and can be observed directly in experiment. Here we present Bayesian methods for inferring the parameters of this process, the trap stiffness and the particle diffusion coefficient, that use exact likelihoods and sufficient statistics to arrive at simple expressions for the maximum a posteriori estimates. This obviates the need for Monte Carlo sampling and yields methods that are both fast and accurate. We apply these to experimental data and demonstrate their advantage over commonly used non-Bayesian fitting methods.
Fast Bayesian inference of optical trap stiffness and particle diffusion
Bera, Sudipta; Singh, Rajesh; Ghosh, Dipanjan; Kundu, Avijit; Banerjee, Ayan; Adhikari, R
2016-01-01
Bayesian inference provides a principled way of estimating the parameters of a stochastic process that is observed discretely in time. The overdamped Brownian motion of a particle confined in an optical trap is generally modelled by the Ornstein-Uhlenbeck process and can be observed directly in experiment. Here we present Bayesian methods for inferring the parameters of this process, the trap stiffness and the particle diffusion coefficient, that use exact likelihoods and sufficient statistics to arrive at simple expressions for the maximum a posteriori estimates. This obviates the need for Monte Carlo sampling and yields methods that are both fast and accurate. We apply these to experimental data and demonstrate their advantage over commonly used non-Bayesian fitting methods.
Bayesian Inference in Monte-Carlo Tree Search
Tesauro, Gerald; Segal, Richard
2012-01-01
Monte-Carlo Tree Search (MCTS) methods are drawing great interest after yielding breakthrough results in computer Go. This paper proposes a Bayesian approach to MCTS that is inspired by distributionfree approaches such as UCT [13], yet significantly differs in important respects. The Bayesian framework allows potentially much more accurate (Bayes-optimal) estimation of node values and node uncertainties from a limited number of simulation trials. We further propose propagating inference in the tree via fast analytic Gaussian approximation methods: this can make the overhead of Bayesian inference manageable in domains such as Go, while preserving high accuracy of expected-value estimates. We find substantial empirical outperformance of UCT in an idealized bandit-tree test environment, where we can obtain valuable insights by comparing with known ground truth. Additionally we rigorously prove on-policy and off-policy convergence of the proposed methods.
Stan: A Probabilistic Programming Language for Bayesian Inference and Optimization
Gelman, Andrew; Lee, Daniel; Guo, Jiqiang
2015-01-01
Stan is a free and open-source C++ program that performs Bayesian inference or optimization for arbitrary user-specified models and can be called from the command line, R, Python, Matlab, or Julia and has great promise for fitting large and complex statistical models in many areas of application. We discuss Stan from users' and developers'…
Bayesian inference of structural brain networks.
Hinne, Max; Heskes, Tom; Beckmann, Christian F; van Gerven, Marcel A J
2013-02-01
Structural brain networks are used to model white-matter connectivity between spatially segregated brain regions. The presence, location and orientation of these white matter tracts can be derived using diffusion-weighted magnetic resonance imaging in combination with probabilistic tractography. Unfortunately, as of yet, none of the existing approaches provide an undisputed way of inferring brain networks from the streamline distributions which tractography produces. State-of-the-art methods rely on an arbitrary threshold or, alternatively, yield weighted results that are difficult to interpret. In this paper, we provide a generative model that explicitly describes how structural brain networks lead to observed streamline distributions. This allows us to draw principled conclusions about brain networks, which we validate using simultaneously acquired resting-state functional MRI data. Inference may be further informed by means of a prior which combines connectivity estimates from multiple subjects. Based on this prior, we obtain networks that significantly improve on the conventional approach.
Bayesian inference of chemical kinetic models from proposed reactions
Galagali, Nikhil
2015-02-01
© 2014 Elsevier Ltd. Bayesian inference provides a natural framework for combining experimental data with prior knowledge to develop chemical kinetic models and quantify the associated uncertainties, not only in parameter values but also in model structure. Most existing applications of Bayesian model selection methods to chemical kinetics have been limited to comparisons among a small set of models, however. The significant computational cost of evaluating posterior model probabilities renders traditional Bayesian methods infeasible when the model space becomes large. We present a new framework for tractable Bayesian model inference and uncertainty quantification using a large number of systematically generated model hypotheses. The approach involves imposing point-mass mixture priors over rate constants and exploring the resulting posterior distribution using an adaptive Markov chain Monte Carlo method. The posterior samples are used to identify plausible models, to quantify rate constant uncertainties, and to extract key diagnostic information about model structure-such as the reactions and operating pathways most strongly supported by the data. We provide numerical demonstrations of the proposed framework by inferring kinetic models for catalytic steam and dry reforming of methane using available experimental data.
Bayesian Inference in the Modern Design of Experiments
DeLoach, Richard
2008-01-01
This paper provides an elementary tutorial overview of Bayesian inference and its potential for application in aerospace experimentation in general and wind tunnel testing in particular. Bayes Theorem is reviewed and examples are provided to illustrate how it can be applied to objectively revise prior knowledge by incorporating insights subsequently obtained from additional observations, resulting in new (posterior) knowledge that combines information from both sources. A logical merger of Bayesian methods and certain aspects of Response Surface Modeling is explored. Specific applications to wind tunnel testing, computational code validation, and instrumentation calibration are discussed.
Energy Technology Data Exchange (ETDEWEB)
Zhang, Guannan [ORNL; Webster, Clayton G [ORNL; Gunzburger, Max D [ORNL
2012-09-01
Although Bayesian analysis has become vital to the quantification of prediction uncertainty in groundwater modeling, its application has been hindered due to the computational cost associated with numerous model executions needed for exploring the posterior probability density function (PPDF) of model parameters. This is particularly the case when the PPDF is estimated using Markov Chain Monte Carlo (MCMC) sampling. In this study, we develop a new approach that improves computational efficiency of Bayesian inference by constructing a surrogate system based on an adaptive sparse-grid high-order stochastic collocation (aSG-hSC) method. Unlike previous works using first-order hierarchical basis, we utilize a compactly supported higher-order hierar- chical basis to construct the surrogate system, resulting in a significant reduction in the number of computational simulations required. In addition, we use hierarchical surplus as an error indi- cator to determine adaptive sparse grids. This allows local refinement in the uncertain domain and/or anisotropic detection with respect to the random model parameters, which further improves computational efficiency. Finally, we incorporate a global optimization technique and propose an iterative algorithm for building the surrogate system for the PPDF with multiple significant modes. Once the surrogate system is determined, the PPDF can be evaluated by sampling the surrogate system directly with very little computational cost. The developed method is evaluated first using a simple analytical density function with multiple modes and then using two synthetic groundwater reactive transport models. The groundwater models represent different levels of complexity; the first example involves coupled linear reactions and the second example simulates nonlinear ura- nium surface complexation. The results show that the aSG-hSC is an effective and efficient tool for Bayesian inference in groundwater modeling in comparison with conventional
Bayesian Inference of Giant Exoplanet Physics
Thorngren, Daniel; Fortney, Jonathan J.
2017-01-01
The physical processes within a giant planet directly set its observed radius for a given mass, age, and insolation. The important aspects are the planet’s bulk composition and its interior thermal evolution. By studying many giant planets as an ensemble, we can gain insight into this physics. We demonstrate two novel examples here. We examine 50 cooler transiting giant planets, whose insolation is sufficiently low (T_eff < 1000 K) that they are not affected by the hot Jupiter radius inflation effect. For these planets, the thermal evolution is relatively well understood, and we show that the bulk planet metallicity increases with the total planet mass, which directly impacts plans for future atmospheric studies. We also examine the relation with stellar metallicity and discuss how these relations place new constraints on the core accretion model of planet formation. Our newest work seeks to quantify the flow of energy into hot Jupiters needed to explain their enlarged radii, in addition to their bulk composition. Because the former is related to stellar insolation and the latter is related to mass, we are able to create a hierarchical Bayesian model to disentangle the two effects in our sample of ~300 transiting giant planets. Our results show conclusively that the inflation power is not a simple fraction of stellar insolation: instead, the power increases with incident flux at a much higher rate. We use these results to test published models of giant planet inflation and to provide accurate empirical mass-radius relations for giant planets.
Dimension-independent likelihood-informed MCMC
Cui, Tiangang
2015-10-08
Many Bayesian inference problems require exploring the posterior distribution of high-dimensional parameters that represent the discretization of an underlying function. This work introduces a family of Markov chain Monte Carlo (MCMC) samplers that can adapt to the particular structure of a posterior distribution over functions. Two distinct lines of research intersect in the methods developed here. First, we introduce a general class of operator-weighted proposal distributions that are well defined on function space, such that the performance of the resulting MCMC samplers is independent of the discretization of the function. Second, by exploiting local Hessian information and any associated low-dimensional structure in the change from prior to posterior distributions, we develop an inhomogeneous discretization scheme for the Langevin stochastic differential equation that yields operator-weighted proposals adapted to the non-Gaussian structure of the posterior. The resulting dimension-independent and likelihood-informed (DILI) MCMC samplers may be useful for a large class of high-dimensional problems where the target probability measure has a density with respect to a Gaussian reference measure. Two nonlinear inverse problems are used to demonstrate the efficiency of these DILI samplers: an elliptic PDE coefficient inverse problem and path reconstruction in a conditioned diffusion.
Bayesian inference data evaluation and decisions
Harney, Hanns Ludwig
2016-01-01
This new edition offers a comprehensive introduction to the analysis of data using Bayes rule. It generalizes Gaussian error intervals to situations in which the data follow distributions other than Gaussian. This is particularly useful when the observed parameter is barely above the background or the histogram of multiparametric data contains many empty bins, so that the determination of the validity of a theory cannot be based on the chi-squared-criterion. In addition to the solutions of practical problems, this approach provides an epistemic insight: the logic of quantum mechanics is obtained as the logic of unbiased inference from counting data. New sections feature factorizing parameters, commuting parameters, observables in quantum mechanics, the art of fitting with coherent and with incoherent alternatives and fitting with multinomial distribution. Additional problems and examples help deepen the knowledge. Requiring no knowledge of quantum mechanics, the book is written on introductory level, with man...
Bayesian electron density inference from JET lithium beam emission spectra using Gaussian processes
Kwak, Sehyun; Svensson, J.; Brix, M.; Ghim, Y.-C.; Contributors, JET
2017-03-01
A Bayesian model to infer edge electron density profiles is developed for the JET lithium beam emission spectroscopy (Li-BES) system, measuring Li I (2p-2s) line radiation using 26 channels with ∼1 cm spatial resolution and 10∼ 20 ms temporal resolution. The density profile is modelled using a Gaussian process prior, and the uncertainty of the density profile is calculated by a Markov Chain Monte Carlo (MCMC) scheme. From the spectra measured by the transmission grating spectrometer, the Li I line intensities are extracted, and modelled as a function of the plasma density by a multi-state model which describes the relevant processes between neutral lithium beam atoms and plasma particles. The spectral model fully takes into account interference filter and instrument effects, that are separately estimated, again using Gaussian processes. The line intensities are inferred based on a spectral model consistent with the measured spectra within their uncertainties, which includes photon statistics and electronic noise. Our newly developed method to infer JET edge electron density profiles has the following advantages in comparison to the conventional method: (i) providing full posterior distributions of edge density profiles, including their associated uncertainties, (ii) the available radial range for density profiles is increased to the full observation range (∼26 cm), (iii) an assumption of monotonic electron density profile is not necessary, (iv) the absolute calibration factor of the diagnostic system is automatically estimated overcoming the limitation of the conventional technique and allowing us to infer the electron density profiles for all pulses without preprocessing the data or an additional boundary condition, and (v) since the full spectrum is modelled, the procedure of modulating the beam to measure the background signal is only necessary for the case of overlapping of the Li I line with impurity lines.
Bayesian inference from count data using discrete uniform priors.
Comoglio, Federico; Fracchia, Letizia; Rinaldi, Maurizio
2013-01-01
We consider a set of sample counts obtained by sampling arbitrary fractions of a finite volume containing an homogeneously dispersed population of identical objects. We report a Bayesian derivation of the posterior probability distribution of the population size using a binomial likelihood and non-conjugate, discrete uniform priors under sampling with or without replacement. Our derivation yields a computationally feasible formula that can prove useful in a variety of statistical problems involving absolute quantification under uncertainty. We implemented our algorithm in the R package dupiR and compared it with a previously proposed Bayesian method based on a Gamma prior. As a showcase, we demonstrate that our inference framework can be used to estimate bacterial survival curves from measurements characterized by extremely low or zero counts and rather high sampling fractions. All in all, we provide a versatile, general purpose algorithm to infer population sizes from count data, which can find application in a broad spectrum of biological and physical problems.
Granger causality vs. dynamic Bayesian network inference: a comparative study
Directory of Open Access Journals (Sweden)
Feng Jianfeng
2009-04-01
Full Text Available Abstract Background In computational biology, one often faces the problem of deriving the causal relationship among different elements such as genes, proteins, metabolites, neurons and so on, based upon multi-dimensional temporal data. Currently, there are two common approaches used to explore the network structure among elements. One is the Granger causality approach, and the other is the dynamic Bayesian network inference approach. Both have at least a few thousand publications reported in the literature. A key issue is to choose which approach is used to tackle the data, in particular when they give rise to contradictory results. Results In this paper, we provide an answer by focusing on a systematic and computationally intensive comparison between the two approaches on both synthesized and experimental data. For synthesized data, a critical point of the data length is found: the dynamic Bayesian network outperforms the Granger causality approach when the data length is short, and vice versa. We then test our results in experimental data of short length which is a common scenario in current biological experiments: it is again confirmed that the dynamic Bayesian network works better. Conclusion When the data size is short, the dynamic Bayesian network inference performs better than the Granger causality approach; otherwise the Granger causality approach is better.
Inference of gene pathways using mixture Bayesian networks
Directory of Open Access Journals (Sweden)
Ko Younhee
2009-05-01
Full Text Available Abstract Background Inference of gene networks typically relies on measurements across a wide range of conditions or treatments. Although one network structure is predicted, the relationship between genes could vary across conditions. A comprehensive approach to infer general and condition-dependent gene networks was evaluated. This approach integrated Bayesian network and Gaussian mixture models to describe continuous microarray gene expression measurements, and three gene networks were predicted. Results The first reconstructions of a circadian rhythm pathway in honey bees and an adherens junction pathway in mouse embryos were obtained. In addition, general and condition-specific gene relationships, some unexpected, were detected in these two pathways and in a yeast cell-cycle pathway. The mixture Bayesian network approach identified all (honey bee circadian rhythm and mouse adherens junction pathways or the vast majority (yeast cell-cycle pathway of the gene relationships reported in empirical studies. Findings across the three pathways and data sets indicate that the mixture Bayesian network approach is well-suited to infer gene pathways based on microarray data. Furthermore, the interpretation of model estimates provided a broader understanding of the relationships between genes. The mixture models offered a comprehensive description of the relationships among genes in complex biological processes or across a wide range of conditions. The mixture parameter estimates and corresponding odds that the gene network inferred for a sample pertained to each mixture component allowed the uncovering of both general and condition-dependent gene relationships and patterns of expression. Conclusion This study demonstrated the two main benefits of learning gene pathways using mixture Bayesian networks. First, the identification of the optimal number of mixture components supported by the data offered a robust approach to infer gene relationships and
Halo detection via large-scale Bayesian inference
Merson, Alexander I.; Jasche, Jens; Abdalla, Filipe B.; Lahav, Ofer; Wandelt, Benjamin; Jones, D. Heath; Colless, Matthew
2016-08-01
We present a proof-of-concept of a novel and fully Bayesian methodology designed to detect haloes of different masses in cosmological observations subject to noise and systematic uncertainties. Our methodology combines the previously published Bayesian large-scale structure inference algorithm, HAmiltonian Density Estimation and Sampling algorithm (HADES), and a Bayesian chain rule (the Blackwell-Rao estimator), which we use to connect the inferred density field to the properties of dark matter haloes. To demonstrate the capability of our approach, we construct a realistic galaxy mock catalogue emulating the wide-area 6-degree Field Galaxy Survey, which has a median redshift of approximately 0.05. Application of HADES to the catalogue provides us with accurately inferred three-dimensional density fields and corresponding quantification of uncertainties inherent to any cosmological observation. We then use a cosmological simulation to relate the amplitude of the density field to the probability of detecting a halo with mass above a specified threshold. With this information, we can sum over the HADES density field realisations to construct maps of detection probabilities and demonstrate the validity of this approach within our mock scenario. We find that the probability of successful detection of haloes in the mock catalogue increases as a function of the signal to noise of the local galaxy observations. Our proposed methodology can easily be extended to account for more complex scientific questions and is a promising novel tool to analyse the cosmic large-scale structure in observations.
Bayesian inference in the numerical solution of Laplace's equation
Mendes, Fábio Macêdo; da Costa Júnior, Edson Alves
2012-05-01
Inference is not unrelated to numerical analysis: given partial information about a mathematical problem, one has to estimate the unknown "true solution" and uncertainties. Many methods of interpolation (least squares, Kriging, Tikhonov regularization, etc) have also a probabilistic interpretation. O'Hagan showed that quadratures can also be constructed explicitly as a form of Bayesian inference (O'Hagan, A., BAYESIAN STATISTICS (1992) 4, pp. 345-363). In his framework, the integrand is modeled as a Gaussian process. It is then possible to build a reliable estimate for the value of the integral by conditioning the stochastic process to the known values of the integr nd in a finite set of points. The present work applies a similar method for the problem of solving Laplace's equation inside a closed boundary. First, one needs a Gaussian process that yields arbitrary harmonic functions. Secondly, the boundaries (Dirichilet or Neumann conditions) are used to update these probabilities and to estimate the solution in the whole domain. This procedure is similar to the widely used Boundary Element Method, but differs from it in the treatment of the boundaries. The language of Bayesian inference gives more flexibility on how the boundary conditions and conservation laws can be handled. This flexibility can be used to attain greater accuracy using a coarser discretization of the boundary and can open doors to more efficient implementations.
Andrade, Daniel
2012-01-01
We present a new method to propagate lower bounds on conditional probability distributions in conventional Bayesian networks. Our method guarantees to provide outer approximations of the exact lower bounds. A key advantage is that we can use any available algorithms and tools for Bayesian networks in order to represent and infer lower bounds. This new method yields results that are provable exact for trees with binary variables, and results which are competitive to existing approximations in credal networks for all other network structures. Our method is not limited to a specific kind of network structure. Basically, it is also not restricted to a specific kind of inference, but we restrict our analysis to prognostic inference in this article. The computational complexity is superior to that of other existing approaches.
MCMC for non-linear state space models using ensembles of latent sequences
2013-01-01
Non-linear state space models are a widely-used class of models for biological, economic, and physical processes. Fitting these models to observed data is a difficult inference problem that has no straightforward solution. We take a Bayesian approach to the inference of unknown parameters of a non-linear state model; this, in turn, requires the availability of efficient Markov Chain Monte Carlo (MCMC) sampling methods for the latent (hidden) variables and model parameters. Using the ensemble ...
Simulation based bayesian econometric inference: principles and some recent computational advances.
L.F. Hoogerheide (Lennart); H.K. van Dijk (Herman); R.D. van Oest (Rutger)
2007-01-01
textabstractIn this paper we discuss several aspects of simulation based Bayesian econometric inference. We start at an elementary level on basic concepts of Bayesian analysis; evaluating integrals by simulation methods is a crucial ingredient in Bayesian inference. Next, the most popular and well-
DEFF Research Database (Denmark)
Møller, Jesper
.1 with the title ‘Inference'.) This contribution concerns statistical inference for parametric models used in stochastic geometry and based on quick and simple simulation free procedures as well as more comprehensive methods using Markov chain Monte Carlo (MCMC) simulations. Due to space limitations the focus...
A Full Bayesian Approach for Boolean Genetic Network Inference
Han, Shengtong; Wong, Raymond K. W.; Lee, Thomas C. M.; Shen, Linghao; Li, Shuo-Yen R.; Fan, Xiaodan
2014-01-01
Boolean networks are a simple but efficient model for describing gene regulatory systems. A number of algorithms have been proposed to infer Boolean networks. However, these methods do not take full consideration of the effects of noise and model uncertainty. In this paper, we propose a full Bayesian approach to infer Boolean genetic networks. Markov chain Monte Carlo algorithms are used to obtain the posterior samples of both the network structure and the related parameters. In addition to regular link addition and removal moves, which can guarantee the irreducibility of the Markov chain for traversing the whole network space, carefully constructed mixture proposals are used to improve the Markov chain Monte Carlo convergence. Both simulations and a real application on cell-cycle data show that our method is more powerful than existing methods for the inference of both the topology and logic relations of the Boolean network from observed data. PMID:25551820
A full bayesian approach for boolean genetic network inference.
Directory of Open Access Journals (Sweden)
Shengtong Han
Full Text Available Boolean networks are a simple but efficient model for describing gene regulatory systems. A number of algorithms have been proposed to infer Boolean networks. However, these methods do not take full consideration of the effects of noise and model uncertainty. In this paper, we propose a full Bayesian approach to infer Boolean genetic networks. Markov chain Monte Carlo algorithms are used to obtain the posterior samples of both the network structure and the related parameters. In addition to regular link addition and removal moves, which can guarantee the irreducibility of the Markov chain for traversing the whole network space, carefully constructed mixture proposals are used to improve the Markov chain Monte Carlo convergence. Both simulations and a real application on cell-cycle data show that our method is more powerful than existing methods for the inference of both the topology and logic relations of the Boolean network from observed data.
Bayesian inference of synaptic quantal parameters from correlated vesicle release
Directory of Open Access Journals (Sweden)
Alexander D Bird
2016-11-01
Full Text Available Synaptic transmission is both history-dependent and stochastic, resulting in varying responses to presentations of the same presynaptic stimulus. This complicates attempts to infer synaptic parameters and has led to the proposal of a number of different strategies for their quantification. Recently Bayesian approaches have been applied to make more efficient use of the data collected in paired intracellular recordings. Methods have been developed that either provide a complete model of the distribution of amplitudes for isolated responses or approximate the amplitude distributions of a train of post-synaptic potentials, with correct short-term synaptic dynamics but neglecting correlations. In both cases the methods provided significantly improved inference of model parameters as compared to existing mean-variance fitting approaches. However, for synapses with high release probability, low vesicle number or relatively low restock rate and for data in which only one or few repeats of the same pattern are available, correlations between serial events can allow for the extraction of significantly more information from experiment: a more complete Bayesian approach would take this into account also. This has not been possible previously because of the technical difficulty in calculating the likelihood of amplitudes seen in correlated post-synaptic potential trains; however, recent theoretical advances have now rendered the likelihood calculation tractable for a broad class of synaptic dynamics models. Here we present a compact mathematical form for the likelihood in terms of a matrix product and demonstrate how marginals of the posterior provide information on covariance of parameter distributions. The associated computer code for Bayesian parameter inference for a variety of models of synaptic dynamics is provided in the supplementary material allowing for quantal and dynamical parameters to be readily inferred from experimental data sets.
Bayesian Inference of Synaptic Quantal Parameters from Correlated Vesicle Release
Bird, Alex D.; Wall, Mark J.; Richardson, Magnus J. E.
2016-01-01
Synaptic transmission is both history-dependent and stochastic, resulting in varying responses to presentations of the same presynaptic stimulus. This complicates attempts to infer synaptic parameters and has led to the proposal of a number of different strategies for their quantification. Recently Bayesian approaches have been applied to make more efficient use of the data collected in paired intracellular recordings. Methods have been developed that either provide a complete model of the distribution of amplitudes for isolated responses or approximate the amplitude distributions of a train of post-synaptic potentials, with correct short-term synaptic dynamics but neglecting correlations. In both cases the methods provided significantly improved inference of model parameters as compared to existing mean-variance fitting approaches. However, for synapses with high release probability, low vesicle number or relatively low restock rate and for data in which only one or few repeats of the same pattern are available, correlations between serial events can allow for the extraction of significantly more information from experiment: a more complete Bayesian approach would take this into account also. This has not been possible previously because of the technical difficulty in calculating the likelihood of amplitudes seen in correlated post-synaptic potential trains; however, recent theoretical advances have now rendered the likelihood calculation tractable for a broad class of synaptic dynamics models. Here we present a compact mathematical form for the likelihood in terms of a matrix product and demonstrate how marginals of the posterior provide information on covariance of parameter distributions. The associated computer code for Bayesian parameter inference for a variety of models of synaptic dynamics is provided in the Supplementary Material allowing for quantal and dynamical parameters to be readily inferred from experimental data sets. PMID:27932970
Jordan, Paul; Brunschwig, Hadassa; Luedin, Eric
2008-01-01
The approach of Bayesian mixed effects modeling is an appropriate method for estimating both population-specific as well as subject-specific times to steady state. In addition to pure estimation, the approach allows to determine the time until a certain fraction of individuals of a population has reached steady state with a pre-specified certainty. In this paper a mixed effects model for the parameters of a nonlinear pharmacokinetic model is used within a Bayesian framework. Model fitting by means of Markov Chain Monte Carlo methods as implemented in the Gibbs sampler as well as the extraction of estimates and probability statements of interest are described. Finally, the proposed approach is illustrated by application to trough data from a multiple dose clinical trial.
Bayesian Inference for Radio Observations - Going beyond deconvolution
Lochner, Michelle; Kunz, Martin; Natarajan, Iniyan; Oozeer, Nadeem; Smirnov, Oleg; Zwart, Jon
2015-01-01
Radio interferometers suffer from the problem of missing information in their data, due to the gaps between the antennas. This results in artifacts, such as bright rings around sources, in the images obtained. Multiple deconvolution algorithms have been proposed to solve this problem and produce cleaner radio images. However, these algorithms are unable to correctly estimate uncertainties in derived scientific parameters or to always include the effects of instrumental errors. We propose an alternative technique called Bayesian Inference for Radio Observations (BIRO) which uses a Bayesian statistical framework to determine the scientific parameters and instrumental errors simultaneously directly from the raw data, without making an image. We use a simple simulation of Westerbork Synthesis Radio Telescope data including pointing errors and beam parameters as instrumental effects, to demonstrate the use of BIRO.
Nonparametric Bayesian inference of the microcanonical stochastic block model
Peixoto, Tiago P.
2017-01-01
A principled approach to characterize the hidden modular structure of networks is to formulate generative models and then infer their parameters from data. When the desired structure is composed of modules or "communities," a suitable choice for this task is the stochastic block model (SBM), where nodes are divided into groups, and the placement of edges is conditioned on the group memberships. Here, we present a nonparametric Bayesian method to infer the modular structure of empirical networks, including the number of modules and their hierarchical organization. We focus on a microcanonical variant of the SBM, where the structure is imposed via hard constraints, i.e., the generated networks are not allowed to violate the patterns imposed by the model. We show how this simple model variation allows simultaneously for two important improvements over more traditional inference approaches: (1) deeper Bayesian hierarchies, with noninformative priors replaced by sequences of priors and hyperpriors, which not only remove limitations that seriously degrade the inference on large networks but also reveal structures at multiple scales; (2) a very efficient inference algorithm that scales well not only for networks with a large number of nodes and edges but also with an unlimited number of modules. We show also how this approach can be used to sample modular hierarchies from the posterior distribution, as well as to perform model selection. We discuss and analyze the differences between sampling from the posterior and simply finding the single parameter estimate that maximizes it. Furthermore, we expose a direct equivalence between our microcanonical approach and alternative derivations based on the canonical SBM.
Inference of Gene Regulatory Network Based on Local Bayesian Networks.
Liu, Fei; Zhang, Shao-Wu; Guo, Wei-Feng; Wei, Ze-Gang; Chen, Luonan
2016-08-01
The inference of gene regulatory networks (GRNs) from expression data can mine the direct regulations among genes and gain deep insights into biological processes at a network level. During past decades, numerous computational approaches have been introduced for inferring the GRNs. However, many of them still suffer from various problems, e.g., Bayesian network (BN) methods cannot handle large-scale networks due to their high computational complexity, while information theory-based methods cannot identify the directions of regulatory interactions and also suffer from false positive/negative problems. To overcome the limitations, in this work we present a novel algorithm, namely local Bayesian network (LBN), to infer GRNs from gene expression data by using the network decomposition strategy and false-positive edge elimination scheme. Specifically, LBN algorithm first uses conditional mutual information (CMI) to construct an initial network or GRN, which is decomposed into a number of local networks or GRNs. Then, BN method is employed to generate a series of local BNs by selecting the k-nearest neighbors of each gene as its candidate regulatory genes, which significantly reduces the exponential search space from all possible GRN structures. Integrating these local BNs forms a tentative network or GRN by performing CMI, which reduces redundant regulations in the GRN and thus alleviates the false positive problem. The final network or GRN can be obtained by iteratively performing CMI and local BN on the tentative network. In the iterative process, the false or redundant regulations are gradually removed. When tested on the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in E.coli, our results suggest that LBN outperforms other state-of-the-art methods (ARACNE, GENIE3 and NARROMI) significantly, with more accurate and robust performance. In particular, the decomposition strategy with local Bayesian networks not only effectively reduce
Applying Bayesian Inference to Galileon Solutions of the Muon Problem
Lamm, Henry
2016-01-01
We derive corrections to atomic energy levels from disformal couplings in Galileon theories. Through Bayesian inference, we constrain the cut-off radii and Galileon scale via these corrections. To connect different atomic systems, we assume the various cut-off radii related by a 1-parameter family of solutions. This introduces a new parameter $\\alpha$ which is also constrained. In this model, we predict shifts to muonic helium of $\\delta E_{He^3}=1.97^{+9.28}_{-1.87}$ meV and $\\delta E_{He^4}=1.69^{+9.25}_{-1.61}$ meV.
Bayesian inference for inverse problems occurring in uncertainty analysis
Fu, Shuai; Celeux, Gilles; Bousquet, Nicolas; Couplet, Mathieu
2012-01-01
The inverse problem considered here is to estimate the distribution of a non-observed random variable $X$ from some noisy observed data $Y$ linked to $X$ through a time-consuming physical model $H$. Bayesian inference is considered to take into account prior expert knowledge on $X$ in a small sample size setting. A Metropolis-Hastings within Gibbs algorithm is proposed to compute the posterior distribution of the parameters of $X$ through a data augmentation process. Since calls to $H$ are qu...
Bayesian Inference for Structured Spike and Slab Priors
DEFF Research Database (Denmark)
Andersen, Michael Riis; Winther, Ole; Hansen, Lars Kai
2014-01-01
Sparse signal recovery addresses the problem of solving underdetermined linear inverse problems subject to a sparsity constraint. We propose a novel prior formulation, the structured spike and slab prior, which allows to incorporate a priori knowledge of the sparsity pattern by imposing a spatial...... Gaussian process on the spike and slab probabilities. Thus, prior information on the structure of the sparsity pattern can be encoded using generic covariance functions. Furthermore, we provide a Bayesian inference scheme for the proposed model based on the expectation propagation framework. Using...
Inference-less Density Estimation using Copula Bayesian Networks
Elidan, Gal
2012-01-01
We consider learning continuous probabilistic graphical models in the face of missing data. For non-Gaussian models, learning the parameters and structure of such models depends on our ability to perform efficient inference, and can be prohibitive even for relatively modest domains. Recently, we introduced the Copula Bayesian Network (CBN) density model - a flexible framework that captures complex high-dimensional dependency structures while offering direct control over the univariate marginals, leading to improved generalization. In this work we show that the CBN model also offers significant computational advantages when training data is partially observed. Concretely, we leverage on the specialized form of the model to derive a computationally amenable learning objective that is a lower bound on the log-likelihood function. Importantly, our energy-like bound circumvents the need for costly inference of an auxiliary distribution, thus facilitating practical learning of highdimensional densities. We demonstr...
Sparse kernel learning with LASSO and Bayesian inference algorithm.
Gao, Junbin; Kwan, Paul W; Shi, Daming
2010-03-01
Kernelized LASSO (Least Absolute Selection and Shrinkage Operator) has been investigated in two separate recent papers [Gao, J., Antolovich, M., & Kwan, P. H. (2008). L1 LASSO and its Bayesian inference. In W. Wobcke, & M. Zhang (Eds.), Lecture notes in computer science: Vol. 5360 (pp. 318-324); Wang, G., Yeung, D. Y., & Lochovsky, F. (2007). The kernel path in kernelized LASSO. In International conference on artificial intelligence and statistics (pp. 580-587). San Juan, Puerto Rico: MIT Press]. This paper is concerned with learning kernels under the LASSO formulation via adopting a generative Bayesian learning and inference approach. A new robust learning algorithm is proposed which produces a sparse kernel model with the capability of learning regularized parameters and kernel hyperparameters. A comparison with state-of-the-art methods for constructing sparse regression models such as the relevance vector machine (RVM) and the local regularization assisted orthogonal least squares regression (LROLS) is given. The new algorithm is also demonstrated to possess considerable computational advantages.
Bayesian inference for generalized linear models for spiking neurons
Directory of Open Access Journals (Sweden)
Sebastian Gerwinn
2010-05-01
Full Text Available Generalized Linear Models (GLMs are commonly used statistical methods for modelling the relationship between neural population activity and presented stimuli. When the dimension of the parameter space is large, strong regularization has to be used in order to fit GLMs to datasets of realistic size without overfitting. By imposing properly chosen priors over parameters, Bayesian inference provides an effective and principled approach for achieving regularization. Here we show how the posterior distribution over model parameters of GLMs can be approximated by a Gaussian using the Expectation Propagation algorithm. In this way, we obtain an estimate of the posterior mean and posterior covariance, allowing us to calculate Bayesian confidence intervals that characterize the uncertainty about the optimal solution. From the posterior we also obtain a different point estimate, namely the posterior mean as opposed to the commonly used maximum a posteriori estimate. We systematically compare the different inference techniques on simulated as well as on multi-electrode recordings of retinal ganglion cells, and explore the effects of the chosen prior and the performance measure used. We find that good performance can be achieved by choosing an Laplace prior together with the posterior mean estimate.
Bayesian inference from count data using discrete uniform priors.
Directory of Open Access Journals (Sweden)
Federico Comoglio
Full Text Available We consider a set of sample counts obtained by sampling arbitrary fractions of a finite volume containing an homogeneously dispersed population of identical objects. We report a Bayesian derivation of the posterior probability distribution of the population size using a binomial likelihood and non-conjugate, discrete uniform priors under sampling with or without replacement. Our derivation yields a computationally feasible formula that can prove useful in a variety of statistical problems involving absolute quantification under uncertainty. We implemented our algorithm in the R package dupiR and compared it with a previously proposed Bayesian method based on a Gamma prior. As a showcase, we demonstrate that our inference framework can be used to estimate bacterial survival curves from measurements characterized by extremely low or zero counts and rather high sampling fractions. All in all, we provide a versatile, general purpose algorithm to infer population sizes from count data, which can find application in a broad spectrum of biological and physical problems.
Bayesian Inference for Signal-Based Seismic Monitoring
Moore, D.
2015-12-01
Traditional seismic monitoring systems rely on discrete detections produced by station processing software, discarding significant information present in the original recorded signal. SIG-VISA (Signal-based Vertically Integrated Seismic Analysis) is a system for global seismic monitoring through Bayesian inference on seismic signals. By modeling signals directly, our forward model is able to incorporate a rich representation of the physics underlying the signal generation process, including source mechanisms, wave propagation, and station response. This allows inference in the model to recover the qualitative behavior of recent geophysical methods including waveform matching and double-differencing, all as part of a unified Bayesian monitoring system that simultaneously detects and locates events from a global network of stations. We demonstrate recent progress in scaling up SIG-VISA to efficiently process the data stream of global signals recorded by the International Monitoring System (IMS), including comparisons against existing processing methods that show increased sensitivity from our signal-based model and in particular the ability to locate events (including aftershock sequences that can tax analyst processing) precisely from waveform correlation effects. We also provide a Bayesian analysis of an alleged low-magnitude event near the DPRK test site in May 2010 [1] [2], investigating whether such an event could plausibly be detected through automated processing in a signal-based monitoring system. [1] Zhang, Miao and Wen, Lianxing. "Seismological Evidence for a Low-Yield Nuclear Test on 12 May 2010 in North Korea". Seismological Research Letters, January/February 2015. [2] Richards, Paul. "A Seismic Event in North Korea on 12 May 2010". CTBTO SnT 2015 oral presentation, video at https://video-archive.ctbto.org/index.php/kmc/preview/partner_id/103/uiconf_id/4421629/entry_id/0_ymmtpps0/delivery/http
Ravenna, Matteo; Lebedev, Sergei
2016-04-01
We develop a Markov Chain Monte Carlo method for joint inversion of Rayleigh- and Love-wave dispersion curves that is able to yield robust radially and azimuthally anisotropic shear velocity profiles, with resolution to depths down to the transition zone. The probabilistic feature of the algorithm is a powerful tool that is able to provide error assessment of the shear velocity models, quantify non-uniqueness and address the issue of data noise estimation by treating it as an unknown parameter in the inversion. In a fixed dimensional Bayesian formulation, we choose to set the number of parameters relatively high, with a more dense parametrization in the uppermost mantle in order to have a good resolution of the Litosphere-Astenosphere Boundary region. We apply the MCMC algorithm to the inversion of surface-wave phase velocities accurately determined in broad period ranges in a few test regions. In the Baikal-Mongolia region we invert Rayleigh- and Love- wave dispersion curves for radially anisotropic structure (Vsv,Vsh) of the crust and upper mantle. In the Tuscany region, where we have phase velocity data with good azimuthal coverage, a different implementation of the algorithm is applied that is able to resolve azimuthal anisotropy; the Rayleigh wave dispersion curves measured at different azimuths have been inverted for the Vsv structure and the depth distribution of the 2-psi azimuthal anisotropy of the region, with good resolution down to asthenospheric depths.
Bayesian inference for identifying interaction rules in moving animal groups.
Directory of Open Access Journals (Sweden)
Richard P Mann
Full Text Available The emergence of similar collective patterns from different self-propelled particle models of animal groups points to a restricted set of "universal" classes for these patterns. While universality is interesting, it is often the fine details of animal interactions that are of biological importance. Universality thus presents a challenge to inferring such interactions from macroscopic group dynamics since these can be consistent with many underlying interaction models. We present a Bayesian framework for learning animal interaction rules from fine scale recordings of animal movements in swarms. We apply these techniques to the inverse problem of inferring interaction rules from simulation models, showing that parameters can often be inferred from a small number of observations. Our methodology allows us to quantify our confidence in parameter fitting. For example, we show that attraction and alignment terms can be reliably estimated when animals are milling in a torus shape, while interaction radius cannot be reliably measured in such a situation. We assess the importance of rate of data collection and show how to test different models, such as topological and metric neighbourhood models. Taken together our results both inform the design of experiments on animal interactions and suggest how these data should be best analysed.
Bayesian inference for identifying interaction rules in moving animal groups.
Mann, Richard P
2011-01-01
The emergence of similar collective patterns from different self-propelled particle models of animal groups points to a restricted set of "universal" classes for these patterns. While universality is interesting, it is often the fine details of animal interactions that are of biological importance. Universality thus presents a challenge to inferring such interactions from macroscopic group dynamics since these can be consistent with many underlying interaction models. We present a Bayesian framework for learning animal interaction rules from fine scale recordings of animal movements in swarms. We apply these techniques to the inverse problem of inferring interaction rules from simulation models, showing that parameters can often be inferred from a small number of observations. Our methodology allows us to quantify our confidence in parameter fitting. For example, we show that attraction and alignment terms can be reliably estimated when animals are milling in a torus shape, while interaction radius cannot be reliably measured in such a situation. We assess the importance of rate of data collection and show how to test different models, such as topological and metric neighbourhood models. Taken together our results both inform the design of experiments on animal interactions and suggest how these data should be best analysed.
A tutorial on time-evolving dynamical Bayesian inference
Stankovski, Tomislav; Duggento, Andrea; McClintock, Peter V. E.; Stefanovska, Aneta
2014-12-01
In view of the current availability and variety of measured data, there is an increasing demand for powerful signal processing tools that can cope successfully with the associated problems that often arise when data are being analysed. In practice many of the data-generating systems are not only time-variable, but also influenced by neighbouring systems and subject to random fluctuations (noise) from their environments. To encompass problems of this kind, we present a tutorial about the dynamical Bayesian inference of time-evolving coupled systems in the presence of noise. It includes the necessary theoretical description and the algorithms for its implementation. For general programming purposes, a pseudocode description is also given. Examples based on coupled phase and limit-cycle oscillators illustrate the salient features of phase dynamics inference. State domain inference is illustrated with an example of coupled chaotic oscillators. The applicability of the latter example to secure communications based on the modulation of coupling functions is outlined. MatLab codes for implementation of the method, as well as for the explicit examples, accompany the tutorial.
Applying Bayesian inference to Galileon solutions of the muon problem
Lamm, Henry
2016-12-01
We derive corrections to atomic energy levels from disformal couplings in Galileon theories. Through Bayesian inference, we constrain the cutoff radii and Galileon scale via these corrections. To connect different atomic systems, we assume the various cutoff radii related by a one-parameter family of solutions. This introduces a new parameter α which is also constrained. In this model, we predict shifts to muonic helium of δ EHe3=1.9 7-1.87+9.28 meV and δ EHe4=1.6 9-1.61+9.25 meV as well as for true muonium, δ ETM=0.0 6-0.05+0.46 meV .
Unsupervised Transient Light Curve Analysis Via Hierarchical Bayesian Inference
Sanders, Nathan; Soderberg, Alicia
2014-01-01
Historically, light curve studies of supernovae (SNe) and other transient classes have focused on individual objects with copious and high signal-to-noise observations. In the nascent era of wide field transient searches, objects with detailed observations are decreasing as a fraction of the overall known SN population, and this strategy sacrifices the majority of the information contained in the data about the underlying population of transients. A population level modeling approach, simultaneously fitting all available observations of objects in a transient sub-class of interest, fully mines the data to infer the properties of the population and avoids certain systematic biases. We present a novel hierarchical Bayesian statistical model for population level modeling of transient light curves, and discuss its implementation using an efficient Hamiltonian Monte Carlo technique. As a test case, we apply this model to the Type IIP SN sample from the Pan-STARRS1 Medium Deep Survey, consisting of 18,837 photometr...
Bayesian large-scale structure inference and cosmic web analysis
Leclercq, Florent
2015-01-01
Surveys of the cosmic large-scale structure carry opportunities for building and testing cosmological theories about the origin and evolution of the Universe. This endeavor requires appropriate data assimilation tools, for establishing the contact between survey catalogs and models of structure formation. In this thesis, we present an innovative statistical approach for the ab initio simultaneous analysis of the formation history and morphology of the cosmic web: the BORG algorithm infers the primordial density fluctuations and produces physical reconstructions of the dark matter distribution that underlies observed galaxies, by assimilating the survey data into a cosmological structure formation model. The method, based on Bayesian probability theory, provides accurate means of uncertainty quantification. We demonstrate the application of BORG to the Sloan Digital Sky Survey data and describe the primordial and late-time large-scale structure in the observed volume. We show how the approach has led to the fi...
Bayesian inference on the sphere beyond statistical isotropy
Das, Santanu; Souradeep, Tarun
2015-01-01
We present a general method for Bayesian inference of the underlying covariance structure of random fields on a sphere. We employ the Bipolar Spherical Harmonic (BipoSH) representation of general covariance structure on the sphere. We illustrate the efficacy of the method as a principled approach to assess violation of statistical isotropy (SI) in the sky maps of Cosmic Microwave Background (CMB) fluctuations. SI violation in observed CMB maps arise due to known physical effects such as Doppler boost and weak lensing; yet unknown theoretical possibilities like cosmic topology and subtle violations of the cosmological principle, as well as, expected observational artefacts of scanning the sky with a non-circular beam, masking, foreground residuals, anisotropic noise, etc. We explicitly demonstrate the recovery of the input SI violation signals with their full statistics in simulated CMB maps. Our formalism easily adapts to exploring parametric physical models with non-SI covariance, as we illustrate for the in...
Bayesian inference underlies the contraction bias in delayed comparison tasks.
Directory of Open Access Journals (Sweden)
Paymon Ashourian
Full Text Available Delayed comparison tasks are widely used in the study of working memory and perception in psychology and neuroscience. It has long been known, however, that decisions in these tasks are biased. When the two stimuli in a delayed comparison trial are small in magnitude, subjects tend to report that the first stimulus is larger than the second stimulus. In contrast, subjects tend to report that the second stimulus is larger than the first when the stimuli are relatively large. Here we study the computational principles underlying this bias, also known as the contraction bias. We propose that the contraction bias results from a Bayesian computation in which a noisy representation of a magnitude is combined with a-priori information about the distribution of magnitudes to optimize performance. We test our hypothesis on choice behavior in a visual delayed comparison experiment by studying the effect of (i changing the prior distribution and (ii changing the uncertainty in the memorized stimulus. We show that choice behavior in both manipulations is consistent with the performance of an observer who uses a Bayesian inference in order to improve performance. Moreover, our results suggest that the contraction bias arises during memory retrieval/decision making and not during memory encoding. These results support the notion that the contraction bias illusion can be understood as resulting from optimality considerations.
Link, William A.; Eaton, Mitchell J.
2012-01-01
1. Markov chain Monte Carlo (MCMC) is a simulation technique that has revolutionised the analysis of ecological data, allowing the fitting of complex models in a Bayesian framework. Since 2001, there have been nearly 200 papers using MCMC in publications of the Ecological Society of America and the British Ecological Society, including more than 75 in the journal Ecology and 35 in the Journal of Applied Ecology.
vs. a polynomial chaos-based MCMC
Siripatana, Adil
2014-08-01
Bayesian Inference of Manning\\'s n coefficient in a Storm Surge Model Framework: comparison between Kalman lter and polynomial based method Adil Siripatana Conventional coastal ocean models solve the shallow water equations, which describe the conservation of mass and momentum when the horizontal length scale is much greater than the vertical length scale. In this case vertical pressure gradients in the momentum equations are nearly hydrostatic. The outputs of coastal ocean models are thus sensitive to the bottom stress terms de ned through the formulation of Manning\\'s n coefficients. This thesis considers the Bayesian inference problem of the Manning\\'s n coefficient in the context of storm surge based on the coastal ocean ADCIRC model. In the first part of the thesis, we apply an ensemble-based Kalman filter, the singular evolutive interpolated Kalman (SEIK) filter to estimate both a constant Manning\\'s n coefficient and a 2-D parameterized Manning\\'s coefficient on one ideal and one of more realistic domain using observation system simulation experiments (OSSEs). We study the sensitivity of the system to the ensemble size. we also access the benefits from using an in ation factor on the filter performance. To study the limitation of the Guassian restricted assumption on the SEIK lter, 5 we also implemented in the second part of this thesis a Markov Chain Monte Carlo (MCMC) method based on a Generalized Polynomial chaos (gPc) approach for the estimation of the 1-D and 2-D Mannning\\'s n coe cient. The gPc is used to build a surrogate model that imitate the ADCIRC model in order to make the computational cost of implementing the MCMC with the ADCIRC model reasonable. We evaluate the performance of the MCMC-gPc approach and study its robustness to di erent OSSEs scenario. we also compare its estimates with those resulting from SEIK in term of parameter estimates and full distributions. we present a full analysis of the solution of these two methods, of the
Bayesian inference for partially identified models exploring the limits of limited data
Gustafson, Paul
2015-01-01
Introduction Identification What Is against Us? What Is for Us? Some Simple Examples of Partially Identified ModelsThe Road Ahead The Structure of Inference in Partially Identified Models Bayesian Inference The Structure of Posterior Distributions in PIMs Computational Strategies Strength of Bayesian Updating, Revisited Posterior MomentsCredible Intervals Evaluating the Worth of Inference Partial Identification versus Model Misspecification The Siren Call of Identification Comp
Bayesian Inference for Functional Dynamics Exploring in fMRI Data.
Guo, Xuan; Liu, Bing; Chen, Le; Chen, Guantao; Pan, Yi; Zhang, Jing
2016-01-01
This paper aims to review state-of-the-art Bayesian-inference-based methods applied to functional magnetic resonance imaging (fMRI) data. Particularly, we focus on one specific long-standing challenge in the computational modeling of fMRI datasets: how to effectively explore typical functional interactions from fMRI time series and the corresponding boundaries of temporal segments. Bayesian inference is a method of statistical inference which has been shown to be a powerful tool to encode dependence relationships among the variables with uncertainty. Here we provide an introduction to a group of Bayesian-inference-based methods for fMRI data analysis, which were designed to detect magnitude or functional connectivity change points and to infer their functional interaction patterns based on corresponding temporal boundaries. We also provide a comparison of three popular Bayesian models, that is, Bayesian Magnitude Change Point Model (BMCPM), Bayesian Connectivity Change Point Model (BCCPM), and Dynamic Bayesian Variable Partition Model (DBVPM), and give a summary of their applications. We envision that more delicate Bayesian inference models will be emerging and play increasingly important roles in modeling brain functions in the years to come.
Bayesian Inference for Functional Dynamics Exploring in fMRI Data
Directory of Open Access Journals (Sweden)
Xuan Guo
2016-01-01
Full Text Available This paper aims to review state-of-the-art Bayesian-inference-based methods applied to functional magnetic resonance imaging (fMRI data. Particularly, we focus on one specific long-standing challenge in the computational modeling of fMRI datasets: how to effectively explore typical functional interactions from fMRI time series and the corresponding boundaries of temporal segments. Bayesian inference is a method of statistical inference which has been shown to be a powerful tool to encode dependence relationships among the variables with uncertainty. Here we provide an introduction to a group of Bayesian-inference-based methods for fMRI data analysis, which were designed to detect magnitude or functional connectivity change points and to infer their functional interaction patterns based on corresponding temporal boundaries. We also provide a comparison of three popular Bayesian models, that is, Bayesian Magnitude Change Point Model (BMCPM, Bayesian Connectivity Change Point Model (BCCPM, and Dynamic Bayesian Variable Partition Model (DBVPM, and give a summary of their applications. We envision that more delicate Bayesian inference models will be emerging and play increasingly important roles in modeling brain functions in the years to come.
Rahmati, Vahid; Kirmse, Knut; Marković, Dimitrije; Holthoff, Knut; Kiebel, Stefan J
2016-02-01
Calcium imaging has been used as a promising technique to monitor the dynamic activity of neuronal populations. However, the calcium trace is temporally smeared which restricts the extraction of quantities of interest such as spike trains of individual neurons. To address this issue, spike reconstruction algorithms have been introduced. One limitation of such reconstructions is that the underlying models are not informed about the biophysics of spike and burst generations. Such existing prior knowledge might be useful for constraining the possible solutions of spikes. Here we describe, in a novel Bayesian approach, how principled knowledge about neuronal dynamics can be employed to infer biophysical variables and parameters from fluorescence traces. By using both synthetic and in vitro recorded fluorescence traces, we demonstrate that the new approach is able to reconstruct different repetitive spiking and/or bursting patterns with accurate single spike resolution. Furthermore, we show that the high inference precision of the new approach is preserved even if the fluorescence trace is rather noisy or if the fluorescence transients show slow rise kinetics lasting several hundred milliseconds, and inhomogeneous rise and decay times. In addition, we discuss the use of the new approach for inferring parameter changes, e.g. due to a pharmacological intervention, as well as for inferring complex characteristics of immature neuronal circuits.
Hernández, Mario R.; Francés, Félix
2015-04-01
One phase of the hydrological models implementation process, significantly contributing to the hydrological predictions uncertainty, is the calibration phase in which values of the unknown model parameters are tuned by optimizing an objective function. An unsuitable error model (e.g. Standard Least Squares or SLS) introduces noise into the estimation of the parameters. The main sources of this noise are the input errors and the hydrological model structural deficiencies. Thus, the biased calibrated parameters cause the divergence model phenomenon, where the errors variance of the (spatially and temporally) forecasted flows far exceeds the errors variance in the fitting period, and provoke the loss of part or all of the physical meaning of the modeled processes. In other words, yielding a calibrated hydrological model which works well, but not for the right reasons. Besides, an unsuitable error model yields a non-reliable predictive uncertainty assessment. Hence, with the aim of prevent all these undesirable effects, this research focuses on the Bayesian joint inference (BJI) of both the hydrological and error model parameters, considering a general additive (GA) error model that allows for correlation, non-stationarity (in variance and bias) and non-normality of model residuals. As hydrological model, it has been used a conceptual distributed model called TETIS, with a particular split structure of the effective model parameters. Bayesian inference has been performed with the aid of a Markov Chain Monte Carlo (MCMC) algorithm called Dream-ZS. MCMC algorithm quantifies the uncertainty of the hydrological and error model parameters by getting the joint posterior probability distribution, conditioned on the observed flows. The BJI methodology is a very powerful and reliable tool, but it must be used correctly this is, if non-stationarity in errors variance and bias is modeled, the Total Laws must be taken into account. The results of this research show that the
Bayesian Inference in Polling Technique: 1992 Presidential Polls.
Satake, Eiki
1994-01-01
Explores the potential utility of Bayesian statistical methods in determining the predictability of multiple polls. Compares Bayesian techniques to the classical statistical method employed by pollsters. Considers these questions in the context of the 1992 presidential elections. (HB)
Bayesian analysis of Markov point processes
DEFF Research Database (Denmark)
Berthelsen, Kasper Klitgaard; Møller, Jesper
2006-01-01
Recently Møller, Pettitt, Berthelsen and Reeves introduced a new MCMC methodology for drawing samples from a posterior distribution when the likelihood function is only specified up to a normalising constant. We illustrate the method in the setting of Bayesian inference for Markov point processes...
Bayesian inference and life testing plans for generalized exponential distribution
Institute of Scientific and Technical Information of China (English)
KUNDU; Debasis; PRADHAN; Biswabrata
2009-01-01
Recently generalized exponential distribution has received considerable attentions.In this paper,we deal with the Bayesian inference of the unknown parameters of the progressively censored generalized exponential distribution.It is assumed that the scale and the shape parameters have independent gamma priors.The Bayes estimates of the unknown parameters cannot be obtained in the closed form.Lindley’s approximation and importance sampling technique have been suggested to compute the approximate Bayes estimates.Markov Chain Monte Carlo method has been used to compute the approximate Bayes estimates and also to construct the highest posterior density credible intervals.We also provide different criteria to compare two different sampling schemes and hence to ?nd the optimal sampling schemes.It is observed that ?nding the optimum censoring procedure is a computationally expensive process.And we have recommended to use the sub-optimal censoring procedure,which can be obtained very easily.Monte Carlo simulations are performed to compare the performances of the different methods and one data analysis has been performed for illustrative purposes.
Bayesian Inference for NASA Probabilistic Risk and Reliability Analysis
Dezfuli, Homayoon; Kelly, Dana; Smith, Curtis; Vedros, Kurt; Galyean, William
2009-01-01
This document, Bayesian Inference for NASA Probabilistic Risk and Reliability Analysis, is intended to provide guidelines for the collection and evaluation of risk and reliability-related data. It is aimed at scientists and engineers familiar with risk and reliability methods and provides a hands-on approach to the investigation and application of a variety of risk and reliability data assessment methods, tools, and techniques. This document provides both: A broad perspective on data analysis collection and evaluation issues. A narrow focus on the methods to implement a comprehensive information repository. The topics addressed herein cover the fundamentals of how data and information are to be used in risk and reliability analysis models and their potential role in decision making. Understanding these topics is essential to attaining a risk informed decision making environment that is being sought by NASA requirements and procedures such as 8000.4 (Agency Risk Management Procedural Requirements), NPR 8705.05 (Probabilistic Risk Assessment Procedures for NASA Programs and Projects), and the System Safety requirements of NPR 8715.3 (NASA General Safety Program Requirements).
Self-associations influence task-performance through Bayesian inference
Directory of Open Access Journals (Sweden)
Sara L Bengtsson
2013-08-01
Full Text Available The way we think about ourselves impacts greatly on our behaviour. This paper describes a behavioural study and a computational model that sheds new light on this important area. Participants were primed 'clever' and 'stupid' using a scrambled sentence task, and we measured the effect on response time and error-rate on a rule-association task. First, we observed a confirmation bias effect in that associations to being 'stupid' led to a gradual decrease in performance, whereas associations to being 'clever' did not. Second, we observed that the activated self-concepts selectively modified attention towards one's performance. There was an early to late double dissociation in RTs in that primed 'clever' resulted in RT increase following error responses, whereas primed 'stupid' resulted in RT increase following correct responses. We propose a computational model of subjects' behaviour based on the logic of the experimental task that involves two processes; memory for rules and the integration of rules with subsequent visual cues. The model also incorporates an adaptive decision threshold based on Bayes rule, whereby decision thresholds are increased if integration was inferred to be faulty. Fitting the computational model to experimental data confirmed our hypothesis that priming affects the memory process. This model explains both the confirmation bias and double dissociation effects and demonstrates that Bayesian inferential principles can be used to study the effect of self-concepts on behaviour.
Directory of Open Access Journals (Sweden)
Oliver Serang
Full Text Available Exact Bayesian inference can sometimes be performed efficiently for special cases where a function has commutative and associative symmetry of its inputs (called "causal independence". For this reason, it is desirable to exploit such symmetry on big data sets. Here we present a method to exploit a general form of this symmetry on probabilistic adder nodes by transforming those probabilistic adder nodes into a probabilistic convolution tree with which dynamic programming computes exact probabilities. A substantial speedup is demonstrated using an illustration example that can arise when identifying splice forms with bottom-up mass spectrometry-based proteomics. On this example, even state-of-the-art exact inference algorithms require a runtime more than exponential in the number of splice forms considered. By using the probabilistic convolution tree, we reduce the runtime to O(k log(k2 and the space to O(k log(k where k is the number of variables joined by an additive or cardinal operator. This approach, which can also be used with junction tree inference, is applicable to graphs with arbitrary dependency on counting variables or cardinalities and can be used on diverse problems and fields like forward error correcting codes, elemental decomposition, and spectral demixing. The approach also trivially generalizes to multiple dimensions.
Mocapy++ - a toolkit for inference and learning in dynamic Bayesian networks
DEFF Research Database (Denmark)
Paluszewski, Martin; Hamelryck, Thomas Wim
2010-01-01
Background Mocapy++ is a toolkit for parameter learning and inference in dynamic Bayesian networks (DBNs). It supports a wide range of DBN architectures and probability distributions, including distributions from directional statistics (the statistics of angles, directions and orientations...
Quantifying MCMC exploration of phylogenetic tree space.
Whidden, Chris; Matsen, Frederick A
2015-05-01
In order to gain an understanding of the effectiveness of phylogenetic Markov chain Monte Carlo (MCMC), it is important to understand how quickly the empirical distribution of the MCMC converges to the posterior distribution. In this article, we investigate this problem on phylogenetic tree topologies with a metric that is especially well suited to the task: the subtree prune-and-regraft (SPR) metric. This metric directly corresponds to the minimum number of MCMC rearrangements required to move between trees in common phylogenetic MCMC implementations. We develop a novel graph-based approach to analyze tree posteriors and find that the SPR metric is much more informative than simpler metrics that are unrelated to MCMC moves. In doing so, we show conclusively that topological peaks do occur in Bayesian phylogenetic posteriors from real data sets as sampled with standard MCMC approaches, investigate the efficiency of Metropolis-coupled MCMC (MCMCMC) in traversing the valleys between peaks, and show that conditional clade distribution (CCD) can have systematic problems when there are multiple peaks.
Bayesian approaches to spatial inference: Modelling and computational challenges and solutions
Moores, Matthew; Mengersen, Kerrie
2014-12-01
We discuss a range of Bayesian modelling approaches for spatial data and investigate some of the associated computational challenges. This paper commences with a brief review of Bayesian mixture models and Markov random fields, with enabling computational algorithms including Markov chain Monte Carlo (MCMC) and integrated nested Laplace approximation (INLA). Following this, we focus on the Potts model as a canonical approach, and discuss the challenge of estimating the inverse temperature parameter that controls the degree of spatial smoothing. We compare three approaches to addressing the doubly intractable nature of the likelihood, namely pseudo-likelihood, path sampling and the exchange algorithm. These techniques are applied to satellite data used to analyse water quality in the Great Barrier Reef.
UNSUPERVISED TRANSIENT LIGHT CURVE ANALYSIS VIA HIERARCHICAL BAYESIAN INFERENCE
Energy Technology Data Exchange (ETDEWEB)
Sanders, N. E.; Soderberg, A. M. [Harvard-Smithsonian Center for Astrophysics, 60 Garden Street, Cambridge, MA 02138 (United States); Betancourt, M., E-mail: nsanders@cfa.harvard.edu [Department of Statistics, University of Warwick, Coventry CV4 7AL (United Kingdom)
2015-02-10
Historically, light curve studies of supernovae (SNe) and other transient classes have focused on individual objects with copious and high signal-to-noise observations. In the nascent era of wide field transient searches, objects with detailed observations are decreasing as a fraction of the overall known SN population, and this strategy sacrifices the majority of the information contained in the data about the underlying population of transients. A population level modeling approach, simultaneously fitting all available observations of objects in a transient sub-class of interest, fully mines the data to infer the properties of the population and avoids certain systematic biases. We present a novel hierarchical Bayesian statistical model for population level modeling of transient light curves, and discuss its implementation using an efficient Hamiltonian Monte Carlo technique. As a test case, we apply this model to the Type IIP SN sample from the Pan-STARRS1 Medium Deep Survey, consisting of 18,837 photometric observations of 76 SNe, corresponding to a joint posterior distribution with 9176 parameters under our model. Our hierarchical model fits provide improved constraints on light curve parameters relevant to the physical properties of their progenitor stars relative to modeling individual light curves alone. Moreover, we directly evaluate the probability for occurrence rates of unseen light curve characteristics from the model hyperparameters, addressing observational biases in survey methodology. We view this modeling framework as an unsupervised machine learning technique with the ability to maximize scientific returns from data to be collected by future wide field transient searches like LSST.
Elsheikh, Ahmed H.
2014-02-01
An efficient Bayesian calibration method based on the nested sampling (NS) algorithm and non-intrusive polynomial chaos method is presented. Nested sampling is a Bayesian sampling algorithm that builds a discrete representation of the posterior distributions by iteratively re-focusing a set of samples to high likelihood regions. NS allows representing the posterior probability density function (PDF) with a smaller number of samples and reduces the curse of dimensionality effects. The main difficulty of the NS algorithm is in the constrained sampling step which is commonly performed using a random walk Markov Chain Monte-Carlo (MCMC) algorithm. In this work, we perform a two-stage sampling using a polynomial chaos response surface to filter out rejected samples in the Markov Chain Monte-Carlo method. The combined use of nested sampling and the two-stage MCMC based on approximate response surfaces provides significant computational gains in terms of the number of simulation runs. The proposed algorithm is applied for calibration and model selection of subsurface flow models. © 2013.
Bayesian inference of nonlinear unsteady aerodynamics from aeroelastic limit cycle oscillations
Energy Technology Data Exchange (ETDEWEB)
Sandhu, Rimple [Department of Civil and Environmental Engineering, Carleton University, Ottawa, Ontario (Canada); Poirel, Dominique [Department of Mechanical and Aerospace Engineering, Royal Military College of Canada, Kingston, Ontario (Canada); Pettit, Chris [Department of Aerospace Engineering, United States Naval Academy, Annapolis, MD (United States); Khalil, Mohammad [Department of Civil and Environmental Engineering, Carleton University, Ottawa, Ontario (Canada); Sarkar, Abhijit, E-mail: abhijit.sarkar@carleton.ca [Department of Civil and Environmental Engineering, Carleton University, Ottawa, Ontario (Canada)
2016-07-01
A Bayesian model selection and parameter estimation algorithm is applied to investigate the influence of nonlinear and unsteady aerodynamic loads on the limit cycle oscillation (LCO) of a pitching airfoil in the transitional Reynolds number regime. At small angles of attack, laminar boundary layer trailing edge separation causes negative aerodynamic damping leading to the LCO. The fluid–structure interaction of the rigid, but elastically mounted, airfoil and nonlinear unsteady aerodynamics is represented by two coupled nonlinear stochastic ordinary differential equations containing uncertain parameters and model approximation errors. Several plausible aerodynamic models with increasing complexity are proposed to describe the aeroelastic system leading to LCO. The likelihood in the posterior parameter probability density function (pdf) is available semi-analytically using the extended Kalman filter for the state estimation of the coupled nonlinear structural and unsteady aerodynamic model. The posterior parameter pdf is sampled using a parallel and adaptive Markov Chain Monte Carlo (MCMC) algorithm. The posterior probability of each model is estimated using the Chib–Jeliazkov method that directly uses the posterior MCMC samples for evidence (marginal likelihood) computation. The Bayesian algorithm is validated through a numerical study and then applied to model the nonlinear unsteady aerodynamic loads using wind-tunnel test data at various Reynolds numbers.
Bayesian inference of nonlinear unsteady aerodynamics from aeroelastic limit cycle oscillations
Sandhu, Rimple; Poirel, Dominique; Pettit, Chris; Khalil, Mohammad; Sarkar, Abhijit
2016-07-01
A Bayesian model selection and parameter estimation algorithm is applied to investigate the influence of nonlinear and unsteady aerodynamic loads on the limit cycle oscillation (LCO) of a pitching airfoil in the transitional Reynolds number regime. At small angles of attack, laminar boundary layer trailing edge separation causes negative aerodynamic damping leading to the LCO. The fluid-structure interaction of the rigid, but elastically mounted, airfoil and nonlinear unsteady aerodynamics is represented by two coupled nonlinear stochastic ordinary differential equations containing uncertain parameters and model approximation errors. Several plausible aerodynamic models with increasing complexity are proposed to describe the aeroelastic system leading to LCO. The likelihood in the posterior parameter probability density function (pdf) is available semi-analytically using the extended Kalman filter for the state estimation of the coupled nonlinear structural and unsteady aerodynamic model. The posterior parameter pdf is sampled using a parallel and adaptive Markov Chain Monte Carlo (MCMC) algorithm. The posterior probability of each model is estimated using the Chib-Jeliazkov method that directly uses the posterior MCMC samples for evidence (marginal likelihood) computation. The Bayesian algorithm is validated through a numerical study and then applied to model the nonlinear unsteady aerodynamic loads using wind-tunnel test data at various Reynolds numbers.
Bayesian posterior distributions without Markov chains.
Cole, Stephen R; Chu, Haitao; Greenland, Sander; Hamra, Ghassan; Richardson, David B
2012-03-01
Bayesian posterior parameter distributions are often simulated using Markov chain Monte Carlo (MCMC) methods. However, MCMC methods are not always necessary and do not help the uninitiated understand Bayesian inference. As a bridge to understanding Bayesian inference, the authors illustrate a transparent rejection sampling method. In example 1, they illustrate rejection sampling using 36 cases and 198 controls from a case-control study (1976-1983) assessing the relation between residential exposure to magnetic fields and the development of childhood cancer. Results from rejection sampling (odds ratio (OR) = 1.69, 95% posterior interval (PI): 0.57, 5.00) were similar to MCMC results (OR = 1.69, 95% PI: 0.58, 4.95) and approximations from data-augmentation priors (OR = 1.74, 95% PI: 0.60, 5.06). In example 2, the authors apply rejection sampling to a cohort study of 315 human immunodeficiency virus seroconverters (1984-1998) to assess the relation between viral load after infection and 5-year incidence of acquired immunodeficiency syndrome, adjusting for (continuous) age at seroconversion and race. In this more complex example, rejection sampling required a notably longer run time than MCMC sampling but remained feasible and again yielded similar results. The transparency of the proposed approach comes at a price of being less broadly applicable than MCMC.
Bayesian Inference Networks and Spreading Activation in Hypertext Systems.
Savoy, Jacques
1992-01-01
Describes a method based on Bayesian networks for searching hypertext systems. Discussion covers the use of Bayesian networks for structuring index terms and representing user information needs; use of link semantics based on constrained spreading activation to find starting points for browsing; and evaluation of a prototype system. (64…
DEFF Research Database (Denmark)
Heller, Rasmus; Chikhi, Lounes; Siegismund, Hans
2013-01-01
when it is violated. Among the most widely applied demographic inference methods are Bayesian skyline plots (BSPs), which are used across a range of biological fields. Violations of the panmixia assumption are to be expected in many biological systems, but the consequences for skyline plot inferences...
Beigy, Hamid; Ahmad, Ashar; Masoudi-Nejad, Ali; Fröhlich, Holger
2017-01-01
Inferring the structure of molecular networks from time series protein or gene expression data provides valuable information about the complex biological processes of the cell. Causal network structure inference has been approached using different methods in the past. Most causal network inference techniques, such as Dynamic Bayesian Networks and ordinary differential equations, are limited by their computational complexity and thus make large scale inference infeasible. This is specifically true if a Bayesian framework is applied in order to deal with the unavoidable uncertainty about the correct model. We devise a novel Bayesian network reverse engineering approach using ordinary differential equations with the ability to include non-linearity. Besides modeling arbitrary, possibly combinatorial and time dependent perturbations with unknown targets, one of our main contributions is the use of Expectation Propagation, an algorithm for approximate Bayesian inference over large scale network structures in short computation time. We further explore the possibility of integrating prior knowledge into network inference. We evaluate the proposed model on DREAM4 and DREAM8 data and find it competitive against several state-of-the-art existing network inference methods. PMID:28166542
Bayesian parameter inference and model selection by population annealing in systems biology.
Murakami, Yohei
2014-01-01
Parameter inference and model selection are very important for mathematical modeling in systems biology. Bayesian statistics can be used to conduct both parameter inference and model selection. Especially, the framework named approximate Bayesian computation is often used for parameter inference and model selection in systems biology. However, Monte Carlo methods needs to be used to compute Bayesian posterior distributions. In addition, the posterior distributions of parameters are sometimes almost uniform or very similar to their prior distributions. In such cases, it is difficult to choose one specific value of parameter with high credibility as the representative value of the distribution. To overcome the problems, we introduced one of the population Monte Carlo algorithms, population annealing. Although population annealing is usually used in statistical mechanics, we showed that population annealing can be used to compute Bayesian posterior distributions in the approximate Bayesian computation framework. To deal with un-identifiability of the representative values of parameters, we proposed to run the simulations with the parameter ensemble sampled from the posterior distribution, named "posterior parameter ensemble". We showed that population annealing is an efficient and convenient algorithm to generate posterior parameter ensemble. We also showed that the simulations with the posterior parameter ensemble can, not only reproduce the data used for parameter inference, but also capture and predict the data which was not used for parameter inference. Lastly, we introduced the marginal likelihood in the approximate Bayesian computation framework for Bayesian model selection. We showed that population annealing enables us to compute the marginal likelihood in the approximate Bayesian computation framework and conduct model selection depending on the Bayes factor.
Sythesis of MCMC and Belief Propagation
Energy Technology Data Exchange (ETDEWEB)
Ahn, Sungsoo [Korea Advanced Institute of Science and Technology, Daejeon (South Korea); Chertkov, Michael [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Shin, Jinwoo [Korea Advanced Institute of Science and Technology, Daejeon (South Korea)
2016-05-27
Markov Chain Monte Carlo (MCMC) and Belief Propagation (BP) are the most popular algorithms for computational inference in Graphical Models (GM). In principle, MCMC is an exact probabilistic method which, however, often suffers from exponentially slow mixing. In contrast, BP is a deterministic method, which is typically fast, empirically very successful, however in general lacking control of accuracy over loopy graphs. In this paper, we introduce MCMC algorithms correcting the approximation error of BP, i.e., we provide a way to compensate for BP errors via a consecutive BP-aware MCMC. Our framework is based on the Loop Calculus (LC) approach which allows to express the BP error as a sum of weighted generalized loops. Although the full series is computationally intractable, it is known that a truncated series, summing up all 2-regular loops, is computable in polynomial-time for planar pair-wise binary GMs and it also provides a highly accurate approximation empirically. Motivated by this, we first propose a polynomial-time approximation MCMC scheme for the truncated series of general (non-planar) pair-wise binary models. Our main idea here is to use the Worm algorithm, known to provide fast mixing in other (related) problems, and then design an appropriate rejection scheme to sample 2-regular loops. Furthermore, we also design an efficient rejection-free MCMC scheme for approximating the full series. The main novelty underlying our design is in utilizing the concept of cycle basis, which provides an efficient decomposition of the generalized loops. In essence, the proposed MCMC schemes run on transformed GM built upon the non-trivial BP solution, and our experiments show that this synthesis of BP and MCMC outperforms both direct MCMC and bare BP schemes.
Protein NMR Structure Refinement based on Bayesian Inference
Ikeya, Teppei; Ikeda, Shiro; Kigawa, Takanori; Ito, Yutaka; Güntert, Peter
2016-03-01
Nuclear Magnetic Resonance (NMR) spectroscopy is a tool to investigate threedimensional (3D) structures and dynamics of biomacromolecules at atomic resolution in solution or more natural environments such as living cells. Since NMR data are principally only spectra with peak signals, it is required to properly deduce structural information from the sparse experimental data with their imperfections and uncertainty, and to visualize 3D conformations by NMR structure calculation. In order to efficiently analyse the data, Rieping et al. proposed a new structure calculation method based on Bayes’ theorem. We implemented a similar approach into the program CYANA with some modifications. It allows us to handle automatic NOE cross peak assignments in unambiguous and ambiguous usages, and to create a prior distribution based on a physical force field with the generalized Born implicit water model. The sampling scheme for obtaining the posterior is performed by a hybrid Monte Carlo algorithm combined with Markov chain Monte Carlo (MCMC) by the Gibbs sampler, and molecular dynamics simulation (MD) for obtaining a canonical ensemble of conformations. Since it is not trivial to search the entire function space particularly for exploring the conformational prior due to the extraordinarily large conformation space of proteins, the replica exchange method is performed, in which several MCMC calculations with different temperatures run in parallel as replicas. It is shown with simulated data or randomly deleted experimental peaks that the new structure calculation method can provide accurate structures even with less peaks, especially compared with the conventional method. In particular, it dramatically improves in-cell structures of the proteins GB1 and TTHA1718 using exclusively information obtained in living Escherichia coli (E. coli) cells.
Jannson, Tomasz; Wang, Wenjian; Hodelin, Juan; Forrester, Thomas; Romanov, Volodymyr; Kostrzewski, Andrew
2016-05-01
In this paper, Bayesian Binary Sensing (BBS) is discussed as an effective tool for Bayesian Inference (BI) evaluation in interdisciplinary areas such as ISR (and, C3I), Homeland Security, QC, medicine, defense, and many others. In particular, Hilbertian Sine (HS) as an absolute measure of BI, is introduced, while avoiding relativity of decision threshold identification, as in the case of traditional measures of BI, related to false positives and false negatives.
Learning Bayesian networks for discrete data
Liang, Faming
2009-02-01
Bayesian networks have received much attention in the recent literature. In this article, we propose an approach to learn Bayesian networks using the stochastic approximation Monte Carlo (SAMC) algorithm. Our approach has two nice features. Firstly, it possesses the self-adjusting mechanism and thus avoids essentially the local-trap problem suffered by conventional MCMC simulation-based approaches in learning Bayesian networks. Secondly, it falls into the class of dynamic importance sampling algorithms; the network features can be inferred by dynamically weighted averaging the samples generated in the learning process, and the resulting estimates can have much lower variation than the single model-based estimates. The numerical results indicate that our approach can mix much faster over the space of Bayesian networks than the conventional MCMC simulation-based approaches. © 2008 Elsevier B.V. All rights reserved.
Sparse Bayesian Inference and the Temperature Structure of the Solar Corona
Warren, Harry P.; Byers, Jeff M.; Crump, Nicholas A.
2017-02-01
Measuring the temperature structure of the solar atmosphere is critical to understanding how it is heated to high temperatures. Unfortunately, the temperature of the upper atmosphere cannot be observed directly, but must be inferred from spectrally resolved observations of individual emission lines that span a wide range of temperatures. Such observations are “inverted” to determine the distribution of plasma temperatures along the line of sight. This inversion is ill posed and, in the absence of regularization, tends to produce wildly oscillatory solutions. We introduce the application of sparse Bayesian inference to the problem of inferring the temperature structure of the solar corona. Within a Bayesian framework a preference for solutions that utilize a minimum number of basis functions can be encoded into the prior and many ad hoc assumptions can be avoided. We demonstrate the efficacy of the Bayesian approach by considering a test library of 40 assumed temperature distributions.
Sparse Bayesian Inference and the Temperature Structure of the Solar Corona
Warren, Harry P; Crump, Nicholas A
2016-01-01
Measuring the temperature structure of the solar atmosphere is critical to understanding how it is heated to high temperatures. Unfortunately, the temperature of the upper atmosphere cannot be observed directly, but must be inferred from spectrally resolved observations of individual emission lines that span a wide range of temperatures. Such observations are "inverted" to determine the distribution of plasma temperatures along the line of sight. This inversion is ill-posed and, in the absence of regularization, tends to produce wildly oscillatory solutions. We introduce the application of sparse Bayesian inference to the problem of inferring the temperature structure of the solar corona. Within a Bayesian framework a preference for solutions that utilize a minimum number of basis functions can be encoded into the prior and many ad hoc assumptions can be avoided. We demonstrate the efficacy of the Bayesian approach by considering a test library of 40 assumed temperature distributions.
Analysis of Gumbel Model for Software Reliability Using Bayesian Paradigm
Directory of Open Access Journals (Sweden)
Raj Kumar
2012-12-01
Full Text Available In this paper, we have illustrated the suitability of Gumbel Model for software reliability data. The model parameters are estimated using likelihood based inferential procedure: classical as well as Bayesian. The quasi Newton-Raphson algorithm is applied to obtain the maximum likelihood estimates and associated probability intervals. The Bayesian estimates of the parameters of Gumbel model are obtained using Markov Chain Monte Carlo(MCMC simulation method in OpenBUGS(established software for Bayesian analysis using Markov Chain Monte Carlo methods. The R functions are developed to study the statistical properties, model validation and comparison tools of the model and the output analysis of MCMC samples generated from OpenBUGS. Details of applying MCMC to parameter estimation for the Gumbel model are elaborated and a real software reliability data set is considered to illustrate the methods of inference discussed in this paper.
Nonparametric Bayesian inference for multidimensional compound Poisson processes
S. Gugushvili; F. van der Meulen; P. Spreij
2015-01-01
Given a sample from a discretely observed multidimensional compound Poisson process, we study the problem of nonparametric estimation of its jump size density r0 and intensity λ0. We take a nonparametric Bayesian approach to the problem and determine posterior contraction rates in this context, whic
Fancher, Chris M.; Han, Zhen; Levin, Igor; Page, Katharine; Reich, Brian J.; Smith, Ralph C.; Wilson, Alyson G.; Jones, Jacob L.
2016-01-01
A Bayesian inference method for refining crystallographic structures is presented. The distribution of model parameters is stochastically sampled using Markov chain Monte Carlo. Posterior probability distributions are constructed for all model parameters to properly quantify uncertainty by appropriately modeling the heteroskedasticity and correlation of the error structure. The proposed method is demonstrated by analyzing a National Institute of Standards and Technology silicon standard reference material. The results obtained by Bayesian inference are compared with those determined by Rietveld refinement. Posterior probability distributions of model parameters provide both estimates and uncertainties. The new method better estimates the true uncertainties in the model as compared to the Rietveld method. PMID:27550221
Institute of Scientific and Technical Information of China (English)
LIU; Jianfeng; ZHANG; Yuan; ZHANG; Qin; WANG; Lixian; ZHANG; Jigang
2006-01-01
It is a challenging issue to map Quantitative Trait Loci (QTL) underlying complex discrete traits, which usually show discontinuous distribution and less information, using conventional statistical methods. Bayesian-Markov chain Monte Carlo (Bayesian-MCMC) approach is the key procedure in mapping QTL for complex binary traits, which provides a complete posterior distribution for QTL parameters using all prior information. As a consequence, Bayesian estimates of all interested variables can be obtained straightforwardly basing on their posterior samples simulated by the MCMC algorithm. In our study, utilities of Bayesian-MCMC are demonstrated using simulated several animal outbred full-sib families with different family structures for a complex binary trait underlied by both a QTL and polygene. Under the Identity-by-Descent-Based variance component random model, three samplers basing on MCMC, including Gibbs sampling, Metropolis algorithm and reversible jump MCMC, were implemented to generate the joint posterior distribution of all unknowns so that the QTL parameters were obtained by Bayesian statistical inferring. The results showed that Bayesian-MCMC approach could work well and robust under different family structures and QTL effects. As family size increases and the number of family decreases, the accuracy of the parameter estimates will be improved. When the true QTL has a small effect, using outbred population experiment design with large family size is the optimal mapping strategy.
Bayesian inference for a wavefront model of the Neolithisation of Europe
Baggaley, Andrew W; Shukurov, Anvar; Boys, Richard J; Golightly, Andrew
2012-01-01
We consider a wavefront model for the spread of Neolithic culture across Europe, and use Bayesian inference techniques to provide estimates for the parameters within this model, as constrained by radiocarbon data from Southern and Western Europe. Our wavefront model allows for both an isotropic background spread (incorporating the effects of local geography), and a localized anisotropic spread associated with major waterways. We introduce an innovative numerical scheme to track the wavefront, allowing us to simulate the times of the first arrival at any site orders of magnitude more efficiently than traditional PDE approaches. We adopt a Bayesian approach to inference and use Gaussian process emulators to facilitate further increases in efficiency in the inference scheme, thereby making Markov chain Monte Carlo methods practical. We allow for uncertainty in the fit of our model, and also infer a parameter specifying the magnitude of this uncertainty. We obtain a magnitude for the background spread of order 1 ...
Clustered nested sampling: efficient Bayesian inference for cosmology
Shaw, R; Hobson, M P
2007-01-01
Bayesian model selection provides the cosmologist with an exacting tool to distinguish between competing models based purely on the data, via the Bayesian evidence. Previous methods to calculate this quantity either lacked general applicability or were computationally demanding. However, nested sampling (Skilling 2004), which was recently applied successfully to cosmology by Muhkerjee et al. 2006, overcomes both of these impediments. Their implementation restricts the parameter space sampled, and thus improves the efficiency, using a decreasing ellipsoidal bound in the $n$-dimensional parameter space centred on the maximum likelihood point. However, if the likelihood function contains any multi-modality, then the ellipse is prevented from constraining the sampling region efficiently. In this paper we introduce a method of clustered ellipsoidal nested sampling which can form multiple ellipses around each individual peak in the likelihood. In addition we have implemented a method for determining the expectation...
Declarative Modeling and Bayesian Inference of Dark Matter Halos
Kronberger, Gabriel
2013-01-01
Probabilistic programming allows specification of probabilistic models in a declarative manner. Recently, several new software systems and languages for probabilistic programming have been developed on the basis of newly developed and improved methods for approximate inference in probabilistic models. In this contribution a probabilistic model for an idealized dark matter localization problem is described. We first derive the probabilistic model for the inference of dark matter locations and masses, and then show how this model can be implemented using BUGS and Infer.NET, two software systems for probabilistic programming. Finally, the different capabilities of both systems are discussed. The presented dark matter model includes mainly non-conjugate factors, thus, it is difficult to implement this model with Infer.NET.
Strategies for MCMC computation in quantitative genetics
DEFF Research Database (Denmark)
Waagepetersen, Rasmus; Ibánez, N.; Sorensen, Daniel
2006-01-01
but with a sparse inverse. Maximum likelihood inference and Bayesian inference for the linear mixed model are well-studied topics (Sorensen and Gianola, 2002). Regarding Bayesian inference, with appropriate choice of priors, the full conditional distributions are standard distributions and Gibbs sampling can...
Energy Technology Data Exchange (ETDEWEB)
Kim, Joo Yeon; Lee, Seung Hyun; Park, Tai Jin [Korean Association for Radiation Application, Seoul (Korea, Republic of)
2016-06-15
Any real application of Bayesian inference must acknowledge that both prior distribution and likelihood function have only been specified as more or less convenient approximations to whatever the analyzer's true belief might be. If the inferences from the Bayesian analysis are to be trusted, it is important to determine that they are robust to such variations of prior and likelihood as might also be consistent with the analyzer's stated beliefs. The robust Bayesian inference was applied to atmospheric dispersion assessment using Gaussian plume model. The scopes of contaminations were specified as the uncertainties of distribution type and parametric variability. The probabilistic distribution of model parameters was assumed to be contaminated as the symmetric unimodal and unimodal distributions. The distribution of the sector-averaged relative concentrations was then calculated by applying the contaminated priors to the model parameters. The sector-averaged concentrations for stability class were compared by applying the symmetric unimodal and unimodal priors, respectively, as the contaminated one based on the class of ε-contamination. Though ε was assumed as 10%, the medians reflecting the symmetric unimodal priors were nearly approximated within 10% compared with ones reflecting the plausible ones. However, the medians reflecting the unimodal priors were approximated within 20% for a few downwind distances compared with ones reflecting the plausible ones. The robustness has been answered by estimating how the results of the Bayesian inferences are robust to reasonable variations of the plausible priors. From these robust inferences, it is reasonable to apply the symmetric unimodal priors for analyzing the robustness of the Bayesian inferences.
Boos, Moritz; Seer, Caroline; Lange, Florian; Kopp, Bruno
2016-01-01
Cognitive determinants of probabilistic inference were examined using hierarchical Bayesian modeling techniques. A classic urn-ball paradigm served as experimental strategy, involving a factorial two (prior probabilities) by two (likelihoods) design. Five computational models of cognitive processes were compared with the observed behavior. Parameter-free Bayesian posterior probabilities and parameter-free base rate neglect provided inadequate models of probabilistic inference. The introduction of distorted subjective probabilities yielded more robust and generalizable results. A general class of (inverted) S-shaped probability weighting functions had been proposed; however, the possibility of large differences in probability distortions not only across experimental conditions, but also across individuals, seems critical for the model's success. It also seems advantageous to consider individual differences in parameters of probability weighting as being sampled from weakly informative prior distributions of individual parameter values. Thus, the results from hierarchical Bayesian modeling converge with previous results in revealing that probability weighting parameters show considerable task dependency and individual differences. Methodologically, this work exemplifies the usefulness of hierarchical Bayesian modeling techniques for cognitive psychology. Theoretically, human probabilistic inference might be best described as the application of individualized strategic policies for Bayesian belief revision.
Directory of Open Access Journals (Sweden)
Moritz eBoos
2016-05-01
Full Text Available Cognitive determinants of probabilistic inference were examined using hierarchical Bayesian modelling techniques. A classic urn-ball paradigm served as experimental strategy, involving a factorial two (prior probabilities by two (likelihoods design. Five computational models of cognitive processes were compared with the observed behaviour. Parameter-free Bayesian posterior probabilities and parameter-free base rate neglect provided inadequate models of probabilistic inference. The introduction of distorted subjective probabilities yielded more robust and generalizable results. A general class of (inverted S-shaped probability weighting functions had been proposed; however, the possibility of large differences in probability distortions not only across experimental conditions, but also across individuals, seems critical for the model’s success. It also seems advantageous to consider individual differences in parameters of probability weighting as being sampled from weakly informative prior distributions of individual parameter values. Thus, the results from hierarchical Bayesian modelling converge with previous results in revealing that probability weighting parameters show considerable task dependency and individual differences. Methodologically, this work exemplifies the usefulness of hierarchical Bayesian modelling techniques for cognitive psychology. Theoretically, human probabilistic inference might be best described as the application of individualized strategic policies for Bayesian belief revision.
Perkins, Simon; Zwart, Jonathan; Natarajan, Iniyan; Smirnov, Oleg
2015-01-01
We present Montblanc, a GPU implementation of the Radio interferometer measurement equation (RIME) in support of the Bayesian inference for radio observations (BIRO) technique. BIRO uses Bayesian inference to select sky models that best match the visibilities observed by a radio interferometer. To accomplish this, BIRO evaluates the RIME multiple times, varying sky model parameters to produce multiple model visibilities. Chi-squared values computed from the model and observed visibilities are used as likelihood values to drive the Bayesian sampling process and select the best sky model. As most of the elements of the RIME and chi-squared calculation are independent of one another, they are highly amenable to parallel computation. Additionally, Montblanc caters for iterative RIME evaluation to produce multiple chi-squared values. Only modified model parameters are transferred to the GPU between each iteration. We implemented Montblanc as a Python package based upon NVIDIA's CUDA architecture. As such, it is ea...
DEFF Research Database (Denmark)
Møller, Jesper; Jacobsen, Robert Dahl
Gaussian, and the wavelet coefficients have log-variances equal to the hidden states. We argue why this provides a flexible model where frequentist and Bayesian inference procedures become tractable for estimation of parameters and hidden states. Our methodology is illustrated for denoising and edge...... detection problems in two-dimensional images....
Bayesian Inference and Prediction in an M/G/1 with Optional Second Service
Mohammadi, A.; Salehi-Rad, M. R.
2012-01-01
In this article, we exploit the Bayesian inference and prediction for an M/G/1 queuing model with optional second re-service. In this model, a service unit attends customers arriving following a Poisson process and demanding service according to a general distribution and some of customers need to r
Explaining Inference on a Population of Independent Agents Using Bayesian Networks
Sutovsky, Peter
2013-01-01
The main goal of this research is to design, implement, and evaluate a novel explanation method, the hierarchical explanation method (HEM), for explaining Bayesian network (BN) inference when the network is modeling a population of conditionally independent agents, each of which is modeled as a subnetwork. For example, consider disease-outbreak…
Using SAS PROC MCMC for Item Response Theory Models
Ames, Allison J.; Samonte, Kelli
2015-01-01
Interest in using Bayesian methods for estimating item response theory models has grown at a remarkable rate in recent years. This attentiveness to Bayesian estimation has also inspired a growth in available software such as WinBUGS, R packages, BMIRT, MPLUS, and SAS PROC MCMC. This article intends to provide an accessible overview of Bayesian…
Bayesian Inference of Empirical Coefficient for Foundation Settlement
Institute of Scientific and Technical Information of China (English)
LI Zhen-yu; WANG Yong-he; YANG Guo-lin
2009-01-01
A new approach based on Bayesian theory is proposed to determine the empirical coefficient in soil settlement calculation. Prior distribution is assumed to be uniform in [0.2,1.4]. Posterior density function is developed in the condition of prior distribution combined with the information of observed samples at four locations on a passenger dedicated line. The results show that the posterior distribution of the empirical coefficient obeys Gaussian distribution. The mean value of the empirical coefficient decreases gradually with the increasing of the load on ground, and variance variation shows no regularity.
Progress on Bayesian Inference of the Fast Ion Distribution Function
DEFF Research Database (Denmark)
Stagner, L.; Heidbrink, W.W,; Chen, X.;
2013-01-01
The fast-ion distribution function (DF) has a complicated dependence on several phase-space variables. The standard analysis procedure in energetic particle research is to compute the DF theoretically, use that DF in forward modeling to predict diagnostic signals, then compare with measured data...... sensitivity of the measurements are incorporated into Bayesian likelihood probabilities. Prior probabilities describe physical constraints. This poster will show reconstructions of classically described, low-power, MHD-quiescent distribution functions from actual FIDA measurements. A description of the full...
DEFF Research Database (Denmark)
Iglesias, Juan Eugenio; Sabuncu, Mert Rory; Van Leemput, Koen
2013-01-01
Many segmentation algorithms in medical image analysis use Bayesian modeling to augment local image appearance with prior anatomical knowledge. Such methods often contain a large number of free parameters that are first estimated and then kept fixed during the actual segmentation process. However......, a faithful Bayesian analysis would marginalize over such parameters, accounting for their uncertainty by considering all possible values they may take. Here we propose to incorporate this uncertainty into Bayesian segmentation methods in order to improve the inference process. In particular, we approximate...... the required marginalization over model parameters using computationally efficient Markov chain Monte Carlo techniques. We illustrate the proposed approach using a recently developed Bayesian method for the segmentation of hippocampal subfields in brain MRI scans, showing a significant improvement...
Iglesias, Juan Eugenio; Sabuncu, Mert Rory; Van Leemput, Koen
2013-10-01
Many segmentation algorithms in medical image analysis use Bayesian modeling to augment local image appearance with prior anatomical knowledge. Such methods often contain a large number of free parameters that are first estimated and then kept fixed during the actual segmentation process. However, a faithful Bayesian analysis would marginalize over such parameters, accounting for their uncertainty by considering all possible values they may take. Here we propose to incorporate this uncertainty into Bayesian segmentation methods in order to improve the inference process. In particular, we approximate the required marginalization over model parameters using computationally efficient Markov chain Monte Carlo techniques. We illustrate the proposed approach using a recently developed Bayesian method for the segmentation of hippocampal subfields in brain MRI scans, showing a significant improvement in an Alzheimer's disease classification task. As an additional benefit, the technique also allows one to compute informative "error bars" on the volume estimates of individual structures.
Bayesian inference and model comparison for metallic fatigue data
Babuška, Ivo
2016-02-23
In this work, we present a statistical treatment of stress-life (S-N) data drawn from a collection of records of fatigue experiments that were performed on 75S-T6 aluminum alloys. Our main objective is to predict the fatigue life of materials by providing a systematic approach to model calibration, model selection and model ranking with reference to S-N data. To this purpose, we consider fatigue-limit models and random fatigue-limit models that are specially designed to allow the treatment of the run-outs (right-censored data). We first fit the models to the data by maximum likelihood methods and estimate the quantiles of the life distribution of the alloy specimen. To assess the robustness of the estimation of the quantile functions, we obtain bootstrap confidence bands by stratified resampling with respect to the cycle ratio. We then compare and rank the models by classical measures of fit based on information criteria. We also consider a Bayesian approach that provides, under the prior distribution of the model parameters selected by the user, their simulation-based posterior distributions. We implement and apply Bayesian model comparison methods, such as Bayes factor ranking and predictive information criteria based on cross-validation techniques under various a priori scenarios.
Bayesian inference analyses of the polygenic architecture of rheumatoid arthritis
Stahl, Eli A.; Wegmann, Daniel; Trynka, Gosia; Gutierrez-Achury, Javier; Do, Ron; Voight, Benjamin F.; Kraft, Peter; Chen, Robert; Kallberg, Henrik J.; Kurreeman, Fina A. S.; Kathiresan, Sekar; Wijmenga, Cisca; Gregersen, Peter K.; Alfredsson, Lars; Siminovitch, Katherine A.; Worthington, Jane; de Bakker, Paul I. W.; Raychaudhuri, Soumya; Plenge, Robert M.
2012-01-01
The genetic architectures of common, complex diseases are largely uncharacterized. We modeled the genetic architecture underlying genome-wide association study (GWAS) data for rheumatoid arthritis and developed a new method using polygenic risk-score analyses to infer the total liability-scale varia
Bayesian Computation Methods for Inferring Regulatory Network Models Using Biomedical Data.
Tian, Tianhai
2016-01-01
The rapid advancement of high-throughput technologies provides huge amounts of information for gene expression and protein activity in the genome-wide scale. The availability of genomics, transcriptomics, proteomics, and metabolomics dataset gives an unprecedented opportunity to study detailed molecular regulations that is very important to precision medicine. However, it is still a significant challenge to design effective and efficient method to infer the network structure and dynamic property of regulatory networks. In recent years a number of computing methods have been designed to explore the regulatory mechanisms as well as estimate unknown model parameters. Among them, the Bayesian inference method can combine both prior knowledge and experimental data to generate updated information regarding the regulatory mechanisms. This chapter gives a brief review for Bayesian statistical methods that are used to infer the network structure and estimate model parameters based on experimental data.
Strelioff, Christopher C; Crutchfield, James P; Hübler, Alfred W
2007-07-01
Markov chains are a natural and well understood tool for describing one-dimensional patterns in time or space. We show how to infer kth order Markov chains, for arbitrary k , from finite data by applying Bayesian methods to both parameter estimation and model-order selection. Extending existing results for multinomial models of discrete data, we connect inference to statistical mechanics through information-theoretic (type theory) techniques. We establish a direct relationship between Bayesian evidence and the partition function which allows for straightforward calculation of the expectation and variance of the conditional relative entropy and the source entropy rate. Finally, we introduce a method that uses finite data-size scaling with model-order comparison to infer the structure of out-of-class processes.
Learning an Astronomical Catalog of the Visible Universe through Scalable Bayesian Inference
Regier, Jeffrey; Giordano, Ryan; Thomas, Rollin; Schlegel, David; McAuliffe, Jon; Prabhat,
2016-01-01
Celeste is a procedure for inferring astronomical catalogs that attains state-of-the-art scientific results. To date, Celeste has been scaled to at most hundreds of megabytes of astronomical images: Bayesian posterior inference is notoriously demanding computationally. In this paper, we report on a scalable, parallel version of Celeste, suitable for learning catalogs from modern large-scale astronomical datasets. Our algorithmic innovations include a fast numerical optimization routine for Bayesian posterior inference and a statistically efficient scheme for decomposing astronomical optimization problems into subproblems. Our scalable implementation is written entirely in Julia, a new high-level dynamic programming language designed for scientific and numerical computing. We use Julia's high-level constructs for shared and distributed memory parallelism, and demonstrate effective load balancing and efficient scaling on up to 8192 Xeon cores on the NERSC Cori supercomputer.
Hierarchical Bayesian inference of galaxy redshift distributions from photometric surveys
Leistedt, Boris; Peiris, Hiranya V
2016-01-01
Accurately characterizing the redshift distributions of galaxies is essential for analysing deep photometric surveys and testing cosmological models. We present a technique to simultaneously infer redshift distributions and individual redshifts from photometric galaxy catalogues. Our model constructs a piecewise constant representation (effectively a histogram) of the distribution of galaxy types and redshifts, the parameters of which are efficiently inferred from noisy photometric flux measurements. This approach can be seen as a generalization of template-fitting photometric redshift methods and relies on a library of spectral templates to relate the photometric fluxes of individual galaxies to their redshifts. We illustrate this technique on simulated galaxy survey data, and demonstrate that it delivers correct posterior distributions on the underlying type and redshift distributions, as well as on the individual types and redshifts of galaxies. We show that even with uninformative priors, large photometri...
Directory of Open Access Journals (Sweden)
A. A. Zolotin
2015-07-01
Full Text Available Posteriori inference is one of the three kinds of probabilistic-logic inferences in the probabilistic graphical models theory and the base for processing of knowledge patterns with probabilistic uncertainty using Bayesian networks. The paper deals with a task of local posteriori inference description in algebraic Bayesian networks that represent a class of probabilistic graphical models by means of matrix-vector equations. The latter are essentially based on the use of tensor product of matrices, Kronecker degree and Hadamard product. Matrix equations for calculating posteriori probabilities vectors within posteriori inference in knowledge patterns with quanta propositions are obtained. Similar equations of the same type have already been discussed within the confines of the theory of algebraic Bayesian networks, but they were built only for the case of posteriori inference in the knowledge patterns on the ideals of conjuncts. During synthesis and development of matrix-vector equations on quanta propositions probability vectors, a number of earlier results concerning normalizing factors in posteriori inference and assignment of linear projective operator with a selector vector was adapted. We consider all three types of incoming evidences - deterministic, stochastic and inaccurate - combined with scalar and interval estimation of probability truth of propositional formulas in the knowledge patterns. Linear programming problems are formed. Their solution gives the desired interval values of posterior probabilities in the case of inaccurate evidence or interval estimates in a knowledge pattern. That sort of description of a posteriori inference gives the possibility to extend the set of knowledge pattern types that we can use in the local and global posteriori inference, as well as simplify complex software implementation by use of existing third-party libraries, effectively supporting submission and processing of matrices and vectors when
Johnson, Eric D; Tubau, Elisabet
2016-09-27
Presenting natural frequencies facilitates Bayesian inferences relative to using percentages. Nevertheless, many people, including highly educated and skilled reasoners, still fail to provide Bayesian responses to these computationally simple problems. We show that the complexity of relational reasoning (e.g., the structural mapping between the presented and requested relations) can help explain the remaining difficulties. With a non-Bayesian inference that required identical arithmetic but afforded a more direct structural mapping, performance was universally high. Furthermore, reducing the relational demands of the task through questions that directed reasoners to use the presented statistics, as compared with questions that prompted the representation of a second, similar sample, also significantly improved reasoning. Distinct error patterns were also observed between these presented- and similar-sample scenarios, which suggested differences in relational-reasoning strategies. On the other hand, while higher numeracy was associated with better Bayesian reasoning, higher-numerate reasoners were not immune to the relational complexity of the task. Together, these findings validate the relational-reasoning view of Bayesian problem solving and highlight the importance of considering not only the presented task structure, but also the complexity of the structural alignment between the presented and requested relations.
Natural frequencies improve Bayesian reasoning in simple and complex inference tasks.
Hoffrage, Ulrich; Krauss, Stefan; Martignon, Laura; Gigerenzer, Gerd
2015-01-01
Representing statistical information in terms of natural frequencies rather than probabilities improves performance in Bayesian inference tasks. This beneficial effect of natural frequencies has been demonstrated in a variety of applied domains such as medicine, law, and education. Yet all the research and applications so far have been limited to situations where one dichotomous cue is used to infer which of two hypotheses is true. Real-life applications, however, often involve situations where cues (e.g., medical tests) have more than one value, where more than two hypotheses (e.g., diseases) are considered, or where more than one cue is available. In Study 1, we show that natural frequencies, compared to information stated in terms of probabilities, consistently increase the proportion of Bayesian inferences made by medical students in four conditions-three cue values, three hypotheses, two cues, or three cues-by an average of 37 percentage points. In Study 2, we show that teaching natural frequencies for simple tasks with one dichotomous cue and two hypotheses leads to a transfer of learning to complex tasks with three cue values and two cues, with a proportion of 40 and 81% correct inferences, respectively. Thus, natural frequencies facilitate Bayesian reasoning in a much broader class of situations than previously thought.
Large-Scale Optimization for Bayesian Inference in Complex Systems
Energy Technology Data Exchange (ETDEWEB)
Willcox, Karen [MIT; Marzouk, Youssef [MIT
2013-11-12
The SAGUARO (Scalable Algorithms for Groundwater Uncertainty Analysis and Robust Optimization) Project focused on the development of scalable numerical algorithms for large-scale Bayesian inversion in complex systems that capitalize on advances in large-scale simulation-based optimization and inversion methods. The project was a collaborative effort among MIT, the University of Texas at Austin, Georgia Institute of Technology, and Sandia National Laboratories. The research was directed in three complementary areas: efficient approximations of the Hessian operator, reductions in complexity of forward simulations via stochastic spectral approximations and model reduction, and employing large-scale optimization concepts to accelerate sampling. The MIT--Sandia component of the SAGUARO Project addressed the intractability of conventional sampling methods for large-scale statistical inverse problems by devising reduced-order models that are faithful to the full-order model over a wide range of parameter values; sampling then employs the reduced model rather than the full model, resulting in very large computational savings. Results indicate little effect on the computed posterior distribution. On the other hand, in the Texas--Georgia Tech component of the project, we retain the full-order model, but exploit inverse problem structure (adjoint-based gradients and partial Hessian information of the parameter-to-observation map) to implicitly extract lower dimensional information on the posterior distribution; this greatly speeds up sampling methods, so that fewer sampling points are needed. We can think of these two approaches as ``reduce then sample'' and ``sample then reduce.'' In fact, these two approaches are complementary, and can be used in conjunction with each other. Moreover, they both exploit deterministic inverse problem structure, in the form of adjoint-based gradient and Hessian information of the underlying parameter-to-observation map, to
Directory of Open Access Journals (Sweden)
Ali Ahmed
2011-01-01
Full Text Available Problem statement: Similarity based Virtual Screening (VS deals with a large amount of data containing irrelevant and/or redundant fragments or features. Recent use of Bayesian network as an alternative for existing tools for similarity based VS has received noticeable attention of the researchers in the field of chemoinformatics. Approach: To this end, different models of Bayesian network have been developed. In this study, we enhance the Bayesian Inference Network (BIN using a subset of selected molecules features. Results: In this approach, a few features were filtered from the molecular fingerprint features based on a features selection approach. Conclusion: Simulated virtual screening experiments with MDL Drug Data Report (MDDR data sets showed that the proposed method provides simple ways of enhancing the cost effectiveness of ligand-based virtual screening searches, especially for higher diversity data set.
Wavelet-Bayesian inference of cosmic strings embedded in the cosmic microwave background
McEwen, J D; Peiris, H V; Wiaux, Y; Ringeval, C; Bouchet, F R
2016-01-01
Cosmic strings are a well-motivated extension to the standard cosmological model and could induce a subdominant component in the anisotropies of the cosmic microwave background (CMB), in addition to the standard inflationary component. The detection of strings, while observationally challenging, would provide a direct probe of physics at very high energy scales. We develop a new framework for cosmic string inference, constructing a Bayesian analysis in wavelet space where the string-induced CMB component has distinct statistical properties to the standard inflationary component. Our wavelet-Bayesian framework provides a principled approach to compute the posterior distribution of the string tension $G\\mu$ and the Bayesian evidence ratio comparing the string model to the standard inflationary model. Furthermore, we present a technique to recover an estimate of any string-induced CMB map embedded in observational data. Using Planck-like simulations we demonstrate the application of our framework and evaluate it...
Bayesian Spatial Modelling with R-INLA
Finn Lindgren; Håvard Rue
2015-01-01
The principles behind the interface to continuous domain spatial models in the R- INLA software package for R are described. The integrated nested Laplace approximation (INLA) approach proposed by Rue, Martino, and Chopin (2009) is a computationally effective alternative to MCMC for Bayesian inference. INLA is designed for latent Gaussian models, a very wide and flexible class of models ranging from (generalized) linear mixed to spatial and spatio-temporal models. Combined with the stochastic...
DEFF Research Database (Denmark)
Møller, Jesper
2010-01-01
Chapter 9: This contribution concerns statistical inference for parametric models used in stochastic geometry and based on quick and simple simulation free procedures as well as more comprehensive methods based on a maximum likelihood or Bayesian approach combined with markov chain Monte Carlo...
2017-01-01
Gene regulatory networks (GRNs) play an important role in cellular systems and are important for understanding biological processes. Many algorithms have been developed to infer the GRNs. However, most algorithms only pay attention to the gene expression data but do not consider the topology information in their inference process, while incorporating this information can partially compensate for the lack of reliable expression data. Here we develop a Bayesian group lasso with spike and slab priors to perform gene selection and estimation for nonparametric models. B-spline basis functions are used to capture the nonlinear relationships flexibly and penalties are used to avoid overfitting. Further, we incorporate the topology information into the Bayesian method as a prior. We present the application of our method on DREAM3 and DREAM4 datasets and two real biological datasets. The results show that our method performs better than existing methods and the topology information prior can improve the result. PMID:28133490
Directory of Open Access Journals (Sweden)
Yue Fan
2017-01-01
Full Text Available Gene regulatory networks (GRNs play an important role in cellular systems and are important for understanding biological processes. Many algorithms have been developed to infer the GRNs. However, most algorithms only pay attention to the gene expression data but do not consider the topology information in their inference process, while incorporating this information can partially compensate for the lack of reliable expression data. Here we develop a Bayesian group lasso with spike and slab priors to perform gene selection and estimation for nonparametric models. B-spline basis functions are used to capture the nonlinear relationships flexibly and penalties are used to avoid overfitting. Further, we incorporate the topology information into the Bayesian method as a prior. We present the application of our method on DREAM3 and DREAM4 datasets and two real biological datasets. The results show that our method performs better than existing methods and the topology information prior can improve the result.
Ball, William T; Egerton, Jack S; Haigh, Joanna D
2014-01-01
We investigate the relationship between spectral solar irradiance (SSI) and ozone in the tropical upper stratosphere. We find that solar cycle (SC) changes in ozone can be well approximated by considering the ozone response to SSI changes in a small number individual wavelength bands between 176 and 310 nm, operating independently of each other. Additionally, we find that the ozone varies approximately linearly with changes in the SSI. Using these facts, we present a Bayesian formalism for inferring SC SSI changes and uncertainties from measured SC ozone profiles. Bayesian inference is a powerful, mathematically self-consistent method of considering both the uncertainties of the data and additional external information to provide the best estimate of parameters being estimated. Using this method, we show that, given measurement uncertainties in both ozone and SSI datasets, it is not currently possible to distinguish between observed or modelled SSI datasets using available estimates of ozone change profiles, ...
Dorn, C; Khan, A; Heng, K; Alibert, Y; Helled, R; Rivoldini, A; Benz, W
2016-01-01
We aim to present a generalized Bayesian inference method for constraining interiors of super Earths and sub-Neptunes. Our methodology succeeds in quantifying the degeneracy and correlation of structural parameters for high dimensional parameter spaces. Specifically, we identify what constraints can be placed on composition and thickness of core, mantle, ice, ocean, and atmospheric layers given observations of mass, radius, and bulk refractory abundance constraints (Fe, Mg, Si) from observations of the host star's photospheric composition. We employed a full probabilistic Bayesian inference analysis that formally accounts for observational and model uncertainties. Using a Markov chain Monte Carlo technique, we computed joint and marginal posterior probability distributions for all structural parameters of interest. We included state-of-the-art structural models based on self-consistent thermodynamics of core, mantle, high-pressure ice, and liquid water. Furthermore, we tested and compared two different atmosp...
Prudhomme, Serge
2015-09-17
Parameter estimation for complex models using Bayesian inference is usually a very costly process as it requires a large number of solves of the forward problem. We show here how the construction of adaptive surrogate models using a posteriori error estimates for quantities of interest can significantly reduce the computational cost in problems of statistical inference. As surrogate models provide only approximations of the true solutions of the forward problem, it is nevertheless necessary to control these errors in order to construct an accurate reduced model with respect to the observables utilized in the identification of the model parameters. Effectiveness of the proposed approach is demonstrated on a numerical example dealing with the Spalart–Allmaras model for the simulation of turbulent channel flows. In particular, we illustrate how Bayesian model selection using the adapted surrogate model in place of solving the coupled nonlinear equations leads to the same quality of results while requiring fewer nonlinear PDE solves.
VIGoR: Variational Bayesian Inference for Genome-Wide Regression
Directory of Open Access Journals (Sweden)
Akio Onogi
2016-04-01
Full Text Available Genome-wide regression using a number of genome-wide markers as predictors is now widely used for genome-wide association mapping and genomic prediction. We developed novel software for genome-wide regression which we named VIGoR (variational Bayesian inference for genome-wide regression. Variational Bayesian inference is computationally much faster than widely used Markov chain Monte Carlo algorithms. VIGoR implements seven regression methods, and is provided as a command line program package for Linux/Mac, and as a cross-platform R package. In addition to model fitting, cross-validation and hyperparameter tuning using cross-validation can be automatically performed by modifying a single argument. VIGoR is available at https://github.com/Onogi/VIGoR. The R package is also available at https://cran.r-project.org/web/packages/VIGoR/index.html.
BAYESIAN INFERENCE OF HIDDEN GAMMA WEAR PROCESS MODEL FOR SURVIVAL DATA WITH TIES.
Sinha, Arijit; Chi, Zhiyi; Chen, Ming-Hui
2015-10-01
Survival data often contain tied event times. Inference without careful treatment of the ties can lead to biased estimates. This paper develops the Bayesian analysis of a stochastic wear process model to fit survival data that might have a large number of ties. Under a general wear process model, we derive the likelihood of parameters. When the wear process is a Gamma process, the likelihood has a semi-closed form that allows posterior sampling to be carried out for the parameters, hence achieving model selection using Bayesian deviance information criterion. An innovative simulation algorithm via direct forward sampling and Gibbs sampling is developed to sample event times that may have ties in the presence of arbitrary covariates; this provides a tool to assess the precision of inference. An extensive simulation study is reported and a data set is used to further illustrate the proposed methodology.
Bayesian Inference using Neural Net Likelihood Models for Protein Secondary Structure Prediction
Directory of Open Access Journals (Sweden)
Seong-Gon Kim
2011-06-01
Full Text Available Several techniques such as Neural Networks, Genetic Algorithms, Decision Trees and other statistical or heuristic methods have been used to approach the complex non-linear task of predicting Alpha-helicies, Beta-sheets and Turns of a proteins secondary structure in the past. This project introduces a new machine learning method by using an offline trained Multilayered Perceptrons (MLP as the likelihood models within a Bayesian Inference framework to predict secondary structures proteins. Varying window sizes are used to extract neighboring amino acid information and passed back and forth between the Neural Net models and the Bayesian Inference process until there is a convergence of the posterior secondary structure probability.
Schmidt, Paul; Schmid, Volker J; Gaser, Christian; Buck, Dorothea; Bührlen, Susanne; Förschler, Annette; Mühlau, Mark
2013-01-01
Aiming at iron-related T2-hypointensity, which is related to normal aging and neurodegenerative processes, we here present two practicable approaches, based on Bayesian inference, for preprocessing and statistical analysis of a complex set of structural MRI data. In particular, Markov Chain Monte Carlo methods were used to simulate posterior distributions. First, we rendered a segmentation algorithm that uses outlier detection based on model checking techniques within a Bayesian mixture model. Second, we rendered an analytical tool comprising a Bayesian regression model with smoothness priors (in the form of Gaussian Markov random fields) mitigating the necessity to smooth data prior to statistical analysis. For validation, we used simulated data and MRI data of 27 healthy controls (age: [Formula: see text]; range, [Formula: see text]). We first observed robust segmentation of both simulated T2-hypointensities and gray-matter regions known to be T2-hypointense. Second, simulated data and images of segmented T2-hypointensity were analyzed. We found not only robust identification of simulated effects but also a biologically plausible age-related increase of T2-hypointensity primarily within the dentate nucleus but also within the globus pallidus, substantia nigra, and red nucleus. Our results indicate that fully Bayesian inference can successfully be applied for preprocessing and statistical analysis of structural MRI data.
Spatial attention, precision, and Bayesian inference: a study of saccadic response speed.
Vossel, Simone; Mathys, Christoph; Daunizeau, Jean; Bauer, Markus; Driver, Jon; Friston, Karl J; Stephan, Klaas E
2014-06-01
Inferring the environment's statistical structure and adapting behavior accordingly is a fundamental modus operandi of the brain. A simple form of this faculty based on spatial attentional orienting can be studied with Posner's location-cueing paradigm in which a cue indicates the target location with a known probability. The present study focuses on a more complex version of this task, where probabilistic context (percentage of cue validity) changes unpredictably over time, thereby creating a volatile environment. Saccadic response speed (RS) was recorded in 15 subjects and used to estimate subject-specific parameters of a Bayesian learning scheme modeling the subjects' trial-by-trial updates of beliefs. Different response models-specifying how computational states translate into observable behavior-were compared using Bayesian model selection. Saccadic RS was most plausibly explained as a function of the precision of the belief about the causes of sensory input. This finding is in accordance with current Bayesian theories of brain function, and specifically with the proposal that spatial attention is mediated by a precision-dependent gain modulation of sensory input. Our results provide empirical support for precision-dependent changes in beliefs about saccade target locations and motivate future neuroimaging and neuropharmacological studies of how Bayesian inference may determine spatial attention.
Dorazio, R.M.; Johnson, F.A.
2003-01-01
Bayesian inference and decision theory may be used in the solution of relatively complex problems of natural resource management, owing to recent advances in statistical theory and computing. In particular, Markov chain Monte Carlo algorithms provide a computational framework for fitting models of adequate complexity and for evaluating the expected consequences of alternative management actions. We illustrate these features using an example based on management of waterfowl habitat.
Bayesian Statistical Inference in Ion-Channel Models with Exact Missed Event Correction.
Epstein, Michael; Calderhead, Ben; Girolami, Mark A; Sivilotti, Lucia G
2016-07-26
The stochastic behavior of single ion channels is most often described as an aggregated continuous-time Markov process with discrete states. For ligand-gated channels each state can represent a different conformation of the channel protein or a different number of bound ligands. Single-channel recordings show only whether the channel is open or shut: states of equal conductance are aggregated, so transitions between them have to be inferred indirectly. The requirement to filter noise from the raw signal further complicates the modeling process, as it limits the time resolution of the data. The consequence of the reduced bandwidth is that openings or shuttings that are shorter than the resolution cannot be observed; these are known as missed events. Postulated models fitted using filtered data must therefore explicitly account for missed events to avoid bias in the estimation of rate parameters and therefore assess parameter identifiability accurately. In this article, we present the first, to our knowledge, Bayesian modeling of ion-channels with exact missed events correction. Bayesian analysis represents uncertain knowledge of the true value of model parameters by considering these parameters as random variables. This allows us to gain a full appreciation of parameter identifiability and uncertainty when estimating values for model parameters. However, Bayesian inference is particularly challenging in this context as the correction for missed events increases the computational complexity of the model likelihood. Nonetheless, we successfully implemented a two-step Markov chain Monte Carlo method that we called "BICME", which performs Bayesian inference in models of realistic complexity. The method is demonstrated on synthetic and real single-channel data from muscle nicotinic acetylcholine channels. We show that parameter uncertainty can be characterized more accurately than with maximum-likelihood methods. Our code for performing inference in these ion channel
Energy Technology Data Exchange (ETDEWEB)
Kang, Seong Keun; Seong, Poong Hyun [KAIST, Daejeon (Korea, Republic of)
2014-08-15
Bayesian methodology has been used widely used in various research fields. It is method of inference using Bayes' rule to update the estimation of probability for the certain hypothesis when additional evidences are acquired. According to the current researches, malfunction of nuclear power plant can be detected by using this Bayesian inference which consistently piles up the newly incoming data and updates its estimation. However, those researches are based on the assumption that people are doing like computer perfectly, which can be criticized and may cause a problem in real world application. Studies in cognitive psychology indicates that when the amount of information becomes larger, people can't save the whole data because people have limited memory capacity which is well known as working memory, and also they have attention problem. The purpose of this paper is to consider the psychological factors and confirm how much this working memory and attention will affect the resulted estimation based on the Bayesian inference. To confirm this, experiment on human is needed, and the tool of experiment is Compact Nuclear Simulator (CNS)
Inferring cellular regulatory networks with Bayesian model averaging for linear regression (BMALR).
Huang, Xun; Zi, Zhike
2014-08-01
Bayesian network and linear regression methods have been widely applied to reconstruct cellular regulatory networks. In this work, we propose a Bayesian model averaging for linear regression (BMALR) method to infer molecular interactions in biological systems. This method uses a new closed form solution to compute the posterior probabilities of the edges from regulators to the target gene within a hybrid framework of Bayesian model averaging and linear regression methods. We have assessed the performance of BMALR by benchmarking on both in silico DREAM datasets and real experimental datasets. The results show that BMALR achieves both high prediction accuracy and high computational efficiency across different benchmarks. A pre-processing of the datasets with the log transformation can further improve the performance of BMALR, leading to a new top overall performance. In addition, BMALR can achieve robust high performance in community predictions when it is combined with other competing methods. The proposed method BMALR is competitive compared to the existing network inference methods. Therefore, BMALR will be useful to infer regulatory interactions in biological networks. A free open source software tool for the BMALR algorithm is available at https://sites.google.com/site/bmalr4netinfer/.
Estimating uncertainty and reliability of social network data using Bayesian inference.
Farine, Damien R; Strandburg-Peshkin, Ariana
2015-09-01
Social network analysis provides a useful lens through which to view the structure of animal societies, and as a result its use is increasingly widespread. One challenge that many studies of animal social networks face is dealing with limited sample sizes, which introduces the potential for a high level of uncertainty in estimating the rates of association or interaction between individuals. We present a method based on Bayesian inference to incorporate uncertainty into network analyses. We test the reliability of this method at capturing both local and global properties of simulated networks, and compare it to a recently suggested method based on bootstrapping. Our results suggest that Bayesian inference can provide useful information about the underlying certainty in an observed network. When networks are well sampled, observed networks approach the real underlying social structure. However, when sampling is sparse, Bayesian inferred networks can provide realistic uncertainty estimates around edge weights. We also suggest a potential method for estimating the reliability of an observed network given the amount of sampling performed. This paper highlights how relatively simple procedures can be used to estimate uncertainty and reliability in studies using animal social network analysis.
Trans-dimensional Bayesian inference for large sequential data sets
Mandolesi, E.; Dettmer, J.; Dosso, S. E.; Holland, C. W.
2015-12-01
This work develops a sequential Monte Carlo method to infer seismic parameters of layered seabeds from large sequential reflection-coefficient data sets. The approach provides parameter estimates and uncertainties along survey tracks with the goal to aid in the detection of unexploded ordnance in shallow water. The sequential data are acquired by a moving platform with source and receiver array towed close to the seabed. This geometry requires consideration of spherical reflection coefficients, computed efficiently by massively parallel implementation of the Sommerfeld integral via Levin integration on a graphics processing unit. The seabed is parametrized with a trans-dimensional model to account for changes in the environment (i.e. changes in layering) along the track. The method combines advanced Markov chain Monte Carlo methods (annealing) with particle filtering (resampling). Since data from closely-spaced source transmissions (pings) often sample similar environments, the solution from one ping can be utilized to efficiently estimate the posterior for data from subsequent pings. Since reflection-coefficient data are highly informative, the likelihood function can be extremely peaked, resulting in little overlap between posteriors of adjacent pings. This is addressed by adding bridging distributions (via annealed importance sampling) between pings for more efficient transitions. The approach assumes the environment to be changing slowly enough to justify the local 1D parametrization. However, bridging allows rapid changes between pings to be addressed and we demonstrate the method to be stable in such situations. Results are in terms of trans-D parameter estimates and uncertainties along the track. The algorithm is examined for realistic simulated data along a track and applied to a dataset collected by an autonomous underwater vehicle on the Malta Plateau, Mediterranean Sea. [Work supported by the SERDP, DoD.
Statistical detection of EEG synchrony using empirical bayesian inference.
Directory of Open Access Journals (Sweden)
Archana K Singh
Full Text Available There is growing interest in understanding how the brain utilizes synchronized oscillatory activity to integrate information across functionally connected regions. Computing phase-locking values (PLV between EEG signals is a popular method for quantifying such synchronizations and elucidating their role in cognitive tasks. However, high-dimensionality in PLV data incurs a serious multiple testing problem. Standard multiple testing methods in neuroimaging research (e.g., false discovery rate, FDR suffer severe loss of power, because they fail to exploit complex dependence structure between hypotheses that vary in spectral, temporal and spatial dimension. Previously, we showed that a hierarchical FDR and optimal discovery procedures could be effectively applied for PLV analysis to provide better power than FDR. In this article, we revisit the multiple comparison problem from a new Empirical Bayes perspective and propose the application of the local FDR method (locFDR; Efron, 2001 for PLV synchrony analysis to compute FDR as a posterior probability that an observed statistic belongs to a null hypothesis. We demonstrate the application of Efron's Empirical Bayes approach for PLV synchrony analysis for the first time. We use simulations to validate the specificity and sensitivity of locFDR and a real EEG dataset from a visual search study for experimental validation. We also compare locFDR with hierarchical FDR and optimal discovery procedures in both simulation and experimental analyses. Our simulation results showed that the locFDR can effectively control false positives without compromising on the power of PLV synchrony inference. Our results from the application locFDR on experiment data detected more significant discoveries than our previously proposed methods whereas the standard FDR method failed to detect any significant discoveries.
Statistical detection of EEG synchrony using empirical bayesian inference.
Singh, Archana K; Asoh, Hideki; Takeda, Yuji; Phillips, Steven
2015-01-01
There is growing interest in understanding how the brain utilizes synchronized oscillatory activity to integrate information across functionally connected regions. Computing phase-locking values (PLV) between EEG signals is a popular method for quantifying such synchronizations and elucidating their role in cognitive tasks. However, high-dimensionality in PLV data incurs a serious multiple testing problem. Standard multiple testing methods in neuroimaging research (e.g., false discovery rate, FDR) suffer severe loss of power, because they fail to exploit complex dependence structure between hypotheses that vary in spectral, temporal and spatial dimension. Previously, we showed that a hierarchical FDR and optimal discovery procedures could be effectively applied for PLV analysis to provide better power than FDR. In this article, we revisit the multiple comparison problem from a new Empirical Bayes perspective and propose the application of the local FDR method (locFDR; Efron, 2001) for PLV synchrony analysis to compute FDR as a posterior probability that an observed statistic belongs to a null hypothesis. We demonstrate the application of Efron's Empirical Bayes approach for PLV synchrony analysis for the first time. We use simulations to validate the specificity and sensitivity of locFDR and a real EEG dataset from a visual search study for experimental validation. We also compare locFDR with hierarchical FDR and optimal discovery procedures in both simulation and experimental analyses. Our simulation results showed that the locFDR can effectively control false positives without compromising on the power of PLV synchrony inference. Our results from the application locFDR on experiment data detected more significant discoveries than our previously proposed methods whereas the standard FDR method failed to detect any significant discoveries.
Li, Shi; Mukherjee, Bhramar; Batterman, Stuart; Ghosh, Malay
2013-12-01
Case-crossover designs are widely used to study short-term exposure effects on the risk of acute adverse health events. While the frequentist literature on this topic is vast, there is no Bayesian work in this general area. The contribution of this paper is twofold. First, the paper establishes Bayesian equivalence results that require characterization of the set of priors under which the posterior distributions of the risk ratio parameters based on a case-crossover and time-series analysis are identical. Second, the paper studies inferential issues under case-crossover designs in a Bayesian framework. Traditionally, a conditional logistic regression is used for inference on risk-ratio parameters in case-crossover studies. We consider instead a more general full likelihood-based approach which makes less restrictive assumptions on the risk functions. Formulation of a full likelihood leads to growth in the number of parameters proportional to the sample size. We propose a semi-parametric Bayesian approach using a Dirichlet process prior to handle the random nuisance parameters that appear in a full likelihood formulation. We carry out a simulation study to compare the Bayesian methods based on full and conditional likelihood with the standard frequentist approaches for case-crossover and time-series analysis. The proposed methods are illustrated through the Detroit Asthma Morbidity, Air Quality and Traffic study, which examines the association between acute asthma risk and ambient air pollutant concentrations.
Implementing relevance feedback in ligand-based virtual screening using Bayesian inference network.
Abdo, Ammar; Salim, Naomie; Ahmed, Ali
2011-10-01
Recently, the use of the Bayesian network as an alternative to existing tools for similarity-based virtual screening has received noticeable attention from researchers in the chemoinformatics field. The main aim of the Bayesian network model is to improve the retrieval effectiveness of similarity-based virtual screening. To this end, different models of the Bayesian network have been developed. In our previous works, the retrieval performance of the Bayesian network was observed to improve significantly when multiple reference structures or fragment weightings were used. In this article, the authors enhance the Bayesian inference network (BIN) using the relevance feedback information. In this approach, a few high-ranking structures of unknown activity were filtered from the outputs of BIN, based on a single active reference structure, to form a set of active reference structures. This set of active reference structures was used in two distinct techniques for carrying out such BIN searching: reweighting the fragments in the reference structures and group fusion techniques. Simulated virtual screening experiments with three MDL Drug Data Report data sets showed that the proposed techniques provide simple ways of enhancing the cost-effectiveness of ligand-based virtual screening searches, especially for higher diversity data sets.
Bayesian inference of T Tauri star properties using multi-wavelength survey photometry
Barentsen, Geert; Drew, Janet E; Sale, Stuart E
2012-01-01
There are many pertinent open issues in the area of star and planet formation. Large statistical samples of young stars across star-forming regions are needed to trigger a breakthrough in our understanding, but most optical studies are based on a wide variety of spectrographs and analysis methods, which introduces large biases. Here we show how graphical Bayesian networks can be employed to construct a hierarchical probabilistic model which allows pre-main sequence ages, masses, accretion rates, and extinctions to be estimated using two widely available photometric survey databases (IPHAS r/i/Halpha and 2MASS J-band magnitudes.) Because our approach does not rely on spectroscopy, it can easily be applied to homogeneously study the large number of clusters for which Gaia will yield membership lists. We explain how the analysis is carried out using the Markov Chain Monte Carlo (MCMC) method and provide Python source code. We then demonstrate its use on 587 known low-mass members of the star-forming region NGC 2...
Unraveling multiple changes in complex climate time series using Bayesian inference
Berner, Nadine; Trauth, Martin H.; Holschneider, Matthias
2016-04-01
Change points in time series are perceived as heterogeneities in the statistical or dynamical characteristics of observations. Unraveling such transitions yields essential information for the understanding of the observed system. The precise detection and basic characterization of underlying changes is therefore of particular importance in environmental sciences. We present a kernel-based Bayesian inference approach to investigate direct as well as indirect climate observations for multiple generic transition events. In order to develop a diagnostic approach designed to capture a variety of natural processes, the basic statistical features of central tendency and dispersion are used to locally approximate a complex time series by a generic transition model. A Bayesian inversion approach is developed to robustly infer on the location and the generic patterns of such a transition. To systematically investigate time series for multiple changes occurring at different temporal scales, the Bayesian inversion is extended to a kernel-based inference approach. By introducing basic kernel measures, the kernel inference results are composed into a proxy probability to a posterior distribution of multiple transitions. Thus, based on a generic transition model a probability expression is derived that is capable to indicate multiple changes within a complex time series. We discuss the method's performance by investigating direct and indirect climate observations. The approach is applied to environmental time series (about 100 a), from the weather station in Tuscaloosa, Alabama, and confirms documented instrumentation changes. Moreover, the approach is used to investigate a set of complex terrigenous dust records from the ODP sites 659, 721/722 and 967 interpreted as climate indicators of the African region of the Plio-Pleistocene period (about 5 Ma). The detailed inference unravels multiple transitions underlying the indirect climate observations coinciding with established
A Bayesian Framework that integrates heterogeneous data for inferring gene regulatory networks
Directory of Open Access Journals (Sweden)
Tapesh eSantra
2014-05-01
Full Text Available Reconstruction of gene regulatory networks (GRNs from experimental data is a fundamental challenge in systems biology. A number of computational approaches have been developed to infer GRNs from mRNA expression profiles. However, expression profiles alone are proving to be insufficient for inferring GRN topologies with reasonable accuracy. Recently, it has been shown that integration of external data sources (such as gene and protein sequence information, gene ontology data, protein protein interactions with mRNA expression profiles may increase the reliability of the inference process. Here, I propose a new approach that incorporates transcription factor binding sites (TFBS and physical protein interactions (PPI among transcription factors (TFs in a Bayesian Variable Selection (BVS algorithm which can infer GRNs from mRNA expression profiles subjected to genetic perturbations. Using real experimental data, I show that the integration of TFBS and PPI data with mRNA expression profiles leads to significantly more accurate networks than those inferred from expression profiles alone. Additionally, the performance of the proposed algorithm is compared with a series of LASSO regression based network inference methods that can also incorporate prior knowledge in the inference framework. The results of this comparison suggest that BVS can outperform LASSO regression based method in some circumstances.
A bayesian framework that integrates heterogeneous data for inferring gene regulatory networks.
Santra, Tapesh
2014-01-01
Reconstruction of gene regulatory networks (GRNs) from experimental data is a fundamental challenge in systems biology. A number of computational approaches have been developed to infer GRNs from mRNA expression profiles. However, expression profiles alone are proving to be insufficient for inferring GRN topologies with reasonable accuracy. Recently, it has been shown that integration of external data sources (such as gene and protein sequence information, gene ontology data, protein-protein interactions) with mRNA expression profiles may increase the reliability of the inference process. Here, I propose a new approach that incorporates transcription factor binding sites (TFBS) and physical protein interactions (PPI) among transcription factors (TFs) in a Bayesian variable selection (BVS) algorithm which can infer GRNs from mRNA expression profiles subjected to genetic perturbations. Using real experimental data, I show that the integration of TFBS and PPI data with mRNA expression profiles leads to significantly more accurate networks than those inferred from expression profiles alone. Additionally, the performance of the proposed algorithm is compared with a series of least absolute shrinkage and selection operator (LASSO) regression-based network inference methods that can also incorporate prior knowledge in the inference framework. The results of this comparison suggest that BVS can outperform LASSO regression-based method in some circumstances.
AGNfitter: SED-fitting code for AGN and galaxies from a MCMC approach
Calistro Rivera, Gabriela; Lusso, Elisabeta; Hennawi, Joseph F.; Hogg, David W.
2016-07-01
AGNfitter is a fully Bayesian MCMC method to fit the spectral energy distributions (SEDs) of active galactic nuclei (AGN) and galaxies from the sub-mm to the UV; it enables robust disentanglement of the physical processes responsible for the emission of sources. Written in Python, AGNfitter makes use of a large library of theoretical, empirical, and semi-empirical models to characterize both the nuclear and host galaxy emission simultaneously. The model consists of four physical emission components: an accretion disk, a torus of AGN heated dust, stellar populations, and cold dust in star forming regions. AGNfitter determines the posterior distributions of numerous parameters that govern the physics of AGN with a fully Bayesian treatment of errors and parameter degeneracies, allowing one to infer integrated luminosities, dust attenuation parameters, stellar masses, and star formation rates.
An application of MCMC simulation in mortality projection for populations with limited data
Directory of Open Access Journals (Sweden)
Jackie Li
2014-01-01
Full Text Available Objective: IIn this paper, we investigate the use of Bayesian modeling and Markov chain Monte Carlo (MCMC simulation, via the software WinBUGS, to project future mortality for populations with limited data. In particular, we adapt some extensions of the Lee-Carter method under the Bayesian framework to allow for situations in which mortality data are scarce. Our approach would be useful for certain developing nations that have not been regularly collecting death counts and population statistics. Inferences of the model estimates and forecasts can readily be drawn from the simulated samples. Information on another population resembling the population under study can be exploited and incorporated into the prior distributions in order to facilitate the construction of probability intervals. The two sets of data can also be modeled in a joint manner. We demonstrate an application of this approach to some data from China and Taiwan.
Truth, models, model sets, AIC, and multimodel inference: a Bayesian perspective
Barker, Richard J.; Link, William A.
2015-01-01
Statistical inference begins with viewing data as realizations of stochastic processes. Mathematical models provide partial descriptions of these processes; inference is the process of using the data to obtain a more complete description of the stochastic processes. Wildlife and ecological scientists have become increasingly concerned with the conditional nature of model-based inference: what if the model is wrong? Over the last 2 decades, Akaike's Information Criterion (AIC) has been widely and increasingly used in wildlife statistics for 2 related purposes, first for model choice and second to quantify model uncertainty. We argue that for the second of these purposes, the Bayesian paradigm provides the natural framework for describing uncertainty associated with model choice and provides the most easily communicated basis for model weighting. Moreover, Bayesian arguments provide the sole justification for interpreting model weights (including AIC weights) as coherent (mathematically self consistent) model probabilities. This interpretation requires treating the model as an exact description of the data-generating mechanism. We discuss the implications of this assumption, and conclude that more emphasis is needed on model checking to provide confidence in the quality of inference.
Jennen, Danyel G J; van Leeuwen, Danitsja M; Hendrickx, Diana M; Gottschalk, Ralph W H; van Delft, Joost H M; Kleinjans, Jos C S
2015-10-19
Microarray-based transcriptomic analysis has been demonstrated to hold the opportunity to study the effects of human exposure to, e.g., chemical carcinogens at the whole genome level, thus yielding broad-ranging molecular information on possible carcinogenic effects. Since genes do not operate individually but rather through concerted interactions, analyzing and visualizing networks of genes should provide important mechanistic information, especially upon connecting them to functional parameters, such as those derived from measurements of biomarkers for exposure and carcinogenic risk. Conventional methods such as hierarchical clustering and correlation analyses are frequently used to address these complex interactions but are limited as they do not provide directional causal dependence relationships. Therefore, our aim was to apply Bayesian network inference with the purpose of phenotypic anchoring of modified gene expressions. We investigated a use case on transcriptomic responses to cigarette smoking in humans, in association with plasma cotinine levels as biomarkers of exposure and aromatic DNA-adducts in blood cells as biomarkers of carcinogenic risk. Many of the genes that appear in the Bayesian networks surrounding plasma cotinine, and to a lesser extent around aromatic DNA-adducts, hold biologically relevant functions in inducing severe adverse effects of smoking. In conclusion, this study shows that Bayesian network inference enables unbiased phenotypic anchoring of transcriptomics responses. Furthermore, in all inferred Bayesian networks several dependencies are found which point to known but also to new relationships between the expression of specific genes, cigarette smoke exposure, DNA damaging-effects, and smoking-related diseases, in particular associated with apoptosis, DNA repair, and tumor suppression, as well as with autoimmunity.
Foreman-Mackey, Daniel; Hogg, David W.; Lang, Dustin; Goodman, Jonathan
2013-03-01
We introduce a stable, well tested Python implementation of the affine-invariant ensemble sampler for Markov chain Monte Carlo (MCMC) proposed by Goodman & Weare (2010). The code is open source and has already been used in several published projects in the astrophysics literature. The algorithm behind emcee has several advantages over traditional MCMC sampling methods and it has excellent performance as measured by the autocorrelation time (or function calls per independent sample). One major advantage of the algorithm is that it requires hand-tuning of only 1 or 2 parameters compared to ∼N2 for a traditional algorithm in an N-dimensional parameter space. In this document, we describe the algorithm and the details of our implementation. Exploiting the parallelism of the ensemble method, emcee permits any user to take advantage of multiple CPU cores without extra effort. The code is available online at http://dan.iel.fm/emcee under the GNU General Public License v2.
Foreman-Mackey, Daniel; Lang, Dustin; Goodman, Jonathan
2012-01-01
We introduce a stable, well tested Python implementation of the affine-invariant ensemble sampler for Markov chain Monte Carlo (MCMC) proposed by Goodman & Weare (2010). The code is open source and has already been used in several published projects in the astrophysics literature. The algorithm behind emcee has several advantages over traditional MCMC sampling methods and it has excellent performance as measured by the autocorrelation time (or function calls per independent sample). One major advantage of the algorithm is that it requires hand-tuning of only 1 or 2 parameters compared to $\\sim N^2$ for a traditional algorithm in an N-dimensional parameter space. In this document, we describe the algorithm and the details of our implementation and API. Exploiting the parallelism of the ensemble method, emcee permits any user to take advantage of multiple CPU cores without extra effort. The code is available online at http://danfm.ca/emcee under the GNU General Public License v2.
Institute of Scientific and Technical Information of China (English)
陈亮; 程汉文; 吴乐南
2009-01-01
依据星座图采用非参数贝叶斯方法对多元相移键控(MPSK)信号进行调制识别.将未知信噪比(SNR)水平的MPSK信号看成复平面内多个未知均值和方差的高斯分布依照一定的比例混合而成,利用非参数贝叶斯推断方法进行密度估计,实现对MPSK信号分类目的.推断过程中,引入Dirichlet过程作为混合比例因子的先验分布,结合正态逆Wishart(NIW)分布作为均值和方差的先验分布,根据接收信号,利用Gibbs采样的MCMC(Monte Carlo Markov chain)随机采样算法,不断调整混合比例因子、均值和方差.通过多次迭代,得到对调制信号的密度估计.仿真表明,在SNR>5 dB,码元数目大于1 600时,2/4/8PSK的识别率超过了95%.%A nonparametric Bayesian method is presented to classify the MPSK (M-ary phase shift keying) signals. The MPSK signals with unknown signal noise ratios (SNRs) are modeled as a Gaussian mixture model with unknown means and covariances in the constellation plane, and a clustering method is proposed to estimate the probability density of the MPSK signals. The method is based on the nonparametric Bayesian inference, which introduces the Dirichlet process as the prior probability of the mixture coefficient, and applies a normal inverse Wishart (NIW) distribution as the prior probability of the unknown mean and covariance. Then, according to the received signals, the parameters are adjusted by the Monte Carlo Markov chain (MCMC) random sampling algorithm. By iterations, the density estimation of the MPSK signals can be estimated. Simulation results show that the correct recognition ratio of 2/4/8PSK is greater than 95% under the condition that SNR >5 dB and 1 600 symbols are used in this method.
基于贝叶斯推理的HCM延误模型修正%Revision of HCM Delay Model Based on Bayesian Inference
Institute of Scientific and Technical Information of China (English)
张惠玲; 孙剑; 邵海鹏
2011-01-01
针对以1个周期时长为分析单位、使用HCM2000延误模型推导信号控制交叉口延误的问题,提出推导模型中参数修正的方法,用t检验验证参数提取的精度.对延误提取模型中的饱和度、启动损失时间及交叉口几何修正系数等参数进行分析,采用贝叶斯定理和马尔科夫链蒙特卡罗模拟方法对参数进行修正.结果证明该方法可以提高按照周期提取延误参数的精度.%This paper validates the HCM2000 delay model at the normal traffic condition using 1 cycle as the duration analysis period. The precision of the HCM 2000 delay model is tested using the t-test method. The saturation, geometry parameter and the start-up loss time parameter in the model are analyzed. Bayesian inference and Markov Chain Monte Carlo(MCMC) simulation are used to revise the models parameters. The method can improve the model's precision when using 1 cycle as the duration of analysis period.
Directory of Open Access Journals (Sweden)
Abel Palafox
2014-01-01
Full Text Available We address a prototype inverse scattering problem in the interface of applied mathematics, statistics, and scientific computing. We pose the acoustic inverse scattering problem in a Bayesian inference perspective and simulate from the posterior distribution using MCMC. The PDE forward map is implemented using high performance computing methods. We implement a standard Bayesian model selection method to estimate an effective number of Fourier coefficients that may be retrieved from noisy data within a standard formulation.
Directory of Open Access Journals (Sweden)
Michael J McGeachie
2014-06-01
Full Text Available Bayesian Networks (BN have been a popular predictive modeling formalism in bioinformatics, but their application in modern genomics has been slowed by an inability to cleanly handle domains with mixed discrete and continuous variables. Existing free BN software packages either discretize continuous variables, which can lead to information loss, or do not include inference routines, which makes prediction with the BN impossible. We present CGBayesNets, a BN package focused around prediction of a clinical phenotype from mixed discrete and continuous variables, which fills these gaps. CGBayesNets implements Bayesian likelihood and inference algorithms for the conditional Gaussian Bayesian network (CGBNs formalism, one appropriate for predicting an outcome of interest from, e.g., multimodal genomic data. We provide four different network learning algorithms, each making a different tradeoff between computational cost and network likelihood. CGBayesNets provides a full suite of functions for model exploration and verification, including cross validation, bootstrapping, and AUC manipulation. We highlight several results obtained previously with CGBayesNets, including predictive models of wood properties from tree genomics, leukemia subtype classification from mixed genomic data, and robust prediction of intensive care unit mortality outcomes from metabolomic profiles. We also provide detailed example analysis on public metabolomic and gene expression datasets. CGBayesNets is implemented in MATLAB and available as MATLAB source code, under an Open Source license and anonymous download at http://www.cgbayesnets.com.
Planetary micro-rover operations on Mars using a Bayesian framework for inference and control
Post, Mark A.; Li, Junquan; Quine, Brendan M.
2016-03-01
With the recent progress toward the application of commercially-available hardware to small-scale space missions, it is now becoming feasible for groups of small, efficient robots based on low-power embedded hardware to perform simple tasks on other planets in the place of large-scale, heavy and expensive robots. In this paper, we describe design and programming of the Beaver micro-rover developed for Northern Light, a Canadian initiative to send a small lander and rover to Mars to study the Martian surface and subsurface. For a small, hardware-limited rover to handle an uncertain and mostly unknown environment without constant management by human operators, we use a Bayesian network of discrete random variables as an abstraction of expert knowledge about the rover and its environment, and inference operations for control. A framework for efficient construction and inference into a Bayesian network using only the C language and fixed-point mathematics on embedded hardware has been developed for the Beaver to make intelligent decisions with minimal sensor data. We study the performance of the Beaver as it probabilistically maps a simple outdoor environment with sensor models that include uncertainty. Results indicate that the Beaver and other small and simple robotic platforms can make use of a Bayesian network to make intelligent decisions in uncertain planetary environments.
McGeachie, Michael J; Chang, Hsun-Hsien; Weiss, Scott T
2014-06-01
Bayesian Networks (BN) have been a popular predictive modeling formalism in bioinformatics, but their application in modern genomics has been slowed by an inability to cleanly handle domains with mixed discrete and continuous variables. Existing free BN software packages either discretize continuous variables, which can lead to information loss, or do not include inference routines, which makes prediction with the BN impossible. We present CGBayesNets, a BN package focused around prediction of a clinical phenotype from mixed discrete and continuous variables, which fills these gaps. CGBayesNets implements Bayesian likelihood and inference algorithms for the conditional Gaussian Bayesian network (CGBNs) formalism, one appropriate for predicting an outcome of interest from, e.g., multimodal genomic data. We provide four different network learning algorithms, each making a different tradeoff between computational cost and network likelihood. CGBayesNets provides a full suite of functions for model exploration and verification, including cross validation, bootstrapping, and AUC manipulation. We highlight several results obtained previously with CGBayesNets, including predictive models of wood properties from tree genomics, leukemia subtype classification from mixed genomic data, and robust prediction of intensive care unit mortality outcomes from metabolomic profiles. We also provide detailed example analysis on public metabolomic and gene expression datasets. CGBayesNets is implemented in MATLAB and available as MATLAB source code, under an Open Source license and anonymous download at http://www.cgbayesnets.com.
Inference of emission rates from multiple sources using Bayesian probability theory.
Yee, Eugene; Flesch, Thomas K
2010-03-01
The determination of atmospheric emission rates from multiple sources using inversion (regularized least-squares or best-fit technique) is known to be very susceptible to measurement and model errors in the problem, rendering the solution unusable. In this paper, a new perspective is offered for this problem: namely, it is argued that the problem should be addressed as one of inference rather than inversion. Towards this objective, Bayesian probability theory is used to estimate the emission rates from multiple sources. The posterior probability distribution for the emission rates is derived, accounting fully for the measurement errors in the concentration data and the model errors in the dispersion model used to interpret the data. The Bayesian inferential methodology for emission rate recovery is validated against real dispersion data, obtained from a field experiment involving various source-sensor geometries (scenarios) consisting of four synthetic area sources and eight concentration sensors. The recovery of discrete emission rates from three different scenarios obtained using Bayesian inference and singular value decomposition inversion are compared and contrasted.
Wang, Xiaoxiao; Wang, Huan; Huang, Jinfeng; Zhou, Yifeng; Tzvetanov, Tzvetomir
2017-01-01
The contrast sensitivity function that spans the two dimensions of contrast and spatial frequency is crucial in predicting functional vision both in research and clinical applications. In this study, the use of Bayesian inference was proposed to determine the parameters of the two-dimensional contrast sensitivity function. Two-dimensional Bayesian inference was extensively simulated in comparison to classical one-dimensional measures. Its performance on two-dimensional data gathered with different sampling algorithms was also investigated. The results showed that the two-dimensional Bayesian inference method significantly improved the accuracy and precision of the contrast sensitivity function, as compared to the more common one-dimensional estimates. In addition, applying two-dimensional Bayesian estimation to the final data set showed similar levels of reliability and efficiency across widely disparate and established sampling methods (from classical one-dimensional sampling, such as Ψ or staircase, to more novel multi-dimensional sampling methods, such as quick contrast sensitivity function and Fisher information gain). Furthermore, the improvements observed following the application of Bayesian inference were maintained even when the prior poorly matched the subject's contrast sensitivity function. Simulation results were confirmed in a psychophysical experiment. The results indicated that two-dimensional Bayesian inference of contrast sensitivity function data provides similar estimates across a wide range of sampling methods. The present study likely has implications for the measurement of contrast sensitivity function in various settings (including research and clinical settings) and would facilitate the comparison of existing data from previous studies. PMID:28119563
BioEM: GPU-accelerated computing of Bayesian inference of electron microscopy images
Cossio, Pilar; Baruffa, Fabio; Rampp, Markus; Lindenstruth, Volker; Hummer, Gerhard
2016-01-01
In cryo-electron microscopy (EM), molecular structures are determined from large numbers of projection images of individual particles. To harness the full power of this single-molecule information, we use the Bayesian inference of EM (BioEM) formalism. By ranking structural models using posterior probabilities calculated for individual images, BioEM in principle addresses the challenge of working with highly dynamic or heterogeneous systems not easily handled in traditional EM reconstruction. However, the calculation of these posteriors for large numbers of particles and models is computationally demanding. Here we present highly parallelized, GPU-accelerated computer software that performs this task efficiently. Our flexible formulation employs CUDA, OpenMP, and MPI parallelization combined with both CPU and GPU computing. The resulting BioEM software scales nearly ideally both on pure CPU and on CPU+GPU architectures, thus enabling Bayesian analysis of tens of thousands of images in a reasonable time. The g...
Improving Bayesian population dynamics inference: a coalescent-based model for multiple loci.
Gill, Mandev S; Lemey, Philippe; Faria, Nuno R; Rambaut, Andrew; Shapiro, Beth; Suchard, Marc A
2013-03-01
Effective population size is fundamental in population genetics and characterizes genetic diversity. To infer past population dynamics from molecular sequence data, coalescent-based models have been developed for Bayesian nonparametric estimation of effective population size over time. Among the most successful is a Gaussian Markov random field (GMRF) model for a single gene locus. Here, we present a generalization of the GMRF model that allows for the analysis of multilocus sequence data. Using simulated data, we demonstrate the improved performance of our method to recover true population trajectories and the time to the most recent common ancestor (TMRCA). We analyze a multilocus alignment of HIV-1 CRF02_AG gene sequences sampled from Cameroon. Our results are consistent with HIV prevalence data and uncover some aspects of the population history that go undetected in Bayesian parametric estimation. Finally, we recover an older and more reconcilable TMRCA for a classic ancient DNA data set.
Moving in time: Bayesian causal inference explains movement coordination to auditory beats.
Elliott, Mark T; Wing, Alan M; Welchman, Andrew E
2014-07-07
Many everyday skilled actions depend on moving in time with signals that are embedded in complex auditory streams (e.g. musical performance, dancing or simply holding a conversation). Such behaviour is apparently effortless; however, it is not known how humans combine auditory signals to support movement production and coordination. Here, we test how participants synchronize their movements when there are potentially conflicting auditory targets to guide their actions. Participants tapped their fingers in time with two simultaneously presented metronomes of equal tempo, but differing in phase and temporal regularity. Synchronization therefore depended on integrating the two timing cues into a single-event estimate or treating the cues as independent and thereby selecting one signal over the other. We show that a Bayesian inference process explains the situations in which participants choose to integrate or separate signals, and predicts motor timing errors. Simulations of this causal inference process demonstrate that this model provides a better description of the data than other plausible models. Our findings suggest that humans exploit a Bayesian inference process to control movement timing in situations where the origin of auditory signals needs to be resolved.
Austerity in MCMC Land: Cutting the Metropolis-Hastings Budget
Korattikara, A.; Chen, Y.; Welling, M.
2014-01-01
Can we make Bayesian posterior MCMC sampling more efficient when faced with very large datasets? We argue that computing the likelihood for N datapoints in the Metropolis-Hastings (MH) test to reach a single binary decision is computationally inefficient. We introduce an approximate MH rule based on
Austerity in MCMC Land: Cutting the Metropolis-Hastings Budget
Korattikara, A.; Chen, Y.; Welling, M.
2013-01-01
Can we make Bayesian posterior MCMC sampling more efficient when faced with very large datasets? We argue that computing the likelihood for N datapoints twice in order to reach a single binary decision is computationally inefficient. We introduce an approximate Metropolis-Hastings rule based on a se
Hinsen, Konrad; Kneller, Gerald R
2016-10-21
Anomalous diffusion is characterized by its asymptotic behavior for t → ∞. This makes it difficult to detect and describe in particle trajectories from experiments or computer simulations, which are necessarily of finite length. We propose a new approach using Bayesian inference applied directly to the observed trajectories sampled at different time scales. We illustrate the performance of this approach using random trajectories with known statistical properties and then use it for analyzing the motion of lipid molecules in the plane of a lipid bilayer.
Bayesian inference for multivariate point processes observed at sparsely distributed times
DEFF Research Database (Denmark)
Rasmussen, Jakob Gulddahl; Møller, Jesper; Aukema, B.H.;
normalizing constants. We discuss the advantages and disadvantages of using continuous time processes compared to discrete time processes in the setting of the present paper as well as other spatial-temporal situations. Keywords: Bark beetle, conditional intensity, forest entomology, Markov chain Monte Carlo......We consider statistical and computational aspects of simulation-based Bayesian inference for a multivariate point process which is only observed at sparsely distributed times. For specicity we consider a particular data set which has earlier been analyzed by a discrete time model involving unknown...
DIP -- Diagnostics for Insufficiencies of Posterior calculations in Bayesian signal inference
Dorn, Sebastian; lin, Torsten A Enß
2013-01-01
We present an error-diagnostic validation method for posterior distributions in Bayesian signal inference. It transfers deviations from the correct posterior into characteristic deviations from a uniform distribution of a quantity constructed for this purpose. We show that this method is able to reveal and discriminate several kinds of numerical and approximation errors. For this we present a number of analytical examples of posteriors with incorrect variance, skewness, position of the maximum, or normalization. We show further how this test can be applied to multidimensional signals.
Bayesian inference of the initial conditions from large-scale structure surveys
Leclercq, Florent
2016-10-01
Analysis of three-dimensional cosmological surveys has the potential to answer outstanding questions on the initial conditions from which structure appeared, and therefore on the very high energy physics at play in the early Universe. We report on recently proposed statistical data analysis methods designed to study the primordial large-scale structure via physical inference of the initial conditions in a fully Bayesian framework, and applications to the Sloan Digital Sky Survey data release 7. We illustrate how this approach led to a detailed characterization of the dynamic cosmic web underlying the observed galaxy distribution, based on the tidal environment.
Bayesian inference for kinetic models of biotransformation using a generalized rate equation.
Ying, Shanshan; Zhang, Jiangjiang; Zeng, Lingzao; Shi, Jiachun; Wu, Laosheng
2017-03-06
Selecting proper rate equations for the kinetic models is essential to quantify biotransformation processes in the environment. Bayesian model selection method can be used to evaluate the candidate models. However, comparisons of all plausible models can result in high computational cost, while limiting the number of candidate models may lead to biased results. In this work, we developed an integrated Bayesian method to simultaneously perform model selection and parameter estimation by using a generalized rate equation. In the approach, the model hypotheses were represented by discrete parameters and the rate constants were represented by continuous parameters. Then Bayesian inference of the kinetic models was solved by implementing Markov Chain Monte Carlo simulation for parameter estimation with the mixed (i.e., discrete and continuous) priors. The validity of this approach was illustrated through a synthetic case and a nitrogen transformation experimental study. It showed that our method can successfully identify the plausible models and parameters, as well as uncertainties therein. Thus this method can provide a powerful tool to reveal more insightful information for the complex biotransformation processes.
Bayesian inference of local geomagnetic secular variation curves: application to archaeomagnetism
Lanos, Philippe
2014-05-01
The errors that occur at different stages of the archaeomagnetic calibration process are combined using a Bayesian hierarchical modelling. The archaeomagnetic data obtained from archaeological structures such as hearths, kilns or sets of bricks and tiles, exhibit considerable experimental errors and are generally more or less well dated by archaeological context, history or chronometric methods (14C, TL, dendrochronology, etc.). They can also be associated with stratigraphic observations which provide prior relative chronological information. The modelling we propose allows all these observations and errors to be linked together thanks to appropriate prior probability densities. The model also includes penalized cubic splines for estimating the univariate, spherical or three-dimensional curves for the secular variation of the geomagnetic field (inclination, declination, intensity) over time at a local place. The mean smooth curve we obtain, with its posterior Bayesian envelop provides an adaptation to the effects of variability in the density of reference points over time. Moreover, the hierarchical modelling also allows an efficient way to penalize outliers automatically. With this new posterior estimate of the curve, the Bayesian statistical framework then allows to estimate the calendar dates of undated archaeological features (such as kilns) based on one, two or three geomagnetic parameters (inclination, declination and/or intensity). Date estimates are presented in the same way as those that arise from radiocarbon dating. In order to illustrate the model and the inference method used, we will present results based on French, Bulgarian and Austrian datasets recently published.
Recognizing recurrent neural networks (rRNN): Bayesian inference for recurrent neural networks.
Bitzer, Sebastian; Kiebel, Stefan J
2012-07-01
Recurrent neural networks (RNNs) are widely used in computational neuroscience and machine learning applications. In an RNN, each neuron computes its output as a nonlinear function of its integrated input. While the importance of RNNs, especially as models of brain processing, is undisputed, it is also widely acknowledged that the computations in standard RNN models may be an over-simplification of what real neuronal networks compute. Here, we suggest that the RNN approach may be made computationally more powerful by its fusion with Bayesian inference techniques for nonlinear dynamical systems. In this scheme, we use an RNN as a generative model of dynamic input caused by the environment, e.g. of speech or kinematics. Given this generative RNN model, we derive Bayesian update equations that can decode its output. Critically, these updates define a 'recognizing RNN' (rRNN), in which neurons compute and exchange prediction and prediction error messages. The rRNN has several desirable features that a conventional RNN does not have, e.g. fast decoding of dynamic stimuli and robustness to initial conditions and noise. Furthermore, it implements a predictive coding scheme for dynamic inputs. We suggest that the Bayesian inversion of RNNs may be useful both as a model of brain function and as a machine learning tool. We illustrate the use of the rRNN by an application to the online decoding (i.e. recognition) of human kinematics.
NetDiff - Bayesian model selection for differential gene regulatory network inference.
Thorne, Thomas
2016-12-16
Differential networks allow us to better understand the changes in cellular processes that are exhibited in conditions of interest, identifying variations in gene regulation or protein interaction between, for example, cases and controls, or in response to external stimuli. Here we present a novel methodology for the inference of differential gene regulatory networks from gene expression microarray data. Specifically we apply a Bayesian model selection approach to compare models of conserved and varying network structure, and use Gaussian graphical models to represent the network structures. We apply a variational inference approach to the learning of Gaussian graphical models of gene regulatory networks, that enables us to perform Bayesian model selection that is significantly more computationally efficient than Markov Chain Monte Carlo approaches. Our method is demonstrated to be more robust than independent analysis of data from multiple conditions when applied to synthetic network data, generating fewer false positive predictions of differential edges. We demonstrate the utility of our approach on real world gene expression microarray data by applying it to existing data from amyotrophic lateral sclerosis cases with and without mutations in C9orf72, and controls, where we are able to identify differential network interactions for further investigation.
Multi-model polynomial chaos surrogate dictionary for Bayesian inference in elasticity problems
Contreras, Andres A.
2016-09-19
A method is presented for inferring the presence of an inclusion inside a domain; the proposed approach is suitable to be used in a diagnostic device with low computational power. Specifically, we use the Bayesian framework for the inference of stiff inclusions embedded in a soft matrix, mimicking tumors in soft tissues. We rely on a polynomial chaos (PC) surrogate to accelerate the inference process. The PC surrogate predicts the dependence of the displacements field with the random elastic moduli of the materials, and are computed by means of the stochastic Galerkin (SG) projection method. Moreover, the inclusion\\'s geometry is assumed to be unknown, and this is addressed by using a dictionary consisting of several geometrical models with different configurations. A model selection approach based on the evidence provided by the data (Bayes factors) is used to discriminate among the different geometrical models and select the most suitable one. The idea of using a dictionary of pre-computed geometrical models helps to maintain the computational cost of the inference process very low, as most of the computational burden is carried out off-line for the resolution of the SG problems. Numerical tests are used to validate the methodology, assess its performance, and analyze the robustness to model errors. © 2016 Elsevier Ltd
Sraj, Ihab
2015-10-22
This paper addresses model dimensionality reduction for Bayesian inference based on prior Gaussian fields with uncertainty in the covariance function hyper-parameters. The dimensionality reduction is traditionally achieved using the Karhunen-Loève expansion of a prior Gaussian process assuming covariance function with fixed hyper-parameters, despite the fact that these are uncertain in nature. The posterior distribution of the Karhunen-Loève coordinates is then inferred using available observations. The resulting inferred field is therefore dependent on the assumed hyper-parameters. Here, we seek to efficiently estimate both the field and covariance hyper-parameters using Bayesian inference. To this end, a generalized Karhunen-Loève expansion is derived using a coordinate transformation to account for the dependence with respect to the covariance hyper-parameters. Polynomial Chaos expansions are employed for the acceleration of the Bayesian inference using similar coordinate transformations, enabling us to avoid expanding explicitly the solution dependence on the uncertain hyper-parameters. We demonstrate the feasibility of the proposed method on a transient diffusion equation by inferring spatially-varying log-diffusivity fields from noisy data. The inferred profiles were found closer to the true profiles when including the hyper-parameters’ uncertainty in the inference formulation.
Bickel, David R
2011-01-01
In statistical practice, whether a Bayesian or frequentist approach is used in inference depends not only on the availability of prior information but also on the attitude taken toward partial prior information, with frequentists tending to be more cautious than Bayesians. The proposed framework defines that attitude in terms of a specified amount of caution, thereby enabling data analysis at the level of caution desired and on the basis of any prior information. The caution parameter represents the attitude toward partial prior information in much the same way as a loss function represents the attitude toward risk. When there is very little prior information and nonzero caution, the resulting inferences correspond to those of the candidate confidence intervals and p-values that are most similar to the credible intervals and hypothesis probabilities of the specified Bayesian posterior. On the other hand, in the presence of a known physical distribution of the parameter, inferences are based only on the corres...
Understanding the formation and evolution of interstellar ices: a Bayesian approach
Energy Technology Data Exchange (ETDEWEB)
Makrymallis, Antonios; Viti, Serena, E-mail: antonios@star.ucl.ac.uk [Department of Physics and Astronomy, University College London, London WC1E 6BT (United Kingdom)
2014-10-10
Understanding the physical conditions of dark molecular clouds and star-forming regions is an inverse problem subject to complicated chemistry that varies nonlinearly with both time and the physical environment. In this paper, we apply a Bayesian approach based on a Markov chain Monte Carlo (MCMC) method for solving the nonlinear inverse problems encountered in astrochemical modeling. We use observations for ice and gas species in dark molecular clouds and a time-dependent, gas-grain chemical model to infer the values of the physical and chemical parameters that characterize quiescent regions of molecular clouds. We show evidence that in high-dimensional problems, MCMC algorithms provide a more efficient and complete solution than more classical strategies. The results of our MCMC method enable us to derive statistical estimates and uncertainties for the physical parameters of interest as a result of the Bayesian treatment.
Understanding the Scalability of Bayesian Network Inference Using Clique Tree Growth Curves
Mengshoel, Ole J.
2010-01-01
One of the main approaches to performing computation in Bayesian networks (BNs) is clique tree clustering and propagation. The clique tree approach consists of propagation in a clique tree compiled from a Bayesian network, and while it was introduced in the 1980s, there is still a lack of understanding of how clique tree computation time depends on variations in BN size and structure. In this article, we improve this understanding by developing an approach to characterizing clique tree growth as a function of parameters that can be computed in polynomial time from BNs, specifically: (i) the ratio of the number of a BN s non-root nodes to the number of root nodes, and (ii) the expected number of moral edges in their moral graphs. Analytically, we partition the set of cliques in a clique tree into different sets, and introduce a growth curve for the total size of each set. For the special case of bipartite BNs, there are two sets and two growth curves, a mixed clique growth curve and a root clique growth curve. In experiments, where random bipartite BNs generated using the BPART algorithm are studied, we systematically increase the out-degree of the root nodes in bipartite Bayesian networks, by increasing the number of leaf nodes. Surprisingly, root clique growth is well-approximated by Gompertz growth curves, an S-shaped family of curves that has previously been used to describe growth processes in biology, medicine, and neuroscience. We believe that this research improves the understanding of the scaling behavior of clique tree clustering for a certain class of Bayesian networks; presents an aid for trade-off studies of clique tree clustering using growth curves; and ultimately provides a foundation for benchmarking and developing improved BN inference and machine learning algorithms.
Mengersen, Kerrie
2017-01-01
Objectives In recent years, large-scale longitudinal neuroimaging studies have improved our understanding of healthy ageing and pathologies including Alzheimer's disease (AD). A particular focus of these studies is group differences and identification of participants at risk of deteriorating to a worse diagnosis. For this, statistical analysis using linear mixed-effects (LME) models are used to account for correlated observations from individuals measured over time. A Bayesian framework for LME models in AD is introduced in this paper to provide additional insight often not found in current LME volumetric analyses. Setting and participants Longitudinal neuroimaging case study of ageing was analysed in this research on 260 participants diagnosed as either healthy controls (HC), mild cognitive impaired (MCI) or AD. Bayesian LME models for the ventricle and hippocampus regions were used to: (1) estimate how the volumes of these regions change over time by diagnosis, (2) identify high-risk non-AD individuals with AD like degeneration and (3) determine probabilistic trajectories of diagnosis groups over age. Results We observed (1) large differences in the average rate of change of volume for the ventricle and hippocampus regions between diagnosis groups, (2) high-risk individuals who had progressed from HC to MCI and displayed similar rates of deterioration as AD counterparts, and (3) critical time points which indicate where deterioration of regions begins to diverge between the diagnosis groups. Conclusions To the best of our knowledge, this is the first application of Bayesian LME models to neuroimaging data which provides inference on a population and individual level in the AD field. The application of a Bayesian LME framework allows for additional information to be extracted from longitudinal studies. This provides health professionals with valuable information of neurodegeneration stages, and a potential to provide a better understanding of disease pathology
Likelihood-based inference for clustered line transect data
DEFF Research Database (Denmark)
Waagepetersen, Rasmus; Schweder, Tore
2006-01-01
The uncertainty in estimation of spatial animal density from line transect surveys depends on the degree of spatial clustering in the animal population. To quantify the clustering we model line transect data as independent thinnings of spatial shot-noise Cox processes. Likelihood-based inference...... is implemented using markov chain Monte Carlo (MCMC) methods to obtain efficient estimates of spatial clustering parameters. Uncertainty is addressed using parametric bootstrap or by consideration of posterior distributions in a Bayesian setting. Maximum likelihood estimation and Bayesian inference are compared...
Probabilistic Damage Characterization Using the Computationally-Efficient Bayesian Approach
Warner, James E.; Hochhalter, Jacob D.
2016-01-01
This work presents a computationally-ecient approach for damage determination that quanti es uncertainty in the provided diagnosis. Given strain sensor data that are polluted with measurement errors, Bayesian inference is used to estimate the location, size, and orientation of damage. This approach uses Bayes' Theorem to combine any prior knowledge an analyst may have about the nature of the damage with information provided implicitly by the strain sensor data to form a posterior probability distribution over possible damage states. The unknown damage parameters are then estimated based on samples drawn numerically from this distribution using a Markov Chain Monte Carlo (MCMC) sampling algorithm. Several modi cations are made to the traditional Bayesian inference approach to provide signi cant computational speedup. First, an ecient surrogate model is constructed using sparse grid interpolation to replace a costly nite element model that must otherwise be evaluated for each sample drawn with MCMC. Next, the standard Bayesian posterior distribution is modi ed using a weighted likelihood formulation, which is shown to improve the convergence of the sampling process. Finally, a robust MCMC algorithm, Delayed Rejection Adaptive Metropolis (DRAM), is adopted to sample the probability distribution more eciently. Numerical examples demonstrate that the proposed framework e ectively provides damage estimates with uncertainty quanti cation and can yield orders of magnitude speedup over standard Bayesian approaches.
Energy Technology Data Exchange (ETDEWEB)
Dana L. Kelly; Albert Malkhasyan
2010-06-01
There is a nearly ubiquitous assumption in PSA that parameter values are at least piecewise-constant in time. As a result, Bayesian inference tends to incorporate many years of plant operation, over which there have been significant changes in plant operational and maintenance practices, plant management, etc. These changes can cause significant changes in parameter values over time; however, failure to perform Bayesian inference in the proper time-dependent framework can mask these changes. Failure to question the assumption of constant parameter values, and failure to perform Bayesian inference in the proper time-dependent framework were noted as important issues in NUREG/CR-6813, performed for the U. S. Nuclear Regulatory Commission’s Advisory Committee on Reactor Safeguards in 2003. That report noted that “industry lacks tools to perform time-trend analysis with Bayesian updating.” This paper describes an application of time-dependent Bayesian inference methods developed for the European Commission Ageing PSA Network. These methods utilize open-source software, implementing Markov chain Monte Carlo sampling. The paper also illustrates the development of a generic prior distribution, which incorporates multiple sources of generic data via weighting factors that address differences in key influences, such as vendor, component boundaries, conditions of the operating environment, etc.
Alsing, Justin; Jaffe, Andrew H
2016-01-01
We apply two Bayesian hierarchical inference schemes to infer shear power spectra, shear maps and cosmological parameters from the CFHTLenS weak lensing survey - the first application of this method to data. In the first approach, we sample the joint posterior distribution of the shear maps and power spectra by Gibbs sampling, with minimal model assumptions. In the second approach, we sample the joint posterior of the shear maps and cosmological parameters, providing a new, accurate and principled approach to cosmological parameter inference from cosmic shear data. As a first demonstration on data we perform a 2-bin tomographic analysis to constrain cosmological parameters and investigate the possibility of photometric redshift bias in the CFHTLenS data. Under the baseline $\\Lambda$CDM model we constrain $S_8 = \\sigma_8(\\Omega_\\mathrm{m}/0.3)^{0.5} = 0.67 ^{\\scriptscriptstyle+ 0.03 }_{\\scriptscriptstyle- 0.03 }$ $(68\\%)$, consistent with previous CFHTLenS analysis but in tension with Planck. Adding neutrino m...
Albert, Carlo; Ulzega, Simone; Stoop, Ruedi
2016-04-01
Parameter inference is a fundamental problem in data-driven modeling. Given observed data that is believed to be a realization of some parameterized model, the aim is to find parameter values that are able to explain the observed data. In many situations, the dominant sources of uncertainty must be included into the model for making reliable predictions. This naturally leads to stochastic models. Stochastic models render parameter inference much harder, as the aim then is to find a distribution of likely parameter values. In Bayesian statistics, which is a consistent framework for data-driven learning, this so-called posterior distribution can be used to make probabilistic predictions. We propose a novel, exact, and very efficient approach for generating posterior parameter distributions for stochastic differential equation models calibrated to measured time series. The algorithm is inspired by reinterpreting the posterior distribution as a statistical mechanics partition function of an object akin to a polymer, where the measurements are mapped on heavier beads compared to those of the simulated data. To arrive at distribution samples, we employ a Hamiltonian Monte Carlo approach combined with a multiple time-scale integration. A separation of time scales naturally arises if either the number of measurement points or the number of simulation points becomes large. Furthermore, at least for one-dimensional problems, we can decouple the harmonic modes between measurement points and solve the fastest part of their dynamics analytically. Our approach is applicable to a wide range of inference problems and is highly parallelizable.
Ursino, Mauro; Cuppini, Cristiano; Magosso, Elisa
2017-03-01
Recent theoretical and experimental studies suggest that in multisensory conditions, the brain performs a near-optimal Bayesian estimate of external events, giving more weight to the more reliable stimuli. However, the neural mechanisms responsible for this behavior, and its progressive maturation in a multisensory environment, are still insufficiently understood. The aim of this letter is to analyze this problem with a neural network model of audiovisual integration, based on probabilistic population coding-the idea that a population of neurons can encode probability functions to perform Bayesian inference. The model consists of two chains of unisensory neurons (auditory and visual) topologically organized. They receive the corresponding input through a plastic receptive field and reciprocally exchange plastic cross-modal synapses, which encode the spatial co-occurrence of visual-auditory inputs. A third chain of multisensory neurons performs a simple sum of auditory and visual excitations. The work includes a theoretical part and a computer simulation study. We show how a simple rule for synapse learning (consisting of Hebbian reinforcement and a decay term) can be used during training to shrink the receptive fields and encode the unisensory likelihood functions. Hence, after training, each unisensory area realizes a maximum likelihood estimate of stimulus position (auditory or visual). In cross-modal conditions, the same learning rule can encode information on prior probability into the cross-modal synapses. Computer simulations confirm the theoretical results and show that the proposed network can realize a maximum likelihood estimate of auditory (or visual) positions in unimodal conditions and a Bayesian estimate, with moderate deviations from optimality, in cross-modal conditions. Furthermore, the model explains the ventriloquism illusion and, looking at the activity in the multimodal neurons, explains the automatic reweighting of auditory and visual inputs
Directory of Open Access Journals (Sweden)
Haseeb A. Khan
2008-01-01
Full Text Available This investigation was aimed to compare the inference of antelope phylogenies resulting from the 16S rRNA, cytochrome-b (cyt-b and d-loop segments of mitochondrial DNA using three different computational models including Bayesian (BA, maximum parsimony (MP and unweighted pair group method with arithmetic mean (UPGMA. The respective nucleotide sequences of three Oryx species (Oryx leucoryx, Oryx dammah and Oryx gazella and an out-group (Addax nasomaculatus were aligned and subjected to BA, MP and UPGMA models for comparing the topologies of respective phylogenetic trees. The 16S rRNA region possessed the highest frequency of conserved sequences (97.65% followed by cyt-b (94.22% and d-loop (87.29%. There were few transitions (2.35% and none transversions in 16S rRNA as compared to cyt-b (5.61% transitions and 0.17% transversions and d-loop (11.57% transitions and 1.14% transversions while com- paring the four taxa. All the three mitochondrial segments clearly differentiated the genus Addax from Oryx using the BA or UPGMA models. The topologies of all the gamma-corrected Bayesian trees were identical irrespective of the marker type. The UPGMA trees resulting from 16S rRNA and d-loop sequences were also identical (Oryx dammah grouped with Oryx leucoryx to Bayesian trees except that the UPGMA tree based on cyt-b showed a slightly different phylogeny (Oryx dammah grouped with Oryx gazella with a low bootstrap support. However, the MP model failed to differentiate the genus Addax from Oryx. These findings demonstrate the efficiency and robustness of BA and UPGMA methods for phylogenetic analysis of antelopes using mitochondrial markers.
Bayesian approaches to infer the physical properties of star-forming galaxies at cosmic dawn
Salmon, Brett Weston Killebrew
In this thesis, I seek to advance our understanding of galaxy formation and evolution in the early universe. Using the largest single project ever conducted by the Hubble Space Telescope (the Cosmic Assembly Near-infrared Deep Extragalactic Legacy Survey, CANDELS) I use deep and wide broadband photometric imaging to infer the physical properties of galaxies from z=8.5 to z=1.5. First, I will present a study that extends the relationship between the star-formation rates (SFRs) and stellar masses (M⋆) of galaxies to 3.5attenuated in galaxies. I calculate the Bayesian evidence for galaxies under different assumptions of their underlying dust-attenuation law. By modeling galaxy ultraviolet-to-near-IR broadband CANDELS data I produce Bayesian evidence towards the dust law in individual galaxies that is confirmed by their observed IR luminosities. Moreover, I find a tight correlation between the strength of attenuation in galaxies and their dust law, a relation reinforced by the results from radiative transfer simulations. Finally, I use the Bayesian methods developed in this thesis to study the number density of SFR in galaxies from z=8 to z=4, and resolve the current disconnect between its evolution and that of the stellar mass function. In doing so, I place the first constraints on the dust law of z>4 galaxies, finding it obeys a similar relation as found at z˜2. I find a clear excess in number density at high SFRs. This new SFR function is in better agreement with the observed stellar mass functions, the few to-date infrared detections at high redshifts, and the connection to the observed distribution of lower redshift infrared sources. Together, these studies greatly improve our understanding of the galaxy star-formation histories, the nature of their dust attenuation, and the distribution of SFR among some of the most distant galaxies in the universe.
Mocapy++ - A toolkit for inference and learning in dynamic Bayesian networks
Directory of Open Access Journals (Sweden)
Hamelryck Thomas
2010-03-01
Full Text Available Abstract Background Mocapy++ is a toolkit for parameter learning and inference in dynamic Bayesian networks (DBNs. It supports a wide range of DBN architectures and probability distributions, including distributions from directional statistics (the statistics of angles, directions and orientations. Results The program package is freely available under the GNU General Public Licence (GPL from SourceForge http://sourceforge.net/projects/mocapy. The package contains the source for building the Mocapy++ library, several usage examples and the user manual. Conclusions Mocapy++ is especially suitable for constructing probabilistic models of biomolecular structure, due to its support for directional statistics. In particular, it supports the Kent distribution on the sphere and the bivariate von Mises distribution on the torus. These distributions have proven useful to formulate probabilistic models of protein and RNA structure in atomic detail.
Improving PWR core simulations by Monte Carlo uncertainty analysis and Bayesian inference
Castro, Emilio; Buss, Oliver; Garcia-Herranz, Nuria; Hoefer, Axel; Porsch, Dieter
2016-01-01
A Monte Carlo-based Bayesian inference model is applied to the prediction of reactor operation parameters of a PWR nuclear power plant. In this non-perturbative framework, high-dimensional covariance information describing the uncertainty of microscopic nuclear data is combined with measured reactor operation data in order to provide statistically sound, well founded uncertainty estimates of integral parameters, such as the boron letdown curve and the burnup-dependent reactor power distribution. The performance of this methodology is assessed in a blind test approach, where we use measurements of a given reactor cycle to improve the prediction of the subsequent cycle. As it turns out, the resulting improvement of the prediction quality is impressive. In particular, the prediction uncertainty of the boron letdown curve, which is of utmost importance for the planning of the reactor cycle length, can be reduced by one order of magnitude by including the boron concentration measurement information of the previous...
cosmoabc: Likelihood-free inference via Population Monte Carlo Approximate Bayesian Computation
Ishida, E E O; Penna-Lima, M; Cisewski, J; de Souza, R S; Trindade, A M M; Cameron, E
2015-01-01
Approximate Bayesian Computation (ABC) enables parameter inference for complex physical systems in cases where the true likelihood function is unknown, unavailable, or computationally too expensive. It relies on the forward simulation of mock data and comparison between observed and synthetic catalogues. Here we present cosmoabc, a Python ABC sampler featuring a Population Monte Carlo (PMC) variation of the original ABC algorithm, which uses an adaptive importance sampling scheme. The code is very flexible and can be easily coupled to an external simulator, while allowing to incorporate arbitrary distance and prior functions. As an example of practical application, we coupled cosmoabc with the numcosmo library and demonstrate how it can be used to estimate posterior probability distributions over cosmological parameters based on measurements of galaxy clusters number counts without computing the likelihood function. cosmoabc is published under the GPLv3 license on PyPI and GitHub and documentation is availabl...
Chakraborty, Shubhankar; Roy Chaudhuri, Partha; Das, Prasanta Kr
2016-07-01
In this communication, a novel optical technique has been proposed for the reconstruction of the shape of a Taylor bubble using measurements from multiple arrays of optical sensors. The deviation of an optical beam passing through the bubble depends on the contour of bubble surface. A theoretical model of the deviation of a beam during the traverse of a Taylor bubble through it has been developed. Using this model and the time history of the deviation captured by the sensor array, the bubble shape has been reconstructed. The reconstruction has been performed using an inverse algorithm based on Bayesian inference technique and Markov chain Monte Carlo sampling algorithm. The reconstructed nose shape has been compared with the true shape, extracted through image processing of high speed images. Finally, an error analysis has been performed to pinpoint the sources of the errors.
Wu, Dongfeng; Rosner, Gary L; Broemeling, Lyle
2005-12-01
This article extends previous probability models for periodic breast cancer screening examinations. The specific aim is to provide statistical inference for age dependence of sensitivity and the transition probability from the disease free to the preclinical state. The setting is a periodic screening program in which a cohort of initially asymptomatic women undergo a sequence of breast cancer screening exams. We use age as a covariate in the estimation of screening sensitivity and the transition probability simultaneously, both from a frequentist point of view and within a Bayesian framework. We apply our method to the Health Insurance Plan of Greater New York study of female breast cancer and give age-dependent sensitivity and transition probability density estimates. The inferential methodology we develop is also applicable when analyzing studies of modalities for early detection of other types of progressive chronic diseases.
Bayesian inference for a wave-front model of the neolithization of Europe.
Baggaley, Andrew W; Sarson, Graeme R; Shukurov, Anvar; Boys, Richard J; Golightly, Andrew
2012-07-01
We consider a wave-front model for the spread of neolithic culture across Europe, and use Bayesian inference techniques to provide estimates for the parameters within this model, as constrained by radiocarbon data from southern and western Europe. Our wave-front model allows for both an isotropic background spread (incorporating the effects of local geography) and a localized anisotropic spread associated with major waterways. We introduce an innovative numerical scheme to track the wave front, and use Gaussian process emulators to further increase the efficiency of our model, thereby making Markov chain Monte Carlo methods practical. We allow for uncertainty in the fit of our model, and discuss the inferred distribution of the parameter specifying this uncertainty, along with the distributions of the parameters of our wave-front model. We subsequently use predictive distributions, taking account of parameter uncertainty, to identify radiocarbon sites which do not agree well with our model. These sites may warrant further archaeological study or motivate refinements to the model.
Duforet-Frebourg, Nicolas; Blum, Michael G B
2014-04-01
Patterns of isolation-by-distance (IBD) arise when population differentiation increases with increasing geographic distances. Patterns of IBD are usually caused by local spatial dispersal, which explains why differences of allele frequencies between populations accumulate with distance. However, spatial variations of demographic parameters such as migration rate or population density can generate nonstationary patterns of IBD where the rate at which genetic differentiation accumulates varies across space. To characterize nonstationary patterns of IBD, we infer local genetic differentiation based on Bayesian kriging. Local genetic differentiation for a sampled population is defined as the average genetic differentiation between the sampled population and fictive neighboring populations. To avoid defining populations in advance, the method can also be applied at the scale of individuals making it relevant for landscape genetics. Inference of local genetic differentiation relies on a matrix of pairwise similarity or dissimilarity between populations or individuals such as matrices of FST between pairs of populations. Simulation studies show that maps of local genetic differentiation can reveal barriers to gene flow but also other patterns such as continuous variations of gene flow across habitat. The potential of the method is illustrated with two datasets: single nucleotide polymorphisms from human Swedish populations and dominant markers for alpine plant species.
Bayesian Inference of the Composition and Inflation Power of Hot Jupiters
Thorngren, Daniel Peter; Fortney, Jonathan J.
2016-10-01
The radius of a planet for a given mass is the result of its composition and thermal evolutionary history. For cooler giants, where thermal evolution is relatively well-understood, we can infer a planet's bulk composition from its mass, radius, stellar insolation and age, since all being equal, more metal-rich planets are smaller and denser. For inflated hot giants, there is a degeneracy between inferred composition and inflation power. Within a Bayesian framework we examine both groups, beginning with the cool giant planets. Among these, we observe that the internal heavy-element mass correlates well with the total planet mass, and the metal enrichment relative to the parent star is correlated negatively with planet mass. However, it appears that there is not a simple relation between the planet heavy-element mass and stellar metallicity. These fundamental "mass-metallicity" results are consistent with the core accretion model of planet formation. For the hotter inflated gas giants, we estimate the functional dependence of inflation power on stellar insolation by demanding that the same metal to mass relation applies to both cold and hot gas giants. We consider various forms for this relation and the resulting outliers. This inflation power result is robust to assumptions about metal placement within the planet and equation of state because it relies only on matching the two groups of planets. These results serve as a new way to connect models of planet inflation to existing observations of giant planets.
Overstall, Antony M; Woods, David C
2013-06-01
Bayesian inference is considered for statistical models that depend on the evaluation of a computationally expensive computer code or simulator. For such situations, the number of evaluations of the likelihood function, and hence of the unnormalized posterior probability density function, is determined by the available computational resource and may be extremely limited. We present a new example of such a simulator that describes the properties of human embryonic stem cells using data from optical trapping experiments. This application is used to motivate a novel strategy for Bayesian inference which exploits a Gaussian process approximation of the simulator and allows computationally efficient Markov chain Monte Carlo inference. The advantages of this strategy over previous methodology are that it is less reliant on the determination of tuning parameters and allows the application of model diagnostic procedures that require no additional evaluations of the simulator. We show the advantages of our method on synthetic examples and demonstrate its application on stem cell experiments.
Analysis of simulated data for the KArlsruhe TRItium Neutrino experiment using Bayesian inference
DEFF Research Database (Denmark)
Riis, Anna Sejersen; Hannestad, Steen; Weinheimer, C.
2011-01-01
neutrinos. As an alternative to the frequentist minimization methods used in the analysis of the earlier experiments in Mainz and Troitsk we have been investigating Markov chain Monte Carlo (MCMC) methods which are very well suited for probing multiparameter spaces. We found that implementing the KATRIN χ2...
Bayesian inference reveals ancient origin of simian foamy virus in orangutans.
Reid, Michael J C; Switzer, William M; Schillaci, Michael A; Klegarth, Amy R; Campbell, Ellsworth; Ragonnet, Manon; Joanisse, Isabelle; Caminiti, Kyna; Lowenberger, Carl A; Galdikas, Birute Mary F; Hollocher, Hope; Sandstrom, Paul A; Brooks, James I
2017-03-05
Simian foamy viruses (SFVs) infect most nonhuman primate species and appears to co-evolve with its hosts. This co-evolutionary signal is particularly strong among great apes, including orangutans (genus Pongo). Previous studies have identified three distinct orangutan SFV clades. The first of these three clades is composed of SFV from P. abelii from Sumatra, the second consists of SFV from P. pygmaeus from Borneo, while the third clade is mixed, comprising an SFV strain found in both species of orangutan. The existence of the mixed clade has been attributed to an expansion of P. pygmaeus into Sumatra following the Mount Toba super-volcanic eruption about 73,000years ago. Divergence dating, however, has yet to be performed to establish a temporal association with the Toba eruption. Here, we use a Bayesian framework and a relaxed molecular clock model with fossil calibrations to test the Toba hypothesis and to gain a more complete understanding of the evolutionary history of orangutan SFV. As with previous studies, our results show a similar three-clade orangutan SFV phylogeny, along with strong statistical support for SFV-host co-evolution in orangutans. Using Bayesian inference, we date the origin of orangutan SFV to >4.7 million years ago (mya), while the mixed species clade dates to approximately 1.7mya, >1.6 million years older than the Toba super-eruption. These results, combined with fossil and paleogeographic evidence, suggest that the origin of SFV in Sumatran and Bornean orangutans, including the mixed species clade, likely occurred on the mainland of Indo-China during the Late Pliocene and Calabrian stage of the Pleistocene, respectively.
Genetic parameters for buffalo milk yield and milk quality traits using Bayesian inference.
Aspilcueta-Borquis, R R; Araujo Neto, F R; Baldi, F; Bignardi, A B; Albuquerque, L G; Tonhati, H
2010-05-01
The availability of accurate genetic parameters for important economic traits in milking buffaloes is critical for implementation of a genetic evaluation program. In the present study, heritabilities and genetic correlations for fat (FY305), protein (PY305), and milk (MY305) yields, milk fat (%F) and protein (%P) percentages, and SCS were estimated using Bayesian methodology. A total of 4,907 lactations from 1,985 cows were used. The (co)variance components were estimated using multiple-trait analysis by Bayesian inference method, applying an animal model, through Gibbs sampling. The model included the fixed effects of contemporary groups (herd-year and calving season), number of milking (2 levels), and age of cow at calving as (co)variable (quadratic and linear effect). The additive genetic, permanent environmental, and residual effects were included as random effects in the model. The posterior means of heritability distributions for MY305, FY305, PY305, %F, P%, and SCS were 0.22, 0.21, 0.23, 0.33, 0.39, and 0.26, respectively. The genetic correlation estimates ranged from -0.13 (between %P and SCS) to 0.94 (between MY305 and PY305). The permanent environmental correlation estimates ranged from -0.38 (between MY305 and %P) to 0.97 (between MY305 and PY305). Residual and phenotypic correlation estimates ranged from -0.26 (between PY305 and SCS) to 0.97 (between MY305 and PY305) and from -0.26 (between MY305 and SCS) to 0.97 (between MY305 and PY305), respectively. Milk yield, milk components, and milk somatic cells counts have enough genetic variation for selection purposes. The genetic correlation estimates suggest that milk components and milk somatic cell counts would be only slightly affected if increasing milk yield were the selection goal. Selecting to increase FY305 or PY305 will also increase MY305, %P, and %F.
Strategies for MCMC computation inquantitative genetics
DEFF Research Database (Denmark)
Waagepetersen, Rasmus; Ibánēz-Escriche, Noelia; Sorensen, Daniel
both in size and regarding the inferences concerning the genetic covariance parameters. Section 2 discusses general strategies for obtaining efficient MCMC algorithms while Section 3 considers these strategies in the specific context of the San Cristobal-Gaudy et al. (1998) model. Section 4 presents...... be implemented relatively straightforwardly. The assumptions of normality, linearity, and variance homogeneity are in many cases not valid. One may then consider generalized linear mixed models where the genetic random effects enter at the level of the linear predictor. San Cristobal-Gaudy et al. (1998) proposed...... likelihood inference is complicated since it is not possible to evaluate explicitly the likelihood function and conventional Gibbs sampling is difficult since the full conditional distributions are not anymore of standard forms. The aim of this paper is to discuss strategies to obtain efficient Markov chain...
Dorn, Caroline; Venturini, Julia; Khan, Amir; Heng, Kevin; Alibert, Yann; Helled, Ravit; Rivoldini, Attilio; Benz, Willy
2017-01-01
Aims: We aim to present a generalized Bayesian inference method for constraining interiors of super Earths and sub-Neptunes. Our methodology succeeds in quantifying the degeneracy and correlation of structural parameters for high dimensional parameter spaces. Specifically, we identify what constraints can be placed on composition and thickness of core, mantle, ice, ocean, and atmospheric layers given observations of mass, radius, and bulk refractory abundance constraints (Fe, Mg, Si) from observations of the host star's photospheric composition. Methods: We employed a full probabilistic Bayesian inference analysis that formally accounts for observational and model uncertainties. Using a Markov chain Monte Carlo technique, we computed joint and marginal posterior probability distributions for all structural parameters of interest. We included state-of-the-art structural models based on self-consistent thermodynamics of core, mantle, high-pressure ice, and liquid water. Furthermore, we tested and compared two different atmospheric models that are tailored for modeling thick and thin atmospheres, respectively. Results: First, we validate our method against Neptune. Second, we apply it to synthetic exoplanets of fixed mass and determine the effect on interior structure and composition when (1) radius; (2) atmospheric model; (3) data uncertainties; (4) semi-major axes; (5) atmospheric composition (i.e., a priori assumption of enriched envelopes versus pure H/He envelopes); and (6) prior distributions are varied. Conclusions: Our main conclusions are: (1) given available data, the range of possible interior structures is large; quantification of the degeneracy of possible interiors is therefore indispensable for meaningful planet characterization. (2) Our method predicts models that agree with independent estimates of Neptune's interior. (3) Increasing the precision in mass and radius leads to much improved constraints on ice mass fraction, size of rocky interior, but
Xu, Chengcheng; Wang, Wei; Liu, Pan; Li, Zhibin
2015-12-01
This study aimed to develop a real-time crash risk model with limited data in China by using Bayesian meta-analysis and Bayesian inference approach. A systematic review was first conducted by using three different Bayesian meta-analyses, including the fixed effect meta-analysis, the random effect meta-analysis, and the meta-regression. The meta-analyses provided a numerical summary of the effects of traffic variables on crash risks by quantitatively synthesizing results from previous studies. The random effect meta-analysis and the meta-regression produced a more conservative estimate for the effects of traffic variables compared with the fixed effect meta-analysis. Then, the meta-analyses results were used as informative priors for developing crash risk models with limited data. Three different meta-analyses significantly affect model fit and prediction accuracy. The model based on meta-regression can increase the prediction accuracy by about 15% as compared to the model that was directly developed with limited data. Finally, the Bayesian predictive densities analysis was used to identify the outliers in the limited data. It can further improve the prediction accuracy by 5.0%.
Comparing rates of springtail predation by web-building spiders using Bayesian inference.
Welch, Kelton D; Schofield, Matthew R; Chapman, Eric G; Harwood, James D
2014-08-01
A major goal of gut-content analysis is to quantify predation rates by predators in the field, which could provide insights into the mechanisms behind ecosystem structure and function, as well as quantification of ecosystem services provided. However, percentage-positive results from molecular assays are strongly influenced by factors other than predation rate, and thus can only be reliably used to quantify predation rates under very restrictive conditions. Here, we develop two statistical approaches, one using a parametric bootstrap and the other in terms of Bayesian inference, to build upon previous techniques that use DNA decay rates to rank predators by their rate of prey consumption, by allowing a statistical assessment of confidence in the inferred ranking. To demonstrate the utility of this technique in evaluating ecological data, we test web-building spiders for predation on a primary prey item, springtails. Using these approaches we found that an orb-weaving spider consumes springtail prey at a higher rate than a syntopic sheet-weaving spider, despite occupying microhabitats where springtails are less frequently encountered. We suggest that spider-web architecture (orb web vs. sheet web) is a primary determinant of prey-consumption rates within this assemblage of predators, which demonstrates the potential influence of predator foraging behaviour on trophic web structure. We also discuss how additional assumptions can be incorporated into the same analysis to allow broader application of the technique beyond the specific example presented. We believe that such modelling techniques can greatly advance the field of molecular gut-content analysis.
Raue, Andreas; Theis, Fabian Joachim; Timmer, Jens
2012-01-01
Increasingly complex applications involve large datasets in combination with non-linear and high dimensional mathematical models. In this context, statistical inference is a challenging issue that calls for pragmatic approaches that take advantage of both Bayesian and frequentist methods. The elegance of Bayesian methodology is founded in the propagation of information content provided by experimental data and prior assumptions to the posterior probability distribution of model predictions. However, for complex applications experimental data and prior assumptions potentially constrain the posterior probability distribution insufficiently. In these situations Bayesian Markov chain Monte Carlo sampling can be infeasible. From a frequentist point of view insufficient experimental data and prior assumptions can be interpreted as non-identifiability. The profile likelihood approach offers to detect and to resolve non-identifiability by experimental design iteratively. Therefore, it allows one to better constrain t...
Characteristics of SiC neutron sensor spectrum unfolding process based on Bayesian inference
Energy Technology Data Exchange (ETDEWEB)
Cetnar, Jerzy; Krolikowski, Igor [Faculty of Energy and Fuels AGH - University of Science and Technology, Al. Mickiewicza 30, 30-059 Krakow (Poland); Ottaviani, L. [IM2NP, UMR CNRS 7334, Aix-Marseille University, Case 231 -13397 Marseille Cedex 20 (France); Lyoussi, A. [CEA, DEN, DER, Instrumentation Sensors and Dosimetry Laboratory, Cadarache, F-13108 St-Paul-Lez-Durance (France)
2015-07-01
This paper deals with SiC detector signal interpretation in neutron radiation measurements in mixed neutron gamma radiation fields, which is called the detector inverse problem or the spectrum unfolding, and it aims in finding a representation of the primary radiation, based on the measured detector signals. In our novel methodology we resort to Bayesian inference approach. In the developed procedure the resultant spectra is unfolded form detector channels reading, where the estimated neutron fluence in a group structure is obtained with its statistical characteristic comprising of standard deviation and correlation matrix. In the paper we present results of unfolding process for case of D-T neutron source in neutron moderating environment. Discussions of statistical properties of obtained results are presented as well as of the physical meaning of obtained correlation matrix of estimated group fluence. The presented works has been carried out within the I-SMART project, which is part of the KIC InnoEnergy R and D program. (authors)
Palacios, Julia A; Minin, Vladimir N
2013-03-01
Changes in population size influence genetic diversity of the population and, as a result, leave a signature of these changes in individual genomes in the population. We are interested in the inverse problem of reconstructing past population dynamics from genomic data. We start with a standard framework based on the coalescent, a stochastic process that generates genealogies connecting randomly sampled individuals from the population of interest. These genealogies serve as a glue between the population demographic history and genomic sequences. It turns out that only the times of genealogical lineage coalescences contain information about population size dynamics. Viewing these coalescent times as a point process, estimating population size trajectories is equivalent to estimating a conditional intensity of this point process. Therefore, our inverse problem is similar to estimating an inhomogeneous Poisson process intensity function. We demonstrate how recent advances in Gaussian process-based nonparametric inference for Poisson processes can be extended to Bayesian nonparametric estimation of population size dynamics under the coalescent. We compare our Gaussian process (GP) approach to one of the state-of-the-art Gaussian Markov random field (GMRF) methods for estimating population trajectories. Using simulated data, we demonstrate that our method has better accuracy and precision. Next, we analyze two genealogies reconstructed from real sequences of hepatitis C and human Influenza A viruses. In both cases, we recover more believed aspects of the viral demographic histories than the GMRF approach. We also find that our GP method produces more reasonable uncertainty estimates than the GMRF method.
Directory of Open Access Journals (Sweden)
Dario Cuevas Rivera
2015-10-01
Full Text Available The olfactory information that is received by the insect brain is encoded in the form of spatiotemporal patterns in the projection neurons of the antennal lobe. These dense and overlapping patterns are transformed into a sparse code in Kenyon cells in the mushroom body. Although it is clear that this sparse code is the basis for rapid categorization of odors, it is yet unclear how the sparse code in Kenyon cells is computed and what information it represents. Here we show that this computation can be modeled by sequential firing rate patterns using Lotka-Volterra equations and Bayesian online inference. This new model can be understood as an 'intelligent coincidence detector', which robustly and dynamically encodes the presence of specific odor features. We found that the model is able to qualitatively reproduce experimentally observed activity in both the projection neurons and the Kenyon cells. In particular, the model explains mechanistically how sparse activity in the Kenyon cells arises from the dense code in the projection neurons. The odor classification performance of the model proved to be robust against noise and time jitter in the observed input sequences. As in recent experimental results, we found that recognition of an odor happened very early during stimulus presentation in the model. Critically, by using the model, we found surprising but simple computational explanations for several experimental phenomena.
Bayesian inference on earthquake size distribution: a case study in Italy
Licia, Faenza; Carlo, Meletti; Laura, Sandri
2010-05-01
This paper is focused on the study of earthquake size statistical distribution by using Bayesian inference. The strategy consists in the definition of an a priori distribution based on instrumental seismicity, and modeled as a power law distribution. By using the observed historical data, the power law is then modified in order to obtain the posterior distribution. The aim of this paper is to define the earthquake size distribution using all the seismic database available (i.e., instrumental and historical catalogs) and a robust statistical technique. We apply this methodology to the Italian seismicity, dividing the territory in source zones as done for the seismic hazard assessment, taken here as a reference model. The results suggest that each area has its own peculiar trend: while the power law is able to capture the mean aspect of the earthquake size distribution, the posterior emphasizes different slopes in different areas. Our results are in general agreement with the ones used in the seismic hazard assessment in Italy. However, there are areas in which a flattening in the curve is shown, meaning a significant departure from the power law behavior and implying that there are some local aspects that a power law distribution is not able to capture.
Genetic parameters for five traits in Africanized honeybees using Bayesian inference
Padilha, Alessandro Haiduck; Sattler, Aroni; Cobuci, Jaime Araújo; McManus, Concepta Margaret
2013-01-01
Heritability and genetic correlations for honey (HP) and propolis production (PP), hygienic behavior (HB), syrup-collection rate (SCR) and percentage of mites on adult bees (PMAB) of a population of Africanized honeybees were estimated. Data from 110 queen bees over three generations were evaluated. Single and multi-trait models were analyzed by Bayesian Inference using MTGSAM. The localization of the hive was significant for SCR and HB and highly significant for PP. Season-year was highly significant only for SCR. The number of frames with bees was significant for HP and PP, including SCR. The heritability estimates were 0.16 for HP, 0.23 for SCR, 0.52 for HB, 0.66 for PP, and 0.13 for PMAB. The genetic correlations were positive among productive traits (PP, HP and SCR) and negative between productive traits and HB, except between PP and HB. Genetic correlations between PMAB and other traits, in general, were negative, except with PP. The study permitted to identify honeybees for improved propolis and honey production. Hygienic behavior may be improved as a consequence of selecting for improved propolis production. The rate of syrup consumption and propolis production may be included in a selection index to enhance honeybee traits. PMID:23885203
Cuevas Rivera, Dario; Bitzer, Sebastian; Kiebel, Stefan J.
2015-01-01
The olfactory information that is received by the insect brain is encoded in the form of spatiotemporal patterns in the projection neurons of the antennal lobe. These dense and overlapping patterns are transformed into a sparse code in Kenyon cells in the mushroom body. Although it is clear that this sparse code is the basis for rapid categorization of odors, it is yet unclear how the sparse code in Kenyon cells is computed and what information it represents. Here we show that this computation can be modeled by sequential firing rate patterns using Lotka-Volterra equations and Bayesian online inference. This new model can be understood as an ‘intelligent coincidence detector’, which robustly and dynamically encodes the presence of specific odor features. We found that the model is able to qualitatively reproduce experimentally observed activity in both the projection neurons and the Kenyon cells. In particular, the model explains mechanistically how sparse activity in the Kenyon cells arises from the dense code in the projection neurons. The odor classification performance of the model proved to be robust against noise and time jitter in the observed input sequences. As in recent experimental results, we found that recognition of an odor happened very early during stimulus presentation in the model. Critically, by using the model, we found surprising but simple computational explanations for several experimental phenomena. PMID:26451888
The Application of Bayesian Inference to Gravitational Waves from Core-Collapse Supernovae
Gossan, Sarah; Ott, Christian; Kalmus, Peter; Logue, Joshua; Heng, Siong
2013-04-01
The gravitational wave (GW) signature of core-collapse supernovae (CCSNe) encodes important information on the supernova explosion mechanism, the workings of which cannot be explored via observations in the electromagnetic spectrum. Recent research has shown that the CCSNe explosion mechanism can be inferred through the application of Bayesian model selection to gravitational wave signals from supernova explosions powered by the neutrino, magnetorotational and acoustic mechanisms. Extending this work, we apply Principal Component Analysis to the GW spectrograms from CCSNe to take into account also the time-frequency evolution of the emitted signals. We do so in the context of Advanced LIGO, to establish if any improvement on distinguishing between various explosion mechanisms can be obtained. Further to this, we consider a five-detector network of interferometers (comprised of the two Advanced LIGO detectors, Advanced Virgo, LIGO India and KAGRA) and generalize the aforementioned analysis for a source of known position but unknown distance, using realistic, re-colored detector data (as opposed to Gaussian noise), in order to make more reliable statements regarding our ability to distinguish between various explosion mechanisms on the basis of their GW signatures.
Energy Technology Data Exchange (ETDEWEB)
George, J.S.; Schmidt, D.M.; Wood, C.C.
1999-02-01
We have developed a Bayesian approach to the analysis of neural electromagnetic (MEG/EEG) data that can incorporate or fuse information from other imaging modalities and addresses the ill-posed inverse problem by sarnpliig the many different solutions which could have produced the given data. From these samples one can draw probabilistic inferences about regions of activation. Our source model assumes a variable number of variable size cortical regions of stimulus-correlated activity. An active region consists of locations on the cortical surf ace, within a sphere centered on some location in cortex. The number and radi of active regions can vary to defined maximum values. The goal of the analysis is to determine the posterior probability distribution for the set of parameters that govern the number, location, and extent of active regions. Markov Chain Monte Carlo is used to generate a large sample of sets of parameters distributed according to the posterior distribution. This sample is representative of the many different source distributions that could account for given data, and allows identification of probable (i.e. consistent) features across solutions. Examples of the use of this analysis technique with both simulated and empirical MEG data are presented.
Directory of Open Access Journals (Sweden)
Mateus José Sudano
2011-01-01
Full Text Available The objective of this experiment was to test in vitro embryo production (IVP as a tool to estimate fertility performance in zebu bulls using Bayesian inference statistics. Oocytes were matured and fertilized in vitro using sperm cells from three different Zebu bulls (V, T, and G. The three bulls presented similar results with regard to pronuclear formation and blastocyst formation rates. However, the cleavage rates were different between bulls. The estimated conception rates based on combined data of cleavage and blastocyst formation were very similar to the true conception rates observed for the same bulls after a fixed-time artificial insemination program. Moreover, even when we used cleavage rate data only or blastocyst formation data only, the estimated conception rates were still close to the true conception rates. We conclude that Bayesian inference is an effective statistical procedure to estimate in vivo bull fertility using data from IVP.
Godsey, Brian
2013-01-01
Inferring gene regulatory networks from expression data is difficult, but it is common and often useful. Most network problems are under-determined--there are more parameters than data points--and therefore data or parameter set reduction is often necessary. Correlation between variables in the model also contributes to confound network coefficient inference. In this paper, we present an algorithm that uses integrated, probabilistic clustering to ease the problems of under-determination and correlated variables within a fully Bayesian framework. Specifically, ours is a dynamic Bayesian network with integrated Gaussian mixture clustering, which we fit using variational Bayesian methods. We show, using public, simulated time-course data sets from the DREAM4 Challenge, that our algorithm outperforms non-clustering methods in many cases (7 out of 25) with fewer samples, rarely underperforming (1 out of 25), and often selects a non-clustering model if it better describes the data. Source code (GNU Octave) for BAyesian Clustering Over Networks (BACON) and sample data are available at: http://code.google.com/p/bacon-for-genetic-networks.
Energy Technology Data Exchange (ETDEWEB)
La Russa, D [The Ottawa Hospital Cancer Centre, Ottawa, ON (Canada)
2015-06-15
Purpose: The purpose of this project is to develop a robust method of parameter estimation for a Poisson-based TCP model using Bayesian inference. Methods: Bayesian inference was performed using the PyMC3 probabilistic programming framework written in Python. A Poisson-based TCP regression model that accounts for clonogen proliferation was fit to observed rates of local relapse as a function of equivalent dose in 2 Gy fractions for a population of 623 stage-I non-small-cell lung cancer patients. The Slice Markov Chain Monte Carlo sampling algorithm was used to sample the posterior distributions, and was initiated using the maximum of the posterior distributions found by optimization. The calculation of TCP with each sample step required integration over the free parameter α, which was performed using an adaptive 24-point Gauss-Legendre quadrature. Convergence was verified via inspection of the trace plot and posterior distribution for each of the fit parameters, as well as with comparisons of the most probable parameter values with their respective maximum likelihood estimates. Results: Posterior distributions for α, the standard deviation of α (σ), the average tumour cell-doubling time (Td), and the repopulation delay time (Tk), were generated assuming α/β = 10 Gy, and a fixed clonogen density of 10{sup 7} cm−{sup 3}. Posterior predictive plots generated from samples from these posterior distributions are in excellent agreement with the observed rates of local relapse used in the Bayesian inference. The most probable values of the model parameters also agree well with maximum likelihood estimates. Conclusion: A robust method of performing Bayesian inference of TCP data using a complex TCP model has been established.
Jakkareddy, Pradeep S.; Balaji, C.
2016-09-01
This paper employs the Bayesian based Metropolis Hasting - Markov Chain Monte Carlo algorithm to solve inverse heat transfer problem of determining the spatially varying heat transfer coefficient from a flat plate with flush mounted discrete heat sources with measured temperatures at the bottom of the plate. The Nusselt number is assumed to be of the form Nu = aReb(x/l)c . To input reasonable values of ’a’ and ‘b’ into the inverse problem, first limited two dimensional conjugate convection simulations were done with Comsol. Based on the guidance from this different values of ‘a’ and ‘b’ are input to a computationally less complex problem of conjugate conduction in the flat plate (15mm thickness) and temperature distributions at the bottom of the plate which is a more convenient location for measuring the temperatures without disturbing the flow were obtained. Since the goal of this work is to demonstrate the eficiacy of the Bayesian approach to accurately retrieve ‘a’ and ‘b’, numerically generated temperatures with known values of ‘a’ and ‘b’ are treated as ‘surrogate’ experimental data. The inverse problem is then solved by repeatedly using the forward solutions together with the MH-MCMC aprroach. To speed up the estimation, the forward model is replaced by an artificial neural network. The mean, maximum-a-posteriori and standard deviation of the estimated parameters ‘a’ and ‘b’ are reported. The robustness of the proposed method is examined, by synthetically adding noise to the temperatures.
Gelman, Andrew; Robert, Christian P.; Rousseau, Judith
2010-01-01
For many decades, statisticians have made attempts to prepare the Bayesian omelette without breaking the Bayesian eggs; that is, to obtain probabilistic likelihood-based inferences without relying on informative prior distributions. A recent example is Murray Aitkin's recent book, {\\em Statistical Inference}, which presents an approach to statistical hypothesis testing based on comparisons of posterior distributions of likelihoods under competing models. Aitkin develops and illustrates his me...
MCMC for Wind Power Simulation
Papaefthymiou, G.; Klöckl, B.
2008-01-01
This paper contributes a Markov chain Monte Carlo (MCMC) method for the direct generation of synthetic time series of wind power output. It is shown that obtaining a stochastic model directly in the wind power domain leads to reduced number of states and to lower order of the Markov chain at equal p
Bayesian inference of genetic parameters for ultrasound scanning traits of Kivircik lambs.
Cemal, I; Karaman, E; Firat, M Z; Yilmaz, O; Ata, N; Karaca, O
2017-03-01
Ultrasound scanning traits have been adapted in selection programs in many countries to improve carcass traits for lean meat production. As the genetic parameters of the traits interested are important for breeding programs, the estimation of these parameters was aimed at the present investigation. The estimated parameters were direct and maternal heritability as well as genetic correlations between the studied traits. The traits were backfat thickness (BFT), skin+backfat thickness (SBFT), eye muscle depth (MD) and live weights at the day of scanning (LW). The breed investigated was Kivircik, which has a high quality of meat. Six different multi-trait animal models were fitted to determine the most suitable model for the data using Bayesian approach. Based on deviance information criterion, a model that includes direct additive genetic effects, maternal additive genetic effects, direct maternal genetic covariance and maternal permanent environmental effects revealed to be the most appropriate for the data, and therefore, inferences were built on the results of that model. The direct heritability estimates for BFT, SBFT, MD and LW were 0.26, 0.26, 0.23 and 0.09, whereas the maternal heritability estimates were 0.27, 0.27, 0.24 and 0.20, respectively. Negative genetic correlations were obtained between direct and maternal effects for BFT, SBFT and MD. Both direct and maternal genetic correlations between traits were favorable, whereas BFT-MD and SBFT-MD had negligible direct genetic correlation. The highest direct and maternal genetic correlations were between BFT and SBFT (0.39) and between MD and LW (0.48), respectively. Our results, in general, indicated that maternal effects should be accounted for in estimation of genetic parameters of ultrasound scanning traits in Kivircik lambs, and SBFT can be used as a selection criterion to improve BFT.
Condition monitoring of distributed systems using two-stage Bayesian inference data fusion
Jaramillo, Víctor H.; Ottewill, James R.; Dudek, Rafał; Lepiarczyk, Dariusz; Pawlik, Paweł
2017-03-01
In industrial practice, condition monitoring is typically applied to critical machinery. A particular piece of machinery may have its own condition monitoring system that allows the health condition of said piece of equipment to be assessed independently of any connected assets. However, industrial machines are typically complex sets of components that continuously interact with one another. In some cases, dynamics resulting from the inception and development of a fault can propagate between individual components. For example, a fault in one component may lead to an increased vibration level in both the faulty component, as well as in connected healthy components. In such cases, a condition monitoring system focusing on a specific element in a connected set of components may either incorrectly indicate a fault, or conversely, a fault might be missed or masked due to the interaction of a piece of equipment with neighboring machines. In such cases, a more holistic condition monitoring approach that can not only account for such interactions, but utilize them to provide a more complete and definitive diagnostic picture of the health of the machinery is highly desirable. In this paper, a Two-Stage Bayesian Inference approach allowing data from separate condition monitoring systems to be combined is presented. Data from distributed condition monitoring systems are combined in two stages, the first data fusion occurring at a local, or component, level, and the second fusion combining data at a global level. Data obtained from an experimental rig consisting of an electric motor, two gearboxes, and a load, operating under a range of different fault conditions is used to illustrate the efficacy of the method at pinpointing the root cause of a problem. The obtained results suggest that the approach is adept at refining the diagnostic information obtained from each of the different machine components monitored, therefore improving the reliability of the health assessment of
Multi-scale inference of interaction rules in animal groups using Bayesian model selection.
Mann, Richard P; Perna, Andrea; Strömbom, Daniel; Garnett, Roman; Herbert-Read, James E; Sumpter, David J T; Ward, Ashley J W
2012-01-01
Inference of interaction rules of animals moving in groups usually relies on an analysis of large scale system behaviour. Models are tuned through repeated simulation until they match the observed behaviour. More recent work has used the fine scale motions of animals to validate and fit the rules of interaction of animals in groups. Here, we use a Bayesian methodology to compare a variety of models to the collective motion of glass prawns (Paratya australiensis). We show that these exhibit a stereotypical 'phase transition', whereby an increase in density leads to the onset of collective motion in one direction. We fit models to this data, which range from: a mean-field model where all prawns interact globally; to a spatial Markovian model where prawns are self-propelled particles influenced only by the current positions and directions of their neighbours; up to non-Markovian models where prawns have 'memory' of previous interactions, integrating their experiences over time when deciding to change behaviour. We show that the mean-field model fits the large scale behaviour of the system, but does not capture fine scale rules of interaction, which are primarily mediated by physical contact. Conversely, the Markovian self-propelled particle model captures the fine scale rules of interaction but fails to reproduce global dynamics. The most sophisticated model, the non-Markovian model, provides a good match to the data at both the fine scale and in terms of reproducing global dynamics. We conclude that prawns' movements are influenced by not just the current direction of nearby conspecifics, but also those encountered in the recent past. Given the simplicity of prawns as a study system our research suggests that self-propelled particle models of collective motion should, if they are to be realistic at multiple biological scales, include memory of previous interactions and other non-Markovian effects.
Multi-scale inference of interaction rules in animal groups using Bayesian model selection.
Directory of Open Access Journals (Sweden)
Richard P Mann
2012-01-01
Full Text Available Inference of interaction rules of animals moving in groups usually relies on an analysis of large scale system behaviour. Models are tuned through repeated simulation until they match the observed behaviour. More recent work has used the fine scale motions of animals to validate and fit the rules of interaction of animals in groups. Here, we use a Bayesian methodology to compare a variety of models to the collective motion of glass prawns (Paratya australiensis. We show that these exhibit a stereotypical 'phase transition', whereby an increase in density leads to the onset of collective motion in one direction. We fit models to this data, which range from: a mean-field model where all prawns interact globally; to a spatial Markovian model where prawns are self-propelled particles influenced only by the current positions and directions of their neighbours; up to non-Markovian models where prawns have 'memory' of previous interactions, integrating their experiences over time when deciding to change behaviour. We show that the mean-field model fits the large scale behaviour of the system, but does not capture fine scale rules of interaction, which are primarily mediated by physical contact. Conversely, the Markovian self-propelled particle model captures the fine scale rules of interaction but fails to reproduce global dynamics. The most sophisticated model, the non-Markovian model, provides a good match to the data at both the fine scale and in terms of reproducing global dynamics. We conclude that prawns' movements are influenced by not just the current direction of nearby conspecifics, but also those encountered in the recent past. Given the simplicity of prawns as a study system our research suggests that self-propelled particle models of collective motion should, if they are to be realistic at multiple biological scales, include memory of previous interactions and other non-Markovian effects.
Energy Technology Data Exchange (ETDEWEB)
Kang, Seongkeun; Seong, Poong Hyun [Korea Advanced Institute of Science and Technology, Daejeon (Korea, Republic of)
2014-05-15
The purpose of this paper is to confirm if Bayesian inference can properly reflect the situation awareness of real human operators, and find the difference between the situation of ideal and practical operators, and investigate the factors which contributes to those difference. As a results, human can not think like computer. If human can memorize all the information, and their thinking process is same to the CPU of computer, the results of these two experiments come out more than 99%. However the probability of finding right malfunction by humans are only 64.52% in simple experiment, and 51.61% in complex experiment. Cognition is the mental processing that includes the attention of working memory, comprehending and producing language, calculating, reasoning, problem solving, and decision making. There are many reasons why human thinking process is different with computer, but in this experiment, we suggest that the working memory is the most important factor. Humans have limited working memory which has only seven chunks capacity. These seven chunks are called magic number. If there are more than seven sequential information, people start to forget the previous information because their working memory capacity is running over. We can check how much working memory affects to the result through the simple experiment. Then what if we neglect the effect of working memory? The total number of subjects who have incorrect memory is 7 (subject 3, 5, 6, 7, 8, 15, 25). They could find the right malfunction if the memory hadn't changed because of lack of working memory. Then the probability of find correct malfunction will be increased to 87.10% from 64.52%. Complex experiment has similar result. In this case, eight subjects(1, 5, 8, 9, 15, 17, 18, 30) had changed the memory, and it affects to find the right malfunction. Considering it, then the probability would be (16+8)/31 = 77.42%.
Chiang, Sharon; Guindani, Michele; Yeh, Hsiang J; Haneef, Zulfi; Stern, John M; Vannucci, Marina
2017-03-01
In this article a multi-subject vector autoregressive (VAR) modeling approach was proposed for inference on effective connectivity based on resting-state functional MRI data. Their framework uses a Bayesian variable selection approach to allow for simultaneous inference on effective connectivity at both the subject- and group-level. Furthermore, it accounts for multi-modal data by integrating structural imaging information into the prior model, encouraging effective connectivity between structurally connected regions. They demonstrated through simulation studies that their approach resulted in improved inference on effective connectivity at both the subject- and group-level, compared with currently used methods. It was concluded by illustrating the method on temporal lobe epilepsy data, where resting-state functional MRI and structural MRI were used. Hum Brain Mapp 38:1311-1332, 2017. © 2016 Wiley Periodicals, Inc.
Energy Technology Data Exchange (ETDEWEB)
Chin, George; Choudhury, Sutanay; Kangas, Lars J.; McFarlane, Sally A.; Marquez, Andres
2011-09-01
Long viewed as a strong statistical inference technique, Bayesian networks have emerged to be an important class of applications for high-performance computing. We have applied an architecture-conscious approach to parallelizing the Lauritzen-Spiegelhalter Junction Tree algorithm for exact inferencing in Bayesian networks. In optimizing the Junction Tree algorithm, we have implemented both in-clique and topological parallelism strategies to best leverage the fine-grained synchronization and massive-scale multithreading of the Cray XMT architecture. Two topological techniques were developed to parallelize the evidence propagation process through the Bayesian network. One technique involves performing intelligent scheduling of junction tree nodes based on its topology and relative size. The second technique involves decomposing the junction tree into a much finer tree-like representation to offer much more opportunities for parallelism. We evaluate these optimizations on five different Bayesian networks and report our findings and observations. Another important contribution of this paper is to demonstrate the application of massive-scale multithreading for load balancing and use of implicit parallelism-based compiler optimizations in designing scalable inferencing algorithms.
Learning Weight Uncertainty with Stochastic Gradient MCMC for Shape Classification
Energy Technology Data Exchange (ETDEWEB)
Li, Chunyuan; Stevens, Andrew J.; Chen, Changyou; Pu, Yunchen; Gan, Zhe; Carin, Lawrence
2016-08-10
Learning the representation of shape cues in 2D & 3D objects for recognition is a fundamental task in computer vision. Deep neural networks (DNNs) have shown promising performance on this task. Due to the large variability of shapes, accurate recognition relies on good estimates of model uncertainty, ignored in traditional training of DNNs, typically learned via stochastic optimization. This paper leverages recent advances in stochastic gradient Markov Chain Monte Carlo (SG-MCMC) to learn weight uncertainty in DNNs. It yields principled Bayesian interpretations for the commonly used Dropout/DropConnect techniques and incorporates them into the SG-MCMC framework. Extensive experiments on 2D & 3D shape datasets and various DNN models demonstrate the superiority of the proposed approach over stochastic optimization. Our approach yields higher recognition accuracy when used in conjunction with Dropout and Batch-Normalization.
Geometric MCMC for infinite-dimensional inverse problems
Beskos, Alexandros; Girolami, Mark; Lan, Shiwei; Farrell, Patrick E.; Stuart, Andrew M.
2017-04-01
Bayesian inverse problems often involve sampling posterior distributions on infinite-dimensional function spaces. Traditional Markov chain Monte Carlo (MCMC) algorithms are characterized by deteriorating mixing times upon mesh-refinement, when the finite-dimensional approximations become more accurate. Such methods are typically forced to reduce step-sizes as the discretization gets finer, and thus are expensive as a function of dimension. Recently, a new class of MCMC methods with mesh-independent convergence times has emerged. However, few of them take into account the geometry of the posterior informed by the data. At the same time, recently developed geometric MCMC algorithms have been found to be powerful in exploring complicated distributions that deviate significantly from elliptic Gaussian laws, but are in general computationally intractable for models defined in infinite dimensions. In this work, we combine geometric methods on a finite-dimensional subspace with mesh-independent infinite-dimensional approaches. Our objective is to speed up MCMC mixing times, without significantly increasing the computational cost per step (for instance, in comparison with the vanilla preconditioned Crank-Nicolson (pCN) method). This is achieved by using ideas from geometric MCMC to probe the complex structure of an intrinsic finite-dimensional subspace where most data information concentrates, while retaining robust mixing times as the dimension grows by using pCN-like methods in the complementary subspace. The resulting algorithms are demonstrated in the context of three challenging inverse problems arising in subsurface flow, heat conduction and incompressible flow control. The algorithms exhibit up to two orders of magnitude improvement in sampling efficiency when compared with the pCN method.
Fontanazza, C M; Freni, G; Notaro, V
2012-01-01
Flood damage in urbanized watersheds may be assessed by combining the flood depth-damage curves and the outputs of urban flood models. The complexity of the physical processes that must be simulated and the limited amount of data available for model calibration may lead to high uncertainty in the model results and consequently in damage estimation. Moreover depth-damage functions are usually affected by significant uncertainty related to the collected data and to the simplified structure of the regression law that is used. The present paper carries out the analysis of the uncertainty connected to the flood damage estimate obtained combining the use of hydraulic models and depth-damage curves. A Bayesian inference analysis was proposed along with a probabilistic approach for the parameters estimating. The analysis demonstrated that the Bayesian approach is very effective considering that the available databases are usually short.
Sandoval-Castellanos, Edson; Palkopoulou, Eleftheria; Dalén, Love
2014-01-01
Inference of population demographic history has vastly improved in recent years due to a number of technological and theoretical advances including the use of ancient DNA. Approximate Bayesian computation (ABC) stands among the most promising methods due to its simple theoretical fundament and exceptional flexibility. However, limited availability of user-friendly programs that perform ABC analysis renders it difficult to implement, and hence programming skills are frequently required. In addition, there is limited availability of programs able to deal with heterochronous data. Here we present the software BaySICS: Bayesian Statistical Inference of Coalescent Simulations. BaySICS provides an integrated and user-friendly platform that performs ABC analyses by means of coalescent simulations from DNA sequence data. It estimates historical demographic population parameters and performs hypothesis testing by means of Bayes factors obtained from model comparisons. Although providing specific features that improve inference from datasets with heterochronous data, BaySICS also has several capabilities making it a suitable tool for analysing contemporary genetic datasets. Those capabilities include joint analysis of independent tables, a graphical interface and the implementation of Markov-chain Monte Carlo without likelihoods.
Sankararaman, Shankar
2016-01-01
This paper presents a computational framework for uncertainty characterization and propagation, and sensitivity analysis under the presence of aleatory and epistemic un- certainty, and develops a rigorous methodology for efficient refinement of epistemic un- certainty by identifying important epistemic variables that significantly affect the overall performance of an engineering system. The proposed methodology is illustrated using the NASA Langley Uncertainty Quantification Challenge (NASA-LUQC) problem that deals with uncertainty analysis of a generic transport model (GTM). First, Bayesian inference is used to infer subsystem-level epistemic quantities using the subsystem-level model and corresponding data. Second, tools of variance-based global sensitivity analysis are used to identify four important epistemic variables (this limitation specified in the NASA-LUQC is reflective of practical engineering situations where not all epistemic variables can be refined due to time/budget constraints) that significantly affect system-level performance. The most significant contribution of this paper is the development of the sequential refine- ment methodology, where epistemic variables for refinement are not identified all-at-once. Instead, only one variable is first identified, and then, Bayesian inference and global sensi- tivity calculations are repeated to identify the next important variable. This procedure is continued until all 4 variables are identified and the refinement in the system-level perfor- mance is computed. The advantages of the proposed sequential refinement methodology over the all-at-once uncertainty refinement approach are explained, and then applied to the NASA Langley Uncertainty Quantification Challenge problem.
Efficiency of alternative McMC strategies illustrated using the reaction norm model
DEFF Research Database (Denmark)
Shariati, M; Sorensen, D
2008-01-01
The Markov chain Monte Carlo (MCMC) strategy provides remarkable flexibility for fitting complex hierarchical models. However, when parameters are highly correlated in their posterior distributions and their number is large, a particular MCMC algorithm may perform poorly and the resulting...... inferences may be affected. The objective of this study was to compare the efficiency (in terms of the asymptotic variance of features of posterior distributions of chosen parameters, and in terms of computing cost) of six MCMC strategies to sample parameters using simulated data generated with a reaction...
Nitrate vulnerability projections from Bayesian inference of multiple groundwater age tracers
Alikhani, Jamal; Deinhart, Amanda L.; Visser, Ate; Bibby, Richard K.; Purtschert, Roland; Moran, Jean E.; Massoudieh, Arash; Esser, Bradley K.
2016-12-01
Nitrate is a major source of contamination of groundwater in the United States and around the world. We tested the applicability of multiple groundwater age tracers (3H, 3He, 4He, 14C, 13C, and 85Kr) in projecting future trends of nitrate concentration in 9 long-screened, public drinking water wells in Turlock, California, where nitrate concentrations are increasing toward the regulatory limit. Very low 85Kr concentrations and apparent 3H/3He ages point to a relatively old modern fraction (40-50 years), diluted with pre-modern groundwater, corroborated by the onset and slope of increasing nitrate concentrations. An inverse Gaussian-Dirac model was chosen to represent the age distribution of the sampled groundwater at each well. Model parameters were estimated using a Bayesian inference, resulting in the posterior probability distribution - including the associated uncertainty - of the parameters and projected nitrate concentrations. Three scenarios were considered, including combined historic nitrate and age tracer data, the sole use of nitrate and the sole use of age tracer data. Each scenario was evaluated based on the ability of the model to reproduce the data and the level of reliability of the nitrate projections. The tracer-only scenario closely reproduced tracer concentrations, but not observed trends in the nitrate concentration. Both cases that included nitrate data resulted in good agreement with historical nitrate trends. Use of combined tracers and nitrate data resulted in a narrower range of projections of future nitrate levels. However, use of combined tracer and nitrate resulted in a larger discrepancy between modeled and measured tracers for some of the tracers. Despite nitrate trend slopes between 0.56 and 1.73 mg/L/year in 7 of the 9 wells, the probability that concentrations will increase to levels above the MCL by 2040 are over 95% for only two of the wells, and below 15% in the other wells, due to a leveling off of reconstructed historical
Alvarado Mora, Mónica Viviana; Romano, Camila Malta; Gomes-Gouvêa, Michele Soares; Gutierrez, Maria Fernanda; Botelho, Livia; Carrilho, Flair José; Pinho, João Renato Rebello
2011-01-01
Hepatitis B is a worldwide health problem affecting about 2 billion people and more than 350 million are chronic carriers of the virus. Nine HBV genotypes (A to I) have been described. The geographical distribution of HBV genotypes is not completely understood due to the limited number of samples from some parts of the world. One such example is Colombia, in which few studies have described the HBV genotypes. In this study, we characterized HBV genotypes in 143 HBsAg-positive volunteer blood donors from Colombia. A fragment of 1306 bp partially comprising HBsAg and the DNA polymerase coding regions (S/POL) was amplified and sequenced. Bayesian phylogenetic analyses were conducted using the Markov Chain Monte Carlo (MCMC) approach to obtain the maximum clade credibility (MCC) tree using BEAST v.1.5.3. Of all samples, 68 were positive and 52 were successfully sequenced. Genotype F was the most prevalent in this population (77%) - subgenotypes F3 (75%) and F1b (2%). Genotype G (7.7%) and subgenotype A2 (15.3%) were also found. Genotype G sequence analysis suggests distinct introductions of this genotype in the country. Furthermore, we estimated the time of the most recent common ancestor (TMRCA) for each HBV/F subgenotype and also for Colombian F3 sequences using two different datasets: (i) 77 sequences comprising 1306 bp of S/POL region and (ii) 283 sequences comprising 681 bp of S/POL region. We also used two other previously estimated evolutionary rates: (i) 2.60 × 10(-4)s/s/y and (ii) 1.5 × 10(-5)s/s/y. Here we report the HBV genotypes circulating in Colombia and estimated the TMRCA for the four different subgenotypes of genotype F.
Analogical and Category-Based Inference: A Theoretical Integration with Bayesian Causal Models
Holyoak, Keith J.; Lee, Hee Seung; Lu, Hongjing
2010-01-01
A fundamental issue for theories of human induction is to specify constraints on potential inferences. For inferences based on shared category membership, an analogy, and/or a relational schema, it appears that the basic goal of induction is to make accurate and goal-relevant inferences that are sensitive to uncertainty. People can use source…
Yang, Yuqing; Chen, Ning; Chen, Ting
2017-01-25
The inference of associations between environmental factors and microbes and among microbes is critical to interpreting metagenomic data, but compositional bias, indirect associations resulting from common factors, and variance within metagenomic sequencing data limit the discovery of associations. To account for these problems, we propose metagenomic Lognormal-Dirichlet-Multinomial (mLDM), a hierarchical Bayesian model with sparsity constraints, to estimate absolute microbial abundance and simultaneously infer both conditionally dependent associations among microbes and direct associations between microbes and environmental factors. We empirically show the effectiveness of the mLDM model using synthetic data, data from the TARA Oceans project, and a colorectal cancer dataset. Finally, we apply mLDM to 16S sequencing data from the western English Channel and report several associations. Our model can be used on both natural environmental and human metagenomic datasets, promoting the understanding of associations in the microbial community.
Morrissey, Edward R; Juárez, Miguel A; Denby, Katherine J; Burroughs, Nigel J
2011-10-01
We propose a semiparametric Bayesian model, based on penalized splines, for the recovery of the time-invariant topology of a causal interaction network from longitudinal data. Our motivation is inference of gene regulatory networks from low-resolution microarray time series, where existence of nonlinear interactions is well known. Parenthood relations are mapped by augmenting the model with kinship indicators and providing these with either an overall or gene-wise hierarchical structure. Appropriate specification of the prior is crucial to control the flexibility of the splines, especially under circumstances of scarce data; thus, we provide an informative, proper prior. Substantive improvement in network inference over a linear model is demonstrated using synthetic data drawn from ordinary differential equation models and gene expression from an experimental data set of the Arabidopsis thaliana circadian rhythm.
Bayesian inference – a way to combine statistical data and semantic analysis meaningfully
Directory of Open Access Journals (Sweden)
Eila Lindfors
2011-11-01
Full Text Available This article focuses on presenting the possibilities of Bayesian modelling (Finite Mixture Modelling in the semantic analysis of statistically modelled data. The probability of a hypothesis in relation to the data available is an important question in inductive reasoning. Bayesian modelling allows the researcher to use many models at a time and provides tools to evaluate the goodness of different models. The researcher should always be aware that there is no such thing as the exact probability of an exact event. This is the reason for using probabilistic models. Each model presents a different perspective on the phenomenon in focus, and the researcher has to choose the most probable model with a view to previous research and the knowledge available.The idea of Bayesian modelling is illustrated here by presenting two different sets of data, one from craft science research (n=167 and the other (n=63 from educational research (Lindfors, 2007, 2002. The principles of how to build models and how to combine different profiles are described in the light of the research mentioned.Bayesian modelling is an analysis based on calculating probabilities in relation to a specific set of quantitative data. It is a tool for handling data and interpreting it semantically. The reliability of the analysis arises from an argumentation of which model can be selected from the model space as the basis for an interpretation, and on which arguments.Keywords: method, sloyd, Bayesian modelling, student teachersURN:NBN:no-29959
GSEVM v.2: MCMC software to analyse genetically structured environmental variance models
DEFF Research Database (Denmark)
Ibáñez-Escriche, N; Garcia, M; Sorensen, D
2010-01-01
This note provides a description of software that allows to fit Bayesian genetically structured variance models using Markov chain Monte Carlo (MCMC). The gsevm v.2 program was written in Fortran 90. The DOS and Unix executable programs, the user's guide, and some example files are freely availab...
Profile-Based LC-MS data alignment--a Bayesian approach.
Tsai, Tsung-Heng; Tadesse, Mahlet G; Wang, Yue; Ressom, Habtom W
2013-01-01
A Bayesian alignment model (BAM) is proposed for alignment of liquid chromatography-mass spectrometry (LC-MS) data. BAM belongs to the category of profile-based approaches, which are composed of two major components: a prototype function and a set of mapping functions. Appropriate estimation of these functions is crucial for good alignment results. BAM uses Markov chain Monte Carlo (MCMC) methods to draw inference on the model parameters and improves on existing MCMC-based alignment methods through 1) the implementation of an efficient MCMC sampler and 2) an adaptive selection of knots. A block Metropolis-Hastings algorithm that mitigates the problem of the MCMC sampler getting stuck at local modes of the posterior distribution is used for the update of the mapping function coefficients. In addition, a stochastic search variable selection (SSVS) methodology is used to determine the number and positions of knots. We applied BAM to a simulated data set, an LC-MS proteomic data set, and two LC-MS metabolomic data sets, and compared its performance with the Bayesian hierarchical curve registration (BHCR) model, the dynamic time-warping (DTW) model, and the continuous profile model (CPM). The advantage of applying appropriate profile-based retention time correction prior to performing a feature-based approach is also demonstrated through the metabolomic data sets.
Tang, An-Min; Tang, Nian-Sheng
2015-02-28
We propose a semiparametric multivariate skew-normal joint model for multivariate longitudinal and multivariate survival data. One main feature of the posited model is that we relax the commonly used normality assumption for random effects and within-subject error by using a centered Dirichlet process prior to specify the random effects distribution and using a multivariate skew-normal distribution to specify the within-subject error distribution and model trajectory functions of longitudinal responses semiparametrically. A Bayesian approach is proposed to simultaneously obtain Bayesian estimates of unknown parameters, random effects and nonparametric functions by combining the Gibbs sampler and the Metropolis-Hastings algorithm. Particularly, a Bayesian local influence approach is developed to assess the effect of minor perturbations to within-subject measurement error and random effects. Several simulation studies and an example are presented to illustrate the proposed methodologies.
Bayesian Inference with Missing Data%数据缺失条件下的贝叶斯推断方法
Institute of Scientific and Technical Information of China (English)
虞健飞; 张恒喜; 朱家元
2002-01-01
Recently Bayesian network(BN) becomus a noticeable research direction in Data Mining.In this paper we introduce missing data mechanisms firstly,and then some methods to do Baysesian inference with missing data based on these missing data mechanisms.All of these must be useful in practice especially when data is scare and expensive.It can foresee that Bayesian networks will become a powerful tool in Data Mining with all of these methods above offered.
Subbiah, M.; Rajeswaran, V.
Extensive statistical practice has shown the importance and relevance of the inferential problem of estimating probability parameters in a binomial experiment; especially on the issues of competing intervals from frequentist, Bayesian, and Bootstrap approaches. The package written in the free R environment and presented in this paper tries to take care of the issues just highlighted, by pooling a number of widely available and well-performing methods and apporting on them essential variations. A wide range of functions helps users with differing skills to estimate, evaluate, summarize, numerically and graphically, various measures adopting either the frequentist or the Bayesian paradigm.
Bayesian networks in educational assessment
Almond, Russell G; Steinberg, Linda S; Yan, Duanli; Williamson, David M
2015-01-01
Bayesian inference networks, a synthesis of statistics and expert systems, have advanced reasoning under uncertainty in medicine, business, and social sciences. This innovative volume is the first comprehensive treatment exploring how they can be applied to design and analyze innovative educational assessments. Part I develops Bayes nets’ foundations in assessment, statistics, and graph theory, and works through the real-time updating algorithm. Part II addresses parametric forms for use with assessment, model-checking techniques, and estimation with the EM algorithm and Markov chain Monte Carlo (MCMC). A unique feature is the volume’s grounding in Evidence-Centered Design (ECD) framework for assessment design. This “design forward” approach enables designers to take full advantage of Bayes nets’ modularity and ability to model complex evidentiary relationships that arise from performance in interactive, technology-rich assessments such as simulations. Part III describes ECD, situates Bayes nets as ...
Bayesian Inference for Growth Mixture Models with Latent Class Dependent Missing Data
Lu, Zhenqiu Laura; Zhang, Zhiyong; Lubke, Gitta
2011-01-01
"Growth mixture models" (GMMs) with nonignorable missing data have drawn increasing attention in research communities but have not been fully studied. The goal of this article is to propose and to evaluate a Bayesian method to estimate the GMMs with latent class dependent missing data. An extended GMM is first presented in which class…
A Bayesian network approach for causal inferences in pesticide risk assessment and management
Pesticide risk assessment and management must balance societal benefits and ecosystem protection, based on quantified risks and the strength of the causal linkages between uses of the pesticide and socioeconomic and ecological endpoints of concern. A Bayesian network (BN) is a gr...
Energy Technology Data Exchange (ETDEWEB)
Blanc, Guillermo A. [Observatories of the Carnegie Institution for Science, 813 Santa Barbara Street, Pasadena, CA 91101 (United States); Kewley, Lisa; Vogt, Frédéric P. A.; Dopita, Michael A. [Research School of Astronomy and Astrophysics, Australian National University, Cotter Road, Weston, ACT 2611 (Australia)
2015-01-10
We present a new method for inferring the metallicity (Z) and ionization parameter (q) of H II regions and star-forming galaxies using strong nebular emission lines (SELs). We use Bayesian inference to derive the joint and marginalized posterior probability density functions for Z and q given a set of observed line fluxes and an input photoionization model. Our approach allows the use of arbitrary sets of SELs and the inclusion of flux upper limits. The method provides a self-consistent way of determining the physical conditions of ionized nebulae that is not tied to the arbitrary choice of a particular SEL diagnostic and uses all the available information. Unlike theoretically calibrated SEL diagnostics, the method is flexible and not tied to a particular photoionization model. We describe our algorithm, validate it against other methods, and present a tool that implements it called IZI. Using a sample of nearby extragalactic H II regions, we assess the performance of commonly used SEL abundance diagnostics. We also use a sample of 22 local H II regions having both direct and recombination line (RL) oxygen abundance measurements in the literature to study discrepancies in the abundance scale between different methods. We find that oxygen abundances derived through Bayesian inference using currently available photoionization models in the literature can be in good (∼30%) agreement with RL abundances, although some models perform significantly better than others. We also confirm that abundances measured using the direct method are typically ∼0.2 dex lower than both RL and photoionization-model-based abundances.
Li, Peng; Gong, Ping; Li, Haoni; Perkins, Edward J; Wang, Nan; Zhang, Chaoyang
2014-12-01
The Dialogue for Reverse Engineering Assessments and Methods (DREAM) project was initiated in 2006 as a community-wide effort for the development of network inference challenges for rigorous assessment of reverse engineering methods for biological networks. We participated in the in silico network inference challenge of DREAM3 in 2008. Here we report the details of our approach and its performance on the synthetic challenge datasets. In our methodology, we first developed a model called relative change ratio (RCR), which took advantage of the heterozygous knockdown data and null-mutant knockout data provided by the challenge, in order to identify the potential regulators for the genes. With this information, a time-delayed dynamic Bayesian network (TDBN) approach was then used to infer gene regulatory networks from time series trajectory datasets. Our approach considerably reduced the searching space of TDBN; hence, it gained a much higher efficiency and accuracy. The networks predicted using our approach were evaluated comparatively along with 29 other submissions by two metrics (area under the ROC curve and area under the precision-recall curve). The overall performance of our approach ranked the second among all participating teams.
Davies, Andrew J; Hope, Max J
2015-07-15
Contingency plans are essential in guiding the response to marine oil spills. However, they are written before the pollution event occurs so must contain some degree of assumption and prediction and hence may be unsuitable for a real incident when it occurs. The use of Bayesian networks in ecology, environmental management, oil spill contingency planning and post-incident analysis is reviewed and analysed to establish their suitability for use as real-time environmental decision support systems during an oil spill response. It is demonstrated that Bayesian networks are appropriate for facilitating the re-assessment and re-validation of contingency plans following pollutant release, thus helping ensure that the optimum response strategy is adopted. This can minimise the possibility of sub-optimal response strategies causing additional environmental and socioeconomic damage beyond the original pollution event.
From least squares to multilevel modeling: A graphical introduction to Bayesian inference
Loredo, Thomas J.
2016-01-01
This tutorial presentation will introduce some of the key ideas and techniques involved in applying Bayesian methods to problems in astrostatistics. The focus will be on the big picture: understanding the foundations (interpreting probability, Bayes's theorem, the law of total probability and marginalization), making connections to traditional methods (propagation of errors, least squares, chi-squared, maximum likelihood, Monte Carlo simulation), and highlighting problems where a Bayesian approach can be particularly powerful (Poisson processes, density estimation and curve fitting with measurement error). The "graphical" component of the title reflects an emphasis on pictorial representations of some of the math, but also on the use of graphical models (multilevel or hierarchical models) for analyzing complex data. Code for some examples from the talk will be available to participants, in Python and in the Stan probabilistic programming language.
Bayesian inference of the resonance content of p(gamma,K^+)Lambda
De Cruz, Lesley; Vancraeyveld, Pieter; Ryckebusch, Jan
2011-01-01
A Bayesian analysis of the world's p(gamma,K^+)Lambda data is presented. We find that the following nucleon resonances have the highest probability of contributing to the reaction: S11(1535), S11(1650), F15(1680), P13(1720), D13(1900), P13(1900), P11(1900), and F15(2000). We adopt a Regge-plus-resonance framework featuring consistent couplings for nucleon resonances up to spin J=5/2. We evaluate all possible combinations of 11 candidate resonances. The best model is selected from the 2048 model variants by calculating the Bayesian evidence values against the world's p(gamma,K^+)Lambda data.
DEFF Research Database (Denmark)
Ehsani, Alireza; Sørensen, Peter; Pomp, Daniel;
2012-01-01
Background To understand the genetic architecture of complex traits and bridge the genotype-phenotype gap, it is useful to study intermediate -omics data, e.g. the transcriptome. The present study introduces a method for simultaneous quantification of the contributions from single nucleotide...... polymorphisms (SNPs) and transcript abundances in explaining phenotypic variance, using Bayesian whole-omics models. Bayesian mixed models and variable selection models were used and, based on parameter samples from the model posterior distributions, explained variances were further partitioned at the level......-modal distribution of genomic values collapses, when gene expressions are added to the model Conclusions With increased availability of various -omics data, integrative approaches are promising tools for understanding the genetic architecture of complex traits. Partitioning of explained variances at the chromosome...
Wu, Chieh-Hsi; Drummond, Alexei J
2011-05-01
We provide a framework for Bayesian coalescent inference from microsatellite data that enables inference of population history parameters averaged over microsatellite mutation models. To achieve this we first implemented a rich family of microsatellite mutation models and related components in the software package BEAST. BEAST is a powerful tool that performs Bayesian MCMC analysis on molecular data to make coalescent and evolutionary inferences. Our implementation permits the application of existing nonparametric methods to microsatellite data. The implemented microsatellite models are based on the replication slippage mechanism and focus on three properties of microsatellite mutation: length dependency of mutation rate, mutational bias toward expansion or contraction, and number of repeat units changed in a single mutation event. We develop a new model that facilitates microsatellite model averaging and Bayesian model selection by transdimensional MCMC. With Bayesian model averaging, the posterior distributions of population history parameters are integrated across a set of microsatellite models and thus account for model uncertainty. Simulated data are used to evaluate our method in terms of accuracy and precision of estimation and also identification of the true mutation model. Finally we apply our method to a red colobus monkey data set as an example.
Bayesian inference of the resonance content of p(γ, K+Λ
Directory of Open Access Journals (Sweden)
Ryckebusch J.
2012-12-01
Full Text Available A Bayesian analysis of the world’s (γ, K+Λ data is presented. We adopt a Regge-plus-resonance framework featuring consistent couplings for nucleon resonances up to spin J = 5/2, and evaluate 2048 model variants considering all possible combinations of 11 candidate resonances. The best model, labeled RPR-2011, is discussed with special emphasis on nucleon resonances in the 1900-MeV mass region.
Bayesian inference of the resonance content of p(gamma,K+)Lambda
Vancraeyveld, Pieter; Ryckebusch, Jan; Vrancx, Tom
2012-01-01
A Bayesian analysis of the world's p(gamma,K+)Lambda data is presented. We adopt a Regge-plus-resonance framework featuring consistent couplings for nucleon resonances up to spin J=5/2, and evaluate 2048 model variants considering all possible combinations of 11 candidate resonances. The best model, labeled RPR-2011, is discussed with special emphasis on nucleon resonances in the 1900-MeV mass region.
Safner, T.; Miller, M.P.; McRae, B.H.; Fortin, M.-J.; Manel, S.
2011-01-01
Recently, techniques available for identifying clusters of individuals or boundaries between clusters using genetic data from natural populations have expanded rapidly. Consequently, there is a need to evaluate these different techniques. We used spatially-explicit simulation models to compare three spatial Bayesian clustering programs and two edge detection methods. Spatially-structured populations were simulated where a continuous population was subdivided by barriers. We evaluated the ability of each method to correctly identify boundary locations while varying: (i) time after divergence, (ii) strength of isolation by distance, (iii) level of genetic diversity, and (iv) amount of gene flow across barriers. To further evaluate the methods' effectiveness to detect genetic clusters in natural populations, we used previously published data on North American pumas and a European shrub. Our results show that with simulated and empirical data, the Bayesian spatial clustering algorithms outperformed direct edge detection methods. All methods incorrectly detected boundaries in the presence of strong patterns of isolation by distance. Based on this finding, we support the application of Bayesian spatial clustering algorithms for boundary detection in empirical datasets, with necessary tests for the influence of isolation by distance. ?? 2011 by the authors; licensee MDPI, Basel, Switzerland.
Adaptive multiscale MCMC algorithm for uncertainty quantification in seismic parameter estimation
Tan, Xiaosi
2014-08-05
Formulating an inverse problem in a Bayesian framework has several major advantages (Sen and Stoffa, 1996). It allows finding multiple solutions subject to flexible a priori information and performing uncertainty quantification in the inverse problem. In this paper, we consider Bayesian inversion for the parameter estimation in seismic wave propagation. The Bayes\\' theorem allows writing the posterior distribution via the likelihood function and the prior distribution where the latter represents our prior knowledge about physical properties. One of the popular algorithms for sampling this posterior distribution is Markov chain Monte Carlo (MCMC), which involves making proposals and calculating their acceptance probabilities. However, for large-scale problems, MCMC is prohibitevely expensive as it requires many forward runs. In this paper, we propose a multilevel MCMC algorithm that employs multilevel forward simulations. Multilevel forward simulations are derived using Generalized Multiscale Finite Element Methods that we have proposed earlier (Efendiev et al., 2013a; Chung et al., 2013). Our overall Bayesian inversion approach provides a substantial speed-up both in the process of the sampling via preconditioning using approximate posteriors and the computation of the forward problems for different proposals by using the adaptive nature of multiscale methods. These aspects of the method are discussed n the paper. This paper is motivated by earlier work of M. Sen and his collaborators (Hong and Sen, 2007; Hong, 2008) who proposed the development of efficient MCMC techniques for seismic applications. In the paper, we present some preliminary numerical results.
DEFF Research Database (Denmark)
Heller, Rasmus; Lorenzen, Eline D.; Okello, J.B.A;
2008-01-01
Genetic studies concerned with the demographic history of wildlife species can help elucidate the role of climate change and other forces such as human activity in shaping patterns of divergence and distribution. The African buffalo (Syncerus caffer) declined dramatically during the rinderpest...... pandemic in the late 1800s, but little is known about the earlier demographic history of the species. We analysed genetic variation at 17 microsatellite loci and a 302-bp fragment of the mitochondrial DNA control region to infer past demographic changes in buffalo populations from East Africa. Two Bayesian...... of African buffalo population declines in the order of 75-98%, starting in the mid-Holocene (approximately 3-7000 years ago). The signature of decline was remarkably consistent using two different coalescent-based methods and two types of molecular markers. Exploratory analyses involving various prior...
Markov chain Monte Carlo inference for Markov jump processes via the linear noise approximation.
Stathopoulos, Vassilios; Girolami, Mark A
2013-02-13
Bayesian analysis for Markov jump processes (MJPs) is a non-trivial and challenging problem. Although exact inference is theoretically possible, it is computationally demanding, thus its applicability is limited to a small class of problems. In this paper, we describe the application of Riemann manifold Markov chain Monte Carlo (MCMC) methods using an approximation to the likelihood of the MJP that is valid when the system modelled is near its thermodynamic limit. The proposed approach is both statistically and computationally efficient whereas the convergence rate and mixing of the chains allow for fast MCMC inference. The methodology is evaluated using numerical simulations on two problems from chemical kinetics and one from systems biology.
Langmore, Ian; Davis, Anthony B.; Bal, Guillaume; Marzouk, Youssef M.
2012-01-01
We describe a method for accelerating a 3D Monte Carlo forward radiative transfer model to the point where it can be used in a new kind of Bayesian retrieval framework. The remote sensing challenge is to detect and quantify a chemical effluent of a known absorbing gas produced by an industrial facility in a deep valley. The available data is a single low resolution noisy image of the scene in the near IR at an absorbing wavelength for the gas of interest. The detected sunlight has been multiply reflected by the variable terrain and/or scattered by an aerosol that is assumed partially known and partially unknown. We thus introduce a new class of remote sensing algorithms best described as "multi-pixel" techniques that call necessarily for a 3D radaitive transfer model (but demonstrated here in 2D); they can be added to conventional ones that exploit typically multi- or hyper-spectral data, sometimes with multi-angle capability, with or without information about polarization. The novel Bayesian inference methodology uses adaptively, with efficiency in mind, the fact that a Monte Carlo forward model has a known and controllable uncertainty depending on the number of sun-to-detector paths used.
Inferência bayesiana em modelos discretos com fração de cura
Fernandes, Luísa Martins
2014-01-01
Este trabalho apresenta inferências do modelo Weibull discreto para dados de sobrevivência com fração de cura. As inferências foram realizadas dentro de um cenário bayesiano fazendo-se o uso das técnicas de MCMC (Markov Chain Monte Carlo). São apresentadas estimativas pontuais dos parâmetros do modelo e seus respectivos intervalos de credibilidade HPD (Highest Posterior Density), assim como um teste de significância genuinamente bayesiano – FBST (Full Bayesian Significance Test) como uma form...
Directory of Open Access Journals (Sweden)
Mariana Inés Pocovi
2015-06-01
Full Text Available Understanding the population structure and genetic diversity in sugarcane (Saccharum officinarum L. accessions from INTA germplasm bank (Argentina will be of great importance for germplasm collection and breeding improvement as it will identify diverse parental combinations to create segregating progenies with maximum genetic variability for further selection. A Bayesian approach, ordination methods (PCoA, Principal Coordinate Analysis and clustering analysis (UPGMA, Unweighted Pair Group Method with Arithmetic Mean were applied to this purpose. Sixty three INTA sugarcane hybrids were genotyped for 107 Simple Sequence Repeat (SSR and 136 Amplified Fragment Length Polymorphism (AFLP loci. Given the low probability values found with AFLP for individual assignment (4.7%, microsatellites seemed to perform better (54% for STRUCTURE analysis that revealed the germplasm to exist in five optimum groups with partly corresponding to their origin. However clusters shown high degree of admixture, F ST values confirmed the existence of differences among groups. Dissimilarity coefficients ranged from 0.079 to 0.651. PCoA separated sugarcane in groups that did not agree with those identified by STRUCTURE. The clustering including all genotypes neither showed resemblance to populations find by STRUCTURE, but clustering performed considering only individuals displaying a proportional membership > 0.6 in their primary population obtained with STRUCTURE showed close similarities. The Bayesian method indubitably brought more information on cultivar origins than classical PCoA and hierarchical clustering method.
Blanc, Guillermo A; Vogt, Frédéric P A; Dopita, Michael A
2014-01-01
We present a new method for inferring the metallicity (Z) and ionization parameter (q) of HII regions and star-forming galaxies using strong nebular emission lines (SEL). We use Bayesian inference to derive the joint and marginalized posterior probability density functions for Z and q given a set of observed line fluxes and an input photo-ionization model. Our approach allows the use of arbitrary sets of SELs and the inclusion of flux upper limits. The method provides a self-consistent way of determining the physical conditions of ionized nebulae that is not tied to the arbitrary choice of a particular SEL diagnostic and uses all the available information. Unlike theoretically calibrated SEL diagnostics the method is flexible and not tied to a particular photo-ionization model. We describe our algorithm, validate it against other methods, and present a tool that implements it called IZI. Using a sample of nearby extra-galactic HII regions we assess the performance of commonly used SEL abundance diagnostics. W...
Energy Technology Data Exchange (ETDEWEB)
Marzouk, Youssef; Fast P. (Lawrence Livermore National Laboratory, Livermore, CA); Kraus, M. (Peterson AFB, CO); Ray, J. P.
2006-01-01
Terrorist attacks using an aerosolized pathogen preparation have gained credibility as a national security concern after the anthrax attacks of 2001. The ability to characterize such attacks, i.e., to estimate the number of people infected, the time of infection, and the average dose received, is important when planning a medical response. We address this question of characterization by formulating a Bayesian inverse problem predicated on a short time-series of diagnosed patients exhibiting symptoms. To be of relevance to response planning, we limit ourselves to 3-5 days of data. In tests performed with anthrax as the pathogen, we find that these data are usually sufficient, especially if the model of the outbreak used in the inverse problem is an accurate one. In some cases the scarcity of data may initially support outbreak characterizations at odds with the true one, but with sufficient data the correct inferences are recovered; in other words, the inverse problem posed and its solution methodology are consistent. We also explore the effect of model error-situations for which the model used in the inverse problem is only a partially accurate representation of the outbreak; here, the model predictions and the observations differ by more than a random noise. We find that while there is a consistent discrepancy between the inferred and the true characterizations, they are also close enough to be of relevance when planning a response.
MCMC with Strings and Branes: The Suburban Algorithm
Heckman, Jonathan J; Vigoda, Ben
2016-01-01
Motivated by the physics of strings and branes, we introduce a general suite of Markov chain Monte Carlo (MCMC) "suburban samplers" (i.e., spread out Metropolis). The suburban algorithm involves an ensemble of statistical agents connected together by a random network. Performance of the collective in reaching a fast and accurate inference depends primarily on the average number of nearest neighbor connections. Increasing the average number of neighbors above zero initially leads to an increase in performance, though there is a critical connectivity with effective dimension d_eff ~ 1, above which "groupthink" takes over, and the performance of the sampler declines.
Bayesian inference of non-positive spectral functions in quantum field theory
Rothkopf, Alexander
2016-01-01
We present the generalization to non positive definite spectral functions of a recently proposed Bayesian deconvolution approach (BR method). The novel prior used here retains many of the beneficial analytic properties of the original method, in particular it allows us to integrate out the hyperparameter $\\alpha$ directly. To preserve the underlying axiom of scale invariance, we introduce a second default-model related function, whose role is discussed. Our reconstruction prescription is contrasted with existing direct methods, as well as with an approach where shift functions are introduced to compensate for negative spectral features. A mock spectrum analysis inspired by the study of gluon spectral functions in QCD illustrates the capabilities of this new approach.
MacCallum, Justin L; Perez, Alberto; Dill, Ken A
2015-06-02
More than 100,000 protein structures are now known at atomic detail. However, far more are not yet known, particularly among large or complex proteins. Often, experimental information is only semireliable because it is uncertain, limited, or confusing in important ways. Some experiments give sparse information, some give ambiguous or nonspecific information, and others give uncertain information-where some is right, some is wrong, but we don't know which. We describe a method called Modeling Employing Limited Data (MELD) that can harness such problematic information in a physics-based, Bayesian framework for improved structure determination. We apply MELD to eight proteins of known structure for which such problematic structural data are available, including a sparse NMR dataset, two ambiguous EPR datasets, and four uncertain datasets taken from sequence evolution data. MELD gives excellent structures, indicating its promise for experimental biomolecule structure determination where only semireliable data are available.
Bayesian inference for functional response in a stochastic predator-prey system.
Gilioli, Gianni; Pasquali, Sara; Ruggeri, Fabrizio
2008-02-01
We present a Bayesian method for functional response parameter estimation starting from time series of field data on predator-prey dynamics. Population dynamics is described by a system of stochastic differential equations in which behavioral stochasticities are represented by noise terms affecting each population as well as their interaction. We focus on the estimation of a behavioral parameter appearing in the functional response of predator to prey abundance when a small number of observations is available. To deal with small sample sizes, latent data are introduced between each pair of field observations and are considered as missing data. The method is applied to both simulated and observational data. The results obtained using different numbers of latent data are compared with those achieved following a frequentist approach. As a case study, we consider an acarine predator-prey system relevant to biological control problems.
Formulating Quantum Theory as a Causally Neutral Theory of Bayesian Inference
Leifer, M S
2011-01-01
Quantum theory can be viewed as a generalization of classical probability theory, but the analogy as it has been developed so far is not complete. Classical probability theory is independent of causal structure, whereas the conventional quantum formalism requires causal structure to be fixed in advance. In this paper, we develop the formalism of quantum conditional states, which unifies the description of experiments involving two systems at a single time with the description of those involving a single system at two times. The analogies between quantum theory and classical probability theory are expressed succinctly within the formalism and it unifies the mathematical description of distinct concepts, such as ensemble preparation procedures, measurements, and quantum dynamics. We introduce a quantum generalization of Bayes' theorem and the associated notion of Bayesian conditioning. Conditioning a quantum state on a classical variable is the correct rule for updating quantum states in light of classical data...
Bayesian Inference for Reliability of Systems and Networks Using the Survival Signature.
Aslett, Louis J M; Coolen, Frank P A; Wilson, Simon P
2015-09-01
The concept of survival signature has recently been introduced as an alternative to the signature for reliability quantification of systems. While these two concepts are closely related for systems consisting of a single type of component, the survival signature is also suitable for systems with multiple types of component, which is not the case for the signature. This also enables the use of the survival signature for reliability of networks. In this article, we present the use of the survival signature for reliability quantification of systems and networks from a Bayesian perspective. We assume that data are available on tested components that are exchangeable with those in the actual system or network of interest. These data consist of failure times and possibly right-censoring times. We present both a nonparametric and parametric approach.
Bayesian Inference for LISA Pathfinder using Markov Chain Monte Carlo Methods
Ferraioli, Luigi; Plagnol, Eric
2012-01-01
We present a parameter estimation procedure based on a Bayesian framework by applying a Markov Chain Monte Carlo algorithm to the calibration of the dynamical parameters of a space based gravitational wave detector. The method is based on the Metropolis-Hastings algorithm and a two-stage annealing treatment in order to ensure an effective exploration of the parameter space at the beginning of the chain. We compare two versions of the algorithm with an application to a LISA Pathfinder data analysis problem. The two algorithms share the same heating strategy but with one moving in coordinate directions using proposals from a multivariate Gaussian distribution, while the other uses the natural logarithm of some parameters and proposes jumps in the eigen-space of the Fisher Information matrix. The algorithm proposing jumps in the eigen-space of the Fisher Information matrix demonstrates a higher acceptance rate and a slightly better convergence towards the equilibrium parameter distributions in the application to...
Inferring Alcoholism SNPs and Regulatory Chemical Compounds Based on Ensemble Bayesian Network.
Chen, Huan; Sun, Jiatong; Jiang, Hong; Wang, Xianyue; Wu, Lingxiang; Wu, Wei; Wang, Qh
2016-12-20
The disturbance of consciousness is one of the most common symptoms of those have alcoholism and may cause disability and mortality. Previous studies indicated that several single nucleotide polymorphisms (SNP) increase the susceptibility of alcoholism. In this study, we utilized the Ensemble Bayesian Network (EBN) method to identify causal SNPs of alcoholism based on the verified GAW14 data. Thirteen out of eighteen SNPs directly connected with alcoholism were found concordance with potential risk regions of alcoholism in OMIM database. As a number of SNPs were found contributing to alteration on gene expression, known as expression quantitative trait loci (eQTLs), we further sought to identify chemical compounds acting as regulators of alcoholism genes captured by causal SNPs. Chloroprene and valproic acid were identified as the expression regulators for genes C11orf66 and SALL3 which were captured by alcoholism SNPs, respectively.
Nakatani-Webster, Eri; Nath, Abhinav
2017-03-14
Amyloid formation is implicated in a number of human diseases, and is thought to proceed via a nucleation-dependent polymerization mechanism. Experimenters often wish to relate changes in amyloid formation kinetics, for example, in response to small molecules to specific mechanistic steps along this pathway. However, fitting kinetic fibril formation data to a complex model including explicit rate constants results in an ill-posed problem with a vast number of potential solutions. The levels of uncertainty remaining in parameters calculated from these models, arising both from experimental noise and high levels of degeneracy or codependency in parameters, is often unclear. Here, we demonstrate that a combination of explicit mathematical models with an approximate Bayesian computation approach can be used to assign the mechanistic effects of modulators on amyloid fibril formation. We show that even when exact rate constants cannot be extracted, parameters derived from these rate constants can be recovered and used to assign mechanistic effects and their relative magnitudes with a great deal of confidence. Furthermore, approximate Bayesian computation provides a robust method for visualizing uncertainty remaining in the model parameters, regardless of its origin. We apply these methods to the problem of heparin-mediated tau polymerization, which displays complex kinetic behavior not amenable to analysis by more traditional methods. Our analysis indicates that the role of heparin cannot be explained by enhancement of nucleation alone, as has been previously proposed. The methods described here are applicable to a wide range of systems, as models can be easily adapted to account for new reactions and reversibility.
Directory of Open Access Journals (Sweden)
Oliver Ratmann
Full Text Available A key priority in infectious disease research is to understand the ecological and evolutionary drivers of viral diseases from data on disease incidence as well as viral genetic and antigenic variation. We propose using a simulation-based, Bayesian method known as Approximate Bayesian Computation (ABC to fit and assess phylodynamic models that simulate pathogen evolution and ecology against summaries of these data. We illustrate the versatility of the method by analyzing two spatial models describing the phylodynamics of interpandemic human influenza virus subtype A(H3N2. The first model captures antigenic drift phenomenologically with continuously waning immunity, and the second epochal evolution model describes the replacement of major, relatively long-lived antigenic clusters. Combining features of long-term surveillance data from The Netherlands with features of influenza A (H3N2 hemagglutinin gene sequences sampled in northern Europe, key phylodynamic parameters can be estimated with ABC. Goodness-of-fit analyses reveal that the irregularity in interannual incidence and H3N2's ladder-like hemagglutinin phylogeny are quantitatively only reproduced under the epochal evolution model within a spatial context. However, the concomitant incidence dynamics result in a very large reproductive number and are not consistent with empirical estimates of H3N2's population level attack rate. These results demonstrate that the interactions between the evolutionary and ecological processes impose multiple quantitative constraints on the phylodynamic trajectories of influenza A(H3N2, so that sequence and surveillance data can be used synergistically. ABC, one of several data synthesis approaches, can easily interface a broad class of phylodynamic models with various types of data but requires careful calibration of the summaries and tolerance parameters.
Exploring the Connection Between Sampling Problems in Bayesian Inference and Statistical Mechanics
Pohorille, Andrew
2006-01-01
The Bayesian and statistical mechanical communities often share the same objective in their work - estimating and integrating probability distribution functions (pdfs) describing stochastic systems, models or processes. Frequently, these pdfs are complex functions of random variables exhibiting multiple, well separated local minima. Conventional strategies for sampling such pdfs are inefficient, sometimes leading to an apparent non-ergodic behavior. Several recently developed techniques for handling this problem have been successfully applied in statistical mechanics. In the multicanonical and Wang-Landau Monte Carlo (MC) methods, the correct pdfs are recovered from uniform sampling of the parameter space by iteratively establishing proper weighting factors connecting these distributions. Trivial generalizations allow for sampling from any chosen pdf. The closely related transition matrix method relies on estimating transition probabilities between different states. All these methods proved to generate estimates of pdfs with high statistical accuracy. In another MC technique, parallel tempering, several random walks, each corresponding to a different value of a parameter (e.g. "temperature"), are generated and occasionally exchanged using the Metropolis criterion. This method can be considered as a statistically correct version of simulated annealing. An alternative approach is to represent the set of independent variables as a Hamiltonian system. Considerab!e progress has been made in understanding how to ensure that the system obeys the equipartition theorem or, equivalently, that coupling between the variables is correctly described. Then a host of techniques developed for dynamical systems can be used. Among them, probably the most powerful is the Adaptive Biasing Force method, in which thermodynamic integration and biased sampling are combined to yield very efficient estimates of pdfs. The third class of methods deals with transitions between states described
Nonparametric Bayesian Inference for Mean Residual Life Functions in Survival Analysis
Poynor, Valerie; Kottas, Athanasios
2014-01-01
Modeling and inference for survival analysis problems typically revolves around different functions related to the survival distribution. Here, we focus on the mean residual life function which provides the expected remaining lifetime given that a subject has survived (i.e., is event-free) up to a particular time. This function is of direct interest in reliability, medical, and actuarial fields. In addition to its practical interpretation, the mean residual life function characterizes the sur...
Energy Technology Data Exchange (ETDEWEB)
Huang, Maoyi [Earth System Analysis and Modeling Group, Pacific Northwest National Laboratory, Richland Washington USA; Ray, Jaideep [Sandia National Laboratories, Livermore California USA; Hou, Zhangshuan [Hydrology Technical Group, Pacific Northwest National Laboratory, Richland Washington USA; Ren, Huiying [Hydrology Technical Group, Pacific Northwest National Laboratory, Richland Washington USA; Liu, Ying [Earth System Analysis and Modeling Group, Pacific Northwest National Laboratory, Richland Washington USA; Swiler, Laura [Sandia National Laboratories, Albuquerque New Mexico USA
2016-07-04
The Community Land Model (CLM) has been widely used in climate and Earth system modeling. Accurate estimation of model parameters is needed for reliable model simulations and predictions under current and future conditions, respectively. In our previous work, a subset of hydrological parameters has been identified to have significant impact on surface energy fluxes at selected flux tower sites based on parameter screening and sensitivity analysis, which indicate that the parameters could potentially be estimated from surface flux observations at the towers. To date, such estimates do not exist. In this paper, we assess the feasibility of applying a Bayesian model calibration technique to estimate CLM parameters at selected flux tower sites under various site conditions. The parameters are estimated as a joint probability density function (PDF) that provides estimates of uncertainty of the parameters being inverted, conditional on climatologically-average latent heat fluxes derived from observations. We find that the simulated mean latent heat fluxes from CLM using the calibrated parameters are generally improved at all sites when compared to those obtained with CLM simulations using default parameter sets. Further, our calibration method also results in credibility bounds around the simulated mean fluxes which bracket the measured data. The modes (or maximum a posteriori values) and 95% credibility intervals of the site-specific posterior PDFs are tabulated as suggested parameter values for each site. Analysis of relationships between the posterior PDFs and site conditions suggests that the parameter values are likely correlated with the plant functional type, which needs to be confirmed in future studies by extending the approach to more sites.
Kaiser, Jacob L; Bland, Cassidy L; Klinke, David J
2016-03-01
Cancer arises from a deregulation of both intracellular and intercellular networks that maintain system homeostasis. Identifying the architecture of these networks and how they are changed in cancer is a pre-requisite for designing drugs to restore homeostasis. Since intercellular networks only appear in intact systems, it is difficult to identify how these networks become altered in human cancer using many of the common experimental models. To overcome this, we used the diversity in normal and malignant human tissue samples from the Cancer Genome Atlas (TCGA) database of human breast cancer to identify the topology associated with intercellular networks in vivo. To improve the underlying biological signals, we constructed Bayesian networks using metagene constructs, which represented groups of genes that are concomitantly associated with different immune and cancer states. We also used bootstrap resampling to establish the significance associated with the inferred networks. In short, we found opposing relationships between cell proliferation and epithelial-to-mesenchymal transformation (EMT) with regards to macrophage polarization. These results were consistent across multiple carcinomas in that proliferation was associated with a type 1 cell-mediated anti-tumor immune response and EMT was associated with a pro-tumor anti-inflammatory response. To address the identifiability of these networks from other datasets, we could identify the relationship between EMT and macrophage polarization with fewer samples when the Bayesian network was generated from malignant samples alone. However, the relationship between proliferation and macrophage polarization was identified with fewer samples when the samples were taken from a combination of the normal and malignant samples. © 2016 American Institute of Chemical Engineers Biotechnol. Prog., 32:470-479, 2016.
Systematic validation of non-equilibrium thermochemical models using Bayesian inference
Energy Technology Data Exchange (ETDEWEB)
Miki, Kenji [NASA Glenn Research Center, OAI, 22800 Cedar Point Rd, Cleveland, OH 44142 (United States); Panesi, Marco, E-mail: mpanesi@illinois.edu [Department of Aerospace Engineering, University of Illinois at Urbana-Champaign, 306 Talbot Lab, 104 S. Wright St., Urbana, IL 61801 (United States); Prudhomme, Serge [Département de mathématiques et de génie industriel, Ecole Polytechnique de Montréal, C.P. 6079, succ. Centre-ville, Montréal, QC, H3C 3A7 (Canada)
2015-10-01
The validation process proposed by Babuška et al. [1] is applied to thermochemical models describing post-shock flow conditions. In this validation approach, experimental data is involved only in the calibration of the models, and the decision process is based on quantities of interest (QoIs) predicted on scenarios that are not necessarily amenable experimentally. Moreover, uncertainties present in the experimental data, as well as those resulting from an incomplete physical model description, are propagated to the QoIs. We investigate four commonly used thermochemical models: a one-temperature model (which assumes thermal equilibrium among all inner modes), and two-temperature models developed by Macheret et al. [2], Marrone and Treanor [3], and Park [4]. Up to 16 uncertain parameters are estimated using Bayesian updating based on the latest absolute volumetric radiance data collected at the Electric Arc Shock Tube (EAST) installed inside the NASA Ames Research Center. Following the solution of the inverse problems, the forward problems are solved in order to predict the radiative heat flux, QoI, and examine the validity of these models. Our results show that all four models are invalid, but for different reasons: the one-temperature model simply fails to reproduce the data while the two-temperature models exhibit unacceptably large uncertainties in the QoI predictions.
Systematic validation of non-equilibrium thermochemical models using Bayesian inference
Miki, Kenji
2015-10-01
© 2015 Elsevier Inc. The validation process proposed by Babuška et al. [1] is applied to thermochemical models describing post-shock flow conditions. In this validation approach, experimental data is involved only in the calibration of the models, and the decision process is based on quantities of interest (QoIs) predicted on scenarios that are not necessarily amenable experimentally. Moreover, uncertainties present in the experimental data, as well as those resulting from an incomplete physical model description, are propagated to the QoIs. We investigate four commonly used thermochemical models: a one-temperature model (which assumes thermal equilibrium among all inner modes), and two-temperature models developed by Macheret et al. [2], Marrone and Treanor [3], and Park [4]. Up to 16 uncertain parameters are estimated using Bayesian updating based on the latest absolute volumetric radiance data collected at the Electric Arc Shock Tube (EAST) installed inside the NASA Ames Research Center. Following the solution of the inverse problems, the forward problems are solved in order to predict the radiative heat flux, QoI, and examine the validity of these models. Our results show that all four models are invalid, but for different reasons: the one-temperature model simply fails to reproduce the data while the two-temperature models exhibit unacceptably large uncertainties in the QoI predictions.
Klein, E K; Oddou-Muratorio, S
2011-03-01
Understanding precisely how plants disperse their seeds and pollen in their neighbourhood is a central question for both ecologists and evolutionary biologists because seed and pollen dispersal governs both the rate of spread of an expanding population and gene flow within and among populations. The concept of a 'dispersal kernel' has become extremely popular in dispersal ecology as a tool that summarizes how dispersal distributes individuals and genes in space and at a given scale. In this issue of Molecular Ecology, the study by Moran & Clark (2011) (M&C in the following) shows how genotypic and spatial data of established seedlings can be analysed in a Bayesian framework to estimate jointly the pollen and seed dispersal kernels and finally derive a parentage analysis from a full-probability approach. This approach applied to red oak shows important dispersal of seeds (138 m on average) and pollen (178 m on average). For seeds, this estimate contrasts with previous results from inverse modelling on seed trap data (9.3 m). This research gathers several methodological advances made in recent years in two research communities and could become a cornerstone for dispersal ecology.
Fajardo, Alvaro; Soñora, Martín; Moreno, Pilar; Moratorio, Gonzalo; Cristina, Juan
2016-10-01
Zika virus (ZIKV) is a member of the family Flaviviridae. In 2015, ZIKV triggered an epidemic in Brazil and spread across Latin America. By May of 2016, the World Health Organization warns over spread of ZIKV beyond this region. Detailed studies on the mode of evolution of ZIKV strains are extremely important for our understanding of the emergence and spread of ZIKV populations. In order to gain insight into these matters, a Bayesian coalescent Markov Chain Monte Carlo analysis of complete genome sequences of recently isolated ZIKV strains was performed. The results of these studies revealed a mean rate of evolution of 1.20 × 10(-3) nucleotide substitutions per site per year (s/s/y) for ZIKV strains enrolled in this study. Several variants isolated in China are grouped together with all strains isolated in Latin America. Another genetic group composed exclusively by Chinese strains were also observed, suggesting the co-circulation of different genetic lineages in China. These findings indicate a high level of diversification of ZIKV populations. Strains isolated from microcephaly cases do not share amino acid substitutions, suggesting that other factors besides viral genetic differences may play a role for the proposed pathogenesis caused by ZIKV infection. J. Med. Virol. 88:1672-1676, 2016. © 2016 Wiley Periodicals, Inc.
Fardal, Mark A; Babul, Arif; Irwin, Mike J; Guhathakurta, Puragra; Gilbert, Karoline M; Ferguson, Annette M N; Ibata, Rodrigo A; Lewis, Geraint F; Tanvir, Nial R; Huxor, Avon P
2013-01-01
M31 has a giant stream of stars extending far to the south and a great deal of other tidal debris in its halo, much of which is thought to be directly associated with the southern stream. We model this structure by means of Bayesian sampling of parameter space, where each sample uses an N-body simulation of a satellite disrupting in M31's potential. We combine constraints on stellar surface densities from the Isaac Newton Telescope survey of M31 with kinematic data and photometric distances. This combination of data tightly constrains the model, indicating a stellar mass at last pericentric passage of log(M_s / Msun) = 9.5+-0.1, comparable to the LMC. Any existing remnant of the satellite is expected to lie in the NE Shelf region beside M31's disk, at velocities more negative than M31's disk in this region. This rules out the prominent satellites M32 or NGC 205 as the progenitor, but an overdensity recently discovered in M31's NE disk sits at the edge of the progenitor locations found in the model. M31's viri...
Zubillaga, María; Skewes, Oscar; Soto, Nicolás; Rabinovich, Jorge E.; Colchero, Fernando
2014-01-01
Understanding the mechanisms that drive population dynamics is fundamental for management of wild populations. The guanaco (Lama guanicoe) is one of two wild camelid species in South America. We evaluated the effects of density dependence and weather variables on population regulation based on a time series of 36 years of population sampling of guanacos in Tierra del Fuego, Chile. The population density varied between 2.7 and 30.7 guanaco/km2, with an apparent monotonic growth during the first 25 years; however, in the last 10 years the population has shown large fluctuations, suggesting that it might have reached its carrying capacity. We used a Bayesian state-space framework and model selection to determine the effect of density and environmental variables on guanaco population dynamics. Our results show that the population is under density dependent regulation and that it is currently fluctuating around an average carrying capacity of 45,000 guanacos. We also found a significant positive effect of previous winter temperature while sheep density has a strong negative effect on the guanaco population growth. We conclude that there are significant density dependent processes and that climate as well as competition with domestic species have important effects determining the population size of guanacos, with important implications for management and conservation. PMID:25514510
Niwayama, Ritsuya; Nagao, Hiromichi; Kitajima, Tomoya S; Hufnagel, Lars; Shinohara, Kyosuke; Higuchi, Tomoyuki; Ishikawa, Takuji; Kimura, Akatsuki
2016-01-01
Cellular structures are hydrodynamically interconnected, such that force generation in one location can move distal structures. One example of this phenomenon is cytoplasmic streaming, whereby active forces at the cell cortex induce streaming of the entire cytoplasm. However, it is not known how the spatial distribution and magnitude of these forces move distant objects within the cell. To address this issue, we developed a computational method that used cytoplasm hydrodynamics to infer the spatial distribution of shear stress at the cell cortex induced by active force generators from experimentally obtained flow field of cytoplasmic streaming. By applying this method, we determined the shear-stress distribution that quantitatively reproduces in vivo flow fields in Caenorhabditis elegans embryos and mouse oocytes during meiosis II. Shear stress in mouse oocytes were predicted to localize to a narrower cortical region than that with a high cortical flow velocity and corresponded with the localization of the cortical actin cap. The predicted patterns of pressure gradient in both species were consistent with species-specific cytoplasmic streaming functions. The shear-stress distribution inferred by our method can contribute to the characterization of active force generation driving biological streaming.
Niwayama, Ritsuya; Nagao, Hiromichi; Kitajima, Tomoya S.; Hufnagel, Lars; Shinohara, Kyosuke; Higuchi, Tomoyuki; Ishikawa, Takuji
2016-01-01
Cellular structures are hydrodynamically interconnected, such that force generation in one location can move distal structures. One example of this phenomenon is cytoplasmic streaming, whereby active forces at the cell cortex induce streaming of the entire cytoplasm. However, it is not known how the spatial distribution and magnitude of these forces move distant objects within the cell. To address this issue, we developed a computational method that used cytoplasm hydrodynamics to infer the spatial distribution of shear stress at the cell cortex induced by active force generators from experimentally obtained flow field of cytoplasmic streaming. By applying this method, we determined the shear-stress distribution that quantitatively reproduces in vivo flow fields in Caenorhabditis elegans embryos and mouse oocytes during meiosis II. Shear stress in mouse oocytes were predicted to localize to a narrower cortical region than that with a high cortical flow velocity and corresponded with the localization of the cortical actin cap. The predicted patterns of pressure gradient in both species were consistent with species-specific cytoplasmic streaming functions. The shear-stress distribution inferred by our method can contribute to the characterization of active force generation driving biological streaming. PMID:27472658
WHOOMP! (There It Is) Rapid Bayesian position reconstruction for gravitational-wave transients
Singer, Leo P
2015-01-01
Within the next few years, Advanced LIGO and Virgo should detect gravitational waves (GWs) from binary neutron star and neutron star-black hole mergers. These sources are also predicted to power a broad array of electromagnetic transients. Because the X-ray and optical signatures can be faint and fade rapidly, observing them hinges on rapidly inferring the sky location from the gravitational wave observations. Markov chain Monte Carlo (MCMC) methods for gravitational-wave parameter estimation can take hours or more. We introduce BAYESTAR, a rapid, Bayesian, non-MCMC sky localization algorithm that takes just seconds to produce probability sky maps that are comparable in accuracy to the full analysis. Prompt localizations from BAYESTAR will make it possible to search electromagnetic counterparts of compact binary mergers.
Aerosol model selection and uncertainty modelling by adaptive MCMC technique
Directory of Open Access Journals (Sweden)
M. Laine
2008-12-01
Full Text Available We present a new technique for model selection problem in atmospheric remote sensing. The technique is based on Monte Carlo sampling and it allows model selection, calculation of model posterior probabilities and model averaging in Bayesian way.
The algorithm developed here is called Adaptive Automatic Reversible Jump Markov chain Monte Carlo method (AARJ. It uses Markov chain Monte Carlo (MCMC technique and its extension called Reversible Jump MCMC. Both of these techniques have been used extensively in statistical parameter estimation problems in wide area of applications since late 1990's. The novel feature in our algorithm is the fact that it is fully automatic and easy to use.
We show how the AARJ algorithm can be implemented and used for model selection and averaging, and to directly incorporate the model uncertainty. We demonstrate the technique by applying it to the statistical inversion problem of gas profile retrieval of GOMOS instrument on board the ENVISAT satellite. Four simple models are used simultaneously to describe the dependence of the aerosol cross-sections on wavelength. During the AARJ estimation all the models are used and we obtain a probability distribution characterizing how probable each model is. By using model averaging, the uncertainty related to selecting the aerosol model can be taken into account in assessing the uncertainty of the estimates.
BayesLine: Bayesian Inference for Spectral Estimation of Gravitational Wave Detector Noise
Littenberg, Tyson B
2014-01-01
Gravitational wave data from ground-based detectors is dominated by instrument noise. Signals will be comparatively weak, and our understanding of the noise will influence detection confidence and signal characterization. Mis-modeled noise can produce large systematic biases in both model selection and parameter estimation. Here we introduce a multi-component, variable dimension, parameterized model to describe the Gaussian-noise power spectrum for data from ground-based gravitational wave interferometers. Called BayesLine, the algorithm models the noise power spectral density using cubic splines for smoothly varying broad-band noise and Lorentzians for narrow-band line features in the spectrum. We describe the algorithm and demonstrate its performance on data from the fifth and sixth LIGO science runs. Once fully integrated into LIGO/Virgo data analysis software, BayesLine will produce accurate spectral estimation and provide a means for marginalizing inferences drawn from the data over all plausible noise s...
Bayesian inference of baseline fertility and treatment effects via a crop yield-fertility model.
Directory of Open Access Journals (Sweden)
Hungyen Chen
Full Text Available To effectively manage soil fertility, knowledge is needed of how a crop uses nutrients from fertilizer applied to the soil. Soil quality is a combination of biological, chemical and physical properties and is hard to assess directly because of collective and multiple functional effects. In this paper, we focus on the application of these concepts to agriculture. We define the baseline fertility of soil as the level of fertility that a crop can acquire for growth from the soil. With this strict definition, we propose a new crop yield-fertility model that enables quantification of the process of improving baseline fertility and the effects of treatments solely from the time series of crop yields. The model was modified from Michaelis-Menten kinetics and measured the additional effects of the treatments given the baseline fertility. Using more than 30 years of experimental data, we used the Bayesian framework to estimate the improvements in baseline fertility and the effects of fertilizer and farmyard manure (FYM on maize (Zea mays, barley (Hordeum vulgare, and soybean (Glycine max yields. Fertilizer contributed the most to the barley yield and FYM contributed the most to the soybean yield among the three crops. The baseline fertility of the subsurface soil was very low for maize and barley prior to fertilization. In contrast, the baseline fertility in this soil approximated half-saturated fertility for the soybean crop. The long-term soil fertility was increased by adding FYM, but the effect of FYM addition was reduced by the addition of fertilizer. Our results provide evidence that long-term soil fertility under continuous farming was maintained, or increased, by the application of natural nutrients compared with the application of synthetic fertilizer.
Eadie, Gwendolyn; Harris, William E.; Springford, Aaron; Widrow, Larry
2017-01-01
The mass and cumulative mass profile of the Milky Way's dark matter halo is a fundamental property of the Galaxy, and yet these quantities remain poorly constrained and span almost two orders of magnitude in the literature. There are a variety of methods to measure the mass of the Milky Way, and a common way to constrain the mass uses kinematic information of satellite objects (e.g. globular clusters) orbiting the Galaxy. One reason precise estimates of the mass and mass profile remain elusive is that the kinematic data of the globular clusters are incomplete; for some both line-of-sight and proper motion measurements are available (i.e. complete data), and for others there are only line-of-sight velocities (i.e. incomplete data). Furthermore, some proper motion measurements suffer from large measurement uncertainties, and these uncertainties can be difficult to take into account because they propagate in complicated ways. Past methods have dealt with incomplete data by using either only the line-of-sight measurements (and throwing away the proper motions), or only using the complete data. In either case, valuable information is not included in the analysis. During my PhD research, I have been developing a coherent hierarchical Bayesian method to estimate the mass and mass profile of the Galaxy that 1) includes both complete and incomplete kinematic data simultaneously in the analysis, and 2) includes measurement uncertainties in a meaningful way. In this presentation, I will introduce our approach in a way that is accessible and clear, and will also present our estimates of the Milky Way's total mass and mass profile using all available kinematic data from the globular cluster population of the Galaxy.
A method of spherical harmonic analysis in the geosciences via hierarchical Bayesian inference
Muir, J. B.; Tkalčić, H.
2015-11-01
The problem of decomposing irregular data on the sphere into a set of spherical harmonics is common in many fields of geosciences where it is necessary to build a quantitative understanding of a globally varying field. For example, in global seismology, a compressional or shear wave speed that emerges from tomographic images is used to interpret current state and composition of the mantle, and in geomagnetism, secular variation of magnetic field intensity measured at the surface is studied to better understand the changes in the Earth's core. Optimization methods are widely used for spherical harmonic analysis of irregular data, but they typically do not treat the dependence of the uncertainty estimates on the imposed regularization. This can cause significant difficulties in interpretation, especially when the best-fit model requires more variables as a result of underestimating data noise. Here, with the above limitations in mind, the problem of spherical harmonic expansion of irregular data is treated within the hierarchical Bayesian framework. The hierarchical approach significantly simplifies the problem by removing the need for regularization terms and user-supplied noise estimates. The use of the corrected Akaike Information Criterion for picking the optimal maximum degree of spherical harmonic expansion and the resulting spherical harmonic analyses are first illustrated on a noisy synthetic data set. Subsequently, the method is applied to two global data sets sensitive to the Earth's inner core and lowermost mantle, consisting of PKPab-df and PcP-P differential traveltime residuals relative to a spherically symmetric Earth model. The posterior probability distributions for each spherical harmonic coefficient are calculated via Markov Chain Monte Carlo sampling; the uncertainty obtained for the coefficients thus reflects the noise present in the real data and the imperfections in the spherical harmonic expansion.
Bayesian inference of selection in a heterogeneous environment from genetic time-series data.
Gompert, Zachariah
2016-01-01
Evolutionary geneticists have sought to characterize the causes and molecular targets of selection in natural populations for many years. Although this research programme has been somewhat successful, most statistical methods employed were designed to detect consistent, weak to moderate selection. In contrast, phenotypic studies in nature show that selection varies in time and that individual bouts of selection can be strong. Measurements of the genomic consequences of such fluctuating selection could help test and refine hypotheses concerning the causes of ecological specialization and the maintenance of genetic variation in populations. Herein, I proposed a Bayesian nonhomogeneous hidden Markov model to estimate effective population sizes and quantify variable selection in heterogeneous environments from genetic time-series data. The model is described and then evaluated using a series of simulated data, including cases where selection occurs on a trait with a simple or polygenic molecular basis. The proposed method accurately distinguished neutral loci from non-neutral loci under strong selection, but not from those under weak selection. Selection coefficients were accurately estimated when selection was constant or when the fitness values of genotypes varied linearly with the environment, but these estimates were less accurate when fitness was polygenic or the relationship between the environment and the fitness of genotypes was nonlinear. Past studies of temporal evolutionary dynamics in laboratory populations have been remarkably successful. The proposed method makes similar analyses of genetic time-series data from natural populations more feasible and thereby could help answer fundamental questions about the causes and consequences of evolution in the wild.
2016-01-01
DNA double-strand breaks are lesions that form during metabolism, DNA replication and exposure to mutagens. When a double-strand break occurs one of a number of repair mechanisms is recruited, all of which have differing propensities for mutational events. Despite DNA repair being of crucial importance, the relative contribution of these mechanisms and their regulatory interactions remain to be fully elucidated. Understanding these mutational processes will have a profound impact on our knowledge of genomic instability, with implications across health, disease and evolution. Here we present a new method to model the combined activation of non-homologous end joining, single strand annealing and alternative end joining, following exposure to ionising radiation. We use Bayesian statistics to integrate eight biological data sets of double-strand break repair curves under varying genetic knockouts and confirm that our model is predictive by re-simulating and comparing to additional data. Analysis of the model suggests that there are at least three disjoint modes of repair, which we assign as fast, slow and intermediate. Our results show that when multiple data sets are combined, the rate for intermediate repair is variable amongst genetic knockouts. Further analysis suggests that the ratio between slow and intermediate repair depends on the presence or absence of DNA-PKcs and Ku70, which implies that non-homologous end joining and alternative end joining are not independent. Finally, we consider the proportion of double-strand breaks within each mechanism as a time series and predict activity as a function of repair rate. We outline how our insights can be directly tested using imaging and sequencing techniques and conclude that there is evidence of variable dynamics in alternative repair pathways. Our approach is an important step towards providing a unifying theoretical framework for the dynamics of DNA repair processes. PMID:27741226
Directory of Open Access Journals (Sweden)
Chuanhua Xing
2011-07-01
Full Text Available Protein-protein interactions (PPIs are essential to most fundamental cellular processes. There has been increasing interest in reconstructing PPIs networks. However, several critical difficulties exist in obtaining reliable predictions. Noticeably, false positive rates can be as high as >80%. Error correction from each generating source can be both time-consuming and inefficient due to the difficulty of covering the errors from multiple levels of data processing procedures within a single test. We propose a novel Bayesian integration method, deemed nonparametric Bayes ensemble learning (NBEL, to lower the misclassification rate (both false positives and negatives through automatically up-weighting data sources that are most informative, while down-weighting less informative and biased sources. Extensive studies indicate that NBEL is significantly more robust than the classic naïve Bayes to unreliable, error-prone and contaminated data. On a large human data set our NBEL approach predicts many more PPIs than naïve Bayes. This suggests that previous studies may have large numbers of not only false positives but also false negatives. The validation on two human PPIs datasets having high quality supports our observations. Our experiments demonstrate that it is feasible to predict high-throughput PPIs computationally with substantially reduced false positives and false negatives. The ability of predicting large numbers of PPIs both reliably and automatically may inspire people to use computational approaches to correct data errors in general, and may speed up PPIs prediction with high quality. Such a reliable prediction may provide a solid platform to other studies such as protein functions prediction and roles of PPIs in disease susceptibility.
Xing, Chuanhua; Dunson, David B
2011-07-01
Protein-protein interactions (PPIs) are essential to most fundamental cellular processes. There has been increasing interest in reconstructing PPIs networks. However, several critical difficulties exist in obtaining reliable predictions. Noticeably, false positive rates can be as high as >80%. Error correction from each generating source can be both time-consuming and inefficient due to the difficulty of covering the errors from multiple levels of data processing procedures within a single test. We propose a novel Bayesian integration method, deemed nonparametric Bayes ensemble learning (NBEL), to lower the misclassification rate (both false positives and negatives) through automatically up-weighting data sources that are most informative, while down-weighting less informative and biased sources. Extensive studies indicate that NBEL is significantly more robust than the classic naïve Bayes to unreliable, error-prone and contaminated data. On a large human data set our NBEL approach predicts many more PPIs than naïve Bayes. This suggests that previous studies may have large numbers of not only false positives but also false negatives. The validation on two human PPIs datasets having high quality supports our observations. Our experiments demonstrate that it is feasible to predict high-throughput PPIs computationally with substantially reduced false positives and false negatives. The ability of predicting large numbers of PPIs both reliably and automatically may inspire people to use computational approaches to correct data errors in general, and may speed up PPIs prediction with high quality. Such a reliable prediction may provide a solid platform to other studies such as protein functions prediction and roles of PPIs in disease susceptibility.
Final Report: Large-Scale Optimization for Bayesian Inference in Complex Systems
Energy Technology Data Exchange (ETDEWEB)
Ghattas, Omar [The University of Texas at Austin
2013-10-15
The SAGUARO (Scalable Algorithms for Groundwater Uncertainty Analysis and Robust Optimiza- tion) Project focuses on the development of scalable numerical algorithms for large-scale Bayesian inversion in complex systems that capitalize on advances in large-scale simulation-based optimiza- tion and inversion methods. Our research is directed in three complementary areas: efficient approximations of the Hessian operator, reductions in complexity of forward simulations via stochastic spectral approximations and model reduction, and employing large-scale optimization concepts to accelerate sampling. Our efforts are integrated in the context of a challenging testbed problem that considers subsurface reacting flow and transport. The MIT component of the SAGUARO Project addresses the intractability of conventional sampling methods for large-scale statistical inverse problems by devising reduced-order models that are faithful to the full-order model over a wide range of parameter values; sampling then employs the reduced model rather than the full model, resulting in very large computational savings. Results indicate little effect on the computed posterior distribution. On the other hand, in the Texas-Georgia Tech component of the project, we retain the full-order model, but exploit inverse problem structure (adjoint-based gradients and partial Hessian information of the parameter-to- observation map) to implicitly extract lower dimensional information on the posterior distribution; this greatly speeds up sampling methods, so that fewer sampling points are needed. We can think of these two approaches as "reduce then sample" and "sample then reduce." In fact, these two approaches are complementary, and can be used in conjunction with each other. Moreover, they both exploit deterministic inverse problem structure, in the form of adjoint-based gradient and Hessian information of the underlying parameter-to-observation map, to achieve their speedups.
A Bayesian Predictive Discriminant Analysis with Screened Data
Directory of Open Access Journals (Sweden)
Hea-Jung Kim
2015-09-01
Full Text Available In the application of discriminant analysis, a situation sometimes arises where individual measurements are screened by a multidimensional screening scheme. For this situation, a discriminant analysis with screened populations is considered from a Bayesian viewpoint, and an optimal predictive rule for the analysis is proposed. In order to establish a flexible method to incorporate the prior information of the screening mechanism, we propose a hierarchical screened scale mixture of normal (HSSMN model, which makes provision for flexible modeling of the screened observations. An Markov chain Monte Carlo (MCMC method using the Gibbs sampler and the Metropolis–Hastings algorithm within the Gibbs sampler is used to perform a Bayesian inference on the HSSMN models and to approximate the optimal predictive rule. A simulation study is given to demonstrate the performance of the proposed predictive discrimination procedure.
Kedzierska, Anna; Husmeier, Dirk
2006-01-01
We propose a heuristic approach to the detection of evidence for recombination and gene conversion in multiple DNA sequence alignments. The proposed method consists of two stages. In the first stage, a sliding window is moved along the DNA sequence alignment, and phylogenetic trees are sampled from the conditional posterior distribution with MCMC. To reduce the noise intrinsic to inference from the limited amount of data available in the typically short sliding window, a clustering algorithm based on the Robinson-Foulds distance is applied to the trees thus sampled, and the posterior distribution over tree clusters is obtained for each window position. While changes in this posterior distribution are indicative of recombination or gene conversion events, it is difficult to decide when such a change is statistically significant. This problem is addressed in the second stage of the proposed algorithm, where the distributions obtained in the first stage are post-processed with a Bayesian hidden Markov model (HMM). The emission states of the HMM are associated with posterior distributions over phylogenetic tree topology clusters. The hidden states of the HMM indicate putative recombinant segments. Inference is done in a Bayesian sense, sampling parameters from the posterior distribution with MCMC. Of particular interest is the determination of the number of hidden states as an indication of the number of putative recombinant regions. To this end, we apply reversible jump MCMC, and sample the number of hidden states from the respective posterior distribution.
Directory of Open Access Journals (Sweden)
Mario A Pardo
Full Text Available We inferred the population densities of blue whales (Balaenoptera musculus and short-beaked common dolphins (Delphinus delphis in the Northeast Pacific Ocean as functions of the water-column's physical structure by implementing hierarchical models in a Bayesian framework. This approach allowed us to propagate the uncertainty of the field observations into the inference of species-habitat relationships and to generate spatially explicit population density predictions with reduced effects of sampling heterogeneity. Our hypothesis was that the large-scale spatial distributions of these two cetacean species respond primarily to ecological processes resulting from shoaling and outcropping of the pycnocline in regions of wind-forced upwelling and eddy-like circulation. Physically, these processes affect the thermodynamic balance of the water column, decreasing its volume and thus the height of the absolute dynamic topography (ADT. Biologically, they lead to elevated primary productivity and persistent aggregation of low-trophic-level prey. Unlike other remotely sensed variables, ADT provides information about the structure of the entire water column and it is also routinely measured at high spatial-temporal resolution by satellite altimeters with uniform global coverage. Our models provide spatially explicit population density predictions for both species, even in areas where the pycnocline shoals but does not outcrop (e.g. the Costa Rica Dome and the North Equatorial Countercurrent thermocline ridge. Interannual variations in distribution during El Niño anomalies suggest that the population density of both species decreases dramatically in the Equatorial Cold Tongue and the Costa Rica Dome, and that their distributions retract to particular areas that remain productive, such as the more oceanic waters in the central California Current System, the northern Gulf of California, the North Equatorial Countercurrent thermocline ridge, and the more
Pardo, Mario A; Gerrodette, Tim; Beier, Emilio; Gendron, Diane; Forney, Karin A; Chivers, Susan J; Barlow, Jay; Palacios, Daniel M
2015-01-01
We inferred the population densities of blue whales (Balaenoptera musculus) and short-beaked common dolphins (Delphinus delphis) in the Northeast Pacific Ocean as functions of the water-column's physical structure by implementing hierarchical models in a Bayesian framework. This approach allowed us to propagate the uncertainty of the field observations into the inference of species-habitat relationships and to generate spatially explicit population density predictions with reduced effects of sampling heterogeneity. Our hypothesis was that the large-scale spatial distributions of these two cetacean species respond primarily to ecological processes resulting from shoaling and outcropping of the pycnocline in regions of wind-forced upwelling and eddy-like circulation. Physically, these processes affect the thermodynamic balance of the water column, decreasing its volume and thus the height of the absolute dynamic topography (ADT). Biologically, they lead to elevated primary productivity and persistent aggregation of low-trophic-level prey. Unlike other remotely sensed variables, ADT provides information about the structure of the entire water column and it is also routinely measured at high spatial-temporal resolution by satellite altimeters with uniform global coverage. Our models provide spatially explicit population density predictions for both species, even in areas where the pycnocline shoals but does not outcrop (e.g. the Costa Rica Dome and the North Equatorial Countercurrent thermocline ridge). Interannual variations in distribution during El Niño anomalies suggest that the population density of both species decreases dramatically in the Equatorial Cold Tongue and the Costa Rica Dome, and that their distributions retract to particular areas that remain productive, such as the more oceanic waters in the central California Current System, the northern Gulf of California, the North Equatorial Countercurrent thermocline ridge, and the more southern portion of the
Directory of Open Access Journals (Sweden)
Silvan Türkcan
Full Text Available The statistical properties of membrane protein random walks reveal information on the interactions between the proteins and their environments. These interactions can be included in an overdamped Langevin equation framework where they are injected in either or both the friction field and the potential field. Using a Bayesian inference scheme, both the friction and potential fields acting on the ε-toxin receptor in its lipid raft have been measured. Two types of events were used to probe these interactions. First, active events, the removal of cholesterol and sphingolipid molecules, were used to measure the time evolution of confining potentials and diffusion fields. Second, passive rare events, de-confinement of the receptors from one raft and transition to an adjacent one, were used to measure hopping energies. Lipid interactions with the ε-toxin receptor are found to be an essential source of confinement. ε-toxin receptor confinement is due to both the friction and potential field induced by cholesterol and sphingolipids. Finally, the statistics of hopping energies reveal sub-structures of potentials in the rafts, characterized by small hopping energies, and the difference of solubilization energy between the inner and outer raft area, characterized by higher hopping energies.
Clérouin, Jean; Desbiens, Nicolas; Dubois, Vincent; Arnault, Philippe
2016-12-01
We show that the Bayesian inference of recently measured x-ray diffraction spectra from laser-shocked aluminum [L. B. Fletcher et al., Nat. Photon. 9, 274 (2015), 10.1038/nphoton.2015.41] with the one-component-plasma (OCP) model performs remarkably well at estimating the ionic density and temperature. This statistical approach requires many evaluations of the OCP static structure factor, which were done using a recently derived analytic fit. The atomic form factor is approximated by an exponential function in the diffraction window of the first peak. The electronic temperature is then estimated from a comparison of this approximated form factor with the electronic structure of an average atom model. Out-of-equilibrium states, with electrons hotter than ions, are diagnosed for the spectra obtained early after the pump, whereas at a late time delay the plasma is at thermal equilibrium. Apart from the present findings, this OCP-based modeling of warm dense matter has an important role to play in the interpretation of x-ray Thomson scattering measurements currently performed at large laser facilities.
Sobradelo, Rosa; Martí, Joan
2015-01-01
One of the most challenging aspects of managing a volcanic crisis is the interpretation of the monitoring data, so as to anticipate to the evolution of the unrest and implement timely mitigation actions. An unrest episode may include different stages or time intervals of increasing activity that may or may not precede a volcanic eruption, depending on the causes of the unrest (magmatic, geothermal or tectonic). Therefore, one of the main goals in monitoring volcanic unrest is to forecast whether or not such increase of activity will end up with an eruption, and if this is the case, how, when, and where this eruption will take place. As an alternative method to expert elicitation for assessing and merging monitoring data and relevant past information, we present a probabilistic method to transform precursory activity into the probability of experiencing a significant variation by the next time interval (i.e. the next step in the unrest), given its preceding evolution, and by further estimating the probability of the occurrence of a particular eruptive scenario combining monitoring and past data. With the 1991 Pinatubo volcanic crisis as a reference, we have developed such a method to assess short-term volcanic hazard using Bayesian inference.
D'Agostini, G
2010-01-01
Triggered by a recent interesting New Scientist article on the too frequent incorrect use of probabilistic evidence in courts, I introduce the basic concepts of probabilistic inference with a toy model, and discuss several important issues that need to be understood in order to extend the basic reasoning to real life cases. In particular, I emphasize the often neglected point that degrees of beliefs are updated not by `bare facts' alone, but by all available information pertaining to them, including how they have been acquired. In this light I show that, contrary to what claimed in that article, there was no "probabilistic pitfall" in the Columbo's episode pointed as example of "bad mathematics" yielding "rough justice". Instead, such a criticism could have a `negative reaction' to the article itself and to the use of Bayesian reasoning in courts, as well as in all other places in which probabilities need to be assessed and decisions need to be made. Anyway, besides introductory/recreational aspects, the pape...
Ata, Metin; Müller, Volker
2014-01-01
We present a Bayesian reconstruction algorithm to generate unbiased samples of the underlying dark matter field from galaxy redshift data. Our new contribution consists of implementing a non-Poisson likelihood including a deterministic non-linear and scale-dependent bias. In particular we present the Hamiltonian equations of motions for the negative binomial (NB) probability distribution function. This permits us to efficiently sample the posterior distribution function of density fields given a sample of galaxies using the Hamiltonian Monte Carlo technique implemented in the Argo code. We have tested our algorithm with the Bolshoi N-body simulation, inferring the underlying dark matter density field from a subsample of the halo catalogue. Our method shows that we can draw closely unbiased samples (compatible within 1-$\\sigma$) from the posterior distribution up to scales of about k~1 h/Mpc in terms of power-spectra and cell-to-cell correlations. We find that a Poisson likelihood yields reconstructions with p...
Lu, Yu; Katz, Neal; Weinberg, Martin D
2011-01-01
We conduct Bayesian model inferences from the observed K-band luminosity function of galaxies in the local Universe, using the semi-analytic model (SAM) of galaxy formation introduced in Lu et al (2011). The prior distributions for the 14 free parameters include a large range of possible models. We find that some of the free parameters, e.g. the characteristic scales for quenching star formation in both high-mass and low-mass halos, are already tightly constrained by the single data set. The posterior distribution includes the model parameters adopted in other SAMs. By marginalising over the posterior distribution, we make predictions that include the full inferential uncertainties for the colour-magnitude relation, the Tully-Fisher relation, the conditional stellar mass function of galaxies in halos of different masses, the HI mass function, the redshift evolution of the stellar mass function of galaxies, and the global star formation history. Using posterior predictive checking with the available observatio...
Directory of Open Access Journals (Sweden)
Heringstad Bjørg
2010-07-01
Full Text Available Abstract Background In the genetic analysis of binary traits with one observation per animal, animal threshold models frequently give biased heritability estimates. In some cases, this problem can be circumvented by fitting sire- or sire-dam models. However, these models are not appropriate in cases where individual records exist on parents. Therefore, the aim of our study was to develop a new Gibbs sampling algorithm for a proper estimation of genetic (covariance components within an animal threshold model framework. Methods In the proposed algorithm, individuals are classified as either "informative" or "non-informative" with respect to genetic (covariance components. The "non-informative" individuals are characterized by their Mendelian sampling deviations (deviance from the mid-parent mean being completely confounded with a single residual on the underlying liability scale. For threshold models, residual variance on the underlying scale is not identifiable. Hence, variance of fully confounded Mendelian sampling deviations cannot be identified either, but can be inferred from the between-family variation. In the new algorithm, breeding values are sampled as in a standard animal model using the full relationship matrix, but genetic (covariance components are inferred from the sampled breeding values and relationships between "informative" individuals (usually parents only. The latter is analogous to a sire-dam model (in cases with no individual records on the parents. Results When applied to simulated data sets, the standard animal threshold model failed to produce useful results since samples of genetic variance always drifted towards infinity, while the new algorithm produced proper parameter estimates essentially identical to the results from a sire-dam model (given the fact that no individual records exist for the parents. Furthermore, the new algorithm showed much faster Markov chain mixing properties for genetic parameters (similar to
DeLannoy, Gabrielle J. M.; Reichle, Rolf H.; Vrugt, Jasper A.
2013-01-01
Uncertainties in L-band (1.4 GHz) radiative transfer modeling (RTM) affect the simulation of brightness temperatures (Tb) over land and the inversion of satellite-observed Tb into soil moisture retrievals. In particular, accurate estimates of the microwave soil roughness, vegetation opacity and scattering albedo for large-scale applications are difficult to obtain from field studies and often lack an uncertainty estimate. Here, a Markov Chain Monte Carlo (MCMC) simulation method is used to determine satellite-scale estimates of RTM parameters and their posterior uncertainty by minimizing the misfit between long-term averages and standard deviations of simulated and observed Tb at a range of incidence angles, at horizontal and vertical polarization, and for morning and evening overpasses. Tb simulations are generated with the Goddard Earth Observing System (GEOS-5) and confronted with Tb observations from the Soil Moisture Ocean Salinity (SMOS) mission. The MCMC algorithm suggests that the relative uncertainty of the RTM parameter estimates is typically less than 25 of the maximum a posteriori density (MAP) parameter value. Furthermore, the actual root-mean-square-differences in long-term Tb averages and standard deviations are found consistent with the respective estimated total simulation and observation error standard deviations of m3.1K and s2.4K. It is also shown that the MAP parameter values estimated through MCMC simulation are in close agreement with those obtained with Particle Swarm Optimization (PSO).
Bayesian Spatial Modelling with R-INLA
Directory of Open Access Journals (Sweden)
Finn Lindgren
2015-02-01
Full Text Available The principles behind the interface to continuous domain spatial models in the R- INLA software package for R are described. The integrated nested Laplace approximation (INLA approach proposed by Rue, Martino, and Chopin (2009 is a computationally effective alternative to MCMC for Bayesian inference. INLA is designed for latent Gaussian models, a very wide and flexible class of models ranging from (generalized linear mixed to spatial and spatio-temporal models. Combined with the stochastic partial differential equation approach (SPDE, Lindgren, Rue, and Lindstrm 2011, one can accommodate all kinds of geographically referenced data, including areal and geostatistical ones, as well as spatial point process data. The implementation interface covers stationary spatial mod- els, non-stationary spatial models, and also spatio-temporal models, and is applicable in epidemiology, ecology, environmental risk assessment, as well as general geostatistics.
Equifinality of formal (DREAM) and informal (GLUE) bayesian approaches in hydrologic modeling?
Energy Technology Data Exchange (ETDEWEB)
Vrugt, Jasper A [Los Alamos National Laboratory; Robinson, Bruce A [Los Alamos National Laboratory; Ter Braak, Cajo J F [NON LANL; Gupta, Hoshin V [NON LANL
2008-01-01
In recent years, a strong debate has emerged in the hydrologic literature regarding what constitutes an appropriate framework for uncertainty estimation. Particularly, there is strong disagreement whether an uncertainty framework should have its roots within a proper statistical (Bayesian) context, or whether such a framework should be based on a different philosophy and implement informal measures and weaker inference to summarize parameter and predictive distributions. In this paper, we compare a formal Bayesian approach using Markov Chain Monte Carlo (MCMC) with generalized likelihood uncertainty estimation (GLUE) for assessing uncertainty in conceptual watershed modeling. Our formal Bayesian approach is implemented using the recently developed differential evolution adaptive metropolis (DREAM) MCMC scheme with a likelihood function that explicitly considers model structural, input and parameter uncertainty. Our results demonstrate that DREAM and GLUE can generate very similar estimates of total streamflow uncertainty. This suggests that formal and informal Bayesian approaches have more common ground than the hydrologic literature and ongoing debate might suggest. The main advantage of formal approaches is, however, that they attempt to disentangle the effect of forcing, parameter and model structural error on total predictive uncertainty. This is key to improving hydrologic theory and to better understand and predict the flow of water through catchments.
Gupta, Cherry; Cobre, Juliana; Polpo, Adriano; Sinha, Debjayoti
2016-09-01
Existing cure-rate survival models are generally not convenient for modeling and estimating the survival quantiles of a patient with specified covariate values. This paper proposes a novel class of cure-rate model, the transform-both-sides cure-rate model (TBSCRM), that can be used to make inferences about both the cure-rate and the survival quantiles. We develop the Bayesian inference about the covariate effects on the cure-rate as well as on the survival quantiles via Markov Chain Monte Carlo (MCMC) tools. We also show that the TBSCRM-based Bayesian method outperforms existing cure-rate models based methods in our simulation studies and in application to the breast cancer survival data from the National Cancer Institute's Surveillance, Epidemiology, and End Results (SEER) database.
Boers, Niklas; Goswami, Bedartha; Chekroun, Mickael; Svensson, Anders; Rousseau, Denis-Didier; Ghil, Michael
2016-04-01
In the recent past, empirical stochastic models have been successfully applied to model a wide range of climatic phenomena [1,2]. In addition to enhancing our understanding of the geophysical systems under consideration, multilayer stochastic models (MSMs) have been shown to be solidly grounded in the Mori-Zwanzig formalism of statistical physics [3]. They are also well-suited for predictive purposes, e.g., for the El Niño Southern Oscillation [4] and the Madden-Julian Oscillation [5]. In general, these models are trained on a given time series under consideration, and then assumed to reproduce certain dynamical properties of the underlying natural system. Most existing approaches are based on least-squares fitting to determine optimal model parameters, which does not allow for an uncertainty estimation of these parameters. This approach significantly limits the degree to which dynamical characteristics of the time series can be safely inferred from the model. Here, we are specifically interested in fitting low-dimensional stochastic models to time series obtained from paleoclimatic proxy records, such as the oxygen isotope ratio and dust concentration of the NGRIP record [6]. The time series derived from these records exhibit substantial dating uncertainties, in addition to the proxy measurement errors. In particular, for time series of this kind, it is crucial to obtain uncertainty estimates for the final model parameters. Following [7], we first propose a statistical procedure to shift dating uncertainties from the time axis to the proxy axis of layer-counted paleoclimatic records. Thereafter, we show how Maximum Likelihood Estimation in combination with Markov Chain Monte Carlo parameter sampling can be employed to translate all uncertainties present in the original proxy time series to uncertainties of the parameter estimates of the stochastic model. We compare time series simulated by the empirical model to the original time series in terms of standard
The Bayesian group lasso for confounded spatial data
Hefley, Trevor J.; Hooten, Mevin B.; Hanks, Ephraim M.; Russell, Robin E.; Walsh, Daniel P.
2017-01-01
Generalized linear mixed models for spatial processes are widely used in applied statistics. In many applications of the spatial generalized linear mixed model (SGLMM), the goal is to obtain inference about regression coefficients while achieving optimal predictive ability. When implementing the SGLMM, multicollinearity among covariates and the spatial random effects can make computation challenging and influence inference. We present a Bayesian group lasso prior with a single tuning parameter that can be chosen to optimize predictive ability of the SGLMM and jointly regularize the regression coefficients and spatial random effect. We implement the group lasso SGLMM using efficient Markov chain Monte Carlo (MCMC) algorithms and demonstrate how multicollinearity among covariates and the spatial random effect can be monitored as a derived quantity. To test our method, we compared several parameterizations of the SGLMM using simulated data and two examples from plant ecology and disease ecology. In all examples, problematic levels multicollinearity occurred and influenced sampling efficiency and inference. We found that the group lasso prior resulted in roughly twice the effective sample size for MCMC samples of regression coefficients and can have higher and less variable predictive accuracy based on out-of-sample data when compared to the standard SGLMM.
Shariati, M M; Korsgaard, I R; Sorensen, D
2009-04-01
Markov chain Monte Carlo (MCMC) enables fitting complex hierarchical models that may adequately reflect the process of data generation. Some of these models may contain more parameters than can be uniquely inferred from the distribution of the data, causing non-identifiability. The reaction norm model with unknown covariates (RNUC) is a model in which unknown environmental effects can be inferred jointly with the remaining parameters. The problem of identifiability of parameters at the level of the likelihood and the associated behaviour of MCMC chains were discussed using the RNUC as an example. It was shown theoretically that when environmental effects (covariates) are considered as random effects, estimable functions of the fixed effects, (co)variance components and genetic effects are identifiable as well as the environmental effects. When the environmental effects are treated as fixed and there are other fixed factors in the model, the contrasts involving environmental effects, the variance of environmental sensitivities (genetic slopes) and the residual variance are the only identifiable parameters. These different identifiability scenarios were generated by changing the formulation of the model and the structure of the data and the models were then implemented via MCMC. The output of MCMC sampling schemes was interpreted in the light of the theoretical findings. The erratic behaviour of the MCMC chains was shown to be associated with identifiability problems in the likelihood, despite propriety of posterior distributions, achieved by arbitrarily chosen uniform (bounded) priors. In some cases, very long chains were needed before the pattern of behaviour of the chain may signal the existence of problems. The paper serves as a warning concerning the implementation of complex models where identifiability problems can be difficult to detect a priori. We conclude that it would be good practice to experiment with a proposed model and to understand its features
Sraj, Ihab
2014-11-01
Tsunami computational models are employed to explore multiple flooding scenarios and to predict water elevations. However, accurate estimation of water elevations requires accurate estimation of many model parameters including the Manning\\'s n friction parameterization. Our objective is to develop an efficient approach for the uncertainty quantification and inference of the Manning\\'s n coefficient which we characterize here by three different parameters set to be constant in the on-shore, near-shore and deep-water regions as defined using iso-baths. We use Polynomial Chaos (PC) to build an inexpensive surrogate for the G. eoC. law model and employ Bayesian inference to estimate and quantify uncertainties related to relevant parameters using the DART buoy data collected during the Tōhoku tsunami. The surrogate model significantly reduces the computational burden of the Markov Chain Monte-Carlo (MCMC) sampling of the Bayesian inference. The PC surrogate is also used to perform a sensitivity analysis.
Introduction to Bayesian statistics
Bolstad, William M
2017-01-01
There is a strong upsurge in the use of Bayesian methods in applied statistical analysis, yet most introductory statistics texts only present frequentist methods. Bayesian statistics has many important advantages that students should learn about if they are going into fields where statistics will be used. In this Third Edition, four newly-added chapters address topics that reflect the rapid advances in the field of Bayesian staistics. The author continues to provide a Bayesian treatment of introductory statistical topics, such as scientific data gathering, discrete random variables, robust Bayesian methods, and Bayesian approaches to inferenfe cfor discrete random variables, bionomial proprotion, Poisson, normal mean, and simple linear regression. In addition, newly-developing topics in the field are presented in four new chapters: Bayesian inference with unknown mean and variance; Bayesian inference for Multivariate Normal mean vector; Bayesian inference for Multiple Linear RegressionModel; and Computati...
Directory of Open Access Journals (Sweden)
Simon Boitard
2016-03-01
Full Text Available Inferring the ancestral dynamics of effective population size is a long-standing question in population genetics, which can now be tackled much more accurately thanks to the massive genomic data available in many species. Several promising methods that take advantage of whole-genome sequences have been recently developed in this context. However, they can only be applied to rather small samples, which limits their ability to estimate recent population size history. Besides, they can be very sensitive to sequencing or phasing errors. Here we introduce a new approximate Bayesian computation approach named PopSizeABC that allows estimating the evolution of the effective population size through time, using a large sample of complete genomes. This sample is summarized using the folded allele frequency spectrum and the average zygotic linkage disequilibrium at different bins of physical distance, two classes of statistics that are widely used in population genetics and can be easily computed from unphased and unpolarized SNP data. Our approach provides accurate estimations of past population sizes, from the very first generations before present back to the expected time to the most recent common ancestor of the sample, as shown by simulations under a wide range of demographic scenarios. When applied to samples of 15 or 25 complete genomes in four cattle breeds (Angus, Fleckvieh, Holstein and Jersey, PopSizeABC revealed a series of population declines, related to historical events such as domestication or modern breed creation. We further highlight that our approach is robust to sequencing errors, provided summary statistics are computed from SNPs with common alleles.
A Very Simple Safe-Bayesian Random Forest.
Quadrianto, Novi; Ghahramani, Zoubin
2015-06-01
Random forests works by averaging several predictions of de-correlated trees. We show a conceptually radical approach to generate a random forest: random sampling of many trees from a prior distribution, and subsequently performing a weighted ensemble of predictive probabilities. Our approach uses priors that allow sampling of decision trees even before looking at the data, and a power likelihood that explores the space spanned by combination of decision trees. While each tree performs Bayesian inference to compute its predictions, our aggregation procedure uses the power likelihood rather than the likelihood and is therefore strictly speaking not Bayesian. Nonetheless, we refer to it as a Bayesian random forest but with a built-in safety. The safeness comes as it has good predictive performance even if the underlying probabilistic model is wrong. We demonstrate empirically that our Safe-Bayesian random forest outperforms MCMC or SMC based Bayesian decision trees in term of speed and accuracy, and achieves competitive performance to entropy or Gini optimised random forest, yet is very simple to construct.
Gradient-based MCMC samplers for dynamic causal modelling.
Sengupta, Biswa; Friston, Karl J; Penny, Will D
2016-01-15
In this technical note, we derive two MCMC (Markov chain Monte Carlo) samplers for dynamic causal models (DCMs). Specifically, we use (a) Hamiltonian MCMC (HMC-E) where sampling is simulated using Hamilton's equation of motion and (b) Langevin Monte Carlo algorithm (LMC-R and LMC-E) that simulates the Langevin diffusion of samples using gradients either on a Euclidean (E) or on a Riemannian (R) manifold. While LMC-R requires minimal tuning, the implementation of HMC-E is heavily dependent on its tuning parameters. These parameters are therefore optimised by learning a Gaussian process model of the time-normalised sample correlation matrix. This allows one to formulate an objective function that balances tuning parameter exploration and exploitation, furnishing an intervention-free inference scheme. Using neural mass models (NMMs)-a class of biophysically motivated DCMs-we find that HMC-E is statistically more efficient than LMC-R (with a Riemannian metric); yet both gradient-based samplers are far superior to the random walk Metropolis algorithm, which proves inadequate to steer away from dynamical instability.
Harris Recurrence and MCMC: A Simplified Approach
DEFF Research Database (Denmark)
Asmussen, Søren; Glynn, Peter W.
A key result underlying the theory of MCMC is that any η-irreducible Markov chain having a transition density with respect to η and possessing a stationary distribution is automatically positive Harris recurrent. This paper provides a short self-contained proof of this fact....
Vehicle Trajectory Estimation Using Spatio-Temporal MCMC
Directory of Open Access Journals (Sweden)
Francois Bardet
2010-01-01
Full Text Available This paper presents an algorithm for modeling and tracking vehicles in video sequences within one integrated framework. Most of the solutions are based on sequential methods that make inference according to current information. In contrast, we propose a deferred logical inference method that makes a decision according to a sequence of observations, thus processing a spatio-temporal search on the whole trajectory. One of the drawbacks of deferred logical inference methods is that the solution space of hypotheses grows exponentially related to the depth of observation. Our approach takes into account both the kinematic model of the vehicle and a driver behavior model in order to reduce the space of the solutions. The resulting proposed state model explains the trajectory with only 11 parameters. The solution space is then sampled with a Markov Chain Monte Carlo (MCMC that uses a model-driven proposal distribution in order to control random walk behavior. We demonstrate our method on real video sequences from which we have ground truth provided by a RTK GPS (Real-Time Kinematic GPS. Experimental results show that the proposed algorithm outperforms a sequential inference solution (particle filter.
Directory of Open Access Journals (Sweden)
Patrick A Reeves
Full Text Available BACKGROUND: Accurate inference of genetic discontinuities between populations is an essential component of intraspecific biodiversity and evolution studies, as well as associative genetics. The most widely-used methods to infer population structure are model-based, Bayesian MCMC procedures that minimize Hardy-Weinberg and linkage disequilibrium within subpopulations. These methods are useful, but suffer from large computational requirements and a dependence on modeling assumptions that may not be met in real data sets. Here we describe the development of a new approach, PCO-MC, which couples principal coordinate analysis to a clustering procedure for the inference of population structure from multilocus genotype data. METHODOLOGY/PRINCIPAL FINDINGS: PCO-MC uses data from all principal coordinate axes simultaneously to calculate a multidimensional "density landscape", from which the number of subpopulations, and the membership within subpopulations, is determined using a valley-seeking algorithm. Using extensive simulations, we show that this approach outperforms a Bayesian MCMC procedure when many loci (e.g. 100 are sampled, but that the Bayesian procedure is marginally superior with few loci (e.g. 10. When presented with sufficient data, PCO-MC accurately delineated subpopulations with population F(st values as low as 0.03 (G'(st>0.2, whereas the limit of resolution of the Bayesian approach was F(st = 0.05 (G'(st>0.35. CONCLUSIONS/SIGNIFICANCE: We draw a distinction between population structure inference for describing biodiversity as opposed to Type I error control in associative genetics. We suggest that discrete assignments, like those produced by PCO-MC, are appropriate for circumscribing units of biodiversity whereas expression of population structure as a continuous variable is more useful for case-control correction in structured association studies.
Energy Technology Data Exchange (ETDEWEB)
Keats, A.; Lien, F.S. [Waterloo Univ., ON (Canada). Dept. of Mechanical Engineering; Yee, E. [Defence Research and Development Canada, Medicine Hat, AB (Canada)
2006-07-01
A Bayesian probabilistic inferential framework capable of incorporating errors and prior information was presented. Bayesian inference was used to find the posterior probability density function of the source parameters in a set of concentration measurements. A method of calculating the source-receptor relationship required for the determination of direct probability was provided which used the adjoint of the transport equation for the scalar concentration. The posterior distribution of the source parameters was sampled using a Markov chain Monte Carlo method. The inverse source determination method was validated against real data sets obtained from a highly disturbed, complex flow field in an urban environment. Data sets included a water-channel simulation of near-field dispersion of contaminant plumes in a large array of building-like obstacles, and a full-scale experiment in Oklahoma City. It was concluded that the 2 examples validated the proposed approach for inverse source determination.
Vargas Cardona, Hernán Darío; Orozco, Álvaro Ángel; Álvarez, Mauricio A
2013-01-01
Automatic identification of biosignals is one of the more studied fields in biomedical engineering. In this paper, we present an approach for the unsupervised recognition of biomedical signals: Microelectrode Recordings (MER) and Electrocardiography signals (ECG). The unsupervised learning is based in classic and bayesian estimation theory. We employ gaussian mixtures models with two estimation methods. The first is derived from the frequentist estimation theory, known as Expectation-Maximization (EM) algorithm. The second is obtained from bayesian probabilistic estimation and it is called variational inference. In this framework, both methods are used for parameters estimation of Gaussian mixtures. The mixtures models are used for unsupervised pattern classification, through the responsibility matrix. The algorithms are applied in two real databases acquired in Parkinson's disease surgeries and electrocardiograms. The results show an accuracy over 85% in MER and 90% in ECG for identification of two classes. These results are statistically equal or even better than parametric (Naive Bayes) and nonparametric classifiers (K-nearest neighbor).
Intracluster Moves for Constrained Discrete-Space MCMC
Hamze, Firas
2012-01-01
This paper addresses the problem of sampling from binary distributions with constraints. In particular, it proposes an MCMC method to draw samples from a distribution of the set of all states at a specified distance from some reference state. For example, when the reference state is the vector of zeros, the algorithm can draw samples from a binary distribution with a constraint on the number of active variables, say the number of 1's. We motivate the need for this algorithm with examples from statistical physics and probabilistic inference. Unlike previous algorithms proposed to sample from binary distributions with these constraints, the new algorithm allows for large moves in state space and tends to propose them such that they are energetically favourable. The algorithm is demonstrated on three Boltzmann machines of varying difficulty: A ferromagnetic Ising model (with positive potentials), a restricted Boltzmann machine with learned Gabor-like filters as potentials, and a challenging three-dimensional spi...
Christley, Scott; Emr, Bryanna; Ghosh, Auyon; Satalin, Josh; Gatto, Louis; Vodovotz, Yoram; Nieman, Gary F.; An, Gary
2013-06-01
Acute respiratory distress syndrome (ARDS) is acute lung failure secondary to severe systemic inflammation, resulting in a derangement of alveolar mechanics (i.e. the dynamic change in alveolar size and shape during tidal ventilation), leading to alveolar instability that can cause further damage to the pulmonary parenchyma. Mechanical ventilation is a mainstay in the treatment of ARDS, but may induce mechano-physical stresses on unstable alveoli, which can paradoxically propagate the cellular and molecular processes exacerbating ARDS pathology. This phenomenon is called ventilator induced lung injury (VILI), and plays a significant role in morbidity and mortality associated with ARDS. In order to identify optimal ventilation strategies to limit VILI and treat ARDS, it is necessary to understand the complex interplay between biological and physical mechanisms of VILI, first at the alveolar level, and then in aggregate at the whole-lung level. Since there is no current consensus about the underlying dynamics of alveolar mechanics, as an initial step we investigate the ventilatory dynamics of an alveolar sac (AS) with the lung alveolar spatial model (LASM), a 3D spatial biomechanical representation of the AS and its interaction with airflow pressure and the surface tension effects of pulmonary surfactant. We use the LASM to identify the mechanical ramifications of alveolar dynamics associated with ARDS. Using graphical processing unit parallel algorithms, we perform Bayesian inference on the model parameters using experimental data from rat lung under control and Tween-induced ARDS conditions. Our results provide two plausible models that recapitulate two fundamental hypotheses about volume change at the alveolar level: (1) increase in alveolar size through isotropic volume change, or (2) minimal change in AS radius with primary expansion of the mouth of the AS, with the implication that the majority of change in lung volume during the respiratory cycle occurs in the
MCMC Analysis of biases in the interpretation of disk galaxy kinematics
Aquino-Ortíz, E.; Valenzuela, O.; Cano-Díaz, M.; Sánchez-Sánchez, S. F.; Hernández-Toledo, H.
2016-06-01
The new generation of galaxy surveys like SAMI, CALIFA and MaNGA opens up the possibility of studying simultaneously properties of galaxies such as spiral arms, bars, disk geometry and orientation, stellar and gas mass distribution, 2D kinematics, etc. The previous task involves exploring a complicated multi-dimensional parameter space. Puglielli et al. (2010) introduced Bayesian statistics and MCMC (Monte Carlo Markov Chain) techniques to construct dynamical models of spiral galaxies. In our study we used synthetic velocity fields that include non-circular motions and assume different disk orientations in order to produce mock observations. We apply popular reconstruction techniques in order to estimate the geometrical disk parameters, systemic velocities, rotation curve shape and maximum circular velocity which are crucial to construct the scaling relations. We conclude that a detailed analysis of kinematics in galaxies using MCMC technique will be reflected in accurate estimations of galaxy properties and more robust scalings relations, otherwise physical conclusions may be importantly biased.
On the Bayesian Treed Multivariate Gaussian Process with Linear Model of Coregionalization
Energy Technology Data Exchange (ETDEWEB)
Konomi, Bledar A.; Karagiannis, Georgios; Lin, Guang
2015-02-01
The Bayesian treed Gaussian process (BTGP) has gained popularity in recent years because it provides a straightforward mechanism for modeling non-stationary data and can alleviate computational demands by fitting models to less data. The extension of BTGP to the multivariate setting requires us to model the cross-covariance and to propose efficient algorithms that can deal with trans-dimensional MCMC moves. In this paper we extend the cross-covariance of the Bayesian treed multivariate Gaussian process (BTMGP) to that of linear model of Coregionalization (LMC) cross-covariances. Different strategies have been developed to improve the MCMC mixing and invert smaller matrices in the Bayesian inference. Moreover, we compare the proposed BTMGP with existing multiple BTGP and BTMGP in test cases and multiphase flow computer experiment in a full scale regenerator of a carbon capture unit. The use of the BTMGP with LMC cross-covariance helped to predict the computer experiments relatively better than existing competitors. The proposed model has a wide variety of applications, such as computer experiments and environmental data. In the case of computer experiments we also develop an adaptive sampling strategy for the BTMGP with LMC cross-covariance function.
Trajectory averaging for stochastic approximation MCMC algorithms
Liang, Faming
2010-01-01
The subject of stochastic approximation was founded by Robbins and Monro [Ann. Math. Statist. 22 (1951) 400--407]. After five decades of continual development, it has developed into an important area in systems control and optimization, and it has also served as a prototype for the development of adaptive algorithms for on-line estimation and control of stochastic systems. Recently, it has been used in statistics with Markov chain Monte Carlo for solving maximum likelihood estimation problems and for general simulation and optimizations. In this paper, we first show that the trajectory averaging estimator is asymptotically efficient for the stochastic approximation MCMC (SAMCMC) algorithm under mild conditions, and then apply this result to the stochastic approximation Monte Carlo algorithm [Liang, Liu and Carroll J. Amer. Statist. Assoc. 102 (2007) 305--320]. The application of the trajectory averaging estimator to other stochastic approximation MCMC algorithms, for example, a stochastic approximation MLE al...
Bayesian Inference on Structure Change in GARCH Models Based on MCMC%基于MCMC的贝叶斯变结构金融时序GARCH模型研究
Institute of Scientific and Technical Information of China (English)
朱慧明; 曾惠芳; 郝立亚
2011-01-01
针对变结构GARCH模型没有解析形式的条件后验分布的问题。借助辅助变量把没有具体解析形式的后验分布转化为一系列完全条件分布，实现了变结构GARCH模型参数的贝叶斯估计。中国外汇市场波动性的实证研究，表明了辅助变量-Gibbs抽样有效的解决了贝叶斯变结构GARCH模型中的高维数值计算问题，并发现其波动持续性是由时间序列的状态转移引起的。%In the GARCH model with structural changes, simple Gibbs sampler is not feasible to simulate its posterior densities directly, because the analytical knowledge of conditional posterior densities is not available. After the introduction of auxiliary variables, the full conditionals can substitute for the awkward forms of conditional posterior densities to implement Gibbs iteration, which carried out the estimation of the GARCH model. The empirical analysis of the Chinese foreign exchange market illustrates that auxiliary sampler resolved the difficulties of the high dimension numerical integral in structure changing GARCH model effectively and the serious pseudo-persistence is caused by regime switching of the time series.
MCM-C Multichip Module Manufacturing Guide
Energy Technology Data Exchange (ETDEWEB)
Blazek, R.J.; Kautz, D.R.; Galichia, J.V.
2000-11-20
Honeywell Federal Manufacturing & Technologies (FM&T) provides complete microcircuit capabilities from design layout through manufacturing and final electrical testing. Manufacturing and testing capabilities include design layout, electrical and mechanical computer simulation and modeling, circuit analysis, component analysis, network fabrication, microelectronic assembly, electrical tester design, electrical testing, materials analysis, and environmental evaluation. This document provides manufacturing guidelines for multichip module-ceramic (MCM-C) microcircuits. Figure 1 illustrates an example MCM-C configuration with the parts and processes that are available. The MCM-C technology is used to manufacture microcircuits for electronic systems that require increased performance, reduced volume, and higher density that cannot be achieved by the standard hybrid microcircuit or printed wiring board technologies. The guidelines focus on the manufacturability issues that must be considered for low-temperature cofired ceramic (LTCC) network fabrication and MCM assembly and the impact that process capabilities have on the overall MCM design layout and product yield. Prerequisites that are necessary to initiate the MCM design layout include electrical, mechanical, and environmental requirements. Customer design data can be accepted in many standard electronic file formats. Other requirements include schedule, quantity, cost, classification, and quality level. Design considerations include electrical, network, packaging, and producibility; and deliverables include finished product, drawings, documentation, and electronic files.
Directory of Open Access Journals (Sweden)
Wills Rachael A
2009-05-01
Full Text Available Abstract Background The problem of silent multiple comparisons is one of the most difficult statistical problems faced by scientists. It is a particular problem for investigating a one-off cancer cluster reported to a health department because any one of hundreds, or possibly thousands, of neighbourhoods, schools, or workplaces could have reported a cluster, which could have been for any one of several types of cancer or any one of several time periods. Methods This paper contrasts the frequentist approach with a Bayesian approach for dealing with silent multiple comparisons in the context of a one-off cluster reported to a health department. Two published cluster investigations were re-analysed using the Dunn-Sidak method to adjust frequentist p-values and confidence intervals for silent multiple comparisons. Bayesian methods were based on the Gamma distribution. Results Bayesian analysis with non-informative priors produced results similar to the frequentist analysis, and suggested that both clusters represented a statistical excess. In the frequentist framework, the statistical significance of both clusters was extremely sensitive to the number of silent multiple comparisons, which can only ever be a subjective "guesstimate". The Bayesian approach is also subjective: whether there is an apparent statistical excess depends on the specified prior. Conclusion In cluster investigations, the frequentist approach is just as subjective as the Bayesian approach, but the Bayesian approach is less ambitious in that it treats the analysis as a synthesis of data and personal judgements (possibly poor ones, rather than objective reality. Bayesian analysis is (arguably a useful tool to support complicated decision-making, because it makes the uncertainty associated with silent multiple comparisons explicit.
Bayesian Concordance Correlation Coefficient with Application to Repeatedly Measured Data
Directory of Open Access Journals (Sweden)
Atanu BHATTACHARJEE
2015-10-01
Full Text Available Objective: In medical research, Lin's classical concordance correlation coefficient (CCC is frequently applied to evaluate the similarity of the measurements produced by different raters or methods on the same subjects. It is particularly useful for continuous data. The objective of this paper is to propose the Bayesian counterpart to compute CCC for continuous data. Material and Methods: A total of 33 patients of astrocytoma brain treated in the Department of Radiation Oncology at Malabar Cancer Centre is enrolled in this work. It is a continuous data of tumor volume and tumor size repeatedly measured during baseline pretreatment workup and post surgery follow-ups for all patients. The tumor volume and tumor size are measured separately by MRI and CT scan. The agreement of measurement between MRI and CT scan is calculated through CCC. The statistical inference is performed through Markov Chain Monte Carlo (MCMC technique. Results: Bayesian CCC is found suitable to get prominent evidence for test statistics to explore the relation between concordance measurements. The posterior mean estimates and 95% credible interval of CCC on tumor size and tumor volume are observed with 0.96(0.87,0.99 and 0.98(0.95,0.99 respectively. Conclusion: The Bayesian inference is adopted for development of the computational algorithm. The approach illustrated in this work provides the researchers an opportunity to find out the most appropriate model for specific data and apply CCC to fulfill the desired hypothesis.
Inferring Genotype of DNA Molecular Marker by Bayesian Theorem%应用贝叶斯理论推断DNA分子标记基因型
Institute of Scientific and Technical Information of China (English)
莫惠栋; 姜长鉴
2002-01-01
引入贝叶斯理论用以从DNA分子标记的表现型(电泳谱带)推断其基因型(DNA来源).结果表明,根据标记座位独立假定而确定的遗传信息不完全标记的基因型概率,与根据邻近的遗传信息完全标记的基因型和有关重组率算得的相应贝叶斯概率,通常都有很大的差异.所以在进行数量性状基因定位和标记辅助选择等工作之前,应当计算每一个体基因组上所有遗传信息不完全座位的有关基因型的贝叶斯概率.文中列出计算未知基因型的贝叶斯概率的详细过程,也讨论了贝叶斯概率的若干推广应用.%Bayesian theorem is applied to infer the DNA molecular marker genotype(DNA chain type) from its phenotype (electrophoresis band type). The results indicated that large differences often present in the genotype probability of a molecular marker with incomplete genetic information when it is obtained from the assumption of independence among markers as compared with that inferred from the genotypes of the flanking markers with the complete genetic information and the recombination fractions among them based on the Bayesian theorem. Therefore, before utilizing the marker information, such as in mapping quantitative trait loci (QTL), marker assisted selection (MAS) etc., Bayesian probability of the genotype for all markers with incomplete genetic information must be calculated over the whole genome for every individual. This study provides detailed procedure for the calculation of the Bayesian probability of the unknown genotype. Several extensions were also discussed for the application of the Bayesian theorem.
Use of SAMC for Bayesian analysis of statistical models with intractable normalizing constants
Jin, Ick Hoon
2014-03-01
Statistical inference for the models with intractable normalizing constants has attracted much attention. During the past two decades, various approximation- or simulation-based methods have been proposed for the problem, such as the Monte Carlo maximum likelihood method and the auxiliary variable Markov chain Monte Carlo methods. The Bayesian stochastic approximation Monte Carlo algorithm specifically addresses this problem: It works by sampling from a sequence of approximate distributions with their average converging to the target posterior distribution, where the approximate distributions can be achieved using the stochastic approximation Monte Carlo algorithm. A strong law of large numbers is established for the Bayesian stochastic approximation Monte Carlo estimator under mild conditions. Compared to the Monte Carlo maximum likelihood method, the Bayesian stochastic approximation Monte Carlo algorithm is more robust to the initial guess of model parameters. Compared to the auxiliary variable MCMC methods, the Bayesian stochastic approximation Monte Carlo algorithm avoids the requirement for perfect samples, and thus can be applied to many models for which perfect sampling is not available or very expensive. The Bayesian stochastic approximation Monte Carlo algorithm also provides a general framework for approximate Bayesian analysis. © 2012 Elsevier B.V. All rights reserved.
Olmi, L; Elia, D; Molinari, S; Pestalozzi, M; Pezzuto, S; Schisano, E; Testi, L; Thompson, M
2013-01-01
Context. Stars form in dense, dusty clumps of molecular clouds, but little is known about their origin, their evolution and their detailed physical properties. In particular, the relationship between the mass distribution of these clumps (also known as the "clump mass function", or CMF) and the stellar initial mass function (IMF), is still poorly understood. Aims. In order to better understand how the CMF evolve toward the IMF, and to discern the "true" shape of the CMF, large samples of bona-fide pre- and proto-stellar clumps are required. Two such datasets obtained from the Herschel infrared GALactic Plane Survey (Hi-GAL) have been described in paper I. Robust statistical methods are needed in order to infer the parameters describing the models used to fit the CMF, and to compare the competing models themselves. Methods. In this paper we apply Bayesian inference to the analysis of the CMF of the two regions discussed in Paper I. First, we determine the Bayesian posterior probability distribution for each of...
Institute of Scientific and Technical Information of China (English)
Druzhinina I S; Kubicek C P
2004-01-01
@@ The Hypocrea lixii/Trichoderma harzianum species aggregate contains a group of taxa (H. lixii/T.harzianum , T. aggressivum , T. tomentosum , T. cerinum , T. velutinum , H. tawa ) of which some (e. g. T. harzianum) are important for biocontrol of plant pathogenic fungi in agriculture, whereas others are aggressive pathogens of Agaricus spp. and Pleurotus spp. in mushroom farms (T. aggressivum), or opportunistic pathogens of immunocompromised mammals including humans (T. harzianum). We characterized the evolutionary properties of three genomic regions in Hypocrea/Trichoderma: the internal transcribed spacer regions ITS1 and 2 of rDNA, the large intron of translation elongation factor 1-alpha (tef1a), and a portion of the large exon of the endochitinase 42 gene (ech42 ), selected the best model which describes the evolution of every fragment, tested the molecular clock hypothesis and made an estimation of the usability of the combined three fragments data matrix for the phylogenetic analysis of the genus as a whole as well as on the level of the holomorphic H. liaxii/T. harzianum species clade and separate clonal lineages. To this end, we applied Bayesian phylogenetic inferences to 124 sequences of ITS1 and 2 and of the large tef1a intron, and to 64 ech42 gene sequences to resolve the evolution of H. lixii/T. harzianum with respect to the position of other taxa with closely related phenotypes. The resulting phylogram clearly identified T.aggressivum, T. velutinum, H. tawa, T. cerinum and T. tomentosum as phylogenetic species, and in addition identified three new unknown phylogenetic species as members of this clacle. The clear distinction between T. tomentosum and T. cerinum was not recognized in all trees, but was supported by multivariate analysis of phenotype micro arrays. In contrast, H. lixii/T. harzianum did not form a single phylogenetic species in this study, as its monophyly was not supported in any analysis. Strains morphologically identified as H. lixii
DEFF Research Database (Denmark)
Choi, Sang-Hyeon; Lee, Ikjin; Jeong, Cheol-Ho
2016-01-01
Sabine absorption coefficient is a widely used one deduced from reverberation time measurements via the Sabine equation. First- and second-level Bayesian analysis are used to estimate the flow resistivity of a sound absorber and the influences of the test chambers from Sabine absorption...
基于MCMC方法优化的港口交通系统风险仿真%Risk simulation of port traffic system optimized based on MCMC method
Institute of Scientific and Technical Information of China (English)
徐广波; 胡甚平
2013-01-01
To master a more accurate risk distribution of port traffic system and improve port risk management ability,based on quantitative assessment on port traffic risk,Bayesian probability statistics of accident rate and accident consequence are obtained.Then a risk simulation model of port traffic system based on Markov Chain Monte Carlo (MCMC) method is built.WinBUGS software and MCMC method are used to realize the parametric inference and optimization of the model.A risk simulation test of a port traffic system is carried out,and distribution curves with specific risk values are obtained.The example proves that the optimized simulation model can better reflect risk trend of port traffic system and serve with decision support to port safety administration.%为获得较为准确的港口交通系统风险分布,提高港口风险管理能力,在港口交通风险定量化评估的基础上,得出交通事故率和事故后果的贝叶斯概率统计,构建基于马尔可夫链蒙特卡罗(Markov Chain Monte Carlo,MCMC)方法的港口交通系统风险仿真模型.利用WinBUGS软件,通过MCMC方法对该模型进行参数推断和优化；并在此基础上对港口交通系统风险进行仿真实验,得出风险度分布曲线.实例表明,优化后的仿真模型能更好地反映港口交通系统风险的趋势,为港口安全管理决策提供支持.
Lafrenière-Bérubé, Charles; Chouteau, Michel; Shamsipour, Pejman; Olivo, Gema R.
2016-04-01
Spectral induced polarization (SIP) parameters can be extracted from field or laboratory complex resistivity measurements, and even airborne or ground frequency domain electromagnetic data. With the growing interest in application of complex resistivity measurements to environmental and mineral exploration problems, there is a need for accurate and easy-to-use inversion tools to estimate SIP parameters. These parameters, which often include chargeability and relaxation time may then be studied and related to other rock attributes such as porosity or metallic grain content, in the case of mineral exploration. We present an open source program, available both as a standalone application or Python module, to estimate SIP parameters using Markov-chain Monte Carlo (MCMC) sampling. The Python language is a high level, open source language that is now widely used in scientific computing. Our program allows the user to choose between the more common Cole-Cole (Pelton), Dias, or Debye decomposition models. Simple circuits composed of resistances and constant phase elements may also be used to represent SIP data. Initial guesses are required when using more classic inversion techniques such as the least-squares formulation, and wrong estimates are often the cause of bad curve fitting. In stochastic optimization using MCMC, the effect of the starting values disappears as the simulation proceeds. Our program is then optimized to do batch inversion over large data sets with as little user-interaction as possible. Additionally, the Bayesian formulation allows the user to do quality control by fully propagating the measurement errors in the inversion process, providing an estimation of the SIP parameters uncertainty. This information is valuable when trying to relate chargeability or relaxation time to other physical properties. We test the inversion program on complex resistivity measurements of 12 core samples from the world-class gold deposit of Canadian Malartic. Results show
Gelman, Andrew; Stern, Hal S; Dunson, David B; Vehtari, Aki; Rubin, Donald B
2013-01-01
FUNDAMENTALS OF BAYESIAN INFERENCEProbability and InferenceSingle-Parameter Models Introduction to Multiparameter Models Asymptotics and Connections to Non-Bayesian ApproachesHierarchical ModelsFUNDAMENTALS OF BAYESIAN DATA ANALYSISModel Checking Evaluating, Comparing, and Expanding ModelsModeling Accounting for Data Collection Decision AnalysisADVANCED COMPUTATION Introduction to Bayesian Computation Basics of Markov Chain Simulation Computationally Efficient Markov Chain Simulation Modal and Distributional ApproximationsREGRESSION MODELS Introduction to Regression Models Hierarchical Linear
Bayesian artificial intelligence
Korb, Kevin B
2010-01-01
Updated and expanded, Bayesian Artificial Intelligence, Second Edition provides a practical and accessible introduction to the main concepts, foundation, and applications of Bayesian networks. It focuses on both the causal discovery of networks and Bayesian inference procedures. Adopting a causal interpretation of Bayesian networks, the authors discuss the use of Bayesian networks for causal modeling. They also draw on their own applied research to illustrate various applications of the technology.New to the Second EditionNew chapter on Bayesian network classifiersNew section on object-oriente
Métodos avanzados de muestreo : MCMC
Pascual Del Olmo, Víctor
2011-01-01
Este proyecto se propone estudiar, analizar e investigar las diferentes metodologías de generación de números aleatorios mediante técnicas avanzadas y modernas de Monte Carlo Markov Chain (MCMC). Los métodos de Monte Carlo son métodos numéricos usados para calcular, aproximar y simular expresiones o sistemas matemáticos complejos y difíciles de evaluar. Aunque estos métodos comenzaron a desarrollarse en los años cuarenta, hasta que las computadoras no se hicieron más potentes estuvieron en un...
Directory of Open Access Journals (Sweden)
Navid Feroze
2016-03-01
Full Text Available The families of mixture distributions have a wider range of applications in different fields such as fisheries, agriculture, botany, economics, medicine, psychology, electrophoresis, finance, communication theory, geology and zoology. They provide the necessary flexibility to model failure distributions of components with multiple failure modes. Mostly, the Bayesian procedure for the estimation of parameters of mixture model is described under the scheme of Type-I censoring. In particular, the Bayesian analysis for the mixture models under doubly censored samples has not been considered in the literature yet. The main objective of this paper is to develop the Bayes estimation of the inverse Weibull mixture distributions under doubly censoring. The posterior estimation has been conducted under the assumption of gamma and inverse levy using precautionary loss function and weighted squared error loss function. The comparisons among the different estimators have been made based on analysis of simulated and real life data sets.
Trajectory averaging for stochastic approximation MCMC algorithms
Liang, Faming
2010-10-01
The subject of stochastic approximation was founded by Robbins and Monro [Ann. Math. Statist. 22 (1951) 400-407]. After five decades of continual development, it has developed into an important area in systems control and optimization, and it has also served as a prototype for the development of adaptive algorithms for on-line estimation and control of stochastic systems. Recently, it has been used in statistics with Markov chain Monte Carlo for solving maximum likelihood estimation problems and for general simulation and optimizations. In this paper, we first show that the trajectory averaging estimator is asymptotically efficient for the stochastic approximation MCMC (SAMCMC) algorithm under mild conditions, and then apply this result to the stochastic approximation Monte Carlo algorithm [Liang, Liu and Carroll J. Amer. Statist. Assoc. 102 (2007) 305-320]. The application of the trajectory averaging estimator to other stochastic approximationMCMC algorithms, for example, a stochastic approximation MLE algorithm for missing data problems, is also considered in the paper. © Institute of Mathematical Statistics, 2010.
ABCtoolbox: a versatile toolkit for approximate Bayesian computations
Directory of Open Access Journals (Sweden)
Neuenschwander Samuel
2010-03-01
Full Text Available Abstract Background The estimation of demographic parameters from genetic data often requires the computation of likelihoods. However, the likelihood function is computationally intractable for many realistic evolutionary models, and the use of Bayesian inference has therefore been limited to very simple models. The situation changed recently with the advent of Approximate Bayesian Computation (ABC algorithms allowing one to obtain parameter posterior distributions based on simulations not requiring likelihood computations. Results Here we present ABCtoolbox, a series of open source programs to perform Approximate Bayesian Computations (ABC. It implements various ABC algorithms including rejection sampling, MCMC without likelihood, a Particle-based sampler and ABC-GLM. ABCtoolbox is bundled with, but not limited to, a program that allows parameter inference in a population genetics context and the simultaneous use of different types of markers with different ploidy levels. In addition, ABCtoolbox can also interact with most simulation and summary statistics computation programs. The usability of the ABCtoolbox is demonstrated by inferring the evolutionary history of two evolutionary lineages of Microtus arvalis. Using nuclear microsatellites and mitochondrial sequence data in the same estimation procedure enabled us to infer sex-specific population sizes and migration rates and to find that males show smaller population sizes but much higher levels of migration than females. Conclusion ABCtoolbox allows a user to perform all the necessary steps of a full ABC analysis, from parameter sampling from prior distributions, data simulations, computation of summary statistics, estimation of posterior distributions, model choice, validation of the estimation procedure, and visualization of the results.
Directory of Open Access Journals (Sweden)
Kevin McNally
2012-01-01
Full Text Available There are numerous biomonitoring programs, both recent and ongoing, to evaluate environmental exposure of humans to chemicals. Due to the lack of exposure and kinetic data, the correlation of biomarker levels with exposure concentrations leads to difficulty in utilizing biomonitoring data for biological guidance values. Exposure reconstruction or reverse dosimetry is the retrospective interpretation of external exposure consistent with biomonitoring data. We investigated the integration of physiologically based pharmacokinetic modelling, global sensitivity analysis, Bayesian inference, and Markov chain Monte Carlo simulation to obtain a population estimate of inhalation exposure to m-xylene. We used exhaled breath and venous blood m-xylene and urinary 3-methylhippuric acid measurements from a controlled human volunteer study in order to evaluate the ability of our computational framework to predict known inhalation exposures. We also investigated the importance of model structure and dimensionality with respect to its ability to reconstruct exposure.
McNally, Kevin; Cotton, Richard; Cocker, John; Jones, Kate; Bartels, Mike; Rick, David; Price, Paul; Loizou, George
2012-01-01
There are numerous biomonitoring programs, both recent and ongoing, to evaluate environmental exposure of humans to chemicals. Due to the lack of exposure and kinetic data, the correlation of biomarker levels with exposure concentrations leads to difficulty in utilizing biomonitoring data for biological guidance values. Exposure reconstruction or reverse dosimetry is the retrospective interpretation of external exposure consistent with biomonitoring data. We investigated the integration of physiologically based pharmacokinetic modelling, global sensitivity analysis, Bayesian inference, and Markov chain Monte Carlo simulation to obtain a population estimate of inhalation exposure to m-xylene. We used exhaled breath and venous blood m-xylene and urinary 3-methylhippuric acid measurements from a controlled human volunteer study in order to evaluate the ability of our computational framework to predict known inhalation exposures. We also investigated the importance of model structure and dimensionality with respect to its ability to reconstruct exposure.
Penasa, M; Cecchinato, A; Battagin, M; De Marchi, M; Pretto, D; Cassandro, M
2010-01-01
The aim of the study was to infer (co)variance components for daily milk yield, fat and protein contents, and somatic cell score (SCS) in Burlina cattle (a local breed in northeast Italy). Data consisted of 13,576 monthly test-day records of 666 cows (parities 1 to 8) collected in 10 herds between 1999 and 2009. Repeatability animal models were implemented using Bayesian methods. Flat priors were assumed for systematic effects of herd test date, days in milk, and parity, as well as for permanent environmental, genetic, and residual effects. On average, Burlina cows produced 17.0 kg of milk per day, with 3.66 and 3.33 percent of fat and protein, respectively, and 358,000 cells per mL of milk. Marginal posterior medians (highest posterior density of 95%) of heritability were 0.18 (0.09-0.28), 0.28 (0.21-0.36), 0.35 (0.25-0.49), and 0.05 (0.01-0.11) for milk yield, fat content, protein content, and SCS, respectively. Marginal posterior medians of genetic correlations between the traits were low and a 95 percent Bayesian confidence region included zero, with the exception of the genetic correlation between fat and protein contents. Despite the low number of animals in the population, results suggest that genetic variance for production and quality traits exists in Burlina cattle.
Lucka, Felix; Pursiainen, Sampsa; Burger, Martin; Wolters, Carsten H
2012-07-16
The estimation of the activity-related ion currents by measuring the induced electromagnetic fields at the head surface is a challenging and severely ill-posed inverse problem. This is especially true in the recovery of brain networks involving deep-lying sources by means of EEG/MEG recordings which is still a challenging task for any inverse method. Recently, hierarchical Bayesian modeling (HBM) emerged as a unifying framework for current density reconstruction (CDR) approaches comprising most established methods as well as offering promising new methods. Our work examines the performance of fully-Bayesian inference methods for HBM for source configurations consisting of few, focal sources when used with realistic, high-resolution finite element (FE) head models. The main foci of interest are the correct depth localization, a well-known source of systematic error of many CDR methods, and the separation of single sources in multiple-source scenarios. Both aspects are very important in the analysis of neurophysiological data and in clinical applications. For these tasks, HBM provides a promising framework and is able to improve upon established CDR methods such as minimum norm estimation (MNE) or sLORETA in many aspects. For challenging multiple-source scenarios where the established methods show crucial errors, promising results are attained. Additionally, we introduce Wasserstein distances as performance measures for the validation of inverse methods in complex source scenarios.
Learning Bayesian network classifiers for credit scoring using Markov Chain Monte Carlo search
Baesens, B.; Egmont-Petersen, M.; Castelo, R.; Vanthienen, J.
2002-01-01
In this paper, we will evaluate the power and usefulness of Bayesian network classifiers for credit scoring. Various types of Bayesian network classifiers will be evaluated and contrasted including unrestricted Bayesian network classifiers learnt using Markov Chain Monte Carlo (MCMC) search. The exp
Serang, Oliver
2015-08-01
Observations depending on sums of random variables are common throughout many fields; however, no efficient solution is currently known for performing max-product inference on these sums of general discrete distributions (max-product inference can be used to obtain maximum a posteriori estimates). The limiting step to max-product inference is the max-convolution problem (sometimes presented in log-transformed form and denoted as "infimal convolution," "min-convolution," or "convolution on the tropical semiring"), for which no O(k log(k)) method is currently known. Presented here is an O(k log(k)) numerical method for estimating the max-convolution of two nonnegative vectors (e.g., two probability mass functions), where k is the length of the larger vector. This numerical max-convolution method is then demonstrated by performing fast max-product inference on a convolution tree, a data structure for performing fast inference given information on the sum of n discrete random variables in O(nk log(nk)log(n)) steps (where each random variable has an arbitrary prior distribution on k contiguous possible states). The numerical max-convolution method can be applied to specialized classes of hidden Markov models to reduce the runtime of computing the Viterbi path from nk(2) to nk log(k), and has potential application to the all-pairs shortest paths problem.
Bernardo, Jose M
2000-01-01
This highly acclaimed text, now available in paperback, provides a thorough account of key concepts and theoretical results, with particular emphasis on viewing statistical inference as a special case of decision theory. Information-theoretic concepts play a central role in the development of the theory, which provides, in particular, a detailed discussion of the problem of specification of so-called prior ignorance . The work is written from the authors s committed Bayesian perspective, but an overview of non-Bayesian theories is also provided, and each chapter contains a wide-ranging critica
Meseguer, Andrea Sánchez; Aldasoro, Juan Jose; Sanmartín, Isabel
2013-05-01
The genus Hypericum L. ("St. John's wort", Hypericaceae) comprises nearly 500 species of shrubs, trees and herbs distributed mainly in temperate regions of the Northern Hemisphere, but also in high-altitude tropical and subtropical areas. Until now, molecular phylogenetic hypotheses on infra-generic relationships have been based solely on the nuclear marker ITS. Here, we used a full Bayesian approach to simultaneously reconstruct phylogenetic relationships, divergence times, and patterns of morphological and range evolution in Hypericum, using nuclear (ITS) and plastid DNA sequences (psbA-trnH, trnS-trnG, trnL-trnF) of 186 species representing 33 of the 36 described morphological sections. Consistent with other studies, we found that corrections of the branch length prior helped recover more realistic branch lengths in by-gene partitioned Bayesian analyses, but the effect was also seen within single genes if the overall mutation rate differed considerably among sites or regions. Our study confirms that Hypericum is not monophyletic with the genus Triadenum embedded within, and rejects the traditional infrageneric classification, with many sections being para- or polyphyletic. The small Western Palearctic sections Elodes and Adenotrias are the sister-group of a geographic dichotomy between a mainly New World clade and a large Old World clade. Bayesian reconstruction of morphological character states and range evolution show a complex pattern of morphological plasticity and inter-continental movement within the genus. The ancestors of Hypericum were probably tropical shrubs that migrated from Africa to the Palearctic in the Early Tertiary, concurrent with the expansion of tropical climates in northern latitudes. Global climate cooling from the Mid Tertiary onwards might have promoted adaptation to temperate conditions in some lineages, such as the development of the herbaceous habit or unspecialized corollas.
Directory of Open Access Journals (Sweden)
Zoltán Csörnyei
2010-01-01
Full Text Available Genetic parameters of number of piglets born alive (NBA and gestation length (GL were analyzed for 39798 Hungarian Landrace (HLA, 141397 records and 70356 Hungarian Large White (HLW, 246961 records sows. Bivariate repeatability animal models were used, applying a Bayesian statistics. Estimated and heritabilitie repeatabilities (within brackets, were low for NBA, 0.07 (0.14 for HLA and 0.08 (0.17 for HLW, but somewhat higher for GL, 0.18 (0.27 for HLA and 0.26 (0.35 for HLW. Estimated genetic correlations between NBA and GL were low, -0.08 for HLA and -0.05 for HLW.
CARBayes: An R Package for Bayesian Spatial Modeling with Conditional Autoregressive Priors
Directory of Open Access Journals (Sweden)
Duncan Lee
2013-11-01
Full Text Available Conditional autoregressive models are commonly used to represent spatial autocorrelation in data relating to a set of non-overlapping areal units, which arise in a wide variety of applications including agriculture, education, epidemiology and image analysis. Such models are typically specified in a hierarchical Bayesian framework, with inference based on Markov chain Monte Carlo (MCMC simulation. The most widely used software to fit such models is WinBUGS or OpenBUGS, but in this paper we introduce the R package CARBayes. The main advantage of CARBayes compared with the BUGS software is its ease of use, because: (1 the spatial adjacency information is easy to specify as a binary neighbourhood matrix; and (2 given the neighbourhood matrix the models can be implemented by a single function call in R. This paper outlines the general class of Bayesian hierarchical models that can be implemented in the CARBayes software, describes their implementation via MCMC simulation techniques, and illustrates their use with two worked examples in the fields of house price analysis and disease mapping.
mbb_emcee: Modified Blackbody MCMC
Conley, Alexander
2016-02-01
Mbb_emcee fits modified blackbodies to photometry data using an affine invariant MCMC. It has large number of options which, for example, allow computation of the IR luminosity or dustmass as part of the fit. Carrying out a fit produces a HDF5 output file containing the results, which can either be read directly, or read back into a mbb_results object for analysis. Upper and lower limits can be imposed as well as Gaussian priors on the model parameters. These additions are useful for analyzing poorly constrained data. In addition to standard Python packages scipy, numpy, and cython, mbb_emcee requires emcee (ascl:1303.002), Astropy (ascl:1304.002), h5py, and for unit tests, nose.
Bayesian estimation of generalized exponential distribution under noninformative priors
Moala, Fernando Antonio; Achcar, Jorge Alberto; Tomazella, Vera Lúcia Damasceno
2012-10-01
The generalized exponential distribution, proposed by Gupta and Kundu (1999), is a good alternative to standard lifetime distributions as exponential, Weibull or gamma. Several authors have considered the problem of Bayesian estimation of the parameters of generalized exponential distribution, assuming independent gamma priors and other informative priors. In this paper, we consider a Bayesian analysis of the generalized exponential distribution by assuming the conventional noninformative prior distributions, as Jeffreys and reference prior, to estimate the parameters. These priors are compared with independent gamma priors for both parameters. The comparison is carried out by examining the frequentist coverage probabilities of Bayesian credible intervals. We shown that maximal data information prior implies in an improper posterior distribution for the parameters of a generalized exponential distribution. It is also shown that the choice of a parameter of interest is very important for the reference prior. The different choices lead to different reference priors in this case. Numerical inference is illustrated for the parameters by considering data set of different sizes and using MCMC (Markov Chain Monte Carlo) methods.
A Bayesian Alternative for Multi-objective Ecohydrological Model Specification
Tang, Y.; Marshall, L. A.; Sharma, A.; Ajami, H.
2015-12-01
Process-based ecohydrological models combine the study of hydrological, physical, biogeochemical and ecological processes of the catchments, which are usually more complex and parametric than conceptual hydrological models. Thus, appropriate calibration objectives and model uncertainty analysis are essential for ecohydrological modeling. In recent years, Bayesian inference has become one of the most popular tools for quantifying the uncertainties in hydrological modeling with the development of Markov Chain Monte Carlo (MCMC) techniques. Our study aims to develop appropriate prior distributions and likelihood functions that minimize the model uncertainties and bias within a Bayesian ecohydrological framework. In our study, a formal Bayesian approach is implemented in an ecohydrological model which combines a hydrological model (HyMOD) and a dynamic vegetation model (DVM). Simulations focused on one objective likelihood (Streamflow/LAI) and multi-objective likelihoods (Streamflow and LAI) with different weights are compared. Uniform, weakly informative and strongly informative prior distributions are used in different simulations. The Kullback-leibler divergence (KLD) is used to measure the dis(similarity) between different priors and corresponding posterior distributions to examine the parameter sensitivity. Results show that different prior distributions can strongly influence posterior distributions for parameters, especially when the available data is limited or parameters are insensitive to the available data. We demonstrate differences in optimized parameters and uncertainty limits in different cases based on multi-objective likelihoods vs. single objective likelihoods. We also demonstrate the importance of appropriately defining the weights of objectives in multi-objective calibration according to different data types.
Bayesian target tracking based on particle filter
Institute of Scientific and Technical Information of China (English)
无
2005-01-01
For being able to deal with the nonlinear or non-Gaussian problems, particle filters have been studied by many researchers. Based on particle filter, the extended Kalman filter (EKF) proposal function is applied to Bayesian target tracking. Markov chain Monte Carlo (MCMC) method, the resampling step, etc novel techniques are also introduced into Bayesian target tracking. And the simulation results confirm the improved particle filter with these techniques outperforms the basic one.
Bayesian modelling of compositional heterogeneity in molecular phylogenetics.
Heaps, Sarah E; Nye, Tom M W; Boys, Richard J; Williams, Tom A; Embley, T Martin
2014-10-01
In molecular phylogenetics, standard models of sequence evolution generally assume that sequence composition remains constant over evolutionary time. However, this assumption is violated in many datasets which show substantial heterogeneity in sequence composition across taxa. We propose a model which allows compositional heterogeneity across branches, and formulate the model in a Bayesian framework. Specifically, the root and each branch of the tree is associated with its own composition vector whilst a global matrix of exchangeability parameters applies everywhere on the tree. We encourage borrowing of strength between branches by developing two possible priors for the composition vectors: one in which information can be exchanged equally amongst all branches of the tree and another in which more information is exchanged between neighbouring branches than between distant branches. We also propose a Markov chain Monte Carlo (MCMC) algorithm for posterior inference which uses data augmentation of substitutional histories to yield a simple complete data likelihood function that factorises over branches and allows Gibbs updates for most parameters. Standard phylogenetic models are not informative about the root position. Therefore a significant advantage of the proposed model is that it allows inference about rooted trees. The position of the root is fundamental to the biological interpretation of trees, both for polarising trait evolution and for establishing the order of divergence among lineages. Furthermore, unlike some other related models from the literature, inference in the model we propose can be carried out through a simple MCMC scheme which does not require problematic dimension-changing moves. We investigate the performance of the model and priors in analyses of two alignments for which there is strong biological opinion about the tree topology and root position.
Kim, Seongryong; Tkalčić, Hrvoje; Rhie, Junkee; Chen, Youlin
2016-08-01
Intraplate volcanism adjacent to active continental margins is not simply explained by plate tectonics or plume interaction. Recent volcanoes in northeast (NE) Asia, including NE China and the Korean Peninsula, are characterized by heterogeneous tectonic structures and geochemical compositions. Here we apply a transdimensional Bayesian tomography to estimate high-resolution images of group and phase velocity variations (with periods between 8 and 70 s). The method provides robust estimations of velocity maps, and the reliability of results is tested through carefully designed synthetic recovery experiments. Our maps reveal two sublithospheric low-velocity anomalies that connect back-arc regions (in Japan and Ryukyu Trench) with current margins of continental lithosphere where the volcanoes are distributed. Combined with evidences from previous geochemical and geophysical studies, we argue that the volcanoes are related to the low-velocity structures associated with back-arc processes and preexisting continental lithosphere.
Marshall, L. A.; Smith, T. J.
2008-12-01
The implementation of Bayesian methods, and specifically Markov chain Monte Carlo (MCMC) methods, are becoming much more widespread due to their usefulness in uncertainty assessment of hydrologic models. These methods have the ability to explicitly account for non-stationarities in model errors (via the likelihood), complex parameter interdependence and uncertainty, and multiple sources of data for model conditioning. These properties hold particular importance for hydrologic models where we need to characterize complex model errors (including heteroscedasticity and correlation) and where a full assessment of the uncertainty associated with the modeled results is desirable. Traditional MCMC algorithms can be difficult to implement due to computational constraints for high-dimensional models with complex parameter spaces and expensive model functions. Failure to effectively explore the parameter space can lead to false convergence to a local optimum and a misunderstanding of the model's ability to characterize the system. While past studies have shown adaptive MCMC techniques to be more desirable than traditional MCMC approaches, few hydrologic studies have taken advantage of these new advances, given their varying difficulty in implementation. We investigated three recently developed MCMC algorithms, the Adaptive Metropolis (AM), the Delayed Rejection Adaptive Metropolis (DRAM) and the Differential Evolution Markov Chain (DE-MC). These algorithms are newly devised and intended to better handle issues common to hydrologic modeling including multi-modality of parameter spaces, complex parameter interactions, and the computational cost associated with potentially expensive hydrologic functions. We evaluated each algorithm through application to two case studies; (1) a synthetic Gaussian mixture with five parameters and two modes and (2) a nine-dimensional snowmelt-hydrologic modeling study applied to an experimental watershed. Each of the three algorithms was compared
Yao, Zhewei; Hu, Zixi; Li, Jinglai
2016-07-01
Many scientific and engineering problems require to perform Bayesian inferences in function spaces, where the unknowns are of infinite dimension. In such problems, choosing an appropriate prior distribution is an important task. In particular, when the function to infer is subject to sharp jumps, the commonly used Gaussian measures become unsuitable. On the other hand, the so-called total variation (TV) prior can only be defined in a finite-dimensional setting, and does not lead to a well-defined posterior measure in function spaces. In this work we present a TV-Gaussian (TG) prior to address such problems, where the TV term is used to detect sharp jumps of the function, and the Gaussian distribution is used as a reference measure so that it results in a well-defined posterior measure in the function space. We also present an efficient Markov Chain Monte Carlo (MCMC) algorithm to draw samples from the posterior distribution of the TG prior. With numerical examples we demonstrate the performance of the TG prior and the efficiency of the proposed MCMC algorithm.
Advances in Bayesian Model Based Clustering Using Particle Learning
Energy Technology Data Exchange (ETDEWEB)
Merl, D M
2009-11-19
Recent work by Carvalho, Johannes, Lopes and Polson and Carvalho, Lopes, Polson and Taddy introduced a sequential Monte Carlo (SMC) alternative to traditional iterative Monte Carlo strategies (e.g. MCMC and EM) for Bayesian inference for a large class of dynamic models. The basis of SMC techniques involves representing the underlying inference problem as one of state space estimation, thus giving way to inference via particle filtering. The key insight of Carvalho et al was to construct the sequence of filtering distributions so as to make use of the posterior predictive distribution of the observable, a distribution usually only accessible in certain Bayesian settings. Access to this distribution allows a reversal of the usual propagate and resample steps characteristic of many SMC methods, thereby alleviating to a large extent many problems associated with particle degeneration. Furthermore, Carvalho et al point out that for many conjugate models the posterior distribution of the static variables can be parametrized in terms of [recursively defined] sufficient statistics of the previously observed data. For models where such sufficient statistics exist, particle learning as it is being called, is especially well suited for the analysis of streaming data do to the relative invariance of its algorithmic complexity with the number of data observations. Through a particle learning approach, a statistical model can be fit to data as the data is arriving, allowing at any instant during the observation process direct quantification of uncertainty surrounding underlying model parameters. Here we describe the use of a particle learning approach for fitting a standard Bayesian semiparametric mixture model as described in Carvalho, Lopes, Polson and Taddy. In Section 2 we briefly review the previously presented particle learning algorithm for the case of a Dirichlet process mixture of multivariate normals. In Section 3 we describe several novel extensions to the original
CLICK MODEL BASED ON BAYESIAN INFERENCE AND ITS IMPLEMENTATION%基于贝叶斯推理的点击模型及其实现
Institute of Scientific and Technical Information of China (English)
孙付伟; 李娟; 杨达
2013-01-01
为能更好地解释搜索引擎和商务搜索的点击日志中的用户行为,实现一种用于分析日志中包含的用户行为的贝叶斯点击模型.通过分析中国最大电子商务网站的约927万条用户搜索点击日志数据,发现一个的文档的点击是受其上下位置点击过的文档共同影响的,然后基于此发现提出并实现一种新的基于贝叶斯推理的点击模型,并给出并行版本的算法实现.最后通过利用来自用户搜索的一个月日志数据验证,结果表明该模型优于现有的点击模型.%In order to better explain user behaviour from click logs in search engine or sponsored search, we implement a Bayesian click model for analysing user behaviours included in logs. By analysing about 9.27 million click log data collected from a largest e-commerce site of China, there finds that the click probability of a document is affected by the clicked documents above and below it. Then we propose and implement a new click model based on Bayesian inference according to the phenomenon found, together with the implementation of an algorithm in parallel version. At last, we validate the model through a log data set collected about a mouth from user search, and the result shows that the proposed model outperforms existing click models.
Energy Technology Data Exchange (ETDEWEB)
Sigeti, David E. [Los Alamos National Laboratory; Pelak, Robert A. [Los Alamos National Laboratory
2012-09-11
We present a Bayesian statistical methodology for identifying improvement in predictive simulations, including an analysis of the number of (presumably expensive) simulations that will need to be made in order to establish with a given level of confidence that an improvement has been observed. Our analysis assumes the ability to predict (or postdict) the same experiments with legacy and new simulation codes and uses a simple binomial model for the probability, {theta}, that, in an experiment chosen at random, the new code will provide a better prediction than the old. This model makes it possible to do statistical analysis with an absolute minimum of assumptions about the statistics of the quantities involved, at the price of discarding some potentially important information in the data. In particular, the analysis depends only on whether or not the new code predicts better than the old in any given experiment, and not on the magnitude of the improvement. We show how the posterior distribution for {theta} may be used, in a kind of Bayesian hypothesis testing, both to decide if an improvement has been observed and to quantify our confidence in that decision. We quantify the predictive probability that should be assigned, prior to taking any data, to the possibility of achieving a given level of confidence, as a function of sample size. We show how this predictive probability depends on the true value of {theta} and, in particular, how there will always be a region around {theta} = 1/2 where it is highly improbable that we will be able to identify an improvement in predictive capability, although the width of this region will shrink to zero as the sample size goes to infinity. We show how the posterior standard deviation may be used, as a kind of 'plan B metric' in the case that the analysis shows that {theta} is close to 1/2 and argue that such a plan B should generally be part of hypothesis testing. All the analysis presented in the paper is done with a
An MCMC Circumstellar Disks Modeling Tool
Wolff, Schuyler; Perrin, Marshall D.; Mazoyer, Johan; Choquet, Elodie; Soummer, Remi; Ren, Bin; Pueyo, Laurent; Debes, John H.; Duchene, Gaspard; Pinte, Christophe; Menard, Francois
2016-01-01
We present an enhanced software framework for the Monte Carlo Markov Chain modeling of circumstellar disk observations, including spectral energy distributions and multi wavelength images from a variety of instruments (e.g. GPI, NICI, HST, WFIRST). The goal is to self-consistently and simultaneously fit a wide variety of observables in order to place constraints on the physical properties of a given disk, while also rigorously assessing the uncertainties in the derived properties. This modular code is designed to work with a collection of existing modeling tools, ranging from simple scripts to define the geometry for optically thin debris disks, to full radiative transfer modeling of complex grain structures in protoplanetary disks (using the MCFOST radiative transfer modeling code). The MCMC chain relies on direct chi squared comparison of model images/spectra to observations. We will include a discussion of how best to weight different observations in the modeling of a single disk and how to incorporate forward modeling from PCA PSF subtraction techniques. The code is open source, python, and available from github. Results for several disks at various evolutionary stages will be discussed.
On the Markov Chain Monte Carlo (MCMC) method
Indian Academy of Sciences (India)
Rajeeva L Karandikar
2006-04-01
Markov Chain Monte Carlo (MCMC) is a popular method used to generate samples from arbitrary distributions, which may be speciﬁed indirectly. In this article, we give an introduction to this method along with some examples.
Institute of Scientific and Technical Information of China (English)
KUNDU Debasis; PRADHAN Biswabrata
2009-01-01
Recently generalized exponential distribution has received considerable attentions. In this paper, we deal with the Bayesian inference of the unknown parameters of the progressively censored generalized exponential distribution. It is assumed that the scale and the shape parameters have independent gamma priors. The Bayes estimates of the unknown parameters cannot be obtained in the closed form. Lindley's approximation and importance sampling technique have been suggested to compute the approximate Bayes estimates. Markov Chain Monte Carlo method has been used to compute the approximate Bayes estimates and also to construct the highest posterior density credible intervals. We also provide different criteria to compare two different sampling schemes and hence to find the optimal sampling schemes. It is observed that finding the optimum censoring procedure is a computationally expensive process. And we have recommended to use the sub-optimal censoring procedure, which can be obtained very easily. Monte Carlo simulations are performed to compare the performances of the different methods and one data analysis has been performed for illustrative purposes.
Lu, Yu; Lu, Zhankui; Katz, Neal; Weinberg, Martin D
2013-01-01
We infer mechanisms of galaxy formation for a broad family of semi-analytic models (SAMs) constrained by the K-band luminosity function and HI mass function of local galaxies using tools of Bayesian analysis. Even with a broad search in parameter space the whole model family fails to match to constraining data. In the best fitting models, the star formation and feedback parameters in low-mass haloes are tightly constrained by the two data sets, and the analysis reveals several generic failures of models that similarly apply to other existing SAMs. First, based on the assumption that baryon accretion follows the dark matter accretion, large mass-loading factors are required for haloes with circular velocities lower than 200 km/s, and most of the wind mass must be expelled from the haloes. Second, assuming that the feedback is powered by Type-II supernovae with a Chabrier IMF, the outflow requires more than 25% of the available SN kinetic energy. Finally, the posterior predictive distributions for the star form...
Amo de Paz, Guillermo; Cubas, Paloma; Divakar, Pradeep K; Lumbsch, H Thorsten; Crespo, Ana
2011-01-01
There is a long-standing debate on the extent of vicariance and long-distance dispersal events to explain the current distribution of organisms, especially in those with small diaspores potentially prone to long-distance dispersal. Age estimates of clades play a crucial role in evaluating the impact of these processes. The aim of this study is to understand the evolutionary history of the largest clade of macrolichens, the parmelioid lichens (Parmeliaceae, Lecanoromycetes, Ascomycota) by dating the origin of the group and its major lineages. They have a worldwide distribution with centers of distribution in the Neo- and Paleotropics, and semi-arid subtropical regions of the Southern Hemisphere. Phylogenetic analyses were performed using DNA sequences of nuLSU and mtSSU rDNA, and the protein-coding RPB1 gene. The three DNA regions had different evolutionary rates: RPB1 gave a rate two to four times higher than nuLSU and mtSSU. Divergence times of the major clades were estimated with partitioned BEAST analyses allowing different rates for each DNA region and using a relaxed clock model. Three calibrations points were used to date the tree: an inferred age at the stem of Lecanoromycetes, and two dated fossils: Parmelia in the parmelioid group, and Alectoria. Palaeoclimatic conditions and the palaeogeological area cladogram were compared to the dated phylogeny of parmelioid. The parmelioid group diversified around the K/T boundary, and the major clades diverged during the Eocene and Oligocene. The radiation of the genera occurred through globally changing climatic condition of the early Oligocene, Miocene and early Pliocene. The estimated divergence times are consistent with long-distance dispersal events being the major factor to explain the biogeographical distribution patterns of Southern Hemisphere parmelioids, especially for Africa-Australia disjunctions, because the sequential break-up of Gondwana started much earlier than the origin of these clades. However, our
Directory of Open Access Journals (Sweden)
Tamás Petkovits
Full Text Available Although the fungal order Mortierellales constitutes one of the largest classical groups of Zygomycota, its phylogeny is poorly understood and no modern taxonomic revision is currently available. In the present study, 90 type and reference strains were used to infer a comprehensive phylogeny of Mortierellales from the sequence data of the complete ITS region and the LSU and SSU genes with a special attention to the monophyly of the genus Mortierella. Out of 15 alternative partitioning strategies compared on the basis of Bayes factors, the one with the highest number of partitions was found optimal (with mixture models yielding the best likelihood and tree length values, implying a higher complexity of evolutionary patterns in the ribosomal genes than generally recognized. Modeling the ITS1, 5.8S, and ITS2, loci separately improved model fit significantly as compared to treating all as one and the same partition. Further, within-partition mixture models suggests that not only the SSU, LSU and ITS regions evolve under qualitatively and/or quantitatively different constraints, but that significant heterogeneity can be found within these loci also. The phylogenetic analysis indicated that the genus Mortierella is paraphyletic with respect to the genera Dissophora, Gamsiella and Lobosporangium and the resulting phylogeny contradict previous, morphology-based sectional classification of Mortierella. Based on tree structure and phenotypic traits, we recognize 12 major clades, for which we attempt to summarize phenotypic similarities. M. longicollis is closely related to the outgroup taxon Rhizopus oryzae, suggesting that it belongs to the Mucorales. Our results demonstrate that traits used in previous classifications of the Mortierellales are highly homoplastic and that the Mortierellales is in a need of a reclassification, where new, phylogenetically informative phenotypic traits should be identified, with molecular phylogenies playing a decisive role.
Blind Equalization of a Nonlinear Satellite System Using MCMC Simulation Methods
Directory of Open Access Journals (Sweden)
Sénécal Stéphane
2002-01-01
Full Text Available This paper proposes the use of Markov Chain Monte-Carlo (MCMC simulation methods for equalizing a satellite communication system. The main difficulties encountered are the nonlinear distorsions caused by the amplifier stage in the satellite. Several processing methods manage to take into account the nonlinearity of the system but they require the knowledge of a training/learning input sequence for updating the parameters of the equalizer. Blind equalization methods also exist but they require a Volterra modelization of the system. The aim of the paper is also to blindly restore the emitted message. To reach the goal, we adopt a Bayesian point of view. We jointly use the prior knowledge on the emitted symbols, and the information available from the received signal. This is done by considering the posterior distribution of the input sequence and the parameters of the model. Such a distribution is very difficult to study and thus motivates the implementation of MCMC methods. The presentation of the method is cut into two parts. The first part solves the problem for a simplified model; the second part deals with the complete model, and a part of the solution uses the algorithm developed for the simplified model. The algorithms are illustrated and their performance is evaluated using bit error rate versus signal-to-noise ratio curves.
Understanding Computational Bayesian Statistics
Bolstad, William M
2011-01-01
A hands-on introduction to computational statistics from a Bayesian point of view Providing a solid grounding in statistics while uniquely covering the topics from a Bayesian perspective, Understanding Computational Bayesian Statistics successfully guides readers through this new, cutting-edge approach. With its hands-on treatment of the topic, the book shows how samples can be drawn from the posterior distribution when the formula giving its shape is all that is known, and how Bayesian inferences can be based on these samples from the posterior. These ideas are illustrated on common statistic
BEAST: Bayesian evolutionary analysis by sampling trees
Directory of Open Access Journals (Sweden)
Drummond Alexei J
2007-11-01
Full Text Available Abstract Background The evolutionary analysis of molecular sequence variation is a statistical enterprise. This is reflected in the increased use of probabilistic models for phylogenetic inference, multiple sequence alignment, and molecular population genetics. Here we present BEAST: a fast, flexible software architecture for Bayesian analysis of molecular sequences related by an evolutionary tree. A large number of popular stochastic models of sequence evolution are provided and tree-based models suitable for both within- and between-species sequence data are implemented. Results BEAST version 1.4.6 consists of 81000 lines of Java source code, 779 classes and 81 packages. It provides models for DNA and protein sequence evolution, highly parametric coalescent analysis, relaxed clock phylogenetics, non-contemporaneous sequence data, statistical alignment and a wide range of options for prior distributions. BEAST source code is object-oriented, modular in design and freely available at http://beast-mcmc.googlecode.com/ under the GNU LGPL license. Conclusion BEAST is a powerful and flexible evolutionary analysis package for molecular sequence variation. It also provides a resource for the further development of new models and statistical methods of evolutionary analysis.
Bayesian object classification of gold nanoparticles
Konomi, Bledar A.
2013-06-01
The properties of materials synthesized with nanoparticles (NPs) are highly correlated to the sizes and shapes of the nanoparticles. The transmission electron microscopy (TEM) imaging technique can be used to measure the morphological characteristics of NPs, which can be simple circles or more complex irregular polygons with varying degrees of scales and sizes. A major difficulty in analyzing the TEM images is the overlapping of objects, having different morphological properties with no specific information about the number of objects present. Furthermore, the objects lying along the boundary render automated image analysis much more difficult. To overcome these challenges, we propose a Bayesian method based on the marked-point process representation of the objects. We derive models, both for the marks which parameterize the morphological aspects and the points which determine the location of the objects. The proposed model is an automatic image segmentation and classification procedure, which simultaneously detects the boundaries and classifies the NPs into one of the predetermined shape families. We execute the inference by sampling the posterior distribution using Markov chainMonte Carlo (MCMC) since the posterior is doubly intractable. We apply our novel method to several TEM imaging samples of gold NPs, producing the needed statistical characterization of their morphology. © Institute of Mathematical Statistics, 2013.
Bayesian Analysis of Inertial Confinement Fusion Experiments at the National Ignition Facility
Gaffney, J A; Sonnad, V; Libby, S B
2012-01-01
We develop a Bayesian inference method that allows the efficient determination of several interesting parameters from complicated high-energy-density experiments performed on the National Ignition Facility (NIF). The model is based on an exploration of phase space using the hydrodynamic code HYDRA. A linear model is used to describe the effect of nuisance parameters on the analysis, allowing an analytic likelihood to be derived that can be determined from a small number of HYDRA runs and then used in existing advanced statistical analysis methods. This approach is applied to a recent experiment in order to determine the carbon opacity and X-ray drive; it is found that the inclusion of prior expert knowledge and fluctuations in capsule dimensions and chemical composition significantly improve the agreement between experiment and theoretical opacity calculations. A parameterisation of HYDRA results is used to test the application of both Markov chain Monte Carlo (MCMC) and genetic algorithm (GA) techniques to e...
Bayesian Reliability Analysis of Non-Stationarity in Multi-agent Systems
Directory of Open Access Journals (Sweden)
TONT Gabriela
2013-05-01
Full Text Available The Bayesian methods provide information about the meaningful parameters in a statistical analysis obtained by combining the prior and sampling distributions to form the posterior distribution of theparameters. The desired inferences are obtained from this joint posterior. An estimation strategy for hierarchical models, where the resulting joint distribution of the associated model parameters cannotbe evaluated analytically, is to use sampling algorithms, known as Markov Chain Monte Carlo (MCMC methods, from which approximate solutions can be obtained. Both serial and parallel configurations of subcomponents are permitted. The capability of time-dependent method to describe a multi-state system is based on a case study, assessingthe operatial situation of studied system. The rationality and validity of the presented model are demonstrated via a case of study. The effect of randomness of the structural parameters is alsoexamined.
A semiparametric Bayesian proportional hazards model for interval censored data with frailty effects
Directory of Open Access Journals (Sweden)
Hölzel Dieter
2009-02-01
Full Text Available Abstract Background Multivariate analysis of interval censored event data based on classical likelihood methods is notoriously cumbersome. Likelihood inference for models which additionally include random effects are not available at all. Developed algorithms bear problems for practical users like: matrix inversion, slow convergence, no assessment of statistical uncertainty. Methods MCMC procedures combined with imputation are used to implement hierarchical models for interval censored data within a Bayesian framework. Results Two examples from clinical practice demonstrate the handling of clustered interval censored event times as well as multilayer random effects for inter-institutional quality assessment. The software developed is called survBayes and is freely available at CRAN. Conclusion The proposed software supports the solution of complex analyses in many fields of clinical epidemiology as well as health services research.
基于MCMC方法的生物气溶胶袭击施放源项参数反演%Source inversion of bioaerosol attack based on MCMC method
Institute of Scientific and Technical Information of China (English)
许晴; 祖正虎; 张文斗; 徐致靖; 黄培堂; 郑涛
2012-01-01
生物气溶胶施放源项参数反演是生物气溶胶袭击危害评估的反问题,对危害评估及应急响应具有重要指导意义.本文基于贝叶斯推理方法,利用生物传感器检测数据和正向大气扩散模型,构造似然函数,采用结合Metropolis-Hasting算法的马尔可夫链蒙特卡洛(Markov chain Monte Carlo,MCMC)抽样,对施放源位置、高度、施放剂量进行反演.统计分析表明,反演结果和初始源项参数设置吻合非常好,证明了方法的有效性.%The inversion of bioaerosol release source parameters is the inverse problem of hazard assessment of bioaerosol attacks,and is of great significance for hazard assessment and emergency response. Based on observations of biosensors and concentrations predicted by an atmospheric dispersion model, a likelihood function was assigned, with which the Markov chain Monte Carlo ( MCMC) sampling based on Bayesian inference was used to invert the source parameters, including the source location,source height,and dispersion strength, statistic analysis shows that the inversion results fit the initial source parameters very well. The validity of the method is proved.
Hierarchical animal movement models for population-level inference
Hooten, Mevin B.; Buderman, Frances E.; Brost, Brian M.; Hanks, Ephraim M.; Ivans, Jacob S.
2016-01-01
New methods for modeling animal movement based on telemetry data are developed regularly. With advances in telemetry capabilities, animal movement models are becoming increasingly sophisticated. Despite a need for population-level inference, animal movement models are still predominantly developed for individual-level inference. Most efforts to upscale the inference to the population level are either post hoc or complicated enough that only the developer can implement the model. Hierarchical Bayesian models provide an ideal platform for the development of population-level animal movement models but can be challenging to fit due to computational limitations or extensive tuning required. We propose a two-stage procedure for fitting hierarchical animal movement models to telemetry data. The two-stage approach is statistically rigorous and allows one to fit individual-level movement models separately, then resample them using a secondary MCMC algorithm. The primary advantages of the two-stage approach are that the first stage is easily parallelizable and the second stage is completely unsupervised, allowing for an automated fitting procedure in many cases. We demonstrate the two-stage procedure with two applications of animal movement models. The first application involves a spatial point process approach to modeling telemetry data, and the second involves a more complicated continuous-time discrete-space animal movement model. We fit these models to simulated data and real telemetry data arising from a population of monitored Canada lynx in Colorado, USA.
Auxiliary Parameter MCMC for Exponential Random Graph Models
Byshkin, Maksym; Stivala, Alex; Mira, Antonietta; Krause, Rolf; Robins, Garry; Lomi, Alessandro
2016-11-01
Exponential random graph models (ERGMs) are a well-established family of statistical models for analyzing social networks. Computational complexity has so far limited the appeal of ERGMs for the analysis of large social networks. Efficient computational methods are highly desirable in order to extend the empirical scope of ERGMs. In this paper we report results of a research project on the development of snowball sampling methods for ERGMs. We propose an auxiliary parameter Markov chain Monte Carlo (MCMC) algorithm for sampling from the relevant probability distributions. The method is designed to decrease the number of allowed network states without worsening the mixing of the Markov chains, and suggests a new approach for the developments of MCMC samplers for ERGMs. We demonstrate the method on both simulated and actual (empirical) network data and show that it reduces CPU time for parameter estimation by an order of magnitude compared to current MCMC methods.
Time-varying nonstationary multivariate risk analysis using a dynamic Bayesian copula
Sarhadi, Ali; Burn, Donald H.; Concepción Ausín, María.; Wiper, Michael P.
2016-03-01
A time-varying risk analysis is proposed for an adaptive design framework in nonstationary conditions arising from climate change. A Bayesian, dynamic conditional copula is developed for modeling the time-varying dependence structure between mixed continuous and discrete multiattributes of multidimensional hydrometeorological phenomena. Joint Bayesian inference is carried out to fit the marginals and copula in an illustrative example using an adaptive, Gibbs Markov Chain Monte Carlo (MCMC) sampler. Posterior mean estimates and credible intervals are provided for the model parameters and the Deviance Information Criterion (DIC) is used to select the model that best captures different forms of nonstationarity over time. This study also introduces a fully Bayesian, time-varying joint return period for multivariate time-dependent risk analysis in nonstationary environments. The results demonstrate that the nature and the risk of extreme-climate multidimensional processes are changed over time under the impact of climate change, and accordingly the long-term decision making strategies should be updated based on the anomalies of the nonstationary environment.
A Genomic Bayesian Multi-trait and Multi-environment Model.
Montesinos-López, Osval A; Montesinos-López, Abelardo; Crossa, José; Toledo, Fernando H; Pérez-Hernández, Oscar; Eskridge, Kent M; Rutkoski, Jessica
2016-09-08
When information on multiple genotypes evaluated in multiple environments is recorded, a multi-environment single trait model for assessing genotype × environment interaction (G × E) is usually employed. Comprehensive models that simultaneously take into account the correlated traits and trait × genotype × environment interaction (T × G × E) are lacking. In this research, we propose a Bayesian model for analyzing multiple traits and multiple environments for whole-genome prediction (WGP) model. For this model, we used Half-[Formula: see text] priors on each standard deviation term and uniform priors on each correlation of the covariance matrix. These priors were not informative and led to posterior inferences that were insensitive to the choice of hyper-parameters. We also developed a computationally efficient Markov Chain Monte Carlo (MCMC) under the above priors, which allowed us to obtain all required full conditional distributions of the parameters leading to an exact Gibbs sampling for the posterior distribution. We used two real data sets to implement and evaluate the proposed Bayesian method and found that when the correlation between traits was high (>0.5), the proposed model (with unstructured variance-covariance) improved prediction accuracy compared to the model with diagonal and standard variance-covariance structures. The R-software package Bayesian Multi-Trait and Multi-Environment (BMTME) offers optimized C++ routines to efficiently perform the analyses.
A Structure Learning Algorithm for Bayesian Network Using Prior Knowledge
Institute of Scientific and Technical Information of China (English)
徐俊刚; 赵越; 陈健; 韩超
2015-01-01
Learning structure from data is one of the most important fundamental tasks of Bayesian network research. Particularly, learning optional structure of Bayesian network is a non-deterministic polynomial-time (NP) hard problem. To solve this problem, many heuristic algorithms have been proposed, and some of them learn Bayesian network structure with the help of different types of prior knowledge. However, the existing algorithms have some restrictions on the prior knowledge, such as quality restriction and use restriction. This makes it diﬃcult to use the prior knowledge well in these algorithms. In this paper, we introduce the prior knowledge into the Markov chain Monte Carlo (MCMC) algorithm and propose an algorithm called Constrained MCMC (C-MCMC) algorithm to learn the structure of the Bayesian network. Three types of prior knowledge are defined: existence of parent node, absence of parent node, and distribution knowledge including the conditional probability distribution (CPD) of edges and the probability distribution (PD) of nodes. All of these types of prior knowledge are easily used in this algorithm. We conduct extensive experiments to demonstrate the feasibility and effectiveness of the proposed method C-MCMC.
Bayesian Inference in Statistical Analysis
Box, George E P
2011-01-01
The Wiley Classics Library consists of selected books that have become recognized classics in their respective fields. With these new unabridged and inexpensive editions, Wiley hopes to extend the life of these important works by making them available to future generations of mathematicians and scientists. Currently available in the Series: T. W. Anderson The Statistical Analysis of Time Series T. S. Arthanari & Yadolah Dodge Mathematical Programming in Statistics Emil Artin Geometric Algebra Norman T. J. Bailey The Elements of Stochastic Processes with Applications to the Natural Sciences Rob
Bayesian Variable Selection via Particle Stochastic Search.
Shi, Minghui; Dunson, David B
2011-02-01
We focus on Bayesian variable selection in regression models. One challenge is to search the huge model space adequately, while identifying high posterior probability regions. In the past decades, the main focus has been on the use of Markov chain Monte Carlo (MCMC) algorithms for these purposes. In this article, we propose a new computational approach based on sequential Monte Carlo (SMC), which we refer to as particle stochastic search (PSS). We illustrate PSS through applications to linear regression and probit models.
von der Linden, Wolfgang; Dose, Volker; von Toussaint, Udo
2014-06-01
Preface; Part I. Introduction: 1. The meaning of probability; 2. Basic definitions; 3. Bayesian inference; 4. Combinatrics; 5. Random walks; 6. Limit theorems; 7. Continuous distributions; 8. The central limit theorem; 9. Poisson processes and waiting times; Part II. Assigning Probabilities: 10. Transformation invariance; 11. Maximum entropy; 12. Qualified maximum entropy; 13. Global smoothness; Part III. Parameter Estimation: 14. Bayesian parameter estimation; 15. Frequentist parameter estimation; 16. The Cramer-Rao inequality; Part IV. Testing Hypotheses: 17. The Bayesian way; 18. The frequentist way; 19. Sampling distributions; 20. Bayesian vs frequentist hypothesis tests; Part V. Real World Applications: 21. Regression; 22. Inconsistent data; 23. Unrecognized signal contributions; 24. Change point problems; 25. Function estimation; 26. Integral equations; 27. Model selection; 28. Bayesian experimental design; Part VI. Probabilistic Numerical Techniques: 29. Numerical integration; 30. Monte Carlo methods; 31. Nested sampling; Appendixes; References; Index.
Bayesian theory and applications
Dellaportas, Petros; Polson, Nicholas G; Stephens, David A
2013-01-01
The development of hierarchical models and Markov chain Monte Carlo (MCMC) techniques forms one of the most profound advances in Bayesian analysis since the 1970s and provides the basis for advances in virtually all areas of applied and theoretical Bayesian statistics. This volume guides the reader along a statistical journey that begins with the basic structure of Bayesian theory, and then provides details on most of the past and present advances in this field. The book has a unique format. There is an explanatory chapter devoted to each conceptual advance followed by journal-style chapters that provide applications or further advances on the concept. Thus, the volume is both a textbook and a compendium of papers covering a vast range of topics. It is appropriate for a well-informed novice interested in understanding the basic approach, methods and recent applications. Because of its advanced chapters and recent work, it is also appropriate for a more mature reader interested in recent applications and devel...
Pascoe, D. J.; Anfinogentov, S.; Nisticò, G.; Goddard, C. R.; Nakariakov, V. M.
2017-04-01
Context. The strong damping of kink oscillations of coronal loops can be explained by mode coupling. The damping envelope depends on the transverse density profile of the loop. Observational measurements of the damping envelope have been used to determine the transverse loop structure which is important for understanding other physical processes such as heating. Aims: The general damping envelope describing the mode coupling of kink waves consists of a Gaussian damping regime followed by an exponential damping regime. Recent observational detection of these damping regimes has been employed as a seismological tool. We extend the description of the damping behaviour to account for additional physical effects, namely a time-dependent period of oscillation, the presence of additional longitudinal harmonics, and the decayless regime of standing kink oscillations. Methods: We examine four examples of standing kink oscillations observed by the Atmospheric Imaging Assembly (AIA) onboard the Solar Dynamics Observatory (SDO). We use forward modelling of the loop position and investigate the dependence on the model parameters using Bayesian inference and Markov chain Monte Carlo (MCMC) sampling. Results: Our improvements to the physical model combined with the use of Bayesian inference and MCMC produce improved estimates of model parameters and their uncertainties. Calculation of the Bayes factor also allows us to compare the suitability of different physical models. We also use a new method based on spline interpolation of the zeroes of the oscillation to accurately describe the background trend of the oscillating loop. Conclusions: This powerful and robust method allows for accurate seismology of coronal loops, in particular the transverse density profile, and potentially reveals additional physical effects.
A Bayesian Analysis of Spectral ARMA Model
Directory of Open Access Journals (Sweden)
Manoel I. Silvestre Bezerra
2012-01-01
Full Text Available Bezerra et al. (2008 proposed a new method, based on Yule-Walker equations, to estimate the ARMA spectral model. In this paper, a Bayesian approach is developed for this model by using the noninformative prior proposed by Jeffreys (1967. The Bayesian computations, simulation via Markov Monte Carlo (MCMC is carried out and characteristics of marginal posterior distributions such as Bayes estimator and confidence interval for the parameters of the ARMA model are derived. Both methods are also compared with the traditional least squares and maximum likelihood approaches and a numerical illustration with two examples of the ARMA model is presented to evaluate the performance of the procedures.
GPstuff: Bayesian Modeling with Gaussian Processes
Vanhatalo, J.; Riihimaki, J.; Hartikainen, J.; Jylänki, P.P.; Tolvanen, V.; Vehtari, A.
2013-01-01
The GPstuff toolbox is a versatile collection of Gaussian process models and computational tools required for Bayesian inference. The tools include, among others, various inference methods, sparse approximations and model assessment methods.
Minsley, Burke J.
2011-01-01
A meaningful interpretation of geophysical measurements requires an assessment of the space of models that are consistent with the data, rather than just a single, ‘best’ model which does not convey information about parameter uncertainty. For this purpose, a trans-dimensional Bayesian Markov chain Monte Carlo (MCMC) algorithm is developed for assessing frequencydomain electromagnetic (FDEM) data acquired from airborne or ground-based systems. By sampling the distribution of models that are consistent with measured data and any prior knowledge, valuable inferences can be made about parameter values such as the likely depth to an interface, the distribution of possible resistivity values as a function of depth and non-unique relationships between parameters. The trans-dimensional aspect of the algorithm allows the number of layers to be a free parameter that is controlled by the data, where models with fewer layers are inherently favoured, which provides a natural measure of parsimony and a signiﬁcant degree of ﬂexibility in parametrization. The MCMC algorithm is used with synthetic examples to illustrate how the distribution of acceptable models is affected by the choice of prior information, the system geometry and conﬁguration and the uncertainty in the measured system elevation. An airborne FDEM data set that was acquired for the purpose of hydrogeological characterization is also studied. The results compare favorably with traditional least-squares analysis, borehole resistivity and lithology logs from the site, and also provide new information about parameter uncertainty necessary for model assessment.
Labarbe, Rudi; Janssens, Guillaume; Sterpin, Edmond
2016-09-01
In proton therapy, quantification of the proton range uncertainty is important to achieve dose distribution compliance. The promising accuracy of prompt gamma imaging (PGI) suggests the development of a mathematical framework using the range measurements to convert population based estimates of uncertainties into patient specific estimates with the purpose of plan adaptation. We present here such framework using Bayesian inference. The sources of uncertainty were modeled by three parameters: setup bias m, random setup precision r and water equivalent path length bias u. The evolution of the expectation values E(m), E(r) and E(u) during the treatment was simulated. The expectation values converged towards the true simulation parameters after 5 and 10 fractions, for E(m) and E(u), respectively. E(r) settle on a constant value slightly lower than the true value after 10 fractions. In conclusion, the simulation showed that there is enough information in the frequency distribution of the range errors measured by PGI to estimate the expectation values and the confidence interval of the model parameters by Bayesian inference. The updated model parameters were used to compute patient specific lateral and local distal margins for adaptive re-planning.
MCMC-based inversion algorithm dedicated to NEMS mass Spectrometry
Pérenon, R.; Mohammad-Djafari, A.; Sage, E.; Duraffourg, L.; Hentz, S.; Brenac, A.; Morel, R.; Grangeat, P.
2013-08-01
Nano Electro Mechanical Systems (NEMS) provide new perspectives in the mass spectrometry field. This new generation of sensors is sensitive enough to detect a single molecule. Thus, it is possible to estimate a concentration profile in a counting-mode which brings a reduced noise and a higher sensitivity. In this paper, first, we briefly describe the measurement system. Then we propose a probabilistic model of the acquisition system in the form of an input-output system from which we can deduce the likelihood of the unknowns in the data and a Bayesian inference approach with a hierarchical Bernoulli-Gamma prior model. To do the computation we propose the use of a Multiple-Try Metropolis Monte-Carlo Markov-Chain algorithm. Multiple-Try Metropolis proposal functions are adapted to the model, especially to the discrete nature of the problem. Our approach provides an automatic robust estimation of mass spectra. We test the proposed algorithm both on experimental and on simulated data. We discuss the performances of the algorithm and the robustness of the estimation.
An MCMC determination of the primordial helium abundance
Aver, Erik; Skillman, Evan D
2011-01-01
Spectroscopic observations of the chemical abundances in metal-poor H II regions provide an independent method for estimating the primordial helium abundance. H II regions are described by several physical parameters such as electron density, electron temperature, and reddening, in addition to y, the ratio of helium to hydrogen. It had been customary to estimate or determine self-consistently these parameters to calculate y. Frequentist analyses of the parameter space have been shown to be successful in these determinations, and Markov Chain Monte Carlo (MCMC) techniques have proven to be very efficient in sampling this parameter space. Nevertheless, accurate determination of the primordial helium abundance from observations of H II regions is constrained by both systematic and statistical uncertainties. In an attempt to better reduce the latter, and better characterize the former, we apply MCMC methods to the large dataset recently compiled by Izotov, Thuan, & Stasinska (2007). To improve the reliability...
Borsboom, D.; Haig, B.D.
2013-01-01
Unlike most other statistical frameworks, Bayesian statistical inference is wedded to a particular approach in the philosophy of science (see Howson & Urbach, 2006); this approach is called Bayesianism. Rather than being concerned with model fitting, this position in the philosophy of science primar
Target tracking in glint noise using a MCMC particle filter
Institute of Scientific and Technical Information of China (English)
Hu Hongtao; Jing Zhongliang; Li Anping; Hu Shiqiang; Tian Hongwei
2005-01-01
In radar target tracking application, the observation noise is usually non-Gaussian, which is also referred as glint noise. The performances of conventional trackers degra de severely in the presence of glint noise. An improved particle filter, Markov chain Monte Carlo particle filter (MCMC-PF), is applied to cope with radar target tracking when the measurements are perturbed by glint noise. Tracking performance of the filter is demonstrated in the present of glint noise by computer simulation.
Variational level set segmentation for forest based on MCMC sampling
Yang, Tie-Jun; Huang, Lin; Jiang, Chuan-xian; Nong, Jian
2014-11-01
Environmental protection is one of the themes of today's world. The forest is a recycler of carbon dioxide and natural oxygen bar. Protection of forests, monitoring of forest growth is long-term task of environmental protection. It is very important to automatically statistic the forest coverage rate using optical remote sensing images and the computer, by which we can timely understand the status of the forest of an area, and can be freed from tedious manual statistics. Towards the problem of computational complexity of the global optimization using convexification, this paper proposes a level set segmentation method based on Markov chain Monte Carlo (MCMC) sampling and applies it to forest segmentation in remote sensing images. The presented method needs not to do any convexity transformation for the energy functional of the goal, and uses MCMC sampling method with global optimization capability instead. The possible local minima occurring by using gradient descent method is also avoided. There are three major contributions in the paper. Firstly, by using MCMC sampling, the convexity of the energy functional is no longer necessary and global optimization can still be achieved. Secondly, taking advantage of the data (texture) and knowledge (a priori color) to guide the construction of Markov chain, the convergence rate of Markov chains is improved significantly. Finally, the level set segmentation method by integrating a priori color and texture for forest is proposed. The experiments show that our method can efficiently and accurately segment forest in remote sensing images.
Bayesian approach to rough set
Marwala, Tshilidzi
2007-01-01
This paper proposes an approach to training rough set models using Bayesian framework trained using Markov Chain Monte Carlo (MCMC) method. The prior probabilities are constructed from the prior knowledge that good rough set models have fewer rules. Markov Chain Monte Carlo sampling is conducted through sampling in the rough set granule space and Metropolis algorithm is used as an acceptance criteria. The proposed method is tested to estimate the risk of HIV given demographic data. The results obtained shows that the proposed approach is able to achieve an average accuracy of 58% with the accuracy varying up to 66%. In addition the Bayesian rough set give the probabilities of the estimated HIV status as well as the linguistic rules describing how the demographic parameters drive the risk of HIV.
Bayesian methods for measures of agreement
Broemeling, Lyle D
2009-01-01
Using WinBUGS to implement Bayesian inferences of estimation and testing hypotheses, Bayesian Methods for Measures of Agreement presents useful methods for the design and analysis of agreement studies. It focuses on agreement among the various players in the diagnostic process.The author employs a Bayesian approach to provide statistical inferences based on various models of intra- and interrater agreement. He presents many examples that illustrate the Bayesian mode of reasoning and explains elements of a Bayesian application, including prior information, experimental information, the likelihood function, posterior distribution, and predictive distribution. The appendices provide the necessary theoretical foundation to understand Bayesian methods as well as introduce the fundamentals of programming and executing the WinBUGS software.Taking a Bayesian approach to inference, this hands-on book explores numerous measures of agreement, including the Kappa coefficient, the G coefficient, and intraclass correlation...
Single channel signal component separation using Bayesian estimation
Institute of Scientific and Technical Information of China (English)
Cai Quanwei; Wei Ping; Xiao Xianci
2007-01-01
A Bayesian estimation method to separate multicomponent signals with single channel observation is presented in this paper. By using the basis function projection, the component separation becomes a problem of limited parameter estimation. Then, a Bayesian model for estimating parameters is set up. The reversible jump MCMC (Monte Carlo Markov Chain) algorithmis adopted to perform the Bayesian computation. The method can jointly estimate the parameters of each component and the component number. Simulation results demonstrate that the method has low SNR threshold and better performance.
Bayesian modeling using WinBUGS
Ntzoufras, Ioannis
2009-01-01
A hands-on introduction to the principles of Bayesian modeling using WinBUGS Bayesian Modeling Using WinBUGS provides an easily accessible introduction to the use of WinBUGS programming techniques in a variety of Bayesian modeling settings. The author provides an accessible treatment of the topic, offering readers a smooth introduction to the principles of Bayesian modeling with detailed guidance on the practical implementation of key principles. The book begins with a basic introduction to Bayesian inference and the WinBUGS software and goes on to cover key topics, including: Markov Chain Monte Carlo algorithms in Bayesian inference Generalized linear models Bayesian hierarchical models Predictive distribution and model checking Bayesian model and variable evaluation Computational notes and screen captures illustrate the use of both WinBUGS as well as R software to apply the discussed techniques. Exercises at the end of each chapter allow readers to test their understanding of the presented concepts and all ...
Directory of Open Access Journals (Sweden)
Oliver Ratmann
2007-11-01
Full Text Available Gene duplication with subsequent interaction divergence is one of the primary driving forces in the evolution of genetic systems. Yet little is known about the precise mechanisms and the role of duplication divergence in the evolution of protein networks from the prokaryote and eukaryote domains. We developed a novel, model-based approach for Bayesian inference on biological network data that centres on approximate Bayesian computation, or likelihood-free inference. Instead of computing the intractable likelihood of the protein network topology, our method summarizes key features of the network and, based on these, uses a MCMC algorithm to approximate the posterior distribution of the model parameters. This allowed us to reliably fit a flexible mixture model that captures hallmarks of evolution by gene duplication and subfunctionalization to protein interaction network data of Helicobacter pylori and Plasmodium falciparum. The 80% credible intervals for the duplication-divergence component are [0.64, 0.98] for H. pylori and [0.87, 0.99] for P. falciparum. The remaining parameter estimates are not inconsistent with sequence data. An extensive sensitivity analysis showed that incompleteness of PIN data does not largely affect the analysis of models of protein network evolution, and that the degree sequence alone barely captures the evolutionary footprints of protein networks relative to other statistics. Our likelihood-free inference approach enables a fully Bayesian analysis of a complex and highly stochastic system that is otherwise intractable at present. Modelling the evolutionary history of PIN data, it transpires that only the simultaneous analysis of several global aspects of protein networks enables credible and consistent inference to be made from available datasets. Our results indicate that gene duplication has played a larger part in the network evolution of the eukaryote than in the prokaryote, and suggests that single gene
Directory of Open Access Journals (Sweden)
Moslem Moradi
2015-06-01
Full Text Available Here in, an application of a new seismic inversion algorithm in one of Iran’s oilfields is described. Stochastic (geostatistical seismic inversion, as a complementary method to deterministic inversion, is perceived as contribution combination of geostatistics and seismic inversion algorithm. This method integrates information from different data sources with different scales, as prior information in Bayesian statistics. Data integration leads to a probability density function (named as a posteriori probability that can yield a model of subsurface. The Markov Chain Monte Carlo (MCMC method is used to sample the posterior probability distribution, and the subsurface model characteristics can be extracted by analyzing a set of the samples. In this study, the theory of stochastic seismic inversion in a Bayesian framework was described and applied to infer P-impedance and porosity models. The comparison between the stochastic seismic inversion and the deterministic model based seismic inversion indicates that the stochastic seismic inversion can provide more detailed information of subsurface character. Since multiple realizations are extracted by this method, an estimation of pore volume and uncertainty in the estimation were analyzed.
Decoding X-ray observations from centres of galaxy clusters using MCMC
Lakhchaura, Kiran; Sharma, Prateek
2016-01-01
Traditionally the thermodynamic profiles (gas density, temperature, etc.) of galaxy clusters are obtained by assuming spherical symmetry and modeling projected X-ray spectra in each annulus. The outer annuli contribute to the inner ones and their contribution needs to be subtracted to obtain the temperature and density of spherical shells. The usual deprojection methods lead to propagation of errors from outside to in and do not model the covariance of parameters in different radial shells. In this paper we describe a method based on a free-form model of clusters with cluster parameters (density, temperature) given in spherical shells, which we {\\it jointly} forward fit to the X-ray data by constructing a Bayesian posterior probability distribution that we sample using the MCMC technique. By systematically marginalising over the nuisance outer shells, we estimate the inner entropy profiles of clusters and fit them to various models for a sample of Chandra X-ray observations of 17 clusters. We show that the en...
Methane emission modeling with MCMC calibration for a boreal peatland
Raivonen, Maarit; Smolander, Sampo; Susiluoto, Jouni; Backman, Leif; Li, Xuefei; Markkanen, Tiina; Kleinen, Thomas; Makela, Jarmo; Aalto, Tuula; Rinne, Janne; Brovkin, Victor; Vesala, Timo
2016-04-01
Natural wetlands, particularly peatlands of the boreal latitudes, are a significant source of methane (CH4). At the moment, the emission estimates are highly uncertain. These natural emissions respond to climatic variability, so it is necessary to understand their dynamics, in order to be able to predict how they affect the greenhouse gas balance in the future. We have developed a model of CH4 production, oxidation and transport in boreal peatlands. It simulates production of CH4 as a proportion of anaerobic peat respiration, transport of CH4 and oxygen between the soil and the atmosphere via diffusion in aerenchymatous plants and in peat pores (water and air filled), ebullition and oxidation of CH4 by methanotrophic microbes. Ultimately, we aim to add the model functionality to global climate models such as the JSBACH (Reick et al., 2013), the land surface scheme of the MPI Earth System Model. We tested the model with measured methane fluxes (using eddy covariance technique) from the Siikaneva site, an oligotrophic boreal fen in southern Finland (61°49' N, 24°11' E), over years 2005-2011. To give the model estimates regional reliability, we calibrated the model using Markov chain Monte Carlo (MCMC) technique. Although the simulations and the research are still ongoing, preliminary results from the MCMC calibration can be described as very promising considering that the model is still at relatively early stage. We will present the model and its dynamics as well as results from the MCMC calibration and the comparison with Siikaneva flux data.
Bayesian inversion of seismic attributes for geological facies using a Hidden Markov Model
Nawaz, Muhammad Atif; Curtis, Andrew
2017-02-01
Markov chain Monte-Carlo (McMC) sampling generates correlated random samples such that their distribution would converge to the true distribution only as the number of samples tends to infinity. In practice, McMC is found to be slow to converge, convergence is not guaranteed to be achieved in finite time, and detection of convergence requires the use of subjective criteria. Although McMC has been used for decades as the algorithm of choice for inference in complex probability distributions, there is a need to seek alternative approaches, particularly in high dimensional problems. Walker & Curtis (2014) developed a method for Bayesian inversion of 2-D spatial data using an exact sampling alternative to McMC which always draws independent samples of the target distribution. Their method thus obviates the need for convergence and removes the concomitant bias exhibited by finite sample sets. Their algorithm is nevertheless computationally intensive and requires large memory. We propose a more efficient method for Bayesian inversion of categorical variables, such as geological facies that requires no sampling at all. The method is based on a 2-D Hidden Markov Model (2D-HMM) over a grid of cells where observations represent localized data constraining each cell. The data in our example application are seismic attributes such as P- and S-wave impedances and rock density; our categorical variables are the hidden states and represent the geological rock types in each cell-facies of distinct subsets of lithology and fluid combinations such as shale, brine-sand and gas-sand. The observations at each location are assumed to be generated from a random function of the hidden state (facies) at that location, and to be distributed according to a certain probability distribution that is independent of hidden states at other locations - an assumption referred to as `localized likelihoods'. The hidden state (facies) at a location cannot be determined solely by the observation at that
Molitor, John
2012-03-01
Bayesian methods have seen an increase in popularity in a wide variety of scientific fields, including epidemiology. One of the main reasons for their widespread application is the power of the Markov chain Monte Carlo (MCMC) techniques generally used to fit these models. As a result, researchers often implicitly associate Bayesian models with MCMC estimation procedures. However, Bayesian models do not always require Markov-chain-based methods for parameter estimation. This is important, as MCMC estimation methods, while generally quite powerful, are complex and computationally expensive and suffer from convergence problems related to the manner in which they generate correlated samples used to estimate probability distributions for parameters of interest. In this issue of the Journal, Cole et al. (Am J Epidemiol. 2012;175(5):368-375) present an interesting paper that discusses non-Markov-chain-based approaches to fitting Bayesian models. These methods, though limited, can overcome some of the problems associated with MCMC techniques and promise to provide simpler approaches to fitting Bayesian models. Applied researchers will find these estimation approaches intuitively appealing and will gain a deeper understanding of Bayesian models through their use. However, readers should be aware that other non-Markov-chain-based methods are currently in active development and have been widely published in other fields.
Kwon, Hyun-Han; Lall, Upmanu; Kim, Seong-Joon
2016-08-01
Recently, the Korean peninsula faced severe drought for more than 3 years (2013-2015). Drought in this region is characterized by multidecadal variability, as seen from one of the longest systematic records available in Asia from 1770 to 2015. This paper explores how the return period of the 2013-2015 drought varies over this historical period to provide a context for the changing climate and drought severity in the region. A nonstationary, multivariate, Bayesian copula model for drought severity and duration is developed and applied. Given the wetting trend over the last 50 years, the recent drought appears quite extreme, while such droughts were common in the eighteenth and nineteenth centuries.
Bayesian multiple target tracking
Streit, Roy L
2013-01-01
This second edition has undergone substantial revision from the 1999 first edition, recognizing that a lot has changed in the multiple target tracking field. One of the most dramatic changes is in the widespread use of particle filters to implement nonlinear, non-Gaussian Bayesian trackers. This book views multiple target tracking as a Bayesian inference problem. Within this framework it develops the theory of single target tracking, multiple target tracking, and likelihood ratio detection and tracking. In addition to providing a detailed description of a basic particle filter that implements
Directory of Open Access Journals (Sweden)
Thomas Christopher M
2011-07-01
Full Text Available Abstract Background IncP-1 plasmids are broad host range plasmids that have been found in clinical and environmental bacteria. They often carry genes for antibiotic resistance or catabolic pathways. The archetypal IncP-1 plasmid RK2 is a well-characterized biological system, with a fully sequenced and annotated genome and wide range of experimental measurements. Its central control operon, encoding two global regulators KorA and KorB, is a natural example of a negatively self-regulated operon. To increase our understanding of the regulation of this operon, we have constructed a dynamical mathematical model using Ordinary Differential Equations, and employed a Bayesian inference scheme, Markov Chain Monte Carlo (MCMC using the Metropolis-Hastings algorithm, as a way of integrating experimental measurements and a priori knowledge. We also compared MCMC and Metabolic Control Analysis (MCA as approaches for determining the sensitivity of model parameters. Results We identified two distinct sets of parameter values, with different biological interpretations, that fit and explain the experimental data. This allowed us to highlight the proportion of repressor protein as dimers as a key experimental measurement defining the dynamics of the system. Analysis of joint posterior distributions led to the identification of correlations between parameters for protein synthesis and partial repression by KorA or KorB dimers, indicating the necessary use of joint posteriors for correct parameter estimation. Using MCA, we demonstrated that the system is highly sensitive to the growth rate but insensitive to repressor monomerization rates in their selected value regions; the latter outcome was also confirmed by MCMC. Finally, by examining a series of different model refinements for partial repression by KorA or KorB dimers alone, we showed that a model including partial repression by KorA and KorB was most compatible with existing experimental data. Conclusions We
Yue, Yu Ryan; Wang, Xiao-Feng
2016-05-10
This paper is motivated from a retrospective study of the impact of vitamin D deficiency on the clinical outcomes for critically ill patients in multi-center critical care units. The primary predictors of interest, vitamin D2 and D3 levels, are censored at a known detection limit. Within the context of generalized linear mixed models, we investigate statistical methods to handle multiple censored predictors in the presence of auxiliary variables. A Bayesian joint modeling approach is proposed to fit the complex heterogeneous multi-center data, in which the data information is fully used to estimate parameters of interest. Efficient Monte Carlo Markov chain algorithms are specifically developed depending on the nature of the response. Simulation studies demonstrate the outperformance of the proposed Bayesian approach over other existing methods. An application to the data set from the vitamin D deficiency study is presented. Possible extensions of the method regarding the absence of auxiliary variables, semiparametric models, as well as the type of censoring are also discussed.
Gomes, Guilherme J. C.; Vrugt, Jasper A.; Vargas, Eurípedes A.
2016-04-01
The depth to bedrock controls a myriad of processes by influencing subsurface flow paths, erosion rates, soil moisture, and water uptake by plant roots. As hillslope interiors are very difficult and costly to illuminate and access, the topography of the bedrock surface is largely unknown. This essay is concerned with the prediction of spatial patterns in the depth to bedrock (DTB) using high-resolution topographic data, numerical modeling, and Bayesian analysis. Our DTB model builds on the bottom-up control on fresh-bedrock topography hypothesis of Rempe and Dietrich (2014) and includes a mass movement and bedrock-valley morphology term to extent the usefulness and general applicability of the model. We reconcile the DTB model with field observations using Bayesian analysis with the DREAM algorithm. We investigate explicitly the benefits of using spatially distributed parameter values to account implicitly, and in a relatively simple way, for rock mass heterogeneities that are very difficult, if not impossible, to characterize adequately in the field. We illustrate our method using an artificial data set of bedrock depth observations and then evaluate our DTB model with real-world data collected at the Papagaio river basin in Rio de Janeiro, Brazil. Our results demonstrate that the DTB model predicts accurately the observed bedrock depth data. The posterior mean DTB simulation is shown to be in good agreement with the measured data. The posterior prediction uncertainty of the DTB model can be propagated forward through hydromechanical models to derive probabilistic estimates of factors of safety.
An Efficient MCMC Algorithm to Sample Binary Matrices with Fixed Marginals
Verhelst, Norman D.
2008-01-01
Uniform sampling of binary matrices with fixed margins is known as a difficult problem. Two classes of algorithms to sample from a distribution not too different from the uniform are studied in the literature: importance sampling and Markov chain Monte Carlo (MCMC). Existing MCMC algorithms converge slowly, require a long burn-in period and yield…
Bayesian Methods for Statistical Analysis
Puza, Borek
2015-01-01
Bayesian methods for statistical analysis is a book on statistical methods for analysing a wide variety of data. The book consists of 12 chapters, starting with basic concepts and covering numerous topics, including Bayesian estimation, decision theory, prediction, hypothesis testing, hierarchical models, Markov chain Monte Carlo methods, finite population inference, biased sampling and nonignorable nonresponse. The book contains many exercises, all with worked solutions, including complete c...
Oware, E. K.
2015-12-01
Modeling aquifer heterogeneities (AH) is a complex, multidimensional problem that mostly requires stochastic imaging strategies for tractability. While the traditional Bayesian Markov chain Monte Carlo (McMC) provides a powerful framework to model AH, the generic McMC is computationally prohibitive and, thus, unappealing for large-scale problems. An innovative variant of the McMC scheme that imposes priori spatial statistical constraints on model parameter updates, for improved characterization in a computationally efficient manner is proposed. The proposed algorithm (PA) is based on Markov random field (MRF) modeling, which is an image processing technique that infers the global behavior of a random field from its local properties, making the MRF approach well suited for imaging AH. MRF-based modeling leverages the equivalence of Gibbs (or Boltzmann) distribution (GD) and MRF to identify the local properties of an MRF in terms of the easily quantifiable Gibbs energy. The PA employs the two-step approach to model the lithological structure of the aquifer and the hydraulic properties within the identified lithologies simultaneously. It performs local Gibbs energy minimizations along a random path, which requires parameters of the GD (spatial statistics) to be specified. A PA that implicitly infers site-specific GD parameters within a Bayesian framework is also presented. The PA is illustrated with a synthetic binary facies aquifer with a lognormal heterogeneity simulated within each facies. GD parameters of 2.6, 1.2, -0.4, and -0.2 were estimated for the horizontal, vertical, NESW, and NWSE directions, respectively. Most of the high hydraulic conductivity zones (facies 2) were fairly resolved (see results below) with facies identification accuracy rate of 81%, 89%, and 90% for the inversions conditioned on concentration (R1), resistivity (R2), and joint (R3), respectively. The incorporation of the conditioning datasets improved on the root mean square error (RMSE
MCMC with Strings and Branes: The Suburban Algorithm (Extended Version)
Heckman, Jonathan J; Vigoda, Ben
2016-01-01
Motivated by the physics of strings and branes, we develop a class of Markov chain Monte Carlo (MCMC) algorithms involving extended objects. Starting from a collection of parallel Metropolis-Hastings (MH) samplers, we place them on an auxiliary grid, and couple them together via nearest neighbor interactions. This leads to a class of "suburban samplers" (i.e., spread out Metropolis). Coupling the samplers in this way modifies the mixing rate and speed of convergence for the Markov chain, and can in many cases allow a sampler to more easily overcome free energy barriers in a target distribution. We test these general theoretical considerations by performing several numerical experiments. For suburban samplers with a fluctuating grid topology, performance is strongly correlated with the average number of neighbors. Increasing the average number of neighbors above zero initially leads to an increase in performance, though there is a critical connectivity with effective dimension d_eff ~ 1, above which "groupthin...
DEFF Research Database (Denmark)
Mørup, Morten; Schmidt, Mikkel N
2012-01-01
Many networks of scientific interest naturally decompose into clusters or communities with comparatively fewer external than internal links; however, current Bayesian models of network communities do not exert this intuitive notion of communities. We formulate a nonparametric Bayesian model...... consistent with ground truth, and on real networks, it outperforms existing approaches in predicting missing links. This suggests that community structure is an important structural property of networks that should be explicitly modeled....... for community detection consistent with an intuitive definition of communities and present a Markov chain Monte Carlo procedure for inferring the community structure. A Matlab toolbox with the proposed inference procedure is available for download. On synthetic and real networks, our model detects communities...
Likelihood-based inference for clustered line transect data
DEFF Research Database (Denmark)
Waagepetersen, Rasmus Plenge; Schweder, Tore
is implemented using Markov Chain Monte Carlo methods to obtain efficient estimates of spatial clustering parameters. Uncertainty is addressed using parametric bootstrap or by consideration of posterior distributions in a Bayesian setting. Maximum likelihood estimation and Bayesian inference is compared...
Bayesian approach to avoiding track seduction
Salmond, David J.; Everett, Nicholas O.
2002-08-01
The problem of maintaining track on a primary target in the presence spurious objects is addressed. Recursive and batch filtering approaches are developed. For the recursive approach, a Bayesian track splitting filter is derived which spawns candidate tracks if there is a possibility of measurement misassociation. The filter evaluates the probability of each candidate track being associated with the primary target. The batch filter is a Markov-chain Monte Carlo (MCMC) algorithm which fits the observed data sequence to models of target dynamics and measurement-track association. Simulation results are presented.
Application of the Bayesian dynamic survival model in medicine.
He, Jianghua; McGee, Daniel L; Niu, Xufeng
2010-02-10
The Bayesian dynamic survival model (BDSM), a time-varying coefficient survival model from the Bayesian prospective, was proposed in early 1990s but has not been widely used or discussed. In this paper, we describe the model structure of the BDSM and introduce two estimation approaches for BDSMs: the Markov Chain Monte Carlo (MCMC) approach and the linear Bayesian (LB) method. The MCMC approach estimates model parameters through sampling and is computationally intensive. With the newly developed geoadditive survival models and software BayesX, the BDSM is available for general applications. The LB approach is easier in terms of computations but it requires the prespecification of some unknown smoothing parameters. In a simulation study, we use the LB approach to show the effects of smoothing parameters on the performance of the BDSM and propose an ad hoc method for identifying appropriate values for those parameters. We also demonstrate the performance of the MCMC approach compared with the LB approach and a penalized partial likelihood method available in software R packages. A gastric cancer trial is utilized to illustrate the application of the BDSM.
Institute of Scientific and Technical Information of China (English)
陈亚军; 刘丁; 梁军利
2012-01-01
To solve the difficult problem of non-Gaussian signal difficult to be described, this paper suggests a method of Bayesian inference on parameter for mixtures of α-stable distributions based on Markov Chain Monte Carlo. The hierarchical Bayesian graph model is constructed. Gibbs sampling algorithm is used to achieve the estimation of the mixing weights and allocation parameter z. The 4 parameter estimations in each distribution component are completed on the basis of Metropolis algorithm. The simulation results show that the method can accurately estimate the parameters of mixture of α-stable distributions, and it has good robustness and flexibility, whereby the method can be used to establish the model for non-Gaussian signal or data.%为解决非高斯信号较难描述这一难点问题,提出一种基于马尔科夫链蒙特卡罗方法的混合α稳定分布参数的贝叶斯推理方法.构建了混合稳定分布分层的贝叶斯图模型,利用Gibbs抽样实现了混合权值和分配参数z的估计,基于Metropolis算法完成了每个分布元中4个参数的估计.仿真结果表明,该方法能够准确地估计出混合α稳定分布中的各个参数,具有很好的鲁棒性和灵活性,可用于对非高斯信号或数据进行建模.
A Bayesian approach to multiscale inverse problems with on-the-fly scale determination
Ellam, Louis; Zabaras, Nicholas; Girolami, Mark
2016-12-01
A Bayesian computational approach is presented to provide a multi-resolution estimate of an unknown spatially varying parameter from indirect measurement data. In particular, we are interested in spatially varying parameters with multiscale characteristics. In our work, we consider the challenge of not knowing the characteristic length scale(s) of the unknown a priori, and present an algorithm for on-the-fly scale determination. Our approach is based on representing the spatial field with a wavelet expansion. Wavelet basis functions are hierarchically structured, localized in both spatial and frequency domains and tend to provide sparse representations in that a large number of wavelet coefficients are approximately zero. For these reasons, wavelet bases are suitable for representing permeability fields with non-trivial correlation structures. Moreover, the intra-scale correlations between wavelet coefficients form a quadtree, and this structure is exploited to identify additional basis functions to refine the model. Bayesian inference is performed using a sequential Monte Carlo (SMC) sampler with a Markov Chain Monte Carlo (MCMC) transition kernel. The SMC sampler is used to move between posterior densities defined on different scales, thereby providing a computationally efficient method for adaptive refinement of the wavelet representation. We gain insight from the marginal likelihoods, by computing Bayes factors, for model comparison and model selection. The marginal likelihoods provide a termination criterion for our scale determination algorithm. The Bayesian computational approach is rather general and applicable to several inverse problems concerning the estimation of a spatially varying parameter. The approach is demonstrated with permeability estimation for groundwater flow using pressure sensor measurements.
Bayesian spatial semi-parametric modeling of HIV variation in Kenya.
Directory of Open Access Journals (Sweden)
Oscar Ngesa
Full Text Available Spatial statistics has seen rapid application in many fields, especially epidemiology and public health. Many studies, nonetheless, make limited use of the geographical location information and also usually assume that the covariates, which are related to the response variable, have linear effects. We develop a Bayesian semi-parametric regression model for HIV prevalence data. Model estimation and inference is based on fully Bayesian approach via Markov Chain Monte Carlo (McMC. The model is applied to HIV prevalence data among men in Kenya, derived from the Kenya AIDS indicator survey, with n = 3,662. Past studies have concluded that HIV infection has a nonlinear association with age. In this study a smooth function based on penalized regression splines is used to estimate this nonlinear effect. Other covariates were assumed to have a linear effect. Spatial references to the counties were modeled as both structured and unstructured spatial effects. We observe that circumcision reduces the risk of HIV infection. The results also indicate that men in the urban areas were more likely to be infected by HIV as compared to their rural counterpart. Men with higher education had the lowest risk of HIV infection. A nonlinear relationship between HIV infection and age was established. Risk of HIV infection increases with age up to the age of 40 then declines with increase in age. Men who had STI in the last 12 months were more likely to be infected with HIV. Also men who had ever used a condom were found to have higher likelihood to be infected by HIV. A significant spatial variation of HIV infection in Kenya was also established. The study shows the practicality and flexibility of Bayesian semi-parametric regression model in analyzing epidemiological data.
Elvira, Clément; Dobigeon, Nicolas
2015-01-01
Sparse representations have proven their efficiency in solving a wide class of inverse problems encountered in signal and image processing. Conversely, enforcing the information to be spread uniformly over representation coefficients exhibits relevant properties in various applications such as digital communications. Anti-sparse regularization can be naturally expressed through an $\\ell_{\\infty}$-norm penalty. This paper derives a probabilistic formulation of such representations. A new probability distribution, referred to as the democratic prior, is first introduced. Its main properties as well as three random variate generators for this distribution are derived. Then this probability distribution is used as a prior to promote anti-sparsity in a Gaussian linear inverse problem, yielding a fully Bayesian formulation of anti-sparse coding. Two Markov chain Monte Carlo (MCMC) algorithms are proposed to generate samples according to the posterior distribution. The first one is a standard Gibbs sampler. The seco...
DEFF Research Database (Denmark)
Shariati, M M; Korsgaard, I R; Sorensen, D
2009-01-01
the formulation of the model and the structure of the data and the models were then implemented via MCMC. The output of MCMC sampling schemes was interpreted in the light of the theoretical findings. The erratic behaviour of the MCMC chains was shown to be associated with identifiability problems...
A new approach for Bayesian model averaging
Institute of Scientific and Technical Information of China (English)
TIAN XiangJun; XIE ZhengHui; WANG AiHui; YANG XiaoChun
2012-01-01
Bayesian model averaging (BMA) is a recently proposed statistical method for calibrating forecast ensembles from numerical weather models.However,successful implementation of BMA requires accurate estimates of the weights and variances of the individual competing models in the ensemble.Two methods,namely the Expectation-Maximization (EM) and the Markov Chain Monte Carlo (MCMC) algorithms,are widely used for BMA model training.Both methods have their own respective strengths and weaknesses.In this paper,we first modify the BMA log-likelihood function with the aim of removing the additional limitation that requires that the BMA weights add to one,and then use a limited memory quasi-Newtonian algorithm for solving the nonlinear optimization problem,thereby formulating a new approach for BMA (referred to as BMA-BFGS).Several groups of multi-model soil moisture simulation experiments from three land surface models show that the performance of BMA-BFGS is similar to the MCMC method in terms of simulation accuracy,and that both are superior to the EM algorithm.On the other hand,the computational cost of the BMA-BFGS algorithm is substantially less than for MCMC and is almost equivalent to that for EM.
Chen, Xi; Jung, Jin-Gyoung; Shajahan-Haq, Ayesha N; Clarke, Robert; Shih, Ie-Ming; Wang, Yue; Magnani, Luca; Wang, Tian-Li; Xuan, Jianhua
2016-04-20
Chromatin immunoprecipitation with massively parallel DNA sequencing (ChIP-seq) has greatly improved the reliability with which transcription factor binding sites (TFBSs) can be identified from genome-wide profiling studies. Many computational tools are developed to detect binding events or peaks, however the robust detection of weak binding events remains a challenge for current peak calling tools. We have developed a novel Bayesian approach (ChIP-BIT) to reliably detect TFBSs and their target genes by jointly modeling binding signal intensities and binding locations of TFBSs. Specifically, a Gaussian mixture model is used to capture both binding and background signals in sample data. As a unique feature of ChIP-BIT, background signals are modeled by a local Gaussian distribution that is accurately estimated from the input data. Extensive simulation studies showed a significantly improved performance of ChIP-BIT in target gene prediction, particularly for detecting weak binding signals at gene promoter regions. We applied ChIP-BIT to find target genes from NOTCH3 and PBX1 ChIP-seq data acquired from MCF-7 breast cancer cells. TF knockdown experiments have initially validated about 30% of co-regulated target genes identified by ChIP-BIT as being differentially expressed in MCF-7 cells. Functional analysis on these genes further revealed the existence of crosstalk between Notch and Wnt signaling pathways.
Gautier, Mathieu
2014-11-01
The recent democratization of next-generation-sequencing-based approaches towards nonmodel species has made it cost-effective to produce large genotyping data sets for a wider range of species. However, when no detailed genome assembly is available, poor knowledge about the organization of the markers within the genome might hamper the optimal use of this abundant information. At the most basic level of genomic organization, the type of chromosome (autosomes, sex chromosomes, mitochondria or chloroplast in plants) may remain unknown for most markers which might be limiting or even misleading in some applications, particularly in population genetics. Conversely, the characterization of sex-linked markers allows molecular sexing of the individuals. In this study, we propose a Bayesian model-based classifier named detsex, to assign markers to their chromosome type and/or to perform sexing of individuals based on genotyping data. The performance of detsex is further evaluated by a comprehensive simulation study and by the analysis of real data sets from various origins (microsatellite and SNP data derived from genotyping assay designs and NGS experiments). Irrespective of the origin of the markers or the size of the data set, detsex was proved efficient (i) to identify the sex-linked markers, (ii) to perform molecular sexing of the individuals and (iii) to perform basic quality check of the genotyping data sets. The underlying structure of the model also allows to consider each of these potential applications either separately or jointly.
Institute of Scientific and Technical Information of China (English)
章栋恩
2002-01-01
A Bayesian approach to the parameters and other interesting quantities of the Dirichlet likelihood is pro-posed. The uniform prior is placed on the meaningful function of the parameters. After transforming theparameters, the Metropolis algorithm is used to draw the posterior samples and the results of the Bayesianinference are followed.Acknowledgements The authors are gratefiul to the referees for their helpfiul comments and suggestionsto improve this work.%本文研究Diriclllct分布总体的参数和其他感兴趣的量的贝叶斯估计.在参数的有实际意义的函数上设置均匀的先验分布.对适当变换后的参数用Metropolis算法得到马尔可夫链蒙特卡岁后验样本.由此即得参数和其他感兴趣的量的贝叶斯估计.
Jafarzadeh, S Reza; Johnson, Wesley O; Gardner, Ian A
2016-03-15
The area under the receiver operating characteristic (ROC) curve (AUC) is used as a performance metric for quantitative tests. Although multiple biomarkers may be available for diagnostic or screening purposes, diagnostic accuracy is often assessed individually rather than in combination. In this paper, we consider the interesting problem of combining multiple biomarkers for use in a single diagnostic criterion with the goal of improving the diagnostic accuracy above that of an individual biomarker. The diagnostic criterion created from multiple biomarkers is based on the predictive probability of disease, conditional on given multiple biomarker outcomes. If the computed predictive probability exceeds a specified cutoff, the corresponding subject is allocated as 'diseased'. This defines a standard diagnostic criterion that has its own ROC curve, namely, the combined ROC (cROC). The AUC metric for cROC, namely, the combined AUC (cAUC), is used to compare the predictive criterion based on multiple biomarkers to one based on fewer biomarkers. A multivariate random-effects model is proposed for modeling multiple normally distributed dependent scores. Bayesian methods for estimating ROC curves and corresponding (marginal) AUCs are developed when a perfect reference standard is not available. In addition, cAUCs are computed to compare the accuracy of different combinations of biomarkers for diagnosis. The methods are evaluated using simulations and are applied to data for Johne's disease (paratuberculosis) in cattle.
Bayesian and maximum likelihood estimation of genetic maps
DEFF Research Database (Denmark)
York, Thomas L.; Durrett, Richard T.; Tanksley, Steven;
2005-01-01
There has recently been increased interest in the use of Markov Chain Monte Carlo (MCMC)-based Bayesian methods for estimating genetic maps. The advantage of these methods is that they can deal accurately with missing data and genotyping errors. Here we present an extension of the previous methods...... that makes the Bayesian method applicable to large data sets. We present an extensive simulation study examining the statistical properties of the method and comparing it with the likelihood method implemented in Mapmaker. We show that the Maximum A Posteriori (MAP) estimator of the genetic distances...
DEFF Research Database (Denmark)
Ødegård, Jørgen; Meuwissen, Theo HE; Heringstad, Bjørg
2010-01-01
" or "non-informative" with respect to genetic (co)variance components. The "non-informative" individuals are characterized by their Mendelian sampling deviations (deviance from the mid-parent mean) being completely confounded with a single residual on the underlying liability scale. For threshold models...... individual records exist on parents. Therefore, the aim of our study was to develop a new Gibbs sampling algorithm for a proper estimation of genetic (co)variance components within an animal threshold model framework. Methods In the proposed algorithm, individuals are classified as either "informative...... relationship matrix, but genetic (co)variance components are inferred from the sampled breeding values and relationships between "informative" individuals (usually parents) only. The latter is analogous to a sire-dam model (in cases with no individual records on the parents). Results When applied to simulated...
da Silva, Arlindo M.; Norris, Peter M.
2013-01-01
Part I presented a Monte Carlo Bayesian method for constraining a complex statistical model of GCM sub-gridcolumn moisture variability using high-resolution MODIS cloud data, thereby permitting large-scale model parameter estimation and cloud data assimilation. This part performs some basic testing of this new approach, verifying that it does indeed significantly reduce mean and standard deviation biases with respect to the assimilated MODIS cloud optical depth, brightness temperature and cloud top pressure, and that it also improves the simulated rotational-Ramman scattering cloud optical centroid pressure (OCP) against independent (non-assimilated) retrievals from the OMI instrument. Of particular interest, the Monte Carlo method does show skill in the especially difficult case where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach allows finite jumps into regions of non-zero cloud probability. In the example provided, the method is able to restore marine stratocumulus near the Californian coast where the background state has a clear swath. This paper also examines a number of algorithmic and physical sensitivities of the new method and provides guidance for its cost-effective implementation. One obvious difficulty for the method, and other cloud data assimilation methods as well, is the lack of information content in the cloud observables on cloud vertical structure, beyond cloud top pressure and optical thickness, thus necessitating strong dependence on the background vertical moisture structure. It is found that a simple flow-dependent correlation modification due to Riishojgaard (1998) provides some help in this respect, by better honoring inversion structures in the background state.
Advances in Bayesian Modeling in Educational Research
Levy, Roy
2016-01-01
In this article, I provide a conceptually oriented overview of Bayesian approaches to statistical inference and contrast them with frequentist approaches that currently dominate conventional practice in educational research. The features and advantages of Bayesian approaches are illustrated with examples spanning several statistical modeling…