Markov Model of Wind Power Time Series UsingBayesian Inference of Transition Matrix
DEFF Research Database (Denmark)
Chen, Peiyuan; Berthelsen, Kasper Klitgaard; Bak-Jensen, Birgitte
2009-01-01
This paper proposes to use Bayesian inference of transition matrix when developing a discrete Markov model of a wind speed/power time series and 95% credible interval for the model verification. The Dirichlet distribution is used as a conjugate prior for the transition matrix. Three discrete Markov...... models are compared, i.e. the basic Markov model, the Bayesian Markov model and the birth-and-death Markov model. The proposed Bayesian Markov model shows the best accuracy in modeling the autocorrelation of the wind power time series....
Non-parametric Bayesian inference for inhomogeneous Markov point processes
DEFF Research Database (Denmark)
Berthelsen, Kasper Klitgaard; Møller, Jesper
With reference to a specific data set, we consider how to perform a flexible non-parametric Bayesian analysis of an inhomogeneous point pattern modelled by a Markov point process, with a location dependent first order term and pairwise interaction only. A priori we assume that the first order term...
Strelioff, Christopher C; Crutchfield, James P; Hübler, Alfred W
2007-07-01
Markov chains are a natural and well understood tool for describing one-dimensional patterns in time or space. We show how to infer kth order Markov chains, for arbitrary k , from finite data by applying Bayesian methods to both parameter estimation and model-order selection. Extending existing results for multinomial models of discrete data, we connect inference to statistical mechanics through information-theoretic (type theory) techniques. We establish a direct relationship between Bayesian evidence and the partition function which allows for straightforward calculation of the expectation and variance of the conditional relative entropy and the source entropy rate. Finally, we introduce a method that uses finite data-size scaling with model-order comparison to infer the structure of out-of-class processes.
O'Neill, Philip D
2002-01-01
Recent Bayesian methods for the analysis of infectious disease outbreak data using stochastic epidemic models are reviewed. These methods rely on Markov chain Monte Carlo methods. Both temporal and non-temporal data are considered. The methods are illustrated with a number of examples featuring different models and datasets.
Murakami, Yohei; Takada, Shoji
2013-01-01
When model parameters in systems biology are not available from experiments, they need to be inferred so that the resulting simulation reproduces the experimentally known phenomena. For the purpose, Bayesian statistics with Markov chain Monte Carlo (MCMC) is a useful method. Conventional MCMC needs likelihood to evaluate a posterior distribution of acceptable parameters, while the approximate Bayesian computation (ABC) MCMC evaluates posterior distribution with use of qualitative fitness measure. However, none of these algorithms can deal with mixture of quantitative, i.e., likelihood, and qualitative fitness measures simultaneously. Here, to deal with this mixture, we formulated Bayesian formula for hybrid fitness measures (HFM). Then we implemented it to MCMC (MCMC-HFM). We tested MCMC-HFM first for a kinetic toy model with a positive feedback. Inferring kinetic parameters mainly related to the positive feedback, we found that MCMC-HFM reliably infer them using both qualitative and quantitative fitness measures. Then, we applied the MCMC-HFM to an apoptosis signal transduction network previously proposed. For kinetic parameters related to implicit positive feedbacks, which are important for bistability and irreversibility of the output, the MCMC-HFM reliably inferred these kinetic parameters. In particular, some kinetic parameters that have experimental estimates were inferred without using these data and the results were consistent with experiments. Moreover, for some parameters, the mixed use of quantitative and qualitative fitness measures narrowed down the acceptable range of parameters.
Bayesian Inference for LISA Pathfinder using Markov Chain Monte Carlo Methods
Ferraioli, Luigi; Plagnol, Eric
2012-01-01
We present a parameter estimation procedure based on a Bayesian framework by applying a Markov Chain Monte Carlo algorithm to the calibration of the dynamical parameters of a space based gravitational wave detector. The method is based on the Metropolis-Hastings algorithm and a two-stage annealing treatment in order to ensure an effective exploration of the parameter space at the beginning of the chain. We compare two versions of the algorithm with an application to a LISA Pathfinder data analysis problem. The two algorithms share the same heating strategy but with one moving in coordinate directions using proposals from a multivariate Gaussian distribution, while the other uses the natural logarithm of some parameters and proposes jumps in the eigen-space of the Fisher Information matrix. The algorithm proposing jumps in the eigen-space of the Fisher Information matrix demonstrates a higher acceptance rate and a slightly better convergence towards the equilibrium parameter distributions in the application to...
Bayesian analysis of Markov point processes
DEFF Research Database (Denmark)
Berthelsen, Kasper Klitgaard; Møller, Jesper
2006-01-01
Recently Møller, Pettitt, Berthelsen and Reeves introduced a new MCMC methodology for drawing samples from a posterior distribution when the likelihood function is only specified up to a normalising constant. We illustrate the method in the setting of Bayesian inference for Markov point processes...
Directory of Open Access Journals (Sweden)
Kevin McNally
2012-01-01
Full Text Available There are numerous biomonitoring programs, both recent and ongoing, to evaluate environmental exposure of humans to chemicals. Due to the lack of exposure and kinetic data, the correlation of biomarker levels with exposure concentrations leads to difficulty in utilizing biomonitoring data for biological guidance values. Exposure reconstruction or reverse dosimetry is the retrospective interpretation of external exposure consistent with biomonitoring data. We investigated the integration of physiologically based pharmacokinetic modelling, global sensitivity analysis, Bayesian inference, and Markov chain Monte Carlo simulation to obtain a population estimate of inhalation exposure to m-xylene. We used exhaled breath and venous blood m-xylene and urinary 3-methylhippuric acid measurements from a controlled human volunteer study in order to evaluate the ability of our computational framework to predict known inhalation exposures. We also investigated the importance of model structure and dimensionality with respect to its ability to reconstruct exposure.
McNally, Kevin; Cotton, Richard; Cocker, John; Jones, Kate; Bartels, Mike; Rick, David; Price, Paul; Loizou, George
2012-01-01
There are numerous biomonitoring programs, both recent and ongoing, to evaluate environmental exposure of humans to chemicals. Due to the lack of exposure and kinetic data, the correlation of biomarker levels with exposure concentrations leads to difficulty in utilizing biomonitoring data for biological guidance values. Exposure reconstruction or reverse dosimetry is the retrospective interpretation of external exposure consistent with biomonitoring data. We investigated the integration of physiologically based pharmacokinetic modelling, global sensitivity analysis, Bayesian inference, and Markov chain Monte Carlo simulation to obtain a population estimate of inhalation exposure to m-xylene. We used exhaled breath and venous blood m-xylene and urinary 3-methylhippuric acid measurements from a controlled human volunteer study in order to evaluate the ability of our computational framework to predict known inhalation exposures. We also investigated the importance of model structure and dimensionality with respect to its ability to reconstruct exposure.
Bayesian Inference with Optimal Maps
Moselhy, Tarek A El
2011-01-01
We present a new approach to Bayesian inference that entirely avoids Markov chain simulation, by constructing a map that pushes forward the prior measure to the posterior measure. Existence and uniqueness of a suitable measure-preserving map is established by formulating the problem in the context of optimal transport theory. We discuss various means of explicitly parameterizing the map and computing it efficiently through solution of an optimization problem, exploiting gradient information from the forward model when possible. The resulting algorithm overcomes many of the computational bottlenecks associated with Markov chain Monte Carlo. Advantages of a map-based representation of the posterior include analytical expressions for posterior moments and the ability to generate arbitrary numbers of independent posterior samples without additional likelihood evaluations or forward solves. The optimization approach also provides clear convergence criteria for posterior approximation and facilitates model selectio...
Probabilistic Inferences in Bayesian Networks
Ding, Jianguo
2010-01-01
This chapter summarizes the popular inferences methods in Bayesian networks. The results demonstrates that the evidence can propagated across the Bayesian networks by any links, whatever it is forward or backward or intercausal style. The belief updating of Bayesian networks can be obtained by various available inference techniques. Theoretically, exact inferences in Bayesian networks is feasible and manageable. However, the computing and inference is NP-hard. That means, in applications, in ...
Bayesian posterior distributions without Markov chains.
Cole, Stephen R; Chu, Haitao; Greenland, Sander; Hamra, Ghassan; Richardson, David B
2012-03-01
Bayesian posterior parameter distributions are often simulated using Markov chain Monte Carlo (MCMC) methods. However, MCMC methods are not always necessary and do not help the uninitiated understand Bayesian inference. As a bridge to understanding Bayesian inference, the authors illustrate a transparent rejection sampling method. In example 1, they illustrate rejection sampling using 36 cases and 198 controls from a case-control study (1976-1983) assessing the relation between residential exposure to magnetic fields and the development of childhood cancer. Results from rejection sampling (odds ratio (OR) = 1.69, 95% posterior interval (PI): 0.57, 5.00) were similar to MCMC results (OR = 1.69, 95% PI: 0.58, 4.95) and approximations from data-augmentation priors (OR = 1.74, 95% PI: 0.60, 5.06). In example 2, the authors apply rejection sampling to a cohort study of 315 human immunodeficiency virus seroconverters (1984-1998) to assess the relation between viral load after infection and 5-year incidence of acquired immunodeficiency syndrome, adjusting for (continuous) age at seroconversion and race. In this more complex example, rejection sampling required a notably longer run time than MCMC sampling but remained feasible and again yielded similar results. The transparency of the proposed approach comes at a price of being less broadly applicable than MCMC.
Computationally efficient Bayesian inference for inverse problems.
Energy Technology Data Exchange (ETDEWEB)
Marzouk, Youssef M.; Najm, Habib N.; Rahn, Larry A.
2007-10-01
Bayesian statistics provides a foundation for inference from noisy and incomplete data, a natural mechanism for regularization in the form of prior information, and a quantitative assessment of uncertainty in the inferred results. Inverse problems - representing indirect estimation of model parameters, inputs, or structural components - can be fruitfully cast in this framework. Complex and computationally intensive forward models arising in physical applications, however, can render a Bayesian approach prohibitive. This difficulty is compounded by high-dimensional model spaces, as when the unknown is a spatiotemporal field. We present new algorithmic developments for Bayesian inference in this context, showing strong connections with the forward propagation of uncertainty. In particular, we introduce a stochastic spectral formulation that dramatically accelerates the Bayesian solution of inverse problems via rapid evaluation of a surrogate posterior. We also explore dimensionality reduction for the inference of spatiotemporal fields, using truncated spectral representations of Gaussian process priors. These new approaches are demonstrated on scalar transport problems arising in contaminant source inversion and in the inference of inhomogeneous material or transport properties. We also present a Bayesian framework for parameter estimation in stochastic models, where intrinsic stochasticity may be intermingled with observational noise. Evaluation of a likelihood function may not be analytically tractable in these cases, and thus several alternative Markov chain Monte Carlo (MCMC) schemes, operating on the product space of the observations and the parameters, are introduced.
Interactive Instruction in Bayesian Inference
DEFF Research Database (Denmark)
Khan, Azam; Breslav, Simon; Hornbæk, Kasper
2017-01-01
An instructional approach is presented to improve human performance in solving Bayesian inference problems. Starting from the original text of the classic Mammography Problem, the textual expression is modified and visualizations are added according to Mayer’s principles of instruction...... that an instructional approach to improving human performance in Bayesian inference is a promising direction....
Bayesian inference for OPC modeling
Burbine, Andrew; Sturtevant, John; Fryer, David; Smith, Bruce W.
2016-03-01
The use of optical proximity correction (OPC) demands increasingly accurate models of the photolithographic process. Model building and inference techniques in the data science community have seen great strides in the past two decades which make better use of available information. This paper aims to demonstrate the predictive power of Bayesian inference as a method for parameter selection in lithographic models by quantifying the uncertainty associated with model inputs and wafer data. Specifically, the method combines the model builder's prior information about each modelling assumption with the maximization of each observation's likelihood as a Student's t-distributed random variable. Through the use of a Markov chain Monte Carlo (MCMC) algorithm, a model's parameter space is explored to find the most credible parameter values. During parameter exploration, the parameters' posterior distributions are generated by applying Bayes' rule, using a likelihood function and the a priori knowledge supplied. The MCMC algorithm used, an affine invariant ensemble sampler (AIES), is implemented by initializing many walkers which semiindependently explore the space. The convergence of these walkers to global maxima of the likelihood volume determine the parameter values' highest density intervals (HDI) to reveal champion models. We show that this method of parameter selection provides insights into the data that traditional methods do not and outline continued experiments to vet the method.
Bayesian inference in geomagnetism
Backus, George E.
1988-01-01
The inverse problem in empirical geomagnetic modeling is investigated, with critical examination of recently published studies. Particular attention is given to the use of Bayesian inference (BI) to select the damping parameter lambda in the uniqueness portion of the inverse problem. The mathematical bases of BI and stochastic inversion are explored, with consideration of bound-softening problems and resolution in linear Gaussian BI. The problem of estimating the radial magnetic field B(r) at the earth core-mantle boundary from surface and satellite measurements is then analyzed in detail, with specific attention to the selection of lambda in the studies of Gubbins (1983) and Gubbins and Bloxham (1985). It is argued that the selection method is inappropriate and leads to lambda values much larger than those that would result if a reasonable bound on the heat flow at the CMB were assumed.
Markov chain Monte Carlo simulation for Bayesian Hidden Markov Models
Chan, Lay Guat; Ibrahim, Adriana Irawati Nur Binti
2016-10-01
A hidden Markov model (HMM) is a mixture model which has a Markov chain with finite states as its mixing distribution. HMMs have been applied to a variety of fields, such as speech and face recognitions. The main purpose of this study is to investigate the Bayesian approach to HMMs. Using this approach, we can simulate from the parameters' posterior distribution using some Markov chain Monte Carlo (MCMC) sampling methods. HMMs seem to be useful, but there are some limitations. Therefore, by using the Mixture of Dirichlet processes Hidden Markov Model (MDPHMM) based on Yau et. al (2011), we hope to overcome these limitations. We shall conduct a simulation study using MCMC methods to investigate the performance of this model.
Inference in hybrid Bayesian networks
DEFF Research Database (Denmark)
Lanseth, Helge; Nielsen, Thomas Dyhre; Rumí, Rafael
2009-01-01
Since the 1980s, Bayesian Networks (BNs) have become increasingly popular for building statistical models of complex systems. This is particularly true for boolean systems, where BNs often prove to be a more efficient modelling framework than traditional reliability-techniques (like fault trees...... decade's research on inference in hybrid Bayesian networks. The discussions are linked to an example model for estimating human reliability....
Bayesian Inference: with ecological applications
Link, William A.; Barker, Richard J.
2010-01-01
This text provides a mathematically rigorous yet accessible and engaging introduction to Bayesian inference with relevant examples that will be of interest to biologists working in the fields of ecology, wildlife management and environmental studies as well as students in advanced undergraduate statistics.. This text opens the door to Bayesian inference, taking advantage of modern computational efficiencies and easily accessible software to evaluate complex hierarchical models.
Bayesian structural inference for hidden processes.
Strelioff, Christopher C; Crutchfield, James P
2014-04-01
We introduce a Bayesian approach to discovering patterns in structurally complex processes. The proposed method of Bayesian structural inference (BSI) relies on a set of candidate unifilar hidden Markov model (uHMM) topologies for inference of process structure from a data series. We employ a recently developed exact enumeration of topological ε-machines. (A sequel then removes the topological restriction.) This subset of the uHMM topologies has the added benefit that inferred models are guaranteed to be ε-machines, irrespective of estimated transition probabilities. Properties of ε-machines and uHMMs allow for the derivation of analytic expressions for estimating transition probabilities, inferring start states, and comparing the posterior probability of candidate model topologies, despite process internal structure being only indirectly present in data. We demonstrate BSI's effectiveness in estimating a process's randomness, as reflected by the Shannon entropy rate, and its structure, as quantified by the statistical complexity. We also compare using the posterior distribution over candidate models and the single, maximum a posteriori model for point estimation and show that the former more accurately reflects uncertainty in estimated values. We apply BSI to in-class examples of finite- and infinite-order Markov processes, as well to an out-of-class, infinite-state hidden process.
Bayesian inference with ecological applications
Link, William A
2009-01-01
This text is written to provide a mathematically sound but accessible and engaging introduction to Bayesian inference specifically for environmental scientists, ecologists and wildlife biologists. It emphasizes the power and usefulness of Bayesian methods in an ecological context. The advent of fast personal computers and easily available software has simplified the use of Bayesian and hierarchical models . One obstacle remains for ecologists and wildlife biologists, namely the near absence of Bayesian texts written specifically for them. The book includes many relevant examples, is supported by software and examples on a companion website and will become an essential grounding in this approach for students and research ecologists. Engagingly written text specifically designed to demystify a complex subject Examples drawn from ecology and wildlife research An essential grounding for graduate and research ecologists in the increasingly prevalent Bayesian approach to inference Companion website with analyt...
Kim, Daesang
2016-01-06
A new Bayesian inference method has been developed and applied to Furan shock tube experimental data for efficient statistical inferences of the Arrhenius parameters of two OH radical consumption reactions. The collected experimental data, which consist of time series signals of OH radical concentrations of 14 shock tube experiments, may require several days for MCMC computations even with the support of a fast surrogate of the combustion simulation model, while the new method reduces it to several hours by splitting the process into two steps of MCMC: the first inference of rate constants and the second inference of the Arrhenius parameters. Each step has low dimensional parameter spaces and the second step does not need the executions of the combustion simulation. Furthermore, the new approach has more flexibility in choosing the ranges of the inference parameters, and the higher speed and flexibility enable the more accurate inferences and the analyses of the propagation of errors in the measured temperatures and the alignment of the experimental time to the inference results.
Polynomial Chaos Surrogates for Bayesian Inference
Le Maitre, Olivier
2016-01-06
The Bayesian inference is a popular probabilistic method to solve inverse problems, such as the identification of field parameter in a PDE model. The inference rely on the Bayes rule to update the prior density of the sought field, from observations, and derive its posterior distribution. In most cases the posterior distribution has no explicit form and has to be sampled, for instance using a Markov-Chain Monte Carlo method. In practice the prior field parameter is decomposed and truncated (e.g. by means of Karhunen- Lo´eve decomposition) to recast the inference problem into the inference of a finite number of coordinates. Although proved effective in many situations, the Bayesian inference as sketched above faces several difficulties requiring improvements. First, sampling the posterior can be a extremely costly task as it requires multiple resolutions of the PDE model for different values of the field parameter. Second, when the observations are not very much informative, the inferred parameter field can highly depends on its prior which can be somehow arbitrary. These issues have motivated the introduction of reduced modeling or surrogates for the (approximate) determination of the parametrized PDE solution and hyperparameters in the description of the prior field. Our contribution focuses on recent developments in these two directions: the acceleration of the posterior sampling by means of Polynomial Chaos expansions and the efficient treatment of parametrized covariance functions for the prior field. We also discuss the possibility of making such approach adaptive to further improve its efficiency.
Perception, illusions and Bayesian inference.
Nour, Matthew M; Nour, Joseph M
2015-01-01
Descriptive psychopathology makes a distinction between veridical perception and illusory perception. In both cases a perception is tied to a sensory stimulus, but in illusions the perception is of a false object. This article re-examines this distinction in light of new work in theoretical and computational neurobiology, which views all perception as a form of Bayesian statistical inference that combines sensory signals with prior expectations. Bayesian perceptual inference can solve the 'inverse optics' problem of veridical perception and provides a biologically plausible account of a number of illusory phenomena, suggesting that veridical and illusory perceptions are generated by precisely the same inferential mechanisms.
Probability biases as Bayesian inference
Directory of Open Access Journals (Sweden)
Andre; C. R. Martins
2006-11-01
Full Text Available In this article, I will show how several observed biases in human probabilistic reasoning can be partially explained as good heuristics for making inferences in an environment where probabilities have uncertainties associated to them. Previous results show that the weight functions and the observed violations of coalescing and stochastic dominance can be understood from a Bayesian point of view. We will review those results and see that Bayesian methods should also be used as part of the explanation behind other known biases. That means that, although the observed errors are still errors under the be understood as adaptations to the solution of real life problems. Heuristics that allow fast evaluations and mimic a Bayesian inference would be an evolutionary advantage, since they would give us an efficient way of making decisions. %XX In that sense, it should be no surprise that humans reason with % probability as it has been observed.
Bayesian inference for Hawkes processes
DEFF Research Database (Denmark)
Rasmussen, Jakob Gulddahl
The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...
Bayesian inference for Hawkes processes
DEFF Research Database (Denmark)
Rasmussen, Jakob Gulddahl
2013-01-01
The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...
Bayesian inference for Hawkes processes
DEFF Research Database (Denmark)
Rasmussen, Jakob Gulddahl
The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...
Markov chain Monte Carlo inference for Markov jump processes via the linear noise approximation.
Stathopoulos, Vassilios; Girolami, Mark A
2013-02-13
Bayesian analysis for Markov jump processes (MJPs) is a non-trivial and challenging problem. Although exact inference is theoretically possible, it is computationally demanding, thus its applicability is limited to a small class of problems. In this paper, we describe the application of Riemann manifold Markov chain Monte Carlo (MCMC) methods using an approximation to the likelihood of the MJP that is valid when the system modelled is near its thermodynamic limit. The proposed approach is both statistically and computationally efficient whereas the convergence rate and mixing of the chains allow for fast MCMC inference. The methodology is evaluated using numerical simulations on two problems from chemical kinetics and one from systems biology.
An Integrated Procedure for Bayesian Reliability Inference Using MCMC
Directory of Open Access Journals (Sweden)
Jing Lin
2014-01-01
Full Text Available The recent proliferation of Markov chain Monte Carlo (MCMC approaches has led to the use of the Bayesian inference in a wide variety of fields. To facilitate MCMC applications, this paper proposes an integrated procedure for Bayesian inference using MCMC methods, from a reliability perspective. The goal is to build a framework for related academic research and engineering applications to implement modern computational-based Bayesian approaches, especially for reliability inferences. The procedure developed here is a continuous improvement process with four stages (Plan, Do, Study, and Action and 11 steps, including: (1 data preparation; (2 prior inspection and integration; (3 prior selection; (4 model selection; (5 posterior sampling; (6 MCMC convergence diagnostic; (7 Monte Carlo error diagnostic; (8 model improvement; (9 model comparison; (10 inference making; (11 data updating and inference improvement. The paper illustrates the proposed procedure using a case study.
Bayesian Estimation and Inference Using Stochastic Electronics.
Thakur, Chetan Singh; Afshar, Saeed; Wang, Runchun M; Hamilton, Tara J; Tapson, Jonathan; van Schaik, André
2016-01-01
In this paper, we present the implementation of two types of Bayesian inference problems to demonstrate the potential of building probabilistic algorithms in hardware using single set of building blocks with the ability to perform these computations in real time. The first implementation, referred to as the BEAST (Bayesian Estimation and Stochastic Tracker), demonstrates a simple problem where an observer uses an underlying Hidden Markov Model (HMM) to track a target in one dimension. In this implementation, sensors make noisy observations of the target position at discrete time steps. The tracker learns the transition model for target movement, and the observation model for the noisy sensors, and uses these to estimate the target position by solving the Bayesian recursive equation online. We show the tracking performance of the system and demonstrate how it can learn the observation model, the transition model, and the external distractor (noise) probability interfering with the observations. In the second implementation, referred to as the Bayesian INference in DAG (BIND), we show how inference can be performed in a Directed Acyclic Graph (DAG) using stochastic circuits. We show how these building blocks can be easily implemented using simple digital logic gates. An advantage of the stochastic electronic implementation is that it is robust to certain types of noise, which may become an issue in integrated circuit (IC) technology with feature sizes in the order of tens of nanometers due to their low noise margin, the effect of high-energy cosmic rays and the low supply voltage. In our framework, the flipping of random individual bits would not affect the system performance because information is encoded in a bit stream.
Bayesian Inference for Radio Observations
Lochner, Michelle; Zwart, Jonathan T L; Smirnov, Oleg; Bassett, Bruce A; Oozeer, Nadeem; Kunz, Martin
2015-01-01
(Abridged) New telescopes like the Square Kilometre Array (SKA) will push into a new sensitivity regime and expose systematics, such as direction-dependent effects, that could previously be ignored. Current methods for handling such systematics rely on alternating best estimates of instrumental calibration and models of the underlying sky, which can lead to inaccurate uncertainty estimates and biased results because such methods ignore any correlations between parameters. These deconvolution algorithms produce a single image that is assumed to be a true representation of the sky, when in fact it is just one realisation of an infinite ensemble of images compatible with the noise in the data. In contrast, here we report a Bayesian formalism that simultaneously infers both systematics and science. Our technique, Bayesian Inference for Radio Observations (BIRO), determines all parameters directly from the raw data, bypassing image-making entirely, by sampling from the joint posterior probability distribution. Thi...
Bayesian inference on proportional elections.
Directory of Open Access Journals (Sweden)
Gabriel Hideki Vatanabe Brunello
Full Text Available Polls for majoritarian voting systems usually show estimates of the percentage of votes for each candidate. However, proportional vote systems do not necessarily guarantee the candidate with the most percentage of votes will be elected. Thus, traditional methods used in majoritarian elections cannot be applied on proportional elections. In this context, the purpose of this paper was to perform a Bayesian inference on proportional elections considering the Brazilian system of seats distribution. More specifically, a methodology to answer the probability that a given party will have representation on the chamber of deputies was developed. Inferences were made on a Bayesian scenario using the Monte Carlo simulation technique, and the developed methodology was applied on data from the Brazilian elections for Members of the Legislative Assembly and Federal Chamber of Deputies in 2010. A performance rate was also presented to evaluate the efficiency of the methodology. Calculations and simulations were carried out using the free R statistical software.
Bayesian Inference and Online Learning in Poisson Neuronal Networks.
Huang, Yanping; Rao, Rajesh P N
2016-08-01
Motivated by the growing evidence for Bayesian computation in the brain, we show how a two-layer recurrent network of Poisson neurons can perform both approximate Bayesian inference and learning for any hidden Markov model. The lower-layer sensory neurons receive noisy measurements of hidden world states. The higher-layer neurons infer a posterior distribution over world states via Bayesian inference from inputs generated by sensory neurons. We demonstrate how such a neuronal network with synaptic plasticity can implement a form of Bayesian inference similar to Monte Carlo methods such as particle filtering. Each spike in a higher-layer neuron represents a sample of a particular hidden world state. The spiking activity across the neural population approximates the posterior distribution over hidden states. In this model, variability in spiking is regarded not as a nuisance but as an integral feature that provides the variability necessary for sampling during inference. We demonstrate how the network can learn the likelihood model, as well as the transition probabilities underlying the dynamics, using a Hebbian learning rule. We present results illustrating the ability of the network to perform inference and learning for arbitrary hidden Markov models.
Bayesian methods for hackers probabilistic programming and Bayesian inference
Davidson-Pilon, Cameron
2016-01-01
Bayesian methods of inference are deeply natural and extremely powerful. However, most discussions of Bayesian inference rely on intensely complex mathematical analyses and artificial examples, making it inaccessible to anyone without a strong mathematical background. Now, though, Cameron Davidson-Pilon introduces Bayesian inference from a computational perspective, bridging theory to practice–freeing you to get results using computing power. Bayesian Methods for Hackers illuminates Bayesian inference through probabilistic programming with the powerful PyMC language and the closely related Python tools NumPy, SciPy, and Matplotlib. Using this approach, you can reach effective solutions in small increments, without extensive mathematical intervention. Davidson-Pilon begins by introducing the concepts underlying Bayesian inference, comparing it with other techniques and guiding you through building and training your first Bayesian model. Next, he introduces PyMC through a series of detailed examples a...
Bayesian Inference of a Multivariate Regression Model
Directory of Open Access Journals (Sweden)
Marick S. Sinay
2014-01-01
Full Text Available We explore Bayesian inference of a multivariate linear regression model with use of a flexible prior for the covariance structure. The commonly adopted Bayesian setup involves the conjugate prior, multivariate normal distribution for the regression coefficients and inverse Wishart specification for the covariance matrix. Here we depart from this approach and propose a novel Bayesian estimator for the covariance. A multivariate normal prior for the unique elements of the matrix logarithm of the covariance matrix is considered. Such structure allows for a richer class of prior distributions for the covariance, with respect to strength of beliefs in prior location hyperparameters, as well as the added ability, to model potential correlation amongst the covariance structure. The posterior moments of all relevant parameters of interest are calculated based upon numerical results via a Markov chain Monte Carlo procedure. The Metropolis-Hastings-within-Gibbs algorithm is invoked to account for the construction of a proposal density that closely matches the shape of the target posterior distribution. As an application of the proposed technique, we investigate a multiple regression based upon the 1980 High School and Beyond Survey.
A Full Bayesian Approach for Boolean Genetic Network Inference
Han, Shengtong; Wong, Raymond K. W.; Lee, Thomas C. M.; Shen, Linghao; Li, Shuo-Yen R.; Fan, Xiaodan
2014-01-01
Boolean networks are a simple but efficient model for describing gene regulatory systems. A number of algorithms have been proposed to infer Boolean networks. However, these methods do not take full consideration of the effects of noise and model uncertainty. In this paper, we propose a full Bayesian approach to infer Boolean genetic networks. Markov chain Monte Carlo algorithms are used to obtain the posterior samples of both the network structure and the related parameters. In addition to regular link addition and removal moves, which can guarantee the irreducibility of the Markov chain for traversing the whole network space, carefully constructed mixture proposals are used to improve the Markov chain Monte Carlo convergence. Both simulations and a real application on cell-cycle data show that our method is more powerful than existing methods for the inference of both the topology and logic relations of the Boolean network from observed data. PMID:25551820
A full bayesian approach for boolean genetic network inference.
Directory of Open Access Journals (Sweden)
Shengtong Han
Full Text Available Boolean networks are a simple but efficient model for describing gene regulatory systems. A number of algorithms have been proposed to infer Boolean networks. However, these methods do not take full consideration of the effects of noise and model uncertainty. In this paper, we propose a full Bayesian approach to infer Boolean genetic networks. Markov chain Monte Carlo algorithms are used to obtain the posterior samples of both the network structure and the related parameters. In addition to regular link addition and removal moves, which can guarantee the irreducibility of the Markov chain for traversing the whole network space, carefully constructed mixture proposals are used to improve the Markov chain Monte Carlo convergence. Both simulations and a real application on cell-cycle data show that our method is more powerful than existing methods for the inference of both the topology and logic relations of the Boolean network from observed data.
Compiling Relational Bayesian Networks for Exact Inference
DEFF Research Database (Denmark)
Jaeger, Manfred; Chavira, Mark; Darwiche, Adnan
2004-01-01
We describe a system for exact inference with relational Bayesian networks as defined in the publicly available \\primula\\ tool. The system is based on compiling propositional instances of relational Bayesian networks into arithmetic circuits and then performing online inference by evaluating...... and differentiating these circuits in time linear in their size. We report on experimental results showing the successful compilation, and efficient inference, on relational Bayesian networks whose {\\primula}--generated propositional instances have thousands of variables, and whose jointrees have clusters...
Likelihood free inference for Markov processes: a comparison.
Owen, Jamie; Wilkinson, Darren J; Gillespie, Colin S
2015-04-01
Approaches to Bayesian inference for problems with intractable likelihoods have become increasingly important in recent years. Approximate Bayesian computation (ABC) and "likelihood free" Markov chain Monte Carlo techniques are popular methods for tackling inference in these scenarios but such techniques are computationally expensive. In this paper we compare the two approaches to inference, with a particular focus on parameter inference for stochastic kinetic models, widely used in systems biology. Discrete time transition kernels for models of this type are intractable for all but the most trivial systems yet forward simulation is usually straightforward. We discuss the relative merits and drawbacks of each approach whilst considering the computational cost implications and efficiency of these techniques. In order to explore the properties of each approach we examine a range of observation regimes using two example models. We use a Lotka-Volterra predator-prey model to explore the impact of full or partial species observations using various time course observations under the assumption of known and unknown measurement error. Further investigation into the impact of observation error is then made using a Schlögl system, a test case which exhibits bi-modal state stability in some regions of parameter space.
Nonparametric Bayesian inference in biostatistics
Müller, Peter
2015-01-01
As chapters in this book demonstrate, BNP has important uses in clinical sciences and inference for issues like unknown partitions in genomics. Nonparametric Bayesian approaches (BNP) play an ever expanding role in biostatistical inference from use in proteomics to clinical trials. Many research problems involve an abundance of data and require flexible and complex probability models beyond the traditional parametric approaches. As this book's expert contributors show, BNP approaches can be the answer. Survival Analysis, in particular survival regression, has traditionally used BNP, but BNP's potential is now very broad. This applies to important tasks like arrangement of patients into clinically meaningful subpopulations and segmenting the genome into functionally distinct regions. This book is designed to both review and introduce application areas for BNP. While existing books provide theoretical foundations, this book connects theory to practice through engaging examples and research questions. Chapters c...
Bayesian analysis of non-homogeneous Markov chains: application to mental health data.
Sung, Minje; Soyer, Refik; Nhan, Nguyen
2007-07-10
In this paper we present a formal treatment of non-homogeneous Markov chains by introducing a hierarchical Bayesian framework. Our work is motivated by the analysis of correlated categorical data which arise in assessment of psychiatric treatment programs. In our development, we introduce a Markovian structure to describe the non-homogeneity of transition patterns. In doing so, we introduce a logistic regression set-up for Markov chains and incorporate covariates in our model. We present a Bayesian model using Markov chain Monte Carlo methods and develop inference procedures to address issues encountered in the analyses of data from psychiatric treatment programs. Our model and inference procedures are implemented to some real data from a psychiatric treatment study.
Institute of Scientific and Technical Information of China (English)
陈亚军; 刘丁; 梁军利
2012-01-01
To solve the difficult problem of non-Gaussian signal difficult to be described, this paper suggests a method of Bayesian inference on parameter for mixtures of α-stable distributions based on Markov Chain Monte Carlo. The hierarchical Bayesian graph model is constructed. Gibbs sampling algorithm is used to achieve the estimation of the mixing weights and allocation parameter z. The 4 parameter estimations in each distribution component are completed on the basis of Metropolis algorithm. The simulation results show that the method can accurately estimate the parameters of mixture of α-stable distributions, and it has good robustness and flexibility, whereby the method can be used to establish the model for non-Gaussian signal or data.%为解决非高斯信号较难描述这一难点问题,提出一种基于马尔科夫链蒙特卡罗方法的混合α稳定分布参数的贝叶斯推理方法.构建了混合稳定分布分层的贝叶斯图模型,利用Gibbs抽样实现了混合权值和分配参数z的估计,基于Metropolis算法完成了每个分布元中4个参数的估计.仿真结果表明,该方法能够准确地估计出混合α稳定分布中的各个参数,具有很好的鲁棒性和灵活性,可用于对非高斯信号或数据进行建模.
Towards Bayesian Inference of the Spatial Distribution of Proteins
DEFF Research Database (Denmark)
Hooghoudt, Jan Otto; Waagepetersen, Rasmus Plenge; Barroso, Margarida
2017-01-01
. In this paper we propose a new likelihood-based approach to statistical inference for FRET microscopic data. The likelihood function is obtained from a detailed modeling of the FRET data generating mechanism conditional on a protein configuration. We next follow a Bayesian approach and introduce a spatial point...... process prior model for the protein configurations depending on hyper parameters quantifying the intensity of the point process. Posterior distributions are evaluated using Markov chain Monte Carlo. We propose to infer microscope related parameters in an initial step from reference data without...
Compiling Relational Bayesian Networks for Exact Inference
DEFF Research Database (Denmark)
Jaeger, Manfred; Darwiche, Adnan; Chavira, Mark
2006-01-01
We describe in this paper a system for exact inference with relational Bayesian networks as defined in the publicly available PRIMULA tool. The system is based on compiling propositional instances of relational Bayesian networks into arithmetic circuits and then performing online inference...... by evaluating and differentiating these circuits in time linear in their size. We report on experimental results showing successful compilation and efficient inference on relational Bayesian networks, whose PRIMULA--generated propositional instances have thousands of variables, and whose jointrees have clusters...
Bayesian multimodel inference for geostatistical regression models.
Directory of Open Access Journals (Sweden)
Devin S Johnson
Full Text Available The problem of simultaneous covariate selection and parameter inference for spatial regression models is considered. Previous research has shown that failure to take spatial correlation into account can influence the outcome of standard model selection methods. A Markov chain Monte Carlo (MCMC method is investigated for the calculation of parameter estimates and posterior model probabilities for spatial regression models. The method can accommodate normal and non-normal response data and a large number of covariates. Thus the method is very flexible and can be used to fit spatial linear models, spatial linear mixed models, and spatial generalized linear mixed models (GLMMs. The Bayesian MCMC method also allows a priori unequal weighting of covariates, which is not possible with many model selection methods such as Akaike's information criterion (AIC. The proposed method is demonstrated on two data sets. The first is the whiptail lizard data set which has been previously analyzed by other researchers investigating model selection methods. Our results confirmed the previous analysis suggesting that sandy soil and ant abundance were strongly associated with lizard abundance. The second data set concerned pollution tolerant fish abundance in relation to several environmental factors. Results indicate that abundance is positively related to Strahler stream order and a habitat quality index. Abundance is negatively related to percent watershed disturbance.
Bayesian Inference Methods for Sparse Channel Estimation
DEFF Research Database (Denmark)
Pedersen, Niels Lovmand
2013-01-01
This thesis deals with sparse Bayesian learning (SBL) with application to radio channel estimation. As opposed to the classical approach for sparse signal representation, we focus on the problem of inferring complex signals. Our investigations within SBL constitute the basis for the development...... of Bayesian inference algorithms for sparse channel estimation. Sparse inference methods aim at finding the sparse representation of a signal given in some overcomplete dictionary of basis vectors. Within this context, one of our main contributions to the field of SBL is a hierarchical representation...... analysis of the complex prior representation, where we show that the ability to induce sparse estimates of a given prior heavily depends on the inference method used and, interestingly, whether real or complex variables are inferred. We also show that the Bayesian estimators derived from the proposed...
Bayesian Inference on Gravitational Waves
Directory of Open Access Journals (Sweden)
Asad Ali
2015-12-01
Full Text Available The Bayesian approach is increasingly becoming popular among the astrophysics data analysis communities. However, the Pakistan statistics communities are unaware of this fertile interaction between the two disciplines. Bayesian methods have been in use to address astronomical problems since the very birth of the Bayes probability in eighteenth century. Today the Bayesian methods for the detection and parameter estimation of gravitational waves have solid theoretical grounds with a strong promise for the realistic applications. This article aims to introduce the Pakistan statistics communities to the applications of Bayesian Monte Carlo methods in the analysis of gravitational wave data with an overview of the Bayesian signal detection and estimation methods and demonstration by a couple of simplified examples.
Picturing classical and quantum Bayesian inference
Coecke, Bob
2011-01-01
We introduce a graphical framework for Bayesian inference that is sufficiently general to accommodate not just the standard case but also recent proposals for a theory of quantum Bayesian inference wherein one considers density operators rather than probability distributions as representative of degrees of belief. The diagrammatic framework is stated in the graphical language of symmetric monoidal categories and of compact structures and Frobenius structures therein, in which Bayesian inversion boils down to transposition with respect to an appropriate compact structure. We characterize classical Bayesian inference in terms of a graphical property and demonstrate that our approach eliminates some purely conventional elements that appear in common representations thereof, such as whether degrees of belief are represented by probabilities or entropic quantities. We also introduce a quantum-like calculus wherein the Frobenius structure is noncommutative and show that it can accommodate Leifer's calculus of `cond...
Bayesian inference of chemical kinetic models from proposed reactions
Galagali, Nikhil
2015-02-01
© 2014 Elsevier Ltd. Bayesian inference provides a natural framework for combining experimental data with prior knowledge to develop chemical kinetic models and quantify the associated uncertainties, not only in parameter values but also in model structure. Most existing applications of Bayesian model selection methods to chemical kinetics have been limited to comparisons among a small set of models, however. The significant computational cost of evaluating posterior model probabilities renders traditional Bayesian methods infeasible when the model space becomes large. We present a new framework for tractable Bayesian model inference and uncertainty quantification using a large number of systematically generated model hypotheses. The approach involves imposing point-mass mixture priors over rate constants and exploring the resulting posterior distribution using an adaptive Markov chain Monte Carlo method. The posterior samples are used to identify plausible models, to quantify rate constant uncertainties, and to extract key diagnostic information about model structure-such as the reactions and operating pathways most strongly supported by the data. We provide numerical demonstrations of the proposed framework by inferring kinetic models for catalytic steam and dry reforming of methane using available experimental data.
Bayesian online algorithms for learning in discrete Hidden Markov Models
Alamino, Roberto C.; Caticha, Nestor
2008-01-01
We propose and analyze two different Bayesian online algorithms for learning in discrete Hidden Markov Models and compare their performance with the already known Baldi-Chauvin Algorithm. Using the Kullback-Leibler divergence as a measure of generalization we draw learning curves in simplified situations for these algorithms and compare their performances.
An Intuitive Dashboard for Bayesian Network Inference
Reddy, Vikas; Charisse Farr, Anna; Wu, Paul; Mengersen, Kerrie; Yarlagadda, Prasad K. D. V.
2014-03-01
Current Bayesian network software packages provide good graphical interface for users who design and develop Bayesian networks for various applications. However, the intended end-users of these networks may not necessarily find such an interface appealing and at times it could be overwhelming, particularly when the number of nodes in the network is large. To circumvent this problem, this paper presents an intuitive dashboard, which provides an additional layer of abstraction, enabling the end-users to easily perform inferences over the Bayesian networks. Unlike most software packages, which display the nodes and arcs of the network, the developed tool organises the nodes based on the cause-and-effect relationship, making the user-interaction more intuitive and friendly. In addition to performing various types of inferences, the users can conveniently use the tool to verify the behaviour of the developed Bayesian network. The tool has been developed using QT and SMILE libraries in C++.
Tactile length contraction as Bayesian inference.
Tong, Jonathan; Ngo, Vy; Goldreich, Daniel
2016-08-01
To perceive, the brain must interpret stimulus-evoked neural activity. This is challenging: The stochastic nature of the neural response renders its interpretation inherently uncertain. Perception would be optimized if the brain used Bayesian inference to interpret inputs in light of expectations derived from experience. Bayesian inference would improve perception on average but cause illusions when stimuli violate expectation. Intriguingly, tactile, auditory, and visual perception are all prone to length contraction illusions, characterized by the dramatic underestimation of the distance between punctate stimuli delivered in rapid succession; the origin of these illusions has been mysterious. We previously proposed that length contraction illusions occur because the brain interprets punctate stimulus sequences using Bayesian inference with a low-velocity expectation. A novel prediction of our Bayesian observer model is that length contraction should intensify if stimuli are made more difficult to localize. Here we report a tactile psychophysical study that tested this prediction. Twenty humans compared two distances on the forearm: a fixed reference distance defined by two taps with 1-s temporal separation and an adjustable comparison distance defined by two taps with temporal separation t ≤ 1 s. We observed significant length contraction: As t was decreased, participants perceived the two distances as equal only when the comparison distance was made progressively greater than the reference distance. Furthermore, the use of weaker taps significantly enhanced participants' length contraction. These findings confirm the model's predictions, supporting the view that the spatiotemporal percept is a best estimate resulting from a Bayesian inference process.
Variational Bayesian Inference of Line Spectra
DEFF Research Database (Denmark)
Badiu, Mihai Alin; Hansen, Thomas Lundgaard; Fleury, Bernard Henri
2017-01-01
In this paper, we address the fundamental problem of line spectral estimation in a Bayesian framework. We target model order and parameter estimation via variational inference in a probabilistic model in which the frequencies are continuous-valued, i.e., not restricted to a grid; and the coeffici......In this paper, we address the fundamental problem of line spectral estimation in a Bayesian framework. We target model order and parameter estimation via variational inference in a probabilistic model in which the frequencies are continuous-valued, i.e., not restricted to a grid...
Bayesian Cosmological inference beyond statistical isotropy
Souradeep, Tarun; Das, Santanu; Wandelt, Benjamin
2016-10-01
With advent of rich data sets, computationally challenge of inference in cosmology has relied on stochastic sampling method. First, I review the widely used MCMC approach used to infer cosmological parameters and present a adaptive improved implementation SCoPE developed by our group. Next, I present a general method for Bayesian inference of the underlying covariance structure of random fields on a sphere. We employ the Bipolar Spherical Harmonic (BipoSH) representation of general covariance structure on the sphere. We illustrate the efficacy of the method with a principled approach to assess violation of statistical isotropy (SI) in the sky maps of Cosmic Microwave Background (CMB) fluctuations. The general, principled, approach to a Bayesian inference of the covariance structure in a random field on a sphere presented here has huge potential for application to other many aspects of cosmology and astronomy, as well as, more distant areas of research like geosciences and climate modelling.
Variational Bayesian Inference of Line Spectra
DEFF Research Database (Denmark)
Badiu, Mihai Alin; Hansen, Thomas Lundgaard; Fleury, Bernard Henri
2016-01-01
In this paper, we address the fundamental problem of line spectral estimation in a Bayesian framework. We target model order and parameter estimation via variational inference in a probabilistic model in which the frequencies are continuous-valued, i.e., not restricted to a grid; and the coeffici......In this paper, we address the fundamental problem of line spectral estimation in a Bayesian framework. We target model order and parameter estimation via variational inference in a probabilistic model in which the frequencies are continuous-valued, i.e., not restricted to a grid......; and the coefficients are governed by a Bernoulli-Gaussian prior model turning model order selection into binary sequence detection. Unlike earlier works which retain only point estimates of the frequencies, we undertake a more complete Bayesian treatment by estimating the posterior probability density functions (pdfs...
Decision generation tools and Bayesian inference
Jannson, Tomasz; Wang, Wenjian; Forrester, Thomas; Kostrzewski, Andrew; Veeris, Christian; Nielsen, Thomas
2014-05-01
Digital Decision Generation (DDG) tools are important software sub-systems of Command and Control (C2) systems and technologies. In this paper, we present a special type of DDGs based on Bayesian Inference, related to adverse (hostile) networks, including such important applications as terrorism-related networks and organized crime ones.
Bayesian Inference in Queueing Networks
Sutton, Charles
2010-01-01
Modern Web services, such as those at Google, Yahoo!, and Amazon, handle billions of requests per day on clusters of thousands of computers. Because these services operate under strict performance requirements, a statistical understanding of their performance is of great practical interest. Such services are modeled by networks of queues, where one queue models each of the individual computers in the system. A key challenge is that the data is incomplete, because recording detailed information about every request to a heavily used system can require unacceptable overhead. In this paper we develop a Bayesian perspective on queueing models in which the arrival and departure times that are not observed are treated as latent variables. Underlying this viewpoint is the observation that a queueing model defines a deterministic transformation between the data and a set of independent variables called the service times. With this viewpoint in hand, we sample from the posterior distribution over missing data and model...
DEFF Research Database (Denmark)
Møller, Jesper
2010-01-01
Chapter 9: This contribution concerns statistical inference for parametric models used in stochastic geometry and based on quick and simple simulation free procedures as well as more comprehensive methods based on a maximum likelihood or Bayesian approach combined with markov chain Monte Carlo...
Bayesianism and inference to the best explanation
Directory of Open Access Journals (Sweden)
Valeriano IRANZO
2008-01-01
Full Text Available Bayesianism and Inference to the best explanation (IBE are two different models of inference. Recently there has been some debate about the possibility of “bayesianizing” IBE. Firstly I explore several alternatives to include explanatory considerations in Bayes’s Theorem. Then I distinguish two different interpretations of prior probabilities: “IBE-Bayesianism” (IBE-Bay and “frequentist-Bayesianism” (Freq-Bay. After detailing the content of the latter, I propose a rule for assessing the priors. I also argue that Freq-Bay: (i endorses a role for explanatory value in the assessment of scientific hypotheses; (ii avoids a purely subjectivist reading of prior probabilities; and (iii fits better than IBE-Bayesianism with two basic facts about science, i.e., the prominent role played by empirical testing and the existence of many scientific theories in the past that failed to fulfil their promises and were subsequently abandoned.
The NIFTY way of Bayesian signal inference
Energy Technology Data Exchange (ETDEWEB)
Selig, Marco, E-mail: mselig@mpa-Garching.mpg.de [Max Planck Institut für Astrophysik, Karl-Schwarzschild-Straße 1, D-85748 Garching, Germany, and Ludwig-Maximilians-Universität München, Geschwister-Scholl-Platz 1, D-80539 München (Germany)
2014-12-05
We introduce NIFTY, 'Numerical Information Field Theory', a software package for the development of Bayesian signal inference algorithms that operate independently from any underlying spatial grid and its resolution. A large number of Bayesian and Maximum Entropy methods for 1D signal reconstruction, 2D imaging, as well as 3D tomography, appear formally similar, but one often finds individualized implementations that are neither flexible nor easily transferable. Signal inference in the framework of NIFTY can be done in an abstract way, such that algorithms, prototyped in 1D, can be applied to real world problems in higher-dimensional settings. NIFTY as a versatile library is applicable and already has been applied in 1D, 2D, 3D and spherical settings. A recent application is the D{sup 3}PO algorithm targeting the non-trivial task of denoising, deconvolving, and decomposing photon observations in high energy astronomy.
Using Alien Coins to Test Whether Simple Inference Is Bayesian
Cassey, Peter; Hawkins, Guy E.; Donkin, Chris; Brown, Scott D.
2016-01-01
Reasoning and inference are well-studied aspects of basic cognition that have been explained as statistically optimal Bayesian inference. Using a simplified experimental design, we conducted quantitative comparisons between Bayesian inference and human inference at the level of individuals. In 3 experiments, with more than 13,000 participants, we…
Analysis of KATRIN data using Bayesian inference
DEFF Research Database (Denmark)
Riis, Anna Sejersen; Hannestad, Steen; Weinheimer, Christian
2011-01-01
The KATRIN (KArlsruhe TRItium Neutrino) experiment will be analyzing the tritium beta-spectrum to determine the mass of the neutrino with a sensitivity of 0.2 eV (90% C.L.). This approach to a measurement of the absolute value of the neutrino mass relies only on the principle of energy conservati...... the KATRIN chi squared function in the COSMOMC package - an MCMC code using Bayesian parameter inference - solved the task at hand very nicely....
Inference with constrained hidden Markov models in PRISM
DEFF Research Database (Denmark)
Christiansen, Henning; Have, Christian Theil; Lassen, Ole Torp
2010-01-01
A Hidden Markov Model (HMM) is a common statistical model which is widely used for analysis of biological sequence data and other sequential phenomena. In the present paper we show how HMMs can be extended with side-constraints and present constraint solving techniques for efficient inference. De...
Multisensory oddity detection as bayesian inference.
Directory of Open Access Journals (Sweden)
Timothy Hospedales
Full Text Available A key goal for the perceptual system is to optimally combine information from all the senses that may be available in order to develop the most accurate and unified picture possible of the outside world. The contemporary theoretical framework of ideal observer maximum likelihood integration (MLI has been highly successful in modelling how the human brain combines information from a variety of different sensory modalities. However, in various recent experiments involving multisensory stimuli of uncertain correspondence, MLI breaks down as a successful model of sensory combination. Within the paradigm of direct stimulus estimation, perceptual models which use Bayesian inference to resolve correspondence have recently been shown to generalize successfully to these cases where MLI fails. This approach has been known variously as model inference, causal inference or structure inference. In this paper, we examine causal uncertainty in another important class of multi-sensory perception paradigm--that of oddity detection and demonstrate how a Bayesian ideal observer also treats oddity detection as a structure inference problem. We validate this approach by showing that it provides an intuitive and quantitative explanation of an important pair of multi-sensory oddity detection experiments--involving cues across and within modalities--for which MLI previously failed dramatically, allowing a novel unifying treatment of within and cross modal multisensory perception. Our successful application of structure inference models to the new 'oddity detection' paradigm, and the resultant unified explanation of across and within modality cases provide further evidence to suggest that structure inference may be a commonly evolved principle for combining perceptual information in the brain.
Multisensory oddity detection as bayesian inference.
Hospedales, Timothy; Vijayakumar, Sethu
2009-01-01
A key goal for the perceptual system is to optimally combine information from all the senses that may be available in order to develop the most accurate and unified picture possible of the outside world. The contemporary theoretical framework of ideal observer maximum likelihood integration (MLI) has been highly successful in modelling how the human brain combines information from a variety of different sensory modalities. However, in various recent experiments involving multisensory stimuli of uncertain correspondence, MLI breaks down as a successful model of sensory combination. Within the paradigm of direct stimulus estimation, perceptual models which use Bayesian inference to resolve correspondence have recently been shown to generalize successfully to these cases where MLI fails. This approach has been known variously as model inference, causal inference or structure inference. In this paper, we examine causal uncertainty in another important class of multi-sensory perception paradigm--that of oddity detection and demonstrate how a Bayesian ideal observer also treats oddity detection as a structure inference problem. We validate this approach by showing that it provides an intuitive and quantitative explanation of an important pair of multi-sensory oddity detection experiments--involving cues across and within modalities--for which MLI previously failed dramatically, allowing a novel unifying treatment of within and cross modal multisensory perception. Our successful application of structure inference models to the new 'oddity detection' paradigm, and the resultant unified explanation of across and within modality cases provide further evidence to suggest that structure inference may be a commonly evolved principle for combining perceptual information in the brain.
Fancher, Chris M.; Han, Zhen; Levin, Igor; Page, Katharine; Reich, Brian J.; Smith, Ralph C.; Wilson, Alyson G.; Jones, Jacob L.
2016-01-01
A Bayesian inference method for refining crystallographic structures is presented. The distribution of model parameters is stochastically sampled using Markov chain Monte Carlo. Posterior probability distributions are constructed for all model parameters to properly quantify uncertainty by appropriately modeling the heteroskedasticity and correlation of the error structure. The proposed method is demonstrated by analyzing a National Institute of Standards and Technology silicon standard reference material. The results obtained by Bayesian inference are compared with those determined by Rietveld refinement. Posterior probability distributions of model parameters provide both estimates and uncertainties. The new method better estimates the true uncertainties in the model as compared to the Rietveld method. PMID:27550221
Analyzing single-molecule time series via nonparametric Bayesian inference.
Hines, Keegan E; Bankston, John R; Aldrich, Richard W
2015-02-03
The ability to measure the properties of proteins at the single-molecule level offers an unparalleled glimpse into biological systems at the molecular scale. The interpretation of single-molecule time series has often been rooted in statistical mechanics and the theory of Markov processes. While existing analysis methods have been useful, they are not without significant limitations including problems of model selection and parameter nonidentifiability. To address these challenges, we introduce the use of nonparametric Bayesian inference for the analysis of single-molecule time series. These methods provide a flexible way to extract structure from data instead of assuming models beforehand. We demonstrate these methods with applications to several diverse settings in single-molecule biophysics. This approach provides a well-constrained and rigorously grounded method for determining the number of biophysical states underlying single-molecule data. Copyright © 2015 Biophysical Society. Published by Elsevier Inc. All rights reserved.
An application of Bayesian inference for solar-like pulsators
Benomar, O.
2008-12-01
As the amount of data collected by space-borne asteroseismic instruments (such as CoRoT and Kepler) increases drastically, it will be useful to have automated processes to extract a maximum of information from these data. The use of a Bayesian approach could be very help- ful for this goal. Only a few attempts have been made in this way (e.g. Brewer et al. 2007). We propose to use Markov Chain Monte Carlo simulations (MCMC) with Metropolis-Hasting (MH) based algorithms to infer the main stellar oscillation parameters from the power spec- trum, in the case of solar-like pulsators. Given a number of modes to be fitted, the algorithm is able to give the best set of parameters (frequency, linewidth, amplitude, rotational split- ting) corresponding to a chosen input model. We illustrate this algorithm with one of the first CoRoT targets: HD 49933.
Bayesian inference on EMRI signals using low frequency approximations
Ali, Asad; Meyer, Renate; Röver, Christian; 10.1088/0264-9381/29/14/145014
2013-01-01
Extreme mass ratio inspirals (EMRIs) are thought to be one of the most exciting gravitational wave sources to be detected with LISA. Due to their complicated nature and weak amplitudes the detection and parameter estimation of such sources is a challenging task. In this paper we present a statistical methodology based on Bayesian inference in which the estimation of parameters is carried out by advanced Markov chain Monte Carlo (MCMC) algorithms such as parallel tempering MCMC. We analysed high and medium mass EMRI systems that fall well inside the low frequency range of LISA. In the context of the Mock LISA Data Challenges, our investigation and results are also the first instance in which a fully Markovian algorithm is applied for EMRI searches. Results show that our algorithm worked well in recovering EMRI signals from different (simulated) LISA data sets having single and multiple EMRI sources and holds great promise for posterior computation under more realistic conditions. The search and estimation meth...
Inference in Hidden Markov Models with Explicit State Duration Distributions
Dewar, Michael; Wood, Frank
2012-01-01
In this letter we borrow from the inference techniques developed for unbounded state-cardinality (nonparametric) variants of the HMM and use them to develop a tuning-parameter free, black-box inference procedure for Explicit-state-duration hidden Markov models (EDHMM). EDHMMs are HMMs that have latent states consisting of both discrete state-indicator and discrete state-duration random variables. In contrast to the implicit geometric state duration distribution possessed by the standard HMM, EDHMMs allow the direct parameterisation and estimation of per-state duration distributions. As most duration distributions are defined over the positive integers, truncation or other approximations are usually required to perform EDHMM inference.
Bayesian inference for pulsar timing models
Vigeland, Sarah J
2013-01-01
The extremely regular, periodic radio emission from millisecond pulsars make them useful tools for studying neutron star astrophysics, general relativity, and low-frequency gravitational waves. These studies require that the observed pulse time of arrivals are fit to complicated timing models that describe numerous effects such as the astrometry of the source, the evolution of the pulsar's spin, the presence of a binary companion, and the propagation of the pulses through the interstellar medium. In this paper, we discuss the benefits of using Bayesian inference to obtain these timing solutions. These include the validation of linearized least-squares model fits when they are correct, and the proper characterization of parameter uncertainties when they are not; the incorporation of prior parameter information and of models of correlated noise; and the Bayesian comparison of alternative timing models. We describe our computational setup, which combines the timing models of tempo2 with the nested-sampling integ...
Human collective intelligence as distributed Bayesian inference
Krafft, Peter M; Pan, Wei; Della Penna, Nicolás; Altshuler, Yaniv; Shmueli, Erez; Tenenbaum, Joshua B; Pentland, Alex
2016-01-01
Collective intelligence is believed to underly the remarkable success of human society. The formation of accurate shared beliefs is one of the key components of human collective intelligence. How are accurate shared beliefs formed in groups of fallible individuals? Answering this question requires a multiscale analysis. We must understand both the individual decision mechanisms people use, and the properties and dynamics of those mechanisms in the aggregate. As of yet, mathematical tools for such an approach have been lacking. To address this gap, we introduce a new analytical framework: We propose that groups arrive at accurate shared beliefs via distributed Bayesian inference. Distributed inference occurs through information processing at the individual level, and yields rational belief formation at the group level. We instantiate this framework in a new model of human social decision-making, which we validate using a dataset we collected of over 50,000 users of an online social trading platform where inves...
Universal Darwinism as a process of Bayesian inference
Campbell, John O
2016-01-01
Many of the mathematical frameworks describing natural selection are equivalent to Bayes Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians). As Bayesian inference can always be cast in terms of (variational) free energy minimization, natural selection can be viewed as comprising two components: a generative model of an "experiment" in the external world environment, and the results of that "experiment" or the "surprise" entailed by predicted and actual outcomes of the "experiment". Minimization of free energy implies that the implicit measure of "surprise" experienced serves to update the generative model in a Bayesian manner. This description clo...
Inference with Constrained Hidden Markov Models in PRISM
Christiansen, Henning; Lassen, Ole Torp; Petit, Matthieu
2010-01-01
A Hidden Markov Model (HMM) is a common statistical model which is widely used for analysis of biological sequence data and other sequential phenomena. In the present paper we show how HMMs can be extended with side-constraints and present constraint solving techniques for efficient inference. Defining HMMs with side-constraints in Constraint Logic Programming have advantages in terms of more compact expression and pruning opportunities during inference. We present a PRISM-based framework for extending HMMs with side-constraints and show how well-known constraints such as cardinality and all different are integrated. We experimentally validate our approach on the biologically motivated problem of global pairwise alignment.
Hierarchical Bayesian inference in the visual cortex
Lee, Tai Sing; Mumford, David
2003-07-01
Traditional views of visual processing suggest that early visual neurons in areas V1 and V2 are static spatiotemporal filters that extract local features from a visual scene. The extracted information is then channeled through a feedforward chain of modules in successively higher visual areas for further analysis. Recent electrophysiological recordings from early visual neurons in awake behaving monkeys reveal that there are many levels of complexity in the information processing of the early visual cortex, as seen in the long-latency responses of its neurons. These new findings suggest that activity in the early visual cortex is tightly coupled and highly interactive with the rest of the visual system. They lead us to propose a new theoretical setting based on the mathematical framework of hierarchical Bayesian inference for reasoning about the visual system. In this framework, the recurrent feedforward/feedback loops in the cortex serve to integrate top-down contextual priors and bottom-up observations so as to implement concurrent probabilistic inference along the visual hierarchy. We suggest that the algorithms of particle filtering and Bayesian-belief propagation might model these interactive cortical computations. We review some recent neurophysiological evidences that support the plausibility of these ideas. 2003 Optical Society of America
Learning Bayesian network classifiers for credit scoring using Markov Chain Monte Carlo search
Baesens, B.; Egmont-Petersen, M.; Castelo, R.; Vanthienen, J.
2002-01-01
In this paper, we will evaluate the power and usefulness of Bayesian network classifiers for credit scoring. Various types of Bayesian network classifiers will be evaluated and contrasted including unrestricted Bayesian network classifiers learnt using Markov Chain Monte Carlo (MCMC) search. The exp
Dorazio, R.M.; Johnson, F.A.
2003-01-01
Bayesian inference and decision theory may be used in the solution of relatively complex problems of natural resource management, owing to recent advances in statistical theory and computing. In particular, Markov chain Monte Carlo algorithms provide a computational framework for fitting models of adequate complexity and for evaluating the expected consequences of alternative management actions. We illustrate these features using an example based on management of waterfowl habitat.
Markov Chain Monte Carlo Bayesian Learning for Neural Networks
Goodrich, Michael S.
2011-01-01
Conventional training methods for neural networks involve starting al a random location in the solution space of the network weights, navigating an error hyper surface to reach a minimum, and sometime stochastic based techniques (e.g., genetic algorithms) to avoid entrapment in a local minimum. It is further typically necessary to preprocess the data (e.g., normalization) to keep the training algorithm on course. Conversely, Bayesian based learning is an epistemological approach concerned with formally updating the plausibility of competing candidate hypotheses thereby obtaining a posterior distribution for the network weights conditioned on the available data and a prior distribution. In this paper, we developed a powerful methodology for estimating the full residual uncertainty in network weights and therefore network predictions by using a modified Jeffery's prior combined with a Metropolis Markov Chain Monte Carlo method.
Geometric ergodicity of a hybrid sampler for Bayesian inference of phylogenetic branch lengths.
Spade, David A; Herbei, Radu; Kubatko, Laura S
2015-10-01
One of the fundamental goals in phylogenetics is to make inferences about the evolutionary pattern among a group of individuals, such as genes or species, using present-day genetic material. This pattern is represented by a phylogenetic tree, and as computational methods have caught up to the statistical theory, Bayesian methods of making inferences about phylogenetic trees have become increasingly popular. Bayesian inference of phylogenetic trees requires sampling from intractable probability distributions. Common methods of sampling from these distributions include Markov chain Monte Carlo (MCMC) and Sequential Monte Carlo (SMC) methods, and one way that both of these methods can proceed is by first simulating a tree topology and then taking a sample from the posterior distribution of the branch lengths given the tree topology and the data set. In many MCMC methods, it is difficult to verify that the underlying Markov chain is geometrically ergodic, and thus, it is necessary to rely on output-based convergence diagnostics in order to assess convergence on an ad hoc basis. These diagnostics suffer from several important limitations, so in an effort to circumvent these limitations, this work establishes geometric convergence for a particular Markov chain that is used to sample branch lengths under a fairly general class of nucleotide substitution models and provides a numerical method for estimating the time this Markov chain takes to converge.
Bayesian inference for a wavefront model of the Neolithisation of Europe
Baggaley, Andrew W; Shukurov, Anvar; Boys, Richard J; Golightly, Andrew
2012-01-01
We consider a wavefront model for the spread of Neolithic culture across Europe, and use Bayesian inference techniques to provide estimates for the parameters within this model, as constrained by radiocarbon data from Southern and Western Europe. Our wavefront model allows for both an isotropic background spread (incorporating the effects of local geography), and a localized anisotropic spread associated with major waterways. We introduce an innovative numerical scheme to track the wavefront, allowing us to simulate the times of the first arrival at any site orders of magnitude more efficiently than traditional PDE approaches. We adopt a Bayesian approach to inference and use Gaussian process emulators to facilitate further increases in efficiency in the inference scheme, thereby making Markov chain Monte Carlo methods practical. We allow for uncertainty in the fit of our model, and also infer a parameter specifying the magnitude of this uncertainty. We obtain a magnitude for the background spread of order 1 ...
A Fast Iterative Bayesian Inference Algorithm for Sparse Channel Estimation
DEFF Research Database (Denmark)
Pedersen, Niels Lovmand; Manchón, Carles Navarro; Fleury, Bernard Henri
2013-01-01
representation of the Bessel K probability density function; a highly efficient, fast iterative Bayesian inference method is then applied to the proposed model. The resulting estimator outperforms other state-of-the-art Bayesian and non-Bayesian estimators, either by yielding lower mean squared estimation error...
Bayesian seismic tomography by parallel interacting Markov chains
Gesret, Alexandrine; Bottero, Alexis; Romary, Thomas; Noble, Mark; Desassis, Nicolas
2014-05-01
The velocity field estimated by first arrival traveltime tomography is commonly used as a starting point for further seismological, mineralogical, tectonic or similar analysis. In order to interpret quantitatively the results, the tomography uncertainty values as well as their spatial distribution are required. The estimated velocity model is obtained through inverse modeling by minimizing an objective function that compares observed and computed traveltimes. This step is often performed by gradient-based optimization algorithms. The major drawback of such local optimization schemes, beyond the possibility of being trapped in a local minimum, is that they do not account for the multiple possible solutions of the inverse problem. They are therefore unable to assess the uncertainties linked to the solution. Within a Bayesian (probabilistic) framework, solving the tomography inverse problem aims at estimating the posterior probability density function of velocity model using a global sampling algorithm. Markov chains Monte-Carlo (MCMC) methods are known to produce samples of virtually any distribution. In such a Bayesian inversion, the total number of simulations we can afford is highly related to the computational cost of the forward model. Although fast algorithms have been recently developed for computing first arrival traveltimes of seismic waves, the complete browsing of the posterior distribution of velocity model is hardly performed, especially when it is high dimensional and/or multimodal. In the latter case, the chain may even stay stuck in one of the modes. In order to improve the mixing properties of classical single MCMC, we propose to make interact several Markov chains at different temperatures. This method can make efficient use of large CPU clusters, without increasing the global computational cost with respect to classical MCMC and is therefore particularly suited for Bayesian inversion. The exchanges between the chains allow a precise sampling of the
Towards Bayesian Inference of the Fast-Ion Distribution Function
DEFF Research Database (Denmark)
Stagner, L.; Heidbrink, W.W.; Salewski, Mirko
2012-01-01
. However, when theory and experiment disagree (for one or more diagnostics), it is unclear how to proceed. Bayesian statistics provides a framework to infer the DF, quantify errors, and reconcile discrepant diagnostic measurements. Diagnostic errors and ``weight functions" that describe the phase space...... sensitivity of the measurements are incorporated into Bayesian likelihood probabilities, while prior probabilities enforce physical constraints. As an initial step, this poster uses Bayesian statistics to infer the DIII-D electron density profile from multiple diagnostic measurements. Likelihood functions...
Attention as a Bayesian inference process
Chikkerur, Sharat; Serre, Thomas; Tan, Cheston; Poggio, Tomaso
2011-03-01
David Marr famously defined vision as "knowing what is where by seeing". In the framework described here, attention is the inference process that solves the visual recognition problem of what is where. The theory proposes a computational role for attention and leads to a model that performs well in recognition tasks and that predicts some of the main properties of attention at the level of psychophysics and physiology. We propose an algorithmic implementation a Bayesian network that can be mapped into the basic functional anatomy of attention involving the ventral stream and the dorsal stream. This description integrates bottom-up, feature-based as well as spatial (context based) attentional mechanisms. We show that the Bayesian model predicts well human eye fixations (considered as a proxy for shifts of attention) in natural scenes, and can improve accuracy in object recognition tasks involving cluttered real world images. In both cases, we found that the proposed model can predict human performance better than existing bottom-up and top-down computational models.
DEFF Research Database (Denmark)
Iglesias, Juan Eugenio; Sabuncu, Mert Rory; Van Leemput, Koen
2013-01-01
Many segmentation algorithms in medical image analysis use Bayesian modeling to augment local image appearance with prior anatomical knowledge. Such methods often contain a large number of free parameters that are first estimated and then kept fixed during the actual segmentation process. However......, a faithful Bayesian analysis would marginalize over such parameters, accounting for their uncertainty by considering all possible values they may take. Here we propose to incorporate this uncertainty into Bayesian segmentation methods in order to improve the inference process. In particular, we approximate...... the required marginalization over model parameters using computationally efficient Markov chain Monte Carlo techniques. We illustrate the proposed approach using a recently developed Bayesian method for the segmentation of hippocampal subfields in brain MRI scans, showing a significant improvement...
Iglesias, Juan Eugenio; Sabuncu, Mert Rory; Van Leemput, Koen
2013-10-01
Many segmentation algorithms in medical image analysis use Bayesian modeling to augment local image appearance with prior anatomical knowledge. Such methods often contain a large number of free parameters that are first estimated and then kept fixed during the actual segmentation process. However, a faithful Bayesian analysis would marginalize over such parameters, accounting for their uncertainty by considering all possible values they may take. Here we propose to incorporate this uncertainty into Bayesian segmentation methods in order to improve the inference process. In particular, we approximate the required marginalization over model parameters using computationally efficient Markov chain Monte Carlo techniques. We illustrate the proposed approach using a recently developed Bayesian method for the segmentation of hippocampal subfields in brain MRI scans, showing a significant improvement in an Alzheimer's disease classification task. As an additional benefit, the technique also allows one to compute informative "error bars" on the volume estimates of individual structures.
Quantum-Like Representation of Non-Bayesian Inference
Asano, M.; Basieva, I.; Khrennikov, A.; Ohya, M.; Tanaka, Y.
2013-01-01
This research is related to the problem of "irrational decision making or inference" that have been discussed in cognitive psychology. There are some experimental studies, and these statistical data cannot be described by classical probability theory. The process of decision making generating these data cannot be reduced to the classical Bayesian inference. For this problem, a number of quantum-like coginitive models of decision making was proposed. Our previous work represented in a natural way the classical Bayesian inference in the frame work of quantum mechanics. By using this representation, in this paper, we try to discuss the non-Bayesian (irrational) inference that is biased by effects like the quantum interference. Further, we describe "psychological factor" disturbing "rationality" as an "environment" correlating with the "main system" of usual Bayesian inference.
Classical and Bayesian aspects of robust unit root inference
H. Hoek (Henk); H.K. van Dijk (Herman)
1995-01-01
textabstractThis paper has two themes. First, we classify some effects which outliers in the data have on unit root inference. We show that, both in a classical and a Bayesian framework, the presence of additive outliers moves ‘standard’ inference towards stationarity. Second, we base inference on a
A Bayesian Approach for Structural Learning with Hidden Markov Models
Directory of Open Access Journals (Sweden)
Cen Li
2002-01-01
Full Text Available Hidden Markov Models(HMM have proved to be a successful modeling paradigm for dynamic and spatial processes in many domains, such as speech recognition, genomics, and general sequence alignment. Typically, in these applications, the model structures are predefined by domain experts. Therefore, the HMM learning problem focuses on the learning of the parameter values of the model to fit the given data sequences. However, when one considers other domains, such as, economics and physiology, model structure capturing the system dynamic behavior is not available. In order to successfully apply the HMM methodology in these domains, it is important that a mechanism is available for automatically deriving the model structure from the data. This paper presents a HMM learning procedure that simultaneously learns the model structure and the maximum likelihood parameter values of a HMM from data. The HMM model structures are derived based on the Bayesian model selection methodology. In addition, we introduce a new initialization procedure for HMM parameter value estimation based on the K-means clustering method. Experimental results with artificially generated data show the effectiveness of the approach.
Universal Darwinism as a process of Bayesian inference
Directory of Open Access Journals (Sweden)
John Oberon Campbell
2016-06-01
Full Text Available Many of the mathematical frameworks describing natural selection are equivalent to Bayes’ Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians. As Bayesian inference can always be cast in terms of (variational free energy minimization, natural selection can be viewed as comprising two components: a generative model of an ‘experiment’ in the external world environment, and the results of that 'experiment' or the 'surprise' entailed by predicted and actual outcomes of the ‘experiment’. Minimization of free energy implies that the implicit measure of 'surprise' experienced serves to update the generative model in a Bayesian manner. This description closely accords with the mechanisms of generalized Darwinian process proposed both by Dawkins, in terms of replicators and vehicles, and Campbell, in terms of inferential systems. Bayesian inference is an algorithm for the accumulation of evidence-based knowledge. This algorithm is now seen to operate over a wide range of evolutionary processes, including natural selection, the evolution of mental models and cultural evolutionary processes, notably including science itself. The variational principle of free energy minimization may thus serve as a unifying mathematical framework for universal Darwinism, the study of evolutionary processes operating throughout nature.
Universal Darwinism As a Process of Bayesian Inference.
Campbell, John O
2016-01-01
Many of the mathematical frameworks describing natural selection are equivalent to Bayes' Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus, natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians). As Bayesian inference can always be cast in terms of (variational) free energy minimization, natural selection can be viewed as comprising two components: a generative model of an "experiment" in the external world environment, and the results of that "experiment" or the "surprise" entailed by predicted and actual outcomes of the "experiment." Minimization of free energy implies that the implicit measure of "surprise" experienced serves to update the generative model in a Bayesian manner. This description closely accords with the mechanisms of generalized Darwinian process proposed both by Dawkins, in terms of replicators and vehicles, and Campbell, in terms of inferential systems. Bayesian inference is an algorithm for the accumulation of evidence-based knowledge. This algorithm is now seen to operate over a wide range of evolutionary processes, including natural selection, the evolution of mental models and cultural evolutionary processes, notably including science itself. The variational principle of free energy minimization may thus serve as a unifying mathematical framework for universal Darwinism, the study of evolutionary processes operating throughout nature.
Schmidt, Paul; Schmid, Volker J; Gaser, Christian; Buck, Dorothea; Bührlen, Susanne; Förschler, Annette; Mühlau, Mark
2013-01-01
Aiming at iron-related T2-hypointensity, which is related to normal aging and neurodegenerative processes, we here present two practicable approaches, based on Bayesian inference, for preprocessing and statistical analysis of a complex set of structural MRI data. In particular, Markov Chain Monte Carlo methods were used to simulate posterior distributions. First, we rendered a segmentation algorithm that uses outlier detection based on model checking techniques within a Bayesian mixture model. Second, we rendered an analytical tool comprising a Bayesian regression model with smoothness priors (in the form of Gaussian Markov random fields) mitigating the necessity to smooth data prior to statistical analysis. For validation, we used simulated data and MRI data of 27 healthy controls (age: [Formula: see text]; range, [Formula: see text]). We first observed robust segmentation of both simulated T2-hypointensities and gray-matter regions known to be T2-hypointense. Second, simulated data and images of segmented T2-hypointensity were analyzed. We found not only robust identification of simulated effects but also a biologically plausible age-related increase of T2-hypointensity primarily within the dentate nucleus but also within the globus pallidus, substantia nigra, and red nucleus. Our results indicate that fully Bayesian inference can successfully be applied for preprocessing and statistical analysis of structural MRI data.
Trans-Dimensional Bayesian Inference for Gravitational Lens Substructures
Brewer, Brendon J; Lewis, Geraint F
2015-01-01
We introduce a Bayesian solution to the problem of inferring the density profile of strong gravitational lenses when the lens galaxy may contain multiple dark or faint substructures. The source and lens models are based on a superposition of an unknown number of non-negative basis functions (or "blobs") whose form was chosen with speed as a primary criterion. The prior distribution for the blobs' properties is specified hierarchically, so the mass function of substructures is a natural output of the method. We use reversible jump Markov Chain Monte Carlo (MCMC) within Diffusive Nested Sampling (DNS) to sample the posterior distribution and evaluate the marginal likelihood of the model, including the summation over the unknown number of blobs in the source and the lens. We demonstrate the method on a simulated data set with a single substructure, which is recovered well with moderate uncertainties. We also apply the method to the g-band image of the "Cosmic Horseshoe" system, and find some hints of potential s...
Bayesian inference and life testing plans for generalized exponential distribution
Institute of Scientific and Technical Information of China (English)
KUNDU; Debasis; PRADHAN; Biswabrata
2009-01-01
Recently generalized exponential distribution has received considerable attentions.In this paper,we deal with the Bayesian inference of the unknown parameters of the progressively censored generalized exponential distribution.It is assumed that the scale and the shape parameters have independent gamma priors.The Bayes estimates of the unknown parameters cannot be obtained in the closed form.Lindley’s approximation and importance sampling technique have been suggested to compute the approximate Bayes estimates.Markov Chain Monte Carlo method has been used to compute the approximate Bayes estimates and also to construct the highest posterior density credible intervals.We also provide different criteria to compare two different sampling schemes and hence to ?nd the optimal sampling schemes.It is observed that ?nding the optimum censoring procedure is a computationally expensive process.And we have recommended to use the sub-optimal censoring procedure,which can be obtained very easily.Monte Carlo simulations are performed to compare the performances of the different methods and one data analysis has been performed for illustrative purposes.
Bayesian Information Criterion as an Alternative way of Statistical Inference
Directory of Open Access Journals (Sweden)
Nadejda Yu. Gubanova
2012-05-01
Full Text Available The article treats Bayesian information criterion as an alternative to traditional methods of statistical inference, based on NHST. The comparison of ANOVA and BIC results for psychological experiment is discussed.
Bayesian Information Criterion as an Alternative way of Statistical Inference
Nadejda Yu. Gubanova; Simon Zh. Simavoryan
2012-01-01
The article treats Bayesian information criterion as an alternative to traditional methods of statistical inference, based on NHST. The comparison of ANOVA and BIC results for psychological experiment is discussed.
VIGoR: Variational Bayesian Inference for Genome-Wide Regression
Directory of Open Access Journals (Sweden)
Akio Onogi
2016-04-01
Full Text Available Genome-wide regression using a number of genome-wide markers as predictors is now widely used for genome-wide association mapping and genomic prediction. We developed novel software for genome-wide regression which we named VIGoR (variational Bayesian inference for genome-wide regression. Variational Bayesian inference is computationally much faster than widely used Markov chain Monte Carlo algorithms. VIGoR implements seven regression methods, and is provided as a command line program package for Linux/Mac, and as a cross-platform R package. In addition to model fitting, cross-validation and hyperparameter tuning using cross-validation can be automatically performed by modifying a single argument. VIGoR is available at https://github.com/Onogi/VIGoR. The R package is also available at https://cran.r-project.org/web/packages/VIGoR/index.html.
Dorn, C; Khan, A; Heng, K; Alibert, Y; Helled, R; Rivoldini, A; Benz, W
2016-01-01
We aim to present a generalized Bayesian inference method for constraining interiors of super Earths and sub-Neptunes. Our methodology succeeds in quantifying the degeneracy and correlation of structural parameters for high dimensional parameter spaces. Specifically, we identify what constraints can be placed on composition and thickness of core, mantle, ice, ocean, and atmospheric layers given observations of mass, radius, and bulk refractory abundance constraints (Fe, Mg, Si) from observations of the host star's photospheric composition. We employed a full probabilistic Bayesian inference analysis that formally accounts for observational and model uncertainties. Using a Markov chain Monte Carlo technique, we computed joint and marginal posterior probability distributions for all structural parameters of interest. We included state-of-the-art structural models based on self-consistent thermodynamics of core, mantle, high-pressure ice, and liquid water. Furthermore, we tested and compared two different atmosp...
Institute of Scientific and Technical Information of China (English)
LIU; Jianfeng; ZHANG; Yuan; ZHANG; Qin; WANG; Lixian; ZHANG; Jigang
2006-01-01
It is a challenging issue to map Quantitative Trait Loci (QTL) underlying complex discrete traits, which usually show discontinuous distribution and less information, using conventional statistical methods. Bayesian-Markov chain Monte Carlo (Bayesian-MCMC) approach is the key procedure in mapping QTL for complex binary traits, which provides a complete posterior distribution for QTL parameters using all prior information. As a consequence, Bayesian estimates of all interested variables can be obtained straightforwardly basing on their posterior samples simulated by the MCMC algorithm. In our study, utilities of Bayesian-MCMC are demonstrated using simulated several animal outbred full-sib families with different family structures for a complex binary trait underlied by both a QTL and polygene. Under the Identity-by-Descent-Based variance component random model, three samplers basing on MCMC, including Gibbs sampling, Metropolis algorithm and reversible jump MCMC, were implemented to generate the joint posterior distribution of all unknowns so that the QTL parameters were obtained by Bayesian statistical inferring. The results showed that Bayesian-MCMC approach could work well and robust under different family structures and QTL effects. As family size increases and the number of family decreases, the accuracy of the parameter estimates will be improved. When the true QTL has a small effect, using outbred population experiment design with large family size is the optimal mapping strategy.
Bayesian Statistical Inference in Ion-Channel Models with Exact Missed Event Correction.
Epstein, Michael; Calderhead, Ben; Girolami, Mark A; Sivilotti, Lucia G
2016-07-26
The stochastic behavior of single ion channels is most often described as an aggregated continuous-time Markov process with discrete states. For ligand-gated channels each state can represent a different conformation of the channel protein or a different number of bound ligands. Single-channel recordings show only whether the channel is open or shut: states of equal conductance are aggregated, so transitions between them have to be inferred indirectly. The requirement to filter noise from the raw signal further complicates the modeling process, as it limits the time resolution of the data. The consequence of the reduced bandwidth is that openings or shuttings that are shorter than the resolution cannot be observed; these are known as missed events. Postulated models fitted using filtered data must therefore explicitly account for missed events to avoid bias in the estimation of rate parameters and therefore assess parameter identifiability accurately. In this article, we present the first, to our knowledge, Bayesian modeling of ion-channels with exact missed events correction. Bayesian analysis represents uncertain knowledge of the true value of model parameters by considering these parameters as random variables. This allows us to gain a full appreciation of parameter identifiability and uncertainty when estimating values for model parameters. However, Bayesian inference is particularly challenging in this context as the correction for missed events increases the computational complexity of the model likelihood. Nonetheless, we successfully implemented a two-step Markov chain Monte Carlo method that we called "BICME", which performs Bayesian inference in models of realistic complexity. The method is demonstrated on synthetic and real single-channel data from muscle nicotinic acetylcholine channels. We show that parameter uncertainty can be characterized more accurately than with maximum-likelihood methods. Our code for performing inference in these ion channel
Bayesian multimodel inference for dose-response studies
Link, W.A.; Albers, P.H.
2007-01-01
Statistical inference in dose?response studies is model-based: The analyst posits a mathematical model of the relation between exposure and response, estimates parameters of the model, and reports conclusions conditional on the model. Such analyses rarely include any accounting for the uncertainties associated with model selection. The Bayesian inferential system provides a convenient framework for model selection and multimodel inference. In this paper we briefly describe the Bayesian paradigm and Bayesian multimodel inference. We then present a family of models for multinomial dose?response data and apply Bayesian multimodel inferential methods to the analysis of data on the reproductive success of American kestrels (Falco sparveriuss) exposed to various sublethal dietary concentrations of methylmercury.
Bayesian inference of multiscale structures in porous media
Lefantzi, S.; McKenna, S. A.; Ray, J.; Van Bloemen Waanders, B.
2011-12-01
We demonstrate how to probabilistically infer properties of a porous medium, in particular, their spatial variations with a model that only partially resolves them. We consider a binary porous medium, with a spatially varying proportion of the high permeability phase such that inclusions of either phase can be embedded within each other. The inclusions are too small to be resolved with a mesh and are distributed in an uneven fashion in the entire domain. Available data include measurements of upscaled permeability at a handful of locations in the domain, as well as breakthrough times from a tracer test. We use these observations to reconstruct the spatial distribution of the proportions of the phases in the domain, and to estimate the size of the unresolved inclusions. We overlay a coarse 30 x 20 Cartesian mesh on the region of interest and use it to impose a separation of scales. The inclusions, which are about ten times smaller than the mesh resolution, form the fine-scale. Their spatial distribution can be resolved by the mesh and is the coarse-scale variable. The proportionality field and the inclusion size are the targets of the inversion. The key to this multiscale inversion lies in constructing a parametric subgrid model that links the coarse- and fine-scales together. We do so by exploiting elements of truncated Gaussian random fields and Poisson point-processes to represent inclusions geometrically. Existing distance-based upscaling theory of binary media is used to create a model for effective permeability of a mesh gridblock. The inference is performed by solving a Bayesian inverse problem, predicated on sparse observations. The high-permeability proportionality field is modeled as a multiGaussian and approximated as a 30-term Karhunen-Loeve (KL) expansion. Darcy flow is used to estimate breakthrough times, given an upscaled permeability field. The Bayesian inverse problem is solved using an adaptive Markov chain Monte Carlo method for 30 KL mode weights
Bayesian inference for multivariate point processes observed at sparsely distributed times
DEFF Research Database (Denmark)
Rasmussen, Jakob Gulddahl; Møller, Jesper; Aukema, B.H.;
normalizing constants. We discuss the advantages and disadvantages of using continuous time processes compared to discrete time processes in the setting of the present paper as well as other spatial-temporal situations. Keywords: Bark beetle, conditional intensity, forest entomology, Markov chain Monte Carlo......We consider statistical and computational aspects of simulation-based Bayesian inference for a multivariate point process which is only observed at sparsely distributed times. For specicity we consider a particular data set which has earlier been analyzed by a discrete time model involving unknown...
DEFF Research Database (Denmark)
Møller, Jesper; Jacobsen, Robert Dahl
We introduce a promising alternative to the usual hidden Markov tree model for Gaussian wavelet coefficients, where their variances are specified by the hidden states and take values in a finite set. In our new model, the hidden states have a similar dependence structure but they are jointly...... Gaussian, and the wavelet coefficients have log-variances equal to the hidden states. We argue why this provides a flexible model where frequentist and Bayesian inference procedures become tractable for estimation of parameters and hidden states. Our methodology is illustrated for denoising and edge...
An empirical Bayesian approach for model-based inference of cellular signaling networks
Directory of Open Access Journals (Sweden)
Klinke David J
2009-11-01
Full Text Available Abstract Background A common challenge in systems biology is to infer mechanistic descriptions of biological process given limited observations of a biological system. Mathematical models are frequently used to represent a belief about the causal relationships among proteins within a signaling network. Bayesian methods provide an attractive framework for inferring the validity of those beliefs in the context of the available data. However, efficient sampling of high-dimensional parameter space and appropriate convergence criteria provide barriers for implementing an empirical Bayesian approach. The objective of this study was to apply an Adaptive Markov chain Monte Carlo technique to a typical study of cellular signaling pathways. Results As an illustrative example, a kinetic model for the early signaling events associated with the epidermal growth factor (EGF signaling network was calibrated against dynamic measurements observed in primary rat hepatocytes. A convergence criterion, based upon the Gelman-Rubin potential scale reduction factor, was applied to the model predictions. The posterior distributions of the parameters exhibited complicated structure, including significant covariance between specific parameters and a broad range of variance among the parameters. The model predictions, in contrast, were narrowly distributed and were used to identify areas of agreement among a collection of experimental studies. Conclusion In summary, an empirical Bayesian approach was developed for inferring the confidence that one can place in a particular model that describes signal transduction mechanisms and for inferring inconsistencies in experimental measurements.
Gelman, Andrew; Stern, Hal S; Dunson, David B; Vehtari, Aki; Rubin, Donald B
2013-01-01
FUNDAMENTALS OF BAYESIAN INFERENCEProbability and InferenceSingle-Parameter Models Introduction to Multiparameter Models Asymptotics and Connections to Non-Bayesian ApproachesHierarchical ModelsFUNDAMENTALS OF BAYESIAN DATA ANALYSISModel Checking Evaluating, Comparing, and Expanding ModelsModeling Accounting for Data Collection Decision AnalysisADVANCED COMPUTATION Introduction to Bayesian Computation Basics of Markov Chain Simulation Computationally Efficient Markov Chain Simulation Modal and Distributional ApproximationsREGRESSION MODELS Introduction to Regression Models Hierarchical Linear
Bayesian Networks: Aspects of Approximate Inference
Bolt, J.H.
2008-01-01
A Bayesian network can be used to model consisely the probabilistic knowledge with respect to a given problem domain. Such a network consists of an acyclic directed graph in which the nodes represent stochastic variables, supplemented with probabilities indicating the strength of the influences betw
Directory of Open Access Journals (Sweden)
Hea-Jung Kim
2017-06-01
Full Text Available This paper develops Bayesian inference in reliability of a class of scale mixtures of log-normal failure time (SMLNFT models with stochastic (or uncertain constraint in their reliability measures. The class is comprehensive and includes existing failure time (FT models (such as log-normal, log-Cauchy, and log-logistic FT models as well as new models that are robust in terms of heavy-tailed FT observations. Since classical frequency approaches to reliability analysis based on the SMLNFT model with stochastic constraint are intractable, the Bayesian method is pursued utilizing a Markov chain Monte Carlo (MCMC sampling based approach. This paper introduces a two-stage maximum entropy (MaxEnt prior, which elicits a priori uncertain constraint and develops Bayesian hierarchical SMLNFT model by using the prior. The paper also proposes an MCMC method for Bayesian inference in the SMLNFT model reliability and calls attention to properties of the MaxEnt prior that are useful for method development. Finally, two data sets are used to illustrate how the proposed methodology works.
Improving Bayesian population dynamics inference: a coalescent-based model for multiple loci.
Gill, Mandev S; Lemey, Philippe; Faria, Nuno R; Rambaut, Andrew; Shapiro, Beth; Suchard, Marc A
2013-03-01
Effective population size is fundamental in population genetics and characterizes genetic diversity. To infer past population dynamics from molecular sequence data, coalescent-based models have been developed for Bayesian nonparametric estimation of effective population size over time. Among the most successful is a Gaussian Markov random field (GMRF) model for a single gene locus. Here, we present a generalization of the GMRF model that allows for the analysis of multilocus sequence data. Using simulated data, we demonstrate the improved performance of our method to recover true population trajectories and the time to the most recent common ancestor (TMRCA). We analyze a multilocus alignment of HIV-1 CRF02_AG gene sequences sampled from Cameroon. Our results are consistent with HIV prevalence data and uncover some aspects of the population history that go undetected in Bayesian parametric estimation. Finally, we recover an older and more reconcilable TMRCA for a classic ancient DNA data set.
PHAISTOS: a framework for Markov chain Monte Carlo simulation and inference of protein structure.
Boomsma, Wouter; Frellsen, Jes; Harder, Tim; Bottaro, Sandro; Johansson, Kristoffer E; Tian, Pengfei; Stovgaard, Kasper; Andreetta, Christian; Olsson, Simon; Valentin, Jan B; Antonov, Lubomir D; Christensen, Anders S; Borg, Mikael; Jensen, Jan H; Lindorff-Larsen, Kresten; Ferkinghoff-Borg, Jesper; Hamelryck, Thomas
2013-07-15
We present a new software framework for Markov chain Monte Carlo sampling for simulation, prediction, and inference of protein structure. The software package contains implementations of recent advances in Monte Carlo methodology, such as efficient local updates and sampling from probabilistic models of local protein structure. These models form a probabilistic alternative to the widely used fragment and rotamer libraries. Combined with an easily extendible software architecture, this makes PHAISTOS well suited for Bayesian inference of protein structure from sequence and/or experimental data. Currently, two force-fields are available within the framework: PROFASI and OPLS-AA/L, the latter including the generalized Born surface area solvent model. A flexible command-line and configuration-file interface allows users quickly to set up simulations with the desired configuration. PHAISTOS is released under the GNU General Public License v3.0. Source code and documentation are freely available from http://phaistos.sourceforge.net. The software is implemented in C++ and has been tested on Linux and OSX platforms.
Rottman, Benjamin M; Hastie, Reid
2016-06-01
Making judgments by relying on beliefs about the causal relationships between events is a fundamental capacity of everyday cognition. In the last decade, Causal Bayesian Networks have been proposed as a framework for modeling causal reasoning. Two experiments were conducted to provide comprehensive data sets with which to evaluate a variety of different types of judgments in comparison to the standard Bayesian networks calculations. Participants were introduced to a fictional system of three events and observed a set of learning trials that instantiated the multivariate distribution relating the three variables. We tested inferences on chains X1→Y→X2, common cause structures X1←Y→X2, and common effect structures X1→Y←X2, on binary and numerical variables, and with high and intermediate causal strengths. We tested transitive inferences, inferences when one variable is irrelevant because it is blocked by an intervening variable (Markov Assumption), inferences from two variables to a middle variable, and inferences about the presence of one cause when the alternative cause was known to have occurred (the normative "explaining away" pattern). Compared to the normative account, in general, when the judgments should change, they change in the normative direction. However, we also discuss a few persistent violations of the standard normative model. In addition, we evaluate the relative success of 12 theoretical explanations for these deviations.
Nonparametric Bayesian inference of the microcanonical stochastic block model
Peixoto, Tiago P
2016-01-01
A principled approach to characterize the hidden modular structure of networks is to formulate generative models, and then infer their parameters from data. When the desired structure is composed of modules or "communities", a suitable choice for this task is the stochastic block model (SBM), where nodes are divided into groups, and the placement of edges is conditioned on the group memberships. Here, we present a nonparametric Bayesian method to infer the modular structure of empirical networks, including the number of modules and their hierarchical organization. We focus on a microcanonical variant of the SBM, where the structure is imposed via hard constraints. We show how this simple model variation allows simultaneously for two important improvements over more traditional inference approaches: 1. Deeper Bayesian hierarchies, with noninformative priors replaced by sequences of priors and hyperpriors, that not only remove limitations that seriously degrade the inference on large networks, but also reveal s...
Variations on Bayesian Prediction and Inference
2016-05-09
Variations on Bayesian prediction and inference” Ryan Martin Department of Mathematics, Statistics , and Computer Science University of Illinois at Chicago...using statistical ideas/methods. We recently learned that this new project will be supported, in part, by the National Science Foundation. 2.2 Problem 2...41. Kalli, M., Griffin, J. E., Walker, S. G. (2011). Slice sampling mixture models. Statistics and Computing 21, 93–105. Koenker, R. (2005). Quantile
On efficient Bayesian inference for models with stochastic volatility
Griffin, Jim E.; Sakaria, Dhirendra Kumar
2016-01-01
An efficient method for Bayesian inference in stochastic volatility models uses a linear state space representation to define a Gibbs sampler in which the volatilities are jointly updated. This method involves the choice of an offset parameter and we illustrate how its choice can have an important effect on the posterior inference. A Metropolis-Hastings algorithm is developed to robustify this approach to choice of the offset parameter. The method is illustrated on simulated data with known p...
Bayesian Inference in Monte-Carlo Tree Search
Tesauro, Gerald; Segal, Richard
2012-01-01
Monte-Carlo Tree Search (MCTS) methods are drawing great interest after yielding breakthrough results in computer Go. This paper proposes a Bayesian approach to MCTS that is inspired by distributionfree approaches such as UCT [13], yet significantly differs in important respects. The Bayesian framework allows potentially much more accurate (Bayes-optimal) estimation of node values and node uncertainties from a limited number of simulation trials. We further propose propagating inference in the tree via fast analytic Gaussian approximation methods: this can make the overhead of Bayesian inference manageable in domains such as Go, while preserving high accuracy of expected-value estimates. We find substantial empirical outperformance of UCT in an idealized bandit-tree test environment, where we can obtain valuable insights by comparing with known ground truth. Additionally we rigorously prove on-policy and off-policy convergence of the proposed methods.
Fast Bayesian inference of optical trap stiffness and particle diffusion
Bera, Sudipta; Paul, Shuvojit; Singh, Rajesh; Ghosh, Dipanjan; Kundu, Avijit; Banerjee, Ayan; Adhikari, R.
2017-01-01
Bayesian inference provides a principled way of estimating the parameters of a stochastic process that is observed discretely in time. The overdamped Brownian motion of a particle confined in an optical trap is generally modelled by the Ornstein-Uhlenbeck process and can be observed directly in experiment. Here we present Bayesian methods for inferring the parameters of this process, the trap stiffness and the particle diffusion coefficient, that use exact likelihoods and sufficient statistics to arrive at simple expressions for the maximum a posteriori estimates. This obviates the need for Monte Carlo sampling and yields methods that are both fast and accurate. We apply these to experimental data and demonstrate their advantage over commonly used non-Bayesian fitting methods.
Fast Bayesian inference of optical trap stiffness and particle diffusion
Bera, Sudipta; Singh, Rajesh; Ghosh, Dipanjan; Kundu, Avijit; Banerjee, Ayan; Adhikari, R
2016-01-01
Bayesian inference provides a principled way of estimating the parameters of a stochastic process that is observed discretely in time. The overdamped Brownian motion of a particle confined in an optical trap is generally modelled by the Ornstein-Uhlenbeck process and can be observed directly in experiment. Here we present Bayesian methods for inferring the parameters of this process, the trap stiffness and the particle diffusion coefficient, that use exact likelihoods and sufficient statistics to arrive at simple expressions for the maximum a posteriori estimates. This obviates the need for Monte Carlo sampling and yields methods that are both fast and accurate. We apply these to experimental data and demonstrate their advantage over commonly used non-Bayesian fitting methods.
Bayesian inference model for fatigue life of laminated composites
DEFF Research Database (Denmark)
Dimitrov, Nikolay Krasimirov; Kiureghian, Armen Der; Berggreen, Christian
2016-01-01
A probabilistic model for estimating the fatigue life of laminated composite plates is developed. The model is based on lamina-level input data, making it possible to predict fatigue properties for a wide range of laminate configurations. Model parameters are estimated by Bayesian inference...
Stan: A Probabilistic Programming Language for Bayesian Inference and Optimization
Gelman, Andrew; Lee, Daniel; Guo, Jiqiang
2015-01-01
Stan is a free and open-source C++ program that performs Bayesian inference or optimization for arbitrary user-specified models and can be called from the command line, R, Python, Matlab, or Julia and has great promise for fitting large and complex statistical models in many areas of application. We discuss Stan from users' and developers'…
Bayesian electron density inference from JET lithium beam emission spectra using Gaussian processes
Kwak, Sehyun; Brix, M; Ghim, Y -c
2016-01-01
A Bayesian model to infer edge electron density profiles is developed for the JET lithium beam emission spectroscopy system, measuring Li I line radiation using 26 channels with ~1 cm spatial resolution and 10~20 ms temporal resolution. The density profile is modelled using a Gaussian process prior, and the uncertainty of the density profile is calculated by a Markov Chain Monte Carlo (MCMC) scheme. From the spectra measured by the transmission grating spectrometer, the Li line intensities are extracted, and modelled as a function of the plasma density by a multi-state model which describes the relevant processes between neutral lithium beam atoms and plasma particles. The spectral model fully takes into account interference filter and instrument effects, that are separately estimated, again using Gaussian processes. The line intensities are inferred based on a spectral model consistent with the measured spectra within their uncertainties, which includes photon statistics and electronic noise. Our newly devel...
Energy Technology Data Exchange (ETDEWEB)
Dana L. Kelly; Albert Malkhasyan
2010-06-01
There is a nearly ubiquitous assumption in PSA that parameter values are at least piecewise-constant in time. As a result, Bayesian inference tends to incorporate many years of plant operation, over which there have been significant changes in plant operational and maintenance practices, plant management, etc. These changes can cause significant changes in parameter values over time; however, failure to perform Bayesian inference in the proper time-dependent framework can mask these changes. Failure to question the assumption of constant parameter values, and failure to perform Bayesian inference in the proper time-dependent framework were noted as important issues in NUREG/CR-6813, performed for the U. S. Nuclear Regulatory Commission’s Advisory Committee on Reactor Safeguards in 2003. That report noted that “industry lacks tools to perform time-trend analysis with Bayesian updating.” This paper describes an application of time-dependent Bayesian inference methods developed for the European Commission Ageing PSA Network. These methods utilize open-source software, implementing Markov chain Monte Carlo sampling. The paper also illustrates the development of a generic prior distribution, which incorporates multiple sources of generic data via weighting factors that address differences in key influences, such as vendor, component boundaries, conditions of the operating environment, etc.
Bayesian inference of structural brain networks.
Hinne, Max; Heskes, Tom; Beckmann, Christian F; van Gerven, Marcel A J
2013-02-01
Structural brain networks are used to model white-matter connectivity between spatially segregated brain regions. The presence, location and orientation of these white matter tracts can be derived using diffusion-weighted magnetic resonance imaging in combination with probabilistic tractography. Unfortunately, as of yet, none of the existing approaches provide an undisputed way of inferring brain networks from the streamline distributions which tractography produces. State-of-the-art methods rely on an arbitrary threshold or, alternatively, yield weighted results that are difficult to interpret. In this paper, we provide a generative model that explicitly describes how structural brain networks lead to observed streamline distributions. This allows us to draw principled conclusions about brain networks, which we validate using simultaneously acquired resting-state functional MRI data. Inference may be further informed by means of a prior which combines connectivity estimates from multiple subjects. Based on this prior, we obtain networks that significantly improve on the conventional approach.
Ensemble Bayesian model averaging using Markov Chain Monte Carlo sampling
Vrugt, J.A.; Diks, C.G.H.; Clark, M.
2008-01-01
Bayesian model averaging (BMA) has recently been proposed as a statistical method to calibrate forecast ensembles from numerical weather models. Successful implementation of BMA however, requires accurate estimates of the weights and variances of the individual competing models in the ensemble. In t
Overstall, Antony M; Woods, David C
2013-06-01
Bayesian inference is considered for statistical models that depend on the evaluation of a computationally expensive computer code or simulator. For such situations, the number of evaluations of the likelihood function, and hence of the unnormalized posterior probability density function, is determined by the available computational resource and may be extremely limited. We present a new example of such a simulator that describes the properties of human embryonic stem cells using data from optical trapping experiments. This application is used to motivate a novel strategy for Bayesian inference which exploits a Gaussian process approximation of the simulator and allows computationally efficient Markov chain Monte Carlo inference. The advantages of this strategy over previous methodology are that it is less reliant on the determination of tuning parameters and allows the application of model diagnostic procedures that require no additional evaluations of the simulator. We show the advantages of our method on synthetic examples and demonstrate its application on stem cell experiments.
Bayesian Inference in the Modern Design of Experiments
DeLoach, Richard
2008-01-01
This paper provides an elementary tutorial overview of Bayesian inference and its potential for application in aerospace experimentation in general and wind tunnel testing in particular. Bayes Theorem is reviewed and examples are provided to illustrate how it can be applied to objectively revise prior knowledge by incorporating insights subsequently obtained from additional observations, resulting in new (posterior) knowledge that combines information from both sources. A logical merger of Bayesian methods and certain aspects of Response Surface Modeling is explored. Specific applications to wind tunnel testing, computational code validation, and instrumentation calibration are discussed.
Bayesian Inference of Giant Exoplanet Physics
Thorngren, Daniel; Fortney, Jonathan J.
2017-01-01
The physical processes within a giant planet directly set its observed radius for a given mass, age, and insolation. The important aspects are the planet’s bulk composition and its interior thermal evolution. By studying many giant planets as an ensemble, we can gain insight into this physics. We demonstrate two novel examples here. We examine 50 cooler transiting giant planets, whose insolation is sufficiently low (T_eff < 1000 K) that they are not affected by the hot Jupiter radius inflation effect. For these planets, the thermal evolution is relatively well understood, and we show that the bulk planet metallicity increases with the total planet mass, which directly impacts plans for future atmospheric studies. We also examine the relation with stellar metallicity and discuss how these relations place new constraints on the core accretion model of planet formation. Our newest work seeks to quantify the flow of energy into hot Jupiters needed to explain their enlarged radii, in addition to their bulk composition. Because the former is related to stellar insolation and the latter is related to mass, we are able to create a hierarchical Bayesian model to disentangle the two effects in our sample of ~300 transiting giant planets. Our results show conclusively that the inflation power is not a simple fraction of stellar insolation: instead, the power increases with incident flux at a much higher rate. We use these results to test published models of giant planet inflation and to provide accurate empirical mass-radius relations for giant planets.
Bayesian inference data evaluation and decisions
Harney, Hanns Ludwig
2016-01-01
This new edition offers a comprehensive introduction to the analysis of data using Bayes rule. It generalizes Gaussian error intervals to situations in which the data follow distributions other than Gaussian. This is particularly useful when the observed parameter is barely above the background or the histogram of multiparametric data contains many empty bins, so that the determination of the validity of a theory cannot be based on the chi-squared-criterion. In addition to the solutions of practical problems, this approach provides an epistemic insight: the logic of quantum mechanics is obtained as the logic of unbiased inference from counting data. New sections feature factorizing parameters, commuting parameters, observables in quantum mechanics, the art of fitting with coherent and with incoherent alternatives and fitting with multinomial distribution. Additional problems and examples help deepen the knowledge. Requiring no knowledge of quantum mechanics, the book is written on introductory level, with man...
Evolution in Mind: Evolutionary Dynamics, Cognitive Processes, and Bayesian Inference.
Suchow, Jordan W; Bourgin, David D; Griffiths, Thomas L
2017-07-01
Evolutionary theory describes the dynamics of population change in settings affected by reproduction, selection, mutation, and drift. In the context of human cognition, evolutionary theory is most often invoked to explain the origins of capacities such as language, metacognition, and spatial reasoning, framing them as functional adaptations to an ancestral environment. However, evolutionary theory is useful for understanding the mind in a second way: as a mathematical framework for describing evolving populations of thoughts, ideas, and memories within a single mind. In fact, deep correspondences exist between the mathematics of evolution and of learning, with perhaps the deepest being an equivalence between certain evolutionary dynamics and Bayesian inference. This equivalence permits reinterpretation of evolutionary processes as algorithms for Bayesian inference and has relevance for understanding diverse cognitive capacities, including memory and creativity. Copyright © 2017 Elsevier Ltd. All rights reserved.
Bayesian inference from count data using discrete uniform priors.
Comoglio, Federico; Fracchia, Letizia; Rinaldi, Maurizio
2013-01-01
We consider a set of sample counts obtained by sampling arbitrary fractions of a finite volume containing an homogeneously dispersed population of identical objects. We report a Bayesian derivation of the posterior probability distribution of the population size using a binomial likelihood and non-conjugate, discrete uniform priors under sampling with or without replacement. Our derivation yields a computationally feasible formula that can prove useful in a variety of statistical problems involving absolute quantification under uncertainty. We implemented our algorithm in the R package dupiR and compared it with a previously proposed Bayesian method based on a Gamma prior. As a showcase, we demonstrate that our inference framework can be used to estimate bacterial survival curves from measurements characterized by extremely low or zero counts and rather high sampling fractions. All in all, we provide a versatile, general purpose algorithm to infer population sizes from count data, which can find application in a broad spectrum of biological and physical problems.
Granger causality vs. dynamic Bayesian network inference: a comparative study
Directory of Open Access Journals (Sweden)
Feng Jianfeng
2009-04-01
Full Text Available Abstract Background In computational biology, one often faces the problem of deriving the causal relationship among different elements such as genes, proteins, metabolites, neurons and so on, based upon multi-dimensional temporal data. Currently, there are two common approaches used to explore the network structure among elements. One is the Granger causality approach, and the other is the dynamic Bayesian network inference approach. Both have at least a few thousand publications reported in the literature. A key issue is to choose which approach is used to tackle the data, in particular when they give rise to contradictory results. Results In this paper, we provide an answer by focusing on a systematic and computationally intensive comparison between the two approaches on both synthesized and experimental data. For synthesized data, a critical point of the data length is found: the dynamic Bayesian network outperforms the Granger causality approach when the data length is short, and vice versa. We then test our results in experimental data of short length which is a common scenario in current biological experiments: it is again confirmed that the dynamic Bayesian network works better. Conclusion When the data size is short, the dynamic Bayesian network inference performs better than the Granger causality approach; otherwise the Granger causality approach is better.
Braak, ter C.J.F.
2004-01-01
Differential Evolution (DE) is a simple genetic algorithm for numerical optimization in real parameter spaces. In a statistical context one would not just want the optimum but also its uncertainty. The uncertainty distribution can be obtained by a Bayesian analysis (after specifying prior and likeli
Inference of gene pathways using mixture Bayesian networks
Directory of Open Access Journals (Sweden)
Ko Younhee
2009-05-01
Full Text Available Abstract Background Inference of gene networks typically relies on measurements across a wide range of conditions or treatments. Although one network structure is predicted, the relationship between genes could vary across conditions. A comprehensive approach to infer general and condition-dependent gene networks was evaluated. This approach integrated Bayesian network and Gaussian mixture models to describe continuous microarray gene expression measurements, and three gene networks were predicted. Results The first reconstructions of a circadian rhythm pathway in honey bees and an adherens junction pathway in mouse embryos were obtained. In addition, general and condition-specific gene relationships, some unexpected, were detected in these two pathways and in a yeast cell-cycle pathway. The mixture Bayesian network approach identified all (honey bee circadian rhythm and mouse adherens junction pathways or the vast majority (yeast cell-cycle pathway of the gene relationships reported in empirical studies. Findings across the three pathways and data sets indicate that the mixture Bayesian network approach is well-suited to infer gene pathways based on microarray data. Furthermore, the interpretation of model estimates provided a broader understanding of the relationships between genes. The mixture models offered a comprehensive description of the relationships among genes in complex biological processes or across a wide range of conditions. The mixture parameter estimates and corresponding odds that the gene network inferred for a sample pertained to each mixture component allowed the uncovering of both general and condition-dependent gene relationships and patterns of expression. Conclusion This study demonstrated the two main benefits of learning gene pathways using mixture Bayesian networks. First, the identification of the optimal number of mixture components supported by the data offered a robust approach to infer gene relationships and
Super-Resolution Using Hidden Markov Model and Bayesian Detection Estimation Framework
Directory of Open Access Journals (Sweden)
Humblot Fabrice
2006-01-01
Full Text Available This paper presents a new method for super-resolution (SR reconstruction of a high-resolution (HR image from several low-resolution (LR images. The HR image is assumed to be composed of homogeneous regions. Thus, the a priori distribution of the pixels is modeled by a finite mixture model (FMM and a Potts Markov model (PMM for the labels. The whole a priori model is then a hierarchical Markov model. The LR images are assumed to be obtained from the HR image by lowpass filtering, arbitrarily translation, decimation, and finally corruption by a random noise. The problem is then put in a Bayesian detection and estimation framework, and appropriate algorithms are developed based on Markov chain Monte Carlo (MCMC Gibbs sampling. At the end, we have not only an estimate of the HR image but also an estimate of the classification labels which leads to a segmentation result.
Bayesian inference in the numerical solution of Laplace's equation
Mendes, Fábio Macêdo; da Costa Júnior, Edson Alves
2012-05-01
Inference is not unrelated to numerical analysis: given partial information about a mathematical problem, one has to estimate the unknown "true solution" and uncertainties. Many methods of interpolation (least squares, Kriging, Tikhonov regularization, etc) have also a probabilistic interpretation. O'Hagan showed that quadratures can also be constructed explicitly as a form of Bayesian inference (O'Hagan, A., BAYESIAN STATISTICS (1992) 4, pp. 345-363). In his framework, the integrand is modeled as a Gaussian process. It is then possible to build a reliable estimate for the value of the integral by conditioning the stochastic process to the known values of the integr nd in a finite set of points. The present work applies a similar method for the problem of solving Laplace's equation inside a closed boundary. First, one needs a Gaussian process that yields arbitrary harmonic functions. Secondly, the boundaries (Dirichilet or Neumann conditions) are used to update these probabilities and to estimate the solution in the whole domain. This procedure is similar to the widely used Boundary Element Method, but differs from it in the treatment of the boundaries. The language of Bayesian inference gives more flexibility on how the boundary conditions and conservation laws can be handled. This flexibility can be used to attain greater accuracy using a coarser discretization of the boundary and can open doors to more efficient implementations.
Halo detection via large-scale Bayesian inference
Merson, Alexander I.; Jasche, Jens; Abdalla, Filipe B.; Lahav, Ofer; Wandelt, Benjamin; Jones, D. Heath; Colless, Matthew
2016-08-01
We present a proof-of-concept of a novel and fully Bayesian methodology designed to detect haloes of different masses in cosmological observations subject to noise and systematic uncertainties. Our methodology combines the previously published Bayesian large-scale structure inference algorithm, HAmiltonian Density Estimation and Sampling algorithm (HADES), and a Bayesian chain rule (the Blackwell-Rao estimator), which we use to connect the inferred density field to the properties of dark matter haloes. To demonstrate the capability of our approach, we construct a realistic galaxy mock catalogue emulating the wide-area 6-degree Field Galaxy Survey, which has a median redshift of approximately 0.05. Application of HADES to the catalogue provides us with accurately inferred three-dimensional density fields and corresponding quantification of uncertainties inherent to any cosmological observation. We then use a cosmological simulation to relate the amplitude of the density field to the probability of detecting a halo with mass above a specified threshold. With this information, we can sum over the HADES density field realisations to construct maps of detection probabilities and demonstrate the validity of this approach within our mock scenario. We find that the probability of successful detection of haloes in the mock catalogue increases as a function of the signal to noise of the local galaxy observations. Our proposed methodology can easily be extended to account for more complex scientific questions and is a promising novel tool to analyse the cosmic large-scale structure in observations.
Andrade, Daniel
2012-01-01
We present a new method to propagate lower bounds on conditional probability distributions in conventional Bayesian networks. Our method guarantees to provide outer approximations of the exact lower bounds. A key advantage is that we can use any available algorithms and tools for Bayesian networks in order to represent and infer lower bounds. This new method yields results that are provable exact for trees with binary variables, and results which are competitive to existing approximations in credal networks for all other network structures. Our method is not limited to a specific kind of network structure. Basically, it is also not restricted to a specific kind of inference, but we restrict our analysis to prognostic inference in this article. The computational complexity is superior to that of other existing approaches.
Free will in Bayesian and inverse Bayesian inference-driven endo-consciousness.
Gunji, Yukio-Pegio; Minoura, Mai; Kojima, Kei; Horry, Yoichi
2017-06-27
How can we link challenging issues related to consciousness and/or qualia with natural science? The introduction of endo-perspective, instead of exo-perspective, as proposed by Matsuno, Rössler, and Gunji, is considered one of the most promising candidate approaches. Here, we distinguish the endo-from the exo-perspective in terms of whether the external is or is not directly operated. In the endo-perspective, the external can be neither perceived nor recognized directly; rather, one can only indirectly summon something outside of the perspective, which can be illustrated by a causation-reversal pair. On one hand, causation logically proceeds from the cause to the effect. On the other hand, a reversal from the effect to the cause is non-logical and is equipped with a metaphorical structure. We argue that the differences in exo- and endo-perspectives result not from the difference between Western and Eastern cultures, but from differences between modernism and animism. Here, a causation-reversal pair described using a pair of upward (from premise to consequence) and downward (from consequence to premise) causation and a pair of Bayesian and inverse Bayesian inference (BIB inference). Accordingly, the notion of endo-consciousness is proposed as an agent equipped with BIB inference. We also argue that BIB inference can yield both highly efficient computations through Bayesian interference and robust computations through inverse Bayesian inference. By adapting a logical model of the free will theorem to the BIB inference, we show that endo-consciousness can explain free will as a regression of the controllability of voluntary action. Copyright © 2017. Published by Elsevier Ltd.
Raue, Andreas; Theis, Fabian Joachim; Timmer, Jens
2012-01-01
Increasingly complex applications involve large datasets in combination with non-linear and high dimensional mathematical models. In this context, statistical inference is a challenging issue that calls for pragmatic approaches that take advantage of both Bayesian and frequentist methods. The elegance of Bayesian methodology is founded in the propagation of information content provided by experimental data and prior assumptions to the posterior probability distribution of model predictions. However, for complex applications experimental data and prior assumptions potentially constrain the posterior probability distribution insufficiently. In these situations Bayesian Markov chain Monte Carlo sampling can be infeasible. From a frequentist point of view insufficient experimental data and prior assumptions can be interpreted as non-identifiability. The profile likelihood approach offers to detect and to resolve non-identifiability by experimental design iteratively. Therefore, it allows one to better constrain t...
Bayesian adaptive Markov chain Monte Carlo estimation of genetic parameters.
Mathew, B; Bauer, A M; Koistinen, P; Reetz, T C; Léon, J; Sillanpää, M J
2012-10-01
Accurate and fast estimation of genetic parameters that underlie quantitative traits using mixed linear models with additive and dominance effects is of great importance in both natural and breeding populations. Here, we propose a new fast adaptive Markov chain Monte Carlo (MCMC) sampling algorithm for the estimation of genetic parameters in the linear mixed model with several random effects. In the learning phase of our algorithm, we use the hybrid Gibbs sampler to learn the covariance structure of the variance components. In the second phase of the algorithm, we use this covariance structure to formulate an effective proposal distribution for a Metropolis-Hastings algorithm, which uses a likelihood function in which the random effects have been integrated out. Compared with the hybrid Gibbs sampler, the new algorithm had better mixing properties and was approximately twice as fast to run. Our new algorithm was able to detect different modes in the posterior distribution. In addition, the posterior mode estimates from the adaptive MCMC method were close to the REML (residual maximum likelihood) estimates. Moreover, our exponential prior for inverse variance components was vague and enabled the estimated mode of the posterior variance to be practically zero, which was in agreement with the support from the likelihood (in the case of no dominance). The method performance is illustrated using simulated data sets with replicates and field data in barley.
Simulation based bayesian econometric inference: principles and some recent computational advances.
L.F. Hoogerheide (Lennart); H.K. van Dijk (Herman); R.D. van Oest (Rutger)
2007-01-01
textabstractIn this paper we discuss several aspects of simulation based Bayesian econometric inference. We start at an elementary level on basic concepts of Bayesian analysis; evaluating integrals by simulation methods is a crucial ingredient in Bayesian inference. Next, the most popular and well-
Inferring animal densities from tracking data using Markov chains.
Directory of Open Access Journals (Sweden)
Hal Whitehead
Full Text Available The distributions and relative densities of species are keys to ecology. Large amounts of tracking data are being collected on a wide variety of animal species using several methods, especially electronic tags that record location. These tracking data are effectively used for many purposes, but generally provide biased measures of distribution, because the starts of the tracks are not randomly distributed among the locations used by the animals. We introduce a simple Markov-chain method that produces unbiased measures of relative density from tracking data. The density estimates can be over a geographical grid, and/or relative to environmental measures. The method assumes that the tracked animals are a random subset of the population in respect to how they move through the habitat cells, and that the movements of the animals among the habitat cells form a time-homogenous Markov chain. We illustrate the method using simulated data as well as real data on the movements of sperm whales. The simulations illustrate the bias introduced when the initial tracking locations are not randomly distributed, as well as the lack of bias when the Markov method is used. We believe that this method will be important in giving unbiased estimates of density from the growing corpus of animal tracking data.
Inferring animal densities from tracking data using Markov chains.
Whitehead, Hal; Jonsen, Ian D
2013-01-01
The distributions and relative densities of species are keys to ecology. Large amounts of tracking data are being collected on a wide variety of animal species using several methods, especially electronic tags that record location. These tracking data are effectively used for many purposes, but generally provide biased measures of distribution, because the starts of the tracks are not randomly distributed among the locations used by the animals. We introduce a simple Markov-chain method that produces unbiased measures of relative density from tracking data. The density estimates can be over a geographical grid, and/or relative to environmental measures. The method assumes that the tracked animals are a random subset of the population in respect to how they move through the habitat cells, and that the movements of the animals among the habitat cells form a time-homogenous Markov chain. We illustrate the method using simulated data as well as real data on the movements of sperm whales. The simulations illustrate the bias introduced when the initial tracking locations are not randomly distributed, as well as the lack of bias when the Markov method is used. We believe that this method will be important in giving unbiased estimates of density from the growing corpus of animal tracking data.
Bayesian inference for kinetic models of biotransformation using a generalized rate equation.
Ying, Shanshan; Zhang, Jiangjiang; Zeng, Lingzao; Shi, Jiachun; Wu, Laosheng
2017-03-06
Selecting proper rate equations for the kinetic models is essential to quantify biotransformation processes in the environment. Bayesian model selection method can be used to evaluate the candidate models. However, comparisons of all plausible models can result in high computational cost, while limiting the number of candidate models may lead to biased results. In this work, we developed an integrated Bayesian method to simultaneously perform model selection and parameter estimation by using a generalized rate equation. In the approach, the model hypotheses were represented by discrete parameters and the rate constants were represented by continuous parameters. Then Bayesian inference of the kinetic models was solved by implementing Markov Chain Monte Carlo simulation for parameter estimation with the mixed (i.e., discrete and continuous) priors. The validity of this approach was illustrated through a synthetic case and a nitrogen transformation experimental study. It showed that our method can successfully identify the plausible models and parameters, as well as uncertainties therein. Thus this method can provide a powerful tool to reveal more insightful information for the complex biotransformation processes.
Hu, Zixi; Yao, Zhewei; Li, Jinglai
2017-03-01
Many scientific and engineering problems require to perform Bayesian inference for unknowns of infinite dimension. In such problems, many standard Markov Chain Monte Carlo (MCMC) algorithms become arbitrary slow under the mesh refinement, which is referred to as being dimension dependent. To this end, a family of dimensional independent MCMC algorithms, known as the preconditioned Crank-Nicolson (pCN) methods, were proposed to sample the infinite dimensional parameters. In this work we develop an adaptive version of the pCN algorithm, where the covariance operator of the proposal distribution is adjusted based on sampling history to improve the simulation efficiency. We show that the proposed algorithm satisfies an important ergodicity condition under some mild assumptions. Finally we provide numerical examples to demonstrate the performance of the proposed method.
A Bayesian method for inferring transmission chains in a partially observed epidemic.
Energy Technology Data Exchange (ETDEWEB)
Marzouk, Youssef M.; Ray, Jaideep
2008-10-01
We present a Bayesian approach for estimating transmission chains and rates in the Abakaliki smallpox epidemic of 1967. The epidemic affected 30 individuals in a community of 74; only the dates of appearance of symptoms were recorded. Our model assumes stochastic transmission of the infections over a social network. Distinct binomial random graphs model intra- and inter-compound social connections, while disease transmission over each link is treated as a Poisson process. Link probabilities and rate parameters are objects of inference. Dates of infection and recovery comprise the remaining unknowns. Distributions for smallpox incubation and recovery periods are obtained from historical data. Using Markov chain Monte Carlo, we explore the joint posterior distribution of the scalar parameters and provide an expected connectivity pattern for the social graph and infection pathway.
Chakraborty, Shubhankar; Roy Chaudhuri, Partha; Das, Prasanta Kr
2016-07-01
In this communication, a novel optical technique has been proposed for the reconstruction of the shape of a Taylor bubble using measurements from multiple arrays of optical sensors. The deviation of an optical beam passing through the bubble depends on the contour of bubble surface. A theoretical model of the deviation of a beam during the traverse of a Taylor bubble through it has been developed. Using this model and the time history of the deviation captured by the sensor array, the bubble shape has been reconstructed. The reconstruction has been performed using an inverse algorithm based on Bayesian inference technique and Markov chain Monte Carlo sampling algorithm. The reconstructed nose shape has been compared with the true shape, extracted through image processing of high speed images. Finally, an error analysis has been performed to pinpoint the sources of the errors.
NetDiff - Bayesian model selection for differential gene regulatory network inference.
Thorne, Thomas
2016-12-16
Differential networks allow us to better understand the changes in cellular processes that are exhibited in conditions of interest, identifying variations in gene regulation or protein interaction between, for example, cases and controls, or in response to external stimuli. Here we present a novel methodology for the inference of differential gene regulatory networks from gene expression microarray data. Specifically we apply a Bayesian model selection approach to compare models of conserved and varying network structure, and use Gaussian graphical models to represent the network structures. We apply a variational inference approach to the learning of Gaussian graphical models of gene regulatory networks, that enables us to perform Bayesian model selection that is significantly more computationally efficient than Markov Chain Monte Carlo approaches. Our method is demonstrated to be more robust than independent analysis of data from multiple conditions when applied to synthetic network data, generating fewer false positive predictions of differential edges. We demonstrate the utility of our approach on real world gene expression microarray data by applying it to existing data from amyotrophic lateral sclerosis cases with and without mutations in C9orf72, and controls, where we are able to identify differential network interactions for further investigation.
Bayesian inference of earthquake parameters from buoy data using a polynomial chaos-based surrogate
Giraldi, Loic
2017-04-07
This work addresses the estimation of the parameters of an earthquake model by the consequent tsunami, with an application to the Chile 2010 event. We are particularly interested in the Bayesian inference of the location, the orientation, and the slip of an Okada-based model of the earthquake ocean floor displacement. The tsunami numerical model is based on the GeoClaw software while the observational data is provided by a single DARTⓇ buoy. We propose in this paper a methodology based on polynomial chaos expansion to construct a surrogate model of the wave height at the buoy location. A correlated noise model is first proposed in order to represent the discrepancy between the computational model and the data. This step is necessary, as a classical independent Gaussian noise is shown to be unsuitable for modeling the error, and to prevent convergence of the Markov Chain Monte Carlo sampler. Second, the polynomial chaos model is subsequently improved to handle the variability of the arrival time of the wave, using a preconditioned non-intrusive spectral method. Finally, the construction of a reduced model dedicated to Bayesian inference is proposed. Numerical results are presented and discussed.
Bayesian inference of synaptic quantal parameters from correlated vesicle release
Directory of Open Access Journals (Sweden)
Alexander D Bird
2016-11-01
Full Text Available Synaptic transmission is both history-dependent and stochastic, resulting in varying responses to presentations of the same presynaptic stimulus. This complicates attempts to infer synaptic parameters and has led to the proposal of a number of different strategies for their quantification. Recently Bayesian approaches have been applied to make more efficient use of the data collected in paired intracellular recordings. Methods have been developed that either provide a complete model of the distribution of amplitudes for isolated responses or approximate the amplitude distributions of a train of post-synaptic potentials, with correct short-term synaptic dynamics but neglecting correlations. In both cases the methods provided significantly improved inference of model parameters as compared to existing mean-variance fitting approaches. However, for synapses with high release probability, low vesicle number or relatively low restock rate and for data in which only one or few repeats of the same pattern are available, correlations between serial events can allow for the extraction of significantly more information from experiment: a more complete Bayesian approach would take this into account also. This has not been possible previously because of the technical difficulty in calculating the likelihood of amplitudes seen in correlated post-synaptic potential trains; however, recent theoretical advances have now rendered the likelihood calculation tractable for a broad class of synaptic dynamics models. Here we present a compact mathematical form for the likelihood in terms of a matrix product and demonstrate how marginals of the posterior provide information on covariance of parameter distributions. The associated computer code for Bayesian parameter inference for a variety of models of synaptic dynamics is provided in the supplementary material allowing for quantal and dynamical parameters to be readily inferred from experimental data sets.
Bayesian Inference of Synaptic Quantal Parameters from Correlated Vesicle Release
Bird, Alex D.; Wall, Mark J.; Richardson, Magnus J. E.
2016-01-01
Synaptic transmission is both history-dependent and stochastic, resulting in varying responses to presentations of the same presynaptic stimulus. This complicates attempts to infer synaptic parameters and has led to the proposal of a number of different strategies for their quantification. Recently Bayesian approaches have been applied to make more efficient use of the data collected in paired intracellular recordings. Methods have been developed that either provide a complete model of the distribution of amplitudes for isolated responses or approximate the amplitude distributions of a train of post-synaptic potentials, with correct short-term synaptic dynamics but neglecting correlations. In both cases the methods provided significantly improved inference of model parameters as compared to existing mean-variance fitting approaches. However, for synapses with high release probability, low vesicle number or relatively low restock rate and for data in which only one or few repeats of the same pattern are available, correlations between serial events can allow for the extraction of significantly more information from experiment: a more complete Bayesian approach would take this into account also. This has not been possible previously because of the technical difficulty in calculating the likelihood of amplitudes seen in correlated post-synaptic potential trains; however, recent theoretical advances have now rendered the likelihood calculation tractable for a broad class of synaptic dynamics models. Here we present a compact mathematical form for the likelihood in terms of a matrix product and demonstrate how marginals of the posterior provide information on covariance of parameter distributions. The associated computer code for Bayesian parameter inference for a variety of models of synaptic dynamics is provided in the Supplementary Material allowing for quantal and dynamical parameters to be readily inferred from experimental data sets. PMID:27932970
Bayesian inversion of seismic attributes for geological facies using a Hidden Markov Model
Nawaz, Muhammad Atif; Curtis, Andrew
2017-02-01
Markov chain Monte-Carlo (McMC) sampling generates correlated random samples such that their distribution would converge to the true distribution only as the number of samples tends to infinity. In practice, McMC is found to be slow to converge, convergence is not guaranteed to be achieved in finite time, and detection of convergence requires the use of subjective criteria. Although McMC has been used for decades as the algorithm of choice for inference in complex probability distributions, there is a need to seek alternative approaches, particularly in high dimensional problems. Walker & Curtis (2014) developed a method for Bayesian inversion of 2-D spatial data using an exact sampling alternative to McMC which always draws independent samples of the target distribution. Their method thus obviates the need for convergence and removes the concomitant bias exhibited by finite sample sets. Their algorithm is nevertheless computationally intensive and requires large memory. We propose a more efficient method for Bayesian inversion of categorical variables, such as geological facies that requires no sampling at all. The method is based on a 2-D Hidden Markov Model (2D-HMM) over a grid of cells where observations represent localized data constraining each cell. The data in our example application are seismic attributes such as P- and S-wave impedances and rock density; our categorical variables are the hidden states and represent the geological rock types in each cell-facies of distinct subsets of lithology and fluid combinations such as shale, brine-sand and gas-sand. The observations at each location are assumed to be generated from a random function of the hidden state (facies) at that location, and to be distributed according to a certain probability distribution that is independent of hidden states at other locations - an assumption referred to as `localized likelihoods'. The hidden state (facies) at a location cannot be determined solely by the observation at that
Pooley, C M; Bishop, S C; Marion, G
2015-06-06
Bayesian statistics provides a framework for the integration of dynamic models with incomplete data to enable inference of model parameters and unobserved aspects of the system under study. An important class of dynamic models is discrete state space, continuous-time Markov processes (DCTMPs). Simulated via the Doob-Gillespie algorithm, these have been used to model systems ranging from chemistry to ecology to epidemiology. A new type of proposal, termed 'model-based proposal' (MBP), is developed for the efficient implementation of Bayesian inference in DCTMPs using Markov chain Monte Carlo (MCMC). This new method, which in principle can be applied to any DCTMP, is compared (using simple epidemiological SIS and SIR models as easy to follow exemplars) to a standard MCMC approach and a recently proposed particle MCMC (PMCMC) technique. When measurements are made on a single-state variable (e.g. the number of infected individuals in a population during an epidemic), model-based proposal MCMC (MBP-MCMC) is marginally faster than PMCMC (by a factor of 2-8 for the tests performed), and significantly faster than the standard MCMC scheme (by a factor of 400 at least). However, when model complexity increases and measurements are made on more than one state variable (e.g. simultaneously on the number of infected individuals in spatially separated subpopulations), MBP-MCMC is significantly faster than PMCMC (more than 100-fold for just four subpopulations) and this difference becomes increasingly large. © 2015 The Author(s) Published by the Royal Society. All rights reserved.
Bayesian Inference for Radio Observations - Going beyond deconvolution
Lochner, Michelle; Kunz, Martin; Natarajan, Iniyan; Oozeer, Nadeem; Smirnov, Oleg; Zwart, Jon
2015-01-01
Radio interferometers suffer from the problem of missing information in their data, due to the gaps between the antennas. This results in artifacts, such as bright rings around sources, in the images obtained. Multiple deconvolution algorithms have been proposed to solve this problem and produce cleaner radio images. However, these algorithms are unable to correctly estimate uncertainties in derived scientific parameters or to always include the effects of instrumental errors. We propose an alternative technique called Bayesian Inference for Radio Observations (BIRO) which uses a Bayesian statistical framework to determine the scientific parameters and instrumental errors simultaneously directly from the raw data, without making an image. We use a simple simulation of Westerbork Synthesis Radio Telescope data including pointing errors and beam parameters as instrumental effects, to demonstrate the use of BIRO.
ANUBIS: artificial neuromodulation using a Bayesian inference system.
Smith, Benjamin J H; Saaj, Chakravarthini M; Allouis, Elie
2013-01-01
Gain tuning is a crucial part of controller design and depends not only on an accurate understanding of the system in question, but also on the designer's ability to predict what disturbances and other perturbations the system will encounter throughout its operation. This letter presents ANUBIS (artificial neuromodulation using a Bayesian inference system), a novel biologically inspired technique for automatically tuning controller parameters in real time. ANUBIS is based on the Bayesian brain concept and modifies it by incorporating a model of the neuromodulatory system comprising four artificial neuromodulators. It has been applied to the controller of EchinoBot, a prototype walking rover for Martian exploration. ANUBIS has been implemented at three levels of the controller; gait generation, foot trajectory planning using Bézier curves, and foot trajectory tracking using a terminal sliding mode controller. We compare the results to a similar system that has been tuned using a multilayer perceptron. The use of Bayesian inference means that the system retains mathematical interpretability, unlike other intelligent tuning techniques, which use neural networks, fuzzy logic, or evolutionary algorithms. The simulation results show that ANUBIS provides significant improvements in efficiency and adaptability of the three controller components; it allows the robot to react to obstacles and uncertainties faster than the system tuned with the MLP, while maintaining stability and accuracy. As well as advancing rover autonomy, ANUBIS could also be applied to other situations where operating conditions are likely to change or cannot be accurately modeled in advance, such as process control. In addition, it demonstrates one way in which neuromodulation could fit into the Bayesian brain framework.
Nonparametric Bayesian inference of the microcanonical stochastic block model
Peixoto, Tiago P.
2017-01-01
A principled approach to characterize the hidden modular structure of networks is to formulate generative models and then infer their parameters from data. When the desired structure is composed of modules or "communities," a suitable choice for this task is the stochastic block model (SBM), where nodes are divided into groups, and the placement of edges is conditioned on the group memberships. Here, we present a nonparametric Bayesian method to infer the modular structure of empirical networks, including the number of modules and their hierarchical organization. We focus on a microcanonical variant of the SBM, where the structure is imposed via hard constraints, i.e., the generated networks are not allowed to violate the patterns imposed by the model. We show how this simple model variation allows simultaneously for two important improvements over more traditional inference approaches: (1) deeper Bayesian hierarchies, with noninformative priors replaced by sequences of priors and hyperpriors, which not only remove limitations that seriously degrade the inference on large networks but also reveal structures at multiple scales; (2) a very efficient inference algorithm that scales well not only for networks with a large number of nodes and edges but also with an unlimited number of modules. We show also how this approach can be used to sample modular hierarchies from the posterior distribution, as well as to perform model selection. We discuss and analyze the differences between sampling from the posterior and simply finding the single parameter estimate that maximizes it. Furthermore, we expose a direct equivalence between our microcanonical approach and alternative derivations based on the canonical SBM.
Inference of Gene Regulatory Network Based on Local Bayesian Networks.
Liu, Fei; Zhang, Shao-Wu; Guo, Wei-Feng; Wei, Ze-Gang; Chen, Luonan
2016-08-01
The inference of gene regulatory networks (GRNs) from expression data can mine the direct regulations among genes and gain deep insights into biological processes at a network level. During past decades, numerous computational approaches have been introduced for inferring the GRNs. However, many of them still suffer from various problems, e.g., Bayesian network (BN) methods cannot handle large-scale networks due to their high computational complexity, while information theory-based methods cannot identify the directions of regulatory interactions and also suffer from false positive/negative problems. To overcome the limitations, in this work we present a novel algorithm, namely local Bayesian network (LBN), to infer GRNs from gene expression data by using the network decomposition strategy and false-positive edge elimination scheme. Specifically, LBN algorithm first uses conditional mutual information (CMI) to construct an initial network or GRN, which is decomposed into a number of local networks or GRNs. Then, BN method is employed to generate a series of local BNs by selecting the k-nearest neighbors of each gene as its candidate regulatory genes, which significantly reduces the exponential search space from all possible GRN structures. Integrating these local BNs forms a tentative network or GRN by performing CMI, which reduces redundant regulations in the GRN and thus alleviates the false positive problem. The final network or GRN can be obtained by iteratively performing CMI and local BN on the tentative network. In the iterative process, the false or redundant regulations are gradually removed. When tested on the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in E.coli, our results suggest that LBN outperforms other state-of-the-art methods (ARACNE, GENIE3 and NARROMI) significantly, with more accurate and robust performance. In particular, the decomposition strategy with local Bayesian networks not only effectively reduce
Modelling heterotachy in phylogenetic inference by reversible-jump Markov chain Monte Carlo.
Pagel, Mark; Meade, Andrew
2008-12-27
The rate at which a given site in a gene sequence alignment evolves over time may vary. This phenomenon--known as heterotachy--can bias or distort phylogenetic trees inferred from models of sequence evolution that assume rates of evolution are constant. Here, we describe a phylogenetic mixture model designed to accommodate heterotachy. The method sums the likelihood of the data at each site over more than one set of branch lengths on the same tree topology. A branch-length set that is best for one site may differ from the branch-length set that is best for some other site, thereby allowing different sites to have different rates of change throughout the tree. Because rate variation may not be present in all branches, we use a reversible-jump Markov chain Monte Carlo algorithm to identify those branches in which reliable amounts of heterotachy occur. We implement the method in combination with our 'pattern-heterogeneity' mixture model, applying it to simulated data and five published datasets. We find that complex evolutionary signals of heterotachy are routinely present over and above variation in the rate or pattern of evolution across sites, that the reversible-jump method requires far fewer parameters than conventional mixture models to describe it, and serves to identify the regions of the tree in which heterotachy is most pronounced. The reversible-jump procedure also removes the need for a posteriori tests of 'significance' such as the Akaike or Bayesian information criterion tests, or Bayes factors. Heterotachy has important consequences for the correct reconstruction of phylogenies as well as for tests of hypotheses that rely on accurate branch-length information. These include molecular clocks, analyses of tempo and mode of evolution, comparative studies and ancestral state reconstruction. The model is available from the authors' website, and can be used for the analysis of both nucleotide and morphological data.
Bayesian inference for inverse problems occurring in uncertainty analysis
Fu, Shuai; Celeux, Gilles; Bousquet, Nicolas; Couplet, Mathieu
2012-01-01
The inverse problem considered here is to estimate the distribution of a non-observed random variable $X$ from some noisy observed data $Y$ linked to $X$ through a time-consuming physical model $H$. Bayesian inference is considered to take into account prior expert knowledge on $X$ in a small sample size setting. A Metropolis-Hastings within Gibbs algorithm is proposed to compute the posterior distribution of the parameters of $X$ through a data augmentation process. Since calls to $H$ are qu...
Bayesian Inference for Structured Spike and Slab Priors
DEFF Research Database (Denmark)
Andersen, Michael Riis; Winther, Ole; Hansen, Lars Kai
2014-01-01
Sparse signal recovery addresses the problem of solving underdetermined linear inverse problems subject to a sparsity constraint. We propose a novel prior formulation, the structured spike and slab prior, which allows to incorporate a priori knowledge of the sparsity pattern by imposing a spatial...... Gaussian process on the spike and slab probabilities. Thus, prior information on the structure of the sparsity pattern can be encoded using generic covariance functions. Furthermore, we provide a Bayesian inference scheme for the proposed model based on the expectation propagation framework. Using...
Applying Bayesian Inference to Galileon Solutions of the Muon Problem
Lamm, Henry
2016-01-01
We derive corrections to atomic energy levels from disformal couplings in Galileon theories. Through Bayesian inference, we constrain the cut-off radii and Galileon scale via these corrections. To connect different atomic systems, we assume the various cut-off radii related by a 1-parameter family of solutions. This introduces a new parameter $\\alpha$ which is also constrained. In this model, we predict shifts to muonic helium of $\\delta E_{He^3}=1.97^{+9.28}_{-1.87}$ meV and $\\delta E_{He^4}=1.69^{+9.25}_{-1.61}$ meV.
Bayesian Inference for Structured Spike and Slab Priors
DEFF Research Database (Denmark)
Andersen, Michael Riis; Winther, Ole; Hansen, Lars Kai
2014-01-01
Sparse signal recovery addresses the problem of solving underdetermined linear inverse problems subject to a sparsity constraint. We propose a novel prior formulation, the structured spike and slab prior, which allows to incorporate a priori knowledge of the sparsity pattern by imposing a spatial...... Gaussian process on the spike and slab probabilities. Thus, prior information on the structure of the sparsity pattern can be encoded using generic covariance functions. Furthermore, we provide a Bayesian inference scheme for the proposed model based on the expectation propagation framework. Using...
Inference-less Density Estimation using Copula Bayesian Networks
Elidan, Gal
2012-01-01
We consider learning continuous probabilistic graphical models in the face of missing data. For non-Gaussian models, learning the parameters and structure of such models depends on our ability to perform efficient inference, and can be prohibitive even for relatively modest domains. Recently, we introduced the Copula Bayesian Network (CBN) density model - a flexible framework that captures complex high-dimensional dependency structures while offering direct control over the univariate marginals, leading to improved generalization. In this work we show that the CBN model also offers significant computational advantages when training data is partially observed. Concretely, we leverage on the specialized form of the model to derive a computationally amenable learning objective that is a lower bound on the log-likelihood function. Importantly, our energy-like bound circumvents the need for costly inference of an auxiliary distribution, thus facilitating practical learning of highdimensional densities. We demonstr...
Inferring on the intentions of others by hierarchical Bayesian learning.
Directory of Open Access Journals (Sweden)
Andreea O Diaconescu
2014-09-01
Full Text Available Inferring on others' (potentially time-varying intentions is a fundamental problem during many social transactions. To investigate the underlying mechanisms, we applied computational modeling to behavioral data from an economic game in which 16 pairs of volunteers (randomly assigned to "player" or "adviser" roles interacted. The player performed a probabilistic reinforcement learning task, receiving information about a binary lottery from a visual pie chart. The adviser, who received more predictive information, issued an additional recommendation. Critically, the game was structured such that the adviser's incentives to provide helpful or misleading information varied in time. Using a meta-Bayesian modeling framework, we found that the players' behavior was best explained by the deployment of hierarchical learning: they inferred upon the volatility of the advisers' intentions in order to optimize their predictions about the validity of their advice. Beyond learning, volatility estimates also affected the trial-by-trial variability of decisions: participants were more likely to rely on their estimates of advice accuracy for making choices when they believed that the adviser's intentions were presently stable. Finally, our model of the players' inference predicted the players' interpersonal reactivity index (IRI scores, explicit ratings of the advisers' helpfulness and the advisers' self-reports on their chosen strategy. Overall, our results suggest that humans (i employ hierarchical generative models to infer on the changing intentions of others, (ii use volatility estimates to inform decision-making in social interactions, and (iii integrate estimates of advice accuracy with non-social sources of information. The Bayesian framework presented here can quantify individual differences in these mechanisms from simple behavioral readouts and may prove useful in future clinical studies of maladaptive social cognition.
Inferring on the intentions of others by hierarchical Bayesian learning.
Diaconescu, Andreea O; Mathys, Christoph; Weber, Lilian A E; Daunizeau, Jean; Kasper, Lars; Lomakina, Ekaterina I; Fehr, Ernst; Stephan, Klaas E
2014-09-01
Inferring on others' (potentially time-varying) intentions is a fundamental problem during many social transactions. To investigate the underlying mechanisms, we applied computational modeling to behavioral data from an economic game in which 16 pairs of volunteers (randomly assigned to "player" or "adviser" roles) interacted. The player performed a probabilistic reinforcement learning task, receiving information about a binary lottery from a visual pie chart. The adviser, who received more predictive information, issued an additional recommendation. Critically, the game was structured such that the adviser's incentives to provide helpful or misleading information varied in time. Using a meta-Bayesian modeling framework, we found that the players' behavior was best explained by the deployment of hierarchical learning: they inferred upon the volatility of the advisers' intentions in order to optimize their predictions about the validity of their advice. Beyond learning, volatility estimates also affected the trial-by-trial variability of decisions: participants were more likely to rely on their estimates of advice accuracy for making choices when they believed that the adviser's intentions were presently stable. Finally, our model of the players' inference predicted the players' interpersonal reactivity index (IRI) scores, explicit ratings of the advisers' helpfulness and the advisers' self-reports on their chosen strategy. Overall, our results suggest that humans (i) employ hierarchical generative models to infer on the changing intentions of others, (ii) use volatility estimates to inform decision-making in social interactions, and (iii) integrate estimates of advice accuracy with non-social sources of information. The Bayesian framework presented here can quantify individual differences in these mechanisms from simple behavioral readouts and may prove useful in future clinical studies of maladaptive social cognition.
Molitor, John
2012-03-01
Bayesian methods have seen an increase in popularity in a wide variety of scientific fields, including epidemiology. One of the main reasons for their widespread application is the power of the Markov chain Monte Carlo (MCMC) techniques generally used to fit these models. As a result, researchers often implicitly associate Bayesian models with MCMC estimation procedures. However, Bayesian models do not always require Markov-chain-based methods for parameter estimation. This is important, as MCMC estimation methods, while generally quite powerful, are complex and computationally expensive and suffer from convergence problems related to the manner in which they generate correlated samples used to estimate probability distributions for parameters of interest. In this issue of the Journal, Cole et al. (Am J Epidemiol. 2012;175(5):368-375) present an interesting paper that discusses non-Markov-chain-based approaches to fitting Bayesian models. These methods, though limited, can overcome some of the problems associated with MCMC techniques and promise to provide simpler approaches to fitting Bayesian models. Applied researchers will find these estimation approaches intuitively appealing and will gain a deeper understanding of Bayesian models through their use. However, readers should be aware that other non-Markov-chain-based methods are currently in active development and have been widely published in other fields.
Sparse kernel learning with LASSO and Bayesian inference algorithm.
Gao, Junbin; Kwan, Paul W; Shi, Daming
2010-03-01
Kernelized LASSO (Least Absolute Selection and Shrinkage Operator) has been investigated in two separate recent papers [Gao, J., Antolovich, M., & Kwan, P. H. (2008). L1 LASSO and its Bayesian inference. In W. Wobcke, & M. Zhang (Eds.), Lecture notes in computer science: Vol. 5360 (pp. 318-324); Wang, G., Yeung, D. Y., & Lochovsky, F. (2007). The kernel path in kernelized LASSO. In International conference on artificial intelligence and statistics (pp. 580-587). San Juan, Puerto Rico: MIT Press]. This paper is concerned with learning kernels under the LASSO formulation via adopting a generative Bayesian learning and inference approach. A new robust learning algorithm is proposed which produces a sparse kernel model with the capability of learning regularized parameters and kernel hyperparameters. A comparison with state-of-the-art methods for constructing sparse regression models such as the relevance vector machine (RVM) and the local regularization assisted orthogonal least squares regression (LROLS) is given. The new algorithm is also demonstrated to possess considerable computational advantages. Copyright 2009 Elsevier Ltd. All rights reserved.
Bayesian inference for generalized linear models for spiking neurons
Directory of Open Access Journals (Sweden)
Sebastian Gerwinn
2010-05-01
Full Text Available Generalized Linear Models (GLMs are commonly used statistical methods for modelling the relationship between neural population activity and presented stimuli. When the dimension of the parameter space is large, strong regularization has to be used in order to fit GLMs to datasets of realistic size without overfitting. By imposing properly chosen priors over parameters, Bayesian inference provides an effective and principled approach for achieving regularization. Here we show how the posterior distribution over model parameters of GLMs can be approximated by a Gaussian using the Expectation Propagation algorithm. In this way, we obtain an estimate of the posterior mean and posterior covariance, allowing us to calculate Bayesian confidence intervals that characterize the uncertainty about the optimal solution. From the posterior we also obtain a different point estimate, namely the posterior mean as opposed to the commonly used maximum a posteriori estimate. We systematically compare the different inference techniques on simulated as well as on multi-electrode recordings of retinal ganglion cells, and explore the effects of the chosen prior and the performance measure used. We find that good performance can be achieved by choosing an Laplace prior together with the posterior mean estimate.
Bayesian inference from count data using discrete uniform priors.
Directory of Open Access Journals (Sweden)
Federico Comoglio
Full Text Available We consider a set of sample counts obtained by sampling arbitrary fractions of a finite volume containing an homogeneously dispersed population of identical objects. We report a Bayesian derivation of the posterior probability distribution of the population size using a binomial likelihood and non-conjugate, discrete uniform priors under sampling with or without replacement. Our derivation yields a computationally feasible formula that can prove useful in a variety of statistical problems involving absolute quantification under uncertainty. We implemented our algorithm in the R package dupiR and compared it with a previously proposed Bayesian method based on a Gamma prior. As a showcase, we demonstrate that our inference framework can be used to estimate bacterial survival curves from measurements characterized by extremely low or zero counts and rather high sampling fractions. All in all, we provide a versatile, general purpose algorithm to infer population sizes from count data, which can find application in a broad spectrum of biological and physical problems.
Bayesian Inference for Signal-Based Seismic Monitoring
Moore, D.
2015-12-01
Traditional seismic monitoring systems rely on discrete detections produced by station processing software, discarding significant information present in the original recorded signal. SIG-VISA (Signal-based Vertically Integrated Seismic Analysis) is a system for global seismic monitoring through Bayesian inference on seismic signals. By modeling signals directly, our forward model is able to incorporate a rich representation of the physics underlying the signal generation process, including source mechanisms, wave propagation, and station response. This allows inference in the model to recover the qualitative behavior of recent geophysical methods including waveform matching and double-differencing, all as part of a unified Bayesian monitoring system that simultaneously detects and locates events from a global network of stations. We demonstrate recent progress in scaling up SIG-VISA to efficiently process the data stream of global signals recorded by the International Monitoring System (IMS), including comparisons against existing processing methods that show increased sensitivity from our signal-based model and in particular the ability to locate events (including aftershock sequences that can tax analyst processing) precisely from waveform correlation effects. We also provide a Bayesian analysis of an alleged low-magnitude event near the DPRK test site in May 2010 [1] [2], investigating whether such an event could plausibly be detected through automated processing in a signal-based monitoring system. [1] Zhang, Miao and Wen, Lianxing. "Seismological Evidence for a Low-Yield Nuclear Test on 12 May 2010 in North Korea". Seismological Research Letters, January/February 2015. [2] Richards, Paul. "A Seismic Event in North Korea on 12 May 2010". CTBTO SnT 2015 oral presentation, video at https://video-archive.ctbto.org/index.php/kmc/preview/partner_id/103/uiconf_id/4421629/entry_id/0_ymmtpps0/delivery/http
Comparing variational Bayes with Markov chain Monte Carlo for Bayesian computation in neuroimaging.
Nathoo, F S; Lesperance, M L; Lawson, A B; Dean, C B
2013-08-01
In this article, we consider methods for Bayesian computation within the context of brain imaging studies. In such studies, the complexity of the resulting data often necessitates the use of sophisticated statistical models; however, the large size of these data can pose significant challenges for model fitting. We focus specifically on the neuroelectromagnetic inverse problem in electroencephalography, which involves estimating the neural activity within the brain from electrode-level data measured across the scalp. The relationship between the observed scalp-level data and the unobserved neural activity can be represented through an underdetermined dynamic linear model, and we discuss Bayesian computation for such models, where parameters represent the unknown neural sources of interest. We review the inverse problem and discuss variational approximations for fitting hierarchical models in this context. While variational methods have been widely adopted for model fitting in neuroimaging, they have received very little attention in the statistical literature, where Markov chain Monte Carlo is often used. We derive variational approximations for fitting two models: a simple distributed source model and a more complex spatiotemporal mixture model. We compare the approximations to Markov chain Monte Carlo using both synthetic data as well as through the analysis of a real electroencephalography dataset examining the evoked response related to face perception. The computational advantages of the variational method are demonstrated and the accuracy associated with the resulting approximations are clarified.
A tutorial on time-evolving dynamical Bayesian inference
Stankovski, Tomislav; Duggento, Andrea; McClintock, Peter V. E.; Stefanovska, Aneta
2014-12-01
In view of the current availability and variety of measured data, there is an increasing demand for powerful signal processing tools that can cope successfully with the associated problems that often arise when data are being analysed. In practice many of the data-generating systems are not only time-variable, but also influenced by neighbouring systems and subject to random fluctuations (noise) from their environments. To encompass problems of this kind, we present a tutorial about the dynamical Bayesian inference of time-evolving coupled systems in the presence of noise. It includes the necessary theoretical description and the algorithms for its implementation. For general programming purposes, a pseudocode description is also given. Examples based on coupled phase and limit-cycle oscillators illustrate the salient features of phase dynamics inference. State domain inference is illustrated with an example of coupled chaotic oscillators. The applicability of the latter example to secure communications based on the modulation of coupling functions is outlined. MatLab codes for implementation of the method, as well as for the explicit examples, accompany the tutorial.
Bayesian inference for identifying interaction rules in moving animal groups.
Directory of Open Access Journals (Sweden)
Richard P Mann
Full Text Available The emergence of similar collective patterns from different self-propelled particle models of animal groups points to a restricted set of "universal" classes for these patterns. While universality is interesting, it is often the fine details of animal interactions that are of biological importance. Universality thus presents a challenge to inferring such interactions from macroscopic group dynamics since these can be consistent with many underlying interaction models. We present a Bayesian framework for learning animal interaction rules from fine scale recordings of animal movements in swarms. We apply these techniques to the inverse problem of inferring interaction rules from simulation models, showing that parameters can often be inferred from a small number of observations. Our methodology allows us to quantify our confidence in parameter fitting. For example, we show that attraction and alignment terms can be reliably estimated when animals are milling in a torus shape, while interaction radius cannot be reliably measured in such a situation. We assess the importance of rate of data collection and show how to test different models, such as topological and metric neighbourhood models. Taken together our results both inform the design of experiments on animal interactions and suggest how these data should be best analysed.
Bayesian inference for identifying interaction rules in moving animal groups.
Mann, Richard P
2011-01-01
The emergence of similar collective patterns from different self-propelled particle models of animal groups points to a restricted set of "universal" classes for these patterns. While universality is interesting, it is often the fine details of animal interactions that are of biological importance. Universality thus presents a challenge to inferring such interactions from macroscopic group dynamics since these can be consistent with many underlying interaction models. We present a Bayesian framework for learning animal interaction rules from fine scale recordings of animal movements in swarms. We apply these techniques to the inverse problem of inferring interaction rules from simulation models, showing that parameters can often be inferred from a small number of observations. Our methodology allows us to quantify our confidence in parameter fitting. For example, we show that attraction and alignment terms can be reliably estimated when animals are milling in a torus shape, while interaction radius cannot be reliably measured in such a situation. We assess the importance of rate of data collection and show how to test different models, such as topological and metric neighbourhood models. Taken together our results both inform the design of experiments on animal interactions and suggest how these data should be best analysed.
Inference of Gene Regulatory Network Based on Local Bayesian Networks.
Directory of Open Access Journals (Sweden)
Fei Liu
2016-08-01
Full Text Available The inference of gene regulatory networks (GRNs from expression data can mine the direct regulations among genes and gain deep insights into biological processes at a network level. During past decades, numerous computational approaches have been introduced for inferring the GRNs. However, many of them still suffer from various problems, e.g., Bayesian network (BN methods cannot handle large-scale networks due to their high computational complexity, while information theory-based methods cannot identify the directions of regulatory interactions and also suffer from false positive/negative problems. To overcome the limitations, in this work we present a novel algorithm, namely local Bayesian network (LBN, to infer GRNs from gene expression data by using the network decomposition strategy and false-positive edge elimination scheme. Specifically, LBN algorithm first uses conditional mutual information (CMI to construct an initial network or GRN, which is decomposed into a number of local networks or GRNs. Then, BN method is employed to generate a series of local BNs by selecting the k-nearest neighbors of each gene as its candidate regulatory genes, which significantly reduces the exponential search space from all possible GRN structures. Integrating these local BNs forms a tentative network or GRN by performing CMI, which reduces redundant regulations in the GRN and thus alleviates the false positive problem. The final network or GRN can be obtained by iteratively performing CMI and local BN on the tentative network. In the iterative process, the false or redundant regulations are gradually removed. When tested on the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in E.coli, our results suggest that LBN outperforms other state-of-the-art methods (ARACNE, GENIE3 and NARROMI significantly, with more accurate and robust performance. In particular, the decomposition strategy with local Bayesian networks not only
Bayesian inference for a wave-front model of the neolithization of Europe.
Baggaley, Andrew W; Sarson, Graeme R; Shukurov, Anvar; Boys, Richard J; Golightly, Andrew
2012-07-01
We consider a wave-front model for the spread of neolithic culture across Europe, and use Bayesian inference techniques to provide estimates for the parameters within this model, as constrained by radiocarbon data from southern and western Europe. Our wave-front model allows for both an isotropic background spread (incorporating the effects of local geography) and a localized anisotropic spread associated with major waterways. We introduce an innovative numerical scheme to track the wave front, and use Gaussian process emulators to further increase the efficiency of our model, thereby making Markov chain Monte Carlo methods practical. We allow for uncertainty in the fit of our model, and discuss the inferred distribution of the parameter specifying this uncertainty, along with the distributions of the parameters of our wave-front model. We subsequently use predictive distributions, taking account of parameter uncertainty, to identify radiocarbon sites which do not agree well with our model. These sites may warrant further archaeological study or motivate refinements to the model.
Ancestry inference in complex admixtures via variable-length Markov chain linkage models.
Rodriguez, Jesse M; Bercovici, Sivan; Elmore, Megan; Batzoglou, Serafim
2013-03-01
Inferring the ancestral origin of chromosomal segments in admixed individuals is key for genetic applications, ranging from analyzing population demographics and history, to mapping disease genes. Previous methods addressed ancestry inference by using either weak models of linkage disequilibrium, or large models that make explicit use of ancestral haplotypes. In this paper we introduce ALLOY, an efficient method that incorporates generalized, but highly expressive, linkage disequilibrium models. ALLOY applies a factorial hidden Markov model to capture the parallel process producing the maternal and paternal admixed haplotypes, and models the background linkage disequilibrium in the ancestral populations via an inhomogeneous variable-length Markov chain. We test ALLOY in a broad range of scenarios ranging from recent to ancient admixtures with up to four ancestral populations. We show that ALLOY outperforms the previous state of the art, and is robust to uncertainties in model parameters.
Bayesian large-scale structure inference and cosmic web analysis
Leclercq, Florent
2015-01-01
Surveys of the cosmic large-scale structure carry opportunities for building and testing cosmological theories about the origin and evolution of the Universe. This endeavor requires appropriate data assimilation tools, for establishing the contact between survey catalogs and models of structure formation. In this thesis, we present an innovative statistical approach for the ab initio simultaneous analysis of the formation history and morphology of the cosmic web: the BORG algorithm infers the primordial density fluctuations and produces physical reconstructions of the dark matter distribution that underlies observed galaxies, by assimilating the survey data into a cosmological structure formation model. The method, based on Bayesian probability theory, provides accurate means of uncertainty quantification. We demonstrate the application of BORG to the Sloan Digital Sky Survey data and describe the primordial and late-time large-scale structure in the observed volume. We show how the approach has led to the fi...
Metainference: A Bayesian Inference Method for Heterogeneous Systems
Bonomi, Massimiliano; Cavalli, Andrea; Vendruscolo, Michele
2015-01-01
Modelling a complex system is almost invariably a challenging task. The incorporation of experimental observations can be used to improve the quality of a model, and thus to obtain better predictions about the behavior of the corresponding system. This approach, however, is affected by a variety of different errors, especially when a system populates simultaneously an ensemble of different states and experimental data are measured as averages over such states. To address this problem we present a method, called metainference, that combines Bayesian inference, which is a powerful strategy to deal with errors in experimental measurements, with the maximum entropy principle, which represents a rigorous approach to deal with experimental measurements averaged over multiple states. To illustrate the method we present its application to the determination of an ensemble of structures corresponding to the thermal fluctuations of a protein molecule. Metainference thus provides an approach to model complex systems with...
Applying Bayesian inference to Galileon solutions of the muon problem
Lamm, Henry
2016-12-01
We derive corrections to atomic energy levels from disformal couplings in Galileon theories. Through Bayesian inference, we constrain the cutoff radii and Galileon scale via these corrections. To connect different atomic systems, we assume the various cutoff radii related by a one-parameter family of solutions. This introduces a new parameter α which is also constrained. In this model, we predict shifts to muonic helium of δ EHe3=1.9 7-1.87+9.28 meV and δ EHe4=1.6 9-1.61+9.25 meV as well as for true muonium, δ ETM=0.0 6-0.05+0.46 meV .
Bayesian Inference Applied to the Electromagnetic Inverse Problem
Schmidt, D M; Wood, C C; Schmidt, David M.; George, John S.
1998-01-01
We present a new approach to the electromagnetic inverse problem that explicitly addresses the ambiguity associated with its ill-posed character. Rather than calculating a single ``best'' solution according to some criterion, our approach produces a large number of likely solutions that both fit the data and any prior information that is used. While the range of the different likely results is representative of the ambiguity in the inverse problem even with prior information present, features that are common across a large number of the different solutions can be identified and are associated with a high degree of probability. This approach is implemented and quantified within the formalism of Bayesian inference which combines prior information with that from measurement in a common framework using a single measure. To demonstrate this approach, a general neural activation model is constructed that includes a variable number of extended regions of activation and can incorporate a great deal of prior informati...
Unsupervised Transient Light Curve Analysis Via Hierarchical Bayesian Inference
Sanders, Nathan; Soderberg, Alicia
2014-01-01
Historically, light curve studies of supernovae (SNe) and other transient classes have focused on individual objects with copious and high signal-to-noise observations. In the nascent era of wide field transient searches, objects with detailed observations are decreasing as a fraction of the overall known SN population, and this strategy sacrifices the majority of the information contained in the data about the underlying population of transients. A population level modeling approach, simultaneously fitting all available observations of objects in a transient sub-class of interest, fully mines the data to infer the properties of the population and avoids certain systematic biases. We present a novel hierarchical Bayesian statistical model for population level modeling of transient light curves, and discuss its implementation using an efficient Hamiltonian Monte Carlo technique. As a test case, we apply this model to the Type IIP SN sample from the Pan-STARRS1 Medium Deep Survey, consisting of 18,837 photometr...
Bayesian inference on the sphere beyond statistical isotropy
Das, Santanu; Souradeep, Tarun
2015-01-01
We present a general method for Bayesian inference of the underlying covariance structure of random fields on a sphere. We employ the Bipolar Spherical Harmonic (BipoSH) representation of general covariance structure on the sphere. We illustrate the efficacy of the method as a principled approach to assess violation of statistical isotropy (SI) in the sky maps of Cosmic Microwave Background (CMB) fluctuations. SI violation in observed CMB maps arise due to known physical effects such as Doppler boost and weak lensing; yet unknown theoretical possibilities like cosmic topology and subtle violations of the cosmological principle, as well as, expected observational artefacts of scanning the sky with a non-circular beam, masking, foreground residuals, anisotropic noise, etc. We explicitly demonstrate the recovery of the input SI violation signals with their full statistics in simulated CMB maps. Our formalism easily adapts to exploring parametric physical models with non-SI covariance, as we illustrate for the in...
Bayesian inference underlies the contraction bias in delayed comparison tasks.
Directory of Open Access Journals (Sweden)
Paymon Ashourian
Full Text Available Delayed comparison tasks are widely used in the study of working memory and perception in psychology and neuroscience. It has long been known, however, that decisions in these tasks are biased. When the two stimuli in a delayed comparison trial are small in magnitude, subjects tend to report that the first stimulus is larger than the second stimulus. In contrast, subjects tend to report that the second stimulus is larger than the first when the stimuli are relatively large. Here we study the computational principles underlying this bias, also known as the contraction bias. We propose that the contraction bias results from a Bayesian computation in which a noisy representation of a magnitude is combined with a-priori information about the distribution of magnitudes to optimize performance. We test our hypothesis on choice behavior in a visual delayed comparison experiment by studying the effect of (i changing the prior distribution and (ii changing the uncertainty in the memorized stimulus. We show that choice behavior in both manipulations is consistent with the performance of an observer who uses a Bayesian inference in order to improve performance. Moreover, our results suggest that the contraction bias arises during memory retrieval/decision making and not during memory encoding. These results support the notion that the contraction bias illusion can be understood as resulting from optimality considerations.
Utilizing Gaze Behavior for Inferring Task Transitions Using Abstract Hidden Markov Models
Directory of Open Access Journals (Sweden)
Daniel Fernando Tello Gamarra
2016-12-01
Full Text Available We demonstrate an improved method for utilizing observed gaze behavior and show that it is useful in inferring hand movement intent during goal directed tasks. The task dynamics and the relationship between hand and gaze behavior are learned using an Abstract Hidden Markov Model (AHMM. We show that the predicted hand movement transitions occur consistently earlier in AHMM models with gaze than those models that do not include gaze observations.
Bayesian electron density inference from JET lithium beam emission spectra using Gaussian processes
Kwak, Sehyun; Svensson, J.; Brix, M.; Ghim, Y.-C.; Contributors, JET
2017-03-01
A Bayesian model to infer edge electron density profiles is developed for the JET lithium beam emission spectroscopy (Li-BES) system, measuring Li I (2p-2s) line radiation using 26 channels with ∼1 cm spatial resolution and 10∼ 20 ms temporal resolution. The density profile is modelled using a Gaussian process prior, and the uncertainty of the density profile is calculated by a Markov Chain Monte Carlo (MCMC) scheme. From the spectra measured by the transmission grating spectrometer, the Li I line intensities are extracted, and modelled as a function of the plasma density by a multi-state model which describes the relevant processes between neutral lithium beam atoms and plasma particles. The spectral model fully takes into account interference filter and instrument effects, that are separately estimated, again using Gaussian processes. The line intensities are inferred based on a spectral model consistent with the measured spectra within their uncertainties, which includes photon statistics and electronic noise. Our newly developed method to infer JET edge electron density profiles has the following advantages in comparison to the conventional method: (i) providing full posterior distributions of edge density profiles, including their associated uncertainties, (ii) the available radial range for density profiles is increased to the full observation range (∼26 cm), (iii) an assumption of monotonic electron density profile is not necessary, (iv) the absolute calibration factor of the diagnostic system is automatically estimated overcoming the limitation of the conventional technique and allowing us to infer the electron density profiles for all pulses without preprocessing the data or an additional boundary condition, and (v) since the full spectrum is modelled, the procedure of modulating the beam to measure the background signal is only necessary for the case of overlapping of the Li I line with impurity lines.
Energy Technology Data Exchange (ETDEWEB)
La Russa, D [The Ottawa Hospital Cancer Centre, Ottawa, ON (Canada)
2015-06-15
Purpose: The purpose of this project is to develop a robust method of parameter estimation for a Poisson-based TCP model using Bayesian inference. Methods: Bayesian inference was performed using the PyMC3 probabilistic programming framework written in Python. A Poisson-based TCP regression model that accounts for clonogen proliferation was fit to observed rates of local relapse as a function of equivalent dose in 2 Gy fractions for a population of 623 stage-I non-small-cell lung cancer patients. The Slice Markov Chain Monte Carlo sampling algorithm was used to sample the posterior distributions, and was initiated using the maximum of the posterior distributions found by optimization. The calculation of TCP with each sample step required integration over the free parameter α, which was performed using an adaptive 24-point Gauss-Legendre quadrature. Convergence was verified via inspection of the trace plot and posterior distribution for each of the fit parameters, as well as with comparisons of the most probable parameter values with their respective maximum likelihood estimates. Results: Posterior distributions for α, the standard deviation of α (σ), the average tumour cell-doubling time (Td), and the repopulation delay time (Tk), were generated assuming α/β = 10 Gy, and a fixed clonogen density of 10{sup 7} cm−{sup 3}. Posterior predictive plots generated from samples from these posterior distributions are in excellent agreement with the observed rates of local relapse used in the Bayesian inference. The most probable values of the model parameters also agree well with maximum likelihood estimates. Conclusion: A robust method of performing Bayesian inference of TCP data using a complex TCP model has been established.
Empirical Markov Chain Monte Carlo Bayesian analysis of fMRI data.
de Pasquale, F; Del Gratta, C; Romani, G L
2008-08-01
In this work an Empirical Markov Chain Monte Carlo Bayesian approach to analyse fMRI data is proposed. The Bayesian framework is appealing since complex models can be adopted in the analysis both for the image and noise model. Here, the noise autocorrelation is taken into account by adopting an AutoRegressive model of order one and a versatile non-linear model is assumed for the task-related activation. Model parameters include the noise variance and autocorrelation, activation amplitudes and the hemodynamic response function parameters. These are estimated at each voxel from samples of the Posterior Distribution. Prior information is included by means of a 4D spatio-temporal model for the interaction between neighbouring voxels in space and time. The results show that this model can provide smooth estimates from low SNR data while important spatial structures in the data can be preserved. A simulation study is presented in which the accuracy and bias of the estimates are addressed. Furthermore, some results on convergence diagnostic of the adopted algorithm are presented. To validate the proposed approach a comparison of the results with those from a standard GLM analysis, spatial filtering techniques and a Variational Bayes approach is provided. This comparison shows that our approach outperforms the classical analysis and is consistent with other Bayesian techniques. This is investigated further by means of the Bayes Factors and the analysis of the residuals. The proposed approach applied to Blocked Design and Event Related datasets produced reliable maps of activation.
Bayesian Inference for Linear Parabolic PDEs with Noisy Boundary Conditions
Ruggeri, Fabrizio
2015-01-07
In this work we develop a hierarchical Bayesian setting to infer unknown parameters in initial-boundary value problems (IBVPs) for one-dimensional linear parabolic partial differential equations. Noisy boundary data and known initial condition are assumed. We derive the likelihood function associated with the forward problem, given some measurements of the solution field subject to Gaussian noise. Such function is then analytically marginalized using the linearity of the equation. Gaussian priors have been assumed for the time-dependent Dirichlet boundary values. Our approach is applied to synthetic data for the one-dimensional heat equation model, where the thermal diffusivity is the unknown parameter. We show how to infer the thermal diffusivity parameter when its prior distribution is lognormal or modeled by means of a space-dependent stationary lognormal random field. We use the Laplace method to provide approximated Gaussian posterior distributions for the thermal diffusivity. Expected information gains and predictive posterior densities for observable quantities are numerically estimated for different experimental setups.
Bayesian Inference for Linear Parabolic PDEs with Noisy Boundary Conditions
Ruggeri, Fabrizio
2016-01-06
In this work we develop a hierarchical Bayesian setting to infer unknown parameters in initial-boundary value problems (IBVPs) for one-dimensional linear parabolic partial differential equations. Noisy boundary data and known initial condition are assumed. We derive the likelihood function associated with the forward problem, given some measurements of the solution field subject to Gaussian noise. Such function is then analytically marginalized using the linearity of the equation. Gaussian priors have been assumed for the time-dependent Dirichlet boundary values. Our approach is applied to synthetic data for the one-dimensional heat equation model, where the thermal diffusivity is the unknown parameter. We show how to infer the thermal diffusivity parameter when its prior distribution is lognormal or modeled by means of a space-dependent stationary lognormal random field. We use the Laplace method to provide approximated Gaussian posterior distributions for the thermal diffusivity. Expected information gains and predictive posterior densities for observable quantities are numerically estimated for different experimental setups.
Wu, Chieh-Hsi; Drummond, Alexei J
2011-05-01
We provide a framework for Bayesian coalescent inference from microsatellite data that enables inference of population history parameters averaged over microsatellite mutation models. To achieve this we first implemented a rich family of microsatellite mutation models and related components in the software package BEAST. BEAST is a powerful tool that performs Bayesian MCMC analysis on molecular data to make coalescent and evolutionary inferences. Our implementation permits the application of existing nonparametric methods to microsatellite data. The implemented microsatellite models are based on the replication slippage mechanism and focus on three properties of microsatellite mutation: length dependency of mutation rate, mutational bias toward expansion or contraction, and number of repeat units changed in a single mutation event. We develop a new model that facilitates microsatellite model averaging and Bayesian model selection by transdimensional MCMC. With Bayesian model averaging, the posterior distributions of population history parameters are integrated across a set of microsatellite models and thus account for model uncertainty. Simulated data are used to evaluate our method in terms of accuracy and precision of estimation and also identification of the true mutation model. Finally we apply our method to a red colobus monkey data set as an example.
Rajabi, Mohammad Mahdi; Ataie-Ashtiani, Behzad
2016-05-01
Bayesian inference has traditionally been conceived as the proper framework for the formal incorporation of expert knowledge in parameter estimation of groundwater models. However, conventional Bayesian inference is incapable of taking into account the imprecision essentially embedded in expert provided information. In order to solve this problem, a number of extensions to conventional Bayesian inference have been introduced in recent years. One of these extensions is 'fuzzy Bayesian inference' which is the result of integrating fuzzy techniques into Bayesian statistics. Fuzzy Bayesian inference has a number of desirable features which makes it an attractive approach for incorporating expert knowledge in the parameter estimation process of groundwater models: (1) it is well adapted to the nature of expert provided information, (2) it allows to distinguishably model both uncertainty and imprecision, and (3) it presents a framework for fusing expert provided information regarding the various inputs of the Bayesian inference algorithm. However an important obstacle in employing fuzzy Bayesian inference in groundwater numerical modeling applications is the computational burden, as the required number of numerical model simulations often becomes extremely exhaustive and often computationally infeasible. In this paper, a novel approach of accelerating the fuzzy Bayesian inference algorithm is proposed which is based on using approximate posterior distributions derived from surrogate modeling, as a screening tool in the computations. The proposed approach is first applied to a synthetic test case of seawater intrusion (SWI) in a coastal aquifer. It is shown that for this synthetic test case, the proposed approach decreases the number of required numerical simulations by an order of magnitude. Then the proposed approach is applied to a real-world test case involving three-dimensional numerical modeling of SWI in Kish Island, located in the Persian Gulf. An expert
Bayesian inference for partially identified models exploring the limits of limited data
Gustafson, Paul
2015-01-01
Introduction Identification What Is against Us? What Is for Us? Some Simple Examples of Partially Identified ModelsThe Road Ahead The Structure of Inference in Partially Identified Models Bayesian Inference The Structure of Posterior Distributions in PIMs Computational Strategies Strength of Bayesian Updating, Revisited Posterior MomentsCredible Intervals Evaluating the Worth of Inference Partial Identification versus Model Misspecification The Siren Call of Identification Comp
Minsley, Burke J.
2011-01-01
A meaningful interpretation of geophysical measurements requires an assessment of the space of models that are consistent with the data, rather than just a single, ‘best’ model which does not convey information about parameter uncertainty. For this purpose, a trans-dimensional Bayesian Markov chain Monte Carlo (MCMC) algorithm is developed for assessing frequencydomain electromagnetic (FDEM) data acquired from airborne or ground-based systems. By sampling the distribution of models that are consistent with measured data and any prior knowledge, valuable inferences can be made about parameter values such as the likely depth to an interface, the distribution of possible resistivity values as a function of depth and non-unique relationships between parameters. The trans-dimensional aspect of the algorithm allows the number of layers to be a free parameter that is controlled by the data, where models with fewer layers are inherently favoured, which provides a natural measure of parsimony and a signiﬁcant degree of ﬂexibility in parametrization. The MCMC algorithm is used with synthetic examples to illustrate how the distribution of acceptable models is affected by the choice of prior information, the system geometry and conﬁguration and the uncertainty in the measured system elevation. An airborne FDEM data set that was acquired for the purpose of hydrogeological characterization is also studied. The results compare favorably with traditional least-squares analysis, borehole resistivity and lithology logs from the site, and also provide new information about parameter uncertainty necessary for model assessment.
Bayesian Inference for Functional Dynamics Exploring in fMRI Data.
Guo, Xuan; Liu, Bing; Chen, Le; Chen, Guantao; Pan, Yi; Zhang, Jing
2016-01-01
This paper aims to review state-of-the-art Bayesian-inference-based methods applied to functional magnetic resonance imaging (fMRI) data. Particularly, we focus on one specific long-standing challenge in the computational modeling of fMRI datasets: how to effectively explore typical functional interactions from fMRI time series and the corresponding boundaries of temporal segments. Bayesian inference is a method of statistical inference which has been shown to be a powerful tool to encode dependence relationships among the variables with uncertainty. Here we provide an introduction to a group of Bayesian-inference-based methods for fMRI data analysis, which were designed to detect magnitude or functional connectivity change points and to infer their functional interaction patterns based on corresponding temporal boundaries. We also provide a comparison of three popular Bayesian models, that is, Bayesian Magnitude Change Point Model (BMCPM), Bayesian Connectivity Change Point Model (BCCPM), and Dynamic Bayesian Variable Partition Model (DBVPM), and give a summary of their applications. We envision that more delicate Bayesian inference models will be emerging and play increasingly important roles in modeling brain functions in the years to come.
Bayesian Inference for Functional Dynamics Exploring in fMRI Data
Directory of Open Access Journals (Sweden)
Xuan Guo
2016-01-01
Full Text Available This paper aims to review state-of-the-art Bayesian-inference-based methods applied to functional magnetic resonance imaging (fMRI data. Particularly, we focus on one specific long-standing challenge in the computational modeling of fMRI datasets: how to effectively explore typical functional interactions from fMRI time series and the corresponding boundaries of temporal segments. Bayesian inference is a method of statistical inference which has been shown to be a powerful tool to encode dependence relationships among the variables with uncertainty. Here we provide an introduction to a group of Bayesian-inference-based methods for fMRI data analysis, which were designed to detect magnitude or functional connectivity change points and to infer their functional interaction patterns based on corresponding temporal boundaries. We also provide a comparison of three popular Bayesian models, that is, Bayesian Magnitude Change Point Model (BMCPM, Bayesian Connectivity Change Point Model (BCCPM, and Dynamic Bayesian Variable Partition Model (DBVPM, and give a summary of their applications. We envision that more delicate Bayesian inference models will be emerging and play increasingly important roles in modeling brain functions in the years to come.
Rahmati, Vahid; Kirmse, Knut; Marković, Dimitrije; Holthoff, Knut; Kiebel, Stefan J
2016-02-01
Calcium imaging has been used as a promising technique to monitor the dynamic activity of neuronal populations. However, the calcium trace is temporally smeared which restricts the extraction of quantities of interest such as spike trains of individual neurons. To address this issue, spike reconstruction algorithms have been introduced. One limitation of such reconstructions is that the underlying models are not informed about the biophysics of spike and burst generations. Such existing prior knowledge might be useful for constraining the possible solutions of spikes. Here we describe, in a novel Bayesian approach, how principled knowledge about neuronal dynamics can be employed to infer biophysical variables and parameters from fluorescence traces. By using both synthetic and in vitro recorded fluorescence traces, we demonstrate that the new approach is able to reconstruct different repetitive spiking and/or bursting patterns with accurate single spike resolution. Furthermore, we show that the high inference precision of the new approach is preserved even if the fluorescence trace is rather noisy or if the fluorescence transients show slow rise kinetics lasting several hundred milliseconds, and inhomogeneous rise and decay times. In addition, we discuss the use of the new approach for inferring parameter changes, e.g. due to a pharmacological intervention, as well as for inferring complex characteristics of immature neuronal circuits.
Bayesian networks precipitation model based on hidden Markov analysis and its application
Institute of Scientific and Technical Information of China (English)
无
2010-01-01
Surface precipitation estimation is very important in hydrologic forecast. To account for the influence of the neighbors on the precipitation of an arbitrary grid in the network, Bayesian networks and Markov random field were adopted to estimate surface precipitation. Spherical coordinates and the expectation-maximization (EM) algorithm were used for region interpolation, and for estimation of the precipitation of arbitrary point in the region. Surface precipitation estimation of seven precipitation stations in Qinghai Lake region was performed. By comparing with other surface precipitation methods such as Thiessen polygon method, distance weighted mean method and arithmetic mean method, it is shown that the proposed method can judge the relationship of precipitation among different points in the area under complicated circumstances and the simulation results are more accurate and rational.
Bayesian texture segmentation based on wavelet domain hidden markov tree and the SMAP rule
Institute of Scientific and Technical Information of China (English)
SUN Jun-xi; ZHANG Su; ZHAO Yong-ming; CHEN Ya-zhu
2005-01-01
According to the sequential maximum a posteriori probability (SMAP) rule, this paper proposes a novel multi-scale Bayesian texture segmentation algorithm based on the wavelet domain Hidden Markov Tree (HMT) model. In the proposed scheme, interscale label transition probability is directly defined and resoled by an EM algorithm. In order to smooth out the variations in the homogeneous regions, intrascale context information is considered. A Gaussian mixture model (GMM) in the redundant wavelet domain is also exploited to formulate the pixel-level statistical features of texture pattern so as to avoid the influence of the variance of pixel brightness. The performance of the proposed method is compared with the state-of-the-art HMTSeg method and evaluated by the experiment results.
Wind Farm Reliability Modelling Using Bayesian Networks and Semi-Markov Processes
Directory of Open Access Journals (Sweden)
Robert Adam Sobolewski
2015-09-01
Full Text Available Technical reliability plays an important role among factors affecting the power output of a wind farm. The reliability is determined by an internal collection grid topology and reliability of its electrical components, e.g. generators, transformers, cables, switch breakers, protective relays, and busbars. A wind farm reliability’s quantitative measure can be the probability distribution of combinations of operating and failed states of the farm’s wind turbines. The operating state of a wind turbine is its ability to generate power and to transfer it to an external power grid, which means the availability of the wind turbine and other equipment necessary for the power transfer to the external grid. This measure can be used for quantitative analysis of the impact of various wind farm topologies and the reliability of individual farm components on the farm reliability, and for determining the expected farm output power with consideration of the reliability. This knowledge may be useful in an analysis of power generation reliability in power systems. The paper presents probabilistic models that quantify the wind farm reliability taking into account the above-mentioned technical factors. To formulate the reliability models Bayesian networks and semi-Markov processes were used. Using Bayesian networks the wind farm structural reliability was mapped, as well as quantitative characteristics describing equipment reliability. To determine the characteristics semi-Markov processes were used. The paper presents an example calculation of: (i probability distribution of the combination of both operating and failed states of four wind turbines included in the wind farm, and (ii expected wind farm output power with consideration of its reliability.
Sandoval-Castellanos, Edson; Palkopoulou, Eleftheria; Dalén, Love
2014-01-01
Inference of population demographic history has vastly improved in recent years due to a number of technological and theoretical advances including the use of ancient DNA. Approximate Bayesian computation (ABC) stands among the most promising methods due to its simple theoretical fundament and exceptional flexibility. However, limited availability of user-friendly programs that perform ABC analysis renders it difficult to implement, and hence programming skills are frequently required. In addition, there is limited availability of programs able to deal with heterochronous data. Here we present the software BaySICS: Bayesian Statistical Inference of Coalescent Simulations. BaySICS provides an integrated and user-friendly platform that performs ABC analyses by means of coalescent simulations from DNA sequence data. It estimates historical demographic population parameters and performs hypothesis testing by means of Bayes factors obtained from model comparisons. Although providing specific features that improve inference from datasets with heterochronous data, BaySICS also has several capabilities making it a suitable tool for analysing contemporary genetic datasets. Those capabilities include joint analysis of independent tables, a graphical interface and the implementation of Markov-chain Monte Carlo without likelihoods.
Felix: Scaling Inference for Markov Logic with an Operator-based Approach
Niu, Feng; Ré, Christopher; Shavlik, Jude
2011-01-01
We examine how to scale up text-processing applications that are expressed in a language, Markov Logic, that allows one to express both logical and statistical rules. Our idea is to exploit the observation that to build text-processing applications one must solve a host of common subtasks, e.g., named-entity extraction, relationship discovery, coreference resolution. For some subtasks, there are specialized algorithms that achieve both high quality and high performance. But current general-purpose statistical inference approaches are oblivious to these subtasks and so use a single algorithm independent of the subtasks that they are performing. The result is that general purpose approaches have either lower quality, performance, or both compared to the specialized approaches. To combat this, we present Felix. In Felix programs are expressed in Markov Logic but are executed using a handful of predefined operators that encapsulate the specialized algorithms for each subtask. Key challenges are that Felix (1) mus...
Bayesian Inference in Polling Technique: 1992 Presidential Polls.
Satake, Eiki
1994-01-01
Explores the potential utility of Bayesian statistical methods in determining the predictability of multiple polls. Compares Bayesian techniques to the classical statistical method employed by pollsters. Considers these questions in the context of the 1992 presidential elections. (HB)
Kypraios, Theodore; Neal, Peter; Prangle, Dennis
2017-05-01
Likelihood-based inference for disease outbreak data can be very challenging due to the inherent dependence of the data and the fact that they are usually incomplete. In this paper we review recent Approximate Bayesian Computation (ABC) methods for the analysis of such data by fitting to them stochastic epidemic models without having to calculate the likelihood of the observed data. We consider both non-temporal and temporal-data and illustrate the methods with a number of examples featuring different models and datasets. In addition, we present extensions to existing algorithms which are easy to implement and provide an improvement to the existing methodology. Finally, R code to implement the algorithms presented in the paper is available on https://github.com/kypraios/epiABC. Copyright © 2016 Elsevier Inc. All rights reserved.
Hidden Markov induced Dynamic Bayesian Network for recovering time evolving gene regulatory networks
Zhu, Shijia; Wang, Yadong
2015-12-01
Dynamic Bayesian Networks (DBN) have been widely used to recover gene regulatory relationships from time-series data in computational systems biology. Its standard assumption is ‘stationarity’, and therefore, several research efforts have been recently proposed to relax this restriction. However, those methods suffer from three challenges: long running time, low accuracy and reliance on parameter settings. To address these problems, we propose a novel non-stationary DBN model by extending each hidden node of Hidden Markov Model into a DBN (called HMDBN), which properly handles the underlying time-evolving networks. Correspondingly, an improved structural EM algorithm is proposed to learn the HMDBN. It dramatically reduces searching space, thereby substantially improving computational efficiency. Additionally, we derived a novel generalized Bayesian Information Criterion under the non-stationary assumption (called BWBIC), which can help significantly improve the reconstruction accuracy and largely reduce over-fitting. Moreover, the re-estimation formulas for all parameters of our model are derived, enabling us to avoid reliance on parameter settings. Compared to the state-of-the-art methods, the experimental evaluation of our proposed method on both synthetic and real biological data demonstrates more stably high prediction accuracy and significantly improved computation efficiency, even with no prior knowledge and parameter settings.
Zhu, Shijia; Wang, Yadong
2015-12-18
Dynamic Bayesian Networks (DBN) have been widely used to recover gene regulatory relationships from time-series data in computational systems biology. Its standard assumption is 'stationarity', and therefore, several research efforts have been recently proposed to relax this restriction. However, those methods suffer from three challenges: long running time, low accuracy and reliance on parameter settings. To address these problems, we propose a novel non-stationary DBN model by extending each hidden node of Hidden Markov Model into a DBN (called HMDBN), which properly handles the underlying time-evolving networks. Correspondingly, an improved structural EM algorithm is proposed to learn the HMDBN. It dramatically reduces searching space, thereby substantially improving computational efficiency. Additionally, we derived a novel generalized Bayesian Information Criterion under the non-stationary assumption (called BWBIC), which can help significantly improve the reconstruction accuracy and largely reduce over-fitting. Moreover, the re-estimation formulas for all parameters of our model are derived, enabling us to avoid reliance on parameter settings. Compared to the state-of-the-art methods, the experimental evaluation of our proposed method on both synthetic and real biological data demonstrates more stably high prediction accuracy and significantly improved computation efficiency, even with no prior knowledge and parameter settings.
Self-associations influence task-performance through Bayesian inference
Directory of Open Access Journals (Sweden)
Sara L Bengtsson
2013-08-01
Full Text Available The way we think about ourselves impacts greatly on our behaviour. This paper describes a behavioural study and a computational model that sheds new light on this important area. Participants were primed 'clever' and 'stupid' using a scrambled sentence task, and we measured the effect on response time and error-rate on a rule-association task. First, we observed a confirmation bias effect in that associations to being 'stupid' led to a gradual decrease in performance, whereas associations to being 'clever' did not. Second, we observed that the activated self-concepts selectively modified attention towards one's performance. There was an early to late double dissociation in RTs in that primed 'clever' resulted in RT increase following error responses, whereas primed 'stupid' resulted in RT increase following correct responses. We propose a computational model of subjects' behaviour based on the logic of the experimental task that involves two processes; memory for rules and the integration of rules with subsequent visual cues. The model also incorporates an adaptive decision threshold based on Bayes rule, whereby decision thresholds are increased if integration was inferred to be faulty. Fitting the computational model to experimental data confirmed our hypothesis that priming affects the memory process. This model explains both the confirmation bias and double dissociation effects and demonstrates that Bayesian inferential principles can be used to study the effect of self-concepts on behaviour.
Bayesian inference of ice thickness from remote-sensing data
Werder, Mauro A.; Huss, Matthias
2017-04-01
Knowledge about ice thickness and volume is indispensable for studying ice dynamics, future sea-level rise due to glacier melt or their contribution to regional hydrology. Accurate measurements of glacier thickness require on-site work, usually employing radar techniques. However, these field measurements are time consuming, expensive and sometime downright impossible. Conversely, measurements of the ice surface, namely elevation and flow velocity, are becoming available world-wide through remote sensing. The model of Farinotti et al. (2009) calculates ice thicknesses based on a mass conservation approach paired with shallow ice physics using estimates of the surface mass balance. The presented work applies a Bayesian inference approach to estimate the parameters of a modified version of this forward model by fitting it to both measurements of surface flow speed and of ice thickness. The inverse model outputs ice thickness as well the distribution of the error. We fit the model to ten test glaciers and ice caps and quantify the improvements of thickness estimates through the usage of surface ice flow measurements.
Bayesian Inference for NASA Probabilistic Risk and Reliability Analysis
Dezfuli, Homayoon; Kelly, Dana; Smith, Curtis; Vedros, Kurt; Galyean, William
2009-01-01
This document, Bayesian Inference for NASA Probabilistic Risk and Reliability Analysis, is intended to provide guidelines for the collection and evaluation of risk and reliability-related data. It is aimed at scientists and engineers familiar with risk and reliability methods and provides a hands-on approach to the investigation and application of a variety of risk and reliability data assessment methods, tools, and techniques. This document provides both: A broad perspective on data analysis collection and evaluation issues. A narrow focus on the methods to implement a comprehensive information repository. The topics addressed herein cover the fundamentals of how data and information are to be used in risk and reliability analysis models and their potential role in decision making. Understanding these topics is essential to attaining a risk informed decision making environment that is being sought by NASA requirements and procedures such as 8000.4 (Agency Risk Management Procedural Requirements), NPR 8705.05 (Probabilistic Risk Assessment Procedures for NASA Programs and Projects), and the System Safety requirements of NPR 8715.3 (NASA General Safety Program Requirements).
Remaining useful tool life predictions in turning using Bayesian inference
Directory of Open Access Journals (Sweden)
Jaydeep M. Karandikar
2013-01-01
Full Text Available Tool wear is an important factor in determining machining productivity. In this paper, tool wear is characterized by remaining useful tool life in a turning operation and is predicted using spindle power and a random sample path method of Bayesian inference. Turning tests are performed at different speeds and feed rates using a carbide tool and MS309 steel work material. The spindle power and the tool flank wear are monitored during cutting; the root mean square of the time domain power is found to be sensitive to tool wear. Sample root mean square power growth curves are generated and the probability of each curve being the true growth curve is updated using Bayes’ rule. The updated probabilities are used to determine the remaining useful tool life. Results show good agreement between the predicted tool life and the empirically-determined true remaining life. The proposed method takes into account the uncertainty in tool life and the growth of the root mean square power at the end of tool life and is, therefore, robust and reliable.
Palacios, Julia A; Minin, Vladimir N
2013-03-01
Changes in population size influence genetic diversity of the population and, as a result, leave a signature of these changes in individual genomes in the population. We are interested in the inverse problem of reconstructing past population dynamics from genomic data. We start with a standard framework based on the coalescent, a stochastic process that generates genealogies connecting randomly sampled individuals from the population of interest. These genealogies serve as a glue between the population demographic history and genomic sequences. It turns out that only the times of genealogical lineage coalescences contain information about population size dynamics. Viewing these coalescent times as a point process, estimating population size trajectories is equivalent to estimating a conditional intensity of this point process. Therefore, our inverse problem is similar to estimating an inhomogeneous Poisson process intensity function. We demonstrate how recent advances in Gaussian process-based nonparametric inference for Poisson processes can be extended to Bayesian nonparametric estimation of population size dynamics under the coalescent. We compare our Gaussian process (GP) approach to one of the state-of-the-art Gaussian Markov random field (GMRF) methods for estimating population trajectories. Using simulated data, we demonstrate that our method has better accuracy and precision. Next, we analyze two genealogies reconstructed from real sequences of hepatitis C and human Influenza A viruses. In both cases, we recover more believed aspects of the viral demographic histories than the GMRF approach. We also find that our GP method produces more reasonable uncertainty estimates than the GMRF method.
Energy Technology Data Exchange (ETDEWEB)
George, J.S.; Schmidt, D.M.; Wood, C.C.
1999-02-01
We have developed a Bayesian approach to the analysis of neural electromagnetic (MEG/EEG) data that can incorporate or fuse information from other imaging modalities and addresses the ill-posed inverse problem by sarnpliig the many different solutions which could have produced the given data. From these samples one can draw probabilistic inferences about regions of activation. Our source model assumes a variable number of variable size cortical regions of stimulus-correlated activity. An active region consists of locations on the cortical surf ace, within a sphere centered on some location in cortex. The number and radi of active regions can vary to defined maximum values. The goal of the analysis is to determine the posterior probability distribution for the set of parameters that govern the number, location, and extent of active regions. Markov Chain Monte Carlo is used to generate a large sample of sets of parameters distributed according to the posterior distribution. This sample is representative of the many different source distributions that could account for given data, and allows identification of probable (i.e. consistent) features across solutions. Examples of the use of this analysis technique with both simulated and empirical MEG data are presented.
Directory of Open Access Journals (Sweden)
Oliver Serang
Full Text Available Exact Bayesian inference can sometimes be performed efficiently for special cases where a function has commutative and associative symmetry of its inputs (called "causal independence". For this reason, it is desirable to exploit such symmetry on big data sets. Here we present a method to exploit a general form of this symmetry on probabilistic adder nodes by transforming those probabilistic adder nodes into a probabilistic convolution tree with which dynamic programming computes exact probabilities. A substantial speedup is demonstrated using an illustration example that can arise when identifying splice forms with bottom-up mass spectrometry-based proteomics. On this example, even state-of-the-art exact inference algorithms require a runtime more than exponential in the number of splice forms considered. By using the probabilistic convolution tree, we reduce the runtime to O(k log(k2 and the space to O(k log(k where k is the number of variables joined by an additive or cardinal operator. This approach, which can also be used with junction tree inference, is applicable to graphs with arbitrary dependency on counting variables or cardinalities and can be used on diverse problems and fields like forward error correcting codes, elemental decomposition, and spectral demixing. The approach also trivially generalizes to multiple dimensions.
A Bayesian hierarchical nonhomogeneous hidden Markov model for multisite streamflow reconstructions
Bracken, C.; Rajagopalan, B.; Woodhouse, C.
2016-10-01
In many complex water supply systems, the next generation of water resources planning models will require simultaneous probabilistic streamflow inputs at multiple locations on an interconnected network. To make use of the valuable multicentury records provided by tree-ring data, reconstruction models must be able to produce appropriate multisite inputs. Existing streamflow reconstruction models typically focus on one site at a time, not addressing intersite dependencies and potentially misrepresenting uncertainty. To this end, we develop a model for multisite streamflow reconstruction with the ability to capture intersite correlations. The proposed model is a hierarchical Bayesian nonhomogeneous hidden Markov model (NHMM). A NHMM is fit to contemporary streamflow at each location using lognormal component distributions. Leading principal components of tree rings are used as covariates to model nonstationary transition probabilities and the parameters of the lognormal component distributions. Spatial dependence between sites is captured with a Gaussian elliptical copula. Parameters of the model are estimated in a fully Bayesian framework, in that marginal posterior distributions of all the parameters are obtained. The model is applied to reconstruct flows at 20 sites in the Upper Colorado River Basin (UCRB) from 1473 to 1906. Many previous reconstructions are available for this basin, making it ideal for testing this new method. The results show some improvements over regression-based methods in terms of validation statistics. Key advantages of the Bayesian NHMM over traditional approaches are a dynamic representation of uncertainty and the ability to make long multisite simulations that capture at-site statistics and spatial correlations between sites.
Trans-dimensional Bayesian inference for large sequential data sets
Mandolesi, E.; Dettmer, J.; Dosso, S. E.; Holland, C. W.
2015-12-01
This work develops a sequential Monte Carlo method to infer seismic parameters of layered seabeds from large sequential reflection-coefficient data sets. The approach provides parameter estimates and uncertainties along survey tracks with the goal to aid in the detection of unexploded ordnance in shallow water. The sequential data are acquired by a moving platform with source and receiver array towed close to the seabed. This geometry requires consideration of spherical reflection coefficients, computed efficiently by massively parallel implementation of the Sommerfeld integral via Levin integration on a graphics processing unit. The seabed is parametrized with a trans-dimensional model to account for changes in the environment (i.e. changes in layering) along the track. The method combines advanced Markov chain Monte Carlo methods (annealing) with particle filtering (resampling). Since data from closely-spaced source transmissions (pings) often sample similar environments, the solution from one ping can be utilized to efficiently estimate the posterior for data from subsequent pings. Since reflection-coefficient data are highly informative, the likelihood function can be extremely peaked, resulting in little overlap between posteriors of adjacent pings. This is addressed by adding bridging distributions (via annealed importance sampling) between pings for more efficient transitions. The approach assumes the environment to be changing slowly enough to justify the local 1D parametrization. However, bridging allows rapid changes between pings to be addressed and we demonstrate the method to be stable in such situations. Results are in terms of trans-D parameter estimates and uncertainties along the track. The algorithm is examined for realistic simulated data along a track and applied to a dataset collected by an autonomous underwater vehicle on the Malta Plateau, Mediterranean Sea. [Work supported by the SERDP, DoD.
Mocapy++ - a toolkit for inference and learning in dynamic Bayesian networks
DEFF Research Database (Denmark)
Paluszewski, Martin; Hamelryck, Thomas Wim
2010-01-01
Background Mocapy++ is a toolkit for parameter learning and inference in dynamic Bayesian networks (DBNs). It supports a wide range of DBN architectures and probability distributions, including distributions from directional statistics (the statistics of angles, directions and orientations...
Bayesian Inference of Seismic Sources Using a 3-D Earth Model for the Japanese Islands Region
Simutė, Saulė; Fichtner, Andreas
2017-04-01
Earthquake source inversion is an established problem in seismology. Nevertheless, one-dimensional Earth models are commonly used to compute synthetic data in point- as well as finite-fault inversions. Reliance on simplified Earth models limits the exploitable information to longer periods and as such, contributes to notorious non-uniqueness of finite-fault models. Failure to properly account for Earth structure means that inaccuracies in the Earth model can map into and pollute the earthquake source solutions. To tackle these problems we construct a full-waveform 3-D Earth model for the Japanese Islands region and infer earthquake source parameters in a probabilistic way using numerically computed 3-D Green's functions. Our model explains data from the earthquakes not used in the inversion significantly better than the initial model in the period range of 20-80 s. This indicates that the model is not over-fit and may thus be used for improved earthquake source inversion. To solve the forward problem, we pre-compute and store Green's functions with the spectral element solver SES3D for all potential source-receiver pairs. The exploitation of the Green's function database means that the forward problem of obtaining displacements is merely a linear combination of strain Green's tensor scaled by the moment tensor elements. We invert for ten model parameters - six moment tensors elements, three location parameters, and the time of the event. A feasible number of model parameters and the fast forward problem allow us to infer the unknowns using the Bayesian Markov chain Monte Carlo, which results in the marginal posterior distributions for every model parameter. The Monte Carlo algorithm is validated against analytical solutions for the linear test case. We perform the inversions using real data in the Japanese Islands region and assess the quality of the solutions by comparing the obtained results with those from the existing 1-D catalogues.
UNSUPERVISED TRANSIENT LIGHT CURVE ANALYSIS VIA HIERARCHICAL BAYESIAN INFERENCE
Energy Technology Data Exchange (ETDEWEB)
Sanders, N. E.; Soderberg, A. M. [Harvard-Smithsonian Center for Astrophysics, 60 Garden Street, Cambridge, MA 02138 (United States); Betancourt, M., E-mail: nsanders@cfa.harvard.edu [Department of Statistics, University of Warwick, Coventry CV4 7AL (United Kingdom)
2015-02-10
Historically, light curve studies of supernovae (SNe) and other transient classes have focused on individual objects with copious and high signal-to-noise observations. In the nascent era of wide field transient searches, objects with detailed observations are decreasing as a fraction of the overall known SN population, and this strategy sacrifices the majority of the information contained in the data about the underlying population of transients. A population level modeling approach, simultaneously fitting all available observations of objects in a transient sub-class of interest, fully mines the data to infer the properties of the population and avoids certain systematic biases. We present a novel hierarchical Bayesian statistical model for population level modeling of transient light curves, and discuss its implementation using an efficient Hamiltonian Monte Carlo technique. As a test case, we apply this model to the Type IIP SN sample from the Pan-STARRS1 Medium Deep Survey, consisting of 18,837 photometric observations of 76 SNe, corresponding to a joint posterior distribution with 9176 parameters under our model. Our hierarchical model fits provide improved constraints on light curve parameters relevant to the physical properties of their progenitor stars relative to modeling individual light curves alone. Moreover, we directly evaluate the probability for occurrence rates of unseen light curve characteristics from the model hyperparameters, addressing observational biases in survey methodology. We view this modeling framework as an unsupervised machine learning technique with the ability to maximize scientific returns from data to be collected by future wide field transient searches like LSST.
Bayesian phylogenetic model selection using reversible jump Markov chain Monte Carlo.
Huelsenbeck, John P; Larget, Bret; Alfaro, Michael E
2004-06-01
A common problem in molecular phylogenetics is choosing a model of DNA substitution that does a good job of explaining the DNA sequence alignment without introducing superfluous parameters. A number of methods have been used to choose among a small set of candidate substitution models, such as the likelihood ratio test, the Akaike Information Criterion (AIC), the Bayesian Information Criterion (BIC), and Bayes factors. Current implementations of any of these criteria suffer from the limitation that only a small set of models are examined, or that the test does not allow easy comparison of non-nested models. In this article, we expand the pool of candidate substitution models to include all possible time-reversible models. This set includes seven models that have already been described. We show how Bayes factors can be calculated for these models using reversible jump Markov chain Monte Carlo, and apply the method to 16 DNA sequence alignments. For each data set, we compare the model with the best Bayes factor to the best models chosen using AIC and BIC. We find that the best model under any of these criteria is not necessarily the most complicated one; models with an intermediate number of substitution types typically do best. Moreover, almost all of the models that are chosen as best do not constrain a transition rate to be the same as a transversion rate, suggesting that it is the transition/transversion rate bias that plays the largest role in determining which models are selected. Importantly, the reversible jump Markov chain Monte Carlo algorithm described here allows estimation of phylogeny (and other phylogenetic model parameters) to be performed while accounting for uncertainty in the model of DNA substitution.
Directory of Open Access Journals (Sweden)
Carlos Alejandro De Luna Ortega
2006-01-01
Full Text Available En este artículo se aborda el diseño de un reconocedor de voz, con el idioma español mexicano, del estado de Aguascalientes, de palabras aisladas, con dependencia del hablante y vocabulario pequeño, empleando Redes Neuronales Artificiales (ANN por sus siglas en inglés, Alineamiento Dinámico del Tiempo (DTW por sus siglas en inglés y Modelos Ocultos de Markov (HMM por sus siglas en inglés para la realización del algoritmo de reconocimiento.
Jin, Ick Hoon; Yuan, Ying; Bandyopadhyay, Dipankar
2016-01-01
Research in dental caries generates data with two levels of hierarchy: that of a tooth overall and that of the different surfaces of the tooth. The outcomes often exhibit spatial referencing among neighboring teeth and surfaces, i.e., the disease status of a tooth or surface might be influenced by the status of a set of proximal teeth/surfaces. Assessments of dental caries (tooth decay) at the tooth level yield binary outcomes indicating the presence/absence of teeth, and trinary outcomes at the surface level indicating healthy, decayed, or filled surfaces. The presence of these mixed discrete responses complicates the data analysis under a unified framework. To mitigate complications, we develop a Bayesian two-level hierarchical model under suitable (spatial) Markov random field assumptions that accommodates the natural hierarchy within the mixed responses. At the first level, we utilize an autologistic model to accommodate the spatial dependence for the tooth-level binary outcomes. For the second level and conditioned on a tooth being non-missing, we utilize a Potts model to accommodate the spatial referencing for the surface-level trinary outcomes. The regression models at both levels were controlled for plausible covariates (risk factors) of caries, and remain connected through shared parameters. To tackle the computational challenges in our Bayesian estimation scheme caused due to the doubly-intractable normalizing constant, we employ a double Metropolis-Hastings sampler. We compare and contrast our model performances to the standard non-spatial (naive) model using a small simulation study, and illustrate via an application to a clinical dataset on dental caries. PMID:27807470
Dorn, Caroline; Venturini, Julia; Khan, Amir; Heng, Kevin; Alibert, Yann; Helled, Ravit; Rivoldini, Attilio; Benz, Willy
2017-01-01
Aims: We aim to present a generalized Bayesian inference method for constraining interiors of super Earths and sub-Neptunes. Our methodology succeeds in quantifying the degeneracy and correlation of structural parameters for high dimensional parameter spaces. Specifically, we identify what constraints can be placed on composition and thickness of core, mantle, ice, ocean, and atmospheric layers given observations of mass, radius, and bulk refractory abundance constraints (Fe, Mg, Si) from observations of the host star's photospheric composition. Methods: We employed a full probabilistic Bayesian inference analysis that formally accounts for observational and model uncertainties. Using a Markov chain Monte Carlo technique, we computed joint and marginal posterior probability distributions for all structural parameters of interest. We included state-of-the-art structural models based on self-consistent thermodynamics of core, mantle, high-pressure ice, and liquid water. Furthermore, we tested and compared two different atmospheric models that are tailored for modeling thick and thin atmospheres, respectively. Results: First, we validate our method against Neptune. Second, we apply it to synthetic exoplanets of fixed mass and determine the effect on interior structure and composition when (1) radius; (2) atmospheric model; (3) data uncertainties; (4) semi-major axes; (5) atmospheric composition (i.e., a priori assumption of enriched envelopes versus pure H/He envelopes); and (6) prior distributions are varied. Conclusions: Our main conclusions are: (1) given available data, the range of possible interior structures is large; quantification of the degeneracy of possible interiors is therefore indispensable for meaningful planet characterization. (2) Our method predicts models that agree with independent estimates of Neptune's interior. (3) Increasing the precision in mass and radius leads to much improved constraints on ice mass fraction, size of rocky interior, but
Bayesian Inference Networks and Spreading Activation in Hypertext Systems.
Savoy, Jacques
1992-01-01
Describes a method based on Bayesian networks for searching hypertext systems. Discussion covers the use of Bayesian networks for structuring index terms and representing user information needs; use of link semantics based on constrained spreading activation to find starting points for browsing; and evaluation of a prototype system. (64…
DEFF Research Database (Denmark)
Møller, Jesper
.1 with the title ‘Inference'.) This contribution concerns statistical inference for parametric models used in stochastic geometry and based on quick and simple simulation free procedures as well as more comprehensive methods using Markov chain Monte Carlo (MCMC) simulations. Due to space limitations the focus...
Directory of Open Access Journals (Sweden)
Eils Roland
2006-06-01
Full Text Available Abstract Background The subcellular location of a protein is closely related to its function. It would be worthwhile to develop a method to predict the subcellular location for a given protein when only the amino acid sequence of the protein is known. Although many efforts have been made to predict subcellular location from sequence information only, there is the need for further research to improve the accuracy of prediction. Results A novel method called HensBC is introduced to predict protein subcellular location. HensBC is a recursive algorithm which constructs a hierarchical ensemble of classifiers. The classifiers used are Bayesian classifiers based on Markov chain models. We tested our method on six various datasets; among them are Gram-negative bacteria dataset, data for discriminating outer membrane proteins and apoptosis proteins dataset. We observed that our method can predict the subcellular location with high accuracy. Another advantage of the proposed method is that it can improve the accuracy of the prediction of some classes with few sequences in training and is therefore useful for datasets with imbalanced distribution of classes. Conclusion This study introduces an algorithm which uses only the primary sequence of a protein to predict its subcellular location. The proposed recursive scheme represents an interesting methodology for learning and combining classifiers. The method is computationally efficient and competitive with the previously reported approaches in terms of prediction accuracies as empirical results indicate. The code for the software is available upon request.
Bulashevska, Alla; Stein, Martin; Jackson, David; Eils, Roland
2009-12-01
Accurate computational methods that can help to predict biological function of a protein from its sequence are of great interest to research biologists and pharmaceutical companies. One approach to assume the function of proteins is to predict the interactions between proteins and other molecules. In this work, we propose a machine learning method that uses a primary sequence of a domain to predict its propensity for interaction with small molecules. By curating the Pfam database with respect to the small molecule binding ability of its component domains, we have constructed a dataset of small molecule binding and non-binding domains. This dataset was then used as training set to learn a Bayesian classifier, which should distinguish members of each class. The domain sequences of both classes are modelled with Markov chains. In a Jack-knife test, our classification procedure achieved the predictive accuracies of 77.2% and 66.7% for binding and non-binding classes respectively. We demonstrate the applicability of our classifier by using it to identify previously unknown small molecule binding domains. Our predictions are available as supplementary material and can provide very useful information to drug discovery specialists. Given the ubiquitous and essential role small molecules play in biological processes, our method is important for identifying pharmaceutically relevant components of complete proteomes. The software is available from the author upon request.
Holan, S.H.; Davis, G.M.; Wildhaber, M.L.; DeLonay, A.J.; Papoulias, D.M.
2009-01-01
The timing of spawning in fish is tightly linked to environmental factors; however, these factors are not very well understood for many species. Specifically, little information is available to guide recruitment efforts for endangered species such as the sturgeon. Therefore, we propose a Bayesian hierarchical model for predicting the success of spawning of the shovelnose sturgeon which uses both biological and behavioural (longitudinal) data. In particular, we use data that were produced from a tracking study that was conducted in the Lower Missouri River. The data that were produced from this study consist of biological variables associated with readiness to spawn along with longitudinal behavioural data collected by using telemetry and archival data storage tags. These high frequency data are complex both biologically and in the underlying behavioural process. To accommodate such complexity we developed a hierarchical linear regression model that uses an eigenvalue predictor, derived from the transition probability matrix of a two-state Markov switching model with generalized auto-regressive conditional heteroscedastic dynamics. Finally, to minimize the computational burden that is associated with estimation of this model, a parallel computing approach is proposed. ?? Journal compilation 2009 Royal Statistical Society.
Bayesian edge detector for SAR imagery using discontinuity-adaptive Markov random field modeling
Institute of Scientific and Technical Information of China (English)
Yuan Zhan; He You; Cai Fuqing
2013-01-01
Synthetic aperture radar (SAR) image is severely affected by multiplicative speckle noise, which greatly complicates the edge detection. In this paper, by incorporating the discontinuity-adaptive Markov random field (DAMRF) and maximum a posteriori (MAP) estimation criterion into edge detection, a Bayesian edge detector for SAR imagery is accordingly developed. In the pro-posed detector, the DAMRF is used as the a priori distribution of the local mean reflectivity, and a maximum a posteriori estimation of it is thus obtained by maximizing the posteriori energy using gradient-descent method. Four normalized ratios constructed in different directions are computed, based on which two edge strength maps (ESMs) are formed. The final edge detection result is achieved by fusing the results of two thresholded ESMs. The experimental results with synthetic and real SAR images show that the proposed detector could efficiently detect edges in SAR images, and achieve better performance than two popular detectors in terms of Pratt’s figure of merit and visual evaluation in most cases.
Laloy, Eric; Beerten, Koen; Vanacker, Veerle; Christl, Marcus; Rogiers, Bart; Wouters, Laurent
2017-07-01
The rate at which low-lying sandy areas in temperate regions, such as the Campine Plateau (NE Belgium), have been eroding during the Quaternary is a matter of debate. Current knowledge on the average pace of landscape evolution in the Campine area is largely based on geological inferences and modern analogies. We performed a Bayesian inversion of an in situ-produced 10Be concentration depth profile to infer the average long-term erosion rate together with two other parameters: the surface exposure age and the inherited 10Be concentration. Compared to the latest advances in probabilistic inversion of cosmogenic radionuclide (CRN) data, our approach has the following two innovative components: it (1) uses Markov chain Monte Carlo (MCMC) sampling and (2) accounts (under certain assumptions) for the contribution of model errors to posterior uncertainty. To investigate to what extent our approach differs from the state of the art in practice, a comparison against the Bayesian inversion method implemented in the CRONUScalc program is made. Both approaches identify similar maximum a posteriori (MAP) parameter values, but posterior parameter and predictive uncertainty derived using the method taken in CRONUScalc is moderately underestimated. A simple way for producing more consistent uncertainty estimates with the CRONUScalc-like method in the presence of model errors is therefore suggested. Our inferred erosion rate of 39 ± 8. 9 mm kyr-1 (1σ) is relatively large in comparison with landforms that erode under comparable (paleo-)climates elsewhere in the world. We evaluate this value in the light of the erodibility of the substrate and sudden base level lowering during the Middle Pleistocene. A denser sampling scheme of a two-nuclide concentration depth profile would allow for better inferred erosion rate resolution, and including more uncertain parameters in the MCMC inversion.
Bayesian Modelling of fMRI Time Series
DEFF Research Database (Denmark)
Højen-Sørensen, Pedro; Hansen, Lars Kai; Rasmussen, Carl Edward
2000-01-01
We present a Hidden Markov Model (HMM) for inferring the hidden psychological state (or neural activity) during single trial fMRI activation experiments with blocked task paradigms. Inference is based on Bayesian methodology, using a combination of analytical and a variety of Markov Chain Monte...
Bayesian approaches to spatial inference: Modelling and computational challenges and solutions
Moores, Matthew; Mengersen, Kerrie
2014-12-01
We discuss a range of Bayesian modelling approaches for spatial data and investigate some of the associated computational challenges. This paper commences with a brief review of Bayesian mixture models and Markov random fields, with enabling computational algorithms including Markov chain Monte Carlo (MCMC) and integrated nested Laplace approximation (INLA). Following this, we focus on the Potts model as a canonical approach, and discuss the challenge of estimating the inverse temperature parameter that controls the degree of spatial smoothing. We compare three approaches to addressing the doubly intractable nature of the likelihood, namely pseudo-likelihood, path sampling and the exchange algorithm. These techniques are applied to satellite data used to analyse water quality in the Great Barrier Reef.
Beigy, Hamid; Ahmad, Ashar; Masoudi-Nejad, Ali; Fröhlich, Holger
2017-01-01
Inferring the structure of molecular networks from time series protein or gene expression data provides valuable information about the complex biological processes of the cell. Causal network structure inference has been approached using different methods in the past. Most causal network inference techniques, such as Dynamic Bayesian Networks and ordinary differential equations, are limited by their computational complexity and thus make large scale inference infeasible. This is specifically true if a Bayesian framework is applied in order to deal with the unavoidable uncertainty about the correct model. We devise a novel Bayesian network reverse engineering approach using ordinary differential equations with the ability to include non-linearity. Besides modeling arbitrary, possibly combinatorial and time dependent perturbations with unknown targets, one of our main contributions is the use of Expectation Propagation, an algorithm for approximate Bayesian inference over large scale network structures in short computation time. We further explore the possibility of integrating prior knowledge into network inference. We evaluate the proposed model on DREAM4 and DREAM8 data and find it competitive against several state-of-the-art existing network inference methods. PMID:28166542
Bayesian parameter inference and model selection by population annealing in systems biology.
Murakami, Yohei
2014-01-01
Parameter inference and model selection are very important for mathematical modeling in systems biology. Bayesian statistics can be used to conduct both parameter inference and model selection. Especially, the framework named approximate Bayesian computation is often used for parameter inference and model selection in systems biology. However, Monte Carlo methods needs to be used to compute Bayesian posterior distributions. In addition, the posterior distributions of parameters are sometimes almost uniform or very similar to their prior distributions. In such cases, it is difficult to choose one specific value of parameter with high credibility as the representative value of the distribution. To overcome the problems, we introduced one of the population Monte Carlo algorithms, population annealing. Although population annealing is usually used in statistical mechanics, we showed that population annealing can be used to compute Bayesian posterior distributions in the approximate Bayesian computation framework. To deal with un-identifiability of the representative values of parameters, we proposed to run the simulations with the parameter ensemble sampled from the posterior distribution, named "posterior parameter ensemble". We showed that population annealing is an efficient and convenient algorithm to generate posterior parameter ensemble. We also showed that the simulations with the posterior parameter ensemble can, not only reproduce the data used for parameter inference, but also capture and predict the data which was not used for parameter inference. Lastly, we introduced the marginal likelihood in the approximate Bayesian computation framework for Bayesian model selection. We showed that population annealing enables us to compute the marginal likelihood in the approximate Bayesian computation framework and conduct model selection depending on the Bayes factor.
Granum, E; Thomason, M G
1990-01-01
A structural pattern recognition approach to the analysis and classification of metaphase chromosome band patterns is presented. An operational method of representing band pattern profiles as sharp edged idealized profiles is outlined. These profiles are nonlinearly scaled to a few, but fixed number of "density" levels. Previous experience has shown that profiles of six levels are appropriate and that the differences between successive bands in these profiles are suitable for classification. String representations, which focuses on the sequences of transitions between local band pattern levels, are derived from such "difference profiles." A method of syntactic analysis of the band transition sequences by dynamic programming for optimal (maximal probability) string-to-network alignments is described. It develops automatic data-driven inference of band pattern models (Markov networks) per class, and uses these models for classification. The method does not use centromere information, but assumes the p-q-orientation of the band pattern profiles to be known a priori. It is experimentally established that the method can build Markov network models, which, when used for classification, show a recognition rate of about 92% on test data. The experiments used 200 samples (chromosome profiles) for each of the 22 autosome chromosome types and are designed to also investigate various classifier design problems. It is found that the use of a priori knowledge of Denver Group assignment only improved classification by 1 or 2%. A scheme for typewise normalization of the class relationship measures prove useful, partly through improvements on average results and partly through a more evenly distributed error pattern. The choice of reference of the p-q-orientation of the band patterns is found to be unimportant, and results of timing of the execution time of the analysis show that recent and efficient implementations can process one cell in less than 1 min on current standard
Energy Technology Data Exchange (ETDEWEB)
Zhang, Guannan [ORNL; Webster, Clayton G [ORNL; Gunzburger, Max D [ORNL
2012-09-01
Although Bayesian analysis has become vital to the quantification of prediction uncertainty in groundwater modeling, its application has been hindered due to the computational cost associated with numerous model executions needed for exploring the posterior probability density function (PPDF) of model parameters. This is particularly the case when the PPDF is estimated using Markov Chain Monte Carlo (MCMC) sampling. In this study, we develop a new approach that improves computational efficiency of Bayesian inference by constructing a surrogate system based on an adaptive sparse-grid high-order stochastic collocation (aSG-hSC) method. Unlike previous works using first-order hierarchical basis, we utilize a compactly supported higher-order hierar- chical basis to construct the surrogate system, resulting in a significant reduction in the number of computational simulations required. In addition, we use hierarchical surplus as an error indi- cator to determine adaptive sparse grids. This allows local refinement in the uncertain domain and/or anisotropic detection with respect to the random model parameters, which further improves computational efficiency. Finally, we incorporate a global optimization technique and propose an iterative algorithm for building the surrogate system for the PPDF with multiple significant modes. Once the surrogate system is determined, the PPDF can be evaluated by sampling the surrogate system directly with very little computational cost. The developed method is evaluated first using a simple analytical density function with multiple modes and then using two synthetic groundwater reactive transport models. The groundwater models represent different levels of complexity; the first example involves coupled linear reactions and the second example simulates nonlinear ura- nium surface complexation. The results show that the aSG-hSC is an effective and efficient tool for Bayesian inference in groundwater modeling in comparison with conventional
Inferring Sequential Order of Somatic Mutations during Tumorgenesis based on Markov Chain Model.
Kang, Hao; Cho, Kwang-Hyun; Zhang, Xiaohua Douglas; Zeng, Tao; Chen, Luonan
2015-01-01
Tumors are developed and worsen with the accumulated mutations on DNA sequences during tumorigenesis. Identifying the temporal order of gene mutations in cancer initiation and development is a challenging topic. It not only provides a new insight into the study of tumorigenesis at the level of genome sequences but also is an effective tool for early diagnosis of tumors and preventive medicine. In this paper, we develop a novel method to accurately estimate the sequential order of gene mutations during tumorigenesis from genome sequencing data based on Markov chain model as TOMC (Temporal Order based on Markov Chain), and also provide a new criterion to further infer the order of samples or patients, which can characterize the severity or stage of the disease. We applied our method to the analysis of tumors based on several high-throughput datasets. Specifically, first, we revealed that tumor suppressor genes (TSG) tend to be mutated ahead of oncogenes, which are considered as important events for key functional loss and gain during tumorigenesis. Second, the comparisons of various methods demonstrated that our approach has clear advantages over the existing methods due to the consideration on the effect of mutation dependence among genes, such as co-mutation. Third and most important, our method is able to deduce the ordinal sequence of patients or samples to quantitatively characterize their severity of tumors. Therefore, our work provides a new way to quantitatively understand the development and progression of tumorigenesis based on high throughput sequencing data.
Travel cost inference from sparse, spatio-temporally correlated time series using markov models
DEFF Research Database (Denmark)
Yang, B.; Guo, C.; Jensen, C.S.
2013-01-01
of such time series offers insight into the underlying system and enables prediction of system behavior. While the techniques presented in the paper apply more generally, we consider the case of transportation systems and aim to predict travel cost from GPS tracking data from probe vehicles. Specifically, each...... road segment has an associated travel-cost time series, which is derived from GPS data. We use spatio-temporal hidden Markov models (STHMM) to model correlations among different traffic time series. We provide algorithms that are able to learn the parameters of an STHMM while contending...... with the sparsity, spatio-temporal correlation, and heterogeneity of the time series. Using the resulting STHMM, near future travel costs in the transportation network, e.g., travel time or greenhouse gas emissions, can be inferred, enabling a variety of routing services, e.g., eco-routing. Empirical studies...
Jannson, Tomasz; Wang, Wenjian; Hodelin, Juan; Forrester, Thomas; Romanov, Volodymyr; Kostrzewski, Andrew
2016-05-01
In this paper, Bayesian Binary Sensing (BBS) is discussed as an effective tool for Bayesian Inference (BI) evaluation in interdisciplinary areas such as ISR (and, C3I), Homeland Security, QC, medicine, defense, and many others. In particular, Hilbertian Sine (HS) as an absolute measure of BI, is introduced, while avoiding relativity of decision threshold identification, as in the case of traditional measures of BI, related to false positives and false negatives.
DEFF Research Database (Denmark)
Picchini, Umberto; Forman, Julie Lyng
2016-01-01
In recent years, dynamical modelling has been provided with a range of breakthrough methods to perform exact Bayesian inference. However, it is often computationally unfeasible to apply exact statistical methodologies in the context of large data sets and complex models. This paper considers...... a nonlinear stochastic differential equation model observed with correlated measurement errors and an application to protein folding modelling. An approximate Bayesian computation (ABC)-MCMC algorithm is suggested to allow inference for model parameters within reasonable time constraints. The ABC algorithm...... applications. A simulation study is conducted to compare our strategy with exact Bayesian inference, the latter resulting two orders of magnitude slower than ABC-MCMC for the considered set-up. Finally, the ABC algorithm is applied to a large size protein data. The suggested methodology is fairly general...
Sparse Bayesian Inference and the Temperature Structure of the Solar Corona
Warren, Harry P; Crump, Nicholas A
2016-01-01
Measuring the temperature structure of the solar atmosphere is critical to understanding how it is heated to high temperatures. Unfortunately, the temperature of the upper atmosphere cannot be observed directly, but must be inferred from spectrally resolved observations of individual emission lines that span a wide range of temperatures. Such observations are "inverted" to determine the distribution of plasma temperatures along the line of sight. This inversion is ill-posed and, in the absence of regularization, tends to produce wildly oscillatory solutions. We introduce the application of sparse Bayesian inference to the problem of inferring the temperature structure of the solar corona. Within a Bayesian framework a preference for solutions that utilize a minimum number of basis functions can be encoded into the prior and many ad hoc assumptions can be avoided. We demonstrate the efficacy of the Bayesian approach by considering a test library of 40 assumed temperature distributions.
Sparse Bayesian Inference and the Temperature Structure of the Solar Corona
Warren, Harry P.; Byers, Jeff M.; Crump, Nicholas A.
2017-02-01
Measuring the temperature structure of the solar atmosphere is critical to understanding how it is heated to high temperatures. Unfortunately, the temperature of the upper atmosphere cannot be observed directly, but must be inferred from spectrally resolved observations of individual emission lines that span a wide range of temperatures. Such observations are “inverted” to determine the distribution of plasma temperatures along the line of sight. This inversion is ill posed and, in the absence of regularization, tends to produce wildly oscillatory solutions. We introduce the application of sparse Bayesian inference to the problem of inferring the temperature structure of the solar corona. Within a Bayesian framework a preference for solutions that utilize a minimum number of basis functions can be encoded into the prior and many ad hoc assumptions can be avoided. We demonstrate the efficacy of the Bayesian approach by considering a test library of 40 assumed temperature distributions.
Soliman, Ahmed A.; Al Sobhi, Mashail M.
2015-02-01
This article deals with the problem of estimating parameters of the Gompertz distribution (GD) based on progressive first-failure censored data using Bayesian and non-Bayesian approaches. The two-sample prediction problem is considered to derive Bayesian prediction bounds for both future order statistics and future record values based on progressive first failure censored informative samples from GD. The sampling schemes such as, first-failure censoring, progressive type II censoring, type II censoring and complete sample can be obtained as special cases of the progressive first-failure censored scheme. Markov chain Monte Carlo (MCMC) method with Gibbs sampling procedure is used to compute the Bayes estimates and also to construct the corresponding credible intervals of the parameters. A simulation study has been conducted in order to compare the proposed Bayes estimators with the maximum likelihood estimators MLE. Finally, some numerical computations with real data set are presented for illustrating all the proposed inferential procedures.
Bayesian inference in dynamic domains using logical OR gates
CSIR Research Space (South Africa)
Claessens, R
2016-04-01
Full Text Available ., de Waal, A., Marnewick, K., Cilliers, D., Houser, A. M., and Boast, L. (2010). Modelling cheetah relocation success in southern Africa using an Iterative Bayesian Network Develop- ment Cycle. Ecological Modelling, 221(4):641 – 651. Koen, H., de...
Nonparametric Bayesian inference for multidimensional compound Poisson processes
S. Gugushvili; F. van der Meulen; P. Spreij
2015-01-01
Given a sample from a discretely observed multidimensional compound Poisson process, we study the problem of nonparametric estimation of its jump size density r0 and intensity λ0. We take a nonparametric Bayesian approach to the problem and determine posterior contraction rates in this context, whic
Learning a Markov Logic network for supervised gene regulatory network inference.
Brouard, Céline; Vrain, Christel; Dubois, Julie; Castel, David; Debily, Marie-Anne; d'Alché-Buc, Florence
2013-09-12
Gene regulatory network inference remains a challenging problem in systems biology despite the numerous approaches that have been proposed. When substantial knowledge on a gene regulatory network is already available, supervised network inference is appropriate. Such a method builds a binary classifier able to assign a class (Regulation/No regulation) to an ordered pair of genes. Once learnt, the pairwise classifier can be used to predict new regulations. In this work, we explore the framework of Markov Logic Networks (MLN) that combine features of probabilistic graphical models with the expressivity of first-order logic rules. We propose to learn a Markov Logic network, e.g. a set of weighted rules that conclude on the predicate "regulates", starting from a known gene regulatory network involved in the switch proliferation/differentiation of keratinocyte cells, a set of experimental transcriptomic data and various descriptions of genes all encoded into first-order logic. As training data are unbalanced, we use asymmetric bagging to learn a set of MLNs. The prediction of a new regulation can then be obtained by averaging predictions of individual MLNs. As a side contribution, we propose three in silico tests to assess the performance of any pairwise classifier in various network inference tasks on real datasets. A first test consists of measuring the average performance on balanced edge prediction problem; a second one deals with the ability of the classifier, once enhanced by asymmetric bagging, to update a given network. Finally our main result concerns a third test that measures the ability of the method to predict regulations with a new set of genes. As expected, MLN, when provided with only numerical discretized gene expression data, does not perform as well as a pairwise SVM in terms of AUPR. However, when a more complete description of gene properties is provided by heterogeneous sources, MLN achieves the same performance as a black-box model such as a
DEFF Research Database (Denmark)
Kristensen, Anders Ringgaard; Søllested, Thomas Algot
2004-01-01
that really uses all these methodological improvements. In this paper, the biological model describing the performance and feed intake of sows is presented. In particular, estimation of herd specific parameters is emphasized. The optimization model is described in a subsequent paper......Several replacement models have been presented in literature. In other applicational areas like dairy cow replacement, various methodological improvements like hierarchical Markov processes and Bayesian updating have been implemented, but not in sow models. Furthermore, there are methodological...... improvements like multi-level hierarchical Markov processes with decisions on multiple time scales, efficient methods for parameter estimations at herd level and standard software that has been hardly implemented at all in any replacement model. The aim of this study is to present a sow replacement model...
Porter, Edward K
2014-01-01
With the advance in computational resources, Bayesian inference is increasingly becoming the standard tool of practise in GW astronomy. However, algorithms such as Markov Chain Monte Carlo (MCMC) require a large number of iterations to guarantee convergence to the target density. Each chain demands a large number of evaluations of the likelihood function, and in the case of a Hessian MCMC, calculations of the Fisher information matrix for use as a proposal distribution. As each iteration requires the generation of at least one gravitational waveform, we very quickly reach a point of exclusion for current Bayesian algorithms, especially for low mass systems where the length of the waveforms is large and the waveform generation time is on the order of seconds. This suddenly demands a timescale of many weeks for a single MCMC. As each likelihood and Fisher information matrix calculation requires the evaluation of noise-weighted scalar products, we demonstrate that by using the linearity of integration, and the f...
Declarative Modeling and Bayesian Inference of Dark Matter Halos
Kronberger, Gabriel
2013-01-01
Probabilistic programming allows specification of probabilistic models in a declarative manner. Recently, several new software systems and languages for probabilistic programming have been developed on the basis of newly developed and improved methods for approximate inference in probabilistic models. In this contribution a probabilistic model for an idealized dark matter localization problem is described. We first derive the probabilistic model for the inference of dark matter locations and masses, and then show how this model can be implemented using BUGS and Infer.NET, two software systems for probabilistic programming. Finally, the different capabilities of both systems are discussed. The presented dark matter model includes mainly non-conjugate factors, thus, it is difficult to implement this model with Infer.NET.
Clustered nested sampling: efficient Bayesian inference for cosmology
Shaw, R; Hobson, M P
2007-01-01
Bayesian model selection provides the cosmologist with an exacting tool to distinguish between competing models based purely on the data, via the Bayesian evidence. Previous methods to calculate this quantity either lacked general applicability or were computationally demanding. However, nested sampling (Skilling 2004), which was recently applied successfully to cosmology by Muhkerjee et al. 2006, overcomes both of these impediments. Their implementation restricts the parameter space sampled, and thus improves the efficiency, using a decreasing ellipsoidal bound in the $n$-dimensional parameter space centred on the maximum likelihood point. However, if the likelihood function contains any multi-modality, then the ellipse is prevented from constraining the sampling region efficiently. In this paper we introduce a method of clustered ellipsoidal nested sampling which can form multiple ellipses around each individual peak in the likelihood. In addition we have implemented a method for determining the expectation...
A localization model to localize multiple sources using Bayesian inference
Dunham, Joshua Rolv
Accurate localization of a sound source in a room setting is important in both psychoacoustics and architectural acoustics. Binaural models have been proposed to explain how the brain processes and utilizes the interaural time differences (ITDs) and interaural level differences (ILDs) of sound waves arriving at the ears of a listener in determining source location. Recent work shows that applying Bayesian methods to this problem is proving fruitful. In this thesis, pink noise samples are convolved with head-related transfer functions (HRTFs) and compared to combinations of one and two anechoic speech signals convolved with different HRTFs or binaural room impulse responses (BRIRs) to simulate room positions. Through exhaustive calculation of Bayesian posterior probabilities and using a maximal likelihood approach, model selection will determine the number of sources present, and parameter estimation will result in azimuthal direction of the source(s).
Operational modal analysis modeling, Bayesian inference, uncertainty laws
Au, Siu-Kui
2017-01-01
This book presents operational modal analysis (OMA), employing a coherent and comprehensive Bayesian framework for modal identification and covering stochastic modeling, theoretical formulations, computational algorithms, and practical applications. Mathematical similarities and philosophical differences between Bayesian and classical statistical approaches to system identification are discussed, allowing their mathematical tools to be shared and their results correctly interpreted. Many chapters can be used as lecture notes for the general topic they cover beyond the OMA context. After an introductory chapter (1), Chapters 2–7 present the general theory of stochastic modeling and analysis of ambient vibrations. Readers are first introduced to the spectral analysis of deterministic time series (2) and structural dynamics (3), which do not require the use of probability concepts. The concepts and techniques in these chapters are subsequently extended to a probabilistic context in Chapter 4 (on stochastic pro...
Sraj, Ihab
2016-08-26
The authors present a polynomial chaos (PC)-based Bayesian inference method for quantifying the uncertainties of the K-profile parameterization (KPP) within the MIT general circulation model (MITgcm) of the tropical Pacific. The inference of the uncertain parameters is based on a Markov chain Monte Carlo (MCMC) scheme that utilizes a newly formulated test statistic taking into account the different components representing the structures of turbulent mixing on both daily and seasonal time scales in addition to the data quality, and filters for the effects of parameter perturbations over those as a result of changes in the wind. To avoid the prohibitive computational cost of integrating the MITgcm model at each MCMC iteration, a surrogate model for the test statistic using the PC method is built. Because of the noise in the model predictions, a basis-pursuit-denoising (BPDN) compressed sensing approach is employed to determine the PC coefficients of a representative surrogate model. The PC surrogate is then used to evaluate the test statistic in the MCMC step for sampling the posterior of the uncertain parameters. Results of the posteriors indicate good agreement with the default values for two parameters of the KPP model, namely the critical bulk and gradient Richardson numbers; while the posteriors of the remaining parameters were barely informative. © 2016 American Meteorological Society.
Sraj, Ihab; Zedler, Sarah E.; Knio, Omar M.; Jackson, Charles S.; Hoteit, Ibrahim
2016-12-01
The authors present a Polynomial Chaos (PC)-based Bayesian inference method for quantifying the uncertainties of the K-Profile Parametrization (KPP) within the MIT General Circulation Model (MITgcm) of the tropical pacific. The inference of the uncertain parameters is based on a Markov Chain Monte Carlo (MCMC) scheme that utilizes a newly formulated test statistic taking into account the different components representing the structures of turbulent mixing on both daily and seasonal timescales in addition to the data quality, and filters for the effects of parameter perturbations over those due to changes in the wind. To avoid the prohibitive computational cost of integrating the MITgcm model at each MCMC iteration, we build a surrogate model for the test statistic using the PC method. To filter out the noise in the model predictions and avoid related convergence issues, we resort to a Basis-Pursuit-DeNoising (BPDN) compressed sensing approach to determine the PC coefficients of a representative surrogate model. The PC surrogate is then used to evaluate the test statistic in the MCMC step for sampling the posterior of the uncertain parameters. Results of the posteriors indicate good agreement with the default values for two parameters of the KPP model namely the critical bulk and gradient Richardson numbers; while the posteriors of the remaining parameters were barely informative.
Energy Technology Data Exchange (ETDEWEB)
Kim, Joo Yeon; Lee, Seung Hyun; Park, Tai Jin [Korean Association for Radiation Application, Seoul (Korea, Republic of)
2016-06-15
Any real application of Bayesian inference must acknowledge that both prior distribution and likelihood function have only been specified as more or less convenient approximations to whatever the analyzer's true belief might be. If the inferences from the Bayesian analysis are to be trusted, it is important to determine that they are robust to such variations of prior and likelihood as might also be consistent with the analyzer's stated beliefs. The robust Bayesian inference was applied to atmospheric dispersion assessment using Gaussian plume model. The scopes of contaminations were specified as the uncertainties of distribution type and parametric variability. The probabilistic distribution of model parameters was assumed to be contaminated as the symmetric unimodal and unimodal distributions. The distribution of the sector-averaged relative concentrations was then calculated by applying the contaminated priors to the model parameters. The sector-averaged concentrations for stability class were compared by applying the symmetric unimodal and unimodal priors, respectively, as the contaminated one based on the class of ε-contamination. Though ε was assumed as 10%, the medians reflecting the symmetric unimodal priors were nearly approximated within 10% compared with ones reflecting the plausible ones. However, the medians reflecting the unimodal priors were approximated within 20% for a few downwind distances compared with ones reflecting the plausible ones. The robustness has been answered by estimating how the results of the Bayesian inferences are robust to reasonable variations of the plausible priors. From these robust inferences, it is reasonable to apply the symmetric unimodal priors for analyzing the robustness of the Bayesian inferences.
Directory of Open Access Journals (Sweden)
Moritz eBoos
2016-05-01
Full Text Available Cognitive determinants of probabilistic inference were examined using hierarchical Bayesian modelling techniques. A classic urn-ball paradigm served as experimental strategy, involving a factorial two (prior probabilities by two (likelihoods design. Five computational models of cognitive processes were compared with the observed behaviour. Parameter-free Bayesian posterior probabilities and parameter-free base rate neglect provided inadequate models of probabilistic inference. The introduction of distorted subjective probabilities yielded more robust and generalizable results. A general class of (inverted S-shaped probability weighting functions had been proposed; however, the possibility of large differences in probability distortions not only across experimental conditions, but also across individuals, seems critical for the model’s success. It also seems advantageous to consider individual differences in parameters of probability weighting as being sampled from weakly informative prior distributions of individual parameter values. Thus, the results from hierarchical Bayesian modelling converge with previous results in revealing that probability weighting parameters show considerable task dependency and individual differences. Methodologically, this work exemplifies the usefulness of hierarchical Bayesian modelling techniques for cognitive psychology. Theoretically, human probabilistic inference might be best described as the application of individualized strategic policies for Bayesian belief revision.
Boos, Moritz; Seer, Caroline; Lange, Florian; Kopp, Bruno
2016-01-01
Cognitive determinants of probabilistic inference were examined using hierarchical Bayesian modeling techniques. A classic urn-ball paradigm served as experimental strategy, involving a factorial two (prior probabilities) by two (likelihoods) design. Five computational models of cognitive processes were compared with the observed behavior. Parameter-free Bayesian posterior probabilities and parameter-free base rate neglect provided inadequate models of probabilistic inference. The introduction of distorted subjective probabilities yielded more robust and generalizable results. A general class of (inverted) S-shaped probability weighting functions had been proposed; however, the possibility of large differences in probability distortions not only across experimental conditions, but also across individuals, seems critical for the model's success. It also seems advantageous to consider individual differences in parameters of probability weighting as being sampled from weakly informative prior distributions of individual parameter values. Thus, the results from hierarchical Bayesian modeling converge with previous results in revealing that probability weighting parameters show considerable task dependency and individual differences. Methodologically, this work exemplifies the usefulness of hierarchical Bayesian modeling techniques for cognitive psychology. Theoretically, human probabilistic inference might be best described as the application of individualized strategic policies for Bayesian belief revision.
Perkins, Simon; Zwart, Jonathan; Natarajan, Iniyan; Smirnov, Oleg
2015-01-01
We present Montblanc, a GPU implementation of the Radio interferometer measurement equation (RIME) in support of the Bayesian inference for radio observations (BIRO) technique. BIRO uses Bayesian inference to select sky models that best match the visibilities observed by a radio interferometer. To accomplish this, BIRO evaluates the RIME multiple times, varying sky model parameters to produce multiple model visibilities. Chi-squared values computed from the model and observed visibilities are used as likelihood values to drive the Bayesian sampling process and select the best sky model. As most of the elements of the RIME and chi-squared calculation are independent of one another, they are highly amenable to parallel computation. Additionally, Montblanc caters for iterative RIME evaluation to produce multiple chi-squared values. Only modified model parameters are transferred to the GPU between each iteration. We implemented Montblanc as a Python package based upon NVIDIA's CUDA architecture. As such, it is ea...
Explaining Inference on a Population of Independent Agents Using Bayesian Networks
Sutovsky, Peter
2013-01-01
The main goal of this research is to design, implement, and evaluate a novel explanation method, the hierarchical explanation method (HEM), for explaining Bayesian network (BN) inference when the network is modeling a population of conditionally independent agents, each of which is modeled as a subnetwork. For example, consider disease-outbreak…
Bayesian Inference and Prediction in an M/G/1 with Optional Second Service
Mohammadi, A.; Salehi-Rad, M. R.
2012-01-01
In this article, we exploit the Bayesian inference and prediction for an M/G/1 queuing model with optional second re-service. In this model, a service unit attends customers arriving following a Poisson process and demanding service according to a general distribution and some of customers need to r
Bayesian inference of protein structure from chemical shift data
DEFF Research Database (Denmark)
Bratholm, Lars Andersen; Christensen, Anders Steen; Hamelryck, Thomas Wim
2015-01-01
content of the data. Here, we present the formulation of such a probability distribution where the error in chemical shift prediction is described by either a Gaussian or Cauchy distribution. The methodology is demonstrated and compared to a set of empirically weighted potentials through Markov chain......, the simulations suggests that sampling both the structure and the uncertainties in chemical shift prediction leads more accurate structures compared to conventional methods using empirical determined weights. The Cauchy distribution, using either sampled uncertainties or predetermined weights, did, however......Protein chemical shifts are routinely used to augment molecular mechanics force fields in protein structure simulations, with weights of the chemical shift restraints determined empirically. These weights, however, might not be an optimal descriptor of a given protein structure and predictive model...
Bayesian Inference of Empirical Coefficient for Foundation Settlement
Institute of Scientific and Technical Information of China (English)
LI Zhen-yu; WANG Yong-he; YANG Guo-lin
2009-01-01
A new approach based on Bayesian theory is proposed to determine the empirical coefficient in soil settlement calculation. Prior distribution is assumed to be uniform in [0.2,1.4]. Posterior density function is developed in the condition of prior distribution combined with the information of observed samples at four locations on a passenger dedicated line. The results show that the posterior distribution of the empirical coefficient obeys Gaussian distribution. The mean value of the empirical coefficient decreases gradually with the increasing of the load on ground, and variance variation shows no regularity.
Progress on Bayesian Inference of the Fast Ion Distribution Function
DEFF Research Database (Denmark)
Stagner, L.; Heidbrink, W.W,; Chen, X.;
2013-01-01
The fast-ion distribution function (DF) has a complicated dependence on several phase-space variables. The standard analysis procedure in energetic particle research is to compute the DF theoretically, use that DF in forward modeling to predict diagnostic signals, then compare with measured data...... sensitivity of the measurements are incorporated into Bayesian likelihood probabilities. Prior probabilities describe physical constraints. This poster will show reconstructions of classically described, low-power, MHD-quiescent distribution functions from actual FIDA measurements. A description of the full...
Bayesian inference analyses of the polygenic architecture of rheumatoid arthritis
Stahl, Eli A.; Wegmann, Daniel; Trynka, Gosia; Gutierrez-Achury, Javier; Do, Ron; Voight, Benjamin F.; Kraft, Peter; Chen, Robert; Kallberg, Henrik J.; Kurreeman, Fina A. S.; Kathiresan, Sekar; Wijmenga, Cisca; Gregersen, Peter K.; Alfredsson, Lars; Siminovitch, Katherine A.; Worthington, Jane; de Bakker, Paul I. W.; Raychaudhuri, Soumya; Plenge, Robert M.
2012-01-01
The genetic architectures of common, complex diseases are largely uncharacterized. We modeled the genetic architecture underlying genome-wide association study (GWAS) data for rheumatoid arthritis and developed a new method using polygenic risk-score analyses to infer the total liability-scale varia
Bayesian inference and model comparison for metallic fatigue data
Babuška, Ivo
2016-02-23
In this work, we present a statistical treatment of stress-life (S-N) data drawn from a collection of records of fatigue experiments that were performed on 75S-T6 aluminum alloys. Our main objective is to predict the fatigue life of materials by providing a systematic approach to model calibration, model selection and model ranking with reference to S-N data. To this purpose, we consider fatigue-limit models and random fatigue-limit models that are specially designed to allow the treatment of the run-outs (right-censored data). We first fit the models to the data by maximum likelihood methods and estimate the quantiles of the life distribution of the alloy specimen. To assess the robustness of the estimation of the quantile functions, we obtain bootstrap confidence bands by stratified resampling with respect to the cycle ratio. We then compare and rank the models by classical measures of fit based on information criteria. We also consider a Bayesian approach that provides, under the prior distribution of the model parameters selected by the user, their simulation-based posterior distributions. We implement and apply Bayesian model comparison methods, such as Bayes factor ranking and predictive information criteria based on cross-validation techniques under various a priori scenarios.
Directory of Open Access Journals (Sweden)
Giulia Carreras
2012-09-01
Full Text Available
Background: parameter uncertainty in the Markov model’s description of a disease course was addressed. Probabilistic sensitivity analysis (PSA is now considered the only tool that properly permits parameter uncertainty’s examination. This consists in sampling values from the parameter’s probability distributions.
Methods: Markov models fitted with microsimulation were considered and methods for carrying out a PSA on transition probabilities were studied. Two Bayesian solutions were developed: for each row of the modeled transition matrix the prior distribution was assumed as a product of Beta or a Dirichlet. The two solutions differ in the source of information: several different sources for each transition in the Beta approach and a single source for each transition from a given health state in the Dirichlet. The two methods were applied to a simple cervical cancer’s model.
Results : differences between posterior estimates from the two methods were negligible. Results showed that the prior variability highly influence the posterior distribution.
Conclusions: the novelty of this work is the Bayesian approach that integrates the two distributions with a product of Binomial distributions likelihood. Such methods could be also applied to cohort data and their application to more complex models could be useful and unique in the cervical cancer context, as well as in other disease modeling.
Oware, E. K.
2015-12-01
Modeling aquifer heterogeneities (AH) is a complex, multidimensional problem that mostly requires stochastic imaging strategies for tractability. While the traditional Bayesian Markov chain Monte Carlo (McMC) provides a powerful framework to model AH, the generic McMC is computationally prohibitive and, thus, unappealing for large-scale problems. An innovative variant of the McMC scheme that imposes priori spatial statistical constraints on model parameter updates, for improved characterization in a computationally efficient manner is proposed. The proposed algorithm (PA) is based on Markov random field (MRF) modeling, which is an image processing technique that infers the global behavior of a random field from its local properties, making the MRF approach well suited for imaging AH. MRF-based modeling leverages the equivalence of Gibbs (or Boltzmann) distribution (GD) and MRF to identify the local properties of an MRF in terms of the easily quantifiable Gibbs energy. The PA employs the two-step approach to model the lithological structure of the aquifer and the hydraulic properties within the identified lithologies simultaneously. It performs local Gibbs energy minimizations along a random path, which requires parameters of the GD (spatial statistics) to be specified. A PA that implicitly infers site-specific GD parameters within a Bayesian framework is also presented. The PA is illustrated with a synthetic binary facies aquifer with a lognormal heterogeneity simulated within each facies. GD parameters of 2.6, 1.2, -0.4, and -0.2 were estimated for the horizontal, vertical, NESW, and NWSE directions, respectively. Most of the high hydraulic conductivity zones (facies 2) were fairly resolved (see results below) with facies identification accuracy rate of 81%, 89%, and 90% for the inversions conditioned on concentration (R1), resistivity (R2), and joint (R3), respectively. The incorporation of the conditioning datasets improved on the root mean square error (RMSE
Bayesian Computation Methods for Inferring Regulatory Network Models Using Biomedical Data.
Tian, Tianhai
2016-01-01
The rapid advancement of high-throughput technologies provides huge amounts of information for gene expression and protein activity in the genome-wide scale. The availability of genomics, transcriptomics, proteomics, and metabolomics dataset gives an unprecedented opportunity to study detailed molecular regulations that is very important to precision medicine. However, it is still a significant challenge to design effective and efficient method to infer the network structure and dynamic property of regulatory networks. In recent years a number of computing methods have been designed to explore the regulatory mechanisms as well as estimate unknown model parameters. Among them, the Bayesian inference method can combine both prior knowledge and experimental data to generate updated information regarding the regulatory mechanisms. This chapter gives a brief review for Bayesian statistical methods that are used to infer the network structure and estimate model parameters based on experimental data.
Learning an Astronomical Catalog of the Visible Universe through Scalable Bayesian Inference
Regier, Jeffrey; Giordano, Ryan; Thomas, Rollin; Schlegel, David; McAuliffe, Jon; Prabhat,
2016-01-01
Celeste is a procedure for inferring astronomical catalogs that attains state-of-the-art scientific results. To date, Celeste has been scaled to at most hundreds of megabytes of astronomical images: Bayesian posterior inference is notoriously demanding computationally. In this paper, we report on a scalable, parallel version of Celeste, suitable for learning catalogs from modern large-scale astronomical datasets. Our algorithmic innovations include a fast numerical optimization routine for Bayesian posterior inference and a statistically efficient scheme for decomposing astronomical optimization problems into subproblems. Our scalable implementation is written entirely in Julia, a new high-level dynamic programming language designed for scientific and numerical computing. We use Julia's high-level constructs for shared and distributed memory parallelism, and demonstrate effective load balancing and efficient scaling on up to 8192 Xeon cores on the NERSC Cori supercomputer.
Hierarchical Bayesian inference of galaxy redshift distributions from photometric surveys
Leistedt, Boris; Peiris, Hiranya V
2016-01-01
Accurately characterizing the redshift distributions of galaxies is essential for analysing deep photometric surveys and testing cosmological models. We present a technique to simultaneously infer redshift distributions and individual redshifts from photometric galaxy catalogues. Our model constructs a piecewise constant representation (effectively a histogram) of the distribution of galaxy types and redshifts, the parameters of which are efficiently inferred from noisy photometric flux measurements. This approach can be seen as a generalization of template-fitting photometric redshift methods and relies on a library of spectral templates to relate the photometric fluxes of individual galaxies to their redshifts. We illustrate this technique on simulated galaxy survey data, and demonstrate that it delivers correct posterior distributions on the underlying type and redshift distributions, as well as on the individual types and redshifts of galaxies. We show that even with uninformative priors, large photometri...
Practical Bayesian inference a primer for physical scientists
Bailer-Jones, Coryn A L
2017-01-01
Science is fundamentally about learning from data, and doing so in the presence of uncertainty. This volume is an introduction to the major concepts of probability and statistics, and the computational tools for analysing and interpreting data. It describes the Bayesian approach, and explains how this can be used to fit and compare models in a range of problems. Topics covered include regression, parameter estimation, model assessment, and Monte Carlo methods, as well as widely used classical methods such as regularization and hypothesis testing. The emphasis throughout is on the principles, the unifying probabilistic approach, and showing how the methods can be implemented in practice. R code (with explanations) is included and is available online, so readers can reproduce the plots and results for themselves. Aimed primarily at undergraduate and graduate students, these techniques can be applied to a wide range of data analysis problems beyond the scope of this work.
Shah, Abhik; Woolf, Peter
2009-06-01
In this paper, we introduce pebl, a Python library and application for learning Bayesian network structure from data and prior knowledge that provides features unmatched by alternative software packages: the ability to use interventional data, flexible specification of structural priors, modeling with hidden variables and exploitation of parallel processing.
Shah, Abhik; Woolf, Peter
2009-01-01
In this paper, we introduce pebl, a Python library and application for learning Bayesian network structure from data and prior knowledge that provides features unmatched by alternative software packages: the ability to use interventional data, flexible specification of structural priors, modeling with hidden variables and exploitation of parallel processing.
Directory of Open Access Journals (Sweden)
A. A. Zolotin
2015-07-01
Full Text Available Posteriori inference is one of the three kinds of probabilistic-logic inferences in the probabilistic graphical models theory and the base for processing of knowledge patterns with probabilistic uncertainty using Bayesian networks. The paper deals with a task of local posteriori inference description in algebraic Bayesian networks that represent a class of probabilistic graphical models by means of matrix-vector equations. The latter are essentially based on the use of tensor product of matrices, Kronecker degree and Hadamard product. Matrix equations for calculating posteriori probabilities vectors within posteriori inference in knowledge patterns with quanta propositions are obtained. Similar equations of the same type have already been discussed within the confines of the theory of algebraic Bayesian networks, but they were built only for the case of posteriori inference in the knowledge patterns on the ideals of conjuncts. During synthesis and development of matrix-vector equations on quanta propositions probability vectors, a number of earlier results concerning normalizing factors in posteriori inference and assignment of linear projective operator with a selector vector was adapted. We consider all three types of incoming evidences - deterministic, stochastic and inaccurate - combined with scalar and interval estimation of probability truth of propositional formulas in the knowledge patterns. Linear programming problems are formed. Their solution gives the desired interval values of posterior probabilities in the case of inaccurate evidence or interval estimates in a knowledge pattern. That sort of description of a posteriori inference gives the possibility to extend the set of knowledge pattern types that we can use in the local and global posteriori inference, as well as simplify complex software implementation by use of existing third-party libraries, effectively supporting submission and processing of matrices and vectors when
Bayesian inference of protein structure from chemical shift data
Bratholm, Lars A.; Christensen, Anders S.; Hamelryck, Thomas
2015-01-01
Protein chemical shifts are routinely used to augment molecular mechanics force fields in protein structure simulations, with weights of the chemical shift restraints determined empirically. These weights, however, might not be an optimal descriptor of a given protein structure and predictive model, and a bias is introduced which might result in incorrect structures. In the inferential structure determination framework, both the unknown structure and the disagreement between experimental and back-calculated data are formulated as a joint probability distribution, thus utilizing the full information content of the data. Here, we present the formulation of such a probability distribution where the error in chemical shift prediction is described by either a Gaussian or Cauchy distribution. The methodology is demonstrated and compared to a set of empirically weighted potentials through Markov chain Monte Carlo simulations of three small proteins (ENHD, Protein G and the SMN Tudor Domain) using the PROFASI force field and the chemical shift predictor CamShift. Using a clustering-criterion for identifying the best structure, together with the addition of a solvent exposure scoring term, the simulations suggests that sampling both the structure and the uncertainties in chemical shift prediction leads more accurate structures compared to conventional methods using empirical determined weights. The Cauchy distribution, using either sampled uncertainties or predetermined weights, did, however, result in overall better convergence to the native fold, suggesting that both types of distribution might be useful in different aspects of the protein structure prediction. PMID:25825683
Johnson, Eric D; Tubau, Elisabet
2016-09-27
Presenting natural frequencies facilitates Bayesian inferences relative to using percentages. Nevertheless, many people, including highly educated and skilled reasoners, still fail to provide Bayesian responses to these computationally simple problems. We show that the complexity of relational reasoning (e.g., the structural mapping between the presented and requested relations) can help explain the remaining difficulties. With a non-Bayesian inference that required identical arithmetic but afforded a more direct structural mapping, performance was universally high. Furthermore, reducing the relational demands of the task through questions that directed reasoners to use the presented statistics, as compared with questions that prompted the representation of a second, similar sample, also significantly improved reasoning. Distinct error patterns were also observed between these presented- and similar-sample scenarios, which suggested differences in relational-reasoning strategies. On the other hand, while higher numeracy was associated with better Bayesian reasoning, higher-numerate reasoners were not immune to the relational complexity of the task. Together, these findings validate the relational-reasoning view of Bayesian problem solving and highlight the importance of considering not only the presented task structure, but also the complexity of the structural alignment between the presented and requested relations.
Tail paradox, partial identifiability, and influential priors in Bayesian branch length inference.
Rannala, Bruce; Zhu, Tianqi; Yang, Ziheng
2012-01-01
Recent studies have observed that Bayesian analyses of sequence data sets using the program MrBayes sometimes generate extremely large branch lengths, with posterior credibility intervals for the tree length (sum of branch lengths) excluding the maximum likelihood estimates. Suggested explanations for this phenomenon include the existence of multiple local peaks in the posterior, lack of convergence of the chain in the tail of the posterior, mixing problems, and misspecified priors on branch lengths. Here, we analyze the behavior of Bayesian Markov chain Monte Carlo algorithms when the chain is in the tail of the posterior distribution and note that all these phenomena can occur. In Bayesian phylogenetics, the likelihood function approaches a constant instead of zero when the branch lengths increase to infinity. The flat tail of the likelihood can cause poor mixing and undue influence of the prior. We suggest that the main cause of the extreme branch length estimates produced in many Bayesian analyses is the poor choice of a default prior on branch lengths in current Bayesian phylogenetic programs. The default prior in MrBayes assigns independent and identical distributions to branch lengths, imposing strong (and unreasonable) assumptions about the tree length. The problem is exacerbated by the strong correlation between the branch lengths and parameters in models of variable rates among sites or among site partitions. To resolve the problem, we suggest two multivariate priors for the branch lengths (called compound Dirichlet priors) that are fairly diffuse and demonstrate their utility in the special case of branch length estimation on a star phylogeny. Our analysis highlights the need for careful thought in the specification of high-dimensional priors in Bayesian analyses.
A gene frequency model for QTL mapping using Bayesian inference
Directory of Open Access Journals (Sweden)
Dekkers Jack CM
2010-06-01
Full Text Available Abstract Background Information for mapping of quantitative trait loci (QTL comes from two sources: linkage disequilibrium (non-random association of allele states and cosegregation (non-random association of allele origin. Information from LD can be captured by modeling conditional means and variances at the QTL given marker information. Similarly, information from cosegregation can be captured by modeling conditional covariances. Here, we consider a Bayesian model based on gene frequency (BGF where both conditional means and variances are modeled as a function of the conditional gene frequencies at the QTL. The parameters in this model include these gene frequencies, additive effect of the QTL, its location, and the residual variance. Bayesian methodology was used to estimate these parameters. The priors used were: logit-normal for gene frequencies, normal for the additive effect, uniform for location, and inverse chi-square for the residual variance. Computer simulation was used to compare the power to detect and accuracy to map QTL by this method with those from least squares analysis using a regression model (LSR. Results To simplify the analysis, data from unrelated individuals in a purebred population were simulated, where only LD information contributes to map the QTL. LD was simulated in a chromosomal segment of 1 cM with one QTL by random mating in a population of size 500 for 1000 generations and in a population of size 100 for 50 generations. The comparison was studied under a range of conditions, which included SNP density of 0.1, 0.05 or 0.02 cM, sample size of 500 or 1000, and phenotypic variance explained by QTL of 2 or 5%. Both 1 and 2-SNP models were considered. Power to detect the QTL for the BGF, ranged from 0.4 to 0.99, and close or equal to the power of the regression using least squares (LSR. Precision to map QTL position of BGF, quantified by the mean absolute error, ranged from 0.11 to 0.21 cM for BGF, and was better
Protein NMR Structure Refinement based on Bayesian Inference
Ikeya, Teppei; Ikeda, Shiro; Kigawa, Takanori; Ito, Yutaka; Güntert, Peter
2016-03-01
Nuclear Magnetic Resonance (NMR) spectroscopy is a tool to investigate threedimensional (3D) structures and dynamics of biomacromolecules at atomic resolution in solution or more natural environments such as living cells. Since NMR data are principally only spectra with peak signals, it is required to properly deduce structural information from the sparse experimental data with their imperfections and uncertainty, and to visualize 3D conformations by NMR structure calculation. In order to efficiently analyse the data, Rieping et al. proposed a new structure calculation method based on Bayes’ theorem. We implemented a similar approach into the program CYANA with some modifications. It allows us to handle automatic NOE cross peak assignments in unambiguous and ambiguous usages, and to create a prior distribution based on a physical force field with the generalized Born implicit water model. The sampling scheme for obtaining the posterior is performed by a hybrid Monte Carlo algorithm combined with Markov chain Monte Carlo (MCMC) by the Gibbs sampler, and molecular dynamics simulation (MD) for obtaining a canonical ensemble of conformations. Since it is not trivial to search the entire function space particularly for exploring the conformational prior due to the extraordinarily large conformation space of proteins, the replica exchange method is performed, in which several MCMC calculations with different temperatures run in parallel as replicas. It is shown with simulated data or randomly deleted experimental peaks that the new structure calculation method can provide accurate structures even with less peaks, especially compared with the conventional method. In particular, it dramatically improves in-cell structures of the proteins GB1 and TTHA1718 using exclusively information obtained in living Escherichia coli (E. coli) cells.
Bayesian inference and model comparison for metallic fatigue data
Babuska, Ivo
2016-01-06
In this work, we present a statistical treatment of stress-life (S-N) data drawn from a collection of records of fatigue experiments that were performed on 75S-T6 aluminum alloys. Our main objective is to predict the fatigue life of materials by providing a systematic approach to model calibration, model selection and model ranking with reference to S-N data. To this purpose, we consider fatigue-limit models and random fatigue-limit models that are specially designed to allow the treatment of the run-outs (right-censored data). We first fit the models to the data by maximum likelihood methods and estimate the quantiles of the life distribution of the alloy specimen. We then compare and rank the models by classical measures of fit based on information criteria. We also consider a Bayesian approach that provides, under the prior distribution of the model parameters selected by the user, their simulation-based posterior distributions.
Natural frequencies improve Bayesian reasoning in simple and complex inference tasks.
Hoffrage, Ulrich; Krauss, Stefan; Martignon, Laura; Gigerenzer, Gerd
2015-01-01
Representing statistical information in terms of natural frequencies rather than probabilities improves performance in Bayesian inference tasks. This beneficial effect of natural frequencies has been demonstrated in a variety of applied domains such as medicine, law, and education. Yet all the research and applications so far have been limited to situations where one dichotomous cue is used to infer which of two hypotheses is true. Real-life applications, however, often involve situations where cues (e.g., medical tests) have more than one value, where more than two hypotheses (e.g., diseases) are considered, or where more than one cue is available. In Study 1, we show that natural frequencies, compared to information stated in terms of probabilities, consistently increase the proportion of Bayesian inferences made by medical students in four conditions-three cue values, three hypotheses, two cues, or three cues-by an average of 37 percentage points. In Study 2, we show that teaching natural frequencies for simple tasks with one dichotomous cue and two hypotheses leads to a transfer of learning to complex tasks with three cue values and two cues, with a proportion of 40 and 81% correct inferences, respectively. Thus, natural frequencies facilitate Bayesian reasoning in a much broader class of situations than previously thought.
Bayesian Modelling of fMRI Time Series
DEFF Research Database (Denmark)
Højen-Sørensen, Pedro; Hansen, Lars Kai; Rasmussen, Carl Edward
2000-01-01
We present a Hidden Markov Model (HMM) for inferring the hidden psychological state (or neural activity) during single trial fMRI activation experiments with blocked task paradigms. Inference is based on Bayesian methodology, using a combination of analytical and a variety of Markov Chain Monte C...... Carlo (MCMC) sampling techniques. The advantage of this method is that detection of short time learning effects between repeated trials is possible since inference is based only on single trial experiments....
Elsheikh, Ahmed H.
2014-02-01
An efficient Bayesian calibration method based on the nested sampling (NS) algorithm and non-intrusive polynomial chaos method is presented. Nested sampling is a Bayesian sampling algorithm that builds a discrete representation of the posterior distributions by iteratively re-focusing a set of samples to high likelihood regions. NS allows representing the posterior probability density function (PDF) with a smaller number of samples and reduces the curse of dimensionality effects. The main difficulty of the NS algorithm is in the constrained sampling step which is commonly performed using a random walk Markov Chain Monte-Carlo (MCMC) algorithm. In this work, we perform a two-stage sampling using a polynomial chaos response surface to filter out rejected samples in the Markov Chain Monte-Carlo method. The combined use of nested sampling and the two-stage MCMC based on approximate response surfaces provides significant computational gains in terms of the number of simulation runs. The proposed algorithm is applied for calibration and model selection of subsurface flow models. © 2013.
Large-Scale Optimization for Bayesian Inference in Complex Systems
Energy Technology Data Exchange (ETDEWEB)
Willcox, Karen [MIT; Marzouk, Youssef [MIT
2013-11-12
The SAGUARO (Scalable Algorithms for Groundwater Uncertainty Analysis and Robust Optimization) Project focused on the development of scalable numerical algorithms for large-scale Bayesian inversion in complex systems that capitalize on advances in large-scale simulation-based optimization and inversion methods. The project was a collaborative effort among MIT, the University of Texas at Austin, Georgia Institute of Technology, and Sandia National Laboratories. The research was directed in three complementary areas: efficient approximations of the Hessian operator, reductions in complexity of forward simulations via stochastic spectral approximations and model reduction, and employing large-scale optimization concepts to accelerate sampling. The MIT--Sandia component of the SAGUARO Project addressed the intractability of conventional sampling methods for large-scale statistical inverse problems by devising reduced-order models that are faithful to the full-order model over a wide range of parameter values; sampling then employs the reduced model rather than the full model, resulting in very large computational savings. Results indicate little effect on the computed posterior distribution. On the other hand, in the Texas--Georgia Tech component of the project, we retain the full-order model, but exploit inverse problem structure (adjoint-based gradients and partial Hessian information of the parameter-to-observation map) to implicitly extract lower dimensional information on the posterior distribution; this greatly speeds up sampling methods, so that fewer sampling points are needed. We can think of these two approaches as ``reduce then sample'' and ``sample then reduce.'' In fact, these two approaches are complementary, and can be used in conjunction with each other. Moreover, they both exploit deterministic inverse problem structure, in the form of adjoint-based gradient and Hessian information of the underlying parameter-to-observation map, to
Directory of Open Access Journals (Sweden)
Ali Ahmed
2011-01-01
Full Text Available Problem statement: Similarity based Virtual Screening (VS deals with a large amount of data containing irrelevant and/or redundant fragments or features. Recent use of Bayesian network as an alternative for existing tools for similarity based VS has received noticeable attention of the researchers in the field of chemoinformatics. Approach: To this end, different models of Bayesian network have been developed. In this study, we enhance the Bayesian Inference Network (BIN using a subset of selected molecules features. Results: In this approach, a few features were filtered from the molecular fingerprint features based on a features selection approach. Conclusion: Simulated virtual screening experiments with MDL Drug Data Report (MDDR data sets showed that the proposed method provides simple ways of enhancing the cost effectiveness of ligand-based virtual screening searches, especially for higher diversity data set.
Wavelet-Bayesian inference of cosmic strings embedded in the cosmic microwave background
McEwen, J D; Peiris, H V; Wiaux, Y; Ringeval, C; Bouchet, F R
2016-01-01
Cosmic strings are a well-motivated extension to the standard cosmological model and could induce a subdominant component in the anisotropies of the cosmic microwave background (CMB), in addition to the standard inflationary component. The detection of strings, while observationally challenging, would provide a direct probe of physics at very high energy scales. We develop a new framework for cosmic string inference, constructing a Bayesian analysis in wavelet space where the string-induced CMB component has distinct statistical properties to the standard inflationary component. Our wavelet-Bayesian framework provides a principled approach to compute the posterior distribution of the string tension $G\\mu$ and the Bayesian evidence ratio comparing the string model to the standard inflationary model. Furthermore, we present a technique to recover an estimate of any string-induced CMB map embedded in observational data. Using Planck-like simulations we demonstrate the application of our framework and evaluate it...
Ball, William T; Egerton, Jack S; Haigh, Joanna D
2014-01-01
We investigate the relationship between spectral solar irradiance (SSI) and ozone in the tropical upper stratosphere. We find that solar cycle (SC) changes in ozone can be well approximated by considering the ozone response to SSI changes in a small number individual wavelength bands between 176 and 310 nm, operating independently of each other. Additionally, we find that the ozone varies approximately linearly with changes in the SSI. Using these facts, we present a Bayesian formalism for inferring SC SSI changes and uncertainties from measured SC ozone profiles. Bayesian inference is a powerful, mathematically self-consistent method of considering both the uncertainties of the data and additional external information to provide the best estimate of parameters being estimated. Using this method, we show that, given measurement uncertainties in both ozone and SSI datasets, it is not currently possible to distinguish between observed or modelled SSI datasets using available estimates of ozone change profiles, ...
BAYESIAN INFERENCE OF HIDDEN GAMMA WEAR PROCESS MODEL FOR SURVIVAL DATA WITH TIES.
Sinha, Arijit; Chi, Zhiyi; Chen, Ming-Hui
2015-10-01
Survival data often contain tied event times. Inference without careful treatment of the ties can lead to biased estimates. This paper develops the Bayesian analysis of a stochastic wear process model to fit survival data that might have a large number of ties. Under a general wear process model, we derive the likelihood of parameters. When the wear process is a Gamma process, the likelihood has a semi-closed form that allows posterior sampling to be carried out for the parameters, hence achieving model selection using Bayesian deviance information criterion. An innovative simulation algorithm via direct forward sampling and Gibbs sampling is developed to sample event times that may have ties in the presence of arbitrary covariates; this provides a tool to assess the precision of inference. An extensive simulation study is reported and a data set is used to further illustrate the proposed methodology.
2017-01-01
Gene regulatory networks (GRNs) play an important role in cellular systems and are important for understanding biological processes. Many algorithms have been developed to infer the GRNs. However, most algorithms only pay attention to the gene expression data but do not consider the topology information in their inference process, while incorporating this information can partially compensate for the lack of reliable expression data. Here we develop a Bayesian group lasso with spike and slab priors to perform gene selection and estimation for nonparametric models. B-spline basis functions are used to capture the nonlinear relationships flexibly and penalties are used to avoid overfitting. Further, we incorporate the topology information into the Bayesian method as a prior. We present the application of our method on DREAM3 and DREAM4 datasets and two real biological datasets. The results show that our method performs better than existing methods and the topology information prior can improve the result. PMID:28133490
Directory of Open Access Journals (Sweden)
Yue Fan
2017-01-01
Full Text Available Gene regulatory networks (GRNs play an important role in cellular systems and are important for understanding biological processes. Many algorithms have been developed to infer the GRNs. However, most algorithms only pay attention to the gene expression data but do not consider the topology information in their inference process, while incorporating this information can partially compensate for the lack of reliable expression data. Here we develop a Bayesian group lasso with spike and slab priors to perform gene selection and estimation for nonparametric models. B-spline basis functions are used to capture the nonlinear relationships flexibly and penalties are used to avoid overfitting. Further, we incorporate the topology information into the Bayesian method as a prior. We present the application of our method on DREAM3 and DREAM4 datasets and two real biological datasets. The results show that our method performs better than existing methods and the topology information prior can improve the result.
Prudhomme, Serge
2015-09-17
Parameter estimation for complex models using Bayesian inference is usually a very costly process as it requires a large number of solves of the forward problem. We show here how the construction of adaptive surrogate models using a posteriori error estimates for quantities of interest can significantly reduce the computational cost in problems of statistical inference. As surrogate models provide only approximations of the true solutions of the forward problem, it is nevertheless necessary to control these errors in order to construct an accurate reduced model with respect to the observables utilized in the identification of the model parameters. Effectiveness of the proposed approach is demonstrated on a numerical example dealing with the Spalart–Allmaras model for the simulation of turbulent channel flows. In particular, we illustrate how Bayesian model selection using the adapted surrogate model in place of solving the coupled nonlinear equations leads to the same quality of results while requiring fewer nonlinear PDE solves.
Bayesian Inference using Neural Net Likelihood Models for Protein Secondary Structure Prediction
Directory of Open Access Journals (Sweden)
Seong-Gon Kim
2011-06-01
Full Text Available Several techniques such as Neural Networks, Genetic Algorithms, Decision Trees and other statistical or heuristic methods have been used to approach the complex non-linear task of predicting Alpha-helicies, Beta-sheets and Turns of a proteins secondary structure in the past. This project introduces a new machine learning method by using an offline trained Multilayered Perceptrons (MLP as the likelihood models within a Bayesian Inference framework to predict secondary structures proteins. Varying window sizes are used to extract neighboring amino acid information and passed back and forth between the Neural Net models and the Bayesian Inference process until there is a convergence of the posterior secondary structure probability.
Lo, Benjamin W Y; Macdonald, R Loch; Baker, Andrew; Levine, Mitchell A H
2013-01-01
The novel clinical prediction approach of Bayesian neural networks with fuzzy logic inferences is created and applied to derive prognostic decision rules in cerebral aneurysmal subarachnoid hemorrhage (aSAH). The approach of Bayesian neural networks with fuzzy logic inferences was applied to data from five trials of Tirilazad for aneurysmal subarachnoid hemorrhage (3551 patients). Bayesian meta-analyses of observational studies on aSAH prognostic factors gave generalizable posterior distributions of population mean log odd ratios (ORs). Similar trends were noted in Bayesian and linear regression ORs. Significant outcome predictors include normal motor response, cerebral infarction, history of myocardial infarction, cerebral edema, history of diabetes mellitus, fever on day 8, prior subarachnoid hemorrhage, admission angiographic vasospasm, neurological grade, intraventricular hemorrhage, ruptured aneurysm size, history of hypertension, vasospasm day, age and mean arterial pressure. Heteroscedasticity was present in the nontransformed dataset. Artificial neural networks found nonlinear relationships with 11 hidden variables in 1 layer, using the multilayer perceptron model. Fuzzy logic decision rules (centroid defuzzification technique) denoted cut-off points for poor prognosis at greater than 2.5 clusters. This aSAH prognostic system makes use of existing knowledge, recognizes unknown areas, incorporates one's clinical reasoning, and compensates for uncertainty in prognostication.
Spatial attention, precision, and Bayesian inference: a study of saccadic response speed.
Vossel, Simone; Mathys, Christoph; Daunizeau, Jean; Bauer, Markus; Driver, Jon; Friston, Karl J; Stephan, Klaas E
2014-06-01
Inferring the environment's statistical structure and adapting behavior accordingly is a fundamental modus operandi of the brain. A simple form of this faculty based on spatial attentional orienting can be studied with Posner's location-cueing paradigm in which a cue indicates the target location with a known probability. The present study focuses on a more complex version of this task, where probabilistic context (percentage of cue validity) changes unpredictably over time, thereby creating a volatile environment. Saccadic response speed (RS) was recorded in 15 subjects and used to estimate subject-specific parameters of a Bayesian learning scheme modeling the subjects' trial-by-trial updates of beliefs. Different response models-specifying how computational states translate into observable behavior-were compared using Bayesian model selection. Saccadic RS was most plausibly explained as a function of the precision of the belief about the causes of sensory input. This finding is in accordance with current Bayesian theories of brain function, and specifically with the proposal that spatial attention is mediated by a precision-dependent gain modulation of sensory input. Our results provide empirical support for precision-dependent changes in beliefs about saccade target locations and motivate future neuroimaging and neuropharmacological studies of how Bayesian inference may determine spatial attention.
Interactions between Eurozone and US Booms and Busts: A Bayesian Panel Markov-switching VAR Model
M. Billio (Monica); R. Casarin (Roberto); F. Ravazzolo (Francesco); H.K. van Dijk (Herman)
2013-01-01
markdownabstract__Abstract__ Interactions between the eurozone and US booms and busts and among major eurozone economies are analyzed by introducing a panel Markov-switching VAR model well suitable for a multi-country cyclical analysis. The model accommodates changes in low and high data frequencie
Statistical detection of EEG synchrony using empirical bayesian inference.
Directory of Open Access Journals (Sweden)
Archana K Singh
Full Text Available There is growing interest in understanding how the brain utilizes synchronized oscillatory activity to integrate information across functionally connected regions. Computing phase-locking values (PLV between EEG signals is a popular method for quantifying such synchronizations and elucidating their role in cognitive tasks. However, high-dimensionality in PLV data incurs a serious multiple testing problem. Standard multiple testing methods in neuroimaging research (e.g., false discovery rate, FDR suffer severe loss of power, because they fail to exploit complex dependence structure between hypotheses that vary in spectral, temporal and spatial dimension. Previously, we showed that a hierarchical FDR and optimal discovery procedures could be effectively applied for PLV analysis to provide better power than FDR. In this article, we revisit the multiple comparison problem from a new Empirical Bayes perspective and propose the application of the local FDR method (locFDR; Efron, 2001 for PLV synchrony analysis to compute FDR as a posterior probability that an observed statistic belongs to a null hypothesis. We demonstrate the application of Efron's Empirical Bayes approach for PLV synchrony analysis for the first time. We use simulations to validate the specificity and sensitivity of locFDR and a real EEG dataset from a visual search study for experimental validation. We also compare locFDR with hierarchical FDR and optimal discovery procedures in both simulation and experimental analyses. Our simulation results showed that the locFDR can effectively control false positives without compromising on the power of PLV synchrony inference. Our results from the application locFDR on experiment data detected more significant discoveries than our previously proposed methods whereas the standard FDR method failed to detect any significant discoveries.
Statistical detection of EEG synchrony using empirical bayesian inference.
Singh, Archana K; Asoh, Hideki; Takeda, Yuji; Phillips, Steven
2015-01-01
There is growing interest in understanding how the brain utilizes synchronized oscillatory activity to integrate information across functionally connected regions. Computing phase-locking values (PLV) between EEG signals is a popular method for quantifying such synchronizations and elucidating their role in cognitive tasks. However, high-dimensionality in PLV data incurs a serious multiple testing problem. Standard multiple testing methods in neuroimaging research (e.g., false discovery rate, FDR) suffer severe loss of power, because they fail to exploit complex dependence structure between hypotheses that vary in spectral, temporal and spatial dimension. Previously, we showed that a hierarchical FDR and optimal discovery procedures could be effectively applied for PLV analysis to provide better power than FDR. In this article, we revisit the multiple comparison problem from a new Empirical Bayes perspective and propose the application of the local FDR method (locFDR; Efron, 2001) for PLV synchrony analysis to compute FDR as a posterior probability that an observed statistic belongs to a null hypothesis. We demonstrate the application of Efron's Empirical Bayes approach for PLV synchrony analysis for the first time. We use simulations to validate the specificity and sensitivity of locFDR and a real EEG dataset from a visual search study for experimental validation. We also compare locFDR with hierarchical FDR and optimal discovery procedures in both simulation and experimental analyses. Our simulation results showed that the locFDR can effectively control false positives without compromising on the power of PLV synchrony inference. Our results from the application locFDR on experiment data detected more significant discoveries than our previously proposed methods whereas the standard FDR method failed to detect any significant discoveries.
Estimating uncertainty and reliability of social network data using Bayesian inference.
Farine, Damien R; Strandburg-Peshkin, Ariana
2015-09-01
Social network analysis provides a useful lens through which to view the structure of animal societies, and as a result its use is increasingly widespread. One challenge that many studies of animal social networks face is dealing with limited sample sizes, which introduces the potential for a high level of uncertainty in estimating the rates of association or interaction between individuals. We present a method based on Bayesian inference to incorporate uncertainty into network analyses. We test the reliability of this method at capturing both local and global properties of simulated networks, and compare it to a recently suggested method based on bootstrapping. Our results suggest that Bayesian inference can provide useful information about the underlying certainty in an observed network. When networks are well sampled, observed networks approach the real underlying social structure. However, when sampling is sparse, Bayesian inferred networks can provide realistic uncertainty estimates around edge weights. We also suggest a potential method for estimating the reliability of an observed network given the amount of sampling performed. This paper highlights how relatively simple procedures can be used to estimate uncertainty and reliability in studies using animal social network analysis.
Energy Technology Data Exchange (ETDEWEB)
Kang, Seong Keun; Seong, Poong Hyun [KAIST, Daejeon (Korea, Republic of)
2014-08-15
Bayesian methodology has been used widely used in various research fields. It is method of inference using Bayes' rule to update the estimation of probability for the certain hypothesis when additional evidences are acquired. According to the current researches, malfunction of nuclear power plant can be detected by using this Bayesian inference which consistently piles up the newly incoming data and updates its estimation. However, those researches are based on the assumption that people are doing like computer perfectly, which can be criticized and may cause a problem in real world application. Studies in cognitive psychology indicates that when the amount of information becomes larger, people can't save the whole data because people have limited memory capacity which is well known as working memory, and also they have attention problem. The purpose of this paper is to consider the psychological factors and confirm how much this working memory and attention will affect the resulted estimation based on the Bayesian inference. To confirm this, experiment on human is needed, and the tool of experiment is Compact Nuclear Simulator (CNS)
Inferring cellular regulatory networks with Bayesian model averaging for linear regression (BMALR).
Huang, Xun; Zi, Zhike
2014-08-01
Bayesian network and linear regression methods have been widely applied to reconstruct cellular regulatory networks. In this work, we propose a Bayesian model averaging for linear regression (BMALR) method to infer molecular interactions in biological systems. This method uses a new closed form solution to compute the posterior probabilities of the edges from regulators to the target gene within a hybrid framework of Bayesian model averaging and linear regression methods. We have assessed the performance of BMALR by benchmarking on both in silico DREAM datasets and real experimental datasets. The results show that BMALR achieves both high prediction accuracy and high computational efficiency across different benchmarks. A pre-processing of the datasets with the log transformation can further improve the performance of BMALR, leading to a new top overall performance. In addition, BMALR can achieve robust high performance in community predictions when it is combined with other competing methods. The proposed method BMALR is competitive compared to the existing network inference methods. Therefore, BMALR will be useful to infer regulatory interactions in biological networks. A free open source software tool for the BMALR algorithm is available at https://sites.google.com/site/bmalr4netinfer/.
Implementing relevance feedback in ligand-based virtual screening using Bayesian inference network.
Abdo, Ammar; Salim, Naomie; Ahmed, Ali
2011-10-01
Recently, the use of the Bayesian network as an alternative to existing tools for similarity-based virtual screening has received noticeable attention from researchers in the chemoinformatics field. The main aim of the Bayesian network model is to improve the retrieval effectiveness of similarity-based virtual screening. To this end, different models of the Bayesian network have been developed. In our previous works, the retrieval performance of the Bayesian network was observed to improve significantly when multiple reference structures or fragment weightings were used. In this article, the authors enhance the Bayesian inference network (BIN) using the relevance feedback information. In this approach, a few high-ranking structures of unknown activity were filtered from the outputs of BIN, based on a single active reference structure, to form a set of active reference structures. This set of active reference structures was used in two distinct techniques for carrying out such BIN searching: reweighting the fragments in the reference structures and group fusion techniques. Simulated virtual screening experiments with three MDL Drug Data Report data sets showed that the proposed techniques provide simple ways of enhancing the cost-effectiveness of ligand-based virtual screening searches, especially for higher diversity data sets.
Li, Shi; Mukherjee, Bhramar; Batterman, Stuart; Ghosh, Malay
2013-12-01
Case-crossover designs are widely used to study short-term exposure effects on the risk of acute adverse health events. While the frequentist literature on this topic is vast, there is no Bayesian work in this general area. The contribution of this paper is twofold. First, the paper establishes Bayesian equivalence results that require characterization of the set of priors under which the posterior distributions of the risk ratio parameters based on a case-crossover and time-series analysis are identical. Second, the paper studies inferential issues under case-crossover designs in a Bayesian framework. Traditionally, a conditional logistic regression is used for inference on risk-ratio parameters in case-crossover studies. We consider instead a more general full likelihood-based approach which makes less restrictive assumptions on the risk functions. Formulation of a full likelihood leads to growth in the number of parameters proportional to the sample size. We propose a semi-parametric Bayesian approach using a Dirichlet process prior to handle the random nuisance parameters that appear in a full likelihood formulation. We carry out a simulation study to compare the Bayesian methods based on full and conditional likelihood with the standard frequentist approaches for case-crossover and time-series analysis. The proposed methods are illustrated through the Detroit Asthma Morbidity, Air Quality and Traffic study, which examines the association between acute asthma risk and ambient air pollutant concentrations.
Bayesian modeling using WinBUGS
Ntzoufras, Ioannis
2009-01-01
A hands-on introduction to the principles of Bayesian modeling using WinBUGS Bayesian Modeling Using WinBUGS provides an easily accessible introduction to the use of WinBUGS programming techniques in a variety of Bayesian modeling settings. The author provides an accessible treatment of the topic, offering readers a smooth introduction to the principles of Bayesian modeling with detailed guidance on the practical implementation of key principles. The book begins with a basic introduction to Bayesian inference and the WinBUGS software and goes on to cover key topics, including: Markov Chain Monte Carlo algorithms in Bayesian inference Generalized linear models Bayesian hierarchical models Predictive distribution and model checking Bayesian model and variable evaluation Computational notes and screen captures illustrate the use of both WinBUGS as well as R software to apply the discussed techniques. Exercises at the end of each chapter allow readers to test their understanding of the presented concepts and all ...
Gelfond, Jonathan A L; Gupta, Mayetri; Ibrahim, Joseph G
2009-12-01
We propose a unified framework for the analysis of chromatin (Ch) immunoprecipitation (IP) microarray (ChIP-chip) data for detecting transcription factor binding sites (TFBSs) or motifs. ChIP-chip assays are used to focus the genome-wide search for TFBSs by isolating a sample of DNA fragments with TFBSs and applying this sample to a microarray with probes corresponding to tiled segments across the genome. Present analytical methods use a two-step approach: (i) analyze array data to estimate IP-enrichment peaks then (ii) analyze the corresponding sequences independently of intensity information. The proposed model integrates peak finding and motif discovery through a unified Bayesian hidden Markov model (HMM) framework that accommodates the inherent uncertainty in both measurements. A Markov chain Monte Carlo algorithm is formulated for parameter estimation, adapting recursive techniques used for HMMs. In simulations and applications to a yeast RAP1 dataset, the proposed method has favorable TFBS discovery performance compared to currently available two-stage procedures in terms of both sensitivity and specificity.
Bayesian Methods for Statistical Analysis
Puza, Borek
2015-01-01
Bayesian methods for statistical analysis is a book on statistical methods for analysing a wide variety of data. The book consists of 12 chapters, starting with basic concepts and covering numerous topics, including Bayesian estimation, decision theory, prediction, hypothesis testing, hierarchical models, Markov chain Monte Carlo methods, finite population inference, biased sampling and nonignorable nonresponse. The book contains many exercises, all with worked solutions, including complete c...
Unraveling multiple changes in complex climate time series using Bayesian inference
Berner, Nadine; Trauth, Martin H.; Holschneider, Matthias
2016-04-01
Change points in time series are perceived as heterogeneities in the statistical or dynamical characteristics of observations. Unraveling such transitions yields essential information for the understanding of the observed system. The precise detection and basic characterization of underlying changes is therefore of particular importance in environmental sciences. We present a kernel-based Bayesian inference approach to investigate direct as well as indirect climate observations for multiple generic transition events. In order to develop a diagnostic approach designed to capture a variety of natural processes, the basic statistical features of central tendency and dispersion are used to locally approximate a complex time series by a generic transition model. A Bayesian inversion approach is developed to robustly infer on the location and the generic patterns of such a transition. To systematically investigate time series for multiple changes occurring at different temporal scales, the Bayesian inversion is extended to a kernel-based inference approach. By introducing basic kernel measures, the kernel inference results are composed into a proxy probability to a posterior distribution of multiple transitions. Thus, based on a generic transition model a probability expression is derived that is capable to indicate multiple changes within a complex time series. We discuss the method's performance by investigating direct and indirect climate observations. The approach is applied to environmental time series (about 100 a), from the weather station in Tuscaloosa, Alabama, and confirms documented instrumentation changes. Moreover, the approach is used to investigate a set of complex terrigenous dust records from the ODP sites 659, 721/722 and 967 interpreted as climate indicators of the African region of the Plio-Pleistocene period (about 5 Ma). The detailed inference unravels multiple transitions underlying the indirect climate observations coinciding with established
BiomeNet: a Bayesian model for inference of metabolic divergence among microbial communities.
Directory of Open Access Journals (Sweden)
Mahdi Shafiei
2014-11-01
Full Text Available Metagenomics yields enormous numbers of microbial sequences that can be assigned a metabolic function. Using such data to infer community-level metabolic divergence is hindered by the lack of a suitable statistical framework. Here, we describe a novel hierarchical Bayesian model, called BiomeNet (Bayesian inference of metabolic networks, for inferring differential prevalence of metabolic subnetworks among microbial communities. To infer the structure of community-level metabolic interactions, BiomeNet applies a mixed-membership modelling framework to enzyme abundance information. The basic idea is that the mixture components of the model (metabolic reactions, subnetworks, and networks are shared across all groups (microbiome samples, but the mixture proportions vary from group to group. Through this framework, the model can capture nested structures within the data. BiomeNet is unique in modeling each metagenome sample as a mixture of complex metabolic systems (metabosystems. The metabosystems are composed of mixtures of tightly connected metabolic subnetworks. BiomeNet differs from other unsupervised methods by allowing researchers to discriminate groups of samples through the metabolic patterns it discovers in the data, and by providing a framework for interpreting them. We describe a collapsed Gibbs sampler for inference of the mixture weights under BiomeNet, and we use simulation to validate the inference algorithm. Application of BiomeNet to human gut metagenomes revealed a metabosystem with greater prevalence among inflammatory bowel disease (IBD patients. Based on the discriminatory subnetworks for this metabosystem, we inferred that the community is likely to be closely associated with the human gut epithelium, resistant to dietary interventions, and interfere with human uptake of an antioxidant connected to IBD. Because this metabosystem has a greater capacity to exploit host-associated glycans, we speculate that IBD
A bayesian framework that integrates heterogeneous data for inferring gene regulatory networks.
Santra, Tapesh
2014-01-01
Reconstruction of gene regulatory networks (GRNs) from experimental data is a fundamental challenge in systems biology. A number of computational approaches have been developed to infer GRNs from mRNA expression profiles. However, expression profiles alone are proving to be insufficient for inferring GRN topologies with reasonable accuracy. Recently, it has been shown that integration of external data sources (such as gene and protein sequence information, gene ontology data, protein-protein interactions) with mRNA expression profiles may increase the reliability of the inference process. Here, I propose a new approach that incorporates transcription factor binding sites (TFBS) and physical protein interactions (PPI) among transcription factors (TFs) in a Bayesian variable selection (BVS) algorithm which can infer GRNs from mRNA expression profiles subjected to genetic perturbations. Using real experimental data, I show that the integration of TFBS and PPI data with mRNA expression profiles leads to significantly more accurate networks than those inferred from expression profiles alone. Additionally, the performance of the proposed algorithm is compared with a series of least absolute shrinkage and selection operator (LASSO) regression-based network inference methods that can also incorporate prior knowledge in the inference framework. The results of this comparison suggest that BVS can outperform LASSO regression-based method in some circumstances.
A Bayesian Framework that integrates heterogeneous data for inferring gene regulatory networks
Directory of Open Access Journals (Sweden)
Tapesh eSantra
2014-05-01
Full Text Available Reconstruction of gene regulatory networks (GRNs from experimental data is a fundamental challenge in systems biology. A number of computational approaches have been developed to infer GRNs from mRNA expression profiles. However, expression profiles alone are proving to be insufficient for inferring GRN topologies with reasonable accuracy. Recently, it has been shown that integration of external data sources (such as gene and protein sequence information, gene ontology data, protein protein interactions with mRNA expression profiles may increase the reliability of the inference process. Here, I propose a new approach that incorporates transcription factor binding sites (TFBS and physical protein interactions (PPI among transcription factors (TFs in a Bayesian Variable Selection (BVS algorithm which can infer GRNs from mRNA expression profiles subjected to genetic perturbations. Using real experimental data, I show that the integration of TFBS and PPI data with mRNA expression profiles leads to significantly more accurate networks than those inferred from expression profiles alone. Additionally, the performance of the proposed algorithm is compared with a series of LASSO regression based network inference methods that can also incorporate prior knowledge in the inference framework. The results of this comparison suggest that BVS can outperform LASSO regression based method in some circumstances.
Truth, models, model sets, AIC, and multimodel inference: a Bayesian perspective
Barker, Richard J.; Link, William A.
2015-01-01
Statistical inference begins with viewing data as realizations of stochastic processes. Mathematical models provide partial descriptions of these processes; inference is the process of using the data to obtain a more complete description of the stochastic processes. Wildlife and ecological scientists have become increasingly concerned with the conditional nature of model-based inference: what if the model is wrong? Over the last 2 decades, Akaike's Information Criterion (AIC) has been widely and increasingly used in wildlife statistics for 2 related purposes, first for model choice and second to quantify model uncertainty. We argue that for the second of these purposes, the Bayesian paradigm provides the natural framework for describing uncertainty associated with model choice and provides the most easily communicated basis for model weighting. Moreover, Bayesian arguments provide the sole justification for interpreting model weights (including AIC weights) as coherent (mathematically self consistent) model probabilities. This interpretation requires treating the model as an exact description of the data-generating mechanism. We discuss the implications of this assumption, and conclude that more emphasis is needed on model checking to provide confidence in the quality of inference.
Jennen, Danyel G J; van Leeuwen, Danitsja M; Hendrickx, Diana M; Gottschalk, Ralph W H; van Delft, Joost H M; Kleinjans, Jos C S
2015-10-19
Microarray-based transcriptomic analysis has been demonstrated to hold the opportunity to study the effects of human exposure to, e.g., chemical carcinogens at the whole genome level, thus yielding broad-ranging molecular information on possible carcinogenic effects. Since genes do not operate individually but rather through concerted interactions, analyzing and visualizing networks of genes should provide important mechanistic information, especially upon connecting them to functional parameters, such as those derived from measurements of biomarkers for exposure and carcinogenic risk. Conventional methods such as hierarchical clustering and correlation analyses are frequently used to address these complex interactions but are limited as they do not provide directional causal dependence relationships. Therefore, our aim was to apply Bayesian network inference with the purpose of phenotypic anchoring of modified gene expressions. We investigated a use case on transcriptomic responses to cigarette smoking in humans, in association with plasma cotinine levels as biomarkers of exposure and aromatic DNA-adducts in blood cells as biomarkers of carcinogenic risk. Many of the genes that appear in the Bayesian networks surrounding plasma cotinine, and to a lesser extent around aromatic DNA-adducts, hold biologically relevant functions in inducing severe adverse effects of smoking. In conclusion, this study shows that Bayesian network inference enables unbiased phenotypic anchoring of transcriptomics responses. Furthermore, in all inferred Bayesian networks several dependencies are found which point to known but also to new relationships between the expression of specific genes, cigarette smoke exposure, DNA damaging-effects, and smoking-related diseases, in particular associated with apoptosis, DNA repair, and tumor suppression, as well as with autoimmunity.
Bayesian clustering of DNA sequences using Markov chains and a stochastic partition model.
Jääskinen, Väinö; Parkkinen, Ville; Cheng, Lu; Corander, Jukka
2014-02-01
In many biological applications it is necessary to cluster DNA sequences into groups that represent underlying organismal units, such as named species or genera. In metagenomics this grouping needs typically to be achieved on the basis of relatively short sequences which contain different types of errors, making the use of a statistical modeling approach desirable. Here we introduce a novel method for this purpose by developing a stochastic partition model that clusters Markov chains of a given order. The model is based on a Dirichlet process prior and we use conjugate priors for the Markov chain parameters which enables an analytical expression for comparing the marginal likelihoods of any two partitions. To find a good candidate for the posterior mode in the partition space, we use a hybrid computational approach which combines the EM-algorithm with a greedy search. This is demonstrated to be faster and yield highly accurate results compared to earlier suggested clustering methods for the metagenomics application. Our model is fairly generic and could also be used for clustering of other types of sequence data for which Markov chains provide a reasonable way to compress information, as illustrated by experiments on shotgun sequence type data from an Escherichia coli strain.
MultiNest: an efficient and robust Bayesian inference tool for cosmology and particle physics
Feroz, F; Bridges, M
2008-01-01
We present further development and the first public release of our multimodal nested sampling algorithm, called MultiNest. This Bayesian inference tool calculates the evidence, with an associated error estimate, and produces posterior samples from distributions that may contain multiple modes and pronounced (curving) degeneracies in high dimensions. The developments presented here lead to further substantial improvements in sampling efficiency and robustness, as compared to the original algorithm presented in Feroz & Hobson (2008), which itself significantly outperformed existing MCMC techniques in a wide range of astrophysical inference problems. The accuracy and economy of the MultiNest algorithm is demonstrated by application to two toy problems and to a cosmological inference problem focussing on the extension of the vanilla $\\Lambda$CDM model to include spatial curvature and a varying equation of state for dark energy. The MultiNest software, which is fully parallelized using MPI and includes an inte...
Hernández, Mario R.; Francés, Félix
2015-04-01
One phase of the hydrological models implementation process, significantly contributing to the hydrological predictions uncertainty, is the calibration phase in which values of the unknown model parameters are tuned by optimizing an objective function. An unsuitable error model (e.g. Standard Least Squares or SLS) introduces noise into the estimation of the parameters. The main sources of this noise are the input errors and the hydrological model structural deficiencies. Thus, the biased calibrated parameters cause the divergence model phenomenon, where the errors variance of the (spatially and temporally) forecasted flows far exceeds the errors variance in the fitting period, and provoke the loss of part or all of the physical meaning of the modeled processes. In other words, yielding a calibrated hydrological model which works well, but not for the right reasons. Besides, an unsuitable error model yields a non-reliable predictive uncertainty assessment. Hence, with the aim of prevent all these undesirable effects, this research focuses on the Bayesian joint inference (BJI) of both the hydrological and error model parameters, considering a general additive (GA) error model that allows for correlation, non-stationarity (in variance and bias) and non-normality of model residuals. As hydrological model, it has been used a conceptual distributed model called TETIS, with a particular split structure of the effective model parameters. Bayesian inference has been performed with the aid of a Markov Chain Monte Carlo (MCMC) algorithm called Dream-ZS. MCMC algorithm quantifies the uncertainty of the hydrological and error model parameters by getting the joint posterior probability distribution, conditioned on the observed flows. The BJI methodology is a very powerful and reliable tool, but it must be used correctly this is, if non-stationarity in errors variance and bias is modeled, the Total Laws must be taken into account. The results of this research show that the
Anderson, Eric C; Ng, Thomas C
2016-02-01
We develop a computational framework for addressing pedigree inference problems using small numbers (80-400) of single nucleotide polymorphisms (SNPs). Our approach relaxes the assumptions, which are commonly made, that sampling is complete with respect to the pedigree and that there is no genotyping error. It relies on representing the inferred pedigree as a factor graph and invoking the Sum-Product algorithm to compute and store quantities that allow the joint probability of the data to be rapidly computed under a large class of rearrangements of the pedigree structure. This allows efficient MCMC sampling over the space of pedigrees, and, hence, Bayesian inference of pedigree structure. In this paper we restrict ourselves to inference of pedigrees without loops using SNPs assumed to be unlinked. We present the methodology in general for multigenerational inference, and we illustrate the method by applying it to the inference of full sibling groups in a large sample (n=1157) of Chinook salmon typed at 95 SNPs. The results show that our method provides a better point estimate and estimate of uncertainty than the currently best-available maximum-likelihood sibling reconstruction method. Extensions of this work to more complex scenarios are briefly discussed. Published by Elsevier Inc.
Inference of emission rates from multiple sources using Bayesian probability theory.
Yee, Eugene; Flesch, Thomas K
2010-03-01
The determination of atmospheric emission rates from multiple sources using inversion (regularized least-squares or best-fit technique) is known to be very susceptible to measurement and model errors in the problem, rendering the solution unusable. In this paper, a new perspective is offered for this problem: namely, it is argued that the problem should be addressed as one of inference rather than inversion. Towards this objective, Bayesian probability theory is used to estimate the emission rates from multiple sources. The posterior probability distribution for the emission rates is derived, accounting fully for the measurement errors in the concentration data and the model errors in the dispersion model used to interpret the data. The Bayesian inferential methodology for emission rate recovery is validated against real dispersion data, obtained from a field experiment involving various source-sensor geometries (scenarios) consisting of four synthetic area sources and eight concentration sensors. The recovery of discrete emission rates from three different scenarios obtained using Bayesian inference and singular value decomposition inversion are compared and contrasted.
McGeachie, Michael J; Chang, Hsun-Hsien; Weiss, Scott T
2014-06-01
Bayesian Networks (BN) have been a popular predictive modeling formalism in bioinformatics, but their application in modern genomics has been slowed by an inability to cleanly handle domains with mixed discrete and continuous variables. Existing free BN software packages either discretize continuous variables, which can lead to information loss, or do not include inference routines, which makes prediction with the BN impossible. We present CGBayesNets, a BN package focused around prediction of a clinical phenotype from mixed discrete and continuous variables, which fills these gaps. CGBayesNets implements Bayesian likelihood and inference algorithms for the conditional Gaussian Bayesian network (CGBNs) formalism, one appropriate for predicting an outcome of interest from, e.g., multimodal genomic data. We provide four different network learning algorithms, each making a different tradeoff between computational cost and network likelihood. CGBayesNets provides a full suite of functions for model exploration and verification, including cross validation, bootstrapping, and AUC manipulation. We highlight several results obtained previously with CGBayesNets, including predictive models of wood properties from tree genomics, leukemia subtype classification from mixed genomic data, and robust prediction of intensive care unit mortality outcomes from metabolomic profiles. We also provide detailed example analysis on public metabolomic and gene expression datasets. CGBayesNets is implemented in MATLAB and available as MATLAB source code, under an Open Source license and anonymous download at http://www.cgbayesnets.com.
Directory of Open Access Journals (Sweden)
Michael J McGeachie
2014-06-01
Full Text Available Bayesian Networks (BN have been a popular predictive modeling formalism in bioinformatics, but their application in modern genomics has been slowed by an inability to cleanly handle domains with mixed discrete and continuous variables. Existing free BN software packages either discretize continuous variables, which can lead to information loss, or do not include inference routines, which makes prediction with the BN impossible. We present CGBayesNets, a BN package focused around prediction of a clinical phenotype from mixed discrete and continuous variables, which fills these gaps. CGBayesNets implements Bayesian likelihood and inference algorithms for the conditional Gaussian Bayesian network (CGBNs formalism, one appropriate for predicting an outcome of interest from, e.g., multimodal genomic data. We provide four different network learning algorithms, each making a different tradeoff between computational cost and network likelihood. CGBayesNets provides a full suite of functions for model exploration and verification, including cross validation, bootstrapping, and AUC manipulation. We highlight several results obtained previously with CGBayesNets, including predictive models of wood properties from tree genomics, leukemia subtype classification from mixed genomic data, and robust prediction of intensive care unit mortality outcomes from metabolomic profiles. We also provide detailed example analysis on public metabolomic and gene expression datasets. CGBayesNets is implemented in MATLAB and available as MATLAB source code, under an Open Source license and anonymous download at http://www.cgbayesnets.com.
Planetary micro-rover operations on Mars using a Bayesian framework for inference and control
Post, Mark A.; Li, Junquan; Quine, Brendan M.
2016-03-01
With the recent progress toward the application of commercially-available hardware to small-scale space missions, it is now becoming feasible for groups of small, efficient robots based on low-power embedded hardware to perform simple tasks on other planets in the place of large-scale, heavy and expensive robots. In this paper, we describe design and programming of the Beaver micro-rover developed for Northern Light, a Canadian initiative to send a small lander and rover to Mars to study the Martian surface and subsurface. For a small, hardware-limited rover to handle an uncertain and mostly unknown environment without constant management by human operators, we use a Bayesian network of discrete random variables as an abstraction of expert knowledge about the rover and its environment, and inference operations for control. A framework for efficient construction and inference into a Bayesian network using only the C language and fixed-point mathematics on embedded hardware has been developed for the Beaver to make intelligent decisions with minimal sensor data. We study the performance of the Beaver as it probabilistically maps a simple outdoor environment with sensor models that include uncertainty. Results indicate that the Beaver and other small and simple robotic platforms can make use of a Bayesian network to make intelligent decisions in uncertain planetary environments.
Wang, Xiaoxiao; Wang, Huan; Huang, Jinfeng; Zhou, Yifeng; Tzvetanov, Tzvetomir
2017-01-01
The contrast sensitivity function that spans the two dimensions of contrast and spatial frequency is crucial in predicting functional vision both in research and clinical applications. In this study, the use of Bayesian inference was proposed to determine the parameters of the two-dimensional contrast sensitivity function. Two-dimensional Bayesian inference was extensively simulated in comparison to classical one-dimensional measures. Its performance on two-dimensional data gathered with different sampling algorithms was also investigated. The results showed that the two-dimensional Bayesian inference method significantly improved the accuracy and precision of the contrast sensitivity function, as compared to the more common one-dimensional estimates. In addition, applying two-dimensional Bayesian estimation to the final data set showed similar levels of reliability and efficiency across widely disparate and established sampling methods (from classical one-dimensional sampling, such as Ψ or staircase, to more novel multi-dimensional sampling methods, such as quick contrast sensitivity function and Fisher information gain). Furthermore, the improvements observed following the application of Bayesian inference were maintained even when the prior poorly matched the subject's contrast sensitivity function. Simulation results were confirmed in a psychophysical experiment. The results indicated that two-dimensional Bayesian inference of contrast sensitivity function data provides similar estimates across a wide range of sampling methods. The present study likely has implications for the measurement of contrast sensitivity function in various settings (including research and clinical settings) and would facilitate the comparison of existing data from previous studies. PMID:28119563
Bayesian techniques for fatigue life prediction and for inference in linear time dependent PDEs
Scavino, Marco
2016-01-08
In this talk we introduce first the main characteristics of a systematic statistical approach to model calibration, model selection and model ranking when stress-life data are drawn from a collection of records of fatigue experiments. Focusing on Bayesian prediction assessment, we consider fatigue-limit models and random fatigue-limit models under different a priori assumptions. In the second part of the talk, we present a hierarchical Bayesian technique for the inference of the coefficients of time dependent linear PDEs, under the assumption that noisy measurements are available in both the interior of a domain of interest and from boundary conditions. We present a computational technique based on the marginalization of the contribution of the boundary parameters and apply it to inverse heat conduction problems.
BioEM: GPU-accelerated computing of Bayesian inference of electron microscopy images
Cossio, Pilar; Baruffa, Fabio; Rampp, Markus; Lindenstruth, Volker; Hummer, Gerhard
2016-01-01
In cryo-electron microscopy (EM), molecular structures are determined from large numbers of projection images of individual particles. To harness the full power of this single-molecule information, we use the Bayesian inference of EM (BioEM) formalism. By ranking structural models using posterior probabilities calculated for individual images, BioEM in principle addresses the challenge of working with highly dynamic or heterogeneous systems not easily handled in traditional EM reconstruction. However, the calculation of these posteriors for large numbers of particles and models is computationally demanding. Here we present highly parallelized, GPU-accelerated computer software that performs this task efficiently. Our flexible formulation employs CUDA, OpenMP, and MPI parallelization combined with both CPU and GPU computing. The resulting BioEM software scales nearly ideally both on pure CPU and on CPU+GPU architectures, thus enabling Bayesian analysis of tens of thousands of images in a reasonable time. The g...
Moving in time: Bayesian causal inference explains movement coordination to auditory beats.
Elliott, Mark T; Wing, Alan M; Welchman, Andrew E
2014-07-07
Many everyday skilled actions depend on moving in time with signals that are embedded in complex auditory streams (e.g. musical performance, dancing or simply holding a conversation). Such behaviour is apparently effortless; however, it is not known how humans combine auditory signals to support movement production and coordination. Here, we test how participants synchronize their movements when there are potentially conflicting auditory targets to guide their actions. Participants tapped their fingers in time with two simultaneously presented metronomes of equal tempo, but differing in phase and temporal regularity. Synchronization therefore depended on integrating the two timing cues into a single-event estimate or treating the cues as independent and thereby selecting one signal over the other. We show that a Bayesian inference process explains the situations in which participants choose to integrate or separate signals, and predicts motor timing errors. Simulations of this causal inference process demonstrate that this model provides a better description of the data than other plausible models. Our findings suggest that humans exploit a Bayesian inference process to control movement timing in situations where the origin of auditory signals needs to be resolved.
Data-based inference of generators for Markov jump processes using convex optimization
Crommelin, D.T.; Vanden-Eijnden, E.
2009-01-01
A variational approach to the estimation of generators for Markov jump processes from discretely sampled data is discussed and generalized. In this approach, one first calculates the spectrum of the discrete maximum likelihood estimator for the transition matrix consistent with the discrete data. Th
Bayesian inference of nonlinear unsteady aerodynamics from aeroelastic limit cycle oscillations
Sandhu, Rimple; Poirel, Dominique; Pettit, Chris; Khalil, Mohammad; Sarkar, Abhijit
2016-07-01
A Bayesian model selection and parameter estimation algorithm is applied to investigate the influence of nonlinear and unsteady aerodynamic loads on the limit cycle oscillation (LCO) of a pitching airfoil in the transitional Reynolds number regime. At small angles of attack, laminar boundary layer trailing edge separation causes negative aerodynamic damping leading to the LCO. The fluid-structure interaction of the rigid, but elastically mounted, airfoil and nonlinear unsteady aerodynamics is represented by two coupled nonlinear stochastic ordinary differential equations containing uncertain parameters and model approximation errors. Several plausible aerodynamic models with increasing complexity are proposed to describe the aeroelastic system leading to LCO. The likelihood in the posterior parameter probability density function (pdf) is available semi-analytically using the extended Kalman filter for the state estimation of the coupled nonlinear structural and unsteady aerodynamic model. The posterior parameter pdf is sampled using a parallel and adaptive Markov Chain Monte Carlo (MCMC) algorithm. The posterior probability of each model is estimated using the Chib-Jeliazkov method that directly uses the posterior MCMC samples for evidence (marginal likelihood) computation. The Bayesian algorithm is validated through a numerical study and then applied to model the nonlinear unsteady aerodynamic loads using wind-tunnel test data at various Reynolds numbers.
Bayesian inference of nonlinear unsteady aerodynamics from aeroelastic limit cycle oscillations
Energy Technology Data Exchange (ETDEWEB)
Sandhu, Rimple [Department of Civil and Environmental Engineering, Carleton University, Ottawa, Ontario (Canada); Poirel, Dominique [Department of Mechanical and Aerospace Engineering, Royal Military College of Canada, Kingston, Ontario (Canada); Pettit, Chris [Department of Aerospace Engineering, United States Naval Academy, Annapolis, MD (United States); Khalil, Mohammad [Department of Civil and Environmental Engineering, Carleton University, Ottawa, Ontario (Canada); Sarkar, Abhijit, E-mail: abhijit.sarkar@carleton.ca [Department of Civil and Environmental Engineering, Carleton University, Ottawa, Ontario (Canada)
2016-07-01
A Bayesian model selection and parameter estimation algorithm is applied to investigate the influence of nonlinear and unsteady aerodynamic loads on the limit cycle oscillation (LCO) of a pitching airfoil in the transitional Reynolds number regime. At small angles of attack, laminar boundary layer trailing edge separation causes negative aerodynamic damping leading to the LCO. The fluid–structure interaction of the rigid, but elastically mounted, airfoil and nonlinear unsteady aerodynamics is represented by two coupled nonlinear stochastic ordinary differential equations containing uncertain parameters and model approximation errors. Several plausible aerodynamic models with increasing complexity are proposed to describe the aeroelastic system leading to LCO. The likelihood in the posterior parameter probability density function (pdf) is available semi-analytically using the extended Kalman filter for the state estimation of the coupled nonlinear structural and unsteady aerodynamic model. The posterior parameter pdf is sampled using a parallel and adaptive Markov Chain Monte Carlo (MCMC) algorithm. The posterior probability of each model is estimated using the Chib–Jeliazkov method that directly uses the posterior MCMC samples for evidence (marginal likelihood) computation. The Bayesian algorithm is validated through a numerical study and then applied to model the nonlinear unsteady aerodynamic loads using wind-tunnel test data at various Reynolds numbers.
Bayesian inference in mass flow simulations - from back calculation to prediction
Kofler, Andreas; Fischer, Jan-Thomas; Hellweger, Valentin; Huber, Andreas; Mergili, Martin; Pudasaini, Shiva; Fellin, Wolfgang; Oberguggenberger, Michael
2017-04-01
Mass flow simulations are an integral part of hazard assessment. Determining the hazard potential requires a multidisciplinary approach, including different scientific fields such as geomorphology, meteorology, physics, civil engineering and mathematics. An important task in snow avalanche simulation is to predict process intensities (runout, flow velocity and depth, ...). The application of probabilistic methods allows one to develop a comprehensive simulation concept, ranging from back to forward calculation and finally to prediction of mass flow events. In this context optimized parameter sets for the used simulation model or intensities of the modeled mass flow process (e.g. runout distances) are represented by probability distributions. Existing deterministic flow models, in particular with respect to snow avalanche dynamics, contain several parameters (e.g. friction). Some of these parameters are more conceptual than physical and their direct measurement in the field is hardly possible. Hence, parameters have to be optimized by matching simulation results to field observations. This inverse problem can be solved by a Bayesian approach (Markov chain Monte Carlo). The optimization process yields parameter distributions, that can be utilized for probabilistic reconstruction and prediction of avalanche events. Arising challenges include the limited amount of observations, correlations appearing in model parameters or observed avalanche characteristics (e.g. velocity and runout) and the accurate handling of ensemble simulations, always taking into account the related uncertainties. Here we present an operational Bayesian simulation framework with r.avaflow, the open source GIS simulation model for granular avalanches and debris flows.
Paul, Sudeshna; Friedman, Alan M; Bailey-Kellogg, Chris; Craig, Bruce A
2013-04-01
The interatomic distance distribution, P(r), is a valuable tool for evaluating the structure of a molecule in solution and represents the maximum structural information that can be derived from solution scattering data without further assumptions. Most current instrumentation for scattering experiments (typically CCD detectors) generates a finely pixelated two-dimensional image. In contin-uation of the standard practice with earlier one-dimensional detectors, these images are typically reduced to a one-dimensional profile of scattering inten-sities, I(q), by circular averaging of the two-dimensional image. Indirect Fourier transformation methods are then used to reconstruct P(r) from I(q). Substantial advantages in data analysis, however, could be achieved by directly estimating the P(r) curve from the two-dimensional images. This article describes a Bayesian framework, using a Markov chain Monte Carlo method, for estimating the parameters of the indirect transform, and thus P(r), directly from the two-dimensional images. Using simulated detector images, it is demonstrated that this method yields P(r) curves nearly identical to the reference P(r). Furthermore, an approach for evaluating spatially correlated errors (such as those that arise from a detector point spread function) is evaluated. Accounting for these errors further improves the precision of the P(r) estimation. Experimental scattering data, where no ground truth reference P(r) is available, are used to demonstrate that this method yields a scattering and detector model that more closely reflects the two-dimensional data, as judged by smaller residuals in cross-validation, than P(r) obtained by indirect transformation of a one-dimensional profile. Finally, the method allows concurrent estimation of the beam center and Dmax, the longest interatomic distance in P(r), as part of the Bayesian Markov chain Monte Carlo method, reducing experimental effort and providing a well defined protocol for these
Directory of Open Access Journals (Sweden)
Y. Paudel
2013-03-01
Full Text Available This study applies Bayesian Inference to estimate flood risk for 53 dyke ring areas in the Netherlands, and focuses particularly on the data scarcity and extreme behaviour of catastrophe risk. The probability density curves of flood damage are estimated through Monte Carlo simulations. Based on these results, flood insurance premiums are estimated using two different practical methods that each account in different ways for an insurer's risk aversion and the dispersion rate of loss data. This study is of practical relevance because insurers have been considering the introduction of flood insurance in the Netherlands, which is currently not generally available.
DIP -- Diagnostics for Insufficiencies of Posterior calculations in Bayesian signal inference
Dorn, Sebastian; lin, Torsten A Enß
2013-01-01
We present an error-diagnostic validation method for posterior distributions in Bayesian signal inference. It transfers deviations from the correct posterior into characteristic deviations from a uniform distribution of a quantity constructed for this purpose. We show that this method is able to reveal and discriminate several kinds of numerical and approximation errors. For this we present a number of analytical examples of posteriors with incorrect variance, skewness, position of the maximum, or normalization. We show further how this test can be applied to multidimensional signals.
Hinsen, Konrad; Kneller, Gerald R
2016-10-21
Anomalous diffusion is characterized by its asymptotic behavior for t → ∞. This makes it difficult to detect and describe in particle trajectories from experiments or computer simulations, which are necessarily of finite length. We propose a new approach using Bayesian inference applied directly to the observed trajectories sampled at different time scales. We illustrate the performance of this approach using random trajectories with known statistical properties and then use it for analyzing the motion of lipid molecules in the plane of a lipid bilayer.
Bayesian inference of the initial conditions from large-scale structure surveys
Leclercq, Florent
2016-10-01
Analysis of three-dimensional cosmological surveys has the potential to answer outstanding questions on the initial conditions from which structure appeared, and therefore on the very high energy physics at play in the early Universe. We report on recently proposed statistical data analysis methods designed to study the primordial large-scale structure via physical inference of the initial conditions in a fully Bayesian framework, and applications to the Sloan Digital Sky Survey data release 7. We illustrate how this approach led to a detailed characterization of the dynamic cosmic web underlying the observed galaxy distribution, based on the tidal environment.
Hinsen, Konrad; Kneller, Gerald R.
2016-10-01
Anomalous diffusion is characterized by its asymptotic behavior for t → ∞. This makes it difficult to detect and describe in particle trajectories from experiments or computer simulations, which are necessarily of finite length. We propose a new approach using Bayesian inference applied directly to the observed trajectories sampled at different time scales. We illustrate the performance of this approach using random trajectories with known statistical properties and then use it for analyzing the motion of lipid molecules in the plane of a lipid bilayer.
Energy Technology Data Exchange (ETDEWEB)
Nicoulaud-Gouin, V.; Giacalone, M.; Gonze, M.A. [Institut de Radioprotection et de Surete Nucleaire-PRP-ENV/SERIS/LM2E (France); Martin-Garin, A.; Garcia-Sanchez, L. [IRSN-PRP-ENV/SERIS/L2BT (France)
2014-07-01
Calibration of transfer models according to observation data is a challenge, especially if parameters uncertainty is required, and if competing models should be decided between them. Generally two main calibration methods are used: The frequentist approach in which the unknown parameter of interest is supposed fixed and its estimation is based on the data only. In this category, least squared method has many restrictions in nonlinear models and competing models need to be nested in order to be compared. The bayesian inference in which the unknown parameter of interest is supposed random and its estimation is based on the data and on prior information. Compared to frequentist method, it provides probability density functions and therefore pointwise estimation with credible intervals. However, in practical cases, Bayesian inference is a complex problem of numerical integration, which explains its low use in operational modeling including radioecology. This study aims to illustrate the interest and feasibility of Bayesian approach in radioecology particularly in the case of ordinary differential equations with non-constant coefficients models, which cover most radiological risk assessment models, notably those implemented in the Symbiose platform (Gonze et al, 2010). Markov Chain Monte Carlo (MCMC) method (Metropolis et al., 1953) was used because the posterior expectations are intractable integrals. The invariant distribution of the parameters was performed by the metropolis-Hasting algorithm (Hastings, 1970). GNU-MCSim software (Bois and Maszle, 2011) a bayesian hierarchical framework, was used to deal with nonlinear differential models. Two case studies including this type of model were investigated: An Equilibrium Kinetic sorption model (EK) (e.g. van Genuchten et al, 1974), with experimental data concerning {sup 137}Cs and {sup 85}Sr sorption and desorption in different soils studied in stirred flow-through reactors. This model, generalizing the K{sub d} approach
Efficient Bayesian estimation of Markov model transition matrices with given stationary distribution
Trendelkamp-Schroer, Benjamin
2013-01-01
Direct simulation of biomolecular dynamics in thermal equilibrium is challenging due to the metastable nature of conformation dynamics and the computational cost of molecular dynamics. Biased or enhanced sampling methods may improve the convergence of expectation values of equilibrium probabilities and expectation values of stationary quantities significantly. Unfortunately the convergence of dynamic observables such as correlation functions or timescales of conformational transitions relies on direct equilibrium simulations. Markov state models are well suited to describe both, stationary properties and properties of slow dynamical processes of a molecular system, in terms of a transition matrix for a jump process on a suitable discretiza- tion of continuous conformation space. Here, we introduce statistical estimation methods that allow a priori knowledge of equilibrium probabilities to be incorporated into the estimation of dynamical observables. Both, maximum likelihood methods and an improved Monte Carlo...
Bayesian inference of the sites of perturbations in metabolic pathways via Markov chain Monte Carlo
Jayawardhana, Bayu; Kell, Douglas B.; Rattray, Magnus
2008-01-01
Motivation: Genetic modifications or pharmaceutical interventions can influence multiple sites in metabolic pathways, and often these are ‘distant’ from the primary effect. In this regard, the ability to identify target and off-target effects of a specific compound or gene therapy is both a major ch
Statistical Inference for Partially Observed Diffusion Processes
DEFF Research Database (Denmark)
Jensen, Anders Christian
-dimensional Ornstein-Uhlenbeck where one coordinate is completely unobserved. This model does not have the Markov property and it makes parameter inference more complicated. Next we take a Bayesian approach and introduce some basic Markov chain Monte Carlo methods. In chapter ve and six we describe an Bayesian method...... to perform parameter inference in multivariate diffusion models that may be only partially observed. The methodology is applied to the stochastic FitzHugh-Nagumo model and the two-dimensional Ornstein-Uhlenbeck process. Chapter seven focus on parameter identifiability in the aprtially observed Ornstein...
Bayesian inference of local geomagnetic secular variation curves: application to archaeomagnetism
Lanos, Philippe
2014-05-01
The errors that occur at different stages of the archaeomagnetic calibration process are combined using a Bayesian hierarchical modelling. The archaeomagnetic data obtained from archaeological structures such as hearths, kilns or sets of bricks and tiles, exhibit considerable experimental errors and are generally more or less well dated by archaeological context, history or chronometric methods (14C, TL, dendrochronology, etc.). They can also be associated with stratigraphic observations which provide prior relative chronological information. The modelling we propose allows all these observations and errors to be linked together thanks to appropriate prior probability densities. The model also includes penalized cubic splines for estimating the univariate, spherical or three-dimensional curves for the secular variation of the geomagnetic field (inclination, declination, intensity) over time at a local place. The mean smooth curve we obtain, with its posterior Bayesian envelop provides an adaptation to the effects of variability in the density of reference points over time. Moreover, the hierarchical modelling also allows an efficient way to penalize outliers automatically. With this new posterior estimate of the curve, the Bayesian statistical framework then allows to estimate the calendar dates of undated archaeological features (such as kilns) based on one, two or three geomagnetic parameters (inclination, declination and/or intensity). Date estimates are presented in the same way as those that arise from radiocarbon dating. In order to illustrate the model and the inference method used, we will present results based on French, Bulgarian and Austrian datasets recently published.
Recognizing recurrent neural networks (rRNN): Bayesian inference for recurrent neural networks.
Bitzer, Sebastian; Kiebel, Stefan J
2012-07-01
Recurrent neural networks (RNNs) are widely used in computational neuroscience and machine learning applications. In an RNN, each neuron computes its output as a nonlinear function of its integrated input. While the importance of RNNs, especially as models of brain processing, is undisputed, it is also widely acknowledged that the computations in standard RNN models may be an over-simplification of what real neuronal networks compute. Here, we suggest that the RNN approach may be made computationally more powerful by its fusion with Bayesian inference techniques for nonlinear dynamical systems. In this scheme, we use an RNN as a generative model of dynamic input caused by the environment, e.g. of speech or kinematics. Given this generative RNN model, we derive Bayesian update equations that can decode its output. Critically, these updates define a 'recognizing RNN' (rRNN), in which neurons compute and exchange prediction and prediction error messages. The rRNN has several desirable features that a conventional RNN does not have, e.g. fast decoding of dynamic stimuli and robustness to initial conditions and noise. Furthermore, it implements a predictive coding scheme for dynamic inputs. We suggest that the Bayesian inversion of RNNs may be useful both as a model of brain function and as a machine learning tool. We illustrate the use of the rRNN by an application to the online decoding (i.e. recognition) of human kinematics.
A simple introduction to Markov Chain Monte-Carlo sampling
van Ravenzwaaij, Don; Cassey, Pete; Brown, Scott D.
2016-01-01
Markov Chain Monte–Carlo (MCMC) is an increasingly popular method for obtaining information about distributions, especially for estimating posterior distributions in Bayesian inference. This article provides a very basic introduction to MCMC sampling. It describes what MCMC is, and what it can be us
Multi-model polynomial chaos surrogate dictionary for Bayesian inference in elasticity problems
Contreras, Andres A.
2016-09-19
A method is presented for inferring the presence of an inclusion inside a domain; the proposed approach is suitable to be used in a diagnostic device with low computational power. Specifically, we use the Bayesian framework for the inference of stiff inclusions embedded in a soft matrix, mimicking tumors in soft tissues. We rely on a polynomial chaos (PC) surrogate to accelerate the inference process. The PC surrogate predicts the dependence of the displacements field with the random elastic moduli of the materials, and are computed by means of the stochastic Galerkin (SG) projection method. Moreover, the inclusion\\'s geometry is assumed to be unknown, and this is addressed by using a dictionary consisting of several geometrical models with different configurations. A model selection approach based on the evidence provided by the data (Bayes factors) is used to discriminate among the different geometrical models and select the most suitable one. The idea of using a dictionary of pre-computed geometrical models helps to maintain the computational cost of the inference process very low, as most of the computational burden is carried out off-line for the resolution of the SG problems. Numerical tests are used to validate the methodology, assess its performance, and analyze the robustness to model errors. © 2016 Elsevier Ltd
Sraj, Ihab
2015-10-22
This paper addresses model dimensionality reduction for Bayesian inference based on prior Gaussian fields with uncertainty in the covariance function hyper-parameters. The dimensionality reduction is traditionally achieved using the Karhunen-Loève expansion of a prior Gaussian process assuming covariance function with fixed hyper-parameters, despite the fact that these are uncertain in nature. The posterior distribution of the Karhunen-Loève coordinates is then inferred using available observations. The resulting inferred field is therefore dependent on the assumed hyper-parameters. Here, we seek to efficiently estimate both the field and covariance hyper-parameters using Bayesian inference. To this end, a generalized Karhunen-Loève expansion is derived using a coordinate transformation to account for the dependence with respect to the covariance hyper-parameters. Polynomial Chaos expansions are employed for the acceleration of the Bayesian inference using similar coordinate transformations, enabling us to avoid expanding explicitly the solution dependence on the uncertain hyper-parameters. We demonstrate the feasibility of the proposed method on a transient diffusion equation by inferring spatially-varying log-diffusivity fields from noisy data. The inferred profiles were found closer to the true profiles when including the hyper-parameters’ uncertainty in the inference formulation.
Hierarchical Bayesian modeling and Markov chain Monte Carlo sampling for tuning-curve analysis.
Cronin, Beau; Stevenson, Ian H; Sur, Mriganka; Körding, Konrad P
2010-01-01
A central theme of systems neuroscience is to characterize the tuning of neural responses to sensory stimuli or the production of movement. Statistically, we often want to estimate the parameters of the tuning curve, such as preferred direction, as well as the associated degree of uncertainty, characterized by error bars. Here we present a new sampling-based, Bayesian method that allows the estimation of tuning-curve parameters, the estimation of error bars, and hypothesis testing. This method also provides a useful way of visualizing which tuning curves are compatible with the recorded data. We demonstrate the utility of this approach using recordings of orientation and direction tuning in primary visual cortex, direction of motion tuning in primary motor cortex, and simulated data.
Directory of Open Access Journals (Sweden)
William A Griffin
Full Text Available Sequential affect dynamics generated during the interaction of intimate dyads, such as married couples, are associated with a cascade of effects-some good and some bad-on each partner, close family members, and other social contacts. Although the effects are well documented, the probabilistic structures associated with micro-social processes connected to the varied outcomes remain enigmatic. Using extant data we developed a method of classifying and subsequently generating couple dynamics using a Hierarchical Dirichlet Process Hidden semi-Markov Model (HDP-HSMM. Our findings indicate that several key aspects of existing models of marital interaction are inadequate: affect state emissions and their durations, along with the expected variability differences between distressed and nondistressed couples are present but highly nuanced; and most surprisingly, heterogeneity among highly satisfied couples necessitate that they be divided into subgroups. We review how this unsupervised learning technique generates plausible dyadic sequences that are sensitive to relationship quality and provide a natural mechanism for computational models of behavioral and affective micro-social processes.
Directory of Open Access Journals (Sweden)
R.J. Boys
2002-01-01
Full Text Available This paper describes a Bayesian approach to determining the order of a finite state Markov chain whose transition probabilities are themselves governed by a homogeneous finite state Markov chain. It extends previous work on homogeneous Markov chains to more general and applicable hidden Markov models. The method we describe uses a Markov chain Monte Carlo algorithm to obtain samples from the (posterior distribution for both the order of Markov dependence in the observed sequence and the other governing model parameters. These samples allow coherent inferences to be made straightforwardly in contrast to those which use information criteria. The methods are illustrated by their application to both simulated and real data sets.
Bickel, David R
2011-01-01
In statistical practice, whether a Bayesian or frequentist approach is used in inference depends not only on the availability of prior information but also on the attitude taken toward partial prior information, with frequentists tending to be more cautious than Bayesians. The proposed framework defines that attitude in terms of a specified amount of caution, thereby enabling data analysis at the level of caution desired and on the basis of any prior information. The caution parameter represents the attitude toward partial prior information in much the same way as a loss function represents the attitude toward risk. When there is very little prior information and nonzero caution, the resulting inferences correspond to those of the candidate confidence intervals and p-values that are most similar to the credible intervals and hypothesis probabilities of the specified Bayesian posterior. On the other hand, in the presence of a known physical distribution of the parameter, inferences are based only on the corres...
Understanding the Scalability of Bayesian Network Inference Using Clique Tree Growth Curves
Mengshoel, Ole J.
2010-01-01
One of the main approaches to performing computation in Bayesian networks (BNs) is clique tree clustering and propagation. The clique tree approach consists of propagation in a clique tree compiled from a Bayesian network, and while it was introduced in the 1980s, there is still a lack of understanding of how clique tree computation time depends on variations in BN size and structure. In this article, we improve this understanding by developing an approach to characterizing clique tree growth as a function of parameters that can be computed in polynomial time from BNs, specifically: (i) the ratio of the number of a BN s non-root nodes to the number of root nodes, and (ii) the expected number of moral edges in their moral graphs. Analytically, we partition the set of cliques in a clique tree into different sets, and introduce a growth curve for the total size of each set. For the special case of bipartite BNs, there are two sets and two growth curves, a mixed clique growth curve and a root clique growth curve. In experiments, where random bipartite BNs generated using the BPART algorithm are studied, we systematically increase the out-degree of the root nodes in bipartite Bayesian networks, by increasing the number of leaf nodes. Surprisingly, root clique growth is well-approximated by Gompertz growth curves, an S-shaped family of curves that has previously been used to describe growth processes in biology, medicine, and neuroscience. We believe that this research improves the understanding of the scaling behavior of clique tree clustering for a certain class of Bayesian networks; presents an aid for trade-off studies of clique tree clustering using growth curves; and ultimately provides a foundation for benchmarking and developing improved BN inference and machine learning algorithms.
Bayesian state space models for inferring and predicting temporal gene expression profiles.
Liang, Yulan; Kelemen, Arpad
2007-12-01
Prediction of gene dynamic behavior is a challenging and important problem in genomic research while estimating the temporal correlations and non-stationarity are the keys in this process. Unfortunately, most existing techniques used for the inclusion of the temporal correlations treat the time course as evenly distributed time intervals and use stationary models with time-invariant settings. This is an assumption that is often violated in microarray time course data since the time course expression data are at unequal time points, where the difference in sampling times varies from minutes to days. Furthermore, the unevenly spaced short time courses with sudden changes make the prediction of genetic dynamics difficult. In this paper, we develop two types of Bayesian state space models to tackle this challenge for inferring and predicting the gene expression profiles associated with diseases. In the univariate time-varying Bayesian state space models we treat both the stochastic transition matrix and the observation matrix time-variant with linear setting and point out that this can easily be extended to nonlinear setting. In the multivariate Bayesian state space model we include temporal correlation structures in the covariance matrix estimations. In both models, the unevenly spaced short time courses with unseen time points are treated as hidden state variables. Bayesian approaches with various prior and hyper-prior models with MCMC algorithms are used to estimate the model parameters and hidden variables. We apply our models to multiple tissue polygenetic affymetrix data sets. Results show that the predictions of the genomic dynamic behavior can be well captured by the proposed models. (c) 2007 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
Mengersen, Kerrie
2017-01-01
Objectives In recent years, large-scale longitudinal neuroimaging studies have improved our understanding of healthy ageing and pathologies including Alzheimer's disease (AD). A particular focus of these studies is group differences and identification of participants at risk of deteriorating to a worse diagnosis. For this, statistical analysis using linear mixed-effects (LME) models are used to account for correlated observations from individuals measured over time. A Bayesian framework for LME models in AD is introduced in this paper to provide additional insight often not found in current LME volumetric analyses. Setting and participants Longitudinal neuroimaging case study of ageing was analysed in this research on 260 participants diagnosed as either healthy controls (HC), mild cognitive impaired (MCI) or AD. Bayesian LME models for the ventricle and hippocampus regions were used to: (1) estimate how the volumes of these regions change over time by diagnosis, (2) identify high-risk non-AD individuals with AD like degeneration and (3) determine probabilistic trajectories of diagnosis groups over age. Results We observed (1) large differences in the average rate of change of volume for the ventricle and hippocampus regions between diagnosis groups, (2) high-risk individuals who had progressed from HC to MCI and displayed similar rates of deterioration as AD counterparts, and (3) critical time points which indicate where deterioration of regions begins to diverge between the diagnosis groups. Conclusions To the best of our knowledge, this is the first application of Bayesian LME models to neuroimaging data which provides inference on a population and individual level in the AD field. The application of a Bayesian LME framework allows for additional information to be extracted from longitudinal studies. This provides health professionals with valuable information of neurodegeneration stages, and a potential to provide a better understanding of disease pathology
A Bayesian MCMC method for point process models with intractable normalising constants
DEFF Research Database (Denmark)
Berthelsen, Kasper Klitgaard; Møller, Jesper
2004-01-01
to simulate from the "unknown distribution", perfect simulation algorithms become useful. We illustrate the method in cases whre the likelihood is given by a Markov point process model. Particularly, we consider semi-parametric Bayesian inference in connection to both inhomogeneous Markov point process models...
Le Maitre, Olivier
2015-01-07
We address model dimensionality reduction in the Bayesian inference of Gaussian fields, considering prior covariance function with unknown hyper-parameters. The Karhunen-Loeve (KL) expansion of a prior Gaussian process is traditionally derived assuming fixed covariance function with pre-assigned hyperparameter values. Thus, the modes strengths of the Karhunen-Loeve expansion inferred using available observations, as well as the resulting inferred process, dependent on the pre-assigned values for the covariance hyper-parameters. Here, we seek to infer the process and its the covariance hyper-parameters in a single Bayesian inference. To this end, the uncertainty in the hyper-parameters is treated by means of a coordinate transformation, leading to a KL-type expansion on a fixed reference basis of spatial modes, but with random coordinates conditioned on the hyper-parameters. A Polynomial Chaos (PC) expansion of the model prediction is also introduced to accelerate the Bayesian inference and the sampling of the posterior distribution with MCMC method. The PC expansion of the model prediction also rely on a coordinates transformation, enabling us to avoid expanding the dependence of the prediction with respect to the covariance hyper-parameters. We demonstrate the efficiency of the proposed method on a transient diffusion equation by inferring spatially-varying log-diffusivity fields from noisy data.
Observation uncertainty in reversible Markov chains.
Metzner, Philipp; Weber, Marcus; Schütte, Christof
2010-09-01
In many applications one is interested in finding a simplified model which captures the essential dynamical behavior of a real life process. If the essential dynamics can be assumed to be (approximately) memoryless then a reasonable choice for a model is a Markov model whose parameters are estimated by means of Bayesian inference from an observed time series. We propose an efficient Monte Carlo Markov chain framework to assess the uncertainty of the Markov model and related observables. The derived Gibbs sampler allows for sampling distributions of transition matrices subject to reversibility and/or sparsity constraints. The performance of the suggested sampling scheme is demonstrated and discussed for a variety of model examples. The uncertainty analysis of functions of the Markov model under investigation is discussed in application to the identification of conformations of the trialanine molecule via Robust Perron Cluster Analysis (PCCA+) .
Directory of Open Access Journals (Sweden)
Márcio das Chagas Moura
2008-08-01
Full Text Available In this work it is proposed a model for the assessment of availability measure of fault tolerant systems based on the integration of continuous time semi-Markov processes and Bayesian belief networks. This integration results in a hybrid stochastic model that is able to represent the dynamic characteristics of a system as well as to deal with cause-effect relationships among external factors such as environmental and operational conditions. The hybrid model also allows for uncertainty propagation on the system availability. It is also proposed a numerical procedure for the solution of the state probability equations of semi-Markov processes described in terms of transition rates. The numerical procedure is based on the application of Laplace transforms that are inverted by the Gauss quadrature method known as Gauss Legendre. The hybrid model and numerical procedure are illustrated by means of an example of application in the context of fault tolerant systems.Neste trabalho, é proposto um modelo baseado na integração entre processos semi-Markovianos e redes Bayesianas para avaliação da disponibilidade de sistemas tolerantes à falha. Esta integração resulta em um modelo estocástico híbrido o qual é capaz de representar as características dinâmicas de um sistema assim como tratar as relações de causa e efeito entre fatores externos tais como condições ambientais e operacionais. Além disso, o modelo híbrido permite avaliar a propagação de incerteza sobre a disponibilidade do sistema. É também proposto um procedimento numérico para a solução das equações de probabilidade de estado de processos semi-Markovianos descritos por taxas de transição. Tal procedimento numérico é baseado na aplicação de transformadas de Laplace que são invertidas pelo método de quadratura Gaussiana conhecido como Gauss Legendre. O modelo híbrido e procedimento numérico são ilustrados por meio de um exemplo de aplicação no contexto de
Albert, Carlo; Ulzega, Simone; Stoop, Ruedi
2016-04-01
Parameter inference is a fundamental problem in data-driven modeling. Given observed data that is believed to be a realization of some parameterized model, the aim is to find parameter values that are able to explain the observed data. In many situations, the dominant sources of uncertainty must be included into the model for making reliable predictions. This naturally leads to stochastic models. Stochastic models render parameter inference much harder, as the aim then is to find a distribution of likely parameter values. In Bayesian statistics, which is a consistent framework for data-driven learning, this so-called posterior distribution can be used to make probabilistic predictions. We propose a novel, exact, and very efficient approach for generating posterior parameter distributions for stochastic differential equation models calibrated to measured time series. The algorithm is inspired by reinterpreting the posterior distribution as a statistical mechanics partition function of an object akin to a polymer, where the measurements are mapped on heavier beads compared to those of the simulated data. To arrive at distribution samples, we employ a Hamiltonian Monte Carlo approach combined with a multiple time-scale integration. A separation of time scales naturally arises if either the number of measurement points or the number of simulation points becomes large. Furthermore, at least for one-dimensional problems, we can decouple the harmonic modes between measurement points and solve the fastest part of their dynamics analytically. Our approach is applicable to a wide range of inference problems and is highly parallelizable.
Alsing, Justin; Jaffe, Andrew H
2016-01-01
We apply two Bayesian hierarchical inference schemes to infer shear power spectra, shear maps and cosmological parameters from the CFHTLenS weak lensing survey - the first application of this method to data. In the first approach, we sample the joint posterior distribution of the shear maps and power spectra by Gibbs sampling, with minimal model assumptions. In the second approach, we sample the joint posterior of the shear maps and cosmological parameters, providing a new, accurate and principled approach to cosmological parameter inference from cosmic shear data. As a first demonstration on data we perform a 2-bin tomographic analysis to constrain cosmological parameters and investigate the possibility of photometric redshift bias in the CFHTLenS data. Under the baseline $\\Lambda$CDM model we constrain $S_8 = \\sigma_8(\\Omega_\\mathrm{m}/0.3)^{0.5} = 0.67 ^{\\scriptscriptstyle+ 0.03 }_{\\scriptscriptstyle- 0.03 }$ $(68\\%)$, consistent with previous CFHTLenS analysis but in tension with Planck. Adding neutrino m...
Bayesian inference for joint modelling of longitudinal continuous, binary and ordinal events.
Li, Qiuju; Pan, Jianxin; Belcher, John
2016-12-01
In medical studies, repeated measurements of continuous, binary and ordinal outcomes are routinely collected from the same patient. Instead of modelling each outcome separately, in this study we propose to jointly model the trivariate longitudinal responses, so as to take account of the inherent association between the different outcomes and thus improve statistical inferences. This work is motivated by a large cohort study in the North West of England, involving trivariate responses from each patient: Body Mass Index, Depression (Yes/No) ascertained with cut-off score not less than 8 at the Hospital Anxiety and Depression Scale, and Pain Interference generated from the Medical Outcomes Study 36-item short-form health survey with values returned on an ordinal scale 1-5. There are some well-established methods for combined continuous and binary, or even continuous and ordinal responses, but little work was done on the joint analysis of continuous, binary and ordinal responses. We propose conditional joint random-effects models, which take into account the inherent association between the continuous, binary and ordinal outcomes. Bayesian analysis methods are used to make statistical inferences. Simulation studies show that, by jointly modelling the trivariate outcomes, standard deviations of the estimates of parameters in the models are smaller and much more stable, leading to more efficient parameter estimates and reliable statistical inferences. In the real data analysis, the proposed joint analysis yields a much smaller deviance information criterion value than the separate analysis, and shows other good statistical properties too.
Fajardo, Alvaro; Soñora, Martín; Moreno, Pilar; Moratorio, Gonzalo; Cristina, Juan
2016-10-01
Zika virus (ZIKV) is a member of the family Flaviviridae. In 2015, ZIKV triggered an epidemic in Brazil and spread across Latin America. By May of 2016, the World Health Organization warns over spread of ZIKV beyond this region. Detailed studies on the mode of evolution of ZIKV strains are extremely important for our understanding of the emergence and spread of ZIKV populations. In order to gain insight into these matters, a Bayesian coalescent Markov Chain Monte Carlo analysis of complete genome sequences of recently isolated ZIKV strains was performed. The results of these studies revealed a mean rate of evolution of 1.20 × 10(-3) nucleotide substitutions per site per year (s/s/y) for ZIKV strains enrolled in this study. Several variants isolated in China are grouped together with all strains isolated in Latin America. Another genetic group composed exclusively by Chinese strains were also observed, suggesting the co-circulation of different genetic lineages in China. These findings indicate a high level of diversification of ZIKV populations. Strains isolated from microcephaly cases do not share amino acid substitutions, suggesting that other factors besides viral genetic differences may play a role for the proposed pathogenesis caused by ZIKV infection. J. Med. Virol. 88:1672-1676, 2016. © 2016 Wiley Periodicals, Inc.
Statistical Inference in Hidden Markov Models Using k-Segment Constraints.
Titsias, Michalis K; Holmes, Christopher C; Yau, Christopher
2016-01-02
Hidden Markov models (HMMs) are one of the most widely used statistical methods for analyzing sequence data. However, the reporting of output from HMMs has largely been restricted to the presentation of the most-probable (MAP) hidden state sequence, found via the Viterbi algorithm, or the sequence of most probable marginals using the forward-backward algorithm. In this article, we expand the amount of information we could obtain from the posterior distribution of an HMM by introducing linear-time dynamic programming recursions that, conditional on a user-specified constraint in the number of segments, allow us to (i) find MAP sequences, (ii) compute posterior probabilities, and (iii) simulate sample paths. We collectively call these recursions k-segment algorithms and illustrate their utility using simulated and real examples. We also highlight the prospective and retrospective use of k-segment constraints for fitting HMMs or exploring existing model fits. Supplementary materials for this article are available online.
Travel cost inference from sparse, spatio-temporally correlated time series using markov models
DEFF Research Database (Denmark)
Yang, B.; Guo, C.; Jensen, C.S.
2013-01-01
of such time series offers insight into the underlying system and enables prediction of system behavior. While the techniques presented in the paper apply more generally, we consider the case of transportation systems and aim to predict travel cost from GPS tracking data from probe vehicles. Specifically, each......The monitoring of a system can yield a set of measurements that can be modeled as a collection of time series. These time series are often sparse, due to missing measurements, and spatiotemporally correlated, meaning that spatially close time series exhibit temporal correlation. The analysis...... road segment has an associated travel-cost time series, which is derived from GPS data. We use spatio-temporal hidden Markov models (STHMM) to model correlations among different traffic time series. We provide algorithms that are able to learn the parameters of an STHMM while contending...
Energy Technology Data Exchange (ETDEWEB)
KENNETH M. HANSON; JANE M. BOOKER
2000-09-08
The authors an uncertainty analysis of data taken using the Rossi technique, in which the horizontal oscilloscope sweep is driven sinusoidally in time ,while the vertical axis follows the signal amplitude. The analysis is done within a Bayesian framework. Complete inferences are obtained by tilting the Markov chain Monte Carlo technique, which produces random samples from the posterior probability distribution expressed in terms of the parameters.
Improving PWR core simulations by Monte Carlo uncertainty analysis and Bayesian inference
Castro, Emilio; Buss, Oliver; Garcia-Herranz, Nuria; Hoefer, Axel; Porsch, Dieter
2016-01-01
A Monte Carlo-based Bayesian inference model is applied to the prediction of reactor operation parameters of a PWR nuclear power plant. In this non-perturbative framework, high-dimensional covariance information describing the uncertainty of microscopic nuclear data is combined with measured reactor operation data in order to provide statistically sound, well founded uncertainty estimates of integral parameters, such as the boron letdown curve and the burnup-dependent reactor power distribution. The performance of this methodology is assessed in a blind test approach, where we use measurements of a given reactor cycle to improve the prediction of the subsequent cycle. As it turns out, the resulting improvement of the prediction quality is impressive. In particular, the prediction uncertainty of the boron letdown curve, which is of utmost importance for the planning of the reactor cycle length, can be reduced by one order of magnitude by including the boron concentration measurement information of the previous...
Mocapy++ - A toolkit for inference and learning in dynamic Bayesian networks
Directory of Open Access Journals (Sweden)
Hamelryck Thomas
2010-03-01
Full Text Available Abstract Background Mocapy++ is a toolkit for parameter learning and inference in dynamic Bayesian networks (DBNs. It supports a wide range of DBN architectures and probability distributions, including distributions from directional statistics (the statistics of angles, directions and orientations. Results The program package is freely available under the GNU General Public Licence (GPL from SourceForge http://sourceforge.net/projects/mocapy. The package contains the source for building the Mocapy++ library, several usage examples and the user manual. Conclusions Mocapy++ is especially suitable for constructing probabilistic models of biomolecular structure, due to its support for directional statistics. In particular, it supports the Kent distribution on the sphere and the bivariate von Mises distribution on the torus. These distributions have proven useful to formulate probabilistic models of protein and RNA structure in atomic detail.
cosmoabc: Likelihood-free inference via Population Monte Carlo Approximate Bayesian Computation
Ishida, E E O; Penna-Lima, M; Cisewski, J; de Souza, R S; Trindade, A M M; Cameron, E
2015-01-01
Approximate Bayesian Computation (ABC) enables parameter inference for complex physical systems in cases where the true likelihood function is unknown, unavailable, or computationally too expensive. It relies on the forward simulation of mock data and comparison between observed and synthetic catalogues. Here we present cosmoabc, a Python ABC sampler featuring a Population Monte Carlo (PMC) variation of the original ABC algorithm, which uses an adaptive importance sampling scheme. The code is very flexible and can be easily coupled to an external simulator, while allowing to incorporate arbitrary distance and prior functions. As an example of practical application, we coupled cosmoabc with the numcosmo library and demonstrate how it can be used to estimate posterior probability distributions over cosmological parameters based on measurements of galaxy clusters number counts without computing the likelihood function. cosmoabc is published under the GPLv3 license on PyPI and GitHub and documentation is availabl...
Wu, Dongfeng; Rosner, Gary L; Broemeling, Lyle
2005-12-01
This article extends previous probability models for periodic breast cancer screening examinations. The specific aim is to provide statistical inference for age dependence of sensitivity and the transition probability from the disease free to the preclinical state. The setting is a periodic screening program in which a cohort of initially asymptomatic women undergo a sequence of breast cancer screening exams. We use age as a covariate in the estimation of screening sensitivity and the transition probability simultaneously, both from a frequentist point of view and within a Bayesian framework. We apply our method to the Health Insurance Plan of Greater New York study of female breast cancer and give age-dependent sensitivity and transition probability density estimates. The inferential methodology we develop is also applicable when analyzing studies of modalities for early detection of other types of progressive chronic diseases.
Bayesian approaches to infer the physical properties of star-forming galaxies at cosmic dawn
Salmon, Brett Weston Killebrew
In this thesis, I seek to advance our understanding of galaxy formation and evolution in the early universe. Using the largest single project ever conducted by the Hubble Space Telescope (the Cosmic Assembly Near-infrared Deep Extragalactic Legacy Survey, CANDELS) I use deep and wide broadband photometric imaging to infer the physical properties of galaxies from z=8.5 to z=1.5. First, I will present a study that extends the relationship between the star-formation rates (SFRs) and stellar masses (M⋆) of galaxies to 3.5attenuated in galaxies. I calculate the Bayesian evidence for galaxies under different assumptions of their underlying dust-attenuation law. By modeling galaxy ultraviolet-to-near-IR broadband CANDELS data I produce Bayesian evidence towards the dust law in individual galaxies that is confirmed by their observed IR luminosities. Moreover, I find a tight correlation between the strength of attenuation in galaxies and their dust law, a relation reinforced by the results from radiative transfer simulations. Finally, I use the Bayesian methods developed in this thesis to study the number density of SFR in galaxies from z=8 to z=4, and resolve the current disconnect between its evolution and that of the stellar mass function. In doing so, I place the first constraints on the dust law of z>4 galaxies, finding it obeys a similar relation as found at z˜2. I find a clear excess in number density at high SFRs. This new SFR function is in better agreement with the observed stellar mass functions, the few to-date infrared detections at high redshifts, and the connection to the observed distribution of lower redshift infrared sources. Together, these studies greatly improve our understanding of the galaxy star-formation histories, the nature of their dust attenuation, and the distribution of SFR among some of the most distant galaxies in the universe.
Directory of Open Access Journals (Sweden)
Haseeb A. Khan
2008-01-01
Full Text Available This investigation was aimed to compare the inference of antelope phylogenies resulting from the 16S rRNA, cytochrome-b (cyt-b and d-loop segments of mitochondrial DNA using three different computational models including Bayesian (BA, maximum parsimony (MP and unweighted pair group method with arithmetic mean (UPGMA. The respective nucleotide sequences of three Oryx species (Oryx leucoryx, Oryx dammah and Oryx gazella and an out-group (Addax nasomaculatus were aligned and subjected to BA, MP and UPGMA models for comparing the topologies of respective phylogenetic trees. The 16S rRNA region possessed the highest frequency of conserved sequences (97.65% followed by cyt-b (94.22% and d-loop (87.29%. There were few transitions (2.35% and none transversions in 16S rRNA as compared to cyt-b (5.61% transitions and 0.17% transversions and d-loop (11.57% transitions and 1.14% transversions while com- paring the four taxa. All the three mitochondrial segments clearly differentiated the genus Addax from Oryx using the BA or UPGMA models. The topologies of all the gamma-corrected Bayesian trees were identical irrespective of the marker type. The UPGMA trees resulting from 16S rRNA and d-loop sequences were also identical (Oryx dammah grouped with Oryx leucoryx to Bayesian trees except that the UPGMA tree based on cyt-b showed a slightly different phylogeny (Oryx dammah grouped with Oryx gazella with a low bootstrap support. However, the MP model failed to differentiate the genus Addax from Oryx. These findings demonstrate the efficiency and robustness of BA and UPGMA methods for phylogenetic analysis of antelopes using mitochondrial markers.
Ursino, Mauro; Cuppini, Cristiano; Magosso, Elisa
2017-03-01
Recent theoretical and experimental studies suggest that in multisensory conditions, the brain performs a near-optimal Bayesian estimate of external events, giving more weight to the more reliable stimuli. However, the neural mechanisms responsible for this behavior, and its progressive maturation in a multisensory environment, are still insufficiently understood. The aim of this letter is to analyze this problem with a neural network model of audiovisual integration, based on probabilistic population coding-the idea that a population of neurons can encode probability functions to perform Bayesian inference. The model consists of two chains of unisensory neurons (auditory and visual) topologically organized. They receive the corresponding input through a plastic receptive field and reciprocally exchange plastic cross-modal synapses, which encode the spatial co-occurrence of visual-auditory inputs. A third chain of multisensory neurons performs a simple sum of auditory and visual excitations. The work includes a theoretical part and a computer simulation study. We show how a simple rule for synapse learning (consisting of Hebbian reinforcement and a decay term) can be used during training to shrink the receptive fields and encode the unisensory likelihood functions. Hence, after training, each unisensory area realizes a maximum likelihood estimate of stimulus position (auditory or visual). In cross-modal conditions, the same learning rule can encode information on prior probability into the cross-modal synapses. Computer simulations confirm the theoretical results and show that the proposed network can realize a maximum likelihood estimate of auditory (or visual) positions in unimodal conditions and a Bayesian estimate, with moderate deviations from optimality, in cross-modal conditions. Furthermore, the model explains the ventriloquism illusion and, looking at the activity in the multimodal neurons, explains the automatic reweighting of auditory and visual inputs
MacNeilage, Paul R; Ganesan, Narayan; Angelaki, Dora E
2008-12-01
Spatial orientation is the sense of body orientation and self-motion relative to the stationary environment, fundamental to normal waking behavior and control of everyday motor actions including eye movements, postural control, and locomotion. The brain achieves spatial orientation by integrating visual, vestibular, and somatosensory signals. Over the past years, considerable progress has been made toward understanding how these signals are processed by the brain using multiple computational approaches that include frequency domain analysis, the concept of internal models, observer theory, Bayesian theory, and Kalman filtering. Here we put these approaches in context by examining the specific questions that can be addressed by each technique and some of the scientific insights that have resulted. We conclude with a recent application of particle filtering, a probabilistic simulation technique that aims to generate the most likely state estimates by incorporating internal models of sensor dynamics and physical laws and noise associated with sensory processing as well as prior knowledge or experience. In this framework, priors for low angular velocity and linear acceleration can explain the phenomena of velocity storage and frequency segregation, both of which have been modeled previously using arbitrary low-pass filtering. How Kalman and particle filters may be implemented by the brain is an emerging field. Unlike past neurophysiological research that has aimed to characterize mean responses of single neurons, investigations of dynamic Bayesian inference should attempt to characterize population activities that constitute probabilistic representations of sensory and prior information.
Bayesian inference in camera trapping studies for a class of spatial capture-recapture models.
Royle, J Andrew; Karanth, K Ullas; Gopalaswamy, Arjun M; Kumar, N Samba
2009-11-01
We develop a class of models for inference about abundance or density using spatial capture-recapture data from studies based on camera trapping and related methods. The model is a hierarchical model composed of two components: a point process model describing the distribution of individuals in space (or their home range centers) and a model describing the observation of individuals in traps. We suppose that trap- and individual-specific capture probabilities are a function of distance between individual home range centers and trap locations. We show that the models can be regarded as generalized linear mixed models, where the individual home range centers are random effects. We adopt a Bayesian framework for inference under these models using a formulation based on data augmentation. We apply the models to camera trapping data on tigers from the Nagarahole Reserve, India, collected over 48 nights in 2006. For this study, 120 camera locations were used, but cameras were only operational at 30 locations during any given sample occasion. Movement of traps is common in many camera-trapping studies and represents an important feature of the observation model that we address explicitly in our application.
Duforet-Frebourg, Nicolas; Blum, Michael G B
2014-04-01
Patterns of isolation-by-distance (IBD) arise when population differentiation increases with increasing geographic distances. Patterns of IBD are usually caused by local spatial dispersal, which explains why differences of allele frequencies between populations accumulate with distance. However, spatial variations of demographic parameters such as migration rate or population density can generate nonstationary patterns of IBD where the rate at which genetic differentiation accumulates varies across space. To characterize nonstationary patterns of IBD, we infer local genetic differentiation based on Bayesian kriging. Local genetic differentiation for a sampled population is defined as the average genetic differentiation between the sampled population and fictive neighboring populations. To avoid defining populations in advance, the method can also be applied at the scale of individuals making it relevant for landscape genetics. Inference of local genetic differentiation relies on a matrix of pairwise similarity or dissimilarity between populations or individuals such as matrices of FST between pairs of populations. Simulation studies show that maps of local genetic differentiation can reveal barriers to gene flow but also other patterns such as continuous variations of gene flow across habitat. The potential of the method is illustrated with two datasets: single nucleotide polymorphisms from human Swedish populations and dominant markers for alpine plant species.
Bayesian Inference of the Composition and Inflation Power of Hot Jupiters
Thorngren, Daniel Peter; Fortney, Jonathan J.
2016-10-01
The radius of a planet for a given mass is the result of its composition and thermal evolutionary history. For cooler giants, where thermal evolution is relatively well-understood, we can infer a planet's bulk composition from its mass, radius, stellar insolation and age, since all being equal, more metal-rich planets are smaller and denser. For inflated hot giants, there is a degeneracy between inferred composition and inflation power. Within a Bayesian framework we examine both groups, beginning with the cool giant planets. Among these, we observe that the internal heavy-element mass correlates well with the total planet mass, and the metal enrichment relative to the parent star is correlated negatively with planet mass. However, it appears that there is not a simple relation between the planet heavy-element mass and stellar metallicity. These fundamental "mass-metallicity" results are consistent with the core accretion model of planet formation. For the hotter inflated gas giants, we estimate the functional dependence of inflation power on stellar insolation by demanding that the same metal to mass relation applies to both cold and hot gas giants. We consider various forms for this relation and the resulting outliers. This inflation power result is robust to assumptions about metal placement within the planet and equation of state because it relies only on matching the two groups of planets. These results serve as a new way to connect models of planet inflation to existing observations of giant planets.
Maximizing Entropy over Markov Processes
DEFF Research Database (Denmark)
Biondi, Fabrizio; Legay, Axel; Nielsen, Bo Friis
2013-01-01
computation reduces to finding a model of a specification with highest entropy. Entropy maximization for probabilistic process specifications has not been studied before, even though it is well known in Bayesian inference for discrete distributions. We give a characterization of global entropy of a process...... as a reward function, a polynomial algorithm to verify the existence of an system maximizing entropy among those respecting a specification, a procedure for the maximization of reward functions over Interval Markov Chains and its application to synthesize an implementation maximizing entropy. We show how...
Maximizing entropy over Markov processes
DEFF Research Database (Denmark)
Biondi, Fabrizio; Legay, Axel; Nielsen, Bo Friis
2014-01-01
computation reduces to finding a model of a specification with highest entropy. Entropy maximization for probabilistic process specifications has not been studied before, even though it is well known in Bayesian inference for discrete distributions. We give a characterization of global entropy of a process...... as a reward function, a polynomial algorithm to verify the existence of a system maximizing entropy among those respecting a specification, a procedure for the maximization of reward functions over Interval Markov Chains and its application to synthesize an implementation maximizing entropy. We show how...
Alsing, Justin; Heavens, Alan; Jaffe, Andrew H.
2017-04-01
We apply two Bayesian hierarchical inference schemes to infer shear power spectra, shear maps and cosmological parameters from the Canada-France-Hawaii Telescope (CFHTLenS) weak lensing survey - the first application of this method to data. In the first approach, we sample the joint posterior distribution of the shear maps and power spectra by Gibbs sampling, with minimal model assumptions. In the second approach, we sample the joint posterior of the shear maps and cosmological parameters, providing a new, accurate and principled approach to cosmological parameter inference from cosmic shear data. As a first demonstration on data, we perform a two-bin tomographic analysis to constrain cosmological parameters and investigate the possibility of photometric redshift bias in the CFHTLenS data. Under the baseline ΛCDM (Λ cold dark matter) model, we constrain S_8 = σ _8(Ω _m/0.3)^{0.5} = 0.67+0.03-0.03 (68 per cent), consistent with previous CFHTLenS analyses but in tension with Planck. Adding neutrino mass as a free parameter, we are able to constrain ∑mν linear redshift-dependent photo-z bias Δz = p2(z - p1), we find p_1=-0.25+0.53-0.60 and p_2 = -0.15+0.17-0.15, and tension with Planck is only alleviated under very conservative prior assumptions. Neither the non-minimal neutrino mass nor photo-z bias models are significantly preferred by the CFHTLenS (two-bin tomography) data.
Directory of Open Access Journals (Sweden)
Yen-Jen Lin
Full Text Available Copy number variation (CNV has been reported to be associated with disease and various cancers. Hence, identifying the accurate position and the type of CNV is currently a critical issue. There are many tools targeting on detecting CNV regions, constructing haplotype phases on CNV regions, or estimating the numerical copy numbers. However, none of them can do all of the three tasks at the same time. This paper presents a method based on Hidden Markov Model to detect parent specific copy number change on both chromosomes with signals from SNP arrays. A haplotype tree is constructed with dynamic branch merging to model the transition of the copy number status of the two alleles assessed at each SNP locus. The emission models are constructed for the genotypes formed with the two haplotypes. The proposed method can provide the segmentation points of the CNV regions as well as the haplotype phasing for the allelic status on each chromosome. The estimated copy numbers are provided as fractional numbers, which can accommodate the somatic mutation in cancer specimens that usually consist of heterogeneous cell populations. The algorithm is evaluated on simulated data and the previously published regions of CNV of the 270 HapMap individuals. The results were compared with five popular methods: PennCNV, genoCN, COKGEN, QuantiSNP and cnvHap. The application on oral cancer samples demonstrates how the proposed method can facilitate clinical association studies. The proposed algorithm exhibits comparable sensitivity of the CNV regions to the best algorithm in our genome-wide study and demonstrates the highest detection rate in SNP dense regions. In addition, we provide better haplotype phasing accuracy than similar approaches. The clinical association carried out with our fractional estimate of copy numbers in the cancer samples provides better detection power than that with integer copy number states.
Lin, Yen-Jen; Chen, Yu-Tin; Hsu, Shu-Ni; Peng, Chien-Hua; Tang, Chuan-Yi; Yen, Tzu-Chen; Hsieh, Wen-Ping
2014-01-01
Copy number variation (CNV) has been reported to be associated with disease and various cancers. Hence, identifying the accurate position and the type of CNV is currently a critical issue. There are many tools targeting on detecting CNV regions, constructing haplotype phases on CNV regions, or estimating the numerical copy numbers. However, none of them can do all of the three tasks at the same time. This paper presents a method based on Hidden Markov Model to detect parent specific copy number change on both chromosomes with signals from SNP arrays. A haplotype tree is constructed with dynamic branch merging to model the transition of the copy number status of the two alleles assessed at each SNP locus. The emission models are constructed for the genotypes formed with the two haplotypes. The proposed method can provide the segmentation points of the CNV regions as well as the haplotype phasing for the allelic status on each chromosome. The estimated copy numbers are provided as fractional numbers, which can accommodate the somatic mutation in cancer specimens that usually consist of heterogeneous cell populations. The algorithm is evaluated on simulated data and the previously published regions of CNV of the 270 HapMap individuals. The results were compared with five popular methods: PennCNV, genoCN, COKGEN, QuantiSNP and cnvHap. The application on oral cancer samples demonstrates how the proposed method can facilitate clinical association studies. The proposed algorithm exhibits comparable sensitivity of the CNV regions to the best algorithm in our genome-wide study and demonstrates the highest detection rate in SNP dense regions. In addition, we provide better haplotype phasing accuracy than similar approaches. The clinical association carried out with our fractional estimate of copy numbers in the cancer samples provides better detection power than that with integer copy number states. PMID:24849202
A method of spherical harmonic analysis in the geosciences via hierarchical Bayesian inference
Muir, J. B.; Tkalčić, H.
2015-11-01
The problem of decomposing irregular data on the sphere into a set of spherical harmonics is common in many fields of geosciences where it is necessary to build a quantitative understanding of a globally varying field. For example, in global seismology, a compressional or shear wave speed that emerges from tomographic images is used to interpret current state and composition of the mantle, and in geomagnetism, secular variation of magnetic field intensity measured at the surface is studied to better understand the changes in the Earth's core. Optimization methods are widely used for spherical harmonic analysis of irregular data, but they typically do not treat the dependence of the uncertainty estimates on the imposed regularization. This can cause significant difficulties in interpretation, especially when the best-fit model requires more variables as a result of underestimating data noise. Here, with the above limitations in mind, the problem of spherical harmonic expansion of irregular data is treated within the hierarchical Bayesian framework. The hierarchical approach significantly simplifies the problem by removing the need for regularization terms and user-supplied noise estimates. The use of the corrected Akaike Information Criterion for picking the optimal maximum degree of spherical harmonic expansion and the resulting spherical harmonic analyses are first illustrated on a noisy synthetic data set. Subsequently, the method is applied to two global data sets sensitive to the Earth's inner core and lowermost mantle, consisting of PKPab-df and PcP-P differential traveltime residuals relative to a spherically symmetric Earth model. The posterior probability distributions for each spherical harmonic coefficient are calculated via Markov Chain Monte Carlo sampling; the uncertainty obtained for the coefficients thus reflects the noise present in the real data and the imperfections in the spherical harmonic expansion.
Genetic parameters for buffalo milk yield and milk quality traits using Bayesian inference.
Aspilcueta-Borquis, R R; Araujo Neto, F R; Baldi, F; Bignardi, A B; Albuquerque, L G; Tonhati, H
2010-05-01
The availability of accurate genetic parameters for important economic traits in milking buffaloes is critical for implementation of a genetic evaluation program. In the present study, heritabilities and genetic correlations for fat (FY305), protein (PY305), and milk (MY305) yields, milk fat (%F) and protein (%P) percentages, and SCS were estimated using Bayesian methodology. A total of 4,907 lactations from 1,985 cows were used. The (co)variance components were estimated using multiple-trait analysis by Bayesian inference method, applying an animal model, through Gibbs sampling. The model included the fixed effects of contemporary groups (herd-year and calving season), number of milking (2 levels), and age of cow at calving as (co)variable (quadratic and linear effect). The additive genetic, permanent environmental, and residual effects were included as random effects in the model. The posterior means of heritability distributions for MY305, FY305, PY305, %F, P%, and SCS were 0.22, 0.21, 0.23, 0.33, 0.39, and 0.26, respectively. The genetic correlation estimates ranged from -0.13 (between %P and SCS) to 0.94 (between MY305 and PY305). The permanent environmental correlation estimates ranged from -0.38 (between MY305 and %P) to 0.97 (between MY305 and PY305). Residual and phenotypic correlation estimates ranged from -0.26 (between PY305 and SCS) to 0.97 (between MY305 and PY305) and from -0.26 (between MY305 and SCS) to 0.97 (between MY305 and PY305), respectively. Milk yield, milk components, and milk somatic cells counts have enough genetic variation for selection purposes. The genetic correlation estimates suggest that milk components and milk somatic cell counts would be only slightly affected if increasing milk yield were the selection goal. Selecting to increase FY305 or PY305 will also increase MY305, %P, and %F.
The R Package MitISEM: Efficient and Robust Simulation Procedures for Bayesian Inference
Directory of Open Access Journals (Sweden)
Nalan Baştürk
2017-07-01
Full Text Available This paper presents the R package MitISEM (mixture of t by importance sampling weighted expectation maximization which provides an automatic and flexible two-stage method to approximate a non-elliptical target density kernel - typically a posterior density kernel - using an adaptive mixture of Student t densities as approximating density. In the first stage a mixture of Student t densities is fitted to the target using an expectation maximization algorithm where each step of the optimization procedure is weighted using importance sampling. In the second stage this mixture density is a candidate density for efficient and robust application of importance sampling or the Metropolis-Hastings (MH method to estimate properties of the target distribution. The package enables Bayesian inference and prediction on model parameters and probabilities, in particular, for models where densities have multi-modal or other non-elliptical shapes like curved ridges. These shapes occur in research topics in several scientific fields. For instance, analysis of DNA data in bio-informatics, obtaining loans in the banking sector by heterogeneous groups in financial economics and analysis of education's effect on earned income in labor economics. The package MitISEM provides also an extended algorithm, 'sequential MitISEM', which substantially decreases computation time when the target density has to be approximated for increasing data samples. This occurs when the posterior or predictive density is updated with new observations and/or when one computes model probabilities using predictive likelihoods. We illustrate the MitISEM algorithm using three canonical statistical and econometric models that are characterized by several types of non-elliptical posterior shapes and that describe well-known data patterns in econometrics and finance. We show that MH using the candidate density obtained by MitISEM outperforms, in terms of numerical efficiency, MH using a simpler
Bayesian inference reveals ancient origin of simian foamy virus in orangutans.
Reid, Michael J C; Switzer, William M; Schillaci, Michael A; Klegarth, Amy R; Campbell, Ellsworth; Ragonnet, Manon; Joanisse, Isabelle; Caminiti, Kyna; Lowenberger, Carl A; Galdikas, Birute Mary F; Hollocher, Hope; Sandstrom, Paul A; Brooks, James I
2017-03-05
Simian foamy viruses (SFVs) infect most nonhuman primate species and appears to co-evolve with its hosts. This co-evolutionary signal is particularly strong among great apes, including orangutans (genus Pongo). Previous studies have identified three distinct orangutan SFV clades. The first of these three clades is composed of SFV from P. abelii from Sumatra, the second consists of SFV from P. pygmaeus from Borneo, while the third clade is mixed, comprising an SFV strain found in both species of orangutan. The existence of the mixed clade has been attributed to an expansion of P. pygmaeus into Sumatra following the Mount Toba super-volcanic eruption about 73,000years ago. Divergence dating, however, has yet to be performed to establish a temporal association with the Toba eruption. Here, we use a Bayesian framework and a relaxed molecular clock model with fossil calibrations to test the Toba hypothesis and to gain a more complete understanding of the evolutionary history of orangutan SFV. As with previous studies, our results show a similar three-clade orangutan SFV phylogeny, along with strong statistical support for SFV-host co-evolution in orangutans. Using Bayesian inference, we date the origin of orangutan SFV to >4.7 million years ago (mya), while the mixed species clade dates to approximately 1.7mya, >1.6 million years older than the Toba super-eruption. These results, combined with fossil and paleogeographic evidence, suggest that the origin of SFV in Sumatran and Bornean orangutans, including the mixed species clade, likely occurred on the mainland of Indo-China during the Late Pliocene and Calabrian stage of the Pleistocene, respectively.
qPR: An adaptive partial-report procedure based on Bayesian inference.
Baek, Jongsoo; Lesmes, Luis Andres; Lu, Zhong-Lin
2016-08-01
Iconic memory is best assessed with the partial report procedure in which an array of letters appears briefly on the screen and a poststimulus cue directs the observer to report the identity of the cued letter(s). Typically, 6-8 cue delays or 600-800 trials are tested to measure the iconic memory decay function. Here we develop a quick partial report, or qPR, procedure based on a Bayesian adaptive framework to estimate the iconic memory decay function with much reduced testing time. The iconic memory decay function is characterized by an exponential function and a joint probability distribution of its three parameters. Starting with a prior of the parameters, the method selects the stimulus to maximize the expected information gain in the next test trial. It then updates the posterior probability distribution of the parameters based on the observer's response using Bayesian inference. The procedure is reiterated until either the total number of trials or the precision of the parameter estimates reaches a certain criterion. Simulation studies showed that only 100 trials were necessary to reach an average absolute bias of 0.026 and a precision of 0.070 (both in terms of probability correct). A psychophysical validation experiment showed that estimates of the iconic memory decay function obtained with 100 qPR trials exhibited good precision (the half width of the 68.2% credible interval = 0.055) and excellent agreement with those obtained with 1,600 trials of the conventional method of constant stimuli procedure (RMSE = 0.063). Quick partial-report relieves the data collection burden in characterizing iconic memory and makes it possible to assess iconic memory in clinical populations.
Xu, Chengcheng; Wang, Wei; Liu, Pan; Li, Zhibin
2015-12-01
This study aimed to develop a real-time crash risk model with limited data in China by using Bayesian meta-analysis and Bayesian inference approach. A systematic review was first conducted by using three different Bayesian meta-analyses, including the fixed effect meta-analysis, the random effect meta-analysis, and the meta-regression. The meta-analyses provided a numerical summary of the effects of traffic variables on crash risks by quantitatively synthesizing results from previous studies. The random effect meta-analysis and the meta-regression produced a more conservative estimate for the effects of traffic variables compared with the fixed effect meta-analysis. Then, the meta-analyses results were used as informative priors for developing crash risk models with limited data. Three different meta-analyses significantly affect model fit and prediction accuracy. The model based on meta-regression can increase the prediction accuracy by about 15% as compared to the model that was directly developed with limited data. Finally, the Bayesian predictive densities analysis was used to identify the outliers in the limited data. It can further improve the prediction accuracy by 5.0%.
Comparing rates of springtail predation by web-building spiders using Bayesian inference.
Welch, Kelton D; Schofield, Matthew R; Chapman, Eric G; Harwood, James D
2014-08-01
A major goal of gut-content analysis is to quantify predation rates by predators in the field, which could provide insights into the mechanisms behind ecosystem structure and function, as well as quantification of ecosystem services provided. However, percentage-positive results from molecular assays are strongly influenced by factors other than predation rate, and thus can only be reliably used to quantify predation rates under very restrictive conditions. Here, we develop two statistical approaches, one using a parametric bootstrap and the other in terms of Bayesian inference, to build upon previous techniques that use DNA decay rates to rank predators by their rate of prey consumption, by allowing a statistical assessment of confidence in the inferred ranking. To demonstrate the utility of this technique in evaluating ecological data, we test web-building spiders for predation on a primary prey item, springtails. Using these approaches we found that an orb-weaving spider consumes springtail prey at a higher rate than a syntopic sheet-weaving spider, despite occupying microhabitats where springtails are less frequently encountered. We suggest that spider-web architecture (orb web vs. sheet web) is a primary determinant of prey-consumption rates within this assemblage of predators, which demonstrates the potential influence of predator foraging behaviour on trophic web structure. We also discuss how additional assumptions can be incorporated into the same analysis to allow broader application of the technique beyond the specific example presented. We believe that such modelling techniques can greatly advance the field of molecular gut-content analysis.
Aberer, Andre J; Stamatakis, Alexandros; Ronquist, Fredrik
2016-01-01
Sampling tree space is the most challenging aspect of Bayesian phylogenetic inference. The sheer number of alternative topologies is problematic by itself. In addition, the complex dependency between branch lengths and topology increases the difficulty of moving efficiently among topologies. Current tree proposals are fast but sample new trees using primitive transformations or re-mappings of old branch lengths. This reduces acceptance rates and presumably slows down convergence and mixing. Here, we explore branch proposals that do not rely on old branch lengths but instead are based on approximations of the conditional posterior. Using a diverse set of empirical data sets, we show that most conditional branch posteriors can be accurately approximated via a [Formula: see text] distribution. We empirically determine the relationship between the logarithmic conditional posterior density, its derivatives, and the characteristics of the branch posterior. We use these relationships to derive an independence sampler for proposing branches with an acceptance ratio of ~90% on most data sets. This proposal samples branches between 2× and 3× more efficiently than traditional proposals with respect to the effective sample size per unit of runtime. We also compare the performance of standard topology proposals with hybrid proposals that use the new independence sampler to update those branches that are most affected by the topological change. Our results show that hybrid proposals can sometimes noticeably decrease the number of generations necessary for topological convergence. Inconsistent performance gains indicate that branch updates are not the limiting factor in improving topological convergence for the currently employed set of proposals. However, our independence sampler might be essential for the construction of novel tree proposals that apply more radical topology changes.
The Application of Bayesian Inference to Gravitational Waves from Core-Collapse Supernovae
Gossan, Sarah; Ott, Christian; Kalmus, Peter; Logue, Joshua; Heng, Siong
2013-04-01
The gravitational wave (GW) signature of core-collapse supernovae (CCSNe) encodes important information on the supernova explosion mechanism, the workings of which cannot be explored via observations in the electromagnetic spectrum. Recent research has shown that the CCSNe explosion mechanism can be inferred through the application of Bayesian model selection to gravitational wave signals from supernova explosions powered by the neutrino, magnetorotational and acoustic mechanisms. Extending this work, we apply Principal Component Analysis to the GW spectrograms from CCSNe to take into account also the time-frequency evolution of the emitted signals. We do so in the context of Advanced LIGO, to establish if any improvement on distinguishing between various explosion mechanisms can be obtained. Further to this, we consider a five-detector network of interferometers (comprised of the two Advanced LIGO detectors, Advanced Virgo, LIGO India and KAGRA) and generalize the aforementioned analysis for a source of known position but unknown distance, using realistic, re-colored detector data (as opposed to Gaussian noise), in order to make more reliable statements regarding our ability to distinguish between various explosion mechanisms on the basis of their GW signatures.
Bayesian inference on earthquake size distribution: a case study in Italy
Licia, Faenza; Carlo, Meletti; Laura, Sandri
2010-05-01
This paper is focused on the study of earthquake size statistical distribution by using Bayesian inference. The strategy consists in the definition of an a priori distribution based on instrumental seismicity, and modeled as a power law distribution. By using the observed historical data, the power law is then modified in order to obtain the posterior distribution. The aim of this paper is to define the earthquake size distribution using all the seismic database available (i.e., instrumental and historical catalogs) and a robust statistical technique. We apply this methodology to the Italian seismicity, dividing the territory in source zones as done for the seismic hazard assessment, taken here as a reference model. The results suggest that each area has its own peculiar trend: while the power law is able to capture the mean aspect of the earthquake size distribution, the posterior emphasizes different slopes in different areas. Our results are in general agreement with the ones used in the seismic hazard assessment in Italy. However, there are areas in which a flattening in the curve is shown, meaning a significant departure from the power law behavior and implying that there are some local aspects that a power law distribution is not able to capture.
Bayesian Predictive Inference of a Proportion Under a Twofold Small-Area Model
Directory of Open Access Journals (Sweden)
Nandram Balgobin
2016-03-01
Full Text Available We extend the twofold small-area model of Stukel and Rao (1997; 1999 to accommodate binary data. An example is the Third International Mathematics and Science Study (TIMSS, in which pass-fail data for mathematics of students from US schools (clusters are available at the third grade by regions and communities (small areas. We compare the finite population proportions of these small areas. We present a hierarchical Bayesian model in which the firststage binary responses have independent Bernoulli distributions, and each subsequent stage is modeled using a beta distribution, which is parameterized by its mean and a correlation coefficient. This twofold small-area model has an intracluster correlation at the first stage and an intercluster correlation at the second stage. The final-stage mean and all correlations are assumed to be noninformative independent random variables. We show how to infer the finite population proportion of each area. We have applied our models to synthetic TIMSS data to show that the twofold model is preferred over a onefold small-area model that ignores the clustering within areas. We further compare these models using a simulation study, which shows that the intracluster correlation is particularly important.
Characteristics of SiC neutron sensor spectrum unfolding process based on Bayesian inference
Energy Technology Data Exchange (ETDEWEB)
Cetnar, Jerzy; Krolikowski, Igor [Faculty of Energy and Fuels AGH - University of Science and Technology, Al. Mickiewicza 30, 30-059 Krakow (Poland); Ottaviani, L. [IM2NP, UMR CNRS 7334, Aix-Marseille University, Case 231 -13397 Marseille Cedex 20 (France); Lyoussi, A. [CEA, DEN, DER, Instrumentation Sensors and Dosimetry Laboratory, Cadarache, F-13108 St-Paul-Lez-Durance (France)
2015-07-01
This paper deals with SiC detector signal interpretation in neutron radiation measurements in mixed neutron gamma radiation fields, which is called the detector inverse problem or the spectrum unfolding, and it aims in finding a representation of the primary radiation, based on the measured detector signals. In our novel methodology we resort to Bayesian inference approach. In the developed procedure the resultant spectra is unfolded form detector channels reading, where the estimated neutron fluence in a group structure is obtained with its statistical characteristic comprising of standard deviation and correlation matrix. In the paper we present results of unfolding process for case of D-T neutron source in neutron moderating environment. Discussions of statistical properties of obtained results are presented as well as of the physical meaning of obtained correlation matrix of estimated group fluence. The presented works has been carried out within the I-SMART project, which is part of the KIC InnoEnergy R and D program. (authors)
Cuevas Rivera, Dario; Bitzer, Sebastian; Kiebel, Stefan J.
2015-01-01
The olfactory information that is received by the insect brain is encoded in the form of spatiotemporal patterns in the projection neurons of the antennal lobe. These dense and overlapping patterns are transformed into a sparse code in Kenyon cells in the mushroom body. Although it is clear that this sparse code is the basis for rapid categorization of odors, it is yet unclear how the sparse code in Kenyon cells is computed and what information it represents. Here we show that this computation can be modeled by sequential firing rate patterns using Lotka-Volterra equations and Bayesian online inference. This new model can be understood as an ‘intelligent coincidence detector’, which robustly and dynamically encodes the presence of specific odor features. We found that the model is able to qualitatively reproduce experimentally observed activity in both the projection neurons and the Kenyon cells. In particular, the model explains mechanistically how sparse activity in the Kenyon cells arises from the dense code in the projection neurons. The odor classification performance of the model proved to be robust against noise and time jitter in the observed input sequences. As in recent experimental results, we found that recognition of an odor happened very early during stimulus presentation in the model. Critically, by using the model, we found surprising but simple computational explanations for several experimental phenomena. PMID:26451888
Directory of Open Access Journals (Sweden)
Dario Cuevas Rivera
2015-10-01
Full Text Available The olfactory information that is received by the insect brain is encoded in the form of spatiotemporal patterns in the projection neurons of the antennal lobe. These dense and overlapping patterns are transformed into a sparse code in Kenyon cells in the mushroom body. Although it is clear that this sparse code is the basis for rapid categorization of odors, it is yet unclear how the sparse code in Kenyon cells is computed and what information it represents. Here we show that this computation can be modeled by sequential firing rate patterns using Lotka-Volterra equations and Bayesian online inference. This new model can be understood as an 'intelligent coincidence detector', which robustly and dynamically encodes the presence of specific odor features. We found that the model is able to qualitatively reproduce experimentally observed activity in both the projection neurons and the Kenyon cells. In particular, the model explains mechanistically how sparse activity in the Kenyon cells arises from the dense code in the projection neurons. The odor classification performance of the model proved to be robust against noise and time jitter in the observed input sequences. As in recent experimental results, we found that recognition of an odor happened very early during stimulus presentation in the model. Critically, by using the model, we found surprising but simple computational explanations for several experimental phenomena.
Bayesian inference of the heat transfer properties of a wall using experimental data
Iglesias, Marco
2016-01-06
A hierarchical Bayesian inference method is developed to estimate the thermal resistance and volumetric heat capacity of a wall. We apply our methodology to a real case study where measurements are recorded each minute from two temperature probes and two heat flux sensors placed on both sides of a solid brick wall along a period of almost five days. We model the heat transfer through the wall by means of the one-dimensional heat equation with Dirichlet boundary conditions. The initial/boundary conditions for the temperature are approximated by piecewise linear functions. We assume that temperature and heat flux measurements have independent Gaussian noise and derive the joint likelihood of the wall parameters and the initial/boundary conditions. Under the model assumptions, the boundary conditions are marginalized analytically from the joint likelihood. ApproximatedGaussian posterior distributions for the wall parameters and the initial condition parameter are obtained using the Laplace method, after incorporating the available prior information. The information gain is estimated under different experimental setups, to determine the best allocation of resources.
Genetic parameters for five traits in Africanized honeybees using Bayesian inference
Padilha, Alessandro Haiduck; Sattler, Aroni; Cobuci, Jaime Araújo; McManus, Concepta Margaret
2013-01-01
Heritability and genetic correlations for honey (HP) and propolis production (PP), hygienic behavior (HB), syrup-collection rate (SCR) and percentage of mites on adult bees (PMAB) of a population of Africanized honeybees were estimated. Data from 110 queen bees over three generations were evaluated. Single and multi-trait models were analyzed by Bayesian Inference using MTGSAM. The localization of the hive was significant for SCR and HB and highly significant for PP. Season-year was highly significant only for SCR. The number of frames with bees was significant for HP and PP, including SCR. The heritability estimates were 0.16 for HP, 0.23 for SCR, 0.52 for HB, 0.66 for PP, and 0.13 for PMAB. The genetic correlations were positive among productive traits (PP, HP and SCR) and negative between productive traits and HB, except between PP and HB. Genetic correlations between PMAB and other traits, in general, were negative, except with PP. The study permitted to identify honeybees for improved propolis and honey production. Hygienic behavior may be improved as a consequence of selecting for improved propolis production. The rate of syrup consumption and propolis production may be included in a selection index to enhance honeybee traits. PMID:23885203
Directory of Open Access Journals (Sweden)
Antonio Canale
2017-06-01
Full Text Available msBP is an R package that implements a new method to perform Bayesian multiscale nonparametric inference introduced by Canale and Dunson (2016. The method, based on mixtures of multiscale beta dictionary densities, overcomes the drawbacks of Pólya trees and inherits many of the advantages of Dirichlet process mixture models. The key idea is that an infinitely-deep binary tree is introduced, with a beta dictionary density assigned to each node of the tree. Using a multiscale stick-breaking characterization, stochastically decreasing weights are assigned to each node. The result is an infinite mixture model. The package msBP implements a series of basic functions to deal with this family of priors such as random densities and numbers generation, creation and manipulation of binary tree objects, and generic functions to plot and print the results. In addition, it implements the Gibbs samplers for posterior computation to perform multiscale density estimation and multiscale testing of group differences described in Canale and Dunson (2016.
基于贝叶斯推理的HCM延误模型修正%Revision of HCM Delay Model Based on Bayesian Inference
Institute of Scientific and Technical Information of China (English)
张惠玲; 孙剑; 邵海鹏
2011-01-01
针对以1个周期时长为分析单位、使用HCM2000延误模型推导信号控制交叉口延误的问题,提出推导模型中参数修正的方法,用t检验验证参数提取的精度.对延误提取模型中的饱和度、启动损失时间及交叉口几何修正系数等参数进行分析,采用贝叶斯定理和马尔科夫链蒙特卡罗模拟方法对参数进行修正.结果证明该方法可以提高按照周期提取延误参数的精度.%This paper validates the HCM2000 delay model at the normal traffic condition using 1 cycle as the duration analysis period. The precision of the HCM 2000 delay model is tested using the t-test method. The saturation, geometry parameter and the start-up loss time parameter in the model are analyzed. Bayesian inference and Markov Chain Monte Carlo(MCMC) simulation are used to revise the models parameters. The method can improve the model's precision when using 1 cycle as the duration of analysis period.
Institute of Scientific and Technical Information of China (English)
KUNDU Debasis; PRADHAN Biswabrata
2009-01-01
Recently generalized exponential distribution has received considerable attentions. In this paper, we deal with the Bayesian inference of the unknown parameters of the progressively censored generalized exponential distribution. It is assumed that the scale and the shape parameters have independent gamma priors. The Bayes estimates of the unknown parameters cannot be obtained in the closed form. Lindley's approximation and importance sampling technique have been suggested to compute the approximate Bayes estimates. Markov Chain Monte Carlo method has been used to compute the approximate Bayes estimates and also to construct the highest posterior density credible intervals. We also provide different criteria to compare two different sampling schemes and hence to find the optimal sampling schemes. It is observed that finding the optimum censoring procedure is a computationally expensive process. And we have recommended to use the sub-optimal censoring procedure, which can be obtained very easily. Monte Carlo simulations are performed to compare the performances of the different methods and one data analysis has been performed for illustrative purposes.
Zhou, X.; Albertson, J. D.
2016-12-01
Natural gas is considered as a bridge fuel towards clean energy due to its potential lower greenhouse gas emission comparing with other fossil fuels. Despite numerous efforts, an efficient and cost-effective approach to monitor fugitive methane emissions along the natural gas production-supply chain has not been developed yet. Recently, mobile methane measurement has been introduced which applies a Bayesian approach to probabilistically infer methane emission rates and update estimates recursively when new measurements become available. However, the likelihood function, especially the error term which determines the shape of the estimate uncertainty, is not rigorously defined and evaluated with field data. To address this issue, we performed a series of near-source (sources, and concurrent wind and temperature data are recorded by nearby 3-D sonic anemometers. With known methane release rates, the measurements were used to determine the functional form and the parameterization of the likelihood function in the Bayesian inference scheme under different meteorological conditions.
Directory of Open Access Journals (Sweden)
Mateus José Sudano
2011-01-01
Full Text Available The objective of this experiment was to test in vitro embryo production (IVP as a tool to estimate fertility performance in zebu bulls using Bayesian inference statistics. Oocytes were matured and fertilized in vitro using sperm cells from three different Zebu bulls (V, T, and G. The three bulls presented similar results with regard to pronuclear formation and blastocyst formation rates. However, the cleavage rates were different between bulls. The estimated conception rates based on combined data of cleavage and blastocyst formation were very similar to the true conception rates observed for the same bulls after a fixed-time artificial insemination program. Moreover, even when we used cleavage rate data only or blastocyst formation data only, the estimated conception rates were still close to the true conception rates. We conclude that Bayesian inference is an effective statistical procedure to estimate in vivo bull fertility using data from IVP.
Alfaro, Michael E; Zoller, Stefan; Lutzoni, François
2003-02-01
Bayesian Markov chain Monte Carlo sampling has become increasingly popular in phylogenetics as a method for both estimating the maximum likelihood topology and for assessing nodal confidence. Despite the growing use of posterior probabilities, the relationship between the Bayesian measure of confidence and the most commonly used confidence measure in phylogenetics, the nonparametric bootstrap proportion, is poorly understood. We used computer simulation to investigate the behavior of three phylogenetic confidence methods: Bayesian posterior probabilities calculated via Markov chain Monte Carlo sampling (BMCMC-PP), maximum likelihood bootstrap proportion (ML-BP), and maximum parsimony bootstrap proportion (MP-BP). We simulated the evolution of DNA sequence on 17-taxon topologies under 18 evolutionary scenarios and examined the performance of these methods in assigning confidence to correct monophyletic and incorrect monophyletic groups, and we examined the effects of increasing character number on support value. BMCMC-PP and ML-BP were often strongly correlated with one another but could provide substantially different estimates of support on short internodes. In contrast, BMCMC-PP correlated poorly with MP-BP across most of the simulation conditions that we examined. For a given threshold value, more correct monophyletic groups were supported by BMCMC-PP than by either ML-BP or MP-BP. When threshold values were chosen that fixed the rate of accepting incorrect monophyletic relationship as true at 5%, all three methods recovered most of the correct relationships on the simulated topologies, although BMCMC-PP and ML-BP performed better than MP-BP. BMCMC-PP was usually a less biased predictor of phylogenetic accuracy than either bootstrapping method. BMCMC-PP provided high support values for correct topological bipartitions with fewer characters than was needed for nonparametric bootstrap.
Godsey, Brian
2013-01-01
Inferring gene regulatory networks from expression data is difficult, but it is common and often useful. Most network problems are under-determined--there are more parameters than data points--and therefore data or parameter set reduction is often necessary. Correlation between variables in the model also contributes to confound network coefficient inference. In this paper, we present an algorithm that uses integrated, probabilistic clustering to ease the problems of under-determination and correlated variables within a fully Bayesian framework. Specifically, ours is a dynamic Bayesian network with integrated Gaussian mixture clustering, which we fit using variational Bayesian methods. We show, using public, simulated time-course data sets from the DREAM4 Challenge, that our algorithm outperforms non-clustering methods in many cases (7 out of 25) with fewer samples, rarely underperforming (1 out of 25), and often selects a non-clustering model if it better describes the data. Source code (GNU Octave) for BAyesian Clustering Over Networks (BACON) and sample data are available at: http://code.google.com/p/bacon-for-genetic-networks.
DEFF Research Database (Denmark)
Heller, Rasmus; Lorenzen, Eline D.; Okello, J.B.A
2008-01-01
pandemic in the late 1800s, but little is known about the earlier demographic history of the species. We analysed genetic variation at 17 microsatellite loci and a 302-bp fragment of the mitochondrial DNA control region to infer past demographic changes in buffalo populations from East Africa. Two Bayesian......-Holocene aridification of East Africa caused a major decline in the effective population size of the buffalo, a species reliant on moist savannah habitat for its existence....
DEFF Research Database (Denmark)
Mørup, Morten; Schmidt, Mikkel N
2012-01-01
Many networks of scientific interest naturally decompose into clusters or communities with comparatively fewer external than internal links; however, current Bayesian models of network communities do not exert this intuitive notion of communities. We formulate a nonparametric Bayesian model...... for community detection consistent with an intuitive definition of communities and present a Markov chain Monte Carlo procedure for inferring the community structure. A Matlab toolbox with the proposed inference procedure is available for download. On synthetic and real networks, our model detects communities...... consistent with ground truth, and on real networks, it outperforms existing approaches in predicting missing links. This suggests that community structure is an important structural property of networks that should be explicitly modeled....
Directory of Open Access Journals (Sweden)
Heringstad Bjørg
2010-07-01
Full Text Available Abstract Background In the genetic analysis of binary traits with one observation per animal, animal threshold models frequently give biased heritability estimates. In some cases, this problem can be circumvented by fitting sire- or sire-dam models. However, these models are not appropriate in cases where individual records exist on parents. Therefore, the aim of our study was to develop a new Gibbs sampling algorithm for a proper estimation of genetic (covariance components within an animal threshold model framework. Methods In the proposed algorithm, individuals are classified as either "informative" or "non-informative" with respect to genetic (covariance components. The "non-informative" individuals are characterized by their Mendelian sampling deviations (deviance from the mid-parent mean being completely confounded with a single residual on the underlying liability scale. For threshold models, residual variance on the underlying scale is not identifiable. Hence, variance of fully confounded Mendelian sampling deviations cannot be identified either, but can be inferred from the between-family variation. In the new algorithm, breeding values are sampled as in a standard animal model using the full relationship matrix, but genetic (covariance components are inferred from the sampled breeding values and relationships between "informative" individuals (usually parents only. The latter is analogous to a sire-dam model (in cases with no individual records on the parents. Results When applied to simulated data sets, the standard animal threshold model failed to produce useful results since samples of genetic variance always drifted towards infinity, while the new algorithm produced proper parameter estimates essentially identical to the results from a sire-dam model (given the fact that no individual records exist for the parents. Furthermore, the new algorithm showed much faster Markov chain mixing properties for genetic parameters (similar to
基于边界的Markov网的发现%Learning Markov Network Based on the Boundary
Institute of Scientific and Technical Information of China (English)
何盈捷; 刘惟一
2001-01-01
Markov network ts an another powerful tool besides Bayesian network which can be used to do uncertain inference. A method of learning Markov network automaticly from mass data based on boundary has been discussed in this paper. Taking advantage of an important conclusion in information theory ,we present an efficient boundary based Markov network learning algorithm. This algorithm only demands O(N2) times CI (conditional independence) test. We prove if the joint probability is strictly positive,then the found Markov network must be the minimal I_map of the sample.
Institute of Scientific and Technical Information of China (English)
陈亮; 程汉文; 吴乐南
2009-01-01
依据星座图采用非参数贝叶斯方法对多元相移键控(MPSK)信号进行调制识别.将未知信噪比(SNR)水平的MPSK信号看成复平面内多个未知均值和方差的高斯分布依照一定的比例混合而成,利用非参数贝叶斯推断方法进行密度估计,实现对MPSK信号分类目的.推断过程中,引入Dirichlet过程作为混合比例因子的先验分布,结合正态逆Wishart(NIW)分布作为均值和方差的先验分布,根据接收信号,利用Gibbs采样的MCMC(Monte Carlo Markov chain)随机采样算法,不断调整混合比例因子、均值和方差.通过多次迭代,得到对调制信号的密度估计.仿真表明,在SNR>5 dB,码元数目大于1 600时,2/4/8PSK的识别率超过了95%.%A nonparametric Bayesian method is presented to classify the MPSK (M-ary phase shift keying) signals. The MPSK signals with unknown signal noise ratios (SNRs) are modeled as a Gaussian mixture model with unknown means and covariances in the constellation plane, and a clustering method is proposed to estimate the probability density of the MPSK signals. The method is based on the nonparametric Bayesian inference, which introduces the Dirichlet process as the prior probability of the mixture coefficient, and applies a normal inverse Wishart (NIW) distribution as the prior probability of the unknown mean and covariance. Then, according to the received signals, the parameters are adjusted by the Monte Carlo Markov chain (MCMC) random sampling algorithm. By iterations, the density estimation of the MPSK signals can be estimated. Simulation results show that the correct recognition ratio of 2/4/8PSK is greater than 95% under the condition that SNR >5 dB and 1 600 symbols are used in this method.
Directory of Open Access Journals (Sweden)
In-Ho Choi
2016-05-01
Full Text Available This study presents a new method to track driver’s facial states, such as head pose and eye-blinking in the real-time basis. Since a driver in the natural driving condition moves his head in diverse ways and his face is often occluded by his hand or the wheel, it should be a great challenge for the standard face models. Among many, Active Appearance Model (AAM, and Active Shape Model (ASM are two favored face models. We have extended Discriminative Bayesian ASM by incorporating the extreme pose cases, called it Pose Extended—Active Shape model (PE-ASM. Two face databases (DB are used for the comparison purpose: one is the Boston University face DB and the other is our custom-made driving DB. Our evaluation indicates that PE-ASM outperforms ASM and AAM in terms of the face fitting against extreme poses. Using this model, we can estimate the driver’s head pose, as well as eye-blinking, by adding respective processes. Two HMMs are trained to model temporal behaviors of these two facial features, and consequently the system can make inference by enumerating these HMM states whether the driver is drowsy or not. Result suggests that it can be used as a driver drowsiness detector in the commercial car where the visual conditions are very diverse and often tough to deal with.
Gelman, Andrew; Robert, Christian P.; Rousseau, Judith
2010-01-01
For many decades, statisticians have made attempts to prepare the Bayesian omelette without breaking the Bayesian eggs; that is, to obtain probabilistic likelihood-based inferences without relying on informative prior distributions. A recent example is Murray Aitkin's recent book, {\\em Statistical Inference}, which presents an approach to statistical hypothesis testing based on comparisons of posterior distributions of likelihoods under competing models. Aitkin develops and illustrates his me...
Kim, Jee-Seon; Bolt, Daniel M.
2007-01-01
The purpose of this ITEMS module is to provide an introduction to Markov chain Monte Carlo (MCMC) estimation for item response models. A brief description of Bayesian inference is followed by an overview of the various facets of MCMC algorithms, including discussion of prior specification, sampling procedures, and methods for evaluating chain…
Vrugt, J.A.; Braak, ter C.J.F.; Diks, C.G.H.; Robinson, B.A.; Hyman, J.M.; Higdon, D.
2009-01-01
Markov chain Monte Carlo (MCMC) methods have found widespread use in many fields of study to estimate the average properties of complex systems, and for posterior inference in a Bayesian framework. Existing theory and experiments prove convergence of well-constructed MCMC schemes to the appropriate
Vrugt, J.A.; Braak, C.J.F.; Diks, C.G.H.; Robinson, B.A.; Hyman, J.M.; Higdon, D.
2009-01-01
Markov chain Monte Carlo (MCMC) methods have found widespread use in many fields of study to estimate the average properties of complex systems, and for posterior inference in a Bayesian framework. Existing theory and experiments prove convergence of well constructed MCMC schemes to the appropriate
Kim, Jee-Seon; Bolt, Daniel M.
2007-01-01
The purpose of this ITEMS module is to provide an introduction to Markov chain Monte Carlo (MCMC) estimation for item response models. A brief description of Bayesian inference is followed by an overview of the various facets of MCMC algorithms, including discussion of prior specification, sampling procedures, and methods for evaluating chain…
Dumitru, Mircea; Mohammad-Djafari, Ali; Sain, Simona Baghai
2016-12-01
The toxicity and efficacy of more than 30 anticancer agents present very high variations, depending on the dosing time. Therefore, the biologists studying the circadian rhythm require a very precise method for estimating the periodic component (PC) vector of chronobiological signals. Moreover, in recent developments, not only the dominant period or the PC vector present a crucial interest but also their stability or variability. In cancer treatment experiments, the recorded signals corresponding to different phases of treatment are short, from 7 days for the synchronization segment to 2 or 3 days for the after-treatment segment. When studying the stability of the dominant period, we have to consider very short length signals relative to the prior knowledge of the dominant period, placed in the circadian domain. The classical approaches, based on Fourier transform (FT) methods are inefficient (i.e., lack of precision) considering the particularities of the data (i.e., the short length). Another particularity of the signals considered in such experiments is the level of noise: such signals are very noisy and establishing the periodic components that are associated with the biological phenomena and distinguishing them from the ones associated with the noise are difficult tasks. In this paper, we propose a new method for the estimation of the PC vector of biomedical signals, using the biological prior informations and considering a model that accounts for the noise. The experiments developed in cancer treatment context are recording signals expressing a limited number of periods. This is a prior information that can be translated as the sparsity of the PC vector. The proposed method considers the PC vector estimation as an Inverse Problem (IP) using the general Bayesian inference in order to infer the unknown of our model, i.e. the PC vector but also the hyperparameters (i.e the variances). The sparsity prior information is modeled using a sparsity enforcing prior law
Hallo, Miroslav; Asano, Kimiyuki; Gallovič, František
2017-09-01
On April 16, 2016, Kumamoto prefecture in Kyushu region, Japan, was devastated by a shallow M JMA7.3 earthquake. The series of foreshocks started by M JMA6.5 foreshock 28 h before the mainshock. They have originated in Hinagu fault zone intersecting the mainshock Futagawa fault zone; hence, the tectonic background for this earthquake sequence is rather complex. Here we infer centroid moment tensors (CMTs) for 11 events with M JMA between 4.8 and 6.5, using strong motion records of the K-NET, KiK-net and F-net networks. We use upgraded Bayesian full-waveform inversion code ISOLA-ObsPy, which takes into account uncertainty of the velocity model. Such an approach allows us to reliably assess uncertainty of the CMT parameters including the centroid position. The solutions show significant systematic spatial and temporal variations throughout the sequence. Foreshocks are right-lateral steeply dipping strike-slip events connected to the NE-SW shear zone. Those located close to the intersection of the Hinagu and Futagawa fault zones are dipping slightly to ESE, while those in the southern area are dipping to WNW. Contrarily, aftershocks are mostly normal dip-slip events, being related to the N-S extensional tectonic regime. Most of the deviatoric moment tensors contain only minor CLVD component, which can be attributed to the velocity model uncertainty. Nevertheless, two of the CMTs involve a significant CLVD component, which may reflect complex rupture process. Decomposition of those moment tensors into two pure shear moment tensors suggests combined right-lateral strike-slip and normal dip-slip mechanisms, consistent with the tectonic settings of the intersection of the Hinagu and Futagawa fault zones.[Figure not available: see fulltext.
Condition monitoring of distributed systems using two-stage Bayesian inference data fusion
Jaramillo, Víctor H.; Ottewill, James R.; Dudek, Rafał; Lepiarczyk, Dariusz; Pawlik, Paweł
2017-03-01
In industrial practice, condition monitoring is typically applied to critical machinery. A particular piece of machinery may have its own condition monitoring system that allows the health condition of said piece of equipment to be assessed independently of any connected assets. However, industrial machines are typically complex sets of components that continuously interact with one another. In some cases, dynamics resulting from the inception and development of a fault can propagate between individual components. For example, a fault in one component may lead to an increased vibration level in both the faulty component, as well as in connected healthy components. In such cases, a condition monitoring system focusing on a specific element in a connected set of components may either incorrectly indicate a fault, or conversely, a fault might be missed or masked due to the interaction of a piece of equipment with neighboring machines. In such cases, a more holistic condition monitoring approach that can not only account for such interactions, but utilize them to provide a more complete and definitive diagnostic picture of the health of the machinery is highly desirable. In this paper, a Two-Stage Bayesian Inference approach allowing data from separate condition monitoring systems to be combined is presented. Data from distributed condition monitoring systems are combined in two stages, the first data fusion occurring at a local, or component, level, and the second fusion combining data at a global level. Data obtained from an experimental rig consisting of an electric motor, two gearboxes, and a load, operating under a range of different fault conditions is used to illustrate the efficacy of the method at pinpointing the root cause of a problem. The obtained results suggest that the approach is adept at refining the diagnostic information obtained from each of the different machine components monitored, therefore improving the reliability of the health assessment of
Multi-scale inference of interaction rules in animal groups using Bayesian model selection.
Directory of Open Access Journals (Sweden)
Richard P Mann
2012-01-01
Full Text Available Inference of interaction rules of animals moving in groups usually relies on an analysis of large scale system behaviour. Models are tuned through repeated simulation until they match the observed behaviour. More recent work has used the fine scale motions of animals to validate and fit the rules of interaction of animals in groups. Here, we use a Bayesian methodology to compare a variety of models to the collective motion of glass prawns (Paratya australiensis. We show that these exhibit a stereotypical 'phase transition', whereby an increase in density leads to the onset of collective motion in one direction. We fit models to this data, which range from: a mean-field model where all prawns interact globally; to a spatial Markovian model where prawns are self-propelled particles influenced only by the current positions and directions of their neighbours; up to non-Markovian models where prawns have 'memory' of previous interactions, integrating their experiences over time when deciding to change behaviour. We show that the mean-field model fits the large scale behaviour of the system, but does not capture fine scale rules of interaction, which are primarily mediated by physical contact. Conversely, the Markovian self-propelled particle model captures the fine scale rules of interaction but fails to reproduce global dynamics. The most sophisticated model, the non-Markovian model, provides a good match to the data at both the fine scale and in terms of reproducing global dynamics. We conclude that prawns' movements are influenced by not just the current direction of nearby conspecifics, but also those encountered in the recent past. Given the simplicity of prawns as a study system our research suggests that self-propelled particle models of collective motion should, if they are to be realistic at multiple biological scales, include memory of previous interactions and other non-Markovian effects.
Bayesian inference of genetic parameters for ultrasound scanning traits of Kivircik lambs.
Cemal, I; Karaman, E; Firat, M Z; Yilmaz, O; Ata, N; Karaca, O
2017-03-01
Ultrasound scanning traits have been adapted in selection programs in many countries to improve carcass traits for lean meat production. As the genetic parameters of the traits interested are important for breeding programs, the estimation of these parameters was aimed at the present investigation. The estimated parameters were direct and maternal heritability as well as genetic correlations between the studied traits. The traits were backfat thickness (BFT), skin+backfat thickness (SBFT), eye muscle depth (MD) and live weights at the day of scanning (LW). The breed investigated was Kivircik, which has a high quality of meat. Six different multi-trait animal models were fitted to determine the most suitable model for the data using Bayesian approach. Based on deviance information criterion, a model that includes direct additive genetic effects, maternal additive genetic effects, direct maternal genetic covariance and maternal permanent environmental effects revealed to be the most appropriate for the data, and therefore, inferences were built on the results of that model. The direct heritability estimates for BFT, SBFT, MD and LW were 0.26, 0.26, 0.23 and 0.09, whereas the maternal heritability estimates were 0.27, 0.27, 0.24 and 0.20, respectively. Negative genetic correlations were obtained between direct and maternal effects for BFT, SBFT and MD. Both direct and maternal genetic correlations between traits were favorable, whereas BFT-MD and SBFT-MD had negligible direct genetic correlation. The highest direct and maternal genetic correlations were between BFT and SBFT (0.39) and between MD and LW (0.48), respectively. Our results, in general, indicated that maternal effects should be accounted for in estimation of genetic parameters of ultrasound scanning traits in Kivircik lambs, and SBFT can be used as a selection criterion to improve BFT.
Multi-scale inference of interaction rules in animal groups using Bayesian model selection.
Mann, Richard P; Perna, Andrea; Strömbom, Daniel; Garnett, Roman; Herbert-Read, James E; Sumpter, David J T; Ward, Ashley J W
2012-01-01
Inference of interaction rules of animals moving in groups usually relies on an analysis of large scale system behaviour. Models are tuned through repeated simulation until they match the observed behaviour. More recent work has used the fine scale motions of animals to validate and fit the rules of interaction of animals in groups. Here, we use a Bayesian methodology to compare a variety of models to the collective motion of glass prawns (Paratya australiensis). We show that these exhibit a stereotypical 'phase transition', whereby an increase in density leads to the onset of collective motion in one direction. We fit models to this data, which range from: a mean-field model where all prawns interact globally; to a spatial Markovian model where prawns are self-propelled particles influenced only by the current positions and directions of their neighbours; up to non-Markovian models where prawns have 'memory' of previous interactions, integrating their experiences over time when deciding to change behaviour. We show that the mean-field model fits the large scale behaviour of the system, but does not capture fine scale rules of interaction, which are primarily mediated by physical contact. Conversely, the Markovian self-propelled particle model captures the fine scale rules of interaction but fails to reproduce global dynamics. The most sophisticated model, the non-Markovian model, provides a good match to the data at both the fine scale and in terms of reproducing global dynamics. We conclude that prawns' movements are influenced by not just the current direction of nearby conspecifics, but also those encountered in the recent past. Given the simplicity of prawns as a study system our research suggests that self-propelled particle models of collective motion should, if they are to be realistic at multiple biological scales, include memory of previous interactions and other non-Markovian effects.
Energy Technology Data Exchange (ETDEWEB)
Kang, Seongkeun; Seong, Poong Hyun [Korea Advanced Institute of Science and Technology, Daejeon (Korea, Republic of)
2014-05-15
The purpose of this paper is to confirm if Bayesian inference can properly reflect the situation awareness of real human operators, and find the difference between the situation of ideal and practical operators, and investigate the factors which contributes to those difference. As a results, human can not think like computer. If human can memorize all the information, and their thinking process is same to the CPU of computer, the results of these two experiments come out more than 99%. However the probability of finding right malfunction by humans are only 64.52% in simple experiment, and 51.61% in complex experiment. Cognition is the mental processing that includes the attention of working memory, comprehending and producing language, calculating, reasoning, problem solving, and decision making. There are many reasons why human thinking process is different with computer, but in this experiment, we suggest that the working memory is the most important factor. Humans have limited working memory which has only seven chunks capacity. These seven chunks are called magic number. If there are more than seven sequential information, people start to forget the previous information because their working memory capacity is running over. We can check how much working memory affects to the result through the simple experiment. Then what if we neglect the effect of working memory? The total number of subjects who have incorrect memory is 7 (subject 3, 5, 6, 7, 8, 15, 25). They could find the right malfunction if the memory hadn't changed because of lack of working memory. Then the probability of find correct malfunction will be increased to 87.10% from 64.52%. Complex experiment has similar result. In this case, eight subjects(1, 5, 8, 9, 15, 17, 18, 30) had changed the memory, and it affects to find the right malfunction. Considering it, then the probability would be (16+8)/31 = 77.42%.
Serang, Oliver
2015-08-01
Observations depending on sums of random variables are common throughout many fields; however, no efficient solution is currently known for performing max-product inference on these sums of general discrete distributions (max-product inference can be used to obtain maximum a posteriori estimates). The limiting step to max-product inference is the max-convolution problem (sometimes presented in log-transformed form and denoted as "infimal convolution," "min-convolution," or "convolution on the tropical semiring"), for which no O(k log(k)) method is currently known. Presented here is an O(k log(k)) numerical method for estimating the max-convolution of two nonnegative vectors (e.g., two probability mass functions), where k is the length of the larger vector. This numerical max-convolution method is then demonstrated by performing fast max-product inference on a convolution tree, a data structure for performing fast inference given information on the sum of n discrete random variables in O(nk log(nk)log(n)) steps (where each random variable has an arbitrary prior distribution on k contiguous possible states). The numerical max-convolution method can be applied to specialized classes of hidden Markov models to reduce the runtime of computing the Viterbi path from nk(2) to nk log(k), and has potential application to the all-pairs shortest paths problem.
Chiang, Sharon; Guindani, Michele; Yeh, Hsiang J; Haneef, Zulfi; Stern, John M; Vannucci, Marina
2017-03-01
In this article a multi-subject vector autoregressive (VAR) modeling approach was proposed for inference on effective connectivity based on resting-state functional MRI data. Their framework uses a Bayesian variable selection approach to allow for simultaneous inference on effective connectivity at both the subject- and group-level. Furthermore, it accounts for multi-modal data by integrating structural imaging information into the prior model, encouraging effective connectivity between structurally connected regions. They demonstrated through simulation studies that their approach resulted in improved inference on effective connectivity at both the subject- and group-level, compared with currently used methods. It was concluded by illustrating the method on temporal lobe epilepsy data, where resting-state functional MRI and structural MRI were used. Hum Brain Mapp 38:1311-1332, 2017. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Boers, Niklas; Goswami, Bedartha; Chekroun, Mickael; Svensson, Anders; Rousseau, Denis-Didier; Ghil, Michael
2016-04-01
In the recent past, empirical stochastic models have been successfully applied to model a wide range of climatic phenomena [1,2]. In addition to enhancing our understanding of the geophysical systems under consideration, multilayer stochastic models (MSMs) have been shown to be solidly grounded in the Mori-Zwanzig formalism of statistical physics [3]. They are also well-suited for predictive purposes, e.g., for the El Niño Southern Oscillation [4] and the Madden-Julian Oscillation [5]. In general, these models are trained on a given time series under consideration, and then assumed to reproduce certain dynamical properties of the underlying natural system. Most existing approaches are based on least-squares fitting to determine optimal model parameters, which does not allow for an uncertainty estimation of these parameters. This approach significantly limits the degree to which dynamical characteristics of the time series can be safely inferred from the model. Here, we are specifically interested in fitting low-dimensional stochastic models to time series obtained from paleoclimatic proxy records, such as the oxygen isotope ratio and dust concentration of the NGRIP record [6]. The time series derived from these records exhibit substantial dating uncertainties, in addition to the proxy measurement errors. In particular, for time series of this kind, it is crucial to obtain uncertainty estimates for the final model parameters. Following [7], we first propose a statistical procedure to shift dating uncertainties from the time axis to the proxy axis of layer-counted paleoclimatic records. Thereafter, we show how Maximum Likelihood Estimation in combination with Markov Chain Monte Carlo parameter sampling can be employed to translate all uncertainties present in the original proxy time series to uncertainties of the parameter estimates of the stochastic model. We compare time series simulated by the empirical model to the original time series in terms of standard
DEFF Research Database (Denmark)
Scholer, Marie; Irving, James; Zibar, Majken Caroline Looms
2012-01-01
We examined to what extent time-lapse crosshole ground-penetrating radar traveltimes, measured during a forced infiltration experiment at the Arreneas field site in Denmark, could help to quantify vadose zone hydraulic properties and their corresponding uncertainties using a Bayesian Markov......-chain-Monte-Carlo inversion approach with different priors. The ground-penetrating radar (GPR) geophysical method has the potential to provide valuable information on the hydraulic properties of the vadose zone because of its strong sensitivity to soil water content. In particular, recent evidence has suggested...... that the stochastic inversion of crosshole GPR traveltime data can allow for a significant reduction in uncertainty regarding subsurface van Genuchten–Mualem (VGM) parameters. Much of the previous work on the stochastic estimation of VGM parameters from crosshole GPR data has considered the case of steady...
Energy Technology Data Exchange (ETDEWEB)
Chin, George; Choudhury, Sutanay; Kangas, Lars J.; McFarlane, Sally A.; Marquez, Andres
2011-09-01
Long viewed as a strong statistical inference technique, Bayesian networks have emerged to be an important class of applications for high-performance computing. We have applied an architecture-conscious approach to parallelizing the Lauritzen-Spiegelhalter Junction Tree algorithm for exact inferencing in Bayesian networks. In optimizing the Junction Tree algorithm, we have implemented both in-clique and topological parallelism strategies to best leverage the fine-grained synchronization and massive-scale multithreading of the Cray XMT architecture. Two topological techniques were developed to parallelize the evidence propagation process through the Bayesian network. One technique involves performing intelligent scheduling of junction tree nodes based on its topology and relative size. The second technique involves decomposing the junction tree into a much finer tree-like representation to offer much more opportunities for parallelism. We evaluate these optimizations on five different Bayesian networks and report our findings and observations. Another important contribution of this paper is to demonstrate the application of massive-scale multithreading for load balancing and use of implicit parallelism-based compiler optimizations in designing scalable inferencing algorithms.
A simple introduction to Markov Chain Monte-Carlo sampling.
van Ravenzwaaij, Don; Cassey, Pete; Brown, Scott D
2016-03-11
Markov Chain Monte-Carlo (MCMC) is an increasingly popular method for obtaining information about distributions, especially for estimating posterior distributions in Bayesian inference. This article provides a very basic introduction to MCMC sampling. It describes what MCMC is, and what it can be used for, with simple illustrative examples. Highlighted are some of the benefits and limitations of MCMC sampling, as well as different approaches to circumventing the limitations most likely to trouble cognitive scientists.
Directory of Open Access Journals (Sweden)
Fang-Rong Yan
Full Text Available This article provides a fully bayesian approach for modeling of single-dose and complete pharmacokinetic data in a population pharmacokinetic (PK model. To overcome the impact of outliers and the difficulty of computation, a generalized linear model is chosen with the hypothesis that the errors follow a multivariate Student t distribution which is a heavy-tailed distribution. The aim of this study is to investigate and implement the performance of the multivariate t distribution to analyze population pharmacokinetic data. Bayesian predictive inferences and the Metropolis-Hastings algorithm schemes are used to process the intractable posterior integration. The precision and accuracy of the proposed model are illustrated by the simulating data and a real example of theophylline data.
Fontanazza, C M; Freni, G; Notaro, V
2012-01-01
Flood damage in urbanized watersheds may be assessed by combining the flood depth-damage curves and the outputs of urban flood models. The complexity of the physical processes that must be simulated and the limited amount of data available for model calibration may lead to high uncertainty in the model results and consequently in damage estimation. Moreover depth-damage functions are usually affected by significant uncertainty related to the collected data and to the simplified structure of the regression law that is used. The present paper carries out the analysis of the uncertainty connected to the flood damage estimate obtained combining the use of hydraulic models and depth-damage curves. A Bayesian inference analysis was proposed along with a probabilistic approach for the parameters estimating. The analysis demonstrated that the Bayesian approach is very effective considering that the available databases are usually short.
Sankararaman, Shankar
2016-01-01
This paper presents a computational framework for uncertainty characterization and propagation, and sensitivity analysis under the presence of aleatory and epistemic un- certainty, and develops a rigorous methodology for efficient refinement of epistemic un- certainty by identifying important epistemic variables that significantly affect the overall performance of an engineering system. The proposed methodology is illustrated using the NASA Langley Uncertainty Quantification Challenge (NASA-LUQC) problem that deals with uncertainty analysis of a generic transport model (GTM). First, Bayesian inference is used to infer subsystem-level epistemic quantities using the subsystem-level model and corresponding data. Second, tools of variance-based global sensitivity analysis are used to identify four important epistemic variables (this limitation specified in the NASA-LUQC is reflective of practical engineering situations where not all epistemic variables can be refined due to time/budget constraints) that significantly affect system-level performance. The most significant contribution of this paper is the development of the sequential refine- ment methodology, where epistemic variables for refinement are not identified all-at-once. Instead, only one variable is first identified, and then, Bayesian inference and global sensi- tivity calculations are repeated to identify the next important variable. This procedure is continued until all 4 variables are identified and the refinement in the system-level perfor- mance is computed. The advantages of the proposed sequential refinement methodology over the all-at-once uncertainty refinement approach are explained, and then applied to the NASA Langley Uncertainty Quantification Challenge problem.
Nitrate vulnerability projections from Bayesian inference of multiple groundwater age tracers
Alikhani, Jamal; Deinhart, Amanda L.; Visser, Ate; Bibby, Richard K.; Purtschert, Roland; Moran, Jean E.; Massoudieh, Arash; Esser, Bradley K.
2016-12-01
Nitrate is a major source of contamination of groundwater in the United States and around the world. We tested the applicability of multiple groundwater age tracers (3H, 3He, 4He, 14C, 13C, and 85Kr) in projecting future trends of nitrate concentration in 9 long-screened, public drinking water wells in Turlock, California, where nitrate concentrations are increasing toward the regulatory limit. Very low 85Kr concentrations and apparent 3H/3He ages point to a relatively old modern fraction (40-50 years), diluted with pre-modern groundwater, corroborated by the onset and slope of increasing nitrate concentrations. An inverse Gaussian-Dirac model was chosen to represent the age distribution of the sampled groundwater at each well. Model parameters were estimated using a Bayesian inference, resulting in the posterior probability distribution - including the associated uncertainty - of the parameters and projected nitrate concentrations. Three scenarios were considered, including combined historic nitrate and age tracer data, the sole use of nitrate and the sole use of age tracer data. Each scenario was evaluated based on the ability of the model to reproduce the data and the level of reliability of the nitrate projections. The tracer-only scenario closely reproduced tracer concentrations, but not observed trends in the nitrate concentration. Both cases that included nitrate data resulted in good agreement with historical nitrate trends. Use of combined tracers and nitrate data resulted in a narrower range of projections of future nitrate levels. However, use of combined tracer and nitrate resulted in a larger discrepancy between modeled and measured tracers for some of the tracers. Despite nitrate trend slopes between 0.56 and 1.73 mg/L/year in 7 of the 9 wells, the probability that concentrations will increase to levels above the MCL by 2040 are over 95% for only two of the wells, and below 15% in the other wells, due to a leveling off of reconstructed historical
Raithel, Carolyn A.; Özel, Feryal; Psaltis, Dimitrios
2017-08-01
One of the key goals of observing neutron stars is to infer the equation of state (EoS) of the cold, ultradense matter in their interiors. Here, we present a Bayesian statistical method of inferring the pressures at five fixed densities, from a sample of mock neutron star masses and radii. We show that while five polytropic segments are needed for maximum flexibility in the absence of any prior knowledge of the EoS, regularizers are also necessary to ensure that simple underlying EoS are not over-parameterized. For ideal data with small measurement uncertainties, we show that the pressure at roughly twice the nuclear saturation density, {ρ }{sat}, can be inferred to within 0.3 dex for many realizations of potential sources of uncertainties. The pressures of more complicated EoS with significant phase transitions can also be inferred to within ˜30%. We also find that marginalizing the multi-dimensional parameter space of pressure to infer a mass-radius relation can lead to biases of nearly 1 km in radius, toward larger radii. Using the full, five-dimensional posterior likelihoods avoids this bias.
Analogical and Category-Based Inference: A Theoretical Integration with Bayesian Causal Models
Holyoak, Keith J.; Lee, Hee Seung; Lu, Hongjing
2010-01-01
A fundamental issue for theories of human induction is to specify constraints on potential inferences. For inferences based on shared category membership, an analogy, and/or a relational schema, it appears that the basic goal of induction is to make accurate and goal-relevant inferences that are sensitive to uncertainty. People can use source…
Analysis of simulated data for the KArlsruhe TRItium Neutrino experiment using Bayesian inference
DEFF Research Database (Denmark)
Riis, Anna Sejersen; Hannestad, Steen; Weinheimer, C.
2011-01-01
neutrinos. As an alternative to the frequentist minimization methods used in the analysis of the earlier experiments in Mainz and Troitsk we have been investigating Markov chain Monte Carlo (MCMC) methods which are very well suited for probing multiparameter spaces. We found that implementing the KATRIN χ2...
Yang, Yuqing; Chen, Ning; Chen, Ting
2017-01-25
The inference of associations between environmental factors and microbes and among microbes is critical to interpreting metagenomic data, but compositional bias, indirect associations resulting from common factors, and variance within metagenomic sequencing data limit the discovery of associations. To account for these problems, we propose metagenomic Lognormal-Dirichlet-Multinomial (mLDM), a hierarchical Bayesian model with sparsity constraints, to estimate absolute microbial abundance and simultaneously infer both conditionally dependent associations among microbes and direct associations between microbes and environmental factors. We empirically show the effectiveness of the mLDM model using synthetic data, data from the TARA Oceans project, and a colorectal cancer dataset. Finally, we apply mLDM to 16S sequencing data from the western English Channel and report several associations. Our model can be used on both natural environmental and human metagenomic datasets, promoting the understanding of associations in the microbial community.
Morrissey, Edward R; Juárez, Miguel A; Denby, Katherine J; Burroughs, Nigel J
2011-10-01
We propose a semiparametric Bayesian model, based on penalized splines, for the recovery of the time-invariant topology of a causal interaction network from longitudinal data. Our motivation is inference of gene regulatory networks from low-resolution microarray time series, where existence of nonlinear interactions is well known. Parenthood relations are mapped by augmenting the model with kinship indicators and providing these with either an overall or gene-wise hierarchical structure. Appropriate specification of the prior is crucial to control the flexibility of the splines, especially under circumstances of scarce data; thus, we provide an informative, proper prior. Substantive improvement in network inference over a linear model is demonstrated using synthetic data drawn from ordinary differential equation models and gene expression from an experimental data set of the Arabidopsis thaliana circadian rhythm.
Bayesian inference – a way to combine statistical data and semantic analysis meaningfully
Directory of Open Access Journals (Sweden)
Eila Lindfors
2011-11-01
Full Text Available This article focuses on presenting the possibilities of Bayesian modelling (Finite Mixture Modelling in the semantic analysis of statistically modelled data. The probability of a hypothesis in relation to the data available is an important question in inductive reasoning. Bayesian modelling allows the researcher to use many models at a time and provides tools to evaluate the goodness of different models. The researcher should always be aware that there is no such thing as the exact probability of an exact event. This is the reason for using probabilistic models. Each model presents a different perspective on the phenomenon in focus, and the researcher has to choose the most probable model with a view to previous research and the knowledge available.The idea of Bayesian modelling is illustrated here by presenting two different sets of data, one from craft science research (n=167 and the other (n=63 from educational research (Lindfors, 2007, 2002. The principles of how to build models and how to combine different profiles are described in the light of the research mentioned.Bayesian modelling is an analysis based on calculating probabilities in relation to a specific set of quantitative data. It is a tool for handling data and interpreting it semantically. The reliability of the analysis arises from an argumentation of which model can be selected from the model space as the basis for an interpretation, and on which arguments.Keywords: method, sloyd, Bayesian modelling, student teachersURN:NBN:no-29959
DEFF Research Database (Denmark)
Kristensen, Anders Ringgaard; Søllested, Thomas Algot
2004-01-01
herds. It is concluded that the Bayesian updating technique and the hierarchical structure decrease the size of the state space dramatically. Since parameter estimates vary considerably among herds it is concluded that decision support concerning sow replacement only makes sense with parameters...... estimated at herd level. It is argued that the multi-level formulation and the standard software comprise a flexible tool and a shortcut to working prototypes...
Tang, An-Min; Tang, Nian-Sheng
2015-02-28
We propose a semiparametric multivariate skew-normal joint model for multivariate longitudinal and multivariate survival data. One main feature of the posited model is that we relax the commonly used normality assumption for random effects and within-subject error by using a centered Dirichlet process prior to specify the random effects distribution and using a multivariate skew-normal distribution to specify the within-subject error distribution and model trajectory functions of longitudinal responses semiparametrically. A Bayesian approach is proposed to simultaneously obtain Bayesian estimates of unknown parameters, random effects and nonparametric functions by combining the Gibbs sampler and the Metropolis-Hastings algorithm. Particularly, a Bayesian local influence approach is developed to assess the effect of minor perturbations to within-subject measurement error and random effects. Several simulation studies and an example are presented to illustrate the proposed methodologies.
Subbiah, M.; Rajeswaran, V.
Extensive statistical practice has shown the importance and relevance of the inferential problem of estimating probability parameters in a binomial experiment; especially on the issues of competing intervals from frequentist, Bayesian, and Bootstrap approaches. The package written in the free R environment and presented in this paper tries to take care of the issues just highlighted, by pooling a number of widely available and well-performing methods and apporting on them essential variations. A wide range of functions helps users with differing skills to estimate, evaluate, summarize, numerically and graphically, various measures adopting either the frequentist or the Bayesian paradigm.
Bayesian Inference with Missing Data%数据缺失条件下的贝叶斯推断方法
Institute of Scientific and Technical Information of China (English)
虞健飞; 张恒喜; 朱家元
2002-01-01
Recently Bayesian network(BN) becomus a noticeable research direction in Data Mining.In this paper we introduce missing data mechanisms firstly,and then some methods to do Baysesian inference with missing data based on these missing data mechanisms.All of these must be useful in practice especially when data is scare and expensive.It can foresee that Bayesian networks will become a powerful tool in Data Mining with all of these methods above offered.
A Bayesian network approach for causal inferences in pesticide risk assessment and management
Pesticide risk assessment and management must balance societal benefits and ecosystem protection, based on quantified risks and the strength of the causal linkages between uses of the pesticide and socioeconomic and ecological endpoints of concern. A Bayesian network (BN) is a gr...
DEFF Research Database (Denmark)
Møller, Jesper; Rasmussen, Jakob Gulddahl
points is such that the dependent cluster point is likely to occur closely to a previous cluster point. We demonstrate the flexibility of the model for producing point patterns with linear structures, and propose to use the model as the likelihood in a Bayesian setting when analyzing a spatial point...
Multi-Pitch Estimation and Tracking Using Bayesian Inference in Block Sparsity
DEFF Research Database (Denmark)
Karimian-Azari, Sam; Jakobsson, Andreas; Jensen, Jesper Rindom
2015-01-01
tracking of the found sources, without posing detailed a priori assumptions of the number of harmonics for each source. The method incorporates a Bayesian prior and assigns data-dependent regularization coefficients to efficiently incorporate both earlier and future data blocks in the tracking of estimates...
Data-Driven Inference on Sign Restrictions in Bayesian Structural Vector Autoregression
DEFF Research Database (Denmark)
Lanne, Markku; Luoto, Jani
asymptotically. In other words, within the set the impulse responses are driven by the implicit prior, and the likelihood has no significance. In this paper, we introduce a Bayesian SVAR model where unique identification is achieved by statistical properties of the data. Our setup facilitates assuming...
Bayesian Inference for Growth Mixture Models with Latent Class Dependent Missing Data
Lu, Zhenqiu Laura; Zhang, Zhiyong; Lubke, Gitta
2011-01-01
"Growth mixture models" (GMMs) with nonignorable missing data have drawn increasing attention in research communities but have not been fully studied. The goal of this article is to propose and to evaluate a Bayesian method to estimate the GMMs with latent class dependent missing data. An extended GMM is first presented in which class…
On the Practice of Bayesian Inference in Basic Economic Time Series Models using Gibbs Sampling
M.D. de Pooter (Michiel); R. Segers (René); H.K. van Dijk (Herman)
2006-01-01
textabstractSeveral lessons learned from a Bayesian analysis of basic economic time series models by means of the Gibbs sampling algorithm are presented. Models include the Cochrane-Orcutt model for serial correlation, the Koyck distributed lag model, the Unit Root model, the Instrumental Variables
Energy Technology Data Exchange (ETDEWEB)
Blanc, Guillermo A. [Observatories of the Carnegie Institution for Science, 813 Santa Barbara Street, Pasadena, CA 91101 (United States); Kewley, Lisa; Vogt, Frédéric P. A.; Dopita, Michael A. [Research School of Astronomy and Astrophysics, Australian National University, Cotter Road, Weston, ACT 2611 (Australia)
2015-01-10
We present a new method for inferring the metallicity (Z) and ionization parameter (q) of H II regions and star-forming galaxies using strong nebular emission lines (SELs). We use Bayesian inference to derive the joint and marginalized posterior probability density functions for Z and q given a set of observed line fluxes and an input photoionization model. Our approach allows the use of arbitrary sets of SELs and the inclusion of flux upper limits. The method provides a self-consistent way of determining the physical conditions of ionized nebulae that is not tied to the arbitrary choice of a particular SEL diagnostic and uses all the available information. Unlike theoretically calibrated SEL diagnostics, the method is flexible and not tied to a particular photoionization model. We describe our algorithm, validate it against other methods, and present a tool that implements it called IZI. Using a sample of nearby extragalactic H II regions, we assess the performance of commonly used SEL abundance diagnostics. We also use a sample of 22 local H II regions having both direct and recombination line (RL) oxygen abundance measurements in the literature to study discrepancies in the abundance scale between different methods. We find that oxygen abundances derived through Bayesian inference using currently available photoionization models in the literature can be in good (∼30%) agreement with RL abundances, although some models perform significantly better than others. We also confirm that abundances measured using the direct method are typically ∼0.2 dex lower than both RL and photoionization-model-based abundances.
Jakkareddy, Pradeep S.; Balaji, C.
2016-09-01
This paper employs the Bayesian based Metropolis Hasting - Markov Chain Monte Carlo algorithm to solve inverse heat transfer problem of determining the spatially varying heat transfer coefficient from a flat plate with flush mounted discrete heat sources with measured temperatures at the bottom of the plate. The Nusselt number is assumed to be of the form Nu = aReb(x/l)c . To input reasonable values of ’a’ and ‘b’ into the inverse problem, first limited two dimensional conjugate convection simulations were done with Comsol. Based on the guidance from this different values of ‘a’ and ‘b’ are input to a computationally less complex problem of conjugate conduction in the flat plate (15mm thickness) and temperature distributions at the bottom of the plate which is a more convenient location for measuring the temperatures without disturbing the flow were obtained. Since the goal of this work is to demonstrate the eficiacy of the Bayesian approach to accurately retrieve ‘a’ and ‘b’, numerically generated temperatures with known values of ‘a’ and ‘b’ are treated as ‘surrogate’ experimental data. The inverse problem is then solved by repeatedly using the forward solutions together with the MH-MCMC aprroach. To speed up the estimation, the forward model is replaced by an artificial neural network. The mean, maximum-a-posteriori and standard deviation of the estimated parameters ‘a’ and ‘b’ are reported. The robustness of the proposed method is examined, by synthetically adding noise to the temperatures.
Li, Peng; Gong, Ping; Li, Haoni; Perkins, Edward J; Wang, Nan; Zhang, Chaoyang
2014-12-01
The Dialogue for Reverse Engineering Assessments and Methods (DREAM) project was initiated in 2006 as a community-wide effort for the development of network inference challenges for rigorous assessment of reverse engineering methods for biological networks. We participated in the in silico network inference challenge of DREAM3 in 2008. Here we report the details of our approach and its performance on the synthetic challenge datasets. In our methodology, we first developed a model called relative change ratio (RCR), which took advantage of the heterozygous knockdown data and null-mutant knockout data provided by the challenge, in order to identify the potential regulators for the genes. With this information, a time-delayed dynamic Bayesian network (TDBN) approach was then used to infer gene regulatory networks from time series trajectory datasets. Our approach considerably reduced the searching space of TDBN; hence, it gained a much higher efficiency and accuracy. The networks predicted using our approach were evaluated comparatively along with 29 other submissions by two metrics (area under the ROC curve and area under the precision-recall curve). The overall performance of our approach ranked the second among all participating teams.
Gonzalez-Lopez, Jesus E Garcia Veronica A
2010-01-01
In this work we introduce a new and richer class of finite order Markov chain models and address the following model selection problem: find the Markov model with the minimal set of parameters (minimal Markov model) which is necessary to represent a source as a Markov chain of finite order. Let us call $M$ the order of the chain and $A$ the finite alphabet, to determine the minimal Markov model, we define an equivalence relation on the state space $A^{M}$, such that all the sequences of size $M$ with the same transition probabilities are put in the same category. In this way we have one set of $(|A|-1)$ transition probabilities for each category, obtaining a model with a minimal number of parameters. We show that the model can be selected consistently using the Bayesian information criterion.
Bayesian inference of the resonance content of p(gamma,K^+)Lambda
De Cruz, Lesley; Vancraeyveld, Pieter; Ryckebusch, Jan
2011-01-01
A Bayesian analysis of the world's p(gamma,K^+)Lambda data is presented. We find that the following nucleon resonances have the highest probability of contributing to the reaction: S11(1535), S11(1650), F15(1680), P13(1720), D13(1900), P13(1900), P11(1900), and F15(2000). We adopt a Regge-plus-resonance framework featuring consistent couplings for nucleon resonances up to spin J=5/2. We evaluate all possible combinations of 11 candidate resonances. The best model is selected from the 2048 model variants by calculating the Bayesian evidence values against the world's p(gamma,K^+)Lambda data.
Davies, Andrew J; Hope, Max J
2015-07-15
Contingency plans are essential in guiding the response to marine oil spills. However, they are written before the pollution event occurs so must contain some degree of assumption and prediction and hence may be unsuitable for a real incident when it occurs. The use of Bayesian networks in ecology, environmental management, oil spill contingency planning and post-incident analysis is reviewed and analysed to establish their suitability for use as real-time environmental decision support systems during an oil spill response. It is demonstrated that Bayesian networks are appropriate for facilitating the re-assessment and re-validation of contingency plans following pollutant release, thus helping ensure that the optimum response strategy is adopted. This can minimise the possibility of sub-optimal response strategies causing additional environmental and socioeconomic damage beyond the original pollution event.
From least squares to multilevel modeling: A graphical introduction to Bayesian inference
Loredo, Thomas J.
2016-01-01
This tutorial presentation will introduce some of the key ideas and techniques involved in applying Bayesian methods to problems in astrostatistics. The focus will be on the big picture: understanding the foundations (interpreting probability, Bayes's theorem, the law of total probability and marginalization), making connections to traditional methods (propagation of errors, least squares, chi-squared, maximum likelihood, Monte Carlo simulation), and highlighting problems where a Bayesian approach can be particularly powerful (Poisson processes, density estimation and curve fitting with measurement error). The "graphical" component of the title reflects an emphasis on pictorial representations of some of the math, but also on the use of graphical models (multilevel or hierarchical models) for analyzing complex data. Code for some examples from the talk will be available to participants, in Python and in the Stan probabilistic programming language.
DEFF Research Database (Denmark)
Ehsani, Alireza; Sørensen, Peter; Pomp, Daniel;
2012-01-01
Background To understand the genetic architecture of complex traits and bridge the genotype-phenotype gap, it is useful to study intermediate -omics data, e.g. the transcriptome. The present study introduces a method for simultaneous quantification of the contributions from single nucleotide...... polymorphisms (SNPs) and transcript abundances in explaining phenotypic variance, using Bayesian whole-omics models. Bayesian mixed models and variable selection models were used and, based on parameter samples from the model posterior distributions, explained variances were further partitioned at the level......-modal distribution of genomic values collapses, when gene expressions are added to the model Conclusions With increased availability of various -omics data, integrative approaches are promising tools for understanding the genetic architecture of complex traits. Partitioning of explained variances at the chromosome...
Berradja, Khadidja; Boughanmi, Nabil
2016-09-01
In dynamic cardiac PET FDG studies the assessment of myocardial metabolic rate of glucose (MMRG) requires the knowledge of the blood input function (IF). IF can be obtained by manual or automatic blood sampling and cross calibrated with PET. These procedures are cumbersome, invasive and generate uncertainties. The IF is contaminated by spillover of radioactivity from the adjacent myocardium and this could cause important error in the estimated MMRG. In this study, we show that the IF can be extracted from the images in a rat heart study with 18F-fluorodeoxyglucose (18F-FDG) by means of Independent Component Analysis (ICA) based on Bayesian theory and Markov Chain Monte Carlo (MCMC) sampling method (BICA). Images of the heart from rats were acquired with the Sherbrooke small animal PET scanner. A region of interest (ROI) was drawn around the rat image and decomposed into blood and tissue using BICA. The Statistical study showed that there is a significant difference (p < 0.05) between MMRG obtained with IF extracted by BICA with respect to IF extracted from measured images corrupted with spillover.
Directory of Open Access Journals (Sweden)
Kanagi Kanapathy
2014-01-01
Full Text Available The research question is whether the positive relationship found between supplier involvement practices and new product development performances in developed economies also holds in emerging economies. The role of supplier involvement practices in new product development performance is yet to be substantially investigated in the emerging economies (other than China. This premise was examined by distributing a survey instrument (Jayaram’s (2008 published survey instrument that has been utilised in developed economies to Malaysian manufacturing companies. To gauge the relationship between the supplier involvement practices and new product development (NPD project performance of 146 companies, structural equation modelling was adopted. Our findings prove that supplier involvement practices have a significant positive impact on NPD project performance in an emerging economy with respect to quality objectives, design objectives, cost objectives, and “time-to-market” objectives. Further analysis using the Bayesian Markov Chain Monte Carlo algorithm, yielding a more credible and feasible differentiation, confirmed these results (even in the case of an emerging economy and indicated that these practices have a 28% impact on variance of NPD project performance. This considerable effect implies that supplier involvement is a must have, although further research is needed to identify the contingencies for its practices.
Ma, Junsheng; Chan, Wenyaw; Tsai, Chu-Lin; Xiong, Momiao; Tilley, Barbara C
2015-11-30
Continuous time Markov chain (CTMC) models are often used to study the progression of chronic diseases in medical research but rarely applied to studies of the process of behavioral change. In studies of interventions to modify behaviors, a widely used psychosocial model is based on the transtheoretical model that often has more than three states (representing stages of change) and conceptually permits all possible instantaneous transitions. Very little attention is given to the study of the relationships between a CTMC model and associated covariates under the framework of transtheoretical model. We developed a Bayesian approach to evaluate the covariate effects on a CTMC model through a log-linear regression link. A simulation study of this approach showed that model parameters were accurately and precisely estimated. We analyzed an existing data set on stages of change in dietary intake from the Next Step Trial using the proposed method and the generalized multinomial logit model. We found that the generalized multinomial logit model was not suitable for these data because it ignores the unbalanced data structure and temporal correlation between successive measurements. Our analysis not only confirms that the nutrition intervention was effective but also provides information on how the intervention affected the transitions among the stages of change. We found that, compared with the control group, subjects in the intervention group, on average, spent substantively less time in the precontemplation stage and were more/less likely to move from an unhealthy/healthy state to a healthy/unhealthy state.
Directory of Open Access Journals (Sweden)
Alex Avilés
2016-01-01
Full Text Available The scarcity of water resources in mountain areas can distort normal water application patterns with among other effects, a negative impact on water supply and river ecosystems. Knowing the probability of droughts might help to optimize a priori the planning and management of the water resources in general and of the Andean watersheds in particular. This study compares Markov chain- (MC and Bayesian network- (BN based models in drought forecasting using a recently developed drought index with respect to their capability to characterize different drought severity states. The copula functions were used to solve the BNs and the ranked probability skill score (RPSS to evaluate the performance of the models. Monthly rainfall and streamflow data of the Chulco River basin, located in Southern Ecuador, were used to assess the performance of both approaches. Global evaluation results revealed that the MC-based models predict better wet and dry periods, and BN-based models generate slightly more accurately forecasts of the most severe droughts. However, evaluation of monthly results reveals that, for each month of the hydrological year, either the MC- or BN-based model provides better forecasts. The presented approach could be of assistance to water managers to ensure that timely decision-making on drought response is undertaken.
Seichter, Felicia; Vogt, Josef; Radermacher, Peter; Mizaikoff, Boris
2017-01-25
The calibration of analytical systems is time-consuming and the effort for daily calibration routines should therefore be minimized, while maintaining the analytical accuracy and precision. The 'calibration transfer' approach proposes to combine calibration data already recorded with actual calibrations measurements. However, this strategy was developed for the multivariate, linear analysis of spectroscopic data, and thus, cannot be applied to sensors with a single response channel and/or a non-linear relationship between signal and desired analytical concentration. To fill this gap for a non-linear calibration equation, we assume that the coefficients for the equation, collected over several calibration runs, are normally distributed. Considering that coefficients of an actual calibration are a sample of this distribution, only a few standards are needed for a complete calibration data set. The resulting calibration transfer approach is demonstrated for a fluorescence oxygen sensor and implemented as a hierarchical Bayesian model, combined with a Lagrange Multipliers technique and Monte-Carlo Markov-Chain sampling. The latter provides realistic estimates for coefficients and prediction together with accurate error bounds by simulating known measurement errors and system fluctuations. Performance criteria for validation and optimal selection of a reduced set of calibration samples were developed and lead to a setup which maintains the analytical performance of a full calibration. Strategies for a rapid determination of problems occurring in a daily calibration routine, are proposed, thereby opening the possibility of correcting the problem just in time.
Bayesian inference of the resonance content of p(γ, K+Λ
Directory of Open Access Journals (Sweden)
Ryckebusch J.
2012-12-01
Full Text Available A Bayesian analysis of the world’s (γ, K+Λ data is presented. We adopt a Regge-plus-resonance framework featuring consistent couplings for nucleon resonances up to spin J = 5/2, and evaluate 2048 model variants considering all possible combinations of 11 candidate resonances. The best model, labeled RPR-2011, is discussed with special emphasis on nucleon resonances in the 1900-MeV mass region.
Bayesian inference of the resonance content of p(gamma,K+)Lambda
Vancraeyveld, Pieter; Ryckebusch, Jan; Vrancx, Tom
2012-01-01
A Bayesian analysis of the world's p(gamma,K+)Lambda data is presented. We adopt a Regge-plus-resonance framework featuring consistent couplings for nucleon resonances up to spin J=5/2, and evaluate 2048 model variants considering all possible combinations of 11 candidate resonances. The best model, labeled RPR-2011, is discussed with special emphasis on nucleon resonances in the 1900-MeV mass region.
Safner, T.; Miller, M.P.; McRae, B.H.; Fortin, M.-J.; Manel, S.
2011-01-01
Recently, techniques available for identifying clusters of individuals or boundaries between clusters using genetic data from natural populations have expanded rapidly. Consequently, there is a need to evaluate these different techniques. We used spatially-explicit simulation models to compare three spatial Bayesian clustering programs and two edge detection methods. Spatially-structured populations were simulated where a continuous population was subdivided by barriers. We evaluated the ability of each method to correctly identify boundary locations while varying: (i) time after divergence, (ii) strength of isolation by distance, (iii) level of genetic diversity, and (iv) amount of gene flow across barriers. To further evaluate the methods' effectiveness to detect genetic clusters in natural populations, we used previously published data on North American pumas and a European shrub. Our results show that with simulated and empirical data, the Bayesian spatial clustering algorithms outperformed direct edge detection methods. All methods incorrectly detected boundaries in the presence of strong patterns of isolation by distance. Based on this finding, we support the application of Bayesian spatial clustering algorithms for boundary detection in empirical datasets, with necessary tests for the influence of isolation by distance. ?? 2011 by the authors; licensee MDPI, Basel, Switzerland.
Mondal, A.
2010-03-01
In this paper, we study the uncertainty quantification in inverse problems for flows in heterogeneous porous media. Reversible jump Markov chain Monte Carlo algorithms (MCMC) are used for hierarchical modeling of channelized permeability fields. Within each channel, the permeability is assumed to have a lognormal distribution. Uncertainty quantification in history matching is carried out hierarchically by constructing geologic facies boundaries as well as permeability fields within each facies using dynamic data such as production data. The search with Metropolis-Hastings algorithm results in very low acceptance rate, and consequently, the computations are CPU demanding. To speed-up the computations, we use a two-stage MCMC that utilizes upscaled models to screen the proposals. In our numerical results, we assume that the channels intersect the wells and the intersection locations are known. Our results show that the proposed algorithms are capable of capturing the channel boundaries and describe the permeability variations within the channels using dynamic production history at the wells. © 2009 Elsevier Ltd. All rights reserved.
Langmore, Ian; Davis, Anthony B.; Bal, Guillaume; Marzouk, Youssef M.
2012-01-01
We describe a method for accelerating a 3D Monte Carlo forward radiative transfer model to the point where it can be used in a new kind of Bayesian retrieval framework. The remote sensing challenge is to detect and quantify a chemical effluent of a known absorbing gas produced by an industrial facility in a deep valley. The available data is a single low resolution noisy image of the scene in the near IR at an absorbing wavelength for the gas of interest. The detected sunlight has been multiply reflected by the variable terrain and/or scattered by an aerosol that is assumed partially known and partially unknown. We thus introduce a new class of remote sensing algorithms best described as "multi-pixel" techniques that call necessarily for a 3D radaitive transfer model (but demonstrated here in 2D); they can be added to conventional ones that exploit typically multi- or hyper-spectral data, sometimes with multi-angle capability, with or without information about polarization. The novel Bayesian inference methodology uses adaptively, with efficiency in mind, the fact that a Monte Carlo forward model has a known and controllable uncertainty depending on the number of sun-to-detector paths used.
Directory of Open Access Journals (Sweden)
Fonnesbeck, C. J.
2004-06-01
Full Text Available When endeavoring to make informed decisions, conservation biologists must frequently contend with disparate sources of data and competing hypotheses about the likely impacts of proposed decisions on the resource status. Frequently, statistical analyses, modeling (e.g., for population projection and optimization or simulation are conducted as separate exercises. For example, a population model might be constructed, whose parameters are then estimated from data (e.g., ringing studies, population surveys. This model might then be used to predict future population states, from current population estimates, under a particular management regime. Finally, the parameterized model might also be used to evaluate alternative candidate management decisions, via simulation, optimization, or both. This approach, while effective, does not take full advantage of the integration of data and model components for prediction and updating; we propose a hierarchical Bayesian context for this integration. In the case of American black ducks (Anas rubripes, managers are simultaneously faced with trying to extract a sustainable harvest from the species, while maintaining individual stocks above acceptable thresholds. The problem is complicated by spatial heterogeneity in the growth rates and carrying capacity of black ducks stocks, movement between stocks, regional differences in the intensity of harvest pressure, and heterogeneity in the degree of competition from a close congener, mallards (Anas platyrynchos among stocks. We have constructed a population life cycle model that takes these components into account and simultaneously performs parameter estimation and population prediction in a Bayesian framework. Ringing data are used to develop posterior predictive distributions for harvest mortality rates, given as input decisions about harvest regulations. Population surveys of black ducks and mallards are used to obtain stock-specific estimates of population size for
Likelihood-based inference for clustered line transect data
DEFF Research Database (Denmark)
Waagepetersen, Rasmus; Schweder, Tore
2006-01-01
The uncertainty in estimation of spatial animal density from line transect surveys depends on the degree of spatial clustering in the animal population. To quantify the clustering we model line transect data as independent thinnings of spatial shot-noise Cox processes. Likelihood-based inference...... is implemented using markov chain Monte Carlo (MCMC) methods to obtain efficient estimates of spatial clustering parameters. Uncertainty is addressed using parametric bootstrap or by consideration of posterior distributions in a Bayesian setting. Maximum likelihood estimation and Bayesian inference are compared...
DEFF Research Database (Denmark)
Heller, Rasmus; Chikhi, Lounes; Siegismund, Hans
2013-01-01
Many coalescent-based methods aiming to infer the demographic history of populations assume a single, isolated and panmictic population (i.e. a Wright-Fisher model). While this assumption may be reasonable under many conditions, several recent studies have shown that the results can be misleading...
2016-03-01
second edition, texts in statistical science. United Kingdom: Chapman & Hall /CRC. Gelman, A., and D. B. Rubin. 1992. Inference from iterative...Soltyk, S., M. Leonard , A. Phatak, and E. Lehmann. 2014. Statistical modelling of rainfall intensity-frequency- duration curves using regional
2014-10-01
probability theory, Ecological Modeling, 205, 437–452. [10] Keats, A., Yee, E., and Lien, F.-S. (2010), Information-driven receptor placement for... Aerodynamics , 96, 1805–1816. [12] Yee, E. (2008), Theory for reconstruction of an unknown number of contaminant sources using probabilistic inference
Directory of Open Access Journals (Sweden)
Mariana Inés Pocovi
2015-06-01
Full Text Available Understanding the population structure and genetic diversity in sugarcane (Saccharum officinarum L. accessions from INTA germplasm bank (Argentina will be of great importance for germplasm collection and breeding improvement as it will identify diverse parental combinations to create segregating progenies with maximum genetic variability for further selection. A Bayesian approach, ordination methods (PCoA, Principal Coordinate Analysis and clustering analysis (UPGMA, Unweighted Pair Group Method with Arithmetic Mean were applied to this purpose. Sixty three INTA sugarcane hybrids were genotyped for 107 Simple Sequence Repeat (SSR and 136 Amplified Fragment Length Polymorphism (AFLP loci. Given the low probability values found with AFLP for individual assignment (4.7%, microsatellites seemed to perform better (54% for STRUCTURE analysis that revealed the germplasm to exist in five optimum groups with partly corresponding to their origin. However clusters shown high degree of admixture, F ST values confirmed the existence of differences among groups. Dissimilarity coefficients ranged from 0.079 to 0.651. PCoA separated sugarcane in groups that did not agree with those identified by STRUCTURE. The clustering including all genotypes neither showed resemblance to populations find by STRUCTURE, but clustering performed considering only individuals displaying a proportional membership > 0.6 in their primary population obtained with STRUCTURE showed close similarities. The Bayesian method indubitably brought more information on cultivar origins than classical PCoA and hierarchical clustering method.
Bayesian inference in genetic parameter estimation of visual scores in Nellore beef-cattle
2009-01-01
The aim of this study was to estimate the components of variance and genetic parameters for the visual scores which constitute the Morphological Evaluation System (MES), such as body structure (S), precocity (P) and musculature (M) in Nellore beef-cattle at the weaning and yearling stages, by using threshold Bayesian models. The information used for this was gleaned from visual scores of 5,407 animals evaluated at the weaning and 2,649 at the yearling stages. The genetic parameters for visual score traits were estimated through two-trait analysis, using the threshold animal model, with Bayesian statistics methodology and MTGSAM (Multiple Trait Gibbs Sampler for Animal Models) threshold software. Heritability estimates for S, P and M were 0.68, 0.65 and 0.62 (at weaning) and 0.44, 0.38 and 0.32 (at the yearling stage), respectively. Heritability estimates for S, P and M were found to be high, and so it is expected that these traits should respond favorably to direct selection. The visual scores evaluated at the weaning and yearling stages might be used in the composition of new selection indexes, as they presented sufficient genetic variability to promote genetic progress in such morphological traits. PMID:21637450
A cost minimisation and Bayesian inference model predicts startle reflex modulation across species.
Bach, Dominik R
2015-04-07
In many species, rapid defensive reflexes are paramount to escaping acute danger. These reflexes are modulated by the state of the environment. This is exemplified in fear-potentiated startle, a more vigorous startle response during conditioned anticipation of an unrelated threatening event. Extant explanations of this phenomenon build on descriptive models of underlying psychological states, or neural processes. Yet, they fail to predict invigorated startle during reward anticipation and instructed attention, and do not explain why startle reflex modulation evolved. Here, we fill this lacuna by developing a normative cost minimisation model based on Bayesian optimality principles. This model predicts the observed pattern of startle modification by rewards, punishments, instructed attention, and several other states. Moreover, the mathematical formalism furnishes predictions that can be tested experimentally. Comparing the model with existing data suggests a specific neural implementation of the underlying computations which yields close approximations to the optimal solution under most circumstances. This analysis puts startle modification into the framework of Bayesian decision theory and predictive coding, and illustrates the importance of an adaptive perspective to interpret defensive behaviour across species. Copyright © 2015 The Author. Published by Elsevier Ltd.. All rights reserved.
Blanc, Guillermo A; Vogt, Frédéric P A; Dopita, Michael A
2014-01-01
We present a new method for inferring the metallicity (Z) and ionization parameter (q) of HII regions and star-forming galaxies using strong nebular emission lines (SEL). We use Bayesian inference to derive the joint and marginalized posterior probability density functions for Z and q given a set of observed line fluxes and an input photo-ionization model. Our approach allows the use of arbitrary sets of SELs and the inclusion of flux upper limits. The method provides a self-consistent way of determining the physical conditions of ionized nebulae that is not tied to the arbitrary choice of a particular SEL diagnostic and uses all the available information. Unlike theoretically calibrated SEL diagnostics the method is flexible and not tied to a particular photo-ionization model. We describe our algorithm, validate it against other methods, and present a tool that implements it called IZI. Using a sample of nearby extra-galactic HII regions we assess the performance of commonly used SEL abundance diagnostics. W...
Energy Technology Data Exchange (ETDEWEB)
Marzouk, Youssef; Fast P. (Lawrence Livermore National Laboratory, Livermore, CA); Kraus, M. (Peterson AFB, CO); Ray, J. P.
2006-01-01
Terrorist attacks using an aerosolized pathogen preparation have gained credibility as a national security concern after the anthrax attacks of 2001. The ability to characterize such attacks, i.e., to estimate the number of people infected, the time of infection, and the average dose received, is important when planning a medical response. We address this question of characterization by formulating a Bayesian inverse problem predicated on a short time-series of diagnosed patients exhibiting symptoms. To be of relevance to response planning, we limit ourselves to 3-5 days of data. In tests performed with anthrax as the pathogen, we find that these data are usually sufficient, especially if the model of the outbreak used in the inverse problem is an accurate one. In some cases the scarcity of data may initially support outbreak characterizations at odds with the true one, but with sufficient data the correct inferences are recovered; in other words, the inverse problem posed and its solution methodology are consistent. We also explore the effect of model error-situations for which the model used in the inverse problem is only a partially accurate representation of the outbreak; here, the model predictions and the observations differ by more than a random noise. We find that while there is a consistent discrepancy between the inferred and the true characterizations, they are also close enough to be of relevance when planning a response.
Inferring Alcoholism SNPs and Regulatory Chemical Compounds Based on Ensemble Bayesian Network.
Chen, Huan; Sun, Jiatong; Jiang, Hong; Wang, Xianyue; Wu, Lingxiang; Wu, Wei; Wang, Qh
2016-12-20
The disturbance of consciousness is one of the most common symptoms of those have alcoholism and may cause disability and mortality. Previous studies indicated that several single nucleotide polymorphisms (SNP) increase the susceptibility of alcoholism. In this study, we utilized the Ensemble Bayesian Network (EBN) method to identify causal SNPs of alcoholism based on the verified GAW14 data. Thirteen out of eighteen SNPs directly connected with alcoholism were found concordance with potential risk regions of alcoholism in OMIM database. As a number of SNPs were found contributing to alteration on gene expression, known as expression quantitative trait loci (eQTLs), we further sought to identify chemical compounds acting as regulators of alcoholism genes captured by causal SNPs. Chloroprene and valproic acid were identified as the expression regulators for genes C11orf66 and SALL3 which were captured by alcoholism SNPs, respectively.
Prudhomme, Serge
2015-01-07
The need for surrogate models and adaptive methods can be best appreciated if one is interested in parameter estimation using a Bayesian calibration procedure for validation purposes. We extend here our latest work on error decomposition and adaptive refinement for response surfaces to the development of surrogate models that can be substituted for the full models to estimate the parameters of Reynolds-averaged Navier-Stokes models. The error estimates and adaptive schemes are driven here by a quantity of interest and are thus based on the approximation of an adjoint problem. We will focus in particular to the accurate estimation of evidences to facilitate model selection. The methodology will be illustrated on the Spalart-Allmaras RANS model for turbulence simulation.
DEFF Research Database (Denmark)
Møller, Jesper; Rasmussen, Jakob Gulddahl
We introduce a flexible spatial point process model for spatial point patterns exhibiting linear structures, without incorporating a latent line process. The model is given by an underlying sequential point process model, i.e. each new point is generated given the previous points. Under this model...... the points can be of one of three types: a ‘background point’, an ‘independent cluster point’, or a ‘dependent cluster point’. The background and independent cluster points are thought to exhibit ‘complete spatial randomness’, while the conditional distribution of a dependent cluster point given the previous...... points is such that the dependent cluster point is likely to occur closely to a previous cluster point. We demonstrate the flexibility of the model for producing point patterns with linear structures, and propose to use the model as the likelihood in a Bayesian setting when analyzing a spatial point...
Formulating Quantum Theory as a Causally Neutral Theory of Bayesian Inference
Leifer, M S
2011-01-01
Quantum theory can be viewed as a generalization of classical probability theory, but the analogy as it has been developed so far is not complete. Classical probability theory is independent of causal structure, whereas the conventional quantum formalism requires causal structure to be fixed in advance. In this paper, we develop the formalism of quantum conditional states, which unifies the description of experiments involving two systems at a single time with the description of those involving a single system at two times. The analogies between quantum theory and classical probability theory are expressed succinctly within the formalism and it unifies the mathematical description of distinct concepts, such as ensemble preparation procedures, measurements, and quantum dynamics. We introduce a quantum generalization of Bayes' theorem and the associated notion of Bayesian conditioning. Conditioning a quantum state on a classical variable is the correct rule for updating quantum states in light of classical data...
Bayesian inference for functional response in a stochastic predator-prey system.
Gilioli, Gianni; Pasquali, Sara; Ruggeri, Fabrizio
2008-02-01
We present a Bayesian method for functional response parameter estimation starting from time series of field data on predator-prey dynamics. Population dynamics is described by a system of stochastic differential equations in which behavioral stochasticities are represented by noise terms affecting each population as well as their interaction. We focus on the estimation of a behavioral parameter appearing in the functional response of predator to prey abundance when a small number of observations is available. To deal with small sample sizes, latent data are introduced between each pair of field observations and are considered as missing data. The method is applied to both simulated and observational data. The results obtained using different numbers of latent data are compared with those achieved following a frequentist approach. As a case study, we consider an acarine predator-prey system relevant to biological control problems.
Bayesian inference of non-positive spectral functions in quantum field theory
Rothkopf, Alexander
2016-01-01
We present the generalization to non positive definite spectral functions of a recently proposed Bayesian deconvolution approach (BR method). The novel prior used here retains many of the beneficial analytic properties of the original method, in particular it allows us to integrate out the hyperparameter $\\alpha$ directly. To preserve the underlying axiom of scale invariance, we introduce a second default-model related function, whose role is discussed. Our reconstruction prescription is contrasted with existing direct methods, as well as with an approach where shift functions are introduced to compensate for negative spectral features. A mock spectrum analysis inspired by the study of gluon spectral functions in QCD illustrates the capabilities of this new approach.
MacCallum, Justin L; Perez, Alberto; Dill, Ken A
2015-06-02
More than 100,000 protein structures are now known at atomic detail. However, far more are not yet known, particularly among large or complex proteins. Often, experimental information is only semireliable because it is uncertain, limited, or confusing in important ways. Some experiments give sparse information, some give ambiguous or nonspecific information, and others give uncertain information-where some is right, some is wrong, but we don't know which. We describe a method called Modeling Employing Limited Data (MELD) that can harness such problematic information in a physics-based, Bayesian framework for improved structure determination. We apply MELD to eight proteins of known structure for which such problematic structural data are available, including a sparse NMR dataset, two ambiguous EPR datasets, and four uncertain datasets taken from sequence evolution data. MELD gives excellent structures, indicating its promise for experimental biomolecule structure determination where only semireliable data are available.
Off-grid Direction of Arrival Estimation Using Sparse Bayesian Inference
Yang, Zai; Zhang, Cishen
2011-01-01
This paper is focused on solving the narrowband direction of arrival estimation problem from a sparse signal reconstruction perspective. Existing sparsity-based methods have shown advantages over conventional ones but exhibit limitations in practical situations where the true directions are not in the sampling grid. A so-called off-grid model is broached to reduce the modeling error caused by the off-grid directions. An iterative algorithm is proposed in this paper to solve the resulting problem from a Bayesian perspective while joint sparsity among different snapshots is exploited by assuming the same Laplace prior. Like existing sparsity-based methods, the new approach applies to arbitrary sensor array and exhibits increased resolution and improved robustness to noise and source correlation. Moreover, our approach results in more accurate direction of arrival estimation, e.g., smaller bias and lower mean squared error. High precision can be obtained with a coarse sampling grid and, meanwhile, computational ...
Bayesian Inference for Reliability of Systems and Networks Using the Survival Signature.
Aslett, Louis J M; Coolen, Frank P A; Wilson, Simon P
2015-09-01
The concept of survival signature has recently been introduced as an alternative to the signature for reliability quantification of systems. While these two concepts are closely related for systems consisting of a single type of component, the survival signature is also suitable for systems with multiple types of component, which is not the case for the signature. This also enables the use of the survival signature for reliability of networks. In this article, we present the use of the survival signature for reliability quantification of systems and networks from a Bayesian perspective. We assume that data are available on tested components that are exchangeable with those in the actual system or network of interest. These data consist of failure times and possibly right-censoring times. We present both a nonparametric and parametric approach.
Nonparametric Bayesian Inference for Mean Residual Life Functions in Survival Analysis
Poynor, Valerie; Kottas, Athanasios
2014-01-01
Modeling and inference for survival analysis problems typically revolves around different functions related to the survival distribution. Here, we focus on the mean residual life function which provides the expected remaining lifetime given that a subject has survived (i.e., is event-free) up to a particular time. This function is of direct interest in reliability, medical, and actuarial fields. In addition to its practical interpretation, the mean residual life function characterizes the sur...
Open-Universe Theory for Bayesian Inference, Decision, and Sensing (OUTBIDS)
2014-01-01
categorization using latent conditional random field models. One disadvantage of the approach in [19] is that the parameters of the classifiers are...jointly using graph cuts. Evaluation on the Graz02 dataset showed improvements in two categories ( bicycle and cars) and no improvement on people... disadvantage of the approach in [19] is that inference becomes extremely slow as the number of nodes (pixels or superpixels) increases. This is
Directory of Open Access Journals (Sweden)
Oliver Ratmann
Full Text Available A key priority in infectious disease research is to understand the ecological and evolutionary drivers of viral diseases from data on disease incidence as well as viral genetic and antigenic variation. We propose using a simulation-based, Bayesian method known as Approximate Bayesian Computation (ABC to fit and assess phylodynamic models that simulate pathogen evolution and ecology against summaries of these data. We illustrate the versatility of the method by analyzing two spatial models describing the phylodynamics of interpandemic human influenza virus subtype A(H3N2. The first model captures antigenic drift phenomenologically with continuously waning immunity, and the second epochal evolution model describes the replacement of major, relatively long-lived antigenic clusters. Combining features of long-term surveillance data from The Netherlands with features of influenza A (H3N2 hemagglutinin gene sequences sampled in northern Europe, key phylodynamic parameters can be estimated with ABC. Goodness-of-fit analyses reveal that the irregularity in interannual incidence and H3N2's ladder-like hemagglutinin phylogeny are quantitatively only reproduced under the epochal evolution model within a spatial context. However, the concomitant incidence dynamics result in a very large reproductive number and are not consistent with empirical estimates of H3N2's population level attack rate. These results demonstrate that the interactions between the evolutionary and ecological processes impose multiple quantitative constraints on the phylodynamic trajectories of influenza A(H3N2, so that sequence and surveillance data can be used synergistically. ABC, one of several data synthesis approaches, can easily interface a broad class of phylodynamic models with various types of data but requires careful calibration of the summaries and tolerance parameters.
Partial inversion of elliptic operator to speed up computation of likelihood in Bayesian inference
Litvinenko, Alexander
2017-08-09
In this paper, we speed up the solution of inverse problems in Bayesian settings. By computing the likelihood, the most expensive part of the Bayesian formula, one compares the available measurement data with the simulated data. To get simulated data, repeated solution of the forward problem is required. This could be a great challenge. Often, the available measurement is a functional $F(u)$ of the solution $u$ or a small part of $u$. Typical examples of $F(u)$ are the solution in a point, solution on a coarser grid, in a small subdomain, the mean value in a subdomain. It is a waste of computational resources to evaluate, first, the whole solution and then compute a part of it. In this work, we compute the functional $F(u)$ direct, without computing the full inverse operator and without computing the whole solution $u$. The main ingredients of the developed approach are the hierarchical domain decomposition technique, the finite element method and the Schur complements. To speed up computations and to reduce the storage cost, we approximate the forward operator and the Schur complement in the hierarchical matrix format. Applying the hierarchical matrix technique, we reduced the computing cost to $\\\\mathcal{O}(k^2n \\\\log^2 n)$, where $k\\\\ll n$ and $n$ is the number of degrees of freedom. Up to the $\\\\H$-matrix accuracy, the computation of the functional $F(u)$ is exact. To reduce the computational resources further, we can approximate $F(u)$ on, for instance, multiple coarse meshes. The offered method is well suited for solving multiscale problems. A disadvantage of this method is the assumption that one has to have access to the discretisation and to the procedure of assembling the Galerkin matrix.
Nakatani-Webster, Eri; Nath, Abhinav
2017-03-14
Amyloid formation is implicated in a number of human diseases, and is thought to proceed via a nucleation-dependent polymerization mechanism. Experimenters often wish to relate changes in amyloid formation kinetics, for example, in response to small molecules to specific mechanistic steps along this pathway. However, fitting kinetic fibril formation data to a complex model including explicit rate constants results in an ill-posed problem with a vast number of potential solutions. The levels of uncertainty remaining in parameters calculated from these models, arising both from experimental noise and high levels of degeneracy or codependency in parameters, is often unclear. Here, we demonstrate that a combination of explicit mathematical models with an approximate Bayesian computation approach can be used to assign the mechanistic effects of modulators on amyloid fibril formation. We show that even when exact rate constants cannot be extracted, parameters derived from these rate constants can be recovered and used to assign mechanistic effects and their relative magnitudes with a great deal of confidence. Furthermore, approximate Bayesian computation provides a robust method for visualizing uncertainty remaining in the model parameters, regardless of its origin. We apply these methods to the problem of heparin-mediated tau polymerization, which displays complex kinetic behavior not amenable to analysis by more traditional methods. Our analysis indicates that the role of heparin cannot be explained by enhancement of nucleation alone, as has been previously proposed. The methods described here are applicable to a wide range of systems, as models can be easily adapted to account for new reactions and reversibility.
Exploring the Connection Between Sampling Problems in Bayesian Inference and Statistical Mechanics
Pohorille, Andrew
2006-01-01
The Bayesian and statistical mechanical communities often share the same objective in their work - estimating and integrating probability distribution functions (pdfs) describing stochastic systems, models or processes. Frequently, these pdfs are complex functions of random variables exhibiting multiple, well separated local minima. Conventional strategies for sampling such pdfs are inefficient, sometimes leading to an apparent non-ergodic behavior. Several recently developed techniques for handling this problem have been successfully applied in statistical mechanics. In the multicanonical and Wang-Landau Monte Carlo (MC) methods, the correct pdfs are recovered from uniform sampling of the parameter space by iteratively establishing proper weighting factors connecting these distributions. Trivial generalizations allow for sampling from any chosen pdf. The closely related transition matrix method relies on estimating transition probabilities between different states. All these methods proved to generate estimates of pdfs with high statistical accuracy. In another MC technique, parallel tempering, several random walks, each corresponding to a different value of a parameter (e.g. "temperature"), are generated and occasionally exchanged using the Metropolis criterion. This method can be considered as a statistically correct version of simulated annealing. An alternative approach is to represent the set of independent variables as a Hamiltonian system. Considerab!e progress has been made in understanding how to ensure that the system obeys the equipartition theorem or, equivalently, that coupling between the variables is correctly described. Then a host of techniques developed for dynamical systems can be used. Among them, probably the most powerful is the Adaptive Biasing Force method, in which thermodynamic integration and biased sampling are combined to yield very efficient estimates of pdfs. The third class of methods deals with transitions between states described
Directory of Open Access Journals (Sweden)
Mehdi Javanbakht
Full Text Available The aim of this study was to estimate the economic burden of diabetes mellitus (DM in Iran from 2009 to 2030.A Markov micro-simulation (MM model was developed to predict the DM population size and associated economic burden. Age- and sex-specific prevalence and incidence of diagnosed and undiagnosed DM were derived from national health surveys. A systematic review was performed to identify the cost of diabetes in Iran and the mean annual direct and indirect costs of patients with DM were estimated using a random-effect Bayesian meta-analysis. Face, internal, cross and predictive validity of the MM model were assessed by consulting an expert group, performing sensitivity analysis (SA and comparing model results with published literature and national survey reports. Sensitivity analysis was also performed to explore the effect of uncertainty in the model.We estimated 3.78 million cases of DM (2.74 million diagnosed and 1.04 million undiagnosed in Iran in 2009. This number is expected to rise to 9.24 million cases (6.73 million diagnosed and 2.50 million undiagnosed by 2030. The mean annual direct and indirect costs of patients with DM in 2009 were US$ 556 (posterior standard deviation, 221 and US$ 689 (619, respectively. Total estimated annual cost of DM was $3.64 (2009 US$ billion (including US$1.71 billion direct and US$1.93 billion indirect costs in 2009 and is predicted to increase to $9.0 (in 2009 US$ billion (including US$4.2 billion direct and US$4.8 billion indirect costs by 2030.The economic burden of DM in Iran is predicted to increase markedly in the coming decades. Identification and implementation of effective strategies to prevent and manage DM should be considered as a public health priority.
Javanbakht, Mehdi; Mashayekhi, Atefeh; Baradaran, Hamid R; Haghdoost, AliAkbar; Afshin, Ashkan
2015-01-01
The aim of this study was to estimate the economic burden of diabetes mellitus (DM) in Iran from 2009 to 2030. A Markov micro-simulation (MM) model was developed to predict the DM population size and associated economic burden. Age- and sex-specific prevalence and incidence of diagnosed and undiagnosed DM were derived from national health surveys. A systematic review was performed to identify the cost of diabetes in Iran and the mean annual direct and indirect costs of patients with DM were estimated using a random-effect Bayesian meta-analysis. Face, internal, cross and predictive validity of the MM model were assessed by consulting an expert group, performing sensitivity analysis (SA) and comparing model results with published literature and national survey reports. Sensitivity analysis was also performed to explore the effect of uncertainty in the model. We estimated 3.78 million cases of DM (2.74 million diagnosed and 1.04 million undiagnosed) in Iran in 2009. This number is expected to rise to 9.24 million cases (6.73 million diagnosed and 2.50 million undiagnosed) by 2030. The mean annual direct and indirect costs of patients with DM in 2009 were US$ 556 (posterior standard deviation, 221) and US$ 689 (619), respectively. Total estimated annual cost of DM was $3.64 (2009 US$) billion (including US$1.71 billion direct and US$1.93 billion indirect costs) in 2009 and is predicted to increase to $9.0 (in 2009 US$) billion (including US$4.2 billion direct and US$4.8 billion indirect costs) by 2030. The economic burden of DM in Iran is predicted to increase markedly in the coming decades. Identification and implementation of effective strategies to prevent and manage DM should be considered as a public health priority.
Lefkimmiatis, Stamatios; Maragos, Petros; Papandreou, George
2009-08-01
We present an improved statistical model for analyzing Poisson processes, with applications to photon-limited imaging. We build on previous work, adopting a multiscale representation of the Poisson process in which the ratios of the underlying Poisson intensities (rates) in adjacent scales are modeled as mixtures of conjugate parametric distributions. Our main contributions include: 1) a rigorous and robust regularized expectation-maximization (EM) algorithm for maximum-likelihood estimation of the rate-ratio density parameters directly from the noisy observed Poisson data (counts); 2) extension of the method to work under a multiscale hidden Markov tree model (HMT) which couples the mixture label assignments in consecutive scales, thus modeling interscale coefficient dependencies in the vicinity of image edges; 3) exploration of a 2-D recursive quad-tree image representation, involving Dirichlet-mixture rate-ratio densities, instead of the conventional separable binary-tree image representation involving beta-mixture rate-ratio densities; and 4) a novel multiscale image representation, which we term Poisson-Haar decomposition, that better models the image edge structure, thus yielding improved performance. Experimental results on standard images with artificially simulated Poisson noise and on real photon-limited images demonstrate the effectiveness of the proposed techniques.
Alvarado Mora, Mónica Viviana; Romano, Camila Malta; Gomes-Gouvêa, Michele Soares; Gutierrez, Maria Fernanda; Botelho, Livia; Carrilho, Flair José; Pinho, João Renato Rebello
2011-01-01
Hepatitis B is a worldwide health problem affecting about 2 billion people and more than 350 million are chronic carriers of the virus. Nine HBV genotypes (A to I) have been described. The geographical distribution of HBV genotypes is not completely understood due to the limited number of samples from some parts of the world. One such example is Colombia, in which few studies have described the HBV genotypes. In this study, we characterized HBV genotypes in 143 HBsAg-positive volunteer blood donors from Colombia. A fragment of 1306 bp partially comprising HBsAg and the DNA polymerase coding regions (S/POL) was amplified and sequenced. Bayesian phylogenetic analyses were conducted using the Markov Chain Monte Carlo (MCMC) approach to obtain the maximum clade credibility (MCC) tree using BEAST v.1.5.3. Of all samples, 68 were positive and 52 were successfully sequenced. Genotype F was the most prevalent in this population (77%) - subgenotypes F3 (75%) and F1b (2%). Genotype G (7.7%) and subgenotype A2 (15.3%) were also found. Genotype G sequence analysis suggests distinct introductions of this genotype in the country. Furthermore, we estimated the time of the most recent common ancestor (TMRCA) for each HBV/F subgenotype and also for Colombian F3 sequences using two different datasets: (i) 77 sequences comprising 1306 bp of S/POL region and (ii) 283 sequences comprising 681 bp of S/POL region. We also used two other previously estimated evolutionary rates: (i) 2.60 × 10(-4)s/s/y and (ii) 1.5 × 10(-5)s/s/y. Here we report the HBV genotypes circulating in Colombia and estimated the TMRCA for the four different subgenotypes of genotype F.
Analysis of Gumbel Model for Software Reliability Using Bayesian Paradigm
Directory of Open Access Journals (Sweden)
Raj Kumar
2012-12-01
Full Text Available In this paper, we have illustrated the suitability of Gumbel Model for software reliability data. The model parameters are estimated using likelihood based inferential procedure: classical as well as Bayesian. The quasi Newton-Raphson algorithm is applied to obtain the maximum likelihood estimates and associated probability intervals. The Bayesian estimates of the parameters of Gumbel model are obtained using Markov Chain Monte Carlo(MCMC simulation method in OpenBUGS(established software for Bayesian analysis using Markov Chain Monte Carlo methods. The R functions are developed to study the statistical properties, model validation and comparison tools of the model and the output analysis of MCMC samples generated from OpenBUGS. Details of applying MCMC to parameter estimation for the Gumbel model are elaborated and a real software reliability data set is considered to illustrate the methods of inference discussed in this paper.
Estimation and uncertainty of reversible Markov models
Trendelkamp-Schroer, Benjamin; Paul, Fabian; Noé, Frank
2015-01-01
Reversibility is a key concept in the theory of Markov models, simplified kinetic models for the conforma- tion dynamics of molecules. The analysis and interpretation of the transition matrix encoding the kinetic properties of the model relies heavily on the reversibility property. The estimation of a reversible transition matrix from simulation data is therefore crucial to the successful application of the previously developed theory. In this work we discuss methods for the maximum likelihood estimation of transition matrices from finite simulation data and present a new algorithm for the estimation if reversibility with respect to a given stationary vector is desired. We also develop new methods for the Bayesian posterior inference of reversible transition matrices with and without given stationary vector taking into account the need for a suitable prior distribution preserving the meta-stable features of the observed process during posterior inference.
Pham, Tuan D.; Salvetti, Federica; Wang, Bing; Diani, Marco; Heindel, Walter; Knecht, Stefan; Wersching, Heike; Baune, Bernhard T.; Berger, Klaus
2011-02-01
Rating and quantification of cerebral white matter hyperintensities on magnetic resonance imaging (MRI) are important tasks in various clinical and scientific settings. As manual evaluation is time consuming and imprecise, much effort has been made to automate the quantification of white matter hyperintensities. There is rarely any report that attempts to study the similarity/dissimilarity of white matter hyperintensity patterns that have different sizes, shapes and spatial localizations on the MRI. This paper proposes an original computational neuroscience framework for such a conceptual study with a standpoint that the prior knowledge about white matter hyperintensities can be accumulated and utilized to enable a reliable inference of the rating of a new white matter hyperintensity observation. This computational approach for rating inference of white matter hyperintensities, which appears to be the first study, can be utilized as a computerized rating-assisting tool and can be very economical for diagnostic evaluation of brain tissue lesions.
Nagy, László G; Urban, Alexander; Orstadius, Leif; Papp, Tamás; Larsson, Ellen; Vágvölgyi, Csaba
2010-12-01
Recently developed comparative phylogenetic methods offer a wide spectrum of applications in evolutionary biology, although it is generally accepted that their statistical properties are incompletely known. Here, we examine and compare the statistical power of the ML and Bayesian methods with regard to selection of best-fit models of fruiting-body evolution and hypothesis testing of ancestral states on a real-life data set of a physiological trait (autodigestion) in the family Psathyrellaceae. Our phylogenies are based on the first multigene data set generated for the family. Two different coding regimes (binary and multistate) and two data sets differing in taxon sampling density are examined. The Bayesian method outperformed Maximum Likelihood with regard to statistical power in all analyses. This is particularly evident if the signal in the data is weak, i.e. in cases when the ML approach does not provide support to choose among competing hypotheses. Results based on binary and multistate coding differed only modestly, although it was evident that multistate analyses were less conclusive in all cases. It seems that increased taxon sampling density has favourable effects on inference of ancestral states, while model parameters are influenced to a smaller extent. The model best fitting our data implies that the rate of losses of deliquescence equals zero, although model selection in ML does not provide proper support to reject three of the four candidate models. The results also support the hypothesis that non-deliquescence (lack of autodigestion) has been ancestral in Psathyrellaceae, and that deliquescent fruiting bodies represent the preferred state, having evolved independently several times during evolution. Copyright © 2010 Elsevier Inc. All rights reserved.
Kaiser, Jacob L; Bland, Cassidy L; Klinke, David J
2016-03-01
Cancer arises from a deregulation of both intracellular and intercellular networks that maintain system homeostasis. Identifying the architecture of these networks and how they are changed in cancer is a pre-requisite for designing drugs to restore homeostasis. Since intercellular networks only appear in intact systems, it is difficult to identify how these networks become altered in human cancer using many of the common experimental models. To overcome this, we used the diversity in normal and malignant human tissue samples from the Cancer Genome Atlas (TCGA) database of human breast cancer to identify the topology associated with intercellular networks in vivo. To improve the underlying biological signals, we constructed Bayesian networks using metagene constructs, which represented groups of genes that are concomitantly associated with different immune and cancer states. We also used bootstrap resampling to establish the significance associated with the inferred networks. In short, we found opposing relationships between cell proliferation and epithelial-to-mesenchymal transformation (EMT) with regards to macrophage polarization. These results were consistent across multiple carcinomas in that proliferation was associated with a type 1 cell-mediated anti-tumor immune response and EMT was associated with a pro-tumor anti-inflammatory response. To address the identifiability of these networks from other datasets, we could identify the relationship between EMT and macrophage polarization with fewer samples when the Bayesian network was generated from malignant samples alone. However, the relationship between proliferation and macrophage polarization was identified with fewer samples when the samples were taken from a combination of the normal and malignant samples. © 2016 American Institute of Chemical Engineers Biotechnol. Prog., 32:470-479, 2016.
Directory of Open Access Journals (Sweden)
Asger Hobolth
2007-02-01
Full Text Available The genealogical relationship of human, chimpanzee, and gorilla varies along the genome. We develop a hidden Markov model (HMM that incorporates this variation and relate the model parameters to population genetics quantities such as speciation times and ancestral population sizes. Our HMM is an analytically tractable approximation to the coalescent process with recombination, and in simulations we see no apparent bias in the HMM estimates. We apply the HMM to four autosomal contiguous human-chimp-gorilla-orangutan alignments comprising a total of 1.9 million base pairs. We find a very recent speciation time of human-chimp (4.1 +/- 0.4 million years, and fairly large ancestral effective population sizes (65,000 +/- 30,000 for the human-chimp ancestor and 45,000 +/- 10,000 for the human-chimp-gorilla ancestor. Furthermore, around 50% of the human genome coalesces with chimpanzee after speciation with gorilla. We also consider 250,000 base pairs of X-chromosome alignments and find an effective population size much smaller than 75% of the autosomal effective population sizes. Finally, we find that the rate of transitions between different genealogies correlates well with the region-wide present-day human recombination rate, but does not correlate with the fine-scale recombination rates and recombination hot spots, suggesting that the latter are evolutionarily transient.
Niwayama, Ritsuya; Nagao, Hiromichi; Kitajima, Tomoya S.; Hufnagel, Lars; Shinohara, Kyosuke; Higuchi, Tomoyuki; Ishikawa, Takuji
2016-01-01
Cellular structures are hydrodynamically interconnected, such that force generation in one location can move distal structures. One example of this phenomenon is cytoplasmic streaming, whereby active forces at the cell cortex induce streaming of the entire cytoplasm. However, it is not known how the spatial distribution and magnitude of these forces move distant objects within the cell. To address this issue, we developed a computational method that used cytoplasm hydrodynamics to infer the spatial distribution of shear stress at the cell cortex induced by active force generators from experimentally obtained flow field of cytoplasmic streaming. By applying this method, we determined the shear-stress distribution that quantitatively reproduces in vivo flow fields in Caenorhabditis elegans embryos and mouse oocytes during meiosis II. Shear stress in mouse oocytes were predicted to localize to a narrower cortical region than that with a high cortical flow velocity and corresponded with the localization of the cortical actin cap. The predicted patterns of pressure gradient in both species were consistent with species-specific cytoplasmic streaming functions. The shear-stress distribution inferred by our method can contribute to the characterization of active force generation driving biological streaming. PMID:27472658
Niwayama, Ritsuya; Nagao, Hiromichi; Kitajima, Tomoya S; Hufnagel, Lars; Shinohara, Kyosuke; Higuchi, Tomoyuki; Ishikawa, Takuji; Kimura, Akatsuki
2016-01-01
Cellular structures are hydrodynamically interconnected, such that force generation in one location can move distal structures. One example of this phenomenon is cytoplasmic streaming, whereby active forces at the cell cortex induce streaming of the entire cytoplasm. However, it is not known how the spatial distribution and magnitude of these forces move distant objects within the cell. To address this issue, we developed a computational method that used cytoplasm hydrodynamics to infer the spatial distribution of shear stress at the cell cortex induced by active force generators from experimentally obtained flow field of cytoplasmic streaming. By applying this method, we determined the shear-stress distribution that quantitatively reproduces in vivo flow fields in Caenorhabditis elegans embryos and mouse oocytes during meiosis II. Shear stress in mouse oocytes were predicted to localize to a narrower cortical region than that with a high cortical flow velocity and corresponded with the localization of the cortical actin cap. The predicted patterns of pressure gradient in both species were consistent with species-specific cytoplasmic streaming functions. The shear-stress distribution inferred by our method can contribute to the characterization of active force generation driving biological streaming.
Clinical inferences and decisions--III. Utility assessment and the Bayesian decision model.
Aspinall, P A; Hill, A R
1984-01-01
It is accepted that errors of misclassifications, however small, can occur in clinical decisions but it cannot be assumed that the importance associated with false positive errors is the same as that for false negatives. The relative importance of these two types of error is frequently implied by a decision maker in the different weighting factors or utilities he assigns to the alternative consequences of his decisions. Formal procedures are available by which it is possible to make explicit in numerical form the value or worth of the outcome of a decision process. The two principal methods are described for generating utilities as associated with clinical decisions. The concept and application of utility is then expanded from a unidimensional to a multidimensional problem where, for example, one variable may be state of health and another monetary assets. When combined with the principles of subjective probability and test criterion selection outlined in Parts I and II of this series, the consequent use of utilities completes the framework upon which the general Bayesian model of clinical decision making is based. The five main stages in this general decision making model are described and applications of the model are illustrated with clinical examples from the field of ophthalmology. These include examples for unidimensional and multidimensional problems which are worked through in detail to illustrate both the principles and methodology involved in a rationalized normative model of clinical decision making behaviour.
Zubillaga, María; Skewes, Oscar; Soto, Nicolás; Rabinovich, Jorge E.; Colchero, Fernando
2014-01-01
Understanding the mechanisms that drive population dynamics is fundamental for management of wild populations. The guanaco (Lama guanicoe) is one of two wild camelid species in South America. We evaluated the effects of density dependence and weather variables on population regulation based on a time series of 36 years of population sampling of guanacos in Tierra del Fuego, Chile. The population density varied between 2.7 and 30.7 guanaco/km2, with an apparent monotonic growth during the first 25 years; however, in the last 10 years the population has shown large fluctuations, suggesting that it might have reached its carrying capacity. We used a Bayesian state-space framework and model selection to determine the effect of density and environmental variables on guanaco population dynamics. Our results show that the population is under density dependent regulation and that it is currently fluctuating around an average carrying capacity of 45,000 guanacos. We also found a significant positive effect of previous winter temperature while sheep density has a strong negative effect on the guanaco population growth. We conclude that there are significant density dependent processes and that climate as well as competition with domestic species have important effects determining the population size of guanacos, with important implications for management and conservation. PMID:25514510
Klein, E K; Oddou-Muratorio, S
2011-03-01
Understanding precisely how plants disperse their seeds and pollen in their neighbourhood is a central question for both ecologists and evolutionary biologists because seed and pollen dispersal governs both the rate of spread of an expanding population and gene flow within and among populations. The concept of a 'dispersal kernel' has become extremely popular in dispersal ecology as a tool that summarizes how dispersal distributes individuals and genes in space and at a given scale. In this issue of Molecular Ecology, the study by Moran & Clark (2011) (M&C in the following) shows how genotypic and spatial data of established seedlings can be analysed in a Bayesian framework to estimate jointly the pollen and seed dispersal kernels and finally derive a parentage analysis from a full-probability approach. This approach applied to red oak shows important dispersal of seeds (138 m on average) and pollen (178 m on average). For seeds, this estimate contrasts with previous results from inverse modelling on seed trap data (9.3 m). This research gathers several methodological advances made in recent years in two research communities and could become a cornerstone for dispersal ecology.