Constrained bayesian inference of project performance models
Sunmola, Funlade
2013-01-01
Project performance models play an important role in the management of project success. When used for monitoring projects, they can offer predictive ability such as indications of possible delivery problems. Approaches for monitoring project performance relies on available project information including restrictions imposed on the project, particularly the constraints of cost, quality, scope and time. We study in this paper a Bayesian inference methodology for project performance modelling in ...
Bayesian Inference of a Multivariate Regression Model
Marick S. Sinay
2014-01-01
Full Text Available We explore Bayesian inference of a multivariate linear regression model with use of a flexible prior for the covariance structure. The commonly adopted Bayesian setup involves the conjugate prior, multivariate normal distribution for the regression coefficients and inverse Wishart specification for the covariance matrix. Here we depart from this approach and propose a novel Bayesian estimator for the covariance. A multivariate normal prior for the unique elements of the matrix logarithm of the covariance matrix is considered. Such structure allows for a richer class of prior distributions for the covariance, with respect to strength of beliefs in prior location hyperparameters, as well as the added ability, to model potential correlation amongst the covariance structure. The posterior moments of all relevant parameters of interest are calculated based upon numerical results via a Markov chain Monte Carlo procedure. The Metropolis-Hastings-within-Gibbs algorithm is invoked to account for the construction of a proposal density that closely matches the shape of the target posterior distribution. As an application of the proposed technique, we investigate a multiple regression based upon the 1980 High School and Beyond Survey.
Bayesian Inference and Optimal Design in the Sparse Linear Model
Seeger, Matthias; Steinke, Florian; Tsuda, Koji
2007-01-01
The sparse linear model has seen many successful applications in Statistics, Machine Learning, and Computational Biology, such as identification of gene regulatory networks from micro-array expression data. Prior work has either approximated Bayesian inference by expensive Markov chain Monte Carlo, or replaced it by point estimation. We show how to obtain a good approximation to Bayesian analysis efficiently, using the Expectation Propagation method. We also address the problems of optimal de...
Bayesian inference model for fatigue life of laminated composites
Dimitrov, Nikolay Krasimirov; Kiureghian, Armen Der; Berggreen, Christian
2016-01-01
A probabilistic model for estimating the fatigue life of laminated composite plates is developed. The model is based on lamina-level input data, making it possible to predict fatigue properties for a wide range of laminate configurations. Model parameters are estimated by Bayesian inference. The...
Bayesian inference of chemical kinetic models from proposed reactions
Galagali, Nikhil
2015-02-01
© 2014 Elsevier Ltd. Bayesian inference provides a natural framework for combining experimental data with prior knowledge to develop chemical kinetic models and quantify the associated uncertainties, not only in parameter values but also in model structure. Most existing applications of Bayesian model selection methods to chemical kinetics have been limited to comparisons among a small set of models, however. The significant computational cost of evaluating posterior model probabilities renders traditional Bayesian methods infeasible when the model space becomes large. We present a new framework for tractable Bayesian model inference and uncertainty quantification using a large number of systematically generated model hypotheses. The approach involves imposing point-mass mixture priors over rate constants and exploring the resulting posterior distribution using an adaptive Markov chain Monte Carlo method. The posterior samples are used to identify plausible models, to quantify rate constant uncertainties, and to extract key diagnostic information about model structure-such as the reactions and operating pathways most strongly supported by the data. We provide numerical demonstrations of the proposed framework by inferring kinetic models for catalytic steam and dry reforming of methane using available experimental data.
Bayesian Inference and Forecasting in the Stationary Bilinear Model
Roberto Leon-Gonzalez; Fuyu Yang
2014-01-01
A stationary bilinear (SB) model can be used to describe processes with a time-varying degree of persistence that depends on past shocks. An example of such a process is inflation. This study develops methods for Bayesian inference, model comparison, and forecasting in the SB model. Using monthly U.K. inflation data, we find that the SB model outperforms the random walk and first order autoregressive AR(1) models in terms of root mean squared forecast errors for both the one-step-ahead and th...
GPU Computing in Bayesian Inference of Realized Stochastic Volatility Model
The realized stochastic volatility (RSV) model that utilizes the realized volatility as additional information has been proposed to infer volatility of financial time series. We consider the Bayesian inference of the RSV model by the Hybrid Monte Carlo (HMC) algorithm. The HMC algorithm can be parallelized and thus performed on the GPU for speedup. The GPU code is developed with CUDA Fortran. We compare the computational time in performing the HMC algorithm on GPU (GTX 760) and CPU (Intel i7-4770 3.4GHz) and find that the GPU can be up to 17 times faster than the CPU. We also code the program with OpenACC and find that appropriate coding can achieve the similar speedup with CUDA Fortran
Bayesian inference for partially identified models exploring the limits of limited data
Gustafson, Paul
2015-01-01
Introduction Identification What Is against Us? What Is for Us? Some Simple Examples of Partially Identified ModelsThe Road Ahead The Structure of Inference in Partially Identified Models Bayesian Inference The Structure of Posterior Distributions in PIMs Computational Strategies Strength of Bayesian Updating, Revisited Posterior MomentsCredible Intervals Evaluating the Worth of Inference Partial Identification versus Model Misspecification The Siren Call of Identification Comp
Markov Model of Wind Power Time Series UsingBayesian Inference of Transition Matrix
Chen, Peiyuan; Berthelsen, Kasper Klitgaard; Bak-Jensen, Birgitte; Chen, Zhe
2009-01-01
This paper proposes to use Bayesian inference of transition matrix when developing a discrete Markov model of a wind speed/power time series and 95% credible interval for the model verification. The Dirichlet distribution is used as a conjugate prior for the transition matrix. Three discrete Markov models are compared, i.e. the basic Markov model, the Bayesian Markov model and the birth-and-death Markov model. The proposed Bayesian Markov model shows the best accuracy in modeling the autocorr...
Frühwirth-Schnatter, Sylvia
1990-01-01
In the paper at hand we apply it to Bayesian statistics to obtain "Fuzzy Bayesian Inference". In the subsequent sections we will discuss a fuzzy valued likelihood function, Bayes' theorem for both fuzzy data and fuzzy priors, a fuzzy Bayes' estimator, fuzzy predictive densities and distributions, and fuzzy H.P.D .-Regions. (author's abstract)
A localization model to localize multiple sources using Bayesian inference
Dunham, Joshua Rolv
Accurate localization of a sound source in a room setting is important in both psychoacoustics and architectural acoustics. Binaural models have been proposed to explain how the brain processes and utilizes the interaural time differences (ITDs) and interaural level differences (ILDs) of sound waves arriving at the ears of a listener in determining source location. Recent work shows that applying Bayesian methods to this problem is proving fruitful. In this thesis, pink noise samples are convolved with head-related transfer functions (HRTFs) and compared to combinations of one and two anechoic speech signals convolved with different HRTFs or binaural room impulse responses (BRIRs) to simulate room positions. Through exhaustive calculation of Bayesian posterior probabilities and using a maximal likelihood approach, model selection will determine the number of sources present, and parameter estimation will result in azimuthal direction of the source(s).
Robust Bayesian inference in Iq-Spherical models
Osiewalski, Jacek; Mark F.J. Steel
1992-01-01
The class of multivariate lq-spherical distributions is introduced and defined through their isodensity surfaces. We prove that, under a Jeffreys' type improper prior on the scale parameter, posterior inference on the location parameters is the same for all lq-spherical sampling models with common q. This gives us perfect inference robustness with respect to any departures from the reference case of independent sampling from the exponential power distribution.
Bayesian inference and model comparison for metallic fatigue data
Babuška, Ivo
2016-02-23
In this work, we present a statistical treatment of stress-life (S-N) data drawn from a collection of records of fatigue experiments that were performed on 75S-T6 aluminum alloys. Our main objective is to predict the fatigue life of materials by providing a systematic approach to model calibration, model selection and model ranking with reference to S-N data. To this purpose, we consider fatigue-limit models and random fatigue-limit models that are specially designed to allow the treatment of the run-outs (right-censored data). We first fit the models to the data by maximum likelihood methods and estimate the quantiles of the life distribution of the alloy specimen. To assess the robustness of the estimation of the quantile functions, we obtain bootstrap confidence bands by stratified resampling with respect to the cycle ratio. We then compare and rank the models by classical measures of fit based on information criteria. We also consider a Bayesian approach that provides, under the prior distribution of the model parameters selected by the user, their simulation-based posterior distributions. We implement and apply Bayesian model comparison methods, such as Bayes factor ranking and predictive information criteria based on cross-validation techniques under various a priori scenarios.
Bayesian inference and model comparison for metallic fatigue data
Babuška, Ivo; Sawlan, Zaid; Scavino, Marco; Szabó, Barna; Tempone, Raúl
2016-06-01
In this work, we present a statistical treatment of stress-life (S-N) data drawn from a collection of records of fatigue experiments that were performed on 75S-T6 aluminum alloys. Our main objective is to predict the fatigue life of materials by providing a systematic approach to model calibration, model selection and model ranking with reference to S-N data. To this purpose, we consider fatigue-limit models and random fatigue-limit models that are specially designed to allow the treatment of the run-outs (right-censored data). We first fit the models to the data by maximum likelihood methods and estimate the quantiles of the life distribution of the alloy specimen. To assess the robustness of the estimation of the quantile functions, we obtain bootstrap confidence bands by stratified resampling with respect to the cycle ratio. We then compare and rank the models by classical measures of fit based on information criteria. We also consider a Bayesian approach that provides, under the prior distribution of the model parameters selected by the user, their simulation-based posterior distributions. We implement and apply Bayesian model comparison methods, such as Bayes factor ranking and predictive information criteria based on cross-validation techniques under various a priori scenarios.
The Lumiere Project: Bayesian User Modeling for Inferring the Goals and Needs of Software Users
Horvitz, Eric J.; Breese, John S.; Heckerman, David; Hovel, David; Rommelse, Koos
2013-01-01
The Lumiere Project centers on harnessing probability and utility to provide assistance to computer software users. We review work on Bayesian user models that can be employed to infer a users needs by considering a user's background, actions, and queries. Several problems were tackled in Lumiere research, including (1) the construction of Bayesian models for reasoning about the time-varying goals of computer users from their observed actions and queries, (2) gaining access to a stream of eve...
Approximate Bayesian inference in semi-mechanistic models
Aderhold, Andrej; Husmeier, Dirk; Grzegorczyk, Marco
2016-01-01
Inference of interaction networks represented by systems of differential equations is a challenging problem in many scientific disciplines. In the present article, we follow a semi-mechanistic modelling approach based on gradient matching. We investigate the extent to which key factors, including the kinetic model, statistical formulation and numerical methods, impact upon performance at network reconstruction. We emphasize general lessons for computational statisticians when faced with the c...
Bayesian inference of BWR model parameters by Markov chain Monte Carlo
In this paper, the Markov chain Monte Carlo approach to Bayesian inference is applied for estimating the parameters of a reduced-order model of the dynamics of a boiling water reactor system. A Bayesian updating strategy is devised to progressively refine the estimates, as newly measured data become available. Finally, the technique is used for detecting parameter changes during the system lifetime, e.g. due to component degradation
Bayesian inference for a wavefront model of the Neolithisation of Europe
Baggaley, Andrew W; Shukurov, Anvar; Boys, Richard J; Golightly, Andrew
2012-01-01
We consider a wavefront model for the spread of Neolithic culture across Europe, and use Bayesian inference techniques to provide estimates for the parameters within this model, as constrained by radiocarbon data from Southern and Western Europe. Our wavefront model allows for both an isotropic background spread (incorporating the effects of local geography), and a localized anisotropic spread associated with major waterways. We introduce an innovative numerical scheme to track the wavefront, allowing us to simulate the times of the first arrival at any site orders of magnitude more efficiently than traditional PDE approaches. We adopt a Bayesian approach to inference and use Gaussian process emulators to facilitate further increases in efficiency in the inference scheme, thereby making Markov chain Monte Carlo methods practical. We allow for uncertainty in the fit of our model, and also infer a parameter specifying the magnitude of this uncertainty. We obtain a magnitude for the background spread of order 1 ...
deBInfer: Bayesian inference for dynamical models of biological systems in R
Boersch-Supan, Philipp H.; Johnson, Leah R
2016-01-01
1. Differential equations (DEs) are commonly used to model the temporal evolution of biological systems, but statistical methods for comparing DE models to data and for parameter inference are relatively poorly developed. This is especially problematic in the context of biological systems where observations are often noisy and only a small number of time points may be available. 2. Bayesian approaches offer a coherent framework for parameter inference that can account for multiple sources of ...
Probabilistic Modelling of Fatigue Life of Composite Laminates Using Bayesian Inference
Dimitrov, Nikolay Krasimirov; Kiureghian, Armen Der
2014-01-01
. Model parameters are estimated by Bayesian inference. The reference data used consists of constant-amplitude fatigue test results for a multi-directional laminate subjected to seven different load ratios. The paper describes the modelling techniques and the parameter estimation procedure, supported by...
Markov Model of Wind Power Time Series UsingBayesian Inference of Transition Matrix
Chen, Peiyuan; Berthelsen, Kasper Klitgaard; Bak-Jensen, Birgitte;
2009-01-01
This paper proposes to use Bayesian inference of transition matrix when developing a discrete Markov model of a wind speed/power time series and 95% credible interval for the model verification. The Dirichlet distribution is used as a conjugate prior for the transition matrix. Three discrete Markov...
Moritz eBoos
2016-05-01
Full Text Available Cognitive determinants of probabilistic inference were examined using hierarchical Bayesian modelling techniques. A classic urn-ball paradigm served as experimental strategy, involving a factorial two (prior probabilities by two (likelihoods design. Five computational models of cognitive processes were compared with the observed behaviour. Parameter-free Bayesian posterior probabilities and parameter-free base rate neglect provided inadequate models of probabilistic inference. The introduction of distorted subjective probabilities yielded more robust and generalizable results. A general class of (inverted S-shaped probability weighting functions had been proposed; however, the possibility of large differences in probability distortions not only across experimental conditions, but also across individuals, seems critical for the model’s success. It also seems advantageous to consider individual differences in parameters of probability weighting as being sampled from weakly informative prior distributions of individual parameter values. Thus, the results from hierarchical Bayesian modelling converge with previous results in revealing that probability weighting parameters show considerable task dependency and individual differences. Methodologically, this work exemplifies the usefulness of hierarchical Bayesian modelling techniques for cognitive psychology. Theoretically, human probabilistic inference might be best described as the application of individualized strategic policies for Bayesian belief revision.
Skill Rating by Bayesian Inference
Di Fatta, Giuseppe; Haworth, Guy McCrossan; Regan, Kenneth W.
2009-01-01
Systems Engineering often involves computer modelling the behaviour of proposed systems and their components. Where a component is human, fallibility must be modelled by a stochastic agent. The identification of a model of decision-making over quantifiable options is investigated using the game-domain of Chess. Bayesian methods are used to infer the distribution of players’ skill levels from the moves they play rather than from their competitive results. The approach is used on large sets of ...
Wu, Yuefeng; Hooker, Giles
2013-01-01
This paper introduces a hierarchical framework to incorporate Hellinger distance methods into Bayesian analysis. We propose to modify a prior over non-parametric densities with the exponential of twice the Hellinger distance between a candidate and a parametric density. By incorporating a prior over the parameters of the second density, we arrive at a hierarchical model in which a non-parametric model is placed between parameters and the data. The parameters of the family can then be estimate...
Prudhomme, Serge
2015-09-17
Parameter estimation for complex models using Bayesian inference is usually a very costly process as it requires a large number of solves of the forward problem. We show here how the construction of adaptive surrogate models using a posteriori error estimates for quantities of interest can significantly reduce the computational cost in problems of statistical inference. As surrogate models provide only approximations of the true solutions of the forward problem, it is nevertheless necessary to control these errors in order to construct an accurate reduced model with respect to the observables utilized in the identification of the model parameters. Effectiveness of the proposed approach is demonstrated on a numerical example dealing with the Spalart–Allmaras model for the simulation of turbulent channel flows. In particular, we illustrate how Bayesian model selection using the adapted surrogate model in place of solving the coupled nonlinear equations leads to the same quality of results while requiring fewer nonlinear PDE solves.
Bayesian inference of models and hyper-parameters for robust optic-flow estimation
Héas, Patrick; Herzet, Cédric; Memin, Etienne
2012-01-01
International audience Selecting optimal models and hyper-parameters is crucial for accurate optic-flow estimation. This paper provides a solution to the problem in a generic Bayesian framework. The method is based on a conditional model linking the image intensity function, the unknown velocity field, hyper-parameters and the prior and likelihood motion models. Inference is performed on each of the three-level of this so-defined hierarchical model by maximization of marginalized \\textit{a...
Inference in hybrid Bayesian networks
Lanseth, Helge; Nielsen, Thomas Dyhre; Rumí, Rafael;
2009-01-01
and reliability block diagrams). However, limitations in the BNs' calculation engine have prevented BNs from becoming equally popular for domains containing mixtures of both discrete and continuous variables (so-called hybrid domains). In this paper we focus on these difficulties, and summarize some of the last...... decade's research on inference in hybrid Bayesian networks. The discussions are linked to an example model for estimating human reliability....
Bayesian Inference for Radio Observations
Lochner, Michelle; Zwart, Jonathan T L; Smirnov, Oleg; Bassett, Bruce A; Oozeer, Nadeem; Kunz, Martin
2015-01-01
(Abridged) New telescopes like the Square Kilometre Array (SKA) will push into a new sensitivity regime and expose systematics, such as direction-dependent effects, that could previously be ignored. Current methods for handling such systematics rely on alternating best estimates of instrumental calibration and models of the underlying sky, which can lead to inaccurate uncertainty estimates and biased results because such methods ignore any correlations between parameters. These deconvolution algorithms produce a single image that is assumed to be a true representation of the sky, when in fact it is just one realisation of an infinite ensemble of images compatible with the noise in the data. In contrast, here we report a Bayesian formalism that simultaneously infers both systematics and science. Our technique, Bayesian Inference for Radio Observations (BIRO), determines all parameters directly from the raw data, bypassing image-making entirely, by sampling from the joint posterior probability distribution. Thi...
Roche, Alexis
2015-01-01
This paper revisits the concept of composite likelihood from the perspective of probabilistic inference, and proposes a generalization called super composite likelihood for sharper inference in multiclass problems. It is argued that, beside providing a new interpretation and a general justification of na\\"ive Bayes procedures, super composite likelihood yields a much wider class of discriminative models suitable for unsupervised and weakly supervised learning.
Inferring activities from context measurements using Bayesian inference and random utility models
Hurtubia, Ricardo; Bierlaire, Michel; Flötteröd, Gunnar
2009-01-01
Smartphones collect a wealth of information about their users. This includes GPS tracks and the MAC addresses of devices around the user, and it can go as far as taking visual and acoustic samples of the user's environment. We present a framework to identify a smartphone user's activities in a Bayesian setting. As prior information, we us a random utility model that accounts for the type of activity a user is likely to perform at any given location and time; this model was estimated for the w...
GNU MCSim : bayesian statistical inference for SBML-coded systems biology models
Bois, Frédéric Y.
2009-01-01
International audience Statistical inference about the parameter values of complex models, such as the ones routinely developed in systems biology, is efficiently performed through Bayesian numerical techniques. In that framework, prior information and multiple levels of uncertainty can be seamlessly integrated. GNU MCSim was precisely developed to achieve those aims, in a general non-linear differential context. Starting with version 5.3.0, GNU MCSim reads in and simulates Systems Biology...
Höhna, Sebastian; Landis, Michael J; Heath, Tracy A; Boussau, Bastien; Lartillot, Nicolas; Moore, Brian R; Huelsenbeck, John P; Ronquist, Fredrik
2016-07-01
Programs for Bayesian inference of phylogeny currently implement a unique and ﬁxed suite of models. Consequently, users of these software packages are simultaneously forced to use a number of programs for a given study, while also lacking the freedom to explore models that have not been implemented by the developers of those programs. We developed a new open-source software package, RevBayes, to address these problems. RevBayes is entirely based on probabilistic graphical models, a powerful generic framework for specifying and analyzing statistical models. Phylogenetic-graphical models can be speciﬁed interactively in RevBayes, piece by piece, using a new succinct and intuitive language called Rev. Rev is similar to the R language and the BUGS model-speciﬁcation language, and should be easy to learn for most users. The strength of RevBayes is the simplicity with which one can design, specify, and implement new and complex models. Fortunately, this tremendous ﬂexibility does not come at the cost of slower computation; as we demonstrate, RevBayes outperforms competing software for several standard analyses. Compared with other programs, RevBayes has fewer black-box elements. Users need to explicitly specify each part of the model and analysis. Although this explicitness may initially be unfamiliar, we are convinced that this transparency will improve understanding of phylogenetic models in our ﬁeld. Moreover, it will motivate the search for improvements to existing methods by brazenly exposing the model choices that we make to critical scrutiny. RevBayes is freely available at http://www.RevBayes.com [Bayesian inference; Graphical models; MCMC; statistical phylogenetics.]. PMID:27235697
Bayesian Inference in the Time Varying Cointegration Model
Gary Koop; Roberto Leon-Gonzalez; Rodney Strachan
2008-01-01
There are both theoretical and empirical reasons for believing that the pa- rameters of macroeconomic models may vary over time. However, work with time-varying parameter models has largely involved Vector autoregressions (VARs), ignoring cointegration. This is despite the fact that cointegration plays an important role in informing macroeconomists on a range of issues. In this paper we develop time varying parameter models which permit coin- tegration. Time-varying parameter VARs (TVP-VARs) ...
Quantum Inference on Bayesian Networks
Yoder, Theodore; Low, Guang Hao; Chuang, Isaac
2014-03-01
Because quantum physics is naturally probabilistic, it seems reasonable to expect physical systems to describe probabilities and their evolution in a natural fashion. Here, we use quantum computation to speedup sampling from a graphical probability model, the Bayesian network. A specialization of this sampling problem is approximate Bayesian inference, where the distribution on query variables is sampled given the values e of evidence variables. Inference is a key part of modern machine learning and artificial intelligence tasks, but is known to be NP-hard. Classically, a single unbiased sample is obtained from a Bayesian network on n variables with at most m parents per node in time (nmP(e) - 1 / 2) , depending critically on P(e) , the probability the evidence might occur in the first place. However, by implementing a quantum version of rejection sampling, we obtain a square-root speedup, taking (n2m P(e) -1/2) time per sample. The speedup is the result of amplitude amplification, which is proving to be broadly applicable in sampling and machine learning tasks. In particular, we provide an explicit and efficient circuit construction that implements the algorithm without the need for oracle access.
Nonparametric Bayesian inference in biostatistics
Müller, Peter
2015-01-01
As chapters in this book demonstrate, BNP has important uses in clinical sciences and inference for issues like unknown partitions in genomics. Nonparametric Bayesian approaches (BNP) play an ever expanding role in biostatistical inference from use in proteomics to clinical trials. Many research problems involve an abundance of data and require flexible and complex probability models beyond the traditional parametric approaches. As this book's expert contributors show, BNP approaches can be the answer. Survival Analysis, in particular survival regression, has traditionally used BNP, but BNP's potential is now very broad. This applies to important tasks like arrangement of patients into clinically meaningful subpopulations and segmenting the genome into functionally distinct regions. This book is designed to both review and introduce application areas for BNP. While existing books provide theoretical foundations, this book connects theory to practice through engaging examples and research questions. Chapters c...
Cuevas Rivera, Dario; Bitzer, Sebastian; Kiebel, Stefan J
2015-10-01
The olfactory information that is received by the insect brain is encoded in the form of spatiotemporal patterns in the projection neurons of the antennal lobe. These dense and overlapping patterns are transformed into a sparse code in Kenyon cells in the mushroom body. Although it is clear that this sparse code is the basis for rapid categorization of odors, it is yet unclear how the sparse code in Kenyon cells is computed and what information it represents. Here we show that this computation can be modeled by sequential firing rate patterns using Lotka-Volterra equations and Bayesian online inference. This new model can be understood as an 'intelligent coincidence detector', which robustly and dynamically encodes the presence of specific odor features. We found that the model is able to qualitatively reproduce experimentally observed activity in both the projection neurons and the Kenyon cells. In particular, the model explains mechanistically how sparse activity in the Kenyon cells arises from the dense code in the projection neurons. The odor classification performance of the model proved to be robust against noise and time jitter in the observed input sequences. As in recent experimental results, we found that recognition of an odor happened very early during stimulus presentation in the model. Critically, by using the model, we found surprising but simple computational explanations for several experimental phenomena. PMID:26451888
Dario Cuevas Rivera
2015-10-01
Full Text Available The olfactory information that is received by the insect brain is encoded in the form of spatiotemporal patterns in the projection neurons of the antennal lobe. These dense and overlapping patterns are transformed into a sparse code in Kenyon cells in the mushroom body. Although it is clear that this sparse code is the basis for rapid categorization of odors, it is yet unclear how the sparse code in Kenyon cells is computed and what information it represents. Here we show that this computation can be modeled by sequential firing rate patterns using Lotka-Volterra equations and Bayesian online inference. This new model can be understood as an 'intelligent coincidence detector', which robustly and dynamically encodes the presence of specific odor features. We found that the model is able to qualitatively reproduce experimentally observed activity in both the projection neurons and the Kenyon cells. In particular, the model explains mechanistically how sparse activity in the Kenyon cells arises from the dense code in the projection neurons. The odor classification performance of the model proved to be robust against noise and time jitter in the observed input sequences. As in recent experimental results, we found that recognition of an odor happened very early during stimulus presentation in the model. Critically, by using the model, we found surprising but simple computational explanations for several experimental phenomena.
Computationally efficient Bayesian inference for inverse problems.
Marzouk, Youssef M.; Najm, Habib N.; Rahn, Larry A.
2007-10-01
Bayesian statistics provides a foundation for inference from noisy and incomplete data, a natural mechanism for regularization in the form of prior information, and a quantitative assessment of uncertainty in the inferred results. Inverse problems - representing indirect estimation of model parameters, inputs, or structural components - can be fruitfully cast in this framework. Complex and computationally intensive forward models arising in physical applications, however, can render a Bayesian approach prohibitive. This difficulty is compounded by high-dimensional model spaces, as when the unknown is a spatiotemporal field. We present new algorithmic developments for Bayesian inference in this context, showing strong connections with the forward propagation of uncertainty. In particular, we introduce a stochastic spectral formulation that dramatically accelerates the Bayesian solution of inverse problems via rapid evaluation of a surrogate posterior. We also explore dimensionality reduction for the inference of spatiotemporal fields, using truncated spectral representations of Gaussian process priors. These new approaches are demonstrated on scalar transport problems arising in contaminant source inversion and in the inference of inhomogeneous material or transport properties. We also present a Bayesian framework for parameter estimation in stochastic models, where intrinsic stochasticity may be intermingled with observational noise. Evaluation of a likelihood function may not be analytically tractable in these cases, and thus several alternative Markov chain Monte Carlo (MCMC) schemes, operating on the product space of the observations and the parameters, are introduced.
Albert, Carlo; Ulzega, Simone; Stoop, Ruedi
2016-04-01
Parameter inference is a fundamental problem in data-driven modeling. Given observed data that is believed to be a realization of some parameterized model, the aim is to find parameter values that are able to explain the observed data. In many situations, the dominant sources of uncertainty must be included into the model for making reliable predictions. This naturally leads to stochastic models. Stochastic models render parameter inference much harder, as the aim then is to find a distribution of likely parameter values. In Bayesian statistics, which is a consistent framework for data-driven learning, this so-called posterior distribution can be used to make probabilistic predictions. We propose a novel, exact, and very efficient approach for generating posterior parameter distributions for stochastic differential equation models calibrated to measured time series. The algorithm is inspired by reinterpreting the posterior distribution as a statistical mechanics partition function of an object akin to a polymer, where the measurements are mapped on heavier beads compared to those of the simulated data. To arrive at distribution samples, we employ a Hamiltonian Monte Carlo approach combined with a multiple time-scale integration. A separation of time scales naturally arises if either the number of measurement points or the number of simulation points becomes large. Furthermore, at least for one-dimensional problems, we can decouple the harmonic modes between measurement points and solve the fastest part of their dynamics analytically. Our approach is applicable to a wide range of inference problems and is highly parallelizable.
Optimal modeling of 1D azimuth correlations in the context of Bayesian inference
De Kock, Michiel B; Trainor, Thomas A
2015-01-01
Analysis and interpretation of spectrum and correlation data from high-energy nuclear collisions is currently controversial because two opposing physics narratives derive contradictory implications from the same data-one narrative claiming collision dynamics is dominated by dijet production and projectile-nucleon fragmentation, the other claiming collision dynamics is dominated by a dense, flowing QCD medium. Opposing interpretations seem to be supported by alternative data models, and current model-comparison schemes are unable to distinguish between them. There is clearly need for a convincing new methodology to break the deadlock. In this study we introduce Bayesian Inference (BI) methods applied to angular correlation data as a basis to evaluate competing data models. For simplicity the data considered are projections of 2D angular correlations onto 1D azimuth from three centrality classes of 200 GeV Au-Au collisions. We consider several data models typical of current model choices, including Fourier seri...
Probability biases as Bayesian inference
Andre; C. R. Martins
2006-11-01
Full Text Available In this article, I will show how several observed biases in human probabilistic reasoning can be partially explained as good heuristics for making inferences in an environment where probabilities have uncertainties associated to them. Previous results show that the weight functions and the observed violations of coalescing and stochastic dominance can be understood from a Bayesian point of view. We will review those results and see that Bayesian methods should also be used as part of the explanation behind other known biases. That means that, although the observed errors are still errors under the be understood as adaptations to the solution of real life problems. Heuristics that allow fast evaluations and mimic a Bayesian inference would be an evolutionary advantage, since they would give us an efficient way of making decisions. %XX In that sense, it should be no surprise that humans reason with % probability as it has been observed.
Bayesian inference in camera trapping studies for a class of spatial capture-recapture models
Royle, J. Andrew; Karanth, K. Ullas; Gopalaswamy, Arjun M.; Kumar, N. Samba
2009-01-01
We develop a class of models for inference about abundance or density using spatial capture-recapture data from studies based on camera trapping and related methods. The model is a hierarchical model composed of two components: a point process model describing the distribution of individuals in space (or their home range centers) and a model describing the observation of individuals in traps. We suppose that trap- and individual-specific capture probabilities are a function of distance between individual home range centers and trap locations. We show that the models can be regarded as generalized linear mixed models, where the individual home range centers are random effects. We adopt a Bayesian framework for inference under these models using a formulation based on data augmentation. We apply the models to camera trapping data on tigers from the Nagarahole Reserve, India, collected over 48 nights in 2006. For this study, 120 camera locations were used, but cameras were only operational at 30 locations during any given sample occasion. Movement of traps is common in many camera-trapping studies and represents an important feature of the observation model that we address explicitly in our application.
Jensen, Finn Verner; Nielsen, Thomas Dyhre
2016-01-01
Mathematically, a Bayesian graphical model is a compact representation of the joint probability distribution for a set of variables. The most frequently used type of Bayesian graphical models are Bayesian networks. The structural part of a Bayesian graphical model is a graph consisting of nodes and...... largely due to the availability of efficient inference algorithms for answering probabilistic queries about the states of the variables in the network. Furthermore, to support the construction of Bayesian network models, learning algorithms are also available. We give an overview of the Bayesian network...
von Nessi, G T
2012-01-01
A new method, based on Bayesian analysis, is presented which unifies the inference of plasma equilibria parameters in a Tokamak with the ability to quantify differences between inferred equilibria and Grad-Shafranov force-balance solutions. At the heart of this technique is the new method of observation splitting, which allows multiple forward models to be associated with a single diagnostic observation. This new idea subsequently provides a means by which the the space of GS solutions can be efficiently characterised via a prior distribution. Moreover, by folding force-balance directly into one set of forward models and utilising simple Biot-Savart responses in another, the Bayesian inference of the plasma parameters itself produces an evidence (a normalisation constant of the inferred posterior distribution) which is sensitive to the relative consistency between both sets of models. This evidence can then be used to help determine the relative accuracy of the tested force-balance model across several discha...
Hill, T; Minier, V; Burton, M G; Cunningham, M R
2008-01-01
Concatenating data from the millimetre regime to the infrared, we have performed spectral energy distribution modelling for 227 of the 405 millimetre continuum sources of Hill et al. (2005) which are thought to contain young massive stars in the earliest stages of their formation. Three main parameters are extracted from the fits: temperature, mass and luminosity. The method employed was Bayesian inference, which allows a statistically probable range of suitable values for each parameter to be drawn for each individual protostellar candidate. This is the first application of this method to massive star formation. The cumulative distribution plots of the SED modelled parameters in this work indicate that collectively, the sources without methanol maser and/or radio continuum associations (MM-only cores) display similar characteristics to those of high mass star formation regions. Attributing significance to the marginal distinctions between the MM-only cores and the high-mass star formation sample we draw hypo...
Tactile length contraction as Bayesian inference.
Tong, Jonathan; Ngo, Vy; Goldreich, Daniel
2016-08-01
To perceive, the brain must interpret stimulus-evoked neural activity. This is challenging: The stochastic nature of the neural response renders its interpretation inherently uncertain. Perception would be optimized if the brain used Bayesian inference to interpret inputs in light of expectations derived from experience. Bayesian inference would improve perception on average but cause illusions when stimuli violate expectation. Intriguingly, tactile, auditory, and visual perception are all prone to length contraction illusions, characterized by the dramatic underestimation of the distance between punctate stimuli delivered in rapid succession; the origin of these illusions has been mysterious. We previously proposed that length contraction illusions occur because the brain interprets punctate stimulus sequences using Bayesian inference with a low-velocity expectation. A novel prediction of our Bayesian observer model is that length contraction should intensify if stimuli are made more difficult to localize. Here we report a tactile psychophysical study that tested this prediction. Twenty humans compared two distances on the forearm: a fixed reference distance defined by two taps with 1-s temporal separation and an adjustable comparison distance defined by two taps with temporal separation t ≤ 1 s. We observed significant length contraction: As t was decreased, participants perceived the two distances as equal only when the comparison distance was made progressively greater than the reference distance. Furthermore, the use of weaker taps significantly enhanced participants' length contraction. These findings confirm the model's predictions, supporting the view that the spatiotemporal percept is a best estimate resulting from a Bayesian inference process. PMID:27121574
Bayesian inference on proportional elections.
Brunello, Gabriel Hideki Vatanabe; Nakano, Eduardo Yoshio
2015-01-01
Polls for majoritarian voting systems usually show estimates of the percentage of votes for each candidate. However, proportional vote systems do not necessarily guarantee the candidate with the most percentage of votes will be elected. Thus, traditional methods used in majoritarian elections cannot be applied on proportional elections. In this context, the purpose of this paper was to perform a Bayesian inference on proportional elections considering the Brazilian system of seats distribution. More specifically, a methodology to answer the probability that a given party will have representation on the chamber of deputies was developed. Inferences were made on a Bayesian scenario using the Monte Carlo simulation technique, and the developed methodology was applied on data from the Brazilian elections for Members of the Legislative Assembly and Federal Chamber of Deputies in 2010. A performance rate was also presented to evaluate the efficiency of the methodology. Calculations and simulations were carried out using the free R statistical software. PMID:25786259
Bazin, Eric; Dawson, Kevin J; Beaumont, Mark A
2010-06-01
We address the problem of finding evidence of natural selection from genetic data, accounting for the confounding effects of demographic history. In the absence of natural selection, gene genealogies should all be sampled from the same underlying distribution, often approximated by a coalescent model. Selection at a particular locus will lead to a modified genealogy, and this motivates a number of recent approaches for detecting the effects of natural selection in the genome as "outliers" under some models. The demographic history of a population affects the sampling distribution of genealogies, and therefore the observed genotypes and the classification of outliers. Since we cannot see genealogies directly, we have to infer them from the observed data under some model of mutation and demography. Thus the accuracy of an outlier-based approach depends to a greater or a lesser extent on the uncertainty about the demographic and mutational model. A natural modeling framework for this type of problem is provided by Bayesian hierarchical models, in which parameters, such as mutation rates and selection coefficients, are allowed to vary across loci. It has proved quite difficult computationally to implement fully probabilistic genealogical models with complex demographies, and this has motivated the development of approximations such as approximate Bayesian computation (ABC). In ABC the data are compressed into summary statistics, and computation of the likelihood function is replaced by simulation of data under the model. In a hierarchical setting one may be interested both in hyperparameters and parameters, and there may be very many of the latter--for example, in a genetic model, these may be parameters describing each of many loci or populations. This poses a problem for ABC in that one then requires summary statistics for each locus, which, if used naively, leads to a consequent difficulty in conditional density estimation. We develop a general method for applying
BAMBI: blind accelerated multimodal Bayesian inference
Graff, Philip; Hobson, Michael P; Lasenby, Anthony
2011-01-01
In this paper we present an algorithm for rapid Bayesian analysis that combines the benefits of nested sampling and artificial neural networks. The blind accelerated multimodal Bayesian inference (BAMBI) algorithm implements the MultiNest package for nested sampling as well as the training of an artificial neural network (NN) to learn the likelihood function. In the case of computationally expensive likelihoods, this allows the substitution of a much more rapid approximation in order to increase significantly the speed of the analysis. We begin by demonstrating, with a few toy examples, the ability of a NN to learn complicated likelihood surfaces. BAMBI's ability to decrease running time for Bayesian inference is then demonstrated in the context of estimating cosmological parameters from WMAP and other observations. We show that valuable speed increases are achieved in addition to obtaining NNs trained on the likelihood functions for the different model and data combinations. These NNs can then be used for an...
Universal Darwinism as a process of Bayesian inference
Campbell, John O
2016-01-01
Many of the mathematical frameworks describing natural selection are equivalent to Bayes Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians). As Bayesian inference can always be cast in terms of (variational) free energy minimization, natural selection can be viewed as comprising two components: a generative model of an "experiment" in the external world environment, and the results of that "experiment" or the "surprise" entailed by predicted and actual outcomes of the "experiment". Minimization of free energy implies that the implicit measure of "surprise" experienced serves to update the generative model in a Bayesian manner. This description clo...
Bayesian model comparison and parameter inference in systems biology using nested sampling.
Pullen, Nick; Morris, Richard J
2014-01-01
Inferring parameters for models of biological processes is a current challenge in systems biology, as is the related problem of comparing competing models that explain the data. In this work we apply Skilling's nested sampling to address both of these problems. Nested sampling is a Bayesian method for exploring parameter space that transforms a multi-dimensional integral to a 1D integration over likelihood space. This approach focuses on the computation of the marginal likelihood or evidence. The ratio of evidences of different models leads to the Bayes factor, which can be used for model comparison. We demonstrate how nested sampling can be used to reverse-engineer a system's behaviour whilst accounting for the uncertainty in the results. The effect of missing initial conditions of the variables as well as unknown parameters is investigated. We show how the evidence and the model ranking can change as a function of the available data. Furthermore, the addition of data from extra variables of the system can deliver more information for model comparison than increasing the data from one variable, thus providing a basis for experimental design. PMID:24523891
Bayesian model comparison and parameter inference in systems biology using nested sampling.
Nick Pullen
Full Text Available Inferring parameters for models of biological processes is a current challenge in systems biology, as is the related problem of comparing competing models that explain the data. In this work we apply Skilling's nested sampling to address both of these problems. Nested sampling is a Bayesian method for exploring parameter space that transforms a multi-dimensional integral to a 1D integration over likelihood space. This approach focuses on the computation of the marginal likelihood or evidence. The ratio of evidences of different models leads to the Bayes factor, which can be used for model comparison. We demonstrate how nested sampling can be used to reverse-engineer a system's behaviour whilst accounting for the uncertainty in the results. The effect of missing initial conditions of the variables as well as unknown parameters is investigated. We show how the evidence and the model ranking can change as a function of the available data. Furthermore, the addition of data from extra variables of the system can deliver more information for model comparison than increasing the data from one variable, thus providing a basis for experimental design.
Bayesian inference for Hawkes processes
Rasmussen, Jakob Gulddahl
The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...... intensity function, while the second approach is based on an underlying clustering and branching structure in the Hawkes process. For practical use, MCMC (Markov chain Monte Carlo) methods are employed. The two approaches are compared numerically using three examples of the Hawkes process....
Bayesian inference for Hawkes processes
Rasmussen, Jakob Gulddahl
2013-01-01
The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...... intensity function, while the second approach is based on an underlying clustering and branching structure in the Hawkes process. For practical use, MCMC (Markov chain Monte Carlo) methods are employed. The two approaches are compared numerically using three examples of the Hawkes process....
Multi-scale inference of interaction rules in animal groups using Bayesian model selection.
Richard P Mann
Full Text Available Inference of interaction rules of animals moving in groups usually relies on an analysis of large scale system behaviour. Models are tuned through repeated simulation until they match the observed behaviour. More recent work has used the fine scale motions of animals to validate and fit the rules of interaction of animals in groups. Here, we use a Bayesian methodology to compare a variety of models to the collective motion of glass prawns (Paratya australiensis. We show that these exhibit a stereotypical 'phase transition', whereby an increase in density leads to the onset of collective motion in one direction. We fit models to this data, which range from: a mean-field model where all prawns interact globally; to a spatial Markovian model where prawns are self-propelled particles influenced only by the current positions and directions of their neighbours; up to non-Markovian models where prawns have 'memory' of previous interactions, integrating their experiences over time when deciding to change behaviour. We show that the mean-field model fits the large scale behaviour of the system, but does not capture the observed locality of interactions. Traditional self-propelled particle models fail to capture the fine scale dynamics of the system. The most sophisticated model, the non-Markovian model, provides a good match to the data at both the fine scale and in terms of reproducing global dynamics, while maintaining a biologically plausible perceptual range. We conclude that prawns' movements are influenced by not just the current direction of nearby conspecifics, but also those encountered in the recent past. Given the simplicity of prawns as a study system our research suggests that self-propelled particle models of collective motion should, if they are to be realistic at multiple biological scales, include memory of previous interactions and other non-Markovian effects.
Multi-scale inference of interaction rules in animal groups using Bayesian model selection.
Richard P Mann
2012-01-01
Full Text Available Inference of interaction rules of animals moving in groups usually relies on an analysis of large scale system behaviour. Models are tuned through repeated simulation until they match the observed behaviour. More recent work has used the fine scale motions of animals to validate and fit the rules of interaction of animals in groups. Here, we use a Bayesian methodology to compare a variety of models to the collective motion of glass prawns (Paratya australiensis. We show that these exhibit a stereotypical 'phase transition', whereby an increase in density leads to the onset of collective motion in one direction. We fit models to this data, which range from: a mean-field model where all prawns interact globally; to a spatial Markovian model where prawns are self-propelled particles influenced only by the current positions and directions of their neighbours; up to non-Markovian models where prawns have 'memory' of previous interactions, integrating their experiences over time when deciding to change behaviour. We show that the mean-field model fits the large scale behaviour of the system, but does not capture fine scale rules of interaction, which are primarily mediated by physical contact. Conversely, the Markovian self-propelled particle model captures the fine scale rules of interaction but fails to reproduce global dynamics. The most sophisticated model, the non-Markovian model, provides a good match to the data at both the fine scale and in terms of reproducing global dynamics. We conclude that prawns' movements are influenced by not just the current direction of nearby conspecifics, but also those encountered in the recent past. Given the simplicity of prawns as a study system our research suggests that self-propelled particle models of collective motion should, if they are to be realistic at multiple biological scales, include memory of previous interactions and other non-Markovian effects.
A Fast Iterative Bayesian Inference Algorithm for Sparse Channel Estimation
Pedersen, Niels Lovmand; Manchón, Carles Navarro; Fleury, Bernard Henri
2013-01-01
representation of the Bessel K probability density function; a highly efficient, fast iterative Bayesian inference method is then applied to the proposed model. The resulting estimator outperforms other state-of-the-art Bayesian and non-Bayesian estimators, either by yielding lower mean squared estimation error...
Gustafson, Paul
2014-01-01
Partially identified models are characterized by the distribution of observables being compatible with a set of values for the target parameter, rather than a single value. This set is often referred to as an identification region. From a non-Bayesian point of view, the identification region is the object revealed to the investigator in the limit of increasing sample size. Conversely, a Bayesian analysis provides the identification region plus the limiting posterior distribution over this reg...
Hug, Sabine Carolin
2015-01-01
In this thesis we use differential equations for mathematically representing biological processes. For this we have to infer the associated parameters for fitting the differential equations to measurement data. If the structure of the ODE itself is uncertain, model selection methods have to be applied. We refine several existing Bayesian methods, ranging from an adaptive scheme for the computation of high-dimensional integrals to multi-chain Metropolis-Hastings algorithms for high-dimensional...
Bayesianism and inference to the best explanation
Valeriano IRANZO
2008-01-01
Full Text Available Bayesianism and Inference to the best explanation (IBE are two different models of inference. Recently there has been some debate about the possibility of “bayesianizing” IBE. Firstly I explore several alternatives to include explanatory considerations in Bayes’s Theorem. Then I distinguish two different interpretations of prior probabilities: “IBE-Bayesianism” (IBE-Bay and “frequentist-Bayesianism” (Freq-Bay. After detailing the content of the latter, I propose a rule for assessing the priors. I also argue that Freq-Bay: (i endorses a role for explanatory value in the assessment of scientific hypotheses; (ii avoids a purely subjectivist reading of prior probabilities; and (iii fits better than IBE-Bayesianism with two basic facts about science, i.e., the prominent role played by empirical testing and the existence of many scientific theories in the past that failed to fulfil their promises and were subsequently abandoned.
Compiling Relational Bayesian Networks for Exact Inference
Jaeger, Manfred; Chavira, Mark; Darwiche, Adnan
2004-01-01
We describe a system for exact inference with relational Bayesian networks as defined in the publicly available \\primula\\ tool. The system is based on compiling propositional instances of relational Bayesian networks into arithmetic circuits and then performing online inference by evaluating and ...
Compiling Relational Bayesian Networks for Exact Inference
Jaeger, Manfred; Darwiche, Adnan; Chavira, Mark
We describe in this paper a system for exact inference with relational Bayesian networks as defined in the publicly available PRIMULA tool. The system is based on compiling propositional instances of relational Bayesian networks into arithmetic circuits and then performing online inference by eva...
Tang, An-Min; Tang, Nian-Sheng
2015-02-28
We propose a semiparametric multivariate skew-normal joint model for multivariate longitudinal and multivariate survival data. One main feature of the posited model is that we relax the commonly used normality assumption for random effects and within-subject error by using a centered Dirichlet process prior to specify the random effects distribution and using a multivariate skew-normal distribution to specify the within-subject error distribution and model trajectory functions of longitudinal responses semiparametrically. A Bayesian approach is proposed to simultaneously obtain Bayesian estimates of unknown parameters, random effects and nonparametric functions by combining the Gibbs sampler and the Metropolis-Hastings algorithm. Particularly, a Bayesian local influence approach is developed to assess the effect of minor perturbations to within-subject measurement error and random effects. Several simulation studies and an example are presented to illustrate the proposed methodologies. PMID:25404574
Bayesian Estimation and Inference Using Stochastic Electronics.
Thakur, Chetan Singh; Afshar, Saeed; Wang, Runchun M; Hamilton, Tara J; Tapson, Jonathan; van Schaik, André
2016-01-01
In this paper, we present the implementation of two types of Bayesian inference problems to demonstrate the potential of building probabilistic algorithms in hardware using single set of building blocks with the ability to perform these computations in real time. The first implementation, referred to as the BEAST (Bayesian Estimation and Stochastic Tracker), demonstrates a simple problem where an observer uses an underlying Hidden Markov Model (HMM) to track a target in one dimension. In this implementation, sensors make noisy observations of the target position at discrete time steps. The tracker learns the transition model for target movement, and the observation model for the noisy sensors, and uses these to estimate the target position by solving the Bayesian recursive equation online. We show the tracking performance of the system and demonstrate how it can learn the observation model, the transition model, and the external distractor (noise) probability interfering with the observations. In the second implementation, referred to as the Bayesian INference in DAG (BIND), we show how inference can be performed in a Directed Acyclic Graph (DAG) using stochastic circuits. We show how these building blocks can be easily implemented using simple digital logic gates. An advantage of the stochastic electronic implementation is that it is robust to certain types of noise, which may become an issue in integrated circuit (IC) technology with feature sizes in the order of tens of nanometers due to their low noise margin, the effect of high-energy cosmic rays and the low supply voltage. In our framework, the flipping of random individual bits would not affect the system performance because information is encoded in a bit stream. PMID:27047326
Fang-Rong Yan
Full Text Available This article provides a fully bayesian approach for modeling of single-dose and complete pharmacokinetic data in a population pharmacokinetic (PK model. To overcome the impact of outliers and the difficulty of computation, a generalized linear model is chosen with the hypothesis that the errors follow a multivariate Student t distribution which is a heavy-tailed distribution. The aim of this study is to investigate and implement the performance of the multivariate t distribution to analyze population pharmacokinetic data. Bayesian predictive inferences and the Metropolis-Hastings algorithm schemes are used to process the intractable posterior integration. The precision and accuracy of the proposed model are illustrated by the simulating data and a real example of theophylline data.
Universal Darwinism As a Process of Bayesian Inference.
Campbell, John O
2016-01-01
Many of the mathematical frameworks describing natural selection are equivalent to Bayes' Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus, natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians). As Bayesian inference can always be cast in terms of (variational) free energy minimization, natural selection can be viewed as comprising two components: a generative model of an "experiment" in the external world environment, and the results of that "experiment" or the "surprise" entailed by predicted and actual outcomes of the "experiment." Minimization of free energy implies that the implicit measure of "surprise" experienced serves to update the generative model in a Bayesian manner. This description closely accords with the mechanisms of generalized Darwinian process proposed both by Dawkins, in terms of replicators and vehicles, and Campbell, in terms of inferential systems. Bayesian inference is an algorithm for the accumulation of evidence-based knowledge. This algorithm is now seen to operate over a wide range of evolutionary processes, including natural selection, the evolution of mental models and cultural evolutionary processes, notably including science itself. The variational principle of free energy minimization may thus serve as a unifying mathematical framework for universal Darwinism, the study of evolutionary processes operating throughout nature. PMID:27375438
Bayesian default probability models
Andrlíková, Petra
2014-01-01
This paper proposes a methodology for default probability estimation for low default portfolios, where the statistical inference may become troublesome. The author suggests using logistic regression models with the Bayesian estimation of parameters. The piecewise logistic regression model and Box-Cox transformation of credit risk score is used to derive the estimates of probability of default, which extends the work by Neagu et al. (2009). The paper shows that the Bayesian models are more acc...
Møller, Jesper; Rasmussen, Jakob Gulddahl
We introduce a flexible spatial point process model for spatial point patterns exhibiting linear structures, without incorporating a latent line process. The model is given by an underlying sequential point process model, i.e. each new point is generated given the previous points. Under this model...... previous points is such that the dependent cluster point is likely to occur closely to a previous cluster point. We demonstrate the flexibility of the model for producing point patterns with linear structures, and propose to use the model as the likelihood in a Bayesian setting when analyzing a spatial...
From least squares to multilevel modeling: A graphical introduction to Bayesian inference
Loredo, Thomas J.
2016-01-01
This tutorial presentation will introduce some of the key ideas and techniques involved in applying Bayesian methods to problems in astrostatistics. The focus will be on the big picture: understanding the foundations (interpreting probability, Bayes's theorem, the law of total probability and marginalization), making connections to traditional methods (propagation of errors, least squares, chi-squared, maximum likelihood, Monte Carlo simulation), and highlighting problems where a Bayesian approach can be particularly powerful (Poisson processes, density estimation and curve fitting with measurement error). The "graphical" component of the title reflects an emphasis on pictorial representations of some of the math, but also on the use of graphical models (multilevel or hierarchical models) for analyzing complex data. Code for some examples from the talk will be available to participants, in Python and in the Stan probabilistic programming language.
Propriety conditions for the Bayesian autologistic model – Inference for histone modifications
Mitra, Riten; Müller, Peter; Ji, Yuan
2013-01-01
Motivated by inference for a set of histone modifications we consider an improper prior for an autologistic model. We state sufficient conditions for posterior propriety under a constant prior on the coefficients of an autologistic model. We use known results for a multinomial logistic regression to prove posterior propriety under the autologistic model. The conditions are easily verified.
Bayesian Inference Methods for Sparse Channel Estimation
Pedersen, Niels Lovmand
2013-01-01
inference algorithms based on the proposed prior representation for sparse channel estimation in orthogonal frequency-division multiplexing receivers. The inference algorithms, which are mainly obtained from variational Bayesian methods, exploit the underlying sparse structure of wireless channel responses......This thesis deals with sparse Bayesian learning (SBL) with application to radio channel estimation. As opposed to the classical approach for sparse signal representation, we focus on the problem of inferring complex signals. Our investigations within SBL constitute the basis for the development of...... Bayesian inference algorithms for sparse channel estimation. Sparse inference methods aim at finding the sparse representation of a signal given in some overcomplete dictionary of basis vectors. Within this context, one of our main contributions to the field of SBL is a hierarchical representation of...
Zhang, Guannan [ORNL; Webster, Clayton G [ORNL; Gunzburger, Max D [ORNL
2012-09-01
Although Bayesian analysis has become vital to the quantification of prediction uncertainty in groundwater modeling, its application has been hindered due to the computational cost associated with numerous model executions needed for exploring the posterior probability density function (PPDF) of model parameters. This is particularly the case when the PPDF is estimated using Markov Chain Monte Carlo (MCMC) sampling. In this study, we develop a new approach that improves computational efficiency of Bayesian inference by constructing a surrogate system based on an adaptive sparse-grid high-order stochastic collocation (aSG-hSC) method. Unlike previous works using first-order hierarchical basis, we utilize a compactly supported higher-order hierar- chical basis to construct the surrogate system, resulting in a significant reduction in the number of computational simulations required. In addition, we use hierarchical surplus as an error indi- cator to determine adaptive sparse grids. This allows local refinement in the uncertain domain and/or anisotropic detection with respect to the random model parameters, which further improves computational efficiency. Finally, we incorporate a global optimization technique and propose an iterative algorithm for building the surrogate system for the PPDF with multiple significant modes. Once the surrogate system is determined, the PPDF can be evaluated by sampling the surrogate system directly with very little computational cost. The developed method is evaluated first using a simple analytical density function with multiple modes and then using two synthetic groundwater reactive transport models. The groundwater models represent different levels of complexity; the first example involves coupled linear reactions and the second example simulates nonlinear ura- nium surface complexation. The results show that the aSG-hSC is an effective and efficient tool for Bayesian inference in groundwater modeling in comparison with conventional
Bayesian inference tools for inverse problems
Mohammad-Djafari, Ali
2013-08-01
In this paper, first the basics of Bayesian inference with a parametric model of the data is presented. Then, the needed extensions are given when dealing with inverse problems and in particular the linear models such as Deconvolution or image reconstruction in Computed Tomography (CT). The main point to discuss then is the prior modeling of signals and images. A classification of these priors is presented, first in separable and Markovien models and then in simple or hierarchical with hidden variables. For practical applications, we need also to consider the estimation of the hyper parameters. Finally, we see that we have to infer simultaneously on the unknowns, the hidden variables and the hyper parameters. Very often, the expression of this joint posterior law is too complex to be handled directly. Indeed, rarely we can obtain analytical solutions to any point estimators such the Maximum A posteriori (MAP) or Posterior Mean (PM). Three main tools are then can be used: Laplace approximation (LAP), Markov Chain Monte Carlo (MCMC) and Bayesian Variational Approximations (BVA). To illustrate all these aspects, we will consider a deconvolution problem where we know that the input signal is sparse and propose to use a Student-t prior for that. Then, to handle the Bayesian computations with this model, we use the property of Student-t which is modelling it via an infinite mixture of Gaussians, introducing thus hidden variables which are the variances. Then, the expression of the joint posterior of the input signal samples, the hidden variables (which are here the inverse variances of those samples) and the hyper-parameters of the problem (for example the variance of the noise) is given. From this point, we will present the joint maximization by alternate optimization and the three possible approximation methods. Finally, the proposed methodology is applied in different applications such as mass spectrometry, spectrum estimation of quasi periodic biological signals and
Human collective intelligence as distributed Bayesian inference
Krafft, Peter M; Pan, Wei; Della Penna, Nicolás; Altshuler, Yaniv; Shmueli, Erez; Tenenbaum, Joshua B; Pentland, Alex
2016-01-01
Collective intelligence is believed to underly the remarkable success of human society. The formation of accurate shared beliefs is one of the key components of human collective intelligence. How are accurate shared beliefs formed in groups of fallible individuals? Answering this question requires a multiscale analysis. We must understand both the individual decision mechanisms people use, and the properties and dynamics of those mechanisms in the aggregate. As of yet, mathematical tools for such an approach have been lacking. To address this gap, we introduce a new analytical framework: We propose that groups arrive at accurate shared beliefs via distributed Bayesian inference. Distributed inference occurs through information processing at the individual level, and yields rational belief formation at the group level. We instantiate this framework in a new model of human social decision-making, which we validate using a dataset we collected of over 50,000 users of an online social trading platform where inves...
Bayesian Inference on Gravitational Waves
Asad Ali
2015-12-01
Full Text Available The Bayesian approach is increasingly becoming popular among the astrophysics data analysis communities. However, the Pakistan statistics communities are unaware of this fertile interaction between the two disciplines. Bayesian methods have been in use to address astronomical problems since the very birth of the Bayes probability in eighteenth century. Today the Bayesian methods for the detection and parameter estimation of gravitational waves have solid theoretical grounds with a strong promise for the realistic applications. This article aims to introduce the Pakistan statistics communities to the applications of Bayesian Monte Carlo methods in the analysis of gravitational wave data with an overview of the Bayesian signal detection and estimation methods and demonstration by a couple of simplified examples.
Elsheikh, Ahmed H.
2014-02-01
An efficient Bayesian calibration method based on the nested sampling (NS) algorithm and non-intrusive polynomial chaos method is presented. Nested sampling is a Bayesian sampling algorithm that builds a discrete representation of the posterior distributions by iteratively re-focusing a set of samples to high likelihood regions. NS allows representing the posterior probability density function (PDF) with a smaller number of samples and reduces the curse of dimensionality effects. The main difficulty of the NS algorithm is in the constrained sampling step which is commonly performed using a random walk Markov Chain Monte-Carlo (MCMC) algorithm. In this work, we perform a two-stage sampling using a polynomial chaos response surface to filter out rejected samples in the Markov Chain Monte-Carlo method. The combined use of nested sampling and the two-stage MCMC based on approximate response surfaces provides significant computational gains in terms of the number of simulation runs. The proposed algorithm is applied for calibration and model selection of subsurface flow models. © 2013.
Efficient Bayesian inference for ARFIMA processes
Graves, T.; Gramacy, R. B.; Franzke, C. L. E.; Watkins, N. W.
2015-03-01
Many geophysical quantities, like atmospheric temperature, water levels in rivers, and wind speeds, have shown evidence of long-range dependence (LRD). LRD means that these quantities experience non-trivial temporal memory, which potentially enhances their predictability, but also hampers the detection of externally forced trends. Thus, it is important to reliably identify whether or not a system exhibits LRD. In this paper we present a modern and systematic approach to the inference of LRD. Rather than Mandelbrot's fractional Gaussian noise, we use the more flexible Autoregressive Fractional Integrated Moving Average (ARFIMA) model which is widely used in time series analysis, and of increasing interest in climate science. Unlike most previous work on the inference of LRD, which is frequentist in nature, we provide a systematic treatment of Bayesian inference. In particular, we provide a new approximate likelihood for efficient parameter inference, and show how nuisance parameters (e.g. short memory effects) can be integrated over in order to focus on long memory parameters, and hypothesis testing more directly. We illustrate our new methodology on the Nile water level data, with favorable comparison to the standard estimators.
Systematic validation of non-equilibrium thermochemical models using Bayesian inference
Miki, Kenji [NASA Glenn Research Center, OAI, 22800 Cedar Point Rd, Cleveland, OH 44142 (United States); Panesi, Marco, E-mail: mpanesi@illinois.edu [Department of Aerospace Engineering, University of Illinois at Urbana-Champaign, 306 Talbot Lab, 104 S. Wright St., Urbana, IL 61801 (United States); Prudhomme, Serge [Département de mathématiques et de génie industriel, Ecole Polytechnique de Montréal, C.P. 6079, succ. Centre-ville, Montréal, QC, H3C 3A7 (Canada)
2015-10-01
The validation process proposed by Babuška et al. [1] is applied to thermochemical models describing post-shock flow conditions. In this validation approach, experimental data is involved only in the calibration of the models, and the decision process is based on quantities of interest (QoIs) predicted on scenarios that are not necessarily amenable experimentally. Moreover, uncertainties present in the experimental data, as well as those resulting from an incomplete physical model description, are propagated to the QoIs. We investigate four commonly used thermochemical models: a one-temperature model (which assumes thermal equilibrium among all inner modes), and two-temperature models developed by Macheret et al. [2], Marrone and Treanor [3], and Park [4]. Up to 16 uncertain parameters are estimated using Bayesian updating based on the latest absolute volumetric radiance data collected at the Electric Arc Shock Tube (EAST) installed inside the NASA Ames Research Center. Following the solution of the inverse problems, the forward problems are solved in order to predict the radiative heat flux, QoI, and examine the validity of these models. Our results show that all four models are invalid, but for different reasons: the one-temperature model simply fails to reproduce the data while the two-temperature models exhibit unacceptably large uncertainties in the QoI predictions.
Systematic validation of non-equilibrium thermochemical models using Bayesian inference
Miki, Kenji
2015-10-01
© 2015 Elsevier Inc. The validation process proposed by Babuška et al. [1] is applied to thermochemical models describing post-shock flow conditions. In this validation approach, experimental data is involved only in the calibration of the models, and the decision process is based on quantities of interest (QoIs) predicted on scenarios that are not necessarily amenable experimentally. Moreover, uncertainties present in the experimental data, as well as those resulting from an incomplete physical model description, are propagated to the QoIs. We investigate four commonly used thermochemical models: a one-temperature model (which assumes thermal equilibrium among all inner modes), and two-temperature models developed by Macheret et al. [2], Marrone and Treanor [3], and Park [4]. Up to 16 uncertain parameters are estimated using Bayesian updating based on the latest absolute volumetric radiance data collected at the Electric Arc Shock Tube (EAST) installed inside the NASA Ames Research Center. Following the solution of the inverse problems, the forward problems are solved in order to predict the radiative heat flux, QoI, and examine the validity of these models. Our results show that all four models are invalid, but for different reasons: the one-temperature model simply fails to reproduce the data while the two-temperature models exhibit unacceptably large uncertainties in the QoI predictions.
An Intuitive Dashboard for Bayesian Network Inference
Reddy, Vikas; Charisse Farr, Anna; Wu, Paul; Mengersen, Kerrie; Yarlagadda, Prasad K. D. V.
2014-03-01
Current Bayesian network software packages provide good graphical interface for users who design and develop Bayesian networks for various applications. However, the intended end-users of these networks may not necessarily find such an interface appealing and at times it could be overwhelming, particularly when the number of nodes in the network is large. To circumvent this problem, this paper presents an intuitive dashboard, which provides an additional layer of abstraction, enabling the end-users to easily perform inferences over the Bayesian networks. Unlike most software packages, which display the nodes and arcs of the network, the developed tool organises the nodes based on the cause-and-effect relationship, making the user-interaction more intuitive and friendly. In addition to performing various types of inferences, the users can conveniently use the tool to verify the behaviour of the developed Bayesian network. The tool has been developed using QT and SMILE libraries in C++.
An Intuitive Dashboard for Bayesian Network Inference
Current Bayesian network software packages provide good graphical interface for users who design and develop Bayesian networks for various applications. However, the intended end-users of these networks may not necessarily find such an interface appealing and at times it could be overwhelming, particularly when the number of nodes in the network is large. To circumvent this problem, this paper presents an intuitive dashboard, which provides an additional layer of abstraction, enabling the end-users to easily perform inferences over the Bayesian networks. Unlike most software packages, which display the nodes and arcs of the network, the developed tool organises the nodes based on the cause-and-effect relationship, making the user-interaction more intuitive and friendly. In addition to performing various types of inferences, the users can conveniently use the tool to verify the behaviour of the developed Bayesian network. The tool has been developed using QT and SMILE libraries in C++
Kernel Bayesian Inference with Posterior Regularization
Song, Yang; Jun ZHU; Ren, Yong
2016-01-01
We propose a vector-valued regression problem whose solution is equivalent to the reproducing kernel Hilbert space (RKHS) embedding of the Bayesian posterior distribution. This equivalence provides a new understanding of kernel Bayesian inference. Moreover, the optimization problem induces a new regularization for the posterior embedding estimator, which is faster and has comparable performance to the squared regularization in kernel Bayes' rule. This regularization coincides with a former th...
Linden, Daniel W; Roloff, Gary J
2015-08-01
Pilot studies are often used to design short-term research projects and long-term ecological monitoring programs, but data are sometimes discarded when they do not match the eventual survey design. Bayesian hierarchical modeling provides a convenient framework for integrating multiple data sources while explicitly separating sample variation into observation and ecological state processes. Such an approach can better estimate state uncertainty and improve inferences from short-term studies in dynamic systems. We used a dynamic multistate occupancy model to estimate the probabilities of occurrence and nesting for white-headed woodpeckers Picoides albolarvatus in recent harvest units within managed forests of northern California, USA. Our objectives were to examine how occupancy states and state transitions were related to forest management practices, and how the probabilities changed over time. Using Gibbs variable selection, we made inferences using multiple model structures and generated model-averaged estimates. Probabilities of white-headed woodpecker occurrence and nesting were high in 2009 and 2010, and the probability that nesting persisted at a site was positively related to the snag density in harvest units. Prior-year nesting resulted in higher probabilities of subsequent occurrence and nesting. We demonstrate the benefit of forest management practices that increase the density of retained snags in harvest units for providing white-headed woodpecker nesting habitat. While including an additional year of data from our pilot study did not drastically alter management recommendations, it changed the interpretation of the mechanism behind the observed dynamics. Bayesian hierarchical modeling has the potential to maximize the utility of studies based on small sample sizes while fully accounting for measurement error and both estimation and model uncertainty, thereby improving the ability of observational data to inform conservation and management strategies
Marzouk, Youssef; Fast P. (Lawrence Livermore National Laboratory, Livermore, CA); Kraus, M. (Peterson AFB, CO); Ray, J. P.
2006-01-01
Terrorist attacks using an aerosolized pathogen preparation have gained credibility as a national security concern after the anthrax attacks of 2001. The ability to characterize such attacks, i.e., to estimate the number of people infected, the time of infection, and the average dose received, is important when planning a medical response. We address this question of characterization by formulating a Bayesian inverse problem predicated on a short time-series of diagnosed patients exhibiting symptoms. To be of relevance to response planning, we limit ourselves to 3-5 days of data. In tests performed with anthrax as the pathogen, we find that these data are usually sufficient, especially if the model of the outbreak used in the inverse problem is an accurate one. In some cases the scarcity of data may initially support outbreak characterizations at odds with the true one, but with sufficient data the correct inferences are recovered; in other words, the inverse problem posed and its solution methodology are consistent. We also explore the effect of model error-situations for which the model used in the inverse problem is only a partially accurate representation of the outbreak; here, the model predictions and the observations differ by more than a random noise. We find that while there is a consistent discrepancy between the inferred and the true characterizations, they are also close enough to be of relevance when planning a response.
Bayesian inference of a lake water quality model by emulating its posterior density
Dietzel, A.; Reichert, P.
2014-10-01
We use a Gaussian stochastic process emulator to interpolate the posterior probability density of a computationally demanding application of the biogeochemical-ecological lake model BELAMO to accelerate statistical inference of deterministic model and error model parameters. The deterministic model consists of a mechanistic description of key processes influencing the mass balance of nutrients, dissolved oxygen, organic particles, and phytoplankton and zooplankton in the lake. This model is complemented by a Gaussian stochastic process to describe the remaining model bias and by Normal, independent observation errors. A small subsample of the Markov chain representing the posterior of the model parameters is propagated through the full model to get model predictions and uncertainty estimates. We expect this approximation to be more accurate at only slightly higher computational costs compared to using a Normal approximation to the posterior probability density and linear error propagation to the results as we did in an earlier paper. The performance of the two techniques is compared for a didactical example as well as for the lake model. As expected, for the didactical example, the use of the emulator led to posterior marginals of the model parameters that are closer to those calculated by Markov chain simulation using the full model than those based on the Normal approximation. For the lake model, the new technique proved applicable without an excessive increase in computational requirements, but we faced challenges in the choice of the design data set for emulator calibration. As the posterior is a scalar function of the parameters, the suggested technique is an alternative to the emulation of a potentially more complex, structured output of the simulation model that allows for the use of a less case-specific emulator. This is at the cost that still the full model has to be used for prediction (which can be done with a smaller, approximately independent subsample
Heringstad Bjørg
2010-07-01
Full Text Available Abstract Background In the genetic analysis of binary traits with one observation per animal, animal threshold models frequently give biased heritability estimates. In some cases, this problem can be circumvented by fitting sire- or sire-dam models. However, these models are not appropriate in cases where individual records exist on parents. Therefore, the aim of our study was to develop a new Gibbs sampling algorithm for a proper estimation of genetic (covariance components within an animal threshold model framework. Methods In the proposed algorithm, individuals are classified as either "informative" or "non-informative" with respect to genetic (covariance components. The "non-informative" individuals are characterized by their Mendelian sampling deviations (deviance from the mid-parent mean being completely confounded with a single residual on the underlying liability scale. For threshold models, residual variance on the underlying scale is not identifiable. Hence, variance of fully confounded Mendelian sampling deviations cannot be identified either, but can be inferred from the between-family variation. In the new algorithm, breeding values are sampled as in a standard animal model using the full relationship matrix, but genetic (covariance components are inferred from the sampled breeding values and relationships between "informative" individuals (usually parents only. The latter is analogous to a sire-dam model (in cases with no individual records on the parents. Results When applied to simulated data sets, the standard animal threshold model failed to produce useful results since samples of genetic variance always drifted towards infinity, while the new algorithm produced proper parameter estimates essentially identical to the results from a sire-dam model (given the fact that no individual records exist for the parents. Furthermore, the new algorithm showed much faster Markov chain mixing properties for genetic parameters (similar to
Bayesian inference of baseline fertility and treatment effects via a crop yield-fertility model.
Chen, Hungyen; Yamagishi, Junko; Kishino, Hirohisa
2014-01-01
To effectively manage soil fertility, knowledge is needed of how a crop uses nutrients from fertilizer applied to the soil. Soil quality is a combination of biological, chemical and physical properties and is hard to assess directly because of collective and multiple functional effects. In this paper, we focus on the application of these concepts to agriculture. We define the baseline fertility of soil as the level of fertility that a crop can acquire for growth from the soil. With this strict definition, we propose a new crop yield-fertility model that enables quantification of the process of improving baseline fertility and the effects of treatments solely from the time series of crop yields. The model was modified from Michaelis-Menten kinetics and measured the additional effects of the treatments given the baseline fertility. Using more than 30 years of experimental data, we used the Bayesian framework to estimate the improvements in baseline fertility and the effects of fertilizer and farmyard manure (FYM) on maize (Zea mays), barley (Hordeum vulgare), and soybean (Glycine max) yields. Fertilizer contributed the most to the barley yield and FYM contributed the most to the soybean yield among the three crops. The baseline fertility of the subsurface soil was very low for maize and barley prior to fertilization. In contrast, the baseline fertility in this soil approximated half-saturated fertility for the soybean crop. The long-term soil fertility was increased by adding FYM, but the effect of FYM addition was reduced by the addition of fertilizer. Our results provide evidence that long-term soil fertility under continuous farming was maintained, or increased, by the application of natural nutrients compared with the application of synthetic fertilizer. PMID:25405353
Stan: A Probabilistic Programming Language for Bayesian Inference and Optimization
Gelman, Andrew; Lee, Daniel; Guo, Jiqiang
2015-01-01
Stan is a free and open-source C++ program that performs Bayesian inference or optimization for arbitrary user-specified models and can be called from the command line, R, Python, Matlab, or Julia and has great promise for fitting large and complex statistical models in many areas of application. We discuss Stan from users' and developers'…
Boers, Niklas; Goswami, Bedartha; Chekroun, Mickael; Svensson, Anders; Rousseau, Denis-Didier; Ghil, Michael
2016-04-01
In the recent past, empirical stochastic models have been successfully applied to model a wide range of climatic phenomena [1,2]. In addition to enhancing our understanding of the geophysical systems under consideration, multilayer stochastic models (MSMs) have been shown to be solidly grounded in the Mori-Zwanzig formalism of statistical physics [3]. They are also well-suited for predictive purposes, e.g., for the El Niño Southern Oscillation [4] and the Madden-Julian Oscillation [5]. In general, these models are trained on a given time series under consideration, and then assumed to reproduce certain dynamical properties of the underlying natural system. Most existing approaches are based on least-squares fitting to determine optimal model parameters, which does not allow for an uncertainty estimation of these parameters. This approach significantly limits the degree to which dynamical characteristics of the time series can be safely inferred from the model. Here, we are specifically interested in fitting low-dimensional stochastic models to time series obtained from paleoclimatic proxy records, such as the oxygen isotope ratio and dust concentration of the NGRIP record [6]. The time series derived from these records exhibit substantial dating uncertainties, in addition to the proxy measurement errors. In particular, for time series of this kind, it is crucial to obtain uncertainty estimates for the final model parameters. Following [7], we first propose a statistical procedure to shift dating uncertainties from the time axis to the proxy axis of layer-counted paleoclimatic records. Thereafter, we show how Maximum Likelihood Estimation in combination with Markov Chain Monte Carlo parameter sampling can be employed to translate all uncertainties present in the original proxy time series to uncertainties of the parameter estimates of the stochastic model. We compare time series simulated by the empirical model to the original time series in terms of standard
A Bayesian Approach to Protein Inference Problem in Shotgun Proteomics
Li, Yong Fuga; Arnold, Randy J.; Li, Yixue; Radivojac, Predrag; Sheng, Quanhu; Tang, Haixu
2009-01-01
The protein inference problem represents a major challenge in shotgun proteomics. In this article, we describe a novel Bayesian approach to address this challenge by incorporating the predicted peptide detectabilities as the prior probabilities of peptide identification. We propose a rigorious probabilistic model for protein inference and provide practical algoritmic solutions to this problem. We used a complex synthetic protein mixture to test our method and obtained promising results.
The NIFTY way of Bayesian signal inference
We introduce NIFTY, 'Numerical Information Field Theory', a software package for the development of Bayesian signal inference algorithms that operate independently from any underlying spatial grid and its resolution. A large number of Bayesian and Maximum Entropy methods for 1D signal reconstruction, 2D imaging, as well as 3D tomography, appear formally similar, but one often finds individualized implementations that are neither flexible nor easily transferable. Signal inference in the framework of NIFTY can be done in an abstract way, such that algorithms, prototyped in 1D, can be applied to real world problems in higher-dimensional settings. NIFTY as a versatile library is applicable and already has been applied in 1D, 2D, 3D and spherical settings. A recent application is the D3PO algorithm targeting the non-trivial task of denoising, deconvolving, and decomposing photon observations in high energy astronomy
Using Alien Coins to Test Whether Simple Inference Is Bayesian
Cassey, Peter; Hawkins, Guy E.; Donkin, Chris; Brown, Scott D.
2016-01-01
Reasoning and inference are well-studied aspects of basic cognition that have been explained as statistically optimal Bayesian inference. Using a simplified experimental design, we conducted quantitative comparisons between Bayesian inference and human inference at the level of individuals. In 3 experiments, with more than 13,000 participants, we…
Analysis of KATRIN data using Bayesian inference
Riis, Anna Sejersen; Weinheimer, Christian
2011-01-01
The KATRIN (KArlsruhe TRItium Neutrino) experiment will be analyzing the tritium beta-spectrum to determine the mass of the neutrino with a sensitivity of 0.2 eV (90% C.L.). This approach to a measurement of the absolute value of the neutrino mass relies only on the principle of energy conservation and can in some sense be called model-independent as compared to cosmology and neutrino-less double beta decay. However by model independent we only mean in case of the minimal extension of the standard model. One should therefore also analyse the data for non-standard couplings to e.g. righthanded or sterile neutrinos. As an alternative to the frequentist minimization methods used in the analysis of the earlier experiments in Mainz and Troitsk we have been investigating Markov Chain Monte Carlo (MCMC) methods which are very well suited for probing multi-parameter spaces. We found that implementing the KATRIN chi squared function in the COSMOMC package - an MCMC code using Bayesian parameter inference - solved the ...
Bayesian Inference of Kinematics and Memberships of Open Cluster
Shao, Z. Y.; Chen, L.; Zhong, J.; Hou, J. L.
2014-07-01
Based on the Bayesian Inference (BI) method, the Multiple-modelling approach is improved to combine coordinative positions, proper motions (PM) and radial velocities (RV), to separate the motion of the open cluster from field stars, as well as to describe the intrinsic kinematic status of the cluster.
Bayesian modeling using WinBUGS
Ntzoufras, Ioannis
2009-01-01
A hands-on introduction to the principles of Bayesian modeling using WinBUGS Bayesian Modeling Using WinBUGS provides an easily accessible introduction to the use of WinBUGS programming techniques in a variety of Bayesian modeling settings. The author provides an accessible treatment of the topic, offering readers a smooth introduction to the principles of Bayesian modeling with detailed guidance on the practical implementation of key principles. The book begins with a basic introduction to Bayesian inference and the WinBUGS software and goes on to cover key topics, including: Markov Chain Monte Carlo algorithms in Bayesian inference Generalized linear models Bayesian hierarchical models Predictive distribution and model checking Bayesian model and variable evaluation Computational notes and screen captures illustrate the use of both WinBUGS as well as R software to apply the discussed techniques. Exercises at the end of each chapter allow readers to test their understanding of the presented concepts and all ...
Bayesian Inference and Application of Robust Growth Curve Models Using Student's "t" Distribution
Zhang, Zhiyong; Lai, Keke; Lu, Zhenqiu; Tong, Xin
2013-01-01
Despite the widespread popularity of growth curve analysis, few studies have investigated robust growth curve models. In this article, the "t" distribution is applied to model heavy-tailed data and contaminated normal data with outliers for growth curve analysis. The derived robust growth curve models are estimated through Bayesian…
Møller, Jesper; Rasmussen, Jakob Gulddahl
We introduce a flexible spatial point process model for spatial point patterns exhibiting linear structures, without incorporating a latent line process. The model is given by an underlying sequential point process model, i.e. each new point is generated given the previous points. Under this mode...
Bayesian Inference in the Modern Design of Experiments
DeLoach, Richard
2008-01-01
This paper provides an elementary tutorial overview of Bayesian inference and its potential for application in aerospace experimentation in general and wind tunnel testing in particular. Bayes Theorem is reviewed and examples are provided to illustrate how it can be applied to objectively revise prior knowledge by incorporating insights subsequently obtained from additional observations, resulting in new (posterior) knowledge that combines information from both sources. A logical merger of Bayesian methods and certain aspects of Response Surface Modeling is explored. Specific applications to wind tunnel testing, computational code validation, and instrumentation calibration are discussed.
Møller, Jesper; Rasmussen, Jakob Gulddahl
2012-01-01
We introduce a flexible spatial point process model for spatial point patterns exhibiting linear structures, without incorporating a latent line process. The model is given by an underlying sequential point process model. Under this model, the points can be of one of three types: a ‘background...... point’ an ‘independent cluster point’ or a ‘dependent cluster point’. The background and independent cluster points are thought to exhibit ‘complete spatial randomness’, whereas the dependent cluster points are likely to occur close to previous cluster points. We demonstrate the flexibility of the model...
Wang Shu-Qiang
2012-07-01
Full Text Available Abstract Background A key challenge in the post genome era is to identify genome-wide transcriptional regulatory networks, which specify the interactions between transcription factors and their target genes. Numerous methods have been developed for reconstructing gene regulatory networks from expression data. However, most of them are based on coarse grained qualitative models, and cannot provide a quantitative view of regulatory systems. Results A binding affinity based regulatory model is proposed to quantify the transcriptional regulatory network. Multiple quantities, including binding affinity and the activity level of transcription factor (TF are incorporated into a general learning model. The sequence features of the promoter and the possible occupancy of nucleosomes are exploited to estimate the binding probability of regulators. Comparing with the previous models that only employ microarray data, the proposed model can bridge the gap between the relative background frequency of the observed nucleotide and the gene's transcription rate. Conclusions We testify the proposed approach on two real-world microarray datasets. Experimental results show that the proposed model can effectively identify the parameters and the activity level of TF. Moreover, the kinetic parameters introduced in the proposed model can reveal more biological sense than previous models can do.
Bayesian Inference for Smoking Cessation with a Latent Cure State
Luo, Sheng; Crainiceanu, Ciprian M.; Thomas A Louis; Chatterjee, Nilanjan
2009-01-01
We present a Bayesian approach to modeling dynamic smoking addiction behavior processes when cure is not directly observed due to censoring. Subject-specific probabilities model the stochastic transitions among three behavioral states: smoking, transient quitting, and permanent quitting (absorbent state). A multivariate normal distribution for random effects is used to account for the potential correlation among the subject-specific transition probabilities. Inference is conducted using a Bay...
An equilibrium validation technique based on Bayesian inference
In recent years, Bayesian probability theory has been used in a number of experiments to fold uncertainties and interdependences in the diagnostic data and forward models, and together with prior knowledge of the state of the plasma, thus increase accuracy of inferred physics variables. Key developments include the application to current and flux surface tomography, effective charge, the electron energy distribution function, neutron spectrometry and density. Virtual observations have also been introduced to better constrain inferred quantities in current tomography. In this work we present Bayesian inference results of toroidal and poloidal current and flux surface tomography. Whilst the uncertainty in these profiles, as well as the uncertainty in inferred parameters such as the safety factor profile is small (<5%), the inference can change substantially depending on the physics model used. We also present Bayesian inference results for Thomson scattering and charge-exchange recombination spectroscopy. In separate work we have computed radial force balance components on the midplane in the Mega Ampere Spherical tokamak. Our aim is to establish a validation framework for different equilibrium physics models. We find that in the overlapping region of the core (normalized poloidal flux less than 0.4) and motional Stark effect (MSE) chords, the plasma is consistent with static Grad-Shafranov force balance to within two standard deviations. In the outboard edge region, where MSE data are also available, the pressure gradient exceeds the Lorentz force. Most likely, this is because the poloidal current is not constrained to zero at the plasma edge. To lowest order, the results suggest computing components of force balance are useful to assess data-consistency, independent of any equilibrium solution. To first order, we have integrated the residue to force balance to infer an energetic particle pressure.
In-Home Activity Recognition: Bayesian Inference for Hidden Markov Models
F. Javier Ordoñez; G. Englebienne; P. de Toledo; T. van Kasteren; A. Sanchez; B. Kröse
2014-01-01
Activity recognition in a home setting is being widely explored as a means to support elderly people living alone. Probabilistic models using classical, maximum-likelihood estimation methods are known to work well in this domain, but they are prone to overfitting and require labeled activity data fo
Bayesian Models of Brain and Behaviour
Penny, William
2012-01-01
This paper presents a review of Bayesian models of brain and behaviour. We first review the basic principles of Bayesian inference. This is followed by descriptions of sampling and variational methods for approximate inference, and forward and backward recursions in time for inference in dynamical models. The review of behavioural models covers work in visual processing, sensory integration, sensorimotor integration, and collective decision making. The review of brain models covers a range of...
Revealing ecological networks using Bayesian network inference algorithms.
Milns, Isobel; Beale, Colin M; Smith, V Anne
2010-07-01
Understanding functional relationships within ecological networks can help reveal keys to ecosystem stability or fragility. Revealing these relationships is complicated by the difficulties of isolating variables or performing experimental manipulations within a natural ecosystem, and thus inferences are often made by matching models to observational data. Such models, however, require assumptions-or detailed measurements-of parameters such as birth and death rate, encounter frequency, territorial exclusion, and predation success. Here, we evaluate the use of a Bayesian network inference algorithm, which can reveal ecological networks based upon species and habitat abundance alone. We test the algorithm's performance and applicability on observational data of avian communities and habitat in the Peak District National Park, United Kingdom. The resulting networks correctly reveal known relationships among habitat types and known interspecific relationships. In addition, the networks produced novel insights into ecosystem structure and identified key species with high connectivity. Thus, Bayesian networks show potential for becoming a valuable tool in ecosystem analysis. PMID:20715607
Church, Jonathan R.
New condensed matter metrologies are being used to probe ever smaller length scales. In support of the diverse field of materials research synchrotron based spectroscopies provide sub-micron spatial resolutions and a breadth of photon wavelengths for scientific studies. For electronic materials the thinnest layers in a complementary metal-oxide-semiconductor (CMOS) device have been reduced to just a few nanometers. This raises concerns for layer uniformity, complete surface coverage, and interfacial quality. Deposition processes like chemical vapor deposition (CVD) and atomic layer deposition (ALD) have been shown to deposit the needed high-quality films for the requisite thicknesses. However, new materials beget new chemistries and, unfortunately, unwanted side-reactions and by-products. CVD/ALD tools and chemical precursors provided by our collaborators at Air Liquide utilized these new chemistries and films were deposited for which novel spectroscopic characterization methods were used. The second portion of the thesis focuses on fading and decomposing paint pigments in iconic artworks. Efforts have been directed towards understanding the micro-environments causing degradation. Hard X-ray photoelectron spectroscopy (HAXPES) and variable kinetic energy X-ray photoelectron spectroscopy (VKE-XPS) are advanced XPS techniques capable of elucidating both chemical environments and electronic band structures in sub-surface regions of electronic materials. HAXPES has been used to study the electronic band structure in a typical CMOS structure; it will be shown that unexpected band alignments are associated with the presence of electronic charges near a buried interface. Additionally, a computational modeling algorithm, Bayes-Sim, was developed to reconstruct compositional depth profiles (CDP) using VKE-XPS data sets; a subset algorithm also reconstructs CDP from angle-resolved XPS data. Reconstructed CDP produced by Bayes-Sim were most strongly correlated to the real
Approximate Bayesian inference for complex ecosystems
Michael P H Stumpf
2014-01-01
Mathematical models have been central to ecology for nearly a century. Simple models of population dynamics have allowed us to understand fundamental aspects underlying the dynamics and stability of ecological systems. What has remained a challenge, however, is to meaningfully interpret experimental or observational data in light of mathematical models. Here, we review recent developments, notably in the growing field of approximate Bayesian computation (ABC), that allow us to calibrate mathe...
Rajabi, Mohammad Mahdi; Ataie-Ashtiani, Behzad
2016-05-01
Bayesian inference has traditionally been conceived as the proper framework for the formal incorporation of expert knowledge in parameter estimation of groundwater models. However, conventional Bayesian inference is incapable of taking into account the imprecision essentially embedded in expert provided information. In order to solve this problem, a number of extensions to conventional Bayesian inference have been introduced in recent years. One of these extensions is 'fuzzy Bayesian inference' which is the result of integrating fuzzy techniques into Bayesian statistics. Fuzzy Bayesian inference has a number of desirable features which makes it an attractive approach for incorporating expert knowledge in the parameter estimation process of groundwater models: (1) it is well adapted to the nature of expert provided information, (2) it allows to distinguishably model both uncertainty and imprecision, and (3) it presents a framework for fusing expert provided information regarding the various inputs of the Bayesian inference algorithm. However an important obstacle in employing fuzzy Bayesian inference in groundwater numerical modeling applications is the computational burden, as the required number of numerical model simulations often becomes extremely exhaustive and often computationally infeasible. In this paper, a novel approach of accelerating the fuzzy Bayesian inference algorithm is proposed which is based on using approximate posterior distributions derived from surrogate modeling, as a screening tool in the computations. The proposed approach is first applied to a synthetic test case of seawater intrusion (SWI) in a coastal aquifer. It is shown that for this synthetic test case, the proposed approach decreases the number of required numerical simulations by an order of magnitude. Then the proposed approach is applied to a real-world test case involving three-dimensional numerical modeling of SWI in Kish Island, located in the Persian Gulf. An expert
Christley, Scott; Emr, Bryanna; Ghosh, Auyon; Satalin, Josh; Gatto, Louis; Vodovotz, Yoram; Nieman, Gary F.; An, Gary
2013-06-01
Acute respiratory distress syndrome (ARDS) is acute lung failure secondary to severe systemic inflammation, resulting in a derangement of alveolar mechanics (i.e. the dynamic change in alveolar size and shape during tidal ventilation), leading to alveolar instability that can cause further damage to the pulmonary parenchyma. Mechanical ventilation is a mainstay in the treatment of ARDS, but may induce mechano-physical stresses on unstable alveoli, which can paradoxically propagate the cellular and molecular processes exacerbating ARDS pathology. This phenomenon is called ventilator induced lung injury (VILI), and plays a significant role in morbidity and mortality associated with ARDS. In order to identify optimal ventilation strategies to limit VILI and treat ARDS, it is necessary to understand the complex interplay between biological and physical mechanisms of VILI, first at the alveolar level, and then in aggregate at the whole-lung level. Since there is no current consensus about the underlying dynamics of alveolar mechanics, as an initial step we investigate the ventilatory dynamics of an alveolar sac (AS) with the lung alveolar spatial model (LASM), a 3D spatial biomechanical representation of the AS and its interaction with airflow pressure and the surface tension effects of pulmonary surfactant. We use the LASM to identify the mechanical ramifications of alveolar dynamics associated with ARDS. Using graphical processing unit parallel algorithms, we perform Bayesian inference on the model parameters using experimental data from rat lung under control and Tween-induced ARDS conditions. Our results provide two plausible models that recapitulate two fundamental hypotheses about volume change at the alveolar level: (1) increase in alveolar size through isotropic volume change, or (2) minimal change in AS radius with primary expansion of the mouth of the AS, with the implication that the majority of change in lung volume during the respiratory cycle occurs in the
Acute respiratory distress syndrome (ARDS) is acute lung failure secondary to severe systemic inflammation, resulting in a derangement of alveolar mechanics (i.e. the dynamic change in alveolar size and shape during tidal ventilation), leading to alveolar instability that can cause further damage to the pulmonary parenchyma. Mechanical ventilation is a mainstay in the treatment of ARDS, but may induce mechano-physical stresses on unstable alveoli, which can paradoxically propagate the cellular and molecular processes exacerbating ARDS pathology. This phenomenon is called ventilator induced lung injury (VILI), and plays a significant role in morbidity and mortality associated with ARDS. In order to identify optimal ventilation strategies to limit VILI and treat ARDS, it is necessary to understand the complex interplay between biological and physical mechanisms of VILI, first at the alveolar level, and then in aggregate at the whole-lung level. Since there is no current consensus about the underlying dynamics of alveolar mechanics, as an initial step we investigate the ventilatory dynamics of an alveolar sac (AS) with the lung alveolar spatial model (LASM), a 3D spatial biomechanical representation of the AS and its interaction with airflow pressure and the surface tension effects of pulmonary surfactant. We use the LASM to identify the mechanical ramifications of alveolar dynamics associated with ARDS. Using graphical processing unit parallel algorithms, we perform Bayesian inference on the model parameters using experimental data from rat lung under control and Tween-induced ARDS conditions. Our results provide two plausible models that recapitulate two fundamental hypotheses about volume change at the alveolar level: (1) increase in alveolar size through isotropic volume change, or (2) minimal change in AS radius with primary expansion of the mouth of the AS, with the implication that the majority of change in lung volume during the respiratory cycle occurs in the
Bayesian Inference for Functional Dynamics Exploring in fMRI Data
Xuan Guo
2016-01-01
Full Text Available This paper aims to review state-of-the-art Bayesian-inference-based methods applied to functional magnetic resonance imaging (fMRI data. Particularly, we focus on one specific long-standing challenge in the computational modeling of fMRI datasets: how to effectively explore typical functional interactions from fMRI time series and the corresponding boundaries of temporal segments. Bayesian inference is a method of statistical inference which has been shown to be a powerful tool to encode dependence relationships among the variables with uncertainty. Here we provide an introduction to a group of Bayesian-inference-based methods for fMRI data analysis, which were designed to detect magnitude or functional connectivity change points and to infer their functional interaction patterns based on corresponding temporal boundaries. We also provide a comparison of three popular Bayesian models, that is, Bayesian Magnitude Change Point Model (BMCPM, Bayesian Connectivity Change Point Model (BCCPM, and Dynamic Bayesian Variable Partition Model (DBVPM, and give a summary of their applications. We envision that more delicate Bayesian inference models will be emerging and play increasingly important roles in modeling brain functions in the years to come.
Optical tweezers calibration with Bayesian inference
Türkcan, Silvan; Richly, Maximilian U.; Le Gall, Antoine; Fiszman, Nicolas; Masson, Jean-Baptiste; Westbrook, Nathalie; Perronet, Karen; Alexandrou, Antigoni
2014-09-01
We present a new method for calibrating an optical-tweezer setup that is based on Bayesian inference1. This method employs an algorithm previously used to analyze the confined trajectories of receptors within lipid rafts2,3. The main advantages of this method are that it does not require input parameters and is insensitive to systematic errors like the drift of the setup. Additionally, it exploits a much larger amount of the information stored in the recorded bead trajectory than standard calibration approaches. The additional information can be used to detect deviations from the perfect harmonic potential or detect environmental influences on the bead. The algorithm infers the diffusion coefficient and the potential felt by a trapped bead, and only requires the bead trajectory as input. We demonstrate that this method outperforms the equipartition method and the power-spectrum method in input information required (bead radius and trajectory length) and in output accuracy. Furthermore, by inferring a higher order potential our method can reveal deviations from the assumed second-order potential. More generally, this method can also be used for magnetic-tweezer calibration.
Parsing optical scanned 3D data by Bayesian inference
Xiong, Hanwei; Xu, Jun; Xu, Chenxi; Pan, Ming
2015-10-01
Optical devices are always used to digitize complex objects to get their shapes in form of point clouds. The results have no semantic meaning about the objects, and tedious process is indispensable to segment the scanned data to get meanings. The reason for a person to perceive an object correctly is the usage of knowledge, so Bayesian inference is used to the goal. A probabilistic And-Or-Graph is used as a unified framework of representation, learning, and recognition for a large number of object categories, and a probabilistic model defined on this And-Or-Graph is learned from a relatively small training set per category. Given a set of 3D scanned data, the Bayesian inference constructs a most probable interpretation of the object, and a semantic segment is obtained from the part decomposition. Some examples are given to explain the method.
Fast Bayesian inference for slow-roll inflation
Ringeval, Christophe
2013-01-01
We present and discuss a new approach increasing by orders of magnitude the speed of performing Bayesian inference and parameter estimation within the framework of slow-roll inflation. The method relies on the determination of an effective likelihood for inflation which is a function of the primordial amplitude of the scalar perturbations complemented with the necessary number of the so-called Hubble flow functions to reach the desired accuracy. Starting from any cosmological data set, the effective likelihood is obtained by marginalisation over the standard cosmological parameters, here viewed as "nuisance" from the early Universe point of view. As being low-dimensional, basic machine-learning algorithms can be trained to accurately reproduce its multidimensional shape and then be used as a proxy to perform fast Bayesian inference on the inflationary models. The robustness and accuracy of the method are illustrated using the Planck Cosmic Microwave Background (CMB) data to perform primordial parameter estima...
Approximate bayesian parameter inference for dynamical systems in systems biology
This paper proposes to use approximate instead of exact stochastic simulation algorithms for approximate Bayesian parameter inference of dynamical systems in systems biology. It first presents the mathematical framework for the description of systems biology models, especially from the aspect of a stochastic formulation as opposed to deterministic model formulations based on the law of mass action. In contrast to maximum likelihood methods for parameter inference, approximate inference method- share presented which are based on sampling parameters from a known prior probability distribution, which gradually evolves toward a posterior distribution, through the comparison of simulated data from the model to a given data set of measurements. The paper then discusses the simulation process, where an over- view is given of the different exact and approximate methods for stochastic simulation and their improvements that we propose. The exact and approximate simulators are implemented and used within approximate Bayesian parameter inference methods. Our evaluation of these methods on two tasks of parameter estimation in two different models shows that equally good results are obtained much faster when using approximate simulation as compared to using exact simulation. (Author)
Nonparametric Bayesian Modeling of Complex Networks
Schmidt, Mikkel Nørgaard; Mørup, Morten
2013-01-01
Modeling structure in complex networks using Bayesian nonparametrics makes it possible to specify flexible model structures and infer the adequate model complexity from the observed data. This article provides a gentle introduction to nonparametric Bayesian modeling of complex networks: Using...... for complex networks can be derived and point out relevant literature....
Pitombeira-Neto, Anselmo Ramalho; Loureiro, Carlos Felipe Grangeiro; Carvalho, Luis Eduardo
2016-01-01
Estimation of origin-destination (OD) demand plays a key role in successful transportation studies. In this paper, we consider the estimation of time-varying day-to-day OD flows given data on traffic volumes in a transportation network for a sequence of days. We propose a dynamic linear model (DLM) in order to represent the stochastic evolution of OD flows over time. DLM's are Bayesian state-space models which can capture non-stationarity. We take into account the hierarchical relationships b...
Bayesian Inference of Reticulate Phylogenies under the Multispecies Network Coalescent.
Wen, Dingqiao; Yu, Yun; Nakhleh, Luay
2016-05-01
The multispecies coalescent (MSC) is a statistical framework that models how gene genealogies grow within the branches of a species tree. The field of computational phylogenetics has witnessed an explosion in the development of methods for species tree inference under MSC, owing mainly to the accumulating evidence of incomplete lineage sorting in phylogenomic analyses. However, the evolutionary history of a set of genomes, or species, could be reticulate due to the occurrence of evolutionary processes such as hybridization or horizontal gene transfer. We report on a novel method for Bayesian inference of genome and species phylogenies under the multispecies network coalescent (MSNC). This framework models gene evolution within the branches of a phylogenetic network, thus incorporating reticulate evolutionary processes, such as hybridization, in addition to incomplete lineage sorting. As phylogenetic networks with different numbers of reticulation events correspond to points of different dimensions in the space of models, we devise a reversible-jump Markov chain Monte Carlo (RJMCMC) technique for sampling the posterior distribution of phylogenetic networks under MSNC. We implemented the methods in the publicly available, open-source software package PhyloNet and studied their performance on simulated and biological data. The work extends the reach of Bayesian inference to phylogenetic networks and enables new evolutionary analyses that account for reticulation. PMID:27144273
Dagne, Getachew; Huang, Yangxin
2012-01-01
Censored data are characteristics of many bioassays in HIV/AIDS studies where assays may not be sensitive enough to determine gradations in viral load determination among those below a detectable threshold. Not accounting for such left-censoring appropriately can lead to biased parameter estimates in most data analysis. To properly adjust for left-censoring, this paper presents an extension of the Tobit model for fitting nonlinear dynamic mixed-effects models with skew distributions. Such extensions allow one to specify the conditional distributions for viral load response to account for left-censoring, skewness and heaviness in the tails of the distributions of the response variable. A Bayesian modeling approach via Markov Chain Monte Carlo (MCMC) algorithm is used to estimate model parameters. The proposed methods are illustrated using real data from an HIV/AIDS study. PMID:22992288
Dagne, Getachew; Huang, Yangxin
2016-01-01
Censored data are characteristics of many bioassays in HIV/AIDS studies where assays may not be sensitive enough to determine gradations in viral load determination among those below a detectable threshold. Not accounting for such left-censoring appropriately can lead to biased parameter estimates in most data analysis. To properly adjust for left-censoring, this paper presents an extension of the Tobit model for fitting nonlinear dynamic mixed-effects models with skew distributions. Such extensions allow one to specify the conditional distributions for viral load response to account for left-censoring, skewness and heaviness in the tails of the distributions of the response variable. A Bayesian modeling approach via Markov Chain Monte Carlo (MCMC) algorithm is used to estimate model parameters. The proposed methods are illustrated using real data from an HIV/AIDS study. PMID:22992288
Progress on Bayesian Inference of the Fast Ion Distribution Function
Stagner, L.; Heidbrink, W.W,; Chen, X.;
2013-01-01
The fast-ion distribution function (DF) has a complicated dependence on several phase-space variables. The standard analysis procedure in energetic particle research is to compute the DF theoretically, use that DF in forward modeling to predict diagnostic signals, then compare with measured data....... However, when theory and experiment disagree (for one or more diagnostics), it is unclear how to proceed. Bayesian statistics provides a framework to infer the DF, quantify errors, and reconcile discrepant diagnostic measurements. Diagnostic errors and weight functions that describe the phase space...
Towards Bayesian Inference of the Fast-Ion Distribution Function
Stagner, L.; Heidbrink, W.W.; Salewski, Mirko
2012-01-01
The fast-ion distribution function (DF) has a complicated dependence on several phase-space variables. The standard analysis procedure in energetic particle research is to compute the DF theoretically, use that DF in forward modeling to predict diagnostic signals, then compare with measured data....... However, when theory and experiment disagree (for one or more diagnostics), it is unclear how to proceed. Bayesian statistics provides a framework to infer the DF, quantify errors, and reconcile discrepant diagnostic measurements. Diagnostic errors and ``weight functions" that describe the phase space...
Bayesian Inference for Signal-Based Seismic Monitoring
Moore, D.
2015-12-01
Traditional seismic monitoring systems rely on discrete detections produced by station processing software, discarding significant information present in the original recorded signal. SIG-VISA (Signal-based Vertically Integrated Seismic Analysis) is a system for global seismic monitoring through Bayesian inference on seismic signals. By modeling signals directly, our forward model is able to incorporate a rich representation of the physics underlying the signal generation process, including source mechanisms, wave propagation, and station response. This allows inference in the model to recover the qualitative behavior of recent geophysical methods including waveform matching and double-differencing, all as part of a unified Bayesian monitoring system that simultaneously detects and locates events from a global network of stations. We demonstrate recent progress in scaling up SIG-VISA to efficiently process the data stream of global signals recorded by the International Monitoring System (IMS), including comparisons against existing processing methods that show increased sensitivity from our signal-based model and in particular the ability to locate events (including aftershock sequences that can tax analyst processing) precisely from waveform correlation effects. We also provide a Bayesian analysis of an alleged low-magnitude event near the DPRK test site in May 2010 [1] [2], investigating whether such an event could plausibly be detected through automated processing in a signal-based monitoring system. [1] Zhang, Miao and Wen, Lianxing. "Seismological Evidence for a Low-Yield Nuclear Test on 12 May 2010 in North Korea". Seismological Research Letters, January/February 2015. [2] Richards, Paul. "A Seismic Event in North Korea on 12 May 2010". CTBTO SnT 2015 oral presentation, video at https://video-archive.ctbto.org/index.php/kmc/preview/partner_id/103/uiconf_id/4421629/entry_id/0_ymmtpps0/delivery/http
Bayesian inference on genetic merit under uncertain paternity
Tempelman Robert J
2003-09-01
Full Text Available Abstract A hierarchical animal model was developed for inference on genetic merit of livestock with uncertain paternity. Fully conditional posterior distributions for fixed and genetic effects, variance components, sire assignments and their probabilities are derived to facilitate a Bayesian inference strategy using MCMC methods. We compared this model to a model based on the Henderson average numerator relationship (ANRM in a simulation study with 10 replicated datasets generated for each of two traits. Trait 1 had a medium heritability (h2 for each of direct and maternal genetic effects whereas Trait 2 had a high h2 attributable only to direct effects. The average posterior probabilities inferred on the true sire were between 1 and 10% larger than the corresponding priors (the inverse of the number of candidate sires in a mating pasture for Trait 1 and between 4 and 13% larger than the corresponding priors for Trait 2. The predicted additive and maternal genetic effects were very similar using both models; however, model choice criteria (Pseudo Bayes Factor and Deviance Information Criterion decisively favored the proposed hierarchical model over the ANRM model.
Kevin McNally
2012-01-01
Full Text Available There are numerous biomonitoring programs, both recent and ongoing, to evaluate environmental exposure of humans to chemicals. Due to the lack of exposure and kinetic data, the correlation of biomarker levels with exposure concentrations leads to difficulty in utilizing biomonitoring data for biological guidance values. Exposure reconstruction or reverse dosimetry is the retrospective interpretation of external exposure consistent with biomonitoring data. We investigated the integration of physiologically based pharmacokinetic modelling, global sensitivity analysis, Bayesian inference, and Markov chain Monte Carlo simulation to obtain a population estimate of inhalation exposure to m-xylene. We used exhaled breath and venous blood m-xylene and urinary 3-methylhippuric acid measurements from a controlled human volunteer study in order to evaluate the ability of our computational framework to predict known inhalation exposures. We also investigated the importance of model structure and dimensionality with respect to its ability to reconstruct exposure.
Bayesian inference for Markov jump processes with informative observations.
Golightly, Andrew; Wilkinson, Darren J
2015-04-01
In this paper we consider the problem of parameter inference for Markov jump process (MJP) representations of stochastic kinetic models. Since transition probabilities are intractable for most processes of interest yet forward simulation is straightforward, Bayesian inference typically proceeds through computationally intensive methods such as (particle) MCMC. Such methods ostensibly require the ability to simulate trajectories from the conditioned jump process. When observations are highly informative, use of the forward simulator is likely to be inefficient and may even preclude an exact (simulation based) analysis. We therefore propose three methods for improving the efficiency of simulating conditioned jump processes. A conditioned hazard is derived based on an approximation to the jump process, and used to generate end-point conditioned trajectories for use inside an importance sampling algorithm. We also adapt a recently proposed sequential Monte Carlo scheme to our problem. Essentially, trajectories are reweighted at a set of intermediate time points, with more weight assigned to trajectories that are consistent with the next observation. We consider two implementations of this approach, based on two continuous approximations of the MJP. We compare these constructs for a simple tractable jump process before using them to perform inference for a Lotka-Volterra system. The best performing construct is used to infer the parameters governing a simple model of motility regulation in Bacillus subtilis. PMID:25720091
Marcelo Costa Souza
2004-10-01
, in which the parameters are regarded as fixed quantities, not assuming changes in time. This work aimed at fitting of autoregressive models with order 2, AR(2, specified in the form of dynamic linear models using Bayesian inference. Monte Carlo Markov Chain (MCMC was used to obtain the estimates, via Gibbs Sampler and Forward Filtering Backward Sampling (FFBS. To evaluate the fitting, two chains with 8000 iterations each, and three different series sizes, with 200, 500 and 800 observations were sampled. The Canadian lynx series (NICHOLLS and QUIN, 1982, was fitted with different discount factors (0.90, 0.95 and 0.99, and the resulting mean square error was used to compare to the fitting using classical inference. A better fit for the model with discount equal to 0.99 was observed. One-step ahead forecasts were done to check the estimates obtained for the updated and the backward sampled series. To the latter, the fitting was better and mean square error lower. In general, it was observed a good fit of the AR(2 dynamic models via Bayesian inference, and this gives a better understanding of the fitting in different situations, both simulated and real.
Bayesian inference for identifying interaction rules in moving animal groups.
Mann, Richard P
2011-01-01
The emergence of similar collective patterns from different self-propelled particle models of animal groups points to a restricted set of "universal" classes for these patterns. While universality is interesting, it is often the fine details of animal interactions that are of biological importance. Universality thus presents a challenge to inferring such interactions from macroscopic group dynamics since these can be consistent with many underlying interaction models. We present a Bayesian framework for learning animal interaction rules from fine scale recordings of animal movements in swarms. We apply these techniques to the inverse problem of inferring interaction rules from simulation models, showing that parameters can often be inferred from a small number of observations. Our methodology allows us to quantify our confidence in parameter fitting. For example, we show that attraction and alignment terms can be reliably estimated when animals are milling in a torus shape, while interaction radius cannot be reliably measured in such a situation. We assess the importance of rate of data collection and show how to test different models, such as topological and metric neighbourhood models. Taken together our results both inform the design of experiments on animal interactions and suggest how these data should be best analysed. PMID:21829657
Bayesian inference for identifying interaction rules in moving animal groups.
Richard P Mann
Full Text Available The emergence of similar collective patterns from different self-propelled particle models of animal groups points to a restricted set of "universal" classes for these patterns. While universality is interesting, it is often the fine details of animal interactions that are of biological importance. Universality thus presents a challenge to inferring such interactions from macroscopic group dynamics since these can be consistent with many underlying interaction models. We present a Bayesian framework for learning animal interaction rules from fine scale recordings of animal movements in swarms. We apply these techniques to the inverse problem of inferring interaction rules from simulation models, showing that parameters can often be inferred from a small number of observations. Our methodology allows us to quantify our confidence in parameter fitting. For example, we show that attraction and alignment terms can be reliably estimated when animals are milling in a torus shape, while interaction radius cannot be reliably measured in such a situation. We assess the importance of rate of data collection and show how to test different models, such as topological and metric neighbourhood models. Taken together our results both inform the design of experiments on animal interactions and suggest how these data should be best analysed.
Bayesian inference of population size history from multiple loci
Drummond Alexei J
2008-10-01
Full Text Available Abstract Background Effective population size (Ne is related to genetic variability and is a basic parameter in many models of population genetics. A number of methods for inferring current and past population sizes from genetic data have been developed since JFC Kingman introduced the n-coalescent in 1982. Here we present the Extended Bayesian Skyline Plot, a non-parametric Bayesian Markov chain Monte Carlo algorithm that extends a previous coalescent-based method in several ways, including the ability to analyze multiple loci. Results Through extensive simulations we show the accuracy and limitations of inferring population size as a function of the amount of data, including recovering information about evolutionary bottlenecks. We also analyzed two real data sets to demonstrate the behavior of the new method; a single gene Hepatitis C virus data set sampled from Egypt and a 10 locus Drosophila ananassae data set representing 16 different populations. Conclusion The results demonstrate the essential role of multiple loci in recovering population size dynamics. Multi-locus data from a small number of individuals can precisely recover past bottlenecks in population size which can not be characterized by analysis of a single locus. We also demonstrate that sequence data quality is important because even moderate levels of sequencing errors result in a considerable decrease in estimation accuracy for realistic levels of population genetic variability.
Sigeti, David E. [Los Alamos National Laboratory; Pelak, Robert A. [Los Alamos National Laboratory
2012-09-11
We present a Bayesian statistical methodology for identifying improvement in predictive simulations, including an analysis of the number of (presumably expensive) simulations that will need to be made in order to establish with a given level of confidence that an improvement has been observed. Our analysis assumes the ability to predict (or postdict) the same experiments with legacy and new simulation codes and uses a simple binomial model for the probability, {theta}, that, in an experiment chosen at random, the new code will provide a better prediction than the old. This model makes it possible to do statistical analysis with an absolute minimum of assumptions about the statistics of the quantities involved, at the price of discarding some potentially important information in the data. In particular, the analysis depends only on whether or not the new code predicts better than the old in any given experiment, and not on the magnitude of the improvement. We show how the posterior distribution for {theta} may be used, in a kind of Bayesian hypothesis testing, both to decide if an improvement has been observed and to quantify our confidence in that decision. We quantify the predictive probability that should be assigned, prior to taking any data, to the possibility of achieving a given level of confidence, as a function of sample size. We show how this predictive probability depends on the true value of {theta} and, in particular, how there will always be a region around {theta} = 1/2 where it is highly improbable that we will be able to identify an improvement in predictive capability, although the width of this region will shrink to zero as the sample size goes to infinity. We show how the posterior standard deviation may be used, as a kind of 'plan B metric' in the case that the analysis shows that {theta} is close to 1/2 and argue that such a plan B should generally be part of hypothesis testing. All the analysis presented in the paper is done with a
Unsupervised Transient Light Curve Analysis Via Hierarchical Bayesian Inference
Sanders, Nathan; Soderberg, Alicia
2014-01-01
Historically, light curve studies of supernovae (SNe) and other transient classes have focused on individual objects with copious and high signal-to-noise observations. In the nascent era of wide field transient searches, objects with detailed observations are decreasing as a fraction of the overall known SN population, and this strategy sacrifices the majority of the information contained in the data about the underlying population of transients. A population level modeling approach, simultaneously fitting all available observations of objects in a transient sub-class of interest, fully mines the data to infer the properties of the population and avoids certain systematic biases. We present a novel hierarchical Bayesian statistical model for population level modeling of transient light curves, and discuss its implementation using an efficient Hamiltonian Monte Carlo technique. As a test case, we apply this model to the Type IIP SN sample from the Pan-STARRS1 Medium Deep Survey, consisting of 18,837 photometr...
To study the impact of the Deepwater Horizon oil spill on photosynthesis of coastal salt marsh plants in Mississippi, we developed a hierarchical Bayesian (HB) model based on field measurements collected from July 2010 to November 2011. We sampled three locations in Davis Bayou, Mississippi (30.375°N, 88.790°W) representative of a range of oil spill impacts. Measured photosynthesis was negative (respiration only) at the heavily oiled location in July 2010 only, and rates started to increase by August 2010. Photosynthesis at the medium oiling location was lower than at the control location in July 2010 and it continued to decrease in September 2010. During winter 2010–2011, the contrast between the control and the two impacted locations was not as obvious as in the growing season of 2010. Photosynthesis increased through spring 2011 at the three locations and decreased starting with October at the control location and a month earlier (September) at the impacted locations. Using the field data, we developed an HB model. The model simulations agreed well with the measured photosynthesis, capturing most of the variability of the measured data. On the basis of the posteriors of the parameters, we found that air temperature and photosynthetic active radiation positively influenced photosynthesis whereas the leaf stress level negatively affected photosynthesis. The photosynthesis rates at the heavily impacted location had recovered to the status of the control location about 140 days after the initial impact, while the impact at the medium impact location was never severe enough to make photosynthesis significantly lower than that at the control location over the study period. The uncertainty in modeling photosynthesis rates mainly came from the individual and micro-site scales, and to a lesser extent from the leaf scale. (letter)
Bayesian networks inference algorithm to implement Dempster Shafer theory in reliability analysis
This paper deals with the use of Bayesian networks to compute system reliability. The reliability analysis problem is described and the usual methods for quantitative reliability analysis are presented within a case study. Some drawbacks that justify the use of Bayesian networks are identified. The basic concepts of the Bayesian networks application to reliability analysis are introduced and a model to compute the reliability for the case study is presented. Dempster Shafer theory to treat epistemic uncertainty in reliability analysis is then discussed and its basic concepts that can be applied thanks to the Bayesian network inference algorithm are introduced. Finally, it is shown, with a numerical example, how Bayesian networks' inference algorithms compute complex system reliability and what the Dempster Shafer theory can provide to reliability analysis
Metainference: A Bayesian Inference Method for Heterogeneous Systems
Bonomi, Massimiliano; Cavalli, Andrea; Vendruscolo, Michele
2015-01-01
Modelling a complex system is almost invariably a challenging task. The incorporation of experimental observations can be used to improve the quality of a model, and thus to obtain better predictions about the behavior of the corresponding system. This approach, however, is affected by a variety of different errors, especially when a system populates simultaneously an ensemble of different states and experimental data are measured as averages over such states. To address this problem we present a method, called metainference, that combines Bayesian inference, which is a powerful strategy to deal with errors in experimental measurements, with the maximum entropy principle, which represents a rigorous approach to deal with experimental measurements averaged over multiple states. To illustrate the method we present its application to the determination of an ensemble of structures corresponding to the thermal fluctuations of a protein molecule. Metainference thus provides an approach to model complex systems with...
Storz, Jay F; Beaumont, Mark A; Alberts, Susan C
2002-11-01
The purpose of this study was to test for evidence that savannah baboons (Papio cynocephalus) underwent a population expansion in concert with a hypothesized expansion of African human and chimpanzee populations during the late Pleistocene. The rationale is that any type of environmental event sufficient to cause simultaneous population expansions in African humans and chimpanzees would also be expected to affect other codistributed mammals. To test for genetic evidence of population expansion or contraction, we performed a coalescent analysis of multilocus microsatellite data using a hierarchical Bayesian model. Markov chain Monte Carlo (MCMC) simulations were used to estimate the posterior probability density of demographic and genealogical parameters. The model was designed to allow interlocus variation in mutational and demographic parameters, which made it possible to detect aberrant patterns of variation at individual loci that could result from heterogeneity in mutational dynamics or from the effects of selection at linked sites. Results of the MCMC simulations were consistent with zero variance in demographic parameters among loci, but there was evidence for a 10- to 20-fold difference in mutation rate between the most slowly and most rapidly evolving loci. Results of the model provided strong evidence that savannah baboons have undergone a long-term historical decline in population size. The mode of the highest posterior density for the joint distribution of current and ancestral population size indicated a roughly eightfold contraction over the past 1,000 to 250,000 years. These results indicate that savannah baboons apparently did not share a common demographic history with other codistributed primate species. PMID:12411607
Kernel Approximate Bayesian Computation for Population Genetic Inferences
Nakagome, Shigeki; Fukumizu, Kenji; Mano, Shuhei
2012-01-01
Approximate Bayesian computation (ABC) is a likelihood-free approach for Bayesian inferences based on a rejection algorithm method that applies a tolerance of dissimilarity between summary statistics from observed and simulated data. Although several improvements to the algorithm have been proposed, none of these improvements avoid the following two sources of approximation: 1) lack of sufficient statistics: sampling is not from the true posterior density given data but from an approximate po...
Bayesian large-scale structure inference and cosmic web analysis
Leclercq, Florent
2015-01-01
Surveys of the cosmic large-scale structure carry opportunities for building and testing cosmological theories about the origin and evolution of the Universe. This endeavor requires appropriate data assimilation tools, for establishing the contact between survey catalogs and models of structure formation. In this thesis, we present an innovative statistical approach for the ab initio simultaneous analysis of the formation history and morphology of the cosmic web: the BORG algorithm infers the primordial density fluctuations and produces physical reconstructions of the dark matter distribution that underlies observed galaxies, by assimilating the survey data into a cosmological structure formation model. The method, based on Bayesian probability theory, provides accurate means of uncertainty quantification. We demonstrate the application of BORG to the Sloan Digital Sky Survey data and describe the primordial and late-time large-scale structure in the observed volume. We show how the approach has led to the fi...
Bayesian inference of mass segregation of open clusters
Shao, Zhengyi; Chen, Li; Lin, Chien-Cheng; Zhong, Jing; Hou, Jinliang
2015-08-01
Based on the Bayesian inference (BI) method, the mixture-modeling approach is improved to combine all kinematic data, including the coordinative position, proper motion (PM) and radial velocity (RV), to separate the motion of the cluster from field stars in its area, as well as to describe the intrinsic kinematic status. Meanwhile, the membership probabilities of individual stars are determined as by product results. This method has been testified by simulation of toy models and it was found that the joint usage of multiple kinematic data can significantly reduce the missing rate of membership determination, say from ~15% for single data type to 1% for using all position, proper motion and radial velocity data.By combining kinematic data from multiple sources of photometric and redshift surveys, such as WIYN and APOGEE, M67 and NGC188 are revisited. Mass segregation is identified clearly for both of these two old open clusters, either in position or in PM spaces, since the Bayesian evidence (BE) of the model, which includes the segregation parameters, is much larger than that without it. The ongoing work is applying this method to the LAMOST released data which contains a large amount of RVs cover ~200 nearby open clusters. If the coming GAIA data can be used, the accuracy of tangential velocity will be largely improved and the intrinsic kinematics of open clusters can be well investigated, though they are usually less than 1 km/s.
Online query answering with differential privacy: a utility-driven approach using Bayesian inference
Xiao, Yonghui
2012-01-01
Data privacy issues frequently and increasingly arise for data sharing and data analysis tasks. In this paper, we study the problem of online query answering under the rigorous differential privacy model. The existing interactive mechanisms for differential privacy can only support a limited number of queries before the accumulated cost of privacy reaches a certain bound. This limitation has greatly hindered their applicability, especially in the scenario where multiple users legitimately need to pose a large number of queries. To minimize the privacy cost and extend the life span of a system, we propose a utility-driven mechanism for online query answering using Bayesian statistical inference. The key idea is to keep track of the query history and use Bayesian inference to answer a new query using previous query answers. The Bayesian inference algorithm provides both optimal point estimation and optimal interval estimation. We formally quantify the error of the inference result to determine if it satisfies t...
Methods for Bayesian power spectrum inference with galaxy surveys
Jasche, Jens
2013-01-01
We derive and implement a full Bayesian large scale structure inference method aiming at precision recovery of the cosmological power spectrum from galaxy redshift surveys. Our approach improves over previous Bayesian methods by performing a joint inference of the three dimensional density field, the cosmological power spectrum, luminosity dependent galaxy biases and corresponding normalizations. We account for all joint and correlated uncertainties between all inferred quantities. Classes of galaxies with different biases are treated as separate sub samples. The method therefore also allows the combined analysis of more than one galaxy survey. In particular, it solves the problem of inferring the power spectrum from galaxy surveys with non-trivial survey geometries by exploring the joint posterior distribution with efficient implementations of multiple block Markov chain and Hybrid Monte Carlo methods. Our Markov sampler achieves high statistical efficiency in low signal to noise regimes by using a determini...
Bayesian inference on the sphere beyond statistical isotropy
Das, Santanu; Souradeep, Tarun
2015-01-01
We present a general method for Bayesian inference of the underlying covariance structure of random fields on a sphere. We employ the Bipolar Spherical Harmonic (BipoSH) representation of general covariance structure on the sphere. We illustrate the efficacy of the method as a principled approach to assess violation of statistical isotropy (SI) in the sky maps of Cosmic Microwave Background (CMB) fluctuations. SI violation in observed CMB maps arise due to known physical effects such as Doppler boost and weak lensing; yet unknown theoretical possibilities like cosmic topology and subtle violations of the cosmological principle, as well as, expected observational artefacts of scanning the sky with a non-circular beam, masking, foreground residuals, anisotropic noise, etc. We explicitly demonstrate the recovery of the input SI violation signals with their full statistics in simulated CMB maps. Our formalism easily adapts to exploring parametric physical models with non-SI covariance, as we illustrate for the in...
Bayesian Inference of Natural Rankings in Incomplete Competition Networks
Park, Juyong
2013-01-01
Competition between a complex system's constituents and a corresponding reward mechanism based on it have profound influence on the functioning, stability, and evolution of the system. But determining the dominance hierarchy or ranking among the constituent parts from the strongest to the weakest -- essential in determining reward or penalty -- is almost always an ambiguous task due to the incomplete nature of competition networks. Here we introduce ``Natural Ranking," a desirably unambiguous ranking method applicable to a complete (full) competition network, and formulate an analytical model based on the Bayesian formula inferring the expected mean and error of the natural ranking of nodes from an incomplete network. We investigate its potential and uses in solving issues in ranking by applying to a real-world competition network of economic and social importance.
Pig Data and Bayesian Inference on Multinomial Probabilities
Kern, John C.
2006-01-01
Bayesian inference on multinomial probabilities is conducted based on data collected from the game Pass the Pigs[R]. Prior information on these probabilities is readily available from the instruction manual, and is easily incorporated in a Dirichlet prior. Posterior analysis of the scoring probabilities quantifies the discrepancy between empirical…
Perkins, Simon; Zwart, Jonathan; Natarajan, Iniyan; Smirnov, Oleg
2015-01-01
We present Montblanc, a GPU implementation of the Radio interferometer measurement equation (RIME) in support of the Bayesian inference for radio observations (BIRO) technique. BIRO uses Bayesian inference to select sky models that best match the visibilities observed by a radio interferometer. To accomplish this, BIRO evaluates the RIME multiple times, varying sky model parameters to produce multiple model visibilities. Chi-squared values computed from the model and observed visibilities are used as likelihood values to drive the Bayesian sampling process and select the best sky model. As most of the elements of the RIME and chi-squared calculation are independent of one another, they are highly amenable to parallel computation. Additionally, Montblanc caters for iterative RIME evaluation to produce multiple chi-squared values. Only modified model parameters are transferred to the GPU between each iteration. We implemented Montblanc as a Python package based upon NVIDIA's CUDA architecture. As such, it is ea...
Metainference: A Bayesian inference method for heterogeneous systems.
Bonomi, Massimiliano; Camilloni, Carlo; Cavalli, Andrea; Vendruscolo, Michele
2016-01-01
Modeling a complex system is almost invariably a challenging task. The incorporation of experimental observations can be used to improve the quality of a model and thus to obtain better predictions about the behavior of the corresponding system. This approach, however, is affected by a variety of different errors, especially when a system simultaneously populates an ensemble of different states and experimental data are measured as averages over such states. To address this problem, we present a Bayesian inference method, called "metainference," that is able to deal with errors in experimental measurements and with experimental measurements averaged over multiple states. To achieve this goal, metainference models a finite sample of the distribution of models using a replica approach, in the spirit of the replica-averaging modeling based on the maximum entropy principle. To illustrate the method, we present its application to a heterogeneous model system and to the determination of an ensemble of structures corresponding to the thermal fluctuations of a protein molecule. Metainference thus provides an approach to modeling complex systems with heterogeneous components and interconverting between different states by taking into account all possible sources of errors. PMID:26844300
Fast, fully Bayesian spatiotemporal inference for fMRI data.
Musgrove, Donald R; Hughes, John; Eberly, Lynn E
2016-04-01
We propose a spatial Bayesian variable selection method for detecting blood oxygenation level dependent activation in functional magnetic resonance imaging (fMRI) data. Typical fMRI experiments generate large datasets that exhibit complex spatial and temporal dependence. Fitting a full statistical model to such data can be so computationally burdensome that many practitioners resort to fitting oversimplified models, which can lead to lower quality inference. We develop a full statistical model that permits efficient computation. Our approach eases the computational burden in two ways. We partition the brain into 3D parcels, and fit our model to the parcels in parallel. Voxel-level activation within each parcel is modeled as regressions located on a lattice. Regressors represent the magnitude of change in blood oxygenation in response to a stimulus, while a latent indicator for each regressor represents whether the change is zero or non-zero. A sparse spatial generalized linear mixed model captures the spatial dependence among indicator variables within a parcel and for a given stimulus. The sparse SGLMM permits considerably more efficient computation than does the spatial model typically employed in fMRI. Through simulation we show that our parcellation scheme performs well in various realistic scenarios. Importantly, indicator variables on the boundary between parcels do not exhibit edge effects. We conclude by applying our methodology to data from a task-based fMRI experiment. PMID:26553916
A tutorial introduction to Bayesian models of cognitive development
Perfors, Amy; Tenenbaum, Joshua B.; Griffiths, Thomas L.; Xu, Fei
2010-01-01
We present an introduction to Bayesian inference as it is used in probabilistic models of cognitive development. Our goal is to provide an intuitive and accessible guide to the what, the how, and the why of the Bayesian approach: what sorts of problems and data the framework is most relevant for, and how and why it may be useful for developmentalists. We emphasize a qualitative understanding of Bayesian inference, but also include information about additional resources for those interested in...
Bayesian Inference for Neighborhood Filters With Application in Denoising.
Huang, Chao-Tsung
2015-11-01
Range-weighted neighborhood filters are useful and popular for their edge-preserving property and simplicity, but they are originally proposed as intuitive tools. Previous works needed to connect them to other tools or models for indirect property reasoning or parameter estimation. In this paper, we introduce a unified empirical Bayesian framework to do both directly. A neighborhood noise model is proposed to reason and infer the Yaroslavsky, bilateral, and modified non-local means filters by joint maximum a posteriori and maximum likelihood estimation. Then, the essential parameter, range variance, can be estimated via model fitting to the empirical distribution of an observable chi scale mixture variable. An algorithm based on expectation-maximization and quasi-Newton optimization is devised to perform the model fitting efficiently. Finally, we apply this framework to the problem of color-image denoising. A recursive fitting and filtering scheme is proposed to improve the image quality. Extensive experiments are performed for a variety of configurations, including different kernel functions, filter types and support sizes, color channel numbers, and noise types. The results show that the proposed framework can fit noisy images well and the range variance can be estimated successfully and efficiently. PMID:26259244
Type Ia Supernova Light Curve Inference: Hierarchical Bayesian Analysis in the Near Infrared
Mandel, Kaisey S; Friedman, Andrew S; Kirshner, Robert P
2009-01-01
We present a comprehensive statistical analysis of the properties of Type Ia SN light curves in the near infrared using recent data from PAIRITEL and the literature. We construct a hierarchical Bayesian framework, incorporating several uncertainties including photometric error, peculiar velocities, dust extinction and intrinsic variations, for coherent statistical inference. SN Ia light curve inferences are drawn from the global posterior probability of parameters describing both individual supernovae and the population conditioned on the entire SN Ia NIR dataset. The logical structure of the hierarchical Bayesian model is represented by a directed acyclic graph. Fully Bayesian analysis of the model and data is enabled by an efficient MCMC algorithm exploiting the conditional structure using Gibbs sampling. We apply this framework to the JHK_s SN Ia light curve data. A new light curve model captures the observed J-band light curve shape variations. The intrinsic variances in peak absolute magnitudes are: sigm...
Practical Statistics for LHC Physicists: Bayesian Inference (3/3)
CERN. Geneva
2015-01-01
These lectures cover those principles and practices of statistics that are most relevant for work at the LHC. The first lecture discusses the basic ideas of descriptive statistics, probability and likelihood. The second lecture covers the key ideas in the frequentist approach, including confidence limits, profile likelihoods, p-values, and hypothesis testing. The third lecture covers inference in the Bayesian approach. Throughout, real-world examples will be used to illustrate the practical application of the ideas. No previous knowledge is assumed.
Bayesian Fusion Algorithm for Inferring Trust in Wireless Sensor Networks
Mohammad Momani; Subhash Challa; Rami Alhmouz
2010-01-01
This paper introduces a new Bayesian fusion algorithm to combine more than one trust component (data trust and communication trust) to infer the overall trust between nodes. This research work proposes that one trust component is not enough when deciding on whether or not to trust a specific node in a wireless sensor network. This paper discusses and analyses the results from the communication trust component (binary) and the data trust component (continuous) and proves that either component ...
Halo detection via large-scale Bayesian inference
Merson, Alexander I.; Jasche, Jens; Abdalla, Filipe B.; Lahav, Ofer; Wandelt, Benjamin; Jones, D. Heath; Colless, Matthew
2016-08-01
We present a proof-of-concept of a novel and fully Bayesian methodology designed to detect haloes of different masses in cosmological observations subject to noise and systematic uncertainties. Our methodology combines the previously published Bayesian large-scale structure inference algorithm, HAmiltonian Density Estimation and Sampling algorithm (HADES), and a Bayesian chain rule (the Blackwell-Rao estimator), which we use to connect the inferred density field to the properties of dark matter haloes. To demonstrate the capability of our approach, we construct a realistic galaxy mock catalogue emulating the wide-area 6-degree Field Galaxy Survey, which has a median redshift of approximately 0.05. Application of HADES to the catalogue provides us with accurately inferred three-dimensional density fields and corresponding quantification of uncertainties inherent to any cosmological observation. We then use a cosmological simulation to relate the amplitude of the density field to the probability of detecting a halo with mass above a specified threshold. With this information, we can sum over the HADES density field realisations to construct maps of detection probabilities and demonstrate the validity of this approach within our mock scenario. We find that the probability of successful detection of haloes in the mock catalogue increases as a function of the signal to noise of the local galaxy observations. Our proposed methodology can easily be extended to account for more complex scientific questions and is a promising novel tool to analyse the cosmic large-scale structure in observations.
Trans-Dimensional Bayesian Inference for Gravitational Lens Substructures
Brewer, Brendon J; Lewis, Geraint F
2015-01-01
We introduce a Bayesian solution to the problem of inferring the density profile of strong gravitational lenses when the lens galaxy may contain multiple dark or faint substructures. The source and lens models are based on a superposition of an unknown number of non-negative basis functions (or "blobs") whose form was chosen with speed as a primary criterion. The prior distribution for the blobs' properties is specified hierarchically, so the mass function of substructures is a natural output of the method. We use reversible jump Markov Chain Monte Carlo (MCMC) within Diffusive Nested Sampling (DNS) to sample the posterior distribution and evaluate the marginal likelihood of the model, including the summation over the unknown number of blobs in the source and the lens. We demonstrate the method on a simulated data set with a single substructure, which is recovered well with moderate uncertainties. We also apply the method to the g-band image of the "Cosmic Horseshoe" system, and find some hints of potential s...
UNSUPERVISED TRANSIENT LIGHT CURVE ANALYSIS VIA HIERARCHICAL BAYESIAN INFERENCE
Sanders, N. E.; Soderberg, A. M. [Harvard-Smithsonian Center for Astrophysics, 60 Garden Street, Cambridge, MA 02138 (United States); Betancourt, M., E-mail: nsanders@cfa.harvard.edu [Department of Statistics, University of Warwick, Coventry CV4 7AL (United Kingdom)
2015-02-10
Historically, light curve studies of supernovae (SNe) and other transient classes have focused on individual objects with copious and high signal-to-noise observations. In the nascent era of wide field transient searches, objects with detailed observations are decreasing as a fraction of the overall known SN population, and this strategy sacrifices the majority of the information contained in the data about the underlying population of transients. A population level modeling approach, simultaneously fitting all available observations of objects in a transient sub-class of interest, fully mines the data to infer the properties of the population and avoids certain systematic biases. We present a novel hierarchical Bayesian statistical model for population level modeling of transient light curves, and discuss its implementation using an efficient Hamiltonian Monte Carlo technique. As a test case, we apply this model to the Type IIP SN sample from the Pan-STARRS1 Medium Deep Survey, consisting of 18,837 photometric observations of 76 SNe, corresponding to a joint posterior distribution with 9176 parameters under our model. Our hierarchical model fits provide improved constraints on light curve parameters relevant to the physical properties of their progenitor stars relative to modeling individual light curves alone. Moreover, we directly evaluate the probability for occurrence rates of unseen light curve characteristics from the model hyperparameters, addressing observational biases in survey methodology. We view this modeling framework as an unsupervised machine learning technique with the ability to maximize scientific returns from data to be collected by future wide field transient searches like LSST.
Methods for Bayesian Power Spectrum Inference with Galaxy Surveys
Jasche, Jens; Wandelt, Benjamin D.
2013-12-01
We derive and implement a full Bayesian large scale structure inference method aiming at precision recovery of the cosmological power spectrum from galaxy redshift surveys. Our approach improves upon previous Bayesian methods by performing a joint inference of the three-dimensional density field, the cosmological power spectrum, luminosity dependent galaxy biases, and corresponding normalizations. We account for all joint and correlated uncertainties between all inferred quantities. Classes of galaxies with different biases are treated as separate subsamples. This method therefore also allows the combined analysis of more than one galaxy survey. In particular, it solves the problem of inferring the power spectrum from galaxy surveys with non-trivial survey geometries by exploring the joint posterior distribution with efficient implementations of multiple block Markov chain and Hybrid Monte Carlo methods. Our Markov sampler achieves high statistical efficiency in low signal-to-noise regimes by using a deterministic reversible jump algorithm. This approach reduces the correlation length of the sampler by several orders of magnitude, turning the otherwise numerically unfeasible problem of joint parameter exploration into a numerically manageable task. We test our method on an artificial mock galaxy survey, emulating characteristic features of the Sloan Digital Sky Survey data release 7, such as its survey geometry and luminosity-dependent biases. These tests demonstrate the numerical feasibility of our large scale Bayesian inference frame work when the parameter space has millions of dimensions. This method reveals and correctly treats the anti-correlation between bias amplitudes and power spectrum, which are not taken into account in current approaches to power spectrum estimation, a 20% effect across large ranges in k space. In addition, this method results in constrained realizations of density fields obtained without assuming the power spectrum or bias parameters
Bayesian Inference for NASA Probabilistic Risk and Reliability Analysis
Dezfuli, Homayoon; Kelly, Dana; Smith, Curtis; Vedros, Kurt; Galyean, William
2009-01-01
This document, Bayesian Inference for NASA Probabilistic Risk and Reliability Analysis, is intended to provide guidelines for the collection and evaluation of risk and reliability-related data. It is aimed at scientists and engineers familiar with risk and reliability methods and provides a hands-on approach to the investigation and application of a variety of risk and reliability data assessment methods, tools, and techniques. This document provides both: A broad perspective on data analysis collection and evaluation issues. A narrow focus on the methods to implement a comprehensive information repository. The topics addressed herein cover the fundamentals of how data and information are to be used in risk and reliability analysis models and their potential role in decision making. Understanding these topics is essential to attaining a risk informed decision making environment that is being sought by NASA requirements and procedures such as 8000.4 (Agency Risk Management Procedural Requirements), NPR 8705.05 (Probabilistic Risk Assessment Procedures for NASA Programs and Projects), and the System Safety requirements of NPR 8715.3 (NASA General Safety Program Requirements).
DeLannoy, Gabrielle J. M.; Reichle, Rolf H.; Vrugt, Jasper A.
2013-01-01
Uncertainties in L-band (1.4 GHz) radiative transfer modeling (RTM) affect the simulation of brightness temperatures (Tb) over land and the inversion of satellite-observed Tb into soil moisture retrievals. In particular, accurate estimates of the microwave soil roughness, vegetation opacity and scattering albedo for large-scale applications are difficult to obtain from field studies and often lack an uncertainty estimate. Here, a Markov Chain Monte Carlo (MCMC) simulation method is used to determine satellite-scale estimates of RTM parameters and their posterior uncertainty by minimizing the misfit between long-term averages and standard deviations of simulated and observed Tb at a range of incidence angles, at horizontal and vertical polarization, and for morning and evening overpasses. Tb simulations are generated with the Goddard Earth Observing System (GEOS-5) and confronted with Tb observations from the Soil Moisture Ocean Salinity (SMOS) mission. The MCMC algorithm suggests that the relative uncertainty of the RTM parameter estimates is typically less than 25 of the maximum a posteriori density (MAP) parameter value. Furthermore, the actual root-mean-square-differences in long-term Tb averages and standard deviations are found consistent with the respective estimated total simulation and observation error standard deviations of m3.1K and s2.4K. It is also shown that the MAP parameter values estimated through MCMC simulation are in close agreement with those obtained with Particle Swarm Optimization (PSO).
Inference algorithms and learning theory for Bayesian sparse factor analysis
Bayesian sparse factor analysis has many applications; for example, it has been applied to the problem of inferring a sparse regulatory network from gene expression data. We describe a number of inference algorithms for Bayesian sparse factor analysis using a slab and spike mixture prior. These include well-established Markov chain Monte Carlo (MCMC) and variational Bayes (VB) algorithms as well as a novel hybrid of VB and Expectation Propagation (EP). For the case of a single latent factor we derive a theory for learning performance using the replica method. We compare the MCMC and VB/EP algorithm results with simulated data to the theoretical prediction. The results for MCMC agree closely with the theory as expected. Results for VB/EP are slightly sub-optimal but show that the new algorithm is effective for sparse inference. In large-scale problems MCMC is infeasible due to computational limitations and the VB/EP algorithm then provides a very useful computationally efficient alternative.
Bayesian inference of the demographic history of chimpanzees.
Wegmann, Daniel; Excoffier, Laurent
2010-06-01
Due to an almost complete absence of fossil record, the evolutionary history of chimpanzees has only been studied recently on the basis of genetic data. Although the general topology of the chimpanzee phylogeny is well established, uncertainties remain concerning the size of current and past populations, the occurrence of bottlenecks or population expansions, or about divergence times and migrations rates between subspecies. Here, we present a novel attempt at globally inferring the detailed evolution of the Pan genus based on approximate Bayesian computation, an approach preferentially applied to complex models where the likelihood cannot be computed analytically. Based on two microsatellite and DNA sequence data sets and adjusting simulated data for local levels of inbreeding and patterns of missing data, we find support for several new features of chimpanzee evolution as compared with previous studies based on smaller data sets and simpler evolutionary models. We find that the central chimpanzees are certainly the oldest population of all P. troglodytes subspecies and that the other two P. t. subspecies diverged from the central chimpanzees by founder events. We also find an older divergence time (1.6 million years [My]) between common chimpanzee and Bonobos than previous studies (0.9-1.3 My), but this divergence appears to have been very progressive with the maintenance of relatively high levels of gene flow between the ancestral chimpanzee population and the Bonobos. Finally, we could also confirm the existence of strong unidirectional gene flow from the western into the central chimpanzee. These results show that interesting and innovative features of chimpanzee history emerge when considering their whole evolutionary history in a single analysis, rather than relying on simpler models involving several comparisons of pairs of populations. PMID:20118191
Bayesian Spatial Modelling with R-INLA
Finn Lindgren; Håvard Rue
2015-01-01
The principles behind the interface to continuous domain spatial models in the R- INLA software package for R are described. The integrated nested Laplace approximation (INLA) approach proposed by Rue, Martino, and Chopin (2009) is a computationally effective alternative to MCMC for Bayesian inference. INLA is designed for latent Gaussian models, a very wide and flexible class of models ranging from (generalized) linear mixed to spatial and spatio-temporal models. Combined with the stochastic...
Congdon, Peter
2014-01-01
This book provides an accessible approach to Bayesian computing and data analysis, with an emphasis on the interpretation of real data sets. Following in the tradition of the successful first edition, this book aims to make a wide range of statistical modeling applications accessible using tested code that can be readily adapted to the reader's own applications. The second edition has been thoroughly reworked and updated to take account of advances in the field. A new set of worked examples is included. The novel aspect of the first edition was the coverage of statistical modeling using WinBU
Structural damage identification using piezoelectric impedance and Bayesian inference
Shuai, Q.; Zhou, K.; Tang, J.
2015-04-01
Structural damage identification is a challenging subject in the structural health monitoring research. The piezoelectric impedance-based damage identification, which usually utilizes the matrix inverse-based optimization, may in theory identify the damage location and damage severity. However, the sensitivity matrix is oftentimes ill-conditioned in practice, since the number of unknowns may far exceed the useful measurements/inputs. In this research, a new method based on intelligent inference framework for damage identification is presented. Bayesian inference is used to directly predict damage location and severity using impedance measurement through forward prediction and comparison. Gaussian process is employed to enrich the forward analysis result, thereby reducing computational cost. Case study is carried out to illustrate the identification performance.
Efficient Nonparametric Bayesian Modelling with Sparse Gaussian Process Approximations
Seeger, Matthias; Lawrence, Neil; Herbrich, Ralf
2006-01-01
Sparse approximations to Bayesian inference for nonparametric Gaussian Process models scale linearly in the number of training points, allowing for the application of powerful kernel-based models to large datasets. We present a general framework based on the informative vector machine (IVM) (Lawrence et.al., 2002) and show how the complete Bayesian task of inference and learning of free hyperparameters can be performed in a practically efficient manner. Our framework allows for arbitrary like...
Inference for Multiplicative Models
Wexler, Ydo; Meek, Christopher
2012-01-01
The paper introduces a generalization for known probabilistic models such as log-linear and graphical models, called here multiplicative models. These models, that express probabilities via product of parameters are shown to capture multiple forms of contextual independence between variables, including decision graphs and noisy-OR functions. An inference algorithm for multiplicative models is provided and its correctness is proved. The complexity analysis of the inference algorithm uses a mor...
Uncertainty Analysis in Fatigue Life Prediction of Gas Turbine Blades Using Bayesian Inference
Li, Yan-Feng; Zhu, Shun-Peng; Li, Jing; Peng, Weiwen; Huang, Hong-Zhong
2015-12-01
This paper investigates Bayesian model selection for fatigue life estimation of gas turbine blades considering model uncertainty and parameter uncertainty. Fatigue life estimation of gas turbine blades is a critical issue for the operation and health management of modern aircraft engines. Since lots of life prediction models have been presented to predict the fatigue life of gas turbine blades, model uncertainty and model selection among these models have consequently become an important issue in the lifecycle management of turbine blades. In this paper, fatigue life estimation is carried out by considering model uncertainty and parameter uncertainty simultaneously. It is formulated as the joint posterior distribution of a fatigue life prediction model and its model parameters using Bayesian inference method. Bayes factor is incorporated to implement the model selection with the quantified model uncertainty. Markov Chain Monte Carlo method is used to facilitate the calculation. A pictorial framework and a step-by-step procedure of the Bayesian inference method for fatigue life estimation considering model uncertainty are presented. Fatigue life estimation of a gas turbine blade is implemented to demonstrate the proposed method.
Dorn, C; Khan, A; Heng, K; Alibert, Y; Helled, R; Rivoldini, A; Benz, W
2016-01-01
We aim to present a generalized Bayesian inference method for constraining interiors of super Earths and sub-Neptunes. Our methodology succeeds in quantifying the degeneracy and correlation of structural parameters for high dimensional parameter spaces. Specifically, we identify what constraints can be placed on composition and thickness of core, mantle, ice, ocean, and atmospheric layers given observations of mass, radius, and bulk refractory abundance constraints (Fe, Mg, Si) from observations of the host star's photospheric composition. We employed a full probabilistic Bayesian inference analysis that formally accounts for observational and model uncertainties. Using a Markov chain Monte Carlo technique, we computed joint and marginal posterior probability distributions for all structural parameters of interest. We included state-of-the-art structural models based on self-consistent thermodynamics of core, mantle, high-pressure ice, and liquid water. Furthermore, we tested and compared two different atmosp...
Stochastic Collapsed Variational Bayesian Inference for Latent Dirichlet Allocation
J. Foulds; L. Boyles; C. DuBois; P. Smyth; M. Welling
2013-01-01
There has been an explosion in the amount of digital text information available in recent years, leading to challenges of scale for traditional inference algorithms for topic models. Recent advances in stochastic variational inference algorithms for latent Dirichlet allocation (LDA) have made it fea
Dorazio, R.M.; Johnson, F.A.
2003-01-01
Bayesian inference and decision theory may be used in the solution of relatively complex problems of natural resource management, owing to recent advances in statistical theory and computing. In particular, Markov chain Monte Carlo algorithms provide a computational framework for fitting models of adequate complexity and for evaluating the expected consequences of alternative management actions. We illustrate these features using an example based on management of waterfowl habitat.
Seyed Taghi Akhavan Niaki; Mohammad Saber Fallah Nezhad
2007-01-01
In order to design a decision-making framework in production environments, in this study, we use both the stochastic dynamic programming and Bayesian inference concepts. Using the posterior probability of the production process to be in state λ (the hazard rate of defective products), first we formulate the problem into a stochastic dynamic programming model. Next, we derive some properties for the optimal value of the objective function. Then, we propose a solution algorithm. At the end, the...
Inference of Gene Regulatory Network Based on Local Bayesian Networks
Liu, Fei; Zhang, Shao-Wu; Guo, Wei-Feng; Chen, Luonan
2016-01-01
The inference of gene regulatory networks (GRNs) from expression data can mine the direct regulations among genes and gain deep insights into biological processes at a network level. During past decades, numerous computational approaches have been introduced for inferring the GRNs. However, many of them still suffer from various problems, e.g., Bayesian network (BN) methods cannot handle large-scale networks due to their high computational complexity, while information theory-based methods cannot identify the directions of regulatory interactions and also suffer from false positive/negative problems. To overcome the limitations, in this work we present a novel algorithm, namely local Bayesian network (LBN), to infer GRNs from gene expression data by using the network decomposition strategy and false-positive edge elimination scheme. Specifically, LBN algorithm first uses conditional mutual information (CMI) to construct an initial network or GRN, which is decomposed into a number of local networks or GRNs. Then, BN method is employed to generate a series of local BNs by selecting the k-nearest neighbors of each gene as its candidate regulatory genes, which significantly reduces the exponential search space from all possible GRN structures. Integrating these local BNs forms a tentative network or GRN by performing CMI, which reduces redundant regulations in the GRN and thus alleviates the false positive problem. The final network or GRN can be obtained by iteratively performing CMI and local BN on the tentative network. In the iterative process, the false or redundant regulations are gradually removed. When tested on the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in E.coli, our results suggest that LBN outperforms other state-of-the-art methods (ARACNE, GENIE3 and NARROMI) significantly, with more accurate and robust performance. In particular, the decomposition strategy with local Bayesian networks not only effectively reduce
Inference of Gene Regulatory Network Based on Local Bayesian Networks.
Liu, Fei; Zhang, Shao-Wu; Guo, Wei-Feng; Wei, Ze-Gang; Chen, Luonan
2016-08-01
The inference of gene regulatory networks (GRNs) from expression data can mine the direct regulations among genes and gain deep insights into biological processes at a network level. During past decades, numerous computational approaches have been introduced for inferring the GRNs. However, many of them still suffer from various problems, e.g., Bayesian network (BN) methods cannot handle large-scale networks due to their high computational complexity, while information theory-based methods cannot identify the directions of regulatory interactions and also suffer from false positive/negative problems. To overcome the limitations, in this work we present a novel algorithm, namely local Bayesian network (LBN), to infer GRNs from gene expression data by using the network decomposition strategy and false-positive edge elimination scheme. Specifically, LBN algorithm first uses conditional mutual information (CMI) to construct an initial network or GRN, which is decomposed into a number of local networks or GRNs. Then, BN method is employed to generate a series of local BNs by selecting the k-nearest neighbors of each gene as its candidate regulatory genes, which significantly reduces the exponential search space from all possible GRN structures. Integrating these local BNs forms a tentative network or GRN by performing CMI, which reduces redundant regulations in the GRN and thus alleviates the false positive problem. The final network or GRN can be obtained by iteratively performing CMI and local BN on the tentative network. In the iterative process, the false or redundant regulations are gradually removed. When tested on the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in E.coli, our results suggest that LBN outperforms other state-of-the-art methods (ARACNE, GENIE3 and NARROMI) significantly, with more accurate and robust performance. In particular, the decomposition strategy with local Bayesian networks not only effectively reduce
Ball, William T; Egerton, Jack S; Haigh, Joanna D
2014-01-01
We investigate the relationship between spectral solar irradiance (SSI) and ozone in the tropical upper stratosphere. We find that solar cycle (SC) changes in ozone can be well approximated by considering the ozone response to SSI changes in a small number individual wavelength bands between 176 and 310 nm, operating independently of each other. Additionally, we find that the ozone varies approximately linearly with changes in the SSI. Using these facts, we present a Bayesian formalism for inferring SC SSI changes and uncertainties from measured SC ozone profiles. Bayesian inference is a powerful, mathematically self-consistent method of considering both the uncertainties of the data and additional external information to provide the best estimate of parameters being estimated. Using this method, we show that, given measurement uncertainties in both ozone and SSI datasets, it is not currently possible to distinguish between observed or modelled SSI datasets using available estimates of ozone change profiles, ...
Ødegård, Jørgen; Meuwissen, Theo HE; Heringstad, Bjørg;
2010-01-01
relationship matrix, but genetic (co)variance components are inferred from the sampled breeding values and relationships between "informative" individuals (usually parents) only. The latter is analogous to a sire-dam model (in cases with no individual records on the parents). Results When applied to simulated......, residual variance on the underlying scale is not identifiable. Hence, variance of fully confounded Mendelian sampling deviations cannot be identified either, but can be inferred from the between-family variation. In the new algorithm, breeding values are sampled as in a standard animal model using the full...
Gaffney, Jim A; Sonnad, Vijay; Libby, Stephen B
2013-01-01
First principles microphysics models are essential to the design and analysis of high energy density physics experiments. Using experimental data to investigate the underlying physics is also essential, particularly when simulations and experiments are not consistent with each other. This is a difficult task, due to the large number of physical models that play a role, and due to the complex (and as a result, noisy) nature of the experiments. This results in a large number of parameters that make any inference a daunting task; it is also very important to consistently treat both experimental and prior understanding of the problem. In this paper we present a Bayesian method that includes both these effects, and allows the inference of a set of modifiers which have been constructed to give information about microphysics models from experimental data. We pay particular attention to radiation transport models. The inference takes into account a large set of experimental parameters and an estimate of the prior kno...
Hadwin, Paul J.; Sipkens, T. A.; Thomson, K. A.; Liu, F.; Daun, K. J.
2016-01-01
Auto-correlated laser-induced incandescence (AC-LII) infers the soot volume fraction (SVF) of soot particles by comparing the spectral incandescence from laser-energized particles to the pyrometrically inferred peak soot temperature. This calculation requires detailed knowledge of model parameters such as the absorption function of soot, which may vary with combustion chemistry, soot age, and the internal structure of the soot. This work presents a Bayesian methodology to quantify such uncertainties. This technique treats the additional "nuisance" model parameters, including the soot absorption function, as stochastic variables and incorporates the current state of knowledge of these parameters into the inference process through maximum entropy priors. While standard AC-LII analysis provides a point estimate of the SVF, Bayesian techniques infer the posterior probability density, which will allow scientists and engineers to better assess the reliability of AC-LII inferred SVFs in the context of environmental regulations and competing diagnostics.
Bayesian Fusion Algorithm for Inferring Trust in Wireless Sensor Networks
Mohammad Momani
2010-07-01
Full Text Available This paper introduces a new Bayesian fusion algorithm to combine more than one trust component (data trust and communication trust to infer the overall trust between nodes. This research work proposes that one trust component is not enough when deciding on whether or not to trust a specific node in a wireless sensor network. This paper discusses and analyses the results from the communication trust component (binary and the data trust component (continuous and proves that either component by itself, can mislead the network and eventually cause a total breakdown of the network. As a result of this, new algorithms are needed to combine more than one trust component to infer the overall trust. The proposed algorithm is simple and generic as it allows trust components to be added and deleted easily. Simulation results demonstrate that a node is highly trustworthy provided that both trust components simultaneously confirm its trustworthiness and conversely, a node is highly untrustworthy if its untrustworthiness is asserted by both components.
Bayesian Modelling of fMRI Time Series
Højen-Sørensen, Pedro; Hansen, Lars Kai; Rasmussen, Carl Edward
2000-01-01
We present a Hidden Markov Model (HMM) for inferring the hidden psychological state (or neural activity) during single trial fMRI activation experiments with blocked task paradigms. Inference is based on Bayesian methodology, using a combination of analytical and a variety of Markov Chain Monte...
Bayesian Modelling of fMRI Time Series
Højen-Sørensen, Pedro; Hansen, Lars Kai; Rasmussen, Carl Edward
We present a Hidden Markov Model (HMM) for inferring the hidden psychological state (or neural activity) during single trial fMRI activation experiments with blocked task paradigms. Inference is based on Bayesian methodology, using a combination of analytical and a variety of Markov Chain Monte...
A formal model of interpersonal inference
Michael Moutoussis
2014-03-01
Full Text Available Introduction: We propose that active Bayesian inference – a general framework for decision-making – can equally be applied to interpersonal exchanges. Social cognition, however, entails special challenges. We address these challenges through a novel formulation of a formal model and demonstrate its psychological significance. Method: We review relevant literature, especially with regards to interpersonal representations, formulate a mathematical model and present a simulation study. The model accommodates normative models from utility theory and places them within the broader setting of Bayesian inference. Crucially, we endow people's prior beliefs, into which utilities are absorbed, with preferences of self and others. The simulation illustrates the model's dynamics and furnishes elementary predictions of the theory. Results: 1. Because beliefs about self and others inform both the desirability and plausibility of outcomes, in this framework interpersonal representations become beliefs that have to be actively inferred. This inference, akin to 'mentalising' in the psychological literature, is based upon the outcomes of interpersonal exchanges. 2. We show how some well-known social-psychological phenomena (e.g. self-serving biases can be explained in terms of active interpersonal inference. 3. Mentalising naturally entails Bayesian updating of how people value social outcomes. Crucially this includes inference about one’s own qualities and preferences. Conclusion: We inaugurate a Bayes optimal framework for modelling intersubject variability in mentalising during interpersonal exchanges. Here, interpersonal representations are endowed with explicit functional and affective properties. We suggest the active inference framework lends itself to the study of psychiatric conditions where mentalising is distorted.
Michael J McGeachie
2014-06-01
Full Text Available Bayesian Networks (BN have been a popular predictive modeling formalism in bioinformatics, but their application in modern genomics has been slowed by an inability to cleanly handle domains with mixed discrete and continuous variables. Existing free BN software packages either discretize continuous variables, which can lead to information loss, or do not include inference routines, which makes prediction with the BN impossible. We present CGBayesNets, a BN package focused around prediction of a clinical phenotype from mixed discrete and continuous variables, which fills these gaps. CGBayesNets implements Bayesian likelihood and inference algorithms for the conditional Gaussian Bayesian network (CGBNs formalism, one appropriate for predicting an outcome of interest from, e.g., multimodal genomic data. We provide four different network learning algorithms, each making a different tradeoff between computational cost and network likelihood. CGBayesNets provides a full suite of functions for model exploration and verification, including cross validation, bootstrapping, and AUC manipulation. We highlight several results obtained previously with CGBayesNets, including predictive models of wood properties from tree genomics, leukemia subtype classification from mixed genomic data, and robust prediction of intensive care unit mortality outcomes from metabolomic profiles. We also provide detailed example analysis on public metabolomic and gene expression datasets. CGBayesNets is implemented in MATLAB and available as MATLAB source code, under an Open Source license and anonymous download at http://www.cgbayesnets.com.
Bayesian electron density inference from JET lithium beam emission spectra using Gaussian processes
Kwak, Sehyun; Brix, M; Ghim, Y -c
2016-01-01
A Bayesian model to infer edge electron density profiles is developed for the JET lithium beam emission spectroscopy system, measuring Li I line radiation using 26 channels with ~1 cm spatial resolution and 10~20 ms temporal resolution. The density profile is modelled using a Gaussian process prior, and the uncertainty of the density profile is calculated by a Markov Chain Monte Carlo (MCMC) scheme. From the spectra measured by the transmission grating spectrometer, the Li line intensities are extracted, and modelled as a function of the plasma density by a multi-state model which describes the relevant processes between neutral lithium beam atoms and plasma particles. The spectral model fully takes into account interference filter and instrument effects, that are separately estimated, again using Gaussian processes. The line intensities are inferred based on a spectral model consistent with the measured spectra within their uncertainties, which includes photon statistics and electronic noise. Our newly devel...
Sraj, Ihab
2015-10-22
This paper addresses model dimensionality reduction for Bayesian inference based on prior Gaussian fields with uncertainty in the covariance function hyper-parameters. The dimensionality reduction is traditionally achieved using the Karhunen-Loève expansion of a prior Gaussian process assuming covariance function with fixed hyper-parameters, despite the fact that these are uncertain in nature. The posterior distribution of the Karhunen-Loève coordinates is then inferred using available observations. The resulting inferred field is therefore dependent on the assumed hyper-parameters. Here, we seek to efficiently estimate both the field and covariance hyper-parameters using Bayesian inference. To this end, a generalized Karhunen-Loève expansion is derived using a coordinate transformation to account for the dependence with respect to the covariance hyper-parameters. Polynomial Chaos expansions are employed for the acceleration of the Bayesian inference using similar coordinate transformations, enabling us to avoid expanding explicitly the solution dependence on the uncertain hyper-parameters. We demonstrate the feasibility of the proposed method on a transient diffusion equation by inferring spatially-varying log-diffusivity fields from noisy data. The inferred profiles were found closer to the true profiles when including the hyper-parameters’ uncertainty in the inference formulation.
Bayesian Nonparametrics in Topic Modeling: A Brief Tutorial
Spangher, Alexander
2015-01-01
Using nonparametric methods has been increasingly explored in Bayesian hierarchical modeling as a way to increase model flexibility. Although the field shows a lot of promise, inference in many models, including Hierachical Dirichlet Processes (HDP), remain prohibitively slow. One promising path forward is to exploit the submodularity inherent in Indian Buffet Process (IBP) to derive near-optimal solutions in polynomial time. In this work, I will present a brief tutorial on Bayesian nonparame...
Bayesian inference of solar and stellar magnetic fields in the weak-field approximation
Ramos, A Asensio
2011-01-01
The weak-field approximation is one of the simplest models that allows us to relate the observed polarization induced by the Zeeman effect with the magnetic field vector present on the plasma of interest. It is usually applied for diagnosing magnetic fields in the solar and stellar atmospheres. A fully Bayesian approach to the inference of magnetic properties in unresolved structures is presented. The analytical expression for the marginal posterior distribution is obtained, from which we can obtain statistically relevant information about the model parameters. The role of a-priori information is discussed and a hierarchical procedure is presented that gives robust results that are almost insensitive to the precise election of the prior. The strength of the formalism is demonstrated through an application to IMaX data. Bayesian methods can optimally exploit data from filter-polarimeters given the scarcity of spectral information as compared with spectro-polarimeters. The effect of noise and how it degrades ou...
Bayesian Inference on Predictors of Sex of the Baby
Scarpa, Bruno
2016-01-01
It is well known that the sex ratio at birth is a biological constant, being about 106 boys to 100 girls. However couples have always wanted to know and decide in advance the sex of a newborn. For example, a large number of papers appeared connecting biometrical variables, such as length of follicular phase in the woman menstrual cycle or timing of intercourse acts to the sex of new baby. In this paper, we propose a Bayesian model to validate some of these theories by using an independent dat...
Planetary micro-rover operations on Mars using a Bayesian framework for inference and control
Post, Mark A.; Li, Junquan; Quine, Brendan M.
2016-03-01
With the recent progress toward the application of commercially-available hardware to small-scale space missions, it is now becoming feasible for groups of small, efficient robots based on low-power embedded hardware to perform simple tasks on other planets in the place of large-scale, heavy and expensive robots. In this paper, we describe design and programming of the Beaver micro-rover developed for Northern Light, a Canadian initiative to send a small lander and rover to Mars to study the Martian surface and subsurface. For a small, hardware-limited rover to handle an uncertain and mostly unknown environment without constant management by human operators, we use a Bayesian network of discrete random variables as an abstraction of expert knowledge about the rover and its environment, and inference operations for control. A framework for efficient construction and inference into a Bayesian network using only the C language and fixed-point mathematics on embedded hardware has been developed for the Beaver to make intelligent decisions with minimal sensor data. We study the performance of the Beaver as it probabilistically maps a simple outdoor environment with sensor models that include uncertainty. Results indicate that the Beaver and other small and simple robotic platforms can make use of a Bayesian network to make intelligent decisions in uncertain planetary environments.
Bayesian inference and life testing plans for generalized exponential distribution
KUNDU; Debasis; PRADHAN; Biswabrata
2009-01-01
Recently generalized exponential distribution has received considerable attentions.In this paper,we deal with the Bayesian inference of the unknown parameters of the progressively censored generalized exponential distribution.It is assumed that the scale and the shape parameters have independent gamma priors.The Bayes estimates of the unknown parameters cannot be obtained in the closed form.Lindley’s approximation and importance sampling technique have been suggested to compute the approximate Bayes estimates.Markov Chain Monte Carlo method has been used to compute the approximate Bayes estimates and also to construct the highest posterior density credible intervals.We also provide different criteria to compare two different sampling schemes and hence to ?nd the optimal sampling schemes.It is observed that ?nding the optimum censoring procedure is a computationally expensive process.And we have recommended to use the sub-optimal censoring procedure,which can be obtained very easily.Monte Carlo simulations are performed to compare the performances of the different methods and one data analysis has been performed for illustrative purposes.
Internal dosimetry of uranium isotopes using bayesian inference methods
A group of personnel at Los Alamos National Laboratory is routinely monitored for the presence of uranium isotopes by urine bioassay. Samples are analysed by alpha spectroscopy, and the results are examined for evidence of an intake of uranium. Because the measurement uncertainties are often comparable to the quantities of material we wish to detect, statistical considerations are crucial for the proper interpretation of the data. The problem is further complicated by the significant, but highly non-uniform, presence of uranium in local drinking water and, in some cases, food supply. Software originally developed for internal dosimetry of plutonium has been adapted to the problem of uranium dosimetry. The software uses an unfolding algorithm to calculate an approximate Bayesian solution to the problem of characterising any intakes which may have occurred, given the history of urine bioassay results for each individual in the monitored population. The program uses biokinetic models from ICRP Publications 68 and later, and a prior probability distribution derived empirically from the body of uranium bioassay data collected at Los Alamos over the operating history of the Laboratory. For each individual, the software creates a posterior probability distribution of intake quantity and solubility type as a function of time. From this distribution, estimates are made of the cumulative committed dose (CEDE) to each individual. Results of the method are compared with those obtained using an earlier classical (non-Bayesian) algorithm for uranium dosimetry. We also discuss the problem of distinguishing occupational intakes from intake of environmental uranium, within a Bayesian framework. (author)
Jones, Matt; Love, Bradley C
2011-08-01
The prominence of Bayesian modeling of cognition has increased recently largely because of mathematical advances in specifying and deriving predictions from complex probabilistic models. Much of this research aims to demonstrate that cognitive behavior can be explained from rational principles alone, without recourse to psychological or neurological processes and representations. We note commonalities between this rational approach and other movements in psychology - namely, Behaviorism and evolutionary psychology - that set aside mechanistic explanations or make use of optimality assumptions. Through these comparisons, we identify a number of challenges that limit the rational program's potential contribution to psychological theory. Specifically, rational Bayesian models are significantly unconstrained, both because they are uninformed by a wide range of process-level data and because their assumptions about the environment are generally not grounded in empirical measurement. The psychological implications of most Bayesian models are also unclear. Bayesian inference itself is conceptually trivial, but strong assumptions are often embedded in the hypothesis sets and the approximation algorithms used to derive model predictions, without a clear delineation between psychological commitments and implementational details. Comparing multiple Bayesian models of the same task is rare, as is the realization that many Bayesian models recapitulate existing (mechanistic level) theories. Despite the expressive power of current Bayesian models, we argue they must be developed in conjunction with mechanistic considerations to offer substantive explanations of cognition. We lay out several means for such an integration, which take into account the representations on which Bayesian inference operates, as well as the algorithms and heuristics that carry it out. We argue this unification will better facilitate lasting contributions to psychological theory, avoiding the pitfalls
Mocapy++ - A toolkit for inference and learning in dynamic Bayesian networks
Paluszewski, Martin; Hamelryck, Thomas Wim
2010-01-01
Background Mocapy++ is a toolkit for parameter learning and inference in dynamic Bayesian networks (DBNs). It supports a wide range of DBN architectures and probability distributions, including distributions from directional statistics (the statistics of angles, directions and orientations...
VIGoR: Variational Bayesian Inference for Genome-Wide Regression
Onogi, Akio; Iwata, Hiroyoshi
2016-01-01
Genome-wide regression using a number of genome-wide markers as predictors is now widely used for genome-wide association mapping and genomic prediction. We developed novel software for genome-wide regression which we named VIGoR (variational Bayesian inference for genome-wide regression). Variational Bayesian inference is computationally much faster than widely used Markov chain Monte Carlo algorithms. VIGoR implements seven regression methods, and is provided as a command line program packa...
Hierarchical Bayesian inference of galaxy redshift distributions from photometric surveys
Leistedt, Boris; Peiris, Hiranya V
2016-01-01
Accurately characterizing the redshift distributions of galaxies is essential for analysing deep photometric surveys and testing cosmological models. We present a technique to simultaneously infer redshift distributions and individual redshifts from photometric galaxy catalogues. Our model constructs a piecewise constant representation (effectively a histogram) of the distribution of galaxy types and redshifts, the parameters of which are efficiently inferred from noisy photometric flux measurements. This approach can be seen as a generalization of template-fitting photometric redshift methods and relies on a library of spectral templates to relate the photometric fluxes of individual galaxies to their redshifts. We illustrate this technique on simulated galaxy survey data, and demonstrate that it delivers correct posterior distributions on the underlying type and redshift distributions, as well as on the individual types and redshifts of galaxies. We show that even with uninformative priors, large photometri...
Perkins, S. J.; Marais, P. C.; Zwart, J. T. L.; Natarajan, I.; Tasse, C.; Smirnov, O.
2015-09-01
We present Montblanc, a GPU implementation of the Radio interferometer measurement equation (RIME) in support of the Bayesian inference for radio observations (BIRO) technique. BIRO uses Bayesian inference to select sky models that best match the visibilities observed by a radio interferometer. To accomplish this, BIRO evaluates the RIME multiple times, varying sky model parameters to produce multiple model visibilities. χ2 values computed from the model and observed visibilities are used as likelihood values to drive the Bayesian sampling process and select the best sky model. As most of the elements of the RIME and χ2 calculation are independent of one another, they are highly amenable to parallel computation. Additionally, Montblanc caters for iterative RIME evaluation to produce multiple χ2 values. Modified model parameters are transferred to the GPU between each iteration. We implemented Montblanc as a Python package based upon NVIDIA's CUDA architecture. As such, it is easy to extend and implement different pipelines. At present, Montblanc supports point and Gaussian morphologies, but is designed for easy addition of new source profiles. Montblanc's RIME implementation is performant: On an NVIDIA K40, it is approximately 250 times faster than MEQTREES on a dual hexacore Intel E5-2620v2 CPU. Compared to the OSKAR simulator's GPU-implemented RIME components it is 7.7 and 12 times faster on the same K40 for single and double-precision floating point respectively. However, OSKAR's RIME implementation is more general than Montblanc's BIRO-tailored RIME. Theoretical analysis of Montblanc's dominant CUDA kernel suggests that it is memory bound. In practice, profiling shows that is balanced between compute and memory, as much of the data required by the problem is retained in L1 and L2 caches.
Occurrence of hazardous accident in nuclear power plants and industrial units usually lead to release of radioactive materials and pollutants in environment. These materials and pollutants can be transported to a far downstream by the wind flow. In this paper, we implemented an atmospheric dispersion code to solve the inverse problem. Having received and detected the pollutants in one region, we may estimate the rate and location of the unknown source. For the modeling, one needs a model with ability of atmospheric dispersion calculation. Furthermore, it is required to implement a mathematical approach to infer the source location and the related rates. In this paper the AERMOD software and Bayesian inference along the Markov Chain Monte Carlo have been applied. Implementing, Bayesian approach and Markov Chain Monte Carlo for the aforementioned subject is not a new approach, but the AERMOD model coupled with the said methods is a new and well known regulatory software, and enhances the reliability of outcomes. To evaluate the method, an example is considered by defining pollutants concentration in a specific region and then obtaining the source location and intensity by a direct calculation. The result of the calculation estimates the average source location at a distance of 7km with an accuracy of 5m which is good enough to support the ability of the proposed algorithm.
Bayesian model reduction and empirical Bayes for group (DCM) studies.
Friston, Karl J; Litvak, Vladimir; Oswal, Ashwini; Razi, Adeel; Stephan, Klaas E; van Wijk, Bernadette C M; Ziegler, Gabriel; Zeidman, Peter
2016-03-01
This technical note describes some Bayesian procedures for the analysis of group studies that use nonlinear models at the first (within-subject) level - e.g., dynamic causal models - and linear models at subsequent (between-subject) levels. Its focus is on using Bayesian model reduction to finesse the inversion of multiple models of a single dataset or a single (hierarchical or empirical Bayes) model of multiple datasets. These applications of Bayesian model reduction allow one to consider parametric random effects and make inferences about group effects very efficiently (in a few seconds). We provide the relatively straightforward theoretical background to these procedures and illustrate their application using a worked example. This example uses a simulated mismatch negativity study of schizophrenia. We illustrate the robustness of Bayesian model reduction to violations of the (commonly used) Laplace assumption in dynamic causal modelling and show how its recursive application can facilitate both classical and Bayesian inference about group differences. Finally, we consider the application of these empirical Bayesian procedures to classification and prediction. PMID:26569570
Modeling Social Annotation: a Bayesian Approach
Plangprasopchok, Anon
2008-01-01
Collaborative tagging systems, such as del.icio.us, CiteULike, and others, allow users to annotate objects, e.g., Web pages or scientific papers, with descriptive labels called tags. The social annotations, contributed by thousands of users, can potentially be used to infer categorical knowledge, classify documents or recommend new relevant information. Traditional text inference methods do not make best use of socially-generated data, since they do not take into account variations in individual users' perspectives and vocabulary. In a previous work, we introduced a simple probabilistic model that takes interests of individual annotators into account in order to find hidden topics of annotated objects. Unfortunately, our proposed approach had a number of shortcomings, including overfitting, local maxima and the requirement to specify values for some parameters. In this paper we address these shortcomings in two ways. First, we extend the model to a fully Bayesian framework. Second, we describe an infinite ver...
Bayesian Model Averaging for Propensity Score Analysis
Kaplan, David; Chen, Jianshen
2013-01-01
The purpose of this study is to explore Bayesian model averaging in the propensity score context. Previous research on Bayesian propensity score analysis does not take into account model uncertainty. In this regard, an internally consistent Bayesian framework for model building and estimation must also account for model uncertainty. The…
Hierarchical Bayesian inference of galaxy redshift distributions from photometric surveys
Leistedt, Boris; Mortlock, Daniel J.; Peiris, Hiranya V.
2016-08-01
Accurately characterizing the redshift distributions of galaxies is essential for analysing deep photometric surveys and testing cosmological models. We present a technique to simultaneously infer redshift distributions and individual redshifts from photometric galaxy catalogues. Our model constructs a piecewise constant representation (effectively a histogram) of the distribution of galaxy types and redshifts, the parameters of which are efficiently inferred from noisy photometric flux measurements. This approach can be seen as a generalization of template-fitting photometric redshift methods and relies on a library of spectral templates to relate the photometric fluxes of individual galaxies to their redshifts. We illustrate this technique on simulated galaxy survey data, and demonstrate that it delivers correct posterior distributions on the underlying type and redshift distributions, as well as on the individual types and redshifts of galaxies. We show that even with uninformative priors, large photometric errors and parameter degeneracies, the redshift and type distributions can be recovered robustly thanks to the hierarchical nature of the model, which is not possible with common photometric redshift estimation techniques. As a result, redshift uncertainties can be fully propagated in cosmological analyses for the first time, fulfilling an essential requirement for the current and future generations of surveys.
TYPE Ia SUPERNOVA LIGHT-CURVE INFERENCE: HIERARCHICAL BAYESIAN ANALYSIS IN THE NEAR-INFRARED
We present a comprehensive statistical analysis of the properties of Type Ia supernova (SN Ia) light curves in the near-infrared using recent data from Peters Automated InfraRed Imaging TELescope and the literature. We construct a hierarchical Bayesian framework, incorporating several uncertainties including photometric error, peculiar velocities, dust extinction, and intrinsic variations, for principled and coherent statistical inference. SN Ia light-curve inferences are drawn from the global posterior probability of parameters describing both individual supernovae and the population conditioned on the entire SN Ia NIR data set. The logical structure of the hierarchical model is represented by a directed acyclic graph. Fully Bayesian analysis of the model and data is enabled by an efficient Markov Chain Monte Carlo algorithm exploiting the conditional probabilistic structure using Gibbs sampling. We apply this framework to the JHKs SN Ia light-curve data. A new light-curve model captures the observed J-band light-curve shape variations. The marginal intrinsic variances in peak absolute magnitudes are σ(MJ) = 0.17 ± 0.03, σ(MH) = 0.11 ± 0.03, and σ(MKs) = 0.19 ± 0.04. We describe the first quantitative evidence for correlations between the NIR absolute magnitudes and J-band light-curve shapes, and demonstrate their utility for distance estimation. The average residual in the Hubble diagram for the training set SNe at cz > 2000kms-1 is 0.10 mag. The new application of bootstrap cross-validation to SN Ia light-curve inference tests the sensitivity of the statistical model fit to the finite sample and estimates the prediction error at 0.15 mag. These results demonstrate that SN Ia NIR light curves are as effective as corrected optical light curves, and, because they are less vulnerable to dust absorption, they have great potential as precise and accurate cosmological distance indicators.
Evidence cross-validation and Bayesian inference of MAST plasma equilibria
In this paper, current profiles for plasma discharges on the mega-ampere spherical tokamak are directly calculated from pickup coil, flux loop, and motional-Stark effect observations via methods based in the statistical theory of Bayesian analysis. By representing toroidal plasma current as a series of axisymmetric current beams with rectangular cross-section and inferring the current for each one of these beams, flux-surface geometry and q-profiles are subsequently calculated by elementary application of Biot-Savart's law. The use of this plasma model in the context of Bayesian analysis was pioneered by Svensson and Werner on the joint-European tokamak [Svensson and Werner,Plasma Phys. Controlled Fusion 50(8), 085002 (2008)]. In this framework, linear forward models are used to generate diagnostic predictions, and the probability distribution for the currents in the collection of plasma beams was subsequently calculated directly via application of Bayes' formula. In this work, we introduce a new diagnostic technique to identify and remove outlier observations associated with diagnostics falling out of calibration or suffering from an unidentified malfunction. These modifications enable a good agreement between Bayesian inference of the last-closed flux-surface with other corroborating data, such as that from force balance considerations using EFIT++[Appel et al., ''A unified approach to equilibrium reconstruction'' Proceedings of the 33rd EPS Conference on Plasma Physics (Rome, Italy, 2006)]. In addition, this analysis also yields errors on the plasma current profile and flux-surface geometry as well as directly predicting the Shafranov shift of the plasma core.
A Non-Parametric Bayesian Method for Inferring Hidden Causes
Wood, Frank; Griffiths, Thomas; Ghahramani, Zoubin
2012-01-01
We present a non-parametric Bayesian approach to structure learning with hidden causes. Previous Bayesian treatments of this problem define a prior over the number of hidden causes and use algorithms such as reversible jump Markov chain Monte Carlo to move between solutions. In contrast, we assume that the number of hidden causes is unbounded, but only a finite number influence observable variables. This makes it possible to use a Gibbs sampler to approximate the distribution over causal stru...
Large-Scale Optimization for Bayesian Inference in Complex Systems
Willcox, Karen [MIT; Marzouk, Youssef [MIT
2013-11-12
The SAGUARO (Scalable Algorithms for Groundwater Uncertainty Analysis and Robust Optimization) Project focused on the development of scalable numerical algorithms for large-scale Bayesian inversion in complex systems that capitalize on advances in large-scale simulation-based optimization and inversion methods. The project was a collaborative effort among MIT, the University of Texas at Austin, Georgia Institute of Technology, and Sandia National Laboratories. The research was directed in three complementary areas: efficient approximations of the Hessian operator, reductions in complexity of forward simulations via stochastic spectral approximations and model reduction, and employing large-scale optimization concepts to accelerate sampling. The MIT--Sandia component of the SAGUARO Project addressed the intractability of conventional sampling methods for large-scale statistical inverse problems by devising reduced-order models that are faithful to the full-order model over a wide range of parameter values; sampling then employs the reduced model rather than the full model, resulting in very large computational savings. Results indicate little effect on the computed posterior distribution. On the other hand, in the Texas--Georgia Tech component of the project, we retain the full-order model, but exploit inverse problem structure (adjoint-based gradients and partial Hessian information of the parameter-to-observation map) to implicitly extract lower dimensional information on the posterior distribution; this greatly speeds up sampling methods, so that fewer sampling points are needed. We can think of these two approaches as ``reduce then sample'' and ``sample then reduce.'' In fact, these two approaches are complementary, and can be used in conjunction with each other. Moreover, they both exploit deterministic inverse problem structure, in the form of adjoint-based gradient and Hessian information of the underlying parameter-to-observation map, to
Validi, AbdoulAhad
2013-01-01
This study introduces a non-intrusive approach in the context of low-rank separated representation to construct a surrogate of high-dimensional stochastic functions, e.g., PDEs/ODEs, in order to decrease the computational cost of Markov Chain Monte Carlo simulations in Bayesian inference. The surrogate model is constructed via a regularized alternative least-square regression with Tikhonov regularization using a roughening matrix computing the gradient of the solution, in conjunction with a perturbation-based error indicator to detect optimal model complexities. The model approximates a vector of a continuous solution at discrete values of a physical variable. The required number of random realizations to achieve a successful approximation linearly depends on the function dimensionality. The computational cost of the model construction is quadratic in the number of random inputs, which potentially tackles the curse of dimensionality in high-dimensional stochastic functions. Furthermore, this vector valued sep...
A Nonparametric Bayesian Model for Nested Clustering.
Lee, Juhee; Müller, Peter; Zhu, Yitan; Ji, Yuan
2016-01-01
We propose a nonparametric Bayesian model for clustering where clusters of experimental units are determined by a shared pattern of clustering another set of experimental units. The proposed model is motivated by the analysis of protein activation data, where we cluster proteins such that all proteins in one cluster give rise to the same clustering of patients. That is, we define clusters of proteins by the way that patients group with respect to the corresponding protein activations. This is in contrast to (almost) all currently available models that use shared parameters in the sampling model to define clusters. This includes in particular model based clustering, Dirichlet process mixtures, product partition models, and more. We show results for two typical biostatistical inference problems that give rise to clustering. PMID:26519174
Bayesian Spatial Modelling with R-INLA
Finn Lindgren
2015-02-01
Full Text Available The principles behind the interface to continuous domain spatial models in the R- INLA software package for R are described. The integrated nested Laplace approximation (INLA approach proposed by Rue, Martino, and Chopin (2009 is a computationally effective alternative to MCMC for Bayesian inference. INLA is designed for latent Gaussian models, a very wide and flexible class of models ranging from (generalized linear mixed to spatial and spatio-temporal models. Combined with the stochastic partial differential equation approach (SPDE, Lindgren, Rue, and Lindstrm 2011, one can accommodate all kinds of geographically referenced data, including areal and geostatistical ones, as well as spatial point process data. The implementation interface covers stationary spatial mod- els, non-stationary spatial models, and also spatio-temporal models, and is applicable in epidemiology, ecology, environmental risk assessment, as well as general geostatistics.
Bayesian kinematic earthquake source models
Minson, S. E.; Simons, M.; Beck, J. L.; Genrich, J. F.; Galetzka, J. E.; Chowdhury, F.; Owen, S. E.; Webb, F.; Comte, D.; Glass, B.; Leiva, C.; Ortega, F. H.
2009-12-01
Most coseismic, postseismic, and interseismic slip models are based on highly regularized optimizations which yield one solution which satisfies the data given a particular set of regularizing constraints. This regularization hampers our ability to answer basic questions such as whether seismic and aseismic slip overlap or instead rupture separate portions of the fault zone. We present a Bayesian methodology for generating kinematic earthquake source models with a focus on large subduction zone earthquakes. Unlike classical optimization approaches, Bayesian techniques sample the ensemble of all acceptable models presented as an a posteriori probability density function (PDF), and thus we can explore the entire solution space to determine, for example, which model parameters are well determined and which are not, or what is the likelihood that two slip distributions overlap in space. Bayesian sampling also has the advantage that all a priori knowledge of the source process can be used to mold the a posteriori ensemble of models. Although very powerful, Bayesian methods have up to now been of limited use in geophysical modeling because they are only computationally feasible for problems with a small number of free parameters due to what is called the "curse of dimensionality." However, our methodology can successfully sample solution spaces of many hundreds of parameters, which is sufficient to produce finite fault kinematic earthquake models. Our algorithm is a modification of the tempered Markov chain Monte Carlo (tempered MCMC or TMCMC) method. In our algorithm, we sample a "tempered" a posteriori PDF using many MCMC simulations running in parallel and evolutionary computation in which models which fit the data poorly are preferentially eliminated in favor of models which better predict the data. We present results for both synthetic test problems as well as for the 2007 Mw 7.8 Tocopilla, Chile earthquake, the latter of which is constrained by InSAR, local high
A Bayesian Nonparametric IRT Model
Karabatsos, George
2015-01-01
This paper introduces a flexible Bayesian nonparametric Item Response Theory (IRT) model, which applies to dichotomous or polytomous item responses, and which can apply to either unidimensional or multidimensional scaling. This is an infinite-mixture IRT model, with person ability and item difficulty parameters, and with a random intercept parameter that is assigned a mixing distribution, with mixing weights a probit function of other person and item parameters. As a result of its flexibility...
Bayesian Stable Isotope Mixing Models
Parnell, Andrew C.; Phillips, Donald L.; Bearhop, Stuart; Semmens, Brice X.; Ward, Eric J.; Moore, Jonathan W.; Andrew L Jackson; Inger, Richard
2012-01-01
In this paper we review recent advances in Stable Isotope Mixing Models (SIMMs) and place them into an over-arching Bayesian statistical framework which allows for several useful extensions. SIMMs are used to quantify the proportional contributions of various sources to a mixture. The most widely used application is quantifying the diet of organisms based on the food sources they have been observed to consume. At the centre of the multivariate statistical model we propose is a compositional m...
L.S. Vismara
2007-12-01
Full Text Available A dinâmica da população de plantas daninhas pode ser representada por um sistema de equações que relaciona as densidades de sementes produzidas e de plântulas em áreas de cultivo. Os valores dos parâmetros dos modelos podem ser inferidos diretamente de experimentação e análise estatística ou extraídos da literatura. O presente trabalho teve por objetivo estimar os parâmetros do modelo de densidade populacional de plantas daninhas, a partir de um experimento conduzido na área experimental da Embrapa Milho e Sorgo, Sete Lagoas, MG, via os procedimentos de inferências clássica e Bayesiana.Dynamics of weed populations can be described as a system of equations relating the produced seed and seedling densities in crop areas. The model parameter values can be either directly inferred from experimentation and statistical analysis or obtained from the literature. The objective of this work was to estimate the weed population density model parameters based on experimental field data at Embrapa Milho e Sorgo, Sete Lagoas, MG, using classic and Bayesian inferences.
Bayesian interpolation in a dynamic sinusoidal model with application to packet-loss concealment
Nielsen, Jesper Kjær; Christensen, Mads Græsbøll; Cemgil, Ali Taylan;
2010-01-01
a Bayesian inference scheme for the missing observations, hidden states and model parameters of the dynamic model. The inference scheme is based on a Markov chain Monte Carlo method known as Gibbs sampler. We illustrate the performance of the inference scheme to the application of packet-loss concealment...
Jannson, Tomasz; Wang, Wenjian; Hodelin, Juan; Forrester, Thomas; Romanov, Volodymyr; Kostrzewski, Andrew
2016-05-01
In this paper, Bayesian Binary Sensing (BBS) is discussed as an effective tool for Bayesian Inference (BI) evaluation in interdisciplinary areas such as ISR (and, C3I), Homeland Security, QC, medicine, defense, and many others. In particular, Hilbertian Sine (HS) as an absolute measure of BI, is introduced, while avoiding relativity of decision threshold identification, as in the case of traditional measures of BI, related to false positives and false negatives.
Bayesian inference of biochemical kinetic parameters using the linear noise approximation
Finkenstädt Bärbel
2009-10-01
Full Text Available Abstract Background Fluorescent and luminescent gene reporters allow us to dynamically quantify changes in molecular species concentration over time on the single cell level. The mathematical modeling of their interaction through multivariate dynamical models requires the deveopment of effective statistical methods to calibrate such models against available data. Given the prevalence of stochasticity and noise in biochemical systems inference for stochastic models is of special interest. In this paper we present a simple and computationally efficient algorithm for the estimation of biochemical kinetic parameters from gene reporter data. Results We use the linear noise approximation to model biochemical reactions through a stochastic dynamic model which essentially approximates a diffusion model by an ordinary differential equation model with an appropriately defined noise process. An explicit formula for the likelihood function can be derived allowing for computationally efficient parameter estimation. The proposed algorithm is embedded in a Bayesian framework and inference is performed using Markov chain Monte Carlo. Conclusion The major advantage of the method is that in contrast to the more established diffusion approximation based methods the computationally costly methods of data augmentation are not necessary. Our approach also allows for unobserved variables and measurement error. The application of the method to both simulated and experimental data shows that the proposed methodology provides a useful alternative to diffusion approximation based methods.
Raue, Andreas; Theis, Fabian Joachim; Timmer, Jens
2012-01-01
Increasingly complex applications involve large datasets in combination with non-linear and high dimensional mathematical models. In this context, statistical inference is a challenging issue that calls for pragmatic approaches that take advantage of both Bayesian and frequentist methods. The elegance of Bayesian methodology is founded in the propagation of information content provided by experimental data and prior assumptions to the posterior probability distribution of model predictions. However, for complex applications experimental data and prior assumptions potentially constrain the posterior probability distribution insufficiently. In these situations Bayesian Markov chain Monte Carlo sampling can be infeasible. From a frequentist point of view insufficient experimental data and prior assumptions can be interpreted as non-identifiability. The profile likelihood approach offers to detect and to resolve non-identifiability by experimental design iteratively. Therefore, it allows one to better constrain t...
Alsing, Justin; Jaffe, Andrew H
2016-01-01
We apply two Bayesian hierarchical inference schemes to infer shear power spectra, shear maps and cosmological parameters from the CFHTLenS weak lensing survey - the first application of this method to data. In the first approach, we sample the joint posterior distribution of the shear maps and power spectra by Gibbs sampling, with minimal model assumptions. In the second approach, we sample the joint posterior of the shear maps and cosmological parameters, providing a new, accurate and principled approach to cosmological parameter inference from cosmic shear data. As a first demonstration on data we perform a 2-bin tomographic analysis to constrain cosmological parameters and investigate the possibility of photometric redshift bias in the CFHTLenS data. Under the baseline $\\Lambda$CDM model we constrain $S_8 = \\sigma_8(\\Omega_\\mathrm{m}/0.3)^{0.5} = 0.67 ^{\\scriptscriptstyle+ 0.03 }_{\\scriptscriptstyle- 0.03 }$ $(68\\%)$, consistent with previous CFHTLenS analysis but in tension with Planck. Adding neutrino m...
Stochastic model updating utilizing Bayesian approach and Gaussian process model
Wan, Hua-Ping; Ren, Wei-Xin
2016-03-01
Stochastic model updating (SMU) has been increasingly applied in quantifying structural parameter uncertainty from responses variability. SMU for parameter uncertainty quantification refers to the problem of inverse uncertainty quantification (IUQ), which is a nontrivial task. Inverse problem solved with optimization usually brings about the issues of gradient computation, ill-conditionedness, and non-uniqueness. Moreover, the uncertainty present in response makes the inverse problem more complicated. In this study, Bayesian approach is adopted in SMU for parameter uncertainty quantification. The prominent strength of Bayesian approach for IUQ problem is that it solves IUQ problem in a straightforward manner, which enables it to avoid the previous issues. However, when applied to engineering structures that are modeled with a high-resolution finite element model (FEM), Bayesian approach is still computationally expensive since the commonly used Markov chain Monte Carlo (MCMC) method for Bayesian inference requires a large number of model runs to guarantee the convergence. Herein we reduce computational cost in two aspects. On the one hand, the fast-running Gaussian process model (GPM) is utilized to approximate the time-consuming high-resolution FEM. On the other hand, the advanced MCMC method using delayed rejection adaptive Metropolis (DRAM) algorithm that incorporates local adaptive strategy with global adaptive strategy is employed for Bayesian inference. In addition, we propose the use of the powerful variance-based global sensitivity analysis (GSA) in parameter selection to exclude non-influential parameters from calibration parameters, which yields a reduced-order model and thus further alleviates the computational burden. A simulated aluminum plate and a real-world complex cable-stayed pedestrian bridge are presented to illustrate the proposed framework and verify its feasibility.
A Bayesian method for inferring transmission chains in a partially observed epidemic.
Marzouk, Youssef M.; Ray, Jaideep
2008-10-01
We present a Bayesian approach for estimating transmission chains and rates in the Abakaliki smallpox epidemic of 1967. The epidemic affected 30 individuals in a community of 74; only the dates of appearance of symptoms were recorded. Our model assumes stochastic transmission of the infections over a social network. Distinct binomial random graphs model intra- and inter-compound social connections, while disease transmission over each link is treated as a Poisson process. Link probabilities and rate parameters are objects of inference. Dates of infection and recovery comprise the remaining unknowns. Distributions for smallpox incubation and recovery periods are obtained from historical data. Using Markov chain Monte Carlo, we explore the joint posterior distribution of the scalar parameters and provide an expected connectivity pattern for the social graph and infection pathway.
Mocapy++ - A toolkit for inference and learning in dynamic Bayesian networks
Hamelryck Thomas
2010-03-01
Full Text Available Abstract Background Mocapy++ is a toolkit for parameter learning and inference in dynamic Bayesian networks (DBNs. It supports a wide range of DBN architectures and probability distributions, including distributions from directional statistics (the statistics of angles, directions and orientations. Results The program package is freely available under the GNU General Public Licence (GPL from SourceForge http://sourceforge.net/projects/mocapy. The package contains the source for building the Mocapy++ library, several usage examples and the user manual. Conclusions Mocapy++ is especially suitable for constructing probabilistic models of biomolecular structure, due to its support for directional statistics. In particular, it supports the Kent distribution on the sphere and the bivariate von Mises distribution on the torus. These distributions have proven useful to formulate probabilistic models of protein and RNA structure in atomic detail.
Chakraborty, Shubhankar; Roy Chaudhuri, Partha; Das, Prasanta Kr.
2016-07-01
In this communication, a novel optical technique has been proposed for the reconstruction of the shape of a Taylor bubble using measurements from multiple arrays of optical sensors. The deviation of an optical beam passing through the bubble depends on the contour of bubble surface. A theoretical model of the deviation of a beam during the traverse of a Taylor bubble through it has been developed. Using this model and the time history of the deviation captured by the sensor array, the bubble shape has been reconstructed. The reconstruction has been performed using an inverse algorithm based on Bayesian inference technique and Markov chain Monte Carlo sampling algorithm. The reconstructed nose shape has been compared with the true shape, extracted through image processing of high speed images. Finally, an error analysis has been performed to pinpoint the sources of the errors.
Murakami, Yohei; Takada, Shoji
2013-01-01
When exact values of model parameters in systems biology are not available from experiments, they need to be inferred so that the resulting simulation reproduces the experimentally known phenomena. For the purpose, Bayesian statistics with Markov chain Monte Carlo (MCMC) is a useful method. Biological experiments are often performed with cell population, and the results are represented by histograms. On another front, experiments sometimes indicate the existence of a specific bifurcation patt...
What is the `relevant population' in Bayesian forensic inference?
Brümmer, Niko; de Villiers, Edward
2014-01-01
In works discussing the Bayesian paradigm for presenting forensic evidence in court, the concept of a `relevant population' is often mentioned, without a clear definition of what is meant, and without recommendations of how to select such populations. This note is to try to better understand this concept. Our analysis is intended to be general enough to be applicable to different forensic technologies and we shall consider both DNA profiling and speaker recognition as examples.
Improving PWR core simulations by Monte Carlo uncertainty analysis and Bayesian inference
Castro, Emilio; Buss, Oliver; Garcia-Herranz, Nuria; Hoefer, Axel; Porsch, Dieter
2016-01-01
A Monte Carlo-based Bayesian inference model is applied to the prediction of reactor operation parameters of a PWR nuclear power plant. In this non-perturbative framework, high-dimensional covariance information describing the uncertainty of microscopic nuclear data is combined with measured reactor operation data in order to provide statistically sound, well founded uncertainty estimates of integral parameters, such as the boron letdown curve and the burnup-dependent reactor power distribution. The performance of this methodology is assessed in a blind test approach, where we use measurements of a given reactor cycle to improve the prediction of the subsequent cycle. As it turns out, the resulting improvement of the prediction quality is impressive. In particular, the prediction uncertainty of the boron letdown curve, which is of utmost importance for the planning of the reactor cycle length, can be reduced by one order of magnitude by including the boron concentration measurement information of the previous...
左哲
2015-01-01
为了研究长输管道腐蚀泄漏及蒸气云爆炸事故的演化规律，通过对埋地管道内(外)壁腐蚀失效、燃气泄漏、气体云团扩散及蒸气云爆炸等4阶段事件进行分析，构建埋地管线腐蚀泄漏火灾的贝叶斯网络模型。研究网络结构中节点变量的取值范围及离散化方法，并基于对事故统计和专家分析判断，设定节点变量的先验概率，量化节点关联的条件概率分布。在对贝叶斯网络推理策略研究的基础上，考察节点变量对推理结果的敏感性，验证模型的合理性。结果表明，长输管道腐蚀泄漏及次生灾害事件过程具有较大的不确定性，主要体现在中间事件均具有多种状态，事故演化路径概率受模型输入条件影响较大。贝叶斯网络方法用于描述事故过程中间节点事件间的依赖关系有较大的优势，可以定量衡量事故风险的不确定性。%In order to research evolutionary laws of unconfined vapor cloud explosion ( UVCE) induced by combustible gas leak in long-distance oil and gas pipelines, Bayesian networks on buried pipelines corrosion leak fire were built by analyzing event nodes on inner and outer wall corrosion failure, combustible gas leak, the gas cloud diffusion and UVCE. The state ranges and discrete methods of node variables were studied. Priori probability and conditional probability distribution of the node variables were set by analyzing on accident statistics data and expert judgements. Bayesian network inference strategy was developed, the sensitivities of each network node variable on inference results were analyzed by researching on evolution mechanism of corrosion leak fire, and the rationality of the model was verified. The results show that there are greater uncer-tainty in the process of pipeline corrosion leaks and secondary disaster. The uncertainty presents in diverse intermediate event status value and probability of accident evolutionary
Constraining East Antarctic mass trends using a Bayesian inference approach
Martin-Español, Alba; Bamber, Jonathan L.
2016-04-01
East Antarctica is an order of magnitude larger than its western neighbour and the Greenland ice sheet. It has the greatest potential to contribute to sea level rise of any source, including non-glacial contributors. It is, however, the most challenging ice mass to constrain because of a range of factors including the relative paucity of in-situ observations and the poor signal to noise ratio of Earth Observation data such as satellite altimetry and gravimetry. A recent study using satellite radar and laser altimetry (Zwally et al. 2015) concluded that the East Antarctic Ice Sheet (EAIS) had been accumulating mass at a rate of 136±28 Gt/yr for the period 2003-08. Here, we use a Bayesian hierarchical model, which has been tested on, and applied to, the whole of Antarctica, to investigate the impact of different assumptions regarding the origin of elevation changes of the EAIS. We combined GRACE, satellite laser and radar altimeter data and GPS measurements to solve simultaneously for surface processes (primarily surface mass balance, SMB), ice dynamics and glacio-isostatic adjustment over the period 2003-13. The hierarchical model partitions mass trends between SMB and ice dynamics based on physical principles and measures of statistical likelihood. Without imposing the division between these processes, the model apportions about a third of the mass trend to ice dynamics, +18 Gt/yr, and two thirds, +39 Gt/yr, to SMB. The total mass trend for that period for the EAIS was 57±20 Gt/yr. Over the period 2003-08, we obtain an ice dynamic trend of 12 Gt/yr and a SMB trend of 15 Gt/yr, with a total mass trend of 27 Gt/yr. We then imposed the condition that the surface mass balance is tightly constrained by the regional climate model RACMO2.3 and allowed height changes due to ice dynamics to occur in areas of low surface velocities (<10 m/yr) , such as those in the interior of East Antarctica (a similar condition as used in Zwally 2015). The model must find a solution that
Cornuet, Jean-Marie; Santos, Filipe; Beaumont, Mark A; Robert, Christian P.; Marin, Jean-Michel; Balding, David J.; Guillemaud, Thomas; Estoup, Arnaud
2008-01-01
Summary: Genetic data obtained on population samples convey information about their evolutionary history. Inference methods can extract part of this information but they require sophisticated statistical techniques that have been made available to the biologist community (through computer programs) only for simple and standard situations typically involving a small number of samples. We propose here a computer program (DIY ABC) for inference based on approximate Bayesian computation (ABC), in...
Analysis of Gumbel Model for Software Reliability Using Bayesian Paradigm
Raj Kumar
2012-12-01
Full Text Available In this paper, we have illustrated the suitability of Gumbel Model for software reliability data. The model parameters are estimated using likelihood based inferential procedure: classical as well as Bayesian. The quasi Newton-Raphson algorithm is applied to obtain the maximum likelihood estimates and associated probability intervals. The Bayesian estimates of the parameters of Gumbel model are obtained using Markov Chain Monte Carlo(MCMC simulation method in OpenBUGS(established software for Bayesian analysis using Markov Chain Monte Carlo methods. The R functions are developed to study the statistical properties, model validation and comparison tools of the model and the output analysis of MCMC samples generated from OpenBUGS. Details of applying MCMC to parameter estimation for the Gumbel model are elaborated and a real software reliability data set is considered to illustrate the methods of inference discussed in this paper.
Bayesian variable order Markov models: Towards Bayesian predictive state representations
C. Dimitrakakis
2009-01-01
We present a Bayesian variable order Markov model that shares many similarities with predictive state representations. The resulting models are compact and much easier to specify and learn than classical predictive state representations. Moreover, we show that they significantly outperform a more st
Picchini, Umberto; Forman, Julie Lyng
2016-01-01
applications. A simulation study is conducted to compare our strategy with exact Bayesian inference, the latter resulting two orders of magnitude slower than ABC-MCMC for the considered set-up. Finally, the ABC algorithm is applied to a large size protein data. The suggested methodology is fairly general and...
Trans-dimensional Bayesian inference for large sequential data sets
Mandolesi, E.; Dettmer, J.; Dosso, S. E.; Holland, C. W.
2015-12-01
This work develops a sequential Monte Carlo method to infer seismic parameters of layered seabeds from large sequential reflection-coefficient data sets. The approach provides parameter estimates and uncertainties along survey tracks with the goal to aid in the detection of unexploded ordnance in shallow water. The sequential data are acquired by a moving platform with source and receiver array towed close to the seabed. This geometry requires consideration of spherical reflection coefficients, computed efficiently by massively parallel implementation of the Sommerfeld integral via Levin integration on a graphics processing unit. The seabed is parametrized with a trans-dimensional model to account for changes in the environment (i.e. changes in layering) along the track. The method combines advanced Markov chain Monte Carlo methods (annealing) with particle filtering (resampling). Since data from closely-spaced source transmissions (pings) often sample similar environments, the solution from one ping can be utilized to efficiently estimate the posterior for data from subsequent pings. Since reflection-coefficient data are highly informative, the likelihood function can be extremely peaked, resulting in little overlap between posteriors of adjacent pings. This is addressed by adding bridging distributions (via annealed importance sampling) between pings for more efficient transitions. The approach assumes the environment to be changing slowly enough to justify the local 1D parametrization. However, bridging allows rapid changes between pings to be addressed and we demonstrate the method to be stable in such situations. Results are in terms of trans-D parameter estimates and uncertainties along the track. The algorithm is examined for realistic simulated data along a track and applied to a dataset collected by an autonomous underwater vehicle on the Malta Plateau, Mediterranean Sea. [Work supported by the SERDP, DoD.
Matrix-free Large Scale Bayesian inference in cosmology
Jasche, Jens
2014-01-01
In this work we propose a new matrix-free implementation of the Wiener sampler which is traditionally applied to high dimensional analysis when signal covariances are unknown. Specifically, the proposed method addresses the problem of jointly inferring a high dimensional signal and its corresponding covariance matrix from a set of observations. Our method implements a Gibbs sampling adaptation of the previously presented messenger approach, permitting to cast the complex multivariate inference problem into a sequence of uni-variate random processes. In this fashion, the traditional requirement of inverting high dimensional matrices is completely eliminated from the inference process, resulting in an efficient algorithm that is trivial to implement. Using cosmic large scale structure data as a showcase, we demonstrate the capabilities of our Gibbs sampling approach by performing a joint analysis of three dimensional density fields and corresponding power-spectra from Gaussian mock catalogues. These tests clear...
Combinatorial Inference for Graphical Models
Neykov, Matey; Lu, Junwei; Liu, Han
2016-01-01
We propose a new family of combinatorial inference problems for graphical models. Unlike classical statistical inference where the main interest is point estimation or parameter testing, combinatorial inference aims at testing the global structure of the underlying graph. Examples include testing the graph connectivity, the presence of a cycle of certain size, or the maximum degree of the graph. To begin with, we develop a unified theory for the fundamental limits of a large family of combina...
Solving #SAT and Bayesian Inference with Backtracking Search
Bacchus, Fahiem; Dalmao, Shannon; Pitassi, Toniann
2014-01-01
Inference in Bayes Nets (BAYES) is an important problem with numerous applications in probabilistic reasoning. Counting the number of satisfying assignments of a propositional formula (#SAT) is a closely related problem of fundamental theoretical importance. Both these problems, and others, are members of the class of sum-of-products (SUMPROD) problems. In this paper we show that standard backtracking search when augmented with a simple memoization scheme (caching) can solve any sum-of-produc...
Emery, A. F.; Valenti, E.; Bardot, D.
2007-01-01
Parameter estimation is generally based upon the maximum likelihood approach and often involves regularization. Typically it is desired that the results be unbiased and of minimum variance. However, it is often better to accept biased estimates that have minimum mean square error. Bayesian inference is an attractive approach that achieves this goal and incorporates regularization automatically. More importantly, it permits us to analyse experiments in which both the system response and the independent variables (time, sensor position, experimental conditions, etc) are corrupted by noise and in which the model includes nuisance variables. This paper describes the use of Bayesian inference for an apparently simple experiment which is, in fact, fundamentally difficult and is compounded by a nuisance variable. By presenting this analysis we hope that members of the inverse community will see the value of applying Bayesian inference.
Performance and prediction: Bayesian modelling of fallible choice in chess
Haworth, Guy McCrossan; Regan, Ken; Di Fatta, Giuseppe
2010-01-01
Evaluating agents in decision-making applications requires assessing their skill and predicting their behaviour. Both are well developed in Poker-like situations, but less so in more complex game and model domains. This paper addresses both tasks by using Bayesian inference in a benchmark space of reference agents. The concepts are explained and demonstrated using the game of chess but the model applies generically to any domain with quantifiable options and fallible choice. Demonstration ...
Gelman, Andrew; Stern, Hal S; Dunson, David B; Vehtari, Aki; Rubin, Donald B
2013-01-01
FUNDAMENTALS OF BAYESIAN INFERENCEProbability and InferenceSingle-Parameter Models Introduction to Multiparameter Models Asymptotics and Connections to Non-Bayesian ApproachesHierarchical ModelsFUNDAMENTALS OF BAYESIAN DATA ANALYSISModel Checking Evaluating, Comparing, and Expanding ModelsModeling Accounting for Data Collection Decision AnalysisADVANCED COMPUTATION Introduction to Bayesian Computation Basics of Markov Chain Simulation Computationally Efficient Markov Chain Simulation Modal and Distributional ApproximationsREGRESSION MODELS Introduction to Regression Models Hierarchical Linear
Merging Digital Surface Models Implementing Bayesian Approaches
Sadeq, H.; Drummond, J.; Li, Z.
2016-06-01
In this research different DSMs from different sources have been merged. The merging is based on a probabilistic model using a Bayesian Approach. The implemented data have been sourced from very high resolution satellite imagery sensors (e.g. WorldView-1 and Pleiades). It is deemed preferable to use a Bayesian Approach when the data obtained from the sensors are limited and it is difficult to obtain many measurements or it would be very costly, thus the problem of the lack of data can be solved by introducing a priori estimations of data. To infer the prior data, it is assumed that the roofs of the buildings are specified as smooth, and for that purpose local entropy has been implemented. In addition to the a priori estimations, GNSS RTK measurements have been collected in the field which are used as check points to assess the quality of the DSMs and to validate the merging result. The model has been applied in the West-End of Glasgow containing different kinds of buildings, such as flat roofed and hipped roofed buildings. Both quantitative and qualitative methods have been employed to validate the merged DSM. The validation results have shown that the model was successfully able to improve the quality of the DSMs and improving some characteristics such as the roof surfaces, which consequently led to better representations. In addition to that, the developed model has been compared with the well established Maximum Likelihood model and showed similar quantitative statistical results and better qualitative results. Although the proposed model has been applied on DSMs that were derived from satellite imagery, it can be applied to any other sourced DSMs.
A Bayesian Approach to Inferring Rates of Selfing and Locus-Specific Mutation.
Redelings, Benjamin D; Kumagai, Seiji; Tatarenkov, Andrey; Wang, Liuyang; Sakai, Ann K; Weller, Stephen G; Culley, Theresa M; Avise, John C; Uyenoyama, Marcy K
2015-11-01
We present a Bayesian method for characterizing the mating system of populations reproducing through a mixture of self-fertilization and random outcrossing. Our method uses patterns of genetic variation across the genome as a basis for inference about reproduction under pure hermaphroditism, gynodioecy, and a model developed to describe the self-fertilizing killifish Kryptolebias marmoratus. We extend the standard coalescence model to accommodate these mating systems, accounting explicitly for multilocus identity disequilibrium, inbreeding depression, and variation in fertility among mating types. We incorporate the Ewens sampling formula (ESF) under the infinite-alleles model of mutation to obtain a novel expression for the likelihood of mating system parameters. Our Markov chain Monte Carlo (MCMC) algorithm assigns locus-specific mutation rates, drawn from a common mutation rate distribution that is itself estimated from the data using a Dirichlet process prior model. Our sampler is designed to accommodate additional information, including observations pertaining to the sex ratio, the intensity of inbreeding depression, and other aspects of reproduction. It can provide joint posterior distributions for the population-wide proportion of uniparental individuals, locus-specific mutation rates, and the number of generations since the most recent outcrossing event for each sampled individual. Further, estimation of all basic parameters of a given model permits estimation of functions of those parameters, including the proportion of the gene pool contributed by each sex and relative effective numbers. PMID:26374460
This study introduces a non-intrusive approach in the context of low-rank separated representation to construct a surrogate of high-dimensional stochastic functions, e.g., PDEs/ODEs, in order to decrease the computational cost of Markov Chain Monte Carlo simulations in Bayesian inference. The surrogate model is constructed via a regularized alternative least-square regression with Tikhonov regularization using a roughening matrix computing the gradient of the solution, in conjunction with a perturbation-based error indicator to detect optimal model complexities. The model approximates a vector of a continuous solution at discrete values of a physical variable. The required number of random realizations to achieve a successful approximation linearly depends on the function dimensionality. The computational cost of the model construction is quadratic in the number of random inputs, which potentially tackles the curse of dimensionality in high-dimensional stochastic functions. Furthermore, this vector-valued separated representation-based model, in comparison to the available scalar-valued case, leads to a significant reduction in the cost of approximation by an order of magnitude equal to the vector size. The performance of the method is studied through its application to three numerical examples including a 41-dimensional elliptic PDE and a 21-dimensional cavity flow
A Bayesian approach to model uncertainty
A Bayesian approach to model uncertainty is taken. For the case of a finite number of alternative models, the model uncertainty is equivalent to parameter uncertainty. A derivation based on Savage's partition problem is given
Differential gene co-expression networks via Bayesian biclustering models
Gao, Chuan; Zhao, Shiwen; McDowell, Ian C.; Brown, Christopher D.; Barbara E Engelhardt
2014-01-01
Identifying latent structure in large data matrices is essential for exploring biological processes. Here, we consider recovering gene co-expression networks from gene expression data, where each network encodes relationships between genes that are locally co-regulated by shared biological mechanisms. To do this, we develop a Bayesian statistical model for biclustering to infer subsets of co-regulated genes whose covariation may be observed in only a subset of the samples. Our biclustering me...
Bayesian Models of Learning and Reasoning with Relations
Chen, Dawn
2014-01-01
How do humans acquire relational concepts such as larger, which are essential for analogical inference and other forms of high-level reasoning? Are they necessarily innate, or can they be learned from non-relational inputs? Using comparative relations as a model domain, we show that structured relations can be learned from unstructured inputs of realistic complexity, applying bottom-up Bayesian learning mechanisms that make minimal assumptions about innate representations. First, we introduce...
Computational methods for Bayesian model choice
Robert, Christian P.; Wraith, Darren
2009-01-01
In this note, we shortly survey some recent approaches on the approximation of the Bayes factor used in Bayesian hypothesis testing and in Bayesian model choice. In particular, we reassess importance sampling, harmonic mean sampling, and nested sampling from a unified perspective.
Sankararaman, Shankar
2016-01-01
This paper presents a computational framework for uncertainty characterization and propagation, and sensitivity analysis under the presence of aleatory and epistemic un- certainty, and develops a rigorous methodology for efficient refinement of epistemic un- certainty by identifying important epistemic variables that significantly affect the overall performance of an engineering system. The proposed methodology is illustrated using the NASA Langley Uncertainty Quantification Challenge (NASA-LUQC) problem that deals with uncertainty analysis of a generic transport model (GTM). First, Bayesian inference is used to infer subsystem-level epistemic quantities using the subsystem-level model and corresponding data. Second, tools of variance-based global sensitivity analysis are used to identify four important epistemic variables (this limitation specified in the NASA-LUQC is reflective of practical engineering situations where not all epistemic variables can be refined due to time/budget constraints) that significantly affect system-level performance. The most significant contribution of this paper is the development of the sequential refine- ment methodology, where epistemic variables for refinement are not identified all-at-once. Instead, only one variable is first identified, and then, Bayesian inference and global sensi- tivity calculations are repeated to identify the next important variable. This procedure is continued until all 4 variables are identified and the refinement in the system-level perfor- mance is computed. The advantages of the proposed sequential refinement methodology over the all-at-once uncertainty refinement approach are explained, and then applied to the NASA Langley Uncertainty Quantification Challenge problem.
Knuth, K. H.
2001-05-01
We consider the application of Bayesian inference to the study of self-organized structures in complex adaptive systems. In particular, we examine the distribution of elements, agents, or processes in systems dominated by hierarchical structure. We demonstrate that results obtained by Caianiello [1] on Hierarchical Modular Systems (HMS) can be found by applying Jaynes' Principle of Group Invariance [2] to a few key assumptions about our knowledge of hierarchical organization. Subsequent application of the Principle of Maximum Entropy allows inferences to be made about specific systems. The utility of the Bayesian method is considered by examining both successes and failures of the hierarchical model. We discuss how Caianiello's original statements suffer from the Mind Projection Fallacy [3] and we restate his assumptions thus widening the applicability of the HMS model. The relationship between inference and statistical physics, described by Jaynes [4], is reiterated with the expectation that this realization will aid the field of complex systems research by moving away from often inappropriate direct application of statistical mechanics to a more encompassing inferential methodology.
Hu, Zixi; Yao, Zhewei; Li, Jinglai
2015-01-01
Many scientific and engineering problems require to perform Bayesian inferences for unknowns of infinite dimension. In such problems, many standard Markov Chain Monte Carlo (MCMC) algorithms become arbitrary slow under the mesh refinement, which is referred to as being dimension dependent. To this end, a family of dimensional independent MCMC algorithms, known as the preconditioned Crank-Nicolson (pCN) methods, were proposed to sample the infinite dimensional parameters. In this work we devel...
Bayesian inferences of the thermal properties of a wall using temperature and heat flux measurements
Iglesias, Marco; Sawlan, Zaid; Scavino, Marco; Tempone, Raul; Wood, Christopher
2016-01-01
We develop a hierarchical Bayesian inference method to estimate the thermal resistance and the volumetric heat capacity of a wall. These thermal properties are essential for accurate building energy simulations that are needed to make effective energy-saving policies. We apply our methodology to an experimental case study conducted in an environmental chamber, where measurements are recorded every minute from temperature probes and heat flux sensors placed on both sides of a solid brick wall ...
Adaptability and phenotypic stability of common bean genotypes through Bayesian inference.
Corrêa, A M; Teodoro, P E; Gonçalves, M C; Barroso, L M A; Nascimento, M; Santos, A; Torres, F E
2016-01-01
This study used Bayesian inference to investigate the genotype x environment interaction in common bean grown in Mato Grosso do Sul State, and it also evaluated the efficiency of using informative and minimally informative a priori distributions. Six trials were conducted in randomized blocks, and the grain yield of 13 common bean genotypes was assessed. To represent the minimally informative a priori distributions, a probability distribution with high variance was used, and a meta-analysis concept was adopted to represent the informative a priori distributions. Bayes factors were used to conduct comparisons between the a priori distributions. The Bayesian inference was effective for the selection of upright common bean genotypes with high adaptability and phenotypic stability using the Eberhart and Russell method. Bayes factors indicated that the use of informative a priori distributions provided more accurate results than minimally informative a priori distributions. According to Bayesian inference, the EMGOPA-201, BAMBUÍ, CNF 4999, CNF 4129 A 54, and CNFv 8025 genotypes had specific adaptability to favorable environments, while the IAPAR 14 and IAC CARIOCA ETE genotypes had specific adaptability to unfavorable environments. PMID:27173270
Kang, Seong Keun; Seong, Poong Hyun [KAIST, Daejeon (Korea, Republic of)
2014-08-15
Bayesian methodology has been used widely used in various research fields. It is method of inference using Bayes' rule to update the estimation of probability for the certain hypothesis when additional evidences are acquired. According to the current researches, malfunction of nuclear power plant can be detected by using this Bayesian inference which consistently piles up the newly incoming data and updates its estimation. However, those researches are based on the assumption that people are doing like computer perfectly, which can be criticized and may cause a problem in real world application. Studies in cognitive psychology indicates that when the amount of information becomes larger, people can't save the whole data because people have limited memory capacity which is well known as working memory, and also they have attention problem. The purpose of this paper is to consider the psychological factors and confirm how much this working memory and attention will affect the resulted estimation based on the Bayesian inference. To confirm this, experiment on human is needed, and the tool of experiment is Compact Nuclear Simulator (CNS)
Bayesian methodology has been used widely used in various research fields. It is method of inference using Bayes' rule to update the estimation of probability for the certain hypothesis when additional evidences are acquired. According to the current researches, malfunction of nuclear power plant can be detected by using this Bayesian inference which consistently piles up the newly incoming data and updates its estimation. However, those researches are based on the assumption that people are doing like computer perfectly, which can be criticized and may cause a problem in real world application. Studies in cognitive psychology indicates that when the amount of information becomes larger, people can't save the whole data because people have limited memory capacity which is well known as working memory, and also they have attention problem. The purpose of this paper is to consider the psychological factors and confirm how much this working memory and attention will affect the resulted estimation based on the Bayesian inference. To confirm this, experiment on human is needed, and the tool of experiment is Compact Nuclear Simulator (CNS)
A novel multimode process monitoring method integrating LDRSKM with Bayesian inference
Shi-jin REN; Yin LIANG; Xiang-jun ZHAO; Mao-yun YANG
2015-01-01
A local discriminant regularized soft k-means (LDRSKM) method with Bayesian inference is proposed for multimode process monitoring. LDRSKM extends the regularized soft k-means algorithm by exploiting the local and non-local geometric information of the data and generalized linear discriminant analysis to provide a better and more meaningful data partition. LDRSKM can perform clustering and subspace selection simultaneously, enhancing the separability of data residing in different clusters. With the data partition obtained, kernel support vector data description (KSVDD) is used to establish the monitoring statistics and control limits. Two Bayesian inference based global fault detection indicators are then developed using the local monitoring results associated with principal and residual subspaces. Based on clustering analysis, Bayesian inference and mani-fold learning methods, the within and cross-mode correlations, and local geometric information can be exploited to enhance monitoring performances for nonlinear and non-Gaussian processes. The effectiveness and efficiency of the proposed method are evaluated using the Tennessee Eastman benchmark process.