Inferring on the Intentions of Others by Hierarchical Bayesian Learning
Diaconescu, Andreea O.; Mathys, Christoph; Weber, Lilian A. E.; Daunizeau, Jean; Kasper, Lars; Lomakina, Ekaterina I.; Fehr, Ernst; Stephan, Klaas E.
2014-01-01
Inferring on others' (potentially time-varying) intentions is a fundamental problem during many social transactions. To investigate the underlying mechanisms, we applied computational modeling to behavioral data from an economic game in which 16 pairs of volunteers (randomly assigned to “player” or “adviser” roles) interacted. The player performed a probabilistic reinforcement learning task, receiving information about a binary lottery from a visual pie chart. The adviser, who received more predictive information, issued an additional recommendation. Critically, the game was structured such that the adviser's incentives to provide helpful or misleading information varied in time. Using a meta-Bayesian modeling framework, we found that the players' behavior was best explained by the deployment of hierarchical learning: they inferred upon the volatility of the advisers' intentions in order to optimize their predictions about the validity of their advice. Beyond learning, volatility estimates also affected the trial-by-trial variability of decisions: participants were more likely to rely on their estimates of advice accuracy for making choices when they believed that the adviser's intentions were presently stable. Finally, our model of the players' inference predicted the players' interpersonal reactivity index (IRI) scores, explicit ratings of the advisers' helpfulness and the advisers' self-reports on their chosen strategy. Overall, our results suggest that humans (i) employ hierarchical generative models to infer on the changing intentions of others, (ii) use volatility estimates to inform decision-making in social interactions, and (iii) integrate estimates of advice accuracy with non-social sources of information. The Bayesian framework presented here can quantify individual differences in these mechanisms from simple behavioral readouts and may prove useful in future clinical studies of maladaptive social cognition. PMID:25187943
UNSUPERVISED TRANSIENT LIGHT CURVE ANALYSIS VIA HIERARCHICAL BAYESIAN INFERENCE
Energy Technology Data Exchange (ETDEWEB)
Sanders, N. E.; Soderberg, A. M. [Harvard-Smithsonian Center for Astrophysics, 60 Garden Street, Cambridge, MA 02138 (United States); Betancourt, M., E-mail: nsanders@cfa.harvard.edu [Department of Statistics, University of Warwick, Coventry CV4 7AL (United Kingdom)
2015-02-10
Historically, light curve studies of supernovae (SNe) and other transient classes have focused on individual objects with copious and high signal-to-noise observations. In the nascent era of wide field transient searches, objects with detailed observations are decreasing as a fraction of the overall known SN population, and this strategy sacrifices the majority of the information contained in the data about the underlying population of transients. A population level modeling approach, simultaneously fitting all available observations of objects in a transient sub-class of interest, fully mines the data to infer the properties of the population and avoids certain systematic biases. We present a novel hierarchical Bayesian statistical model for population level modeling of transient light curves, and discuss its implementation using an efficient Hamiltonian Monte Carlo technique. As a test case, we apply this model to the Type IIP SN sample from the Pan-STARRS1 Medium Deep Survey, consisting of 18,837 photometric observations of 76 SNe, corresponding to a joint posterior distribution with 9176 parameters under our model. Our hierarchical model fits provide improved constraints on light curve parameters relevant to the physical properties of their progenitor stars relative to modeling individual light curves alone. Moreover, we directly evaluate the probability for occurrence rates of unseen light curve characteristics from the model hyperparameters, addressing observational biases in survey methodology. We view this modeling framework as an unsupervised machine learning technique with the ability to maximize scientific returns from data to be collected by future wide field transient searches like LSST.
UNSUPERVISED TRANSIENT LIGHT CURVE ANALYSIS VIA HIERARCHICAL BAYESIAN INFERENCE
International Nuclear Information System (INIS)
Sanders, N. E.; Soderberg, A. M.; Betancourt, M.
2015-01-01
Historically, light curve studies of supernovae (SNe) and other transient classes have focused on individual objects with copious and high signal-to-noise observations. In the nascent era of wide field transient searches, objects with detailed observations are decreasing as a fraction of the overall known SN population, and this strategy sacrifices the majority of the information contained in the data about the underlying population of transients. A population level modeling approach, simultaneously fitting all available observations of objects in a transient sub-class of interest, fully mines the data to infer the properties of the population and avoids certain systematic biases. We present a novel hierarchical Bayesian statistical model for population level modeling of transient light curves, and discuss its implementation using an efficient Hamiltonian Monte Carlo technique. As a test case, we apply this model to the Type IIP SN sample from the Pan-STARRS1 Medium Deep Survey, consisting of 18,837 photometric observations of 76 SNe, corresponding to a joint posterior distribution with 9176 parameters under our model. Our hierarchical model fits provide improved constraints on light curve parameters relevant to the physical properties of their progenitor stars relative to modeling individual light curves alone. Moreover, we directly evaluate the probability for occurrence rates of unseen light curve characteristics from the model hyperparameters, addressing observational biases in survey methodology. We view this modeling framework as an unsupervised machine learning technique with the ability to maximize scientific returns from data to be collected by future wide field transient searches like LSST
Boos, Moritz; Seer, Caroline; Lange, Florian; Kopp, Bruno
2016-01-01
Cognitive determinants of probabilistic inference were examined using hierarchical Bayesian modeling techniques. A classic urn-ball paradigm served as experimental strategy, involving a factorial two (prior probabilities) by two (likelihoods) design. Five computational models of cognitive processes were compared with the observed behavior. Parameter-free Bayesian posterior probabilities and parameter-free base rate neglect provided inadequate models of probabilistic inference. The introduction of distorted subjective probabilities yielded more robust and generalizable results. A general class of (inverted) S-shaped probability weighting functions had been proposed; however, the possibility of large differences in probability distortions not only across experimental conditions, but also across individuals, seems critical for the model's success. It also seems advantageous to consider individual differences in parameters of probability weighting as being sampled from weakly informative prior distributions of individual parameter values. Thus, the results from hierarchical Bayesian modeling converge with previous results in revealing that probability weighting parameters show considerable task dependency and individual differences. Methodologically, this work exemplifies the usefulness of hierarchical Bayesian modeling techniques for cognitive psychology. Theoretically, human probabilistic inference might be best described as the application of individualized strategic policies for Bayesian belief revision.
Directory of Open Access Journals (Sweden)
Moritz eBoos
2016-05-01
Full Text Available Cognitive determinants of probabilistic inference were examined using hierarchical Bayesian modelling techniques. A classic urn-ball paradigm served as experimental strategy, involving a factorial two (prior probabilities by two (likelihoods design. Five computational models of cognitive processes were compared with the observed behaviour. Parameter-free Bayesian posterior probabilities and parameter-free base rate neglect provided inadequate models of probabilistic inference. The introduction of distorted subjective probabilities yielded more robust and generalizable results. A general class of (inverted S-shaped probability weighting functions had been proposed; however, the possibility of large differences in probability distortions not only across experimental conditions, but also across individuals, seems critical for the model’s success. It also seems advantageous to consider individual differences in parameters of probability weighting as being sampled from weakly informative prior distributions of individual parameter values. Thus, the results from hierarchical Bayesian modelling converge with previous results in revealing that probability weighting parameters show considerable task dependency and individual differences. Methodologically, this work exemplifies the usefulness of hierarchical Bayesian modelling techniques for cognitive psychology. Theoretically, human probabilistic inference might be best described as the application of individualized strategic policies for Bayesian belief revision.
Directory of Open Access Journals (Sweden)
Ross H Johnstone
2017-03-01
Full Text Available Dose-response (or ‘concentration-effect’ relationships commonly occur in biological and pharmacological systems and are well characterised by Hill curves. These curves are described by an equation with two parameters: the inhibitory concentration 50% (IC50; and the Hill coefficient. Typically just the ‘best fit’ parameter values are reported in the literature. Here we introduce a Python-based software tool, PyHillFit , and describe the underlying Bayesian inference methods that it uses, to infer probability distributions for these parameters as well as the level of experimental observation noise. The tool also allows for hierarchical fitting, characterising the effect of inter-experiment variability. We demonstrate the use of the tool on a recently published dataset on multiple ion channel inhibition by multiple drug compounds. We compare the maximum likelihood, Bayesian and hierarchical Bayesian approaches. We then show how uncertainty in dose-response inputs can be characterised and propagated into a cardiac action potential simulation to give a probability distribution on model outputs.
Directory of Open Access Journals (Sweden)
Hea-Jung Kim
2017-06-01
Full Text Available This paper develops Bayesian inference in reliability of a class of scale mixtures of log-normal failure time (SMLNFT models with stochastic (or uncertain constraint in their reliability measures. The class is comprehensive and includes existing failure time (FT models (such as log-normal, log-Cauchy, and log-logistic FT models as well as new models that are robust in terms of heavy-tailed FT observations. Since classical frequency approaches to reliability analysis based on the SMLNFT model with stochastic constraint are intractable, the Bayesian method is pursued utilizing a Markov chain Monte Carlo (MCMC sampling based approach. This paper introduces a two-stage maximum entropy (MaxEnt prior, which elicits a priori uncertain constraint and develops Bayesian hierarchical SMLNFT model by using the prior. The paper also proposes an MCMC method for Bayesian inference in the SMLNFT model reliability and calls attention to properties of the MaxEnt prior that are useful for method development. Finally, two data sets are used to illustrate how the proposed methodology works.
TYPE Ia SUPERNOVA LIGHT-CURVE INFERENCE: HIERARCHICAL BAYESIAN ANALYSIS IN THE NEAR-INFRARED
International Nuclear Information System (INIS)
Mandel, Kaisey S.; Friedman, Andrew S.; Kirshner, Robert P.; Wood-Vasey, W. Michael
2009-01-01
We present a comprehensive statistical analysis of the properties of Type Ia supernova (SN Ia) light curves in the near-infrared using recent data from Peters Automated InfraRed Imaging TELescope and the literature. We construct a hierarchical Bayesian framework, incorporating several uncertainties including photometric error, peculiar velocities, dust extinction, and intrinsic variations, for principled and coherent statistical inference. SN Ia light-curve inferences are drawn from the global posterior probability of parameters describing both individual supernovae and the population conditioned on the entire SN Ia NIR data set. The logical structure of the hierarchical model is represented by a directed acyclic graph. Fully Bayesian analysis of the model and data is enabled by an efficient Markov Chain Monte Carlo algorithm exploiting the conditional probabilistic structure using Gibbs sampling. We apply this framework to the JHK s SN Ia light-curve data. A new light-curve model captures the observed J-band light-curve shape variations. The marginal intrinsic variances in peak absolute magnitudes are σ(M J ) = 0.17 ± 0.03, σ(M H ) = 0.11 ± 0.03, and σ(M Ks ) = 0.19 ± 0.04. We describe the first quantitative evidence for correlations between the NIR absolute magnitudes and J-band light-curve shapes, and demonstrate their utility for distance estimation. The average residual in the Hubble diagram for the training set SNe at cz > 2000kms -1 is 0.10 mag. The new application of bootstrap cross-validation to SN Ia light-curve inference tests the sensitivity of the statistical model fit to the finite sample and estimates the prediction error at 0.15 mag. These results demonstrate that SN Ia NIR light curves are as effective as corrected optical light curves, and, because they are less vulnerable to dust absorption, they have great potential as precise and accurate cosmological distance indicators.
Hierarchical Bayesian inference of the initial mass function in composite stellar populations
Dries, M.; Trager, S. C.; Koopmans, L. V. E.; Popping, G.; Somerville, R. S.
2018-03-01
The initial mass function (IMF) is a key ingredient in many studies of galaxy formation and evolution. Although the IMF is often assumed to be universal, there is continuing evidence that it is not universal. Spectroscopic studies that derive the IMF of the unresolved stellar populations of a galaxy often assume that this spectrum can be described by a single stellar population (SSP). To alleviate these limitations, in this paper we have developed a unique hierarchical Bayesian framework for modelling composite stellar populations (CSPs). Within this framework, we use a parametrized IMF prior to regulate a direct inference of the IMF. We use this new framework to determine the number of SSPs that is required to fit a set of realistic CSP mock spectra. The CSP mock spectra that we use are based on semi-analytic models and have an IMF that varies as a function of stellar velocity dispersion of the galaxy. Our results suggest that using a single SSP biases the determination of the IMF slope to a higher value than the true slope, although the trend with stellar velocity dispersion is overall recovered. If we include more SSPs in the fit, the Bayesian evidence increases significantly and the inferred IMF slopes of our mock spectra converge, within the errors, to their true values. Most of the bias is already removed by using two SSPs instead of one. We show that we can reconstruct the variable IMF of our mock spectra for signal-to-noise ratios exceeding ˜75.
How does aging affect recognition-based inference? A hierarchical Bayesian modeling approach.
Horn, Sebastian S; Pachur, Thorsten; Mata, Rui
2015-01-01
The recognition heuristic (RH) is a simple strategy for probabilistic inference according to which recognized objects are judged to score higher on a criterion than unrecognized objects. In this article, a hierarchical Bayesian extension of the multinomial r-model is applied to measure use of the RH on the individual participant level and to re-evaluate differences between younger and older adults' strategy reliance across environments. Further, it is explored how individual r-model parameters relate to alternative measures of the use of recognition and other knowledge, such as adherence rates and indices from signal-detection theory (SDT). Both younger and older adults used the RH substantially more often in an environment with high than low recognition validity, reflecting adaptivity in strategy use across environments. In extension of previous analyses (based on adherence rates), hierarchical modeling revealed that in an environment with low recognition validity, (a) older adults had a stronger tendency than younger adults to rely on the RH and (b) variability in RH use between individuals was larger than in an environment with high recognition validity; variability did not differ between age groups. Further, the r-model parameters correlated moderately with an SDT measure expressing how well people can discriminate cases where the RH leads to a correct vs. incorrect inference; this suggests that the r-model and the SDT measures may offer complementary insights into the use of recognition in decision making. In conclusion, younger and older adults are largely adaptive in their application of the RH, but cognitive aging may be associated with an increased tendency to rely on this strategy. Copyright © 2014 Elsevier B.V. All rights reserved.
Izquierdo, K.; Lekic, V.; Montesi, L.
2017-12-01
Gravity inversions are especially important for planetary applications since measurements of the variations in gravitational acceleration are often the only constraint available to map out lateral density variations in the interiors of planets and other Solar system objects. Currently, global gravity data is available for the terrestrial planets and the Moon. Although several methods for inverting these data have been developed and applied, the non-uniqueness of global density models that fit the data has not yet been fully characterized. We make use of Bayesian inference and a Reversible Jump Markov Chain Monte Carlo (RJMCMC) approach to develop a Trans-dimensional Hierarchical Bayesian (THB) inversion algorithm that yields a large sample of models that fit a gravity field. From this group of models, we can determine the most likely value of parameters of a global density model and a measure of the non-uniqueness of each parameter when the number of anomalies describing the gravity field is not fixed a priori. We explore the use of a parallel tempering algorithm and fast multipole method to reduce the number of iterations and computing time needed. We applied this method to a synthetic gravity field of the Moon and a long wavelength synthetic model of density anomalies in the Earth's lower mantle. We obtained a good match between the given gravity field and the gravity field produced by the most likely model in each inversion. The number of anomalies of the models showed parsimony of the algorithm, the value of the noise variance of the input data was retrieved, and the non-uniqueness of the models was quantified. Our results show that the ability to constrain the latitude and longitude of density anomalies, which is excellent at shallow locations (information about the overall density distribution of celestial bodies even when there is no other geophysical data available.
Kim, Daesang; El Gharamti, Iman; Hantouche, Mireille; Elwardani, Ahmed Elsaid; Farooq, Aamir; Bisetti, Fabrizio; Knio, Omar
2017-01-01
We developed a novel two-step hierarchical method for the Bayesian inference of the rate parameters of a target reaction from time-resolved concentration measurements in shock tubes. The method was applied to the calibration of the parameters
Galliano, Frédéric
2018-05-01
This article presents a new dust spectral energy distribution (SED) model, named HerBIE, aimed at eliminating the noise-induced correlations and large scatter obtained when performing least-squares fits. The originality of this code is to apply the hierarchical Bayesian approach to full dust models, including realistic optical properties, stochastic heating, and the mixing of physical conditions in the observed regions. We test the performances of our model by applying it to synthetic observations. We explore the impact on the recovered parameters of several effects: signal-to-noise ratio, SED shape, sample size, the presence of intrinsic correlations, the wavelength coverage, and the use of different SED model components. We show that this method is very efficient: the recovered parameters are consistently distributed around their true values. We do not find any clear bias, even for the most degenerate parameters, or with extreme signal-to-noise ratios.
Bayesian inference with ecological applications
Link, William A
2009-01-01
This text is written to provide a mathematically sound but accessible and engaging introduction to Bayesian inference specifically for environmental scientists, ecologists and wildlife biologists. It emphasizes the power and usefulness of Bayesian methods in an ecological context. The advent of fast personal computers and easily available software has simplified the use of Bayesian and hierarchical models . One obstacle remains for ecologists and wildlife biologists, namely the near absence of Bayesian texts written specifically for them. The book includes many relevant examples, is supported by software and examples on a companion website and will become an essential grounding in this approach for students and research ecologists. Engagingly written text specifically designed to demystify a complex subject Examples drawn from ecology and wildlife research An essential grounding for graduate and research ecologists in the increasingly prevalent Bayesian approach to inference Companion website with analyt...
Bayesian statistical inference
Directory of Open Access Journals (Sweden)
Bruno De Finetti
2017-04-01
Full Text Available This work was translated into English and published in the volume: Bruno De Finetti, Induction and Probability, Biblioteca di Statistica, eds. P. Monari, D. Cocchi, Clueb, Bologna, 1993.Bayesian statistical Inference is one of the last fundamental philosophical papers in which we can find the essential De Finetti's approach to the statistical inference.
Directory of Open Access Journals (Sweden)
Mario A Pardo
Full Text Available We inferred the population densities of blue whales (Balaenoptera musculus and short-beaked common dolphins (Delphinus delphis in the Northeast Pacific Ocean as functions of the water-column's physical structure by implementing hierarchical models in a Bayesian framework. This approach allowed us to propagate the uncertainty of the field observations into the inference of species-habitat relationships and to generate spatially explicit population density predictions with reduced effects of sampling heterogeneity. Our hypothesis was that the large-scale spatial distributions of these two cetacean species respond primarily to ecological processes resulting from shoaling and outcropping of the pycnocline in regions of wind-forced upwelling and eddy-like circulation. Physically, these processes affect the thermodynamic balance of the water column, decreasing its volume and thus the height of the absolute dynamic topography (ADT. Biologically, they lead to elevated primary productivity and persistent aggregation of low-trophic-level prey. Unlike other remotely sensed variables, ADT provides information about the structure of the entire water column and it is also routinely measured at high spatial-temporal resolution by satellite altimeters with uniform global coverage. Our models provide spatially explicit population density predictions for both species, even in areas where the pycnocline shoals but does not outcrop (e.g. the Costa Rica Dome and the North Equatorial Countercurrent thermocline ridge. Interannual variations in distribution during El Niño anomalies suggest that the population density of both species decreases dramatically in the Equatorial Cold Tongue and the Costa Rica Dome, and that their distributions retract to particular areas that remain productive, such as the more oceanic waters in the central California Current System, the northern Gulf of California, the North Equatorial Countercurrent thermocline ridge, and the more
Bayesian nonparametric hierarchical modeling.
Dunson, David B
2009-04-01
In biomedical research, hierarchical models are very widely used to accommodate dependence in multivariate and longitudinal data and for borrowing of information across data from different sources. A primary concern in hierarchical modeling is sensitivity to parametric assumptions, such as linearity and normality of the random effects. Parametric assumptions on latent variable distributions can be challenging to check and are typically unwarranted, given available prior knowledge. This article reviews some recent developments in Bayesian nonparametric methods motivated by complex, multivariate and functional data collected in biomedical studies. The author provides a brief review of flexible parametric approaches relying on finite mixtures and latent class modeling. Dirichlet process mixture models are motivated by the need to generalize these approaches to avoid assuming a fixed finite number of classes. Focusing on an epidemiology application, the author illustrates the practical utility and potential of nonparametric Bayes methods.
Bayesian Inference Methods for Sparse Channel Estimation
DEFF Research Database (Denmark)
Pedersen, Niels Lovmand
2013-01-01
This thesis deals with sparse Bayesian learning (SBL) with application to radio channel estimation. As opposed to the classical approach for sparse signal representation, we focus on the problem of inferring complex signals. Our investigations within SBL constitute the basis for the development...... of Bayesian inference algorithms for sparse channel estimation. Sparse inference methods aim at finding the sparse representation of a signal given in some overcomplete dictionary of basis vectors. Within this context, one of our main contributions to the field of SBL is a hierarchical representation...... analysis of the complex prior representation, where we show that the ability to induce sparse estimates of a given prior heavily depends on the inference method used and, interestingly, whether real or complex variables are inferred. We also show that the Bayesian estimators derived from the proposed...
Applied Bayesian hierarchical methods
National Research Council Canada - National Science Library
Congdon, P
2010-01-01
... . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Posterior Inference from Bayes Formula . . . . . . . . . . . . 1.3 Markov Chain Monte Carlo Sampling in Relation to Monte Carlo Methods: Obtaining Posterior...
Kim, Daesang
2017-06-22
We developed a novel two-step hierarchical method for the Bayesian inference of the rate parameters of a target reaction from time-resolved concentration measurements in shock tubes. The method was applied to the calibration of the parameters of the reaction of hydroxyl with 2-methylfuran, which is studied experimentally via absorption measurements of the OH radical\\'s concentration following shock-heating. In the first step of the approach, each shock tube experiment is treated independently to infer the posterior distribution of the rate constant and error hyper-parameter that best explains the OH signal. In the second step, these posterior distributions are sampled to calibrate the parameters appearing in the Arrhenius reaction model for the rate constant. Furthermore, the second step is modified and repeated in order to explore alternative rate constant models and to assess the effect of uncertainties in the reflected shock\\'s temperature. Comparisons of the estimates obtained via the proposed methodology against the common least squares approach are presented. The relative merits of the novel Bayesian framework are highlighted, especially with respect to the opportunity to utilize the posterior distributions of the parameters in future uncertainty quantification studies.
Wu, Zhen; Liu, Yong; Liang, Zhongyao; Wu, Sifeng; Guo, Huaicheng
2017-06-01
Lake eutrophication is associated with excessive anthropogenic nutrients (mainly nitrogen (N) and phosphorus (P)) and unobserved internal nutrient cycling. Despite the advances in understanding the role of external loadings, the contribution of internal nutrient cycling is still an open question. A dynamic mass-balance model was developed to simulate and measure the contributions of internal cycling and external loading. It was based on the temporal Bayesian Hierarchical Framework (BHM), where we explored the seasonal patterns in the dynamics of nutrient cycling processes and the limitation of N and P on phytoplankton growth in hyper-eutrophic Lake Dianchi, China. The dynamic patterns of the five state variables (Chla, TP, ammonia, nitrate and organic N) were simulated based on the model. Five parameters (algae growth rate, sediment exchange rate of N and P, nitrification rate and denitrification rate) were estimated based on BHM. The model provided a good fit to observations. Our model results highlighted the role of internal cycling of N and P in Lake Dianchi. The internal cycling processes contributed more than external loading to the N and P changes in the water column. Further insights into the nutrient limitation analysis indicated that the sediment exchange of P determined the P limitation. Allowing for the contribution of denitrification to N removal, N was the more limiting nutrient in most of the time, however, P was the more important nutrient for eutrophication management. For Lake Dianchi, it would not be possible to recover solely by reducing the external watershed nutrient load; the mechanisms of internal cycling should also be considered as an approach to inhibit the release of sediments and to enhance denitrification. Copyright © 2017 Elsevier Ltd. All rights reserved.
Inference in hybrid Bayesian networks
DEFF Research Database (Denmark)
Lanseth, Helge; Nielsen, Thomas Dyhre; Rumí, Rafael
2009-01-01
Since the 1980s, Bayesian Networks (BNs) have become increasingly popular for building statistical models of complex systems. This is particularly true for boolean systems, where BNs often prove to be a more efficient modelling framework than traditional reliability-techniques (like fault trees...... decade's research on inference in hybrid Bayesian networks. The discussions are linked to an example model for estimating human reliability....
Interactive Instruction in Bayesian Inference
DEFF Research Database (Denmark)
Khan, Azam; Breslav, Simon; Hornbæk, Kasper
2018-01-01
An instructional approach is presented to improve human performance in solving Bayesian inference problems. Starting from the original text of the classic Mammography Problem, the textual expression is modified and visualizations are added according to Mayer’s principles of instruction. These pri......An instructional approach is presented to improve human performance in solving Bayesian inference problems. Starting from the original text of the classic Mammography Problem, the textual expression is modified and visualizations are added according to Mayer’s principles of instruction....... These principles concern coherence, personalization, signaling, segmenting, multimedia, spatial contiguity, and pretraining. Principles of self-explanation and interactivity are also applied. Four experiments on the Mammography Problem showed that these principles help participants answer the questions...... that an instructional approach to improving human performance in Bayesian inference is a promising direction....
Bailer-Jones, Coryn A. L.
2017-04-01
Preface; 1. Probability basics; 2. Estimation and uncertainty; 3. Statistical models and inference; 4. Linear models, least squares, and maximum likelihood; 5. Parameter estimation: single parameter; 6. Parameter estimation: multiple parameters; 7. Approximating distributions; 8. Monte Carlo methods for inference; 9. Parameter estimation: Markov chain Monte Carlo; 10. Frequentist hypothesis testing; 11. Model comparison; 12. Dealing with more complicated problems; References; Index.
Probability biases as Bayesian inference
Directory of Open Access Journals (Sweden)
Andre; C. R. Martins
2006-11-01
Full Text Available In this article, I will show how several observed biases in human probabilistic reasoning can be partially explained as good heuristics for making inferences in an environment where probabilities have uncertainties associated to them. Previous results show that the weight functions and the observed violations of coalescing and stochastic dominance can be understood from a Bayesian point of view. We will review those results and see that Bayesian methods should also be used as part of the explanation behind other known biases. That means that, although the observed errors are still errors under the be understood as adaptations to the solution of real life problems. Heuristics that allow fast evaluations and mimic a Bayesian inference would be an evolutionary advantage, since they would give us an efficient way of making decisions. %XX In that sense, it should be no surprise that humans reason with % probability as it has been observed.
Inference in hybrid Bayesian networks
International Nuclear Information System (INIS)
Langseth, Helge; Nielsen, Thomas D.; Rumi, Rafael; Salmeron, Antonio
2009-01-01
Since the 1980s, Bayesian networks (BNs) have become increasingly popular for building statistical models of complex systems. This is particularly true for boolean systems, where BNs often prove to be a more efficient modelling framework than traditional reliability techniques (like fault trees and reliability block diagrams). However, limitations in the BNs' calculation engine have prevented BNs from becoming equally popular for domains containing mixtures of both discrete and continuous variables (the so-called hybrid domains). In this paper we focus on these difficulties, and summarize some of the last decade's research on inference in hybrid Bayesian networks. The discussions are linked to an example model for estimating human reliability.
Bayesian inference on proportional elections.
Directory of Open Access Journals (Sweden)
Gabriel Hideki Vatanabe Brunello
Full Text Available Polls for majoritarian voting systems usually show estimates of the percentage of votes for each candidate. However, proportional vote systems do not necessarily guarantee the candidate with the most percentage of votes will be elected. Thus, traditional methods used in majoritarian elections cannot be applied on proportional elections. In this context, the purpose of this paper was to perform a Bayesian inference on proportional elections considering the Brazilian system of seats distribution. More specifically, a methodology to answer the probability that a given party will have representation on the chamber of deputies was developed. Inferences were made on a Bayesian scenario using the Monte Carlo simulation technique, and the developed methodology was applied on data from the Brazilian elections for Members of the Legislative Assembly and Federal Chamber of Deputies in 2010. A performance rate was also presented to evaluate the efficiency of the methodology. Calculations and simulations were carried out using the free R statistical software.
Hierarchical Bayesian Modeling of Fluid-Induced Seismicity
Broccardo, M.; Mignan, A.; Wiemer, S.; Stojadinovic, B.; Giardini, D.
2017-11-01
In this study, we present a Bayesian hierarchical framework to model fluid-induced seismicity. The framework is based on a nonhomogeneous Poisson process with a fluid-induced seismicity rate proportional to the rate of injected fluid. The fluid-induced seismicity rate model depends upon a set of physically meaningful parameters and has been validated for six fluid-induced case studies. In line with the vision of hierarchical Bayesian modeling, the rate parameters are considered as random variables. We develop both the Bayesian inference and updating rules, which are used to develop a probabilistic forecasting model. We tested the Basel 2006 fluid-induced seismic case study to prove that the hierarchical Bayesian model offers a suitable framework to coherently encode both epistemic uncertainty and aleatory variability. Moreover, it provides a robust and consistent short-term seismic forecasting model suitable for online risk quantification and mitigation.
Bayesian methods for hackers probabilistic programming and Bayesian inference
Davidson-Pilon, Cameron
2016-01-01
Bayesian methods of inference are deeply natural and extremely powerful. However, most discussions of Bayesian inference rely on intensely complex mathematical analyses and artificial examples, making it inaccessible to anyone without a strong mathematical background. Now, though, Cameron Davidson-Pilon introduces Bayesian inference from a computational perspective, bridging theory to practice–freeing you to get results using computing power. Bayesian Methods for Hackers illuminates Bayesian inference through probabilistic programming with the powerful PyMC language and the closely related Python tools NumPy, SciPy, and Matplotlib. Using this approach, you can reach effective solutions in small increments, without extensive mathematical intervention. Davidson-Pilon begins by introducing the concepts underlying Bayesian inference, comparing it with other techniques and guiding you through building and training your first Bayesian model. Next, he introduces PyMC through a series of detailed examples a...
Bayesian inference for Hawkes processes
DEFF Research Database (Denmark)
Rasmussen, Jakob Gulddahl
The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...... intensity function, while the second approach is based on an underlying clustering and branching structure in the Hawkes process. For practical use, MCMC (Markov chain Monte Carlo) methods are employed. The two approaches are compared numerically using three examples of the Hawkes process....
Bayesian inference for Hawkes processes
DEFF Research Database (Denmark)
Rasmussen, Jakob Gulddahl
2013-01-01
The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...... intensity function, while the second approach is based on an underlying clustering and branching structure in the Hawkes process. For practical use, MCMC (Markov chain Monte Carlo) methods are employed. The two approaches are compared numerically using three examples of the Hawkes process....
Nonparametric Bayesian inference in biostatistics
Müller, Peter
2015-01-01
As chapters in this book demonstrate, BNP has important uses in clinical sciences and inference for issues like unknown partitions in genomics. Nonparametric Bayesian approaches (BNP) play an ever expanding role in biostatistical inference from use in proteomics to clinical trials. Many research problems involve an abundance of data and require flexible and complex probability models beyond the traditional parametric approaches. As this book's expert contributors show, BNP approaches can be the answer. Survival Analysis, in particular survival regression, has traditionally used BNP, but BNP's potential is now very broad. This applies to important tasks like arrangement of patients into clinically meaningful subpopulations and segmenting the genome into functionally distinct regions. This book is designed to both review and introduce application areas for BNP. While existing books provide theoretical foundations, this book connects theory to practice through engaging examples and research questions. Chapters c...
Variations on Bayesian Prediction and Inference
2016-05-09
inference 2.2.1 Background There are a number of statistical inference problems that are not generally formulated via a full probability model...problem of inference about an unknown parameter, the Bayesian approach requires a full probability 1. REPORT DATE (DD-MM-YYYY) 4. TITLE AND...the problem of inference about an unknown parameter, the Bayesian approach requires a full probability model/likelihood which can be an obstacle
Compiling Relational Bayesian Networks for Exact Inference
DEFF Research Database (Denmark)
Jaeger, Manfred; Darwiche, Adnan; Chavira, Mark
2006-01-01
We describe in this paper a system for exact inference with relational Bayesian networks as defined in the publicly available PRIMULA tool. The system is based on compiling propositional instances of relational Bayesian networks into arithmetic circuits and then performing online inference...
Compiling Relational Bayesian Networks for Exact Inference
DEFF Research Database (Denmark)
Jaeger, Manfred; Chavira, Mark; Darwiche, Adnan
2004-01-01
We describe a system for exact inference with relational Bayesian networks as defined in the publicly available \\primula\\ tool. The system is based on compiling propositional instances of relational Bayesian networks into arithmetic circuits and then performing online inference by evaluating...
Cortical hierarchies perform Bayesian causal inference in multisensory perception.
Directory of Open Access Journals (Sweden)
Tim Rohe
2015-02-01
Full Text Available To form a veridical percept of the environment, the brain needs to integrate sensory signals from a common source but segregate those from independent sources. Thus, perception inherently relies on solving the "causal inference problem." Behaviorally, humans solve this problem optimally as predicted by Bayesian Causal Inference; yet, the underlying neural mechanisms are unexplored. Combining psychophysics, Bayesian modeling, functional magnetic resonance imaging (fMRI, and multivariate decoding in an audiovisual spatial localization task, we demonstrate that Bayesian Causal Inference is performed by a hierarchy of multisensory processes in the human brain. At the bottom of the hierarchy, in auditory and visual areas, location is represented on the basis that the two signals are generated by independent sources (= segregation. At the next stage, in posterior intraparietal sulcus, location is estimated under the assumption that the two signals are from a common source (= forced fusion. Only at the top of the hierarchy, in anterior intraparietal sulcus, the uncertainty about the causal structure of the world is taken into account and sensory signals are combined as predicted by Bayesian Causal Inference. Characterizing the computational operations of signal interactions reveals the hierarchical nature of multisensory perception in human neocortex. It unravels how the brain accomplishes Bayesian Causal Inference, a statistical computation fundamental for perception and cognition. Our results demonstrate how the brain combines information in the face of uncertainty about the underlying causal structure of the world.
Hierarchical Bayesian Models of Subtask Learning
Anglim, Jeromy; Wynton, Sarah K. A.
2015-01-01
The current study used Bayesian hierarchical methods to challenge and extend previous work on subtask learning consistency. A general model of individual-level subtask learning was proposed focusing on power and exponential functions with constraints to test for inconsistency. To study subtask learning, we developed a novel computer-based booking…
Bayesian Inference on Gravitational Waves
Directory of Open Access Journals (Sweden)
Asad Ali
2015-12-01
Full Text Available The Bayesian approach is increasingly becoming popular among the astrophysics data analysis communities. However, the Pakistan statistics communities are unaware of this fertile interaction between the two disciplines. Bayesian methods have been in use to address astronomical problems since the very birth of the Bayes probability in eighteenth century. Today the Bayesian methods for the detection and parameter estimation of gravitational waves have solid theoretical grounds with a strong promise for the realistic applications. This article aims to introduce the Pakistan statistics communities to the applications of Bayesian Monte Carlo methods in the analysis of gravitational wave data with an overview of the Bayesian signal detection and estimation methods and demonstration by a couple of simplified examples.
Testing adaptive toolbox models: a Bayesian hierarchical approach.
Scheibehenne, Benjamin; Rieskamp, Jörg; Wagenmakers, Eric-Jan
2013-01-01
Many theories of human cognition postulate that people are equipped with a repertoire of strategies to solve the tasks they face. This theoretical framework of a cognitive toolbox provides a plausible account of intra- and interindividual differences in human behavior. Unfortunately, it is often unclear how to rigorously test the toolbox framework. How can a toolbox model be quantitatively specified? How can the number of toolbox strategies be limited to prevent uncontrolled strategy sprawl? How can a toolbox model be formally tested against alternative theories? The authors show how these challenges can be met by using Bayesian inference techniques. By means of parameter recovery simulations and the analysis of empirical data across a variety of domains (i.e., judgment and decision making, children's cognitive development, function learning, and perceptual categorization), the authors illustrate how Bayesian inference techniques allow toolbox models to be quantitatively specified, strategy sprawl to be contained, and toolbox models to be rigorously tested against competing theories. The authors demonstrate that their approach applies at the individual level but can also be generalized to the group level with hierarchical Bayesian procedures. The suggested Bayesian inference techniques represent a theoretical and methodological advancement for toolbox theories of cognition and behavior.
BAYESIAN INFERENCE OF CMB GRAVITATIONAL LENSING
Energy Technology Data Exchange (ETDEWEB)
Anderes, Ethan [Department of Statistics, University of California, Davis, CA 95616 (United States); Wandelt, Benjamin D.; Lavaux, Guilhem [Sorbonne Universités, UPMC Univ Paris 06 and CNRS, UMR7095, Institut d’Astrophysique de Paris, F-75014, Paris (France)
2015-08-01
The Planck satellite, along with several ground-based telescopes, has mapped the cosmic microwave background (CMB) at sufficient resolution and signal-to-noise so as to allow a detection of the subtle distortions due to the gravitational influence of the intervening matter distribution. A natural modeling approach is to write a Bayesian hierarchical model for the lensed CMB in terms of the unlensed CMB and the lensing potential. So far there has been no feasible algorithm for inferring the posterior distribution of the lensing potential from the lensed CMB map. We propose a solution that allows efficient Markov Chain Monte Carlo sampling from the joint posterior of the lensing potential and the unlensed CMB map using the Hamiltonian Monte Carlo technique. The main conceptual step in the solution is a re-parameterization of CMB lensing in terms of the lensed CMB and the “inverse lensing” potential. We demonstrate a fast implementation on simulated data, including noise and a sky cut, that uses a further acceleration based on a very mild approximation of the inverse lensing potential. We find that the resulting Markov Chain has short correlation lengths and excellent convergence properties, making it promising for applications to high-resolution CMB data sets in the future.
An Intuitive Dashboard for Bayesian Network Inference
International Nuclear Information System (INIS)
Reddy, Vikas; Farr, Anna Charisse; Wu, Paul; Mengersen, Kerrie; Yarlagadda, Prasad K D V
2014-01-01
Current Bayesian network software packages provide good graphical interface for users who design and develop Bayesian networks for various applications. However, the intended end-users of these networks may not necessarily find such an interface appealing and at times it could be overwhelming, particularly when the number of nodes in the network is large. To circumvent this problem, this paper presents an intuitive dashboard, which provides an additional layer of abstraction, enabling the end-users to easily perform inferences over the Bayesian networks. Unlike most software packages, which display the nodes and arcs of the network, the developed tool organises the nodes based on the cause-and-effect relationship, making the user-interaction more intuitive and friendly. In addition to performing various types of inferences, the users can conveniently use the tool to verify the behaviour of the developed Bayesian network. The tool has been developed using QT and SMILE libraries in C++
An Intuitive Dashboard for Bayesian Network Inference
Reddy, Vikas; Charisse Farr, Anna; Wu, Paul; Mengersen, Kerrie; Yarlagadda, Prasad K. D. V.
2014-03-01
Current Bayesian network software packages provide good graphical interface for users who design and develop Bayesian networks for various applications. However, the intended end-users of these networks may not necessarily find such an interface appealing and at times it could be overwhelming, particularly when the number of nodes in the network is large. To circumvent this problem, this paper presents an intuitive dashboard, which provides an additional layer of abstraction, enabling the end-users to easily perform inferences over the Bayesian networks. Unlike most software packages, which display the nodes and arcs of the network, the developed tool organises the nodes based on the cause-and-effect relationship, making the user-interaction more intuitive and friendly. In addition to performing various types of inferences, the users can conveniently use the tool to verify the behaviour of the developed Bayesian network. The tool has been developed using QT and SMILE libraries in C++.
Introduction to Hierarchical Bayesian Modeling for Ecological Data
Parent, Eric
2012-01-01
Making statistical modeling and inference more accessible to ecologists and related scientists, Introduction to Hierarchical Bayesian Modeling for Ecological Data gives readers a flexible and effective framework to learn about complex ecological processes from various sources of data. It also helps readers get started on building their own statistical models. The text begins with simple models that progressively become more complex and realistic through explanatory covariates and intermediate hidden states variables. When fitting the models to data, the authors gradually present the concepts a
Bayesian disease mapping: hierarchical modeling in spatial epidemiology
National Research Council Canada - National Science Library
Lawson, Andrew
2013-01-01
.... Exploring these new developments, Bayesian Disease Mapping: Hierarchical Modeling in Spatial Epidemiology, Second Edition provides an up-to-date, cohesive account of the full range of Bayesian disease mapping methods and applications...
Prediction of road accidents: A Bayesian hierarchical approach
DEFF Research Database (Denmark)
Deublein, Markus; Schubert, Matthias; Adey, Bryan T.
2013-01-01
the expected number of accidents in which an injury has occurred and the expected number of light, severe and fatally injured road users. Additionally, the methodology is used for geo-referenced identification of road sections with increased occurrence probabilities of injury accident events on a road link......In this paper a novel methodology for the prediction of the occurrence of road accidents is presented. The methodology utilizes a combination of three statistical methods: (1) gamma-updating of the occurrence rates of injury accidents and injured road users, (2) hierarchical multivariate Poisson......-lognormal regression analysis taking into account correlations amongst multiple dependent model response variables and effects of discrete accident count data e.g. over-dispersion, and (3) Bayesian inference algorithms, which are applied by means of data mining techniques supported by Bayesian Probabilistic Networks...
A Bayesian Network Schema for Lessening Database Inference
National Research Council Canada - National Science Library
Chang, LiWu; Moskowitz, Ira S
2001-01-01
.... The authors introduce a formal schema for database inference analysis, based upon a Bayesian network structure, which identifies critical parameters involved in the inference problem and represents...
Empirical Bayesian inference and model uncertainty
International Nuclear Information System (INIS)
Poern, K.
1994-01-01
This paper presents a hierarchical or multistage empirical Bayesian approach for the estimation of uncertainty concerning the intensity of a homogeneous Poisson process. A class of contaminated gamma distributions is considered to describe the uncertainty concerning the intensity. These distributions in turn are defined through a set of secondary parameters, the knowledge of which is also described and updated via Bayes formula. This two-stage Bayesian approach is an example where the modeling uncertainty is treated in a comprehensive way. Each contaminated gamma distributions, represented by a point in the 3D space of secondary parameters, can be considered as a specific model of the uncertainty about the Poisson intensity. Then, by the empirical Bayesian method each individual model is assigned a posterior probability
Polynomial Chaos Surrogates for Bayesian Inference
Le Maitre, Olivier
2016-01-06
The Bayesian inference is a popular probabilistic method to solve inverse problems, such as the identification of field parameter in a PDE model. The inference rely on the Bayes rule to update the prior density of the sought field, from observations, and derive its posterior distribution. In most cases the posterior distribution has no explicit form and has to be sampled, for instance using a Markov-Chain Monte Carlo method. In practice the prior field parameter is decomposed and truncated (e.g. by means of Karhunen- Lo´eve decomposition) to recast the inference problem into the inference of a finite number of coordinates. Although proved effective in many situations, the Bayesian inference as sketched above faces several difficulties requiring improvements. First, sampling the posterior can be a extremely costly task as it requires multiple resolutions of the PDE model for different values of the field parameter. Second, when the observations are not very much informative, the inferred parameter field can highly depends on its prior which can be somehow arbitrary. These issues have motivated the introduction of reduced modeling or surrogates for the (approximate) determination of the parametrized PDE solution and hyperparameters in the description of the prior field. Our contribution focuses on recent developments in these two directions: the acceleration of the posterior sampling by means of Polynomial Chaos expansions and the efficient treatment of parametrized covariance functions for the prior field. We also discuss the possibility of making such approach adaptive to further improve its efficiency.
The NIFTY way of Bayesian signal inference
International Nuclear Information System (INIS)
Selig, Marco
2014-01-01
We introduce NIFTY, 'Numerical Information Field Theory', a software package for the development of Bayesian signal inference algorithms that operate independently from any underlying spatial grid and its resolution. A large number of Bayesian and Maximum Entropy methods for 1D signal reconstruction, 2D imaging, as well as 3D tomography, appear formally similar, but one often finds individualized implementations that are neither flexible nor easily transferable. Signal inference in the framework of NIFTY can be done in an abstract way, such that algorithms, prototyped in 1D, can be applied to real world problems in higher-dimensional settings. NIFTY as a versatile library is applicable and already has been applied in 1D, 2D, 3D and spherical settings. A recent application is the D 3 PO algorithm targeting the non-trivial task of denoising, deconvolving, and decomposing photon observations in high energy astronomy
The NIFTy way of Bayesian signal inference
Selig, Marco
2014-12-01
We introduce NIFTy, "Numerical Information Field Theory", a software package for the development of Bayesian signal inference algorithms that operate independently from any underlying spatial grid and its resolution. A large number of Bayesian and Maximum Entropy methods for 1D signal reconstruction, 2D imaging, as well as 3D tomography, appear formally similar, but one often finds individualized implementations that are neither flexible nor easily transferable. Signal inference in the framework of NIFTy can be done in an abstract way, such that algorithms, prototyped in 1D, can be applied to real world problems in higher-dimensional settings. NIFTy as a versatile library is applicable and already has been applied in 1D, 2D, 3D and spherical settings. A recent application is the D3PO algorithm targeting the non-trivial task of denoising, deconvolving, and decomposing photon observations in high energy astronomy.
Bayesianism and inference to the best explanation
Directory of Open Access Journals (Sweden)
Valeriano IRANZO
2008-01-01
Full Text Available Bayesianism and Inference to the best explanation (IBE are two different models of inference. Recently there has been some debate about the possibility of “bayesianizing” IBE. Firstly I explore several alternatives to include explanatory considerations in Bayes’s Theorem. Then I distinguish two different interpretations of prior probabilities: “IBE-Bayesianism” (IBE-Bay and “frequentist-Bayesianism” (Freq-Bay. After detailing the content of the latter, I propose a rule for assessing the priors. I also argue that Freq-Bay: (i endorses a role for explanatory value in the assessment of scientific hypotheses; (ii avoids a purely subjectivist reading of prior probabilities; and (iii fits better than IBE-Bayesianism with two basic facts about science, i.e., the prominent role played by empirical testing and the existence of many scientific theories in the past that failed to fulfil their promises and were subsequently abandoned.
Optimal inference with suboptimal models: Addiction and active Bayesian inference
Schwartenbeck, Philipp; FitzGerald, Thomas H.B.; Mathys, Christoph; Dolan, Ray; Wurst, Friedrich; Kronbichler, Martin; Friston, Karl
2015-01-01
When casting behaviour as active (Bayesian) inference, optimal inference is defined with respect to an agent’s beliefs – based on its generative model of the world. This contrasts with normative accounts of choice behaviour, in which optimal actions are considered in relation to the true structure of the environment – as opposed to the agent’s beliefs about worldly states (or the task). This distinction shifts an understanding of suboptimal or pathological behaviour away from aberrant inference as such, to understanding the prior beliefs of a subject that cause them to behave less ‘optimally’ than our prior beliefs suggest they should behave. Put simply, suboptimal or pathological behaviour does not speak against understanding behaviour in terms of (Bayes optimal) inference, but rather calls for a more refined understanding of the subject’s generative model upon which their (optimal) Bayesian inference is based. Here, we discuss this fundamental distinction and its implications for understanding optimality, bounded rationality and pathological (choice) behaviour. We illustrate our argument using addictive choice behaviour in a recently described ‘limited offer’ task. Our simulations of pathological choices and addictive behaviour also generate some clear hypotheses, which we hope to pursue in ongoing empirical work. PMID:25561321
Using Alien Coins to Test Whether Simple Inference Is Bayesian
Cassey, Peter; Hawkins, Guy E.; Donkin, Chris; Brown, Scott D.
2016-01-01
Reasoning and inference are well-studied aspects of basic cognition that have been explained as statistically optimal Bayesian inference. Using a simplified experimental design, we conducted quantitative comparisons between Bayesian inference and human inference at the level of individuals. In 3 experiments, with more than 13,000 participants, we…
Constrained bayesian inference of project performance models
Sunmola, Funlade
2013-01-01
Project performance models play an important role in the management of project success. When used for monitoring projects, they can offer predictive ability such as indications of possible delivery problems. Approaches for monitoring project performance relies on available project information including restrictions imposed on the project, particularly the constraints of cost, quality, scope and time. We study in this paper a Bayesian inference methodology for project performance modelling in ...
Bayesian inference in processing experimental data: principles and basic applications
International Nuclear Information System (INIS)
D'Agostini, G
2003-01-01
This paper introduces general ideas and some basic methods of the Bayesian probability theory applied to physics measurements. Our aim is to make the reader familiar, through examples rather than rigorous formalism, with concepts such as the following: model comparison (including the automatic Ockham's Razor filter provided by the Bayesian approach); parametric inference; quantification of the uncertainty about the value of physical quantities, also taking into account systematic effects; role of marginalization; posterior characterization; predictive distributions; hierarchical modelling and hyperparameters; Gaussian approximation of the posterior and recovery of conventional methods, especially maximum likelihood and chi-square fits under well-defined conditions; conjugate priors, transformation invariance and maximum entropy motivated priors; and Monte Carlo (MC) estimates of expectation, including a short introduction to Markov Chain MC methods
Bayesian structural inference for hidden processes
Strelioff, Christopher C.; Crutchfield, James P.
2014-04-01
We introduce a Bayesian approach to discovering patterns in structurally complex processes. The proposed method of Bayesian structural inference (BSI) relies on a set of candidate unifilar hidden Markov model (uHMM) topologies for inference of process structure from a data series. We employ a recently developed exact enumeration of topological ɛ-machines. (A sequel then removes the topological restriction.) This subset of the uHMM topologies has the added benefit that inferred models are guaranteed to be ɛ-machines, irrespective of estimated transition probabilities. Properties of ɛ-machines and uHMMs allow for the derivation of analytic expressions for estimating transition probabilities, inferring start states, and comparing the posterior probability of candidate model topologies, despite process internal structure being only indirectly present in data. We demonstrate BSI's effectiveness in estimating a process's randomness, as reflected by the Shannon entropy rate, and its structure, as quantified by the statistical complexity. We also compare using the posterior distribution over candidate models and the single, maximum a posteriori model for point estimation and show that the former more accurately reflects uncertainty in estimated values. We apply BSI to in-class examples of finite- and infinite-order Markov processes, as well to an out-of-class, infinite-state hidden process.
Bayesian Estimation and Inference using Stochastic Hardware
Directory of Open Access Journals (Sweden)
Chetan Singh Thakur
2016-03-01
Full Text Available In this paper, we present the implementation of two types of Bayesian inference problems to demonstrate the potential of building probabilistic algorithms in hardware using single set of building blocks with the ability to perform these computations in real time. The first implementation, referred to as the BEAST (Bayesian Estimation and Stochastic Tracker, demonstrates a simple problem where an observer uses an underlying Hidden Markov Model (HMM to track a target in one dimension. In this implementation, sensors make noisy observations of the target position at discrete time steps. The tracker learns the transition model for target movement, and the observation model for the noisy sensors, and uses these to estimate the target position by solving the Bayesian recursive equation online. We show the tracking performance of the system and demonstrate how it can learn the observation model, the transition model, and the external distractor (noise probability interfering with the observations. In the second implementation, referred to as the Bayesian INference in DAG (BIND, we show how inference can be performed in a Directed Acyclic Graph (DAG using stochastic circuits. We show how these building blocks can be easily implemented using simple digital logic gates. An advantage of the stochastic electronic implementation is that it is robust to certain types of noise, which may become an issue in integrated circuit (IC technology with feature sizes in the order of tens of nanometers due to their low noise margin, the effect of high-energy cosmic rays and the low supply voltage. In our framework, the flipping of random individual bits would not affect the system performance because information is encoded in a bit stream.
Bayesian Estimation and Inference Using Stochastic Electronics.
Thakur, Chetan Singh; Afshar, Saeed; Wang, Runchun M; Hamilton, Tara J; Tapson, Jonathan; van Schaik, André
2016-01-01
In this paper, we present the implementation of two types of Bayesian inference problems to demonstrate the potential of building probabilistic algorithms in hardware using single set of building blocks with the ability to perform these computations in real time. The first implementation, referred to as the BEAST (Bayesian Estimation and Stochastic Tracker), demonstrates a simple problem where an observer uses an underlying Hidden Markov Model (HMM) to track a target in one dimension. In this implementation, sensors make noisy observations of the target position at discrete time steps. The tracker learns the transition model for target movement, and the observation model for the noisy sensors, and uses these to estimate the target position by solving the Bayesian recursive equation online. We show the tracking performance of the system and demonstrate how it can learn the observation model, the transition model, and the external distractor (noise) probability interfering with the observations. In the second implementation, referred to as the Bayesian INference in DAG (BIND), we show how inference can be performed in a Directed Acyclic Graph (DAG) using stochastic circuits. We show how these building blocks can be easily implemented using simple digital logic gates. An advantage of the stochastic electronic implementation is that it is robust to certain types of noise, which may become an issue in integrated circuit (IC) technology with feature sizes in the order of tens of nanometers due to their low noise margin, the effect of high-energy cosmic rays and the low supply voltage. In our framework, the flipping of random individual bits would not affect the system performance because information is encoded in a bit stream.
Heuristics as Bayesian inference under extreme priors.
Parpart, Paula; Jones, Matt; Love, Bradley C
2018-05-01
Simple heuristics are often regarded as tractable decision strategies because they ignore a great deal of information in the input data. One puzzle is why heuristics can outperform full-information models, such as linear regression, which make full use of the available information. These "less-is-more" effects, in which a relatively simpler model outperforms a more complex model, are prevalent throughout cognitive science, and are frequently argued to demonstrate an inherent advantage of simplifying computation or ignoring information. In contrast, we show at the computational level (where algorithmic restrictions are set aside) that it is never optimal to discard information. Through a formal Bayesian analysis, we prove that popular heuristics, such as tallying and take-the-best, are formally equivalent to Bayesian inference under the limit of infinitely strong priors. Varying the strength of the prior yields a continuum of Bayesian models with the heuristics at one end and ordinary regression at the other. Critically, intermediate models perform better across all our simulations, suggesting that down-weighting information with the appropriate prior is preferable to entirely ignoring it. Rather than because of their simplicity, our analyses suggest heuristics perform well because they implement strong priors that approximate the actual structure of the environment. We end by considering how new heuristics could be derived by infinitely strengthening the priors of other Bayesian models. These formal results have implications for work in psychology, machine learning and economics. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Bayesian inference on genetic merit under uncertain paternity
Directory of Open Access Journals (Sweden)
Tempelman Robert J
2003-09-01
Full Text Available Abstract A hierarchical animal model was developed for inference on genetic merit of livestock with uncertain paternity. Fully conditional posterior distributions for fixed and genetic effects, variance components, sire assignments and their probabilities are derived to facilitate a Bayesian inference strategy using MCMC methods. We compared this model to a model based on the Henderson average numerator relationship (ANRM in a simulation study with 10 replicated datasets generated for each of two traits. Trait 1 had a medium heritability (h2 for each of direct and maternal genetic effects whereas Trait 2 had a high h2 attributable only to direct effects. The average posterior probabilities inferred on the true sire were between 1 and 10% larger than the corresponding priors (the inverse of the number of candidate sires in a mating pasture for Trait 1 and between 4 and 13% larger than the corresponding priors for Trait 2. The predicted additive and maternal genetic effects were very similar using both models; however, model choice criteria (Pseudo Bayes Factor and Deviance Information Criterion decisively favored the proposed hierarchical model over the ANRM model.
Bayesian Inference of a Multivariate Regression Model
Directory of Open Access Journals (Sweden)
Marick S. Sinay
2014-01-01
Full Text Available We explore Bayesian inference of a multivariate linear regression model with use of a flexible prior for the covariance structure. The commonly adopted Bayesian setup involves the conjugate prior, multivariate normal distribution for the regression coefficients and inverse Wishart specification for the covariance matrix. Here we depart from this approach and propose a novel Bayesian estimator for the covariance. A multivariate normal prior for the unique elements of the matrix logarithm of the covariance matrix is considered. Such structure allows for a richer class of prior distributions for the covariance, with respect to strength of beliefs in prior location hyperparameters, as well as the added ability, to model potential correlation amongst the covariance structure. The posterior moments of all relevant parameters of interest are calculated based upon numerical results via a Markov chain Monte Carlo procedure. The Metropolis-Hastings-within-Gibbs algorithm is invoked to account for the construction of a proposal density that closely matches the shape of the target posterior distribution. As an application of the proposed technique, we investigate a multiple regression based upon the 1980 High School and Beyond Survey.
Efficient Bayesian inference for ARFIMA processes
Graves, T.; Gramacy, R. B.; Franzke, C. L. E.; Watkins, N. W.
2015-03-01
Many geophysical quantities, like atmospheric temperature, water levels in rivers, and wind speeds, have shown evidence of long-range dependence (LRD). LRD means that these quantities experience non-trivial temporal memory, which potentially enhances their predictability, but also hampers the detection of externally forced trends. Thus, it is important to reliably identify whether or not a system exhibits LRD. In this paper we present a modern and systematic approach to the inference of LRD. Rather than Mandelbrot's fractional Gaussian noise, we use the more flexible Autoregressive Fractional Integrated Moving Average (ARFIMA) model which is widely used in time series analysis, and of increasing interest in climate science. Unlike most previous work on the inference of LRD, which is frequentist in nature, we provide a systematic treatment of Bayesian inference. In particular, we provide a new approximate likelihood for efficient parameter inference, and show how nuisance parameters (e.g. short memory effects) can be integrated over in order to focus on long memory parameters, and hypothesis testing more directly. We illustrate our new methodology on the Nile water level data, with favorable comparison to the standard estimators.
HIERARCHICAL PROBABILISTIC INFERENCE OF COSMIC SHEAR
International Nuclear Information System (INIS)
Schneider, Michael D.; Dawson, William A.; Hogg, David W.; Marshall, Philip J.; Bard, Deborah J.; Meyers, Joshua; Lang, Dustin
2015-01-01
Point estimators for the shearing of galaxy images induced by gravitational lensing involve a complex inverse problem in the presence of noise, pixelization, and model uncertainties. We present a probabilistic forward modeling approach to gravitational lensing inference that has the potential to mitigate the biased inferences in most common point estimators and is practical for upcoming lensing surveys. The first part of our statistical framework requires specification of a likelihood function for the pixel data in an imaging survey given parameterized models for the galaxies in the images. We derive the lensing shear posterior by marginalizing over all intrinsic galaxy properties that contribute to the pixel data (i.e., not limited to galaxy ellipticities) and learn the distributions for the intrinsic galaxy properties via hierarchical inference with a suitably flexible conditional probabilitiy distribution specification. We use importance sampling to separate the modeling of small imaging areas from the global shear inference, thereby rendering our algorithm computationally tractable for large surveys. With simple numerical examples we demonstrate the improvements in accuracy from our importance sampling approach, as well as the significance of the conditional distribution specification for the intrinsic galaxy properties when the data are generated from an unknown number of distinct galaxy populations with different morphological characteristics
Integrating distributed Bayesian inference and reinforcement learning for sensor management
Grappiolo, C.; Whiteson, S.; Pavlin, G.; Bakker, B.
2009-01-01
This paper introduces a sensor management approach that integrates distributed Bayesian inference (DBI) and reinforcement learning (RL). DBI is implemented using distributed perception networks (DPNs), a multiagent approach to performing efficient inference, while RL is used to automatically
Bayesian hierarchical modelling of North Atlantic windiness
Vanem, E.; Breivik, O. N.
2013-03-01
Extreme weather conditions represent serious natural hazards to ship operations and may be the direct cause or contributing factor to maritime accidents. Such severe environmental conditions can be taken into account in ship design and operational windows can be defined that limits hazardous operations to less extreme conditions. Nevertheless, possible changes in the statistics of extreme weather conditions, possibly due to anthropogenic climate change, represent an additional hazard to ship operations that is less straightforward to account for in a consistent way. Obviously, there are large uncertainties as to how future climate change will affect the extreme weather conditions at sea and there is a need for stochastic models that can describe the variability in both space and time at various scales of the environmental conditions. Previously, Bayesian hierarchical space-time models have been developed to describe the variability and complex dependence structures of significant wave height in space and time. These models were found to perform reasonably well and provided some interesting results, in particular, pertaining to long-term trends in the wave climate. In this paper, a similar framework is applied to oceanic windiness and the spatial and temporal variability of the 10-m wind speed over an area in the North Atlantic ocean is investigated. When the results from the model for North Atlantic windiness is compared to the results for significant wave height over the same area, it is interesting to observe that whereas an increasing trend in significant wave height was identified, no statistically significant long-term trend was estimated in windiness. This may indicate that the increase in significant wave height is not due to an increase in locally generated wind waves, but rather to increased swell. This observation is also consistent with studies that have suggested a poleward shift of the main storm tracks.
Bayesian hierarchical modelling of North Atlantic windiness
Directory of Open Access Journals (Sweden)
E. Vanem
2013-03-01
Full Text Available Extreme weather conditions represent serious natural hazards to ship operations and may be the direct cause or contributing factor to maritime accidents. Such severe environmental conditions can be taken into account in ship design and operational windows can be defined that limits hazardous operations to less extreme conditions. Nevertheless, possible changes in the statistics of extreme weather conditions, possibly due to anthropogenic climate change, represent an additional hazard to ship operations that is less straightforward to account for in a consistent way. Obviously, there are large uncertainties as to how future climate change will affect the extreme weather conditions at sea and there is a need for stochastic models that can describe the variability in both space and time at various scales of the environmental conditions. Previously, Bayesian hierarchical space-time models have been developed to describe the variability and complex dependence structures of significant wave height in space and time. These models were found to perform reasonably well and provided some interesting results, in particular, pertaining to long-term trends in the wave climate. In this paper, a similar framework is applied to oceanic windiness and the spatial and temporal variability of the 10-m wind speed over an area in the North Atlantic ocean is investigated. When the results from the model for North Atlantic windiness is compared to the results for significant wave height over the same area, it is interesting to observe that whereas an increasing trend in significant wave height was identified, no statistically significant long-term trend was estimated in windiness. This may indicate that the increase in significant wave height is not due to an increase in locally generated wind waves, but rather to increased swell. This observation is also consistent with studies that have suggested a poleward shift of the main storm tracks.
A Fast Iterative Bayesian Inference Algorithm for Sparse Channel Estimation
DEFF Research Database (Denmark)
Pedersen, Niels Lovmand; Manchón, Carles Navarro; Fleury, Bernard Henri
2013-01-01
representation of the Bessel K probability density function; a highly efficient, fast iterative Bayesian inference method is then applied to the proposed model. The resulting estimator outperforms other state-of-the-art Bayesian and non-Bayesian estimators, either by yielding lower mean squared estimation error...
Comparative Study of Inference Methods for Bayesian Nonnegative Matrix Factorisation
DEFF Research Database (Denmark)
Brouwer, Thomas; Frellsen, Jes; Liò, Pietro
2017-01-01
In this paper, we study the trade-offs of different inference approaches for Bayesian matrix factorisation methods, which are commonly used for predicting missing values, and for finding patterns in the data. In particular, we consider Bayesian nonnegative variants of matrix factorisation and tri......-factorisation, and compare non-probabilistic inference, Gibbs sampling, variational Bayesian inference, and a maximum-a-posteriori approach. The variational approach is new for the Bayesian nonnegative models. We compare their convergence, and robustness to noise and sparsity of the data, on both synthetic and real...
Prediction of road accidents: A Bayesian hierarchical approach.
Deublein, Markus; Schubert, Matthias; Adey, Bryan T; Köhler, Jochen; Faber, Michael H
2013-03-01
In this paper a novel methodology for the prediction of the occurrence of road accidents is presented. The methodology utilizes a combination of three statistical methods: (1) gamma-updating of the occurrence rates of injury accidents and injured road users, (2) hierarchical multivariate Poisson-lognormal regression analysis taking into account correlations amongst multiple dependent model response variables and effects of discrete accident count data e.g. over-dispersion, and (3) Bayesian inference algorithms, which are applied by means of data mining techniques supported by Bayesian Probabilistic Networks in order to represent non-linearity between risk indicating and model response variables, as well as different types of uncertainties which might be present in the development of the specific models. Prior Bayesian Probabilistic Networks are first established by means of multivariate regression analysis of the observed frequencies of the model response variables, e.g. the occurrence of an accident, and observed values of the risk indicating variables, e.g. degree of road curvature. Subsequently, parameter learning is done using updating algorithms, to determine the posterior predictive probability distributions of the model response variables, conditional on the values of the risk indicating variables. The methodology is illustrated through a case study using data of the Austrian rural motorway network. In the case study, on randomly selected road segments the methodology is used to produce a model to predict the expected number of accidents in which an injury has occurred and the expected number of light, severe and fatally injured road users. Additionally, the methodology is used for geo-referenced identification of road sections with increased occurrence probabilities of injury accident events on a road link between two Austrian cities. It is shown that the proposed methodology can be used to develop models to estimate the occurrence of road accidents for any
Yang, Jingjing; Cox, Dennis D; Lee, Jong Soo; Ren, Peng; Choi, Taeryon
2017-12-01
Functional data are defined as realizations of random functions (mostly smooth functions) varying over a continuum, which are usually collected on discretized grids with measurement errors. In order to accurately smooth noisy functional observations and deal with the issue of high-dimensional observation grids, we propose a novel Bayesian method based on the Bayesian hierarchical model with a Gaussian-Wishart process prior and basis function representations. We first derive an induced model for the basis-function coefficients of the functional data, and then use this model to conduct posterior inference through Markov chain Monte Carlo methods. Compared to the standard Bayesian inference that suffers serious computational burden and instability in analyzing high-dimensional functional data, our method greatly improves the computational scalability and stability, while inheriting the advantage of simultaneously smoothing raw observations and estimating the mean-covariance functions in a nonparametric way. In addition, our method can naturally handle functional data observed on random or uncommon grids. Simulation and real studies demonstrate that our method produces similar results to those obtainable by the standard Bayesian inference with low-dimensional common grids, while efficiently smoothing and estimating functional data with random and high-dimensional observation grids when the standard Bayesian inference fails. In conclusion, our method can efficiently smooth and estimate high-dimensional functional data, providing one way to resolve the curse of dimensionality for Bayesian functional data analysis with Gaussian-Wishart processes. © 2017, The International Biometric Society.
Analyzing thresholds and efficiency with hierarchical Bayesian logistic regression.
Houpt, Joseph W; Bittner, Jennifer L
2018-05-10
Ideal observer analysis is a fundamental tool used widely in vision science for analyzing the efficiency with which a cognitive or perceptual system uses available information. The performance of an ideal observer provides a formal measure of the amount of information in a given experiment. The ratio of human to ideal performance is then used to compute efficiency, a construct that can be directly compared across experimental conditions while controlling for the differences due to the stimuli and/or task specific demands. In previous research using ideal observer analysis, the effects of varying experimental conditions on efficiency have been tested using ANOVAs and pairwise comparisons. In this work, we present a model that combines Bayesian estimates of psychometric functions with hierarchical logistic regression for inference about both unadjusted human performance metrics and efficiencies. Our approach improves upon the existing methods by constraining the statistical analysis using a standard model connecting stimulus intensity to human observer accuracy and by accounting for variability in the estimates of human and ideal observer performance scores. This allows for both individual and group level inferences. Copyright © 2018 Elsevier Ltd. All rights reserved.
Towards Bayesian Inference of the Fast-Ion Distribution Function
DEFF Research Database (Denmark)
Stagner, L.; Heidbrink, W.W.; Salewski, Mirko
2012-01-01
sensitivity of the measurements are incorporated into Bayesian likelihood probabilities, while prior probabilities enforce physical constraints. As an initial step, this poster uses Bayesian statistics to infer the DIII-D electron density profile from multiple diagnostic measurements. Likelihood functions....... However, when theory and experiment disagree (for one or more diagnostics), it is unclear how to proceed. Bayesian statistics provides a framework to infer the DF, quantify errors, and reconcile discrepant diagnostic measurements. Diagnostic errors and ``weight functions" that describe the phase space...
Quantum-Like Representation of Non-Bayesian Inference
Asano, M.; Basieva, I.; Khrennikov, A.; Ohya, M.; Tanaka, Y.
2013-01-01
This research is related to the problem of "irrational decision making or inference" that have been discussed in cognitive psychology. There are some experimental studies, and these statistical data cannot be described by classical probability theory. The process of decision making generating these data cannot be reduced to the classical Bayesian inference. For this problem, a number of quantum-like coginitive models of decision making was proposed. Our previous work represented in a natural way the classical Bayesian inference in the frame work of quantum mechanics. By using this representation, in this paper, we try to discuss the non-Bayesian (irrational) inference that is biased by effects like the quantum interference. Further, we describe "psychological factor" disturbing "rationality" as an "environment" correlating with the "main system" of usual Bayesian inference.
Fully probabilistic design of hierarchical Bayesian models
Czech Academy of Sciences Publication Activity Database
Quinn, A.; Kárný, Miroslav; Guy, Tatiana Valentine
2016-01-01
Roč. 369, č. 1 (2016), s. 532-547 ISSN 0020-0255 R&D Projects: GA ČR GA13-13502S Institutional support: RVO:67985556 Keywords : Fully probabilistic design * Ideal distribution * Minimum cross-entropy principle * Bayesian conditioning * Kullback-Leibler divergence * Bayesian nonparametric modelling Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 4.832, year: 2016 http://library.utia.cas.cz/separaty/2016/AS/karny-0463052.pdf
Universal Darwinism As a Process of Bayesian Inference.
Campbell, John O
2016-01-01
Many of the mathematical frameworks describing natural selection are equivalent to Bayes' Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus, natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians). As Bayesian inference can always be cast in terms of (variational) free energy minimization, natural selection can be viewed as comprising two components: a generative model of an "experiment" in the external world environment, and the results of that "experiment" or the "surprise" entailed by predicted and actual outcomes of the "experiment." Minimization of free energy implies that the implicit measure of "surprise" experienced serves to update the generative model in a Bayesian manner. This description closely accords with the mechanisms of generalized Darwinian process proposed both by Dawkins, in terms of replicators and vehicles, and Campbell, in terms of inferential systems. Bayesian inference is an algorithm for the accumulation of evidence-based knowledge. This algorithm is now seen to operate over a wide range of evolutionary processes, including natural selection, the evolution of mental models and cultural evolutionary processes, notably including science itself. The variational principle of free energy minimization may thus serve as a unifying mathematical framework for universal Darwinism, the study of evolutionary processes operating throughout nature.
Universal Darwinism as a process of Bayesian inference
Directory of Open Access Journals (Sweden)
John Oberon Campbell
2016-06-01
Full Text Available Many of the mathematical frameworks describing natural selection are equivalent to Bayes’ Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians. As Bayesian inference can always be cast in terms of (variational free energy minimization, natural selection can be viewed as comprising two components: a generative model of an ‘experiment’ in the external world environment, and the results of that 'experiment' or the 'surprise' entailed by predicted and actual outcomes of the ‘experiment’. Minimization of free energy implies that the implicit measure of 'surprise' experienced serves to update the generative model in a Bayesian manner. This description closely accords with the mechanisms of generalized Darwinian process proposed both by Dawkins, in terms of replicators and vehicles, and Campbell, in terms of inferential systems. Bayesian inference is an algorithm for the accumulation of evidence-based knowledge. This algorithm is now seen to operate over a wide range of evolutionary processes, including natural selection, the evolution of mental models and cultural evolutionary processes, notably including science itself. The variational principle of free energy minimization may thus serve as a unifying mathematical framework for universal Darwinism, the study of evolutionary processes operating throughout nature.
Bayesian inference of radiation belt loss timescales.
Camporeale, E.; Chandorkar, M.
2017-12-01
Electron fluxes in the Earth's radiation belts are routinely studied using the classical quasi-linear radial diffusion model. Although this simplified linear equation has proven to be an indispensable tool in understanding the dynamics of the radiation belt, it requires specification of quantities such as the diffusion coefficient and electron loss timescales that are never directly measured. Researchers have so far assumed a-priori parameterisations for radiation belt quantities and derived the best fit using satellite data. The state of the art in this domain lacks a coherent formulation of this problem in a probabilistic framework. We present some recent progress that we have made in performing Bayesian inference of radial diffusion parameters. We achieve this by making extensive use of the theory connecting Gaussian Processes and linear partial differential equations, and performing Markov Chain Monte Carlo sampling of radial diffusion parameters. These results are important for understanding the role and the propagation of uncertainties in radiation belt simulations and, eventually, for providing a probabilistic forecast of energetic electron fluxes in a Space Weather context.
Bayesian Inference for Linear Parabolic PDEs with Noisy Boundary Conditions
Ruggeri, Fabrizio; Sawlan, Zaid A; Scavino, Marco; Tempone, Raul
2016-01-01
In this work we develop a hierarchical Bayesian setting to infer unknown parameters in initial-boundary value problems (IBVPs) for one-dimensional linear parabolic partial differential equations. Noisy boundary data and known initial condition are assumed. We derive the likelihood function associated with the forward problem, given some measurements of the solution field subject to Gaussian noise. Such function is then analytically marginalized using the linearity of the equation. Gaussian priors have been assumed for the time-dependent Dirichlet boundary values. Our approach is applied to synthetic data for the one-dimensional heat equation model, where the thermal diffusivity is the unknown parameter. We show how to infer the thermal diffusivity parameter when its prior distribution is lognormal or modeled by means of a space-dependent stationary lognormal random field. We use the Laplace method to provide approximated Gaussian posterior distributions for the thermal diffusivity. Expected information gains and predictive posterior densities for observable quantities are numerically estimated for different experimental setups.
Bayesian Inference for Linear Parabolic PDEs with Noisy Boundary Conditions
Ruggeri, Fabrizio
2015-01-07
In this work we develop a hierarchical Bayesian setting to infer unknown parameters in initial-boundary value problems (IBVPs) for one-dimensional linear parabolic partial differential equations. Noisy boundary data and known initial condition are assumed. We derive the likelihood function associated with the forward problem, given some measurements of the solution field subject to Gaussian noise. Such function is then analytically marginalized using the linearity of the equation. Gaussian priors have been assumed for the time-dependent Dirichlet boundary values. Our approach is applied to synthetic data for the one-dimensional heat equation model, where the thermal diffusivity is the unknown parameter. We show how to infer the thermal diffusivity parameter when its prior distribution is lognormal or modeled by means of a space-dependent stationary lognormal random field. We use the Laplace method to provide approximated Gaussian posterior distributions for the thermal diffusivity. Expected information gains and predictive posterior densities for observable quantities are numerically estimated for different experimental setups.
Bayesian Inference for Linear Parabolic PDEs with Noisy Boundary Conditions
Ruggeri, Fabrizio
2016-01-06
In this work we develop a hierarchical Bayesian setting to infer unknown parameters in initial-boundary value problems (IBVPs) for one-dimensional linear parabolic partial differential equations. Noisy boundary data and known initial condition are assumed. We derive the likelihood function associated with the forward problem, given some measurements of the solution field subject to Gaussian noise. Such function is then analytically marginalized using the linearity of the equation. Gaussian priors have been assumed for the time-dependent Dirichlet boundary values. Our approach is applied to synthetic data for the one-dimensional heat equation model, where the thermal diffusivity is the unknown parameter. We show how to infer the thermal diffusivity parameter when its prior distribution is lognormal or modeled by means of a space-dependent stationary lognormal random field. We use the Laplace method to provide approximated Gaussian posterior distributions for the thermal diffusivity. Expected information gains and predictive posterior densities for observable quantities are numerically estimated for different experimental setups.
Bayesian inference of chemical kinetic models from proposed reactions
Galagali, Nikhil; Marzouk, Youssef M.
2015-01-01
© 2014 Elsevier Ltd. Bayesian inference provides a natural framework for combining experimental data with prior knowledge to develop chemical kinetic models and quantify the associated uncertainties, not only in parameter values but also in model
Efficient design and inference in distributed Bayesian networks: an overview
de Oude, P.; Groen, F.C.A.; Pavlin, G.; Bezhanishvili, N.; Löbner, S.; Schwabe, K.; Spada, L.
2011-01-01
This paper discusses an approach to distributed Bayesian modeling and inference, which is relevant for an important class of contemporary real world situation assessment applications. By explicitly considering the locality of causal relations, the presented approach (i) supports coherent distributed
Bayesian Information Criterion as an Alternative way of Statistical Inference
Directory of Open Access Journals (Sweden)
Nadejda Yu. Gubanova
2012-05-01
Full Text Available The article treats Bayesian information criterion as an alternative to traditional methods of statistical inference, based on NHST. The comparison of ANOVA and BIC results for psychological experiment is discussed.
On Bayesian Inference under Sampling from Scale Mixtures of Normals
Fernández, C.; Steel, M.F.J.
1996-01-01
This paper considers a Bayesian analysis of the linear regression model under independent sampling from general scale mixtures of Normals.Using a common reference prior, we investigate the validity of Bayesian inference and the existence of posterior moments of the regression and precision
Hierarchical Bayesian sparse image reconstruction with application to MRFM.
Dobigeon, Nicolas; Hero, Alfred O; Tourneret, Jean-Yves
2009-09-01
This paper presents a hierarchical Bayesian model to reconstruct sparse images when the observations are obtained from linear transformations and corrupted by an additive white Gaussian noise. Our hierarchical Bayes model is well suited to such naturally sparse image applications as it seamlessly accounts for properties such as sparsity and positivity of the image via appropriate Bayes priors. We propose a prior that is based on a weighted mixture of a positive exponential distribution and a mass at zero. The prior has hyperparameters that are tuned automatically by marginalization over the hierarchical Bayesian model. To overcome the complexity of the posterior distribution, a Gibbs sampling strategy is proposed. The Gibbs samples can be used to estimate the image to be recovered, e.g., by maximizing the estimated posterior distribution. In our fully Bayesian approach, the posteriors of all the parameters are available. Thus, our algorithm provides more information than other previously proposed sparse reconstruction methods that only give a point estimate. The performance of the proposed hierarchical Bayesian sparse reconstruction method is illustrated on synthetic data and real data collected from a tobacco virus sample using a prototype MRFM instrument.
Advances in Applications of Hierarchical Bayesian Methods with Hydrological Models
Alexander, R. B.; Schwarz, G. E.; Boyer, E. W.
2017-12-01
Mechanistic and empirical watershed models are increasingly used to inform water resource decisions. Growing access to historical stream measurements and data from in-situ sensor technologies has increased the need for improved techniques for coupling models with hydrological measurements. Techniques that account for the intrinsic uncertainties of both models and measurements are especially needed. Hierarchical Bayesian methods provide an efficient modeling tool for quantifying model and prediction uncertainties, including those associated with measurements. Hierarchical methods can also be used to explore spatial and temporal variations in model parameters and uncertainties that are informed by hydrological measurements. We used hierarchical Bayesian methods to develop a hybrid (statistical-mechanistic) SPARROW (SPAtially Referenced Regression On Watershed attributes) model of long-term mean annual streamflow across diverse environmental and climatic drainages in 18 U.S. hydrological regions. Our application illustrates the use of a new generation of Bayesian methods that offer more advanced computational efficiencies than the prior generation. Evaluations of the effects of hierarchical (regional) variations in model coefficients and uncertainties on model accuracy indicates improved prediction accuracies (median of 10-50%) but primarily in humid eastern regions, where model uncertainties are one-third of those in arid western regions. Generally moderate regional variability is observed for most hierarchical coefficients. Accounting for measurement and structural uncertainties, using hierarchical state-space techniques, revealed the effects of spatially-heterogeneous, latent hydrological processes in the "localized" drainages between calibration sites; this improved model precision, with only minor changes in regional coefficients. Our study can inform advances in the use of hierarchical methods with hydrological models to improve their integration with stream
Bayesian Inference and Online Learning in Poisson Neuronal Networks.
Huang, Yanping; Rao, Rajesh P N
2016-08-01
Motivated by the growing evidence for Bayesian computation in the brain, we show how a two-layer recurrent network of Poisson neurons can perform both approximate Bayesian inference and learning for any hidden Markov model. The lower-layer sensory neurons receive noisy measurements of hidden world states. The higher-layer neurons infer a posterior distribution over world states via Bayesian inference from inputs generated by sensory neurons. We demonstrate how such a neuronal network with synaptic plasticity can implement a form of Bayesian inference similar to Monte Carlo methods such as particle filtering. Each spike in a higher-layer neuron represents a sample of a particular hidden world state. The spiking activity across the neural population approximates the posterior distribution over hidden states. In this model, variability in spiking is regarded not as a nuisance but as an integral feature that provides the variability necessary for sampling during inference. We demonstrate how the network can learn the likelihood model, as well as the transition probabilities underlying the dynamics, using a Hebbian learning rule. We present results illustrating the ability of the network to perform inference and learning for arbitrary hidden Markov models.
Bayesian hierarchical model for large-scale covariance matrix estimation.
Zhu, Dongxiao; Hero, Alfred O
2007-12-01
Many bioinformatics problems implicitly depend on estimating large-scale covariance matrix. The traditional approaches tend to give rise to high variance and low accuracy due to "overfitting." We cast the large-scale covariance matrix estimation problem into the Bayesian hierarchical model framework, and introduce dependency between covariance parameters. We demonstrate the advantages of our approaches over the traditional approaches using simulations and OMICS data analysis.
Progress on Bayesian Inference of the Fast Ion Distribution Function
DEFF Research Database (Denmark)
Stagner, L.; Heidbrink, W.W,; Chen, X.
2013-01-01
. However, when theory and experiment disagree (for one or more diagnostics), it is unclear how to proceed. Bayesian statistics provides a framework to infer the DF, quantify errors, and reconcile discrepant diagnostic measurements. Diagnostic errors and weight functions that describe the phase space...... sensitivity of the measurements are incorporated into Bayesian likelihood probabilities. Prior probabilities describe physical constraints. This poster will show reconstructions of classically described, low-power, MHD-quiescent distribution functions from actual FIDA measurements. A description of the full...
Robust bayesian inference of generalized Pareto distribution ...
African Journals Online (AJOL)
En utilisant une etude exhaustive de Monte Carlo, nous prouvons que, moyennant une fonction perte generalisee adequate, on peut construire un estimateur Bayesien robuste du modele. Key words: Bayesian estimation; Extreme value; Generalized Fisher information; Gener- alized Pareto distribution; Monte Carlo; ...
A Bayesian nonparametric approach to causal inference on quantiles.
Xu, Dandan; Daniels, Michael J; Winterstein, Almut G
2018-02-25
We propose a Bayesian nonparametric approach (BNP) for causal inference on quantiles in the presence of many confounders. In particular, we define relevant causal quantities and specify BNP models to avoid bias from restrictive parametric assumptions. We first use Bayesian additive regression trees (BART) to model the propensity score and then construct the distribution of potential outcomes given the propensity score using a Dirichlet process mixture (DPM) of normals model. We thoroughly evaluate the operating characteristics of our approach and compare it to Bayesian and frequentist competitors. We use our approach to answer an important clinical question involving acute kidney injury using electronic health records. © 2018, The International Biometric Society.
Royle, J. Andrew; Dorazio, Robert M.
2008-01-01
A guide to data collection, modeling and inference strategies for biological survey data using Bayesian and classical statistical methods. This book describes a general and flexible framework for modeling and inference in ecological systems based on hierarchical models, with a strict focus on the use of probability models and parametric inference. Hierarchical models represent a paradigm shift in the application of statistics to ecological inference problems because they combine explicit models of ecological system structure or dynamics with models of how ecological systems are observed. The principles of hierarchical modeling are developed and applied to problems in population, metapopulation, community, and metacommunity systems. The book provides the first synthetic treatment of many recent methodological advances in ecological modeling and unifies disparate methods and procedures. The authors apply principles of hierarchical modeling to ecological problems, including * occurrence or occupancy models for estimating species distribution * abundance models based on many sampling protocols, including distance sampling * capture-recapture models with individual effects * spatial capture-recapture models based on camera trapping and related methods * population and metapopulation dynamic models * models of biodiversity, community structure and dynamics.
Neuronal integration of dynamic sources: Bayesian learning and Bayesian inference.
Siegelmann, Hava T; Holzman, Lars E
2010-09-01
One of the brain's most basic functions is integrating sensory data from diverse sources. This ability causes us to question whether the neural system is computationally capable of intelligently integrating data, not only when sources have known, fixed relative dependencies but also when it must determine such relative weightings based on dynamic conditions, and then use these learned weightings to accurately infer information about the world. We suggest that the brain is, in fact, fully capable of computing this parallel task in a single network and describe a neural inspired circuit with this property. Our implementation suggests the possibility that evidence learning requires a more complex organization of the network than was previously assumed, where neurons have different specialties, whose emergence brings the desired adaptivity seen in human online inference.
A Bayesian hierarchical approach to comparative audit for carotid surgery.
Kuhan, G; Marshall, E C; Abidia, A F; Chetter, I C; McCollum, P T
2002-12-01
the aim of this study was to illustrate how a Bayesian hierarchical modelling approach can aid the reliable comparison of outcome rates between surgeons. retrospective analysis of prospective and retrospective data. binary outcome data (death/stroke within 30 days), together with information on 15 possible risk factors specific for CEA were available on 836 CEAs performed by four vascular surgeons from 1992-99. The median patient age was 68 (range 38-86) years and 60% were men. the model was developed using the WinBUGS software. After adjusting for patient-level risk factors, a cross-validatory approach was adopted to identify "divergent" performance. A ranking exercise was also carried out. the overall observed 30-day stroke/death rate was 3.9% (33/836). The model found diabetes, stroke and heart disease to be significant risk factors. There was no significant difference between the predicted and observed outcome rates for any surgeon (Bayesian p -value>0.05). Each surgeon had a median rank of 3 with associated 95% CI 1.0-5.0, despite the variability of observed stroke/death rate from 2.9-4.4%. After risk adjustment, there was very little residual between-surgeon variability in outcome rate. Bayesian hierarchical models can help to accurately quantify the uncertainty associated with surgeons' performance and rank.
Bayesian inference with information content model check for Langevin equations
DEFF Research Database (Denmark)
Krog, Jens F. C.; Lomholt, Michael Andersen
2017-01-01
The Bayesian data analysis framework has been proven to be a systematic and effective method of parameter inference and model selection for stochastic processes. In this work we introduce an information content model check which may serve as a goodness-of-fit, like the chi-square procedure...
Bayesian inference model for fatigue life of laminated composites
DEFF Research Database (Denmark)
Dimitrov, Nikolay Krasimirov; Kiureghian, Armen Der; Berggreen, Christian
2016-01-01
A probabilistic model for estimating the fatigue life of laminated composite plates is developed. The model is based on lamina-level input data, making it possible to predict fatigue properties for a wide range of laminate configurations. Model parameters are estimated by Bayesian inference. The ...
Stan: A Probabilistic Programming Language for Bayesian Inference and Optimization
Gelman, Andrew; Lee, Daniel; Guo, Jiqiang
2015-01-01
Stan is a free and open-source C++ program that performs Bayesian inference or optimization for arbitrary user-specified models and can be called from the command line, R, Python, Matlab, or Julia and has great promise for fitting large and complex statistical models in many areas of application. We discuss Stan from users' and developers'…
Road network safety evaluation using Bayesian hierarchical joint model.
Wang, Jie; Huang, Helai
2016-05-01
Safety and efficiency are commonly regarded as two significant performance indicators of transportation systems. In practice, road network planning has focused on road capacity and transport efficiency whereas the safety level of a road network has received little attention in the planning stage. This study develops a Bayesian hierarchical joint model for road network safety evaluation to help planners take traffic safety into account when planning a road network. The proposed model establishes relationships between road network risk and micro-level variables related to road entities and traffic volume, as well as socioeconomic, trip generation and network density variables at macro level which are generally used for long term transportation plans. In addition, network spatial correlation between intersections and their connected road segments is also considered in the model. A road network is elaborately selected in order to compare the proposed hierarchical joint model with a previous joint model and a negative binomial model. According to the results of the model comparison, the hierarchical joint model outperforms the joint model and negative binomial model in terms of the goodness-of-fit and predictive performance, which indicates the reasonableness of considering the hierarchical data structure in crash prediction and analysis. Moreover, both random effects at the TAZ level and the spatial correlation between intersections and their adjacent segments are found to be significant, supporting the employment of the hierarchical joint model as an alternative in road-network-level safety modeling as well. Copyright © 2016 Elsevier Ltd. All rights reserved.
Bayesian inference of substrate properties from film behavior
International Nuclear Information System (INIS)
Aggarwal, R; Demkowicz, M J; Marzouk, Y M
2015-01-01
We demonstrate that by observing the behavior of a film deposited on a substrate, certain features of the substrate may be inferred with quantified uncertainty using Bayesian methods. We carry out this demonstration on an illustrative film/substrate model where the substrate is a Gaussian random field and the film is a two-component mixture that obeys the Cahn–Hilliard equation. We construct a stochastic reduced order model to describe the film/substrate interaction and use it to infer substrate properties from film behavior. This quantitative inference strategy may be adapted to other film/substrate systems. (paper)
Constructive Epistemic Modeling: A Hierarchical Bayesian Model Averaging Method
Tsai, F. T. C.; Elshall, A. S.
2014-12-01
Constructive epistemic modeling is the idea that our understanding of a natural system through a scientific model is a mental construct that continually develops through learning about and from the model. Using the hierarchical Bayesian model averaging (HBMA) method [1], this study shows that segregating different uncertain model components through a BMA tree of posterior model probabilities, model prediction, within-model variance, between-model variance and total model variance serves as a learning tool [2]. First, the BMA tree of posterior model probabilities permits the comparative evaluation of the candidate propositions of each uncertain model component. Second, systemic model dissection is imperative for understanding the individual contribution of each uncertain model component to the model prediction and variance. Third, the hierarchical representation of the between-model variance facilitates the prioritization of the contribution of each uncertain model component to the overall model uncertainty. We illustrate these concepts using the groundwater modeling of a siliciclastic aquifer-fault system. The sources of uncertainty considered are from geological architecture, formation dip, boundary conditions and model parameters. The study shows that the HBMA analysis helps in advancing knowledge about the model rather than forcing the model to fit a particularly understanding or merely averaging several candidate models. [1] Tsai, F. T.-C., and A. S. Elshall (2013), Hierarchical Bayesian model averaging for hydrostratigraphic modeling: Uncertainty segregation and comparative evaluation. Water Resources Research, 49, 5520-5536, doi:10.1002/wrcr.20428. [2] Elshall, A.S., and F. T.-C. Tsai (2014). Constructive epistemic modeling of groundwater flow with geological architecture and boundary condition uncertainty under Bayesian paradigm, Journal of Hydrology, 517, 105-119, doi: 10.1016/j.jhydrol.2014.05.027.
Variational Bayesian Inference of Line Spectra
DEFF Research Database (Denmark)
Badiu, Mihai Alin; Hansen, Thomas Lundgaard; Fleury, Bernard Henri
2017-01-01
parameters. We propose an accurate representation of the pdfs of the frequencies by mixtures of von Mises pdfs, which yields closed-form expectations. We define the algorithm VALSE in which the estimates of the pdfs and parameters are iteratively updated. VALSE is a gridless, convergent method, does......; and the coefficients are governed by a Bernoulli-Gaussian prior model turning model order selection into binary sequence detection. Unlike earlier works which retain only point estimates of the frequencies, we undertake a more complete Bayesian treatment by estimating the posterior probability density functions (pdfs......) of the frequencies and computing expectations over them. Thus, we additionally capture and operate with the uncertainty of the frequency estimates. Aiming to maximize the model evidence, variational optimization provides analytic approximations of the posterior pdfs and also gives estimates of the additional...
Hierarchical Bayesian spatial models for multispecies conservation planning and monitoring.
Carroll, Carlos; Johnson, Devin S; Dunk, Jeffrey R; Zielinski, William J
2010-12-01
Biologists who develop and apply habitat models are often familiar with the statistical challenges posed by their data's spatial structure but are unsure of whether the use of complex spatial models will increase the utility of model results in planning. We compared the relative performance of nonspatial and hierarchical Bayesian spatial models for three vertebrate and invertebrate taxa of conservation concern (Church's sideband snails [Monadenia churchi], red tree voles [Arborimus longicaudus], and Pacific fishers [Martes pennanti pacifica]) that provide examples of a range of distributional extents and dispersal abilities. We used presence-absence data derived from regional monitoring programs to develop models with both landscape and site-level environmental covariates. We used Markov chain Monte Carlo algorithms and a conditional autoregressive or intrinsic conditional autoregressive model framework to fit spatial models. The fit of Bayesian spatial models was between 35 and 55% better than the fit of nonspatial analogue models. Bayesian spatial models outperformed analogous models developed with maximum entropy (Maxent) methods. Although the best spatial and nonspatial models included similar environmental variables, spatial models provided estimates of residual spatial effects that suggested how ecological processes might structure distribution patterns. Spatial models built from presence-absence data improved fit most for localized endemic species with ranges constrained by poorly known biogeographic factors and for widely distributed species suspected to be strongly affected by unmeasured environmental variables or population processes. By treating spatial effects as a variable of interest rather than a nuisance, hierarchical Bayesian spatial models, especially when they are based on a common broad-scale spatial lattice (here the national Forest Inventory and Analysis grid of 24 km(2) hexagons), can increase the relevance of habitat models to multispecies
Inferring hierarchical clustering structures by deterministic annealing
International Nuclear Information System (INIS)
Hofmann, T.; Buhmann, J.M.
1996-01-01
The unsupervised detection of hierarchical structures is a major topic in unsupervised learning and one of the key questions in data analysis and representation. We propose a novel algorithm for the problem of learning decision trees for data clustering and related problems. In contrast to many other methods based on successive tree growing and pruning, we propose an objective function for tree evaluation and we derive a non-greedy technique for tree growing. Applying the principles of maximum entropy and minimum cross entropy, a deterministic annealing algorithm is derived in a meanfield approximation. This technique allows us to canonically superimpose tree structures and to fit parameters to averaged or open-quote fuzzified close-quote trees
HDDM: Hierarchical Bayesian estimation of the Drift-Diffusion Model in Python.
Wiecki, Thomas V; Sofer, Imri; Frank, Michael J
2013-01-01
The diffusion model is a commonly used tool to infer latent psychological processes underlying decision-making, and to link them to neural mechanisms based on response times. Although efficient open source software has been made available to quantitatively fit the model to data, current estimation methods require an abundance of response time measurements to recover meaningful parameters, and only provide point estimates of each parameter. In contrast, hierarchical Bayesian parameter estimation methods are useful for enhancing statistical power, allowing for simultaneous estimation of individual subject parameters and the group distribution that they are drawn from, while also providing measures of uncertainty in these parameters in the posterior distribution. Here, we present a novel Python-based toolbox called HDDM (hierarchical drift diffusion model), which allows fast and flexible estimation of the the drift-diffusion model and the related linear ballistic accumulator model. HDDM requires fewer data per subject/condition than non-hierarchical methods, allows for full Bayesian data analysis, and can handle outliers in the data. Finally, HDDM supports the estimation of how trial-by-trial measurements (e.g., fMRI) influence decision-making parameters. This paper will first describe the theoretical background of the drift diffusion model and Bayesian inference. We then illustrate usage of the toolbox on a real-world data set from our lab. Finally, parameter recovery studies show that HDDM beats alternative fitting methods like the χ(2)-quantile method as well as maximum likelihood estimation. The software and documentation can be downloaded at: http://ski.clps.brown.edu/hddm_docs/
HDDM: Hierarchical Bayesian estimation of the Drift-Diffusion Model in Python
Directory of Open Access Journals (Sweden)
Thomas V Wiecki
2013-08-01
Full Text Available The diffusion model is a commonly used tool to infer latent psychological processes underlying decision making, and to link them to neural mechanisms based on reaction times. Although efficient open source software has been made available to quantitatively fit the model to data, current estimation methods require an abundance of reaction time measurements to recover meaningful parameters, and only provide point estimates of each parameter. In contrast, hierarchical Bayesian parameter estimation methods are useful for enhancing statistical power, allowing for simultaneous estimation of individual subject parameters and the group distribution that they are drawn from, while also providing measures of uncertainty in these parameters in the posterior distribution. Here, we present a novel Python-based toolbox called HDDM (hierarchical drift diffusion model, which allows fast and flexible estimation of the the drift-diffusion model and the related linear ballistic accumulator model. HDDM requires fewer data per subject / condition than non-hierarchical method, allows for full Bayesian data analysis, and can handle outliers in the data. Finally, HDDM supports the estimation of how trial-by-trial measurements (e.g. fMRI influence decision making parameters. This paper will first describe the theoretical background of drift-diffusion model and Bayesian inference. We then illustrate usage of the toolbox on a real-world data set from our lab. Finally, parameter recovery studies show that HDDM beats alternative fitting methods like the chi-quantile method as well as maximum likelihood estimation. The software and documentation can be downloaded at: http://ski.clps.brown.edu/hddm_docs
Bayesian inference of chemical kinetic models from proposed reactions
Galagali, Nikhil
2015-02-01
© 2014 Elsevier Ltd. Bayesian inference provides a natural framework for combining experimental data with prior knowledge to develop chemical kinetic models and quantify the associated uncertainties, not only in parameter values but also in model structure. Most existing applications of Bayesian model selection methods to chemical kinetics have been limited to comparisons among a small set of models, however. The significant computational cost of evaluating posterior model probabilities renders traditional Bayesian methods infeasible when the model space becomes large. We present a new framework for tractable Bayesian model inference and uncertainty quantification using a large number of systematically generated model hypotheses. The approach involves imposing point-mass mixture priors over rate constants and exploring the resulting posterior distribution using an adaptive Markov chain Monte Carlo method. The posterior samples are used to identify plausible models, to quantify rate constant uncertainties, and to extract key diagnostic information about model structure-such as the reactions and operating pathways most strongly supported by the data. We provide numerical demonstrations of the proposed framework by inferring kinetic models for catalytic steam and dry reforming of methane using available experimental data.
International Nuclear Information System (INIS)
Ma Xiang; Zabaras, Nicholas
2009-01-01
A new approach to modeling inverse problems using a Bayesian inference method is introduced. The Bayesian approach considers the unknown parameters as random variables and seeks the probabilistic distribution of the unknowns. By introducing the concept of the stochastic prior state space to the Bayesian formulation, we reformulate the deterministic forward problem as a stochastic one. The adaptive hierarchical sparse grid collocation (ASGC) method is used for constructing an interpolant to the solution of the forward model in this prior space which is large enough to capture all the variability/uncertainty in the posterior distribution of the unknown parameters. This solution can be considered as a function of the random unknowns and serves as a stochastic surrogate model for the likelihood calculation. Hierarchical Bayesian formulation is used to derive the posterior probability density function (PPDF). The spatial model is represented as a convolution of a smooth kernel and a Markov random field. The state space of the PPDF is explored using Markov chain Monte Carlo algorithms to obtain statistics of the unknowns. The likelihood calculation is performed by directly sampling the approximate stochastic solution obtained through the ASGC method. The technique is assessed on two nonlinear inverse problems: source inversion and permeability estimation in flow through porous media
Yau, Christopher; Holmes, Chris
2011-07-01
We propose a hierarchical Bayesian nonparametric mixture model for clustering when some of the covariates are assumed to be of varying relevance to the clustering problem. This can be thought of as an issue in variable selection for unsupervised learning. We demonstrate that by defining a hierarchical population based nonparametric prior on the cluster locations scaled by the inverse covariance matrices of the likelihood we arrive at a 'sparsity prior' representation which admits a conditionally conjugate prior. This allows us to perform full Gibbs sampling to obtain posterior distributions over parameters of interest including an explicit measure of each covariate's relevance and a distribution over the number of potential clusters present in the data. This also allows for individual cluster specific variable selection. We demonstrate improved inference on a number of canonical problems.
Bayesian Inference of High-Dimensional Dynamical Ocean Models
Lin, J.; Lermusiaux, P. F. J.; Lolla, S. V. T.; Gupta, A.; Haley, P. J., Jr.
2015-12-01
This presentation addresses a holistic set of challenges in high-dimension ocean Bayesian nonlinear estimation: i) predict the probability distribution functions (pdfs) of large nonlinear dynamical systems using stochastic partial differential equations (PDEs); ii) assimilate data using Bayes' law with these pdfs; iii) predict the future data that optimally reduce uncertainties; and (iv) rank the known and learn the new model formulations themselves. Overall, we allow the joint inference of the state, equations, geometry, boundary conditions and initial conditions of dynamical models. Examples are provided for time-dependent fluid and ocean flows, including cavity, double-gyre and Strait flows with jets and eddies. The Bayesian model inference, based on limited observations, is illustrated first by the estimation of obstacle shapes and positions in fluid flows. Next, the Bayesian inference of biogeochemical reaction equations and of their states and parameters is presented, illustrating how PDE-based machine learning can rigorously guide the selection and discovery of complex ecosystem models. Finally, the inference of multiscale bottom gravity current dynamics is illustrated, motivated in part by classic overflows and dense water formation sites and their relevance to climate monitoring and dynamics. This is joint work with our MSEAS group at MIT.
Bayesian inference data evaluation and decisions
Harney, Hanns Ludwig
2016-01-01
This new edition offers a comprehensive introduction to the analysis of data using Bayes rule. It generalizes Gaussian error intervals to situations in which the data follow distributions other than Gaussian. This is particularly useful when the observed parameter is barely above the background or the histogram of multiparametric data contains many empty bins, so that the determination of the validity of a theory cannot be based on the chi-squared-criterion. In addition to the solutions of practical problems, this approach provides an epistemic insight: the logic of quantum mechanics is obtained as the logic of unbiased inference from counting data. New sections feature factorizing parameters, commuting parameters, observables in quantum mechanics, the art of fitting with coherent and with incoherent alternatives and fitting with multinomial distribution. Additional problems and examples help deepen the knowledge. Requiring no knowledge of quantum mechanics, the book is written on introductory level, with man...
Bayesian inference and updating of reliability data
International Nuclear Information System (INIS)
Sabri, Z.A.; Cullingford, M.C.; David, H.T.; Husseiny, A.A.
1980-01-01
A Bayes methodology for inference of reliability values using available but scarce current data is discussed. The method can be used to update failure rates as more information becomes available from field experience, assuming that the performance of a given component (or system) exhibits a nonhomogeneous Poisson process. Bayes' theorem is used to summarize the historical evidence and current component data in the form of a posterior distribution suitable for prediction and for smoothing or interpolation. An example is given. It may be appropriate to apply the methodology developed here to human error data, in which case the exponential model might be used to describe the learning behavior of the operator or maintenance crew personnel
Evolution in Mind: Evolutionary Dynamics, Cognitive Processes, and Bayesian Inference.
Suchow, Jordan W; Bourgin, David D; Griffiths, Thomas L
2017-07-01
Evolutionary theory describes the dynamics of population change in settings affected by reproduction, selection, mutation, and drift. In the context of human cognition, evolutionary theory is most often invoked to explain the origins of capacities such as language, metacognition, and spatial reasoning, framing them as functional adaptations to an ancestral environment. However, evolutionary theory is useful for understanding the mind in a second way: as a mathematical framework for describing evolving populations of thoughts, ideas, and memories within a single mind. In fact, deep correspondences exist between the mathematics of evolution and of learning, with perhaps the deepest being an equivalence between certain evolutionary dynamics and Bayesian inference. This equivalence permits reinterpretation of evolutionary processes as algorithms for Bayesian inference and has relevance for understanding diverse cognitive capacities, including memory and creativity. Copyright © 2017 Elsevier Ltd. All rights reserved.
Application of Bayesian inference to stochastic analytic continuation
International Nuclear Information System (INIS)
Fuchs, S; Pruschke, T; Jarrell, M
2010-01-01
We present an algorithm for the analytic continuation of imaginary-time quantum Monte Carlo data. The algorithm is strictly based on principles of Bayesian statistical inference. It utilizes Monte Carlo simulations to calculate a weighted average of possible energy spectra. We apply the algorithm to imaginary-time quantum Monte Carlo data and compare the resulting energy spectra with those from a standard maximum entropy calculation.
Practical Statistics for LHC Physicists: Bayesian Inference (3/3)
CERN. Geneva
2015-01-01
These lectures cover those principles and practices of statistics that are most relevant for work at the LHC. The first lecture discusses the basic ideas of descriptive statistics, probability and likelihood. The second lecture covers the key ideas in the frequentist approach, including confidence limits, profile likelihoods, p-values, and hypothesis testing. The third lecture covers inference in the Bayesian approach. Throughout, real-world examples will be used to illustrate the practical application of the ideas. No previous knowledge is assumed.
Bayesian techniques for fatigue life prediction and for inference in linear time dependent PDEs
Scavino, Marco
2016-01-08
In this talk we introduce first the main characteristics of a systematic statistical approach to model calibration, model selection and model ranking when stress-life data are drawn from a collection of records of fatigue experiments. Focusing on Bayesian prediction assessment, we consider fatigue-limit models and random fatigue-limit models under different a priori assumptions. In the second part of the talk, we present a hierarchical Bayesian technique for the inference of the coefficients of time dependent linear PDEs, under the assumption that noisy measurements are available in both the interior of a domain of interest and from boundary conditions. We present a computational technique based on the marginalization of the contribution of the boundary parameters and apply it to inverse heat conduction problems.
Sampling-free Bayesian inversion with adaptive hierarchical tensor representations
Eigel, Martin; Marschall, Manuel; Schneider, Reinhold
2018-03-01
A sampling-free approach to Bayesian inversion with an explicit polynomial representation of the parameter densities is developed, based on an affine-parametric representation of a linear forward model. This becomes feasible due to the complete treatment in function spaces, which requires an efficient model reduction technique for numerical computations. The advocated perspective yields the crucial benefit that error bounds can be derived for all occuring approximations, leading to provable convergence subject to the discretization parameters. Moreover, it enables a fully adaptive a posteriori control with automatic problem-dependent adjustments of the employed discretizations. The method is discussed in the context of modern hierarchical tensor representations, which are used for the evaluation of a random PDE (the forward model) and the subsequent high-dimensional quadrature of the log-likelihood, alleviating the ‘curse of dimensionality’. Numerical experiments demonstrate the performance and confirm the theoretical results.
Rahpeyma, Sahar; Halldorsson, Benedikt; Hrafnkelsson, Birgir; Jonsson, Sigurjon
2018-01-01
Knowledge of the characteristics of earthquake ground motion is fundamental for earthquake hazard assessments. Over small distances, relative to the source–site distance, where uniform site conditions are expected, the ground motion variability is also expected to be insignificant. However, despite being located on what has been characterized as a uniform lava‐rock site condition, considerable peak ground acceleration (PGA) variations were observed on stations of a small‐aperture array (covering approximately 1 km2) of accelerographs in Southwest Iceland during the Ölfus earthquake of magnitude 6.3 on May 29, 2008 and its sequence of aftershocks. We propose a novel Bayesian hierarchical model for the PGA variations accounting separately for earthquake event effects, station effects, and event‐station effects. An efficient posterior inference scheme based on Markov chain Monte Carlo (MCMC) simulations is proposed for the new model. The variance of the station effect is certainly different from zero according to the posterior density, indicating that individual station effects are different from one another. The Bayesian hierarchical model thus captures the observed PGA variations and quantifies to what extent the source and recording sites contribute to the overall variation in ground motions over relatively small distances on the lava‐rock site condition.
Rahpeyma, Sahar
2018-04-17
Knowledge of the characteristics of earthquake ground motion is fundamental for earthquake hazard assessments. Over small distances, relative to the source–site distance, where uniform site conditions are expected, the ground motion variability is also expected to be insignificant. However, despite being located on what has been characterized as a uniform lava‐rock site condition, considerable peak ground acceleration (PGA) variations were observed on stations of a small‐aperture array (covering approximately 1 km2) of accelerographs in Southwest Iceland during the Ölfus earthquake of magnitude 6.3 on May 29, 2008 and its sequence of aftershocks. We propose a novel Bayesian hierarchical model for the PGA variations accounting separately for earthquake event effects, station effects, and event‐station effects. An efficient posterior inference scheme based on Markov chain Monte Carlo (MCMC) simulations is proposed for the new model. The variance of the station effect is certainly different from zero according to the posterior density, indicating that individual station effects are different from one another. The Bayesian hierarchical model thus captures the observed PGA variations and quantifies to what extent the source and recording sites contribute to the overall variation in ground motions over relatively small distances on the lava‐rock site condition.
Hierarchical Active Inference: A Theory of Motivated Control.
Pezzulo, Giovanni; Rigoli, Francesco; Friston, Karl J
2018-04-01
Motivated control refers to the coordination of behaviour to achieve affectively valenced outcomes or goals. The study of motivated control traditionally assumes a distinction between control and motivational processes, which map to distinct (dorsolateral versus ventromedial) brain systems. However, the respective roles and interactions between these processes remain controversial. We offer a novel perspective that casts control and motivational processes as complementary aspects - goal propagation and prioritization, respectively - of active inference and hierarchical goal processing under deep generative models. We propose that the control hierarchy propagates prior preferences or goals, but their precision is informed by the motivational context, inferred at different levels of the motivational hierarchy. The ensuing integration of control and motivational processes underwrites action and policy selection and, ultimately, motivated behaviour, by enabling deep inference to prioritize goals in a context-sensitive way. Copyright © 2018 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Directory of Open Access Journals (Sweden)
David Lunn
Full Text Available The advantages of Bayesian statistical approaches, such as flexibility and the ability to acknowledge uncertainty in all parameters, have made them the prevailing method for analysing the spread of infectious diseases in human or animal populations. We introduce a Bayesian approach to experimental host-pathogen systems that shares these attractive features. Since uncertainty in all parameters is acknowledged, existing information can be accounted for through prior distributions, rather than through fixing some parameter values. The non-linear dynamics, multi-factorial design, multiple measurements of responses over time and sampling error that are typical features of experimental host-pathogen systems can also be naturally incorporated. We analyse the dynamics of the free-living protozoan Paramecium caudatum and its specialist bacterial parasite Holospora undulata. Our analysis provides strong evidence for a saturable infection function, and we were able to reproduce the two waves of infection apparent in the data by separating the initial inoculum from the parasites released after the first cycle of infection. In addition, the parameter estimates from the hierarchical model can be combined to infer variations in the parasite's basic reproductive ratio across experimental groups, enabling us to make predictions about the effect of resources and host genotype on the ability of the parasite to spread. Even though the high level of variability between replicates limited the resolution of the results, this Bayesian framework has strong potential to be used more widely in experimental ecology.
Fast model updating coupling Bayesian inference and PGD model reduction
Rubio, Paul-Baptiste; Louf, François; Chamoin, Ludovic
2018-04-01
The paper focuses on a coupled Bayesian-Proper Generalized Decomposition (PGD) approach for the real-time identification and updating of numerical models. The purpose is to use the most general case of Bayesian inference theory in order to address inverse problems and to deal with different sources of uncertainties (measurement and model errors, stochastic parameters). In order to do so with a reasonable CPU cost, the idea is to replace the direct model called for Monte-Carlo sampling by a PGD reduced model, and in some cases directly compute the probability density functions from the obtained analytical formulation. This procedure is first applied to a welding control example with the updating of a deterministic parameter. In the second application, the identification of a stochastic parameter is studied through a glued assembly example.
Sparse linear models: Variational approximate inference and Bayesian experimental design
International Nuclear Information System (INIS)
Seeger, Matthias W
2009-01-01
A wide range of problems such as signal reconstruction, denoising, source separation, feature selection, and graphical model search are addressed today by posterior maximization for linear models with sparsity-favouring prior distributions. The Bayesian posterior contains useful information far beyond its mode, which can be used to drive methods for sampling optimization (active learning), feature relevance ranking, or hyperparameter estimation, if only this representation of uncertainty can be approximated in a tractable manner. In this paper, we review recent results for variational sparse inference, and show that they share underlying computational primitives. We discuss how sampling optimization can be implemented as sequential Bayesian experimental design. While there has been tremendous recent activity to develop sparse estimation, little attendance has been given to sparse approximate inference. In this paper, we argue that many problems in practice, such as compressive sensing for real-world image reconstruction, are served much better by proper uncertainty approximations than by ever more aggressive sparse estimation algorithms. Moreover, since some variational inference methods have been given strong convex optimization characterizations recently, theoretical analysis may become possible, promising new insights into nonlinear experimental design.
Inference algorithms and learning theory for Bayesian sparse factor analysis
International Nuclear Information System (INIS)
Rattray, Magnus; Sharp, Kevin; Stegle, Oliver; Winn, John
2009-01-01
Bayesian sparse factor analysis has many applications; for example, it has been applied to the problem of inferring a sparse regulatory network from gene expression data. We describe a number of inference algorithms for Bayesian sparse factor analysis using a slab and spike mixture prior. These include well-established Markov chain Monte Carlo (MCMC) and variational Bayes (VB) algorithms as well as a novel hybrid of VB and Expectation Propagation (EP). For the case of a single latent factor we derive a theory for learning performance using the replica method. We compare the MCMC and VB/EP algorithm results with simulated data to the theoretical prediction. The results for MCMC agree closely with the theory as expected. Results for VB/EP are slightly sub-optimal but show that the new algorithm is effective for sparse inference. In large-scale problems MCMC is infeasible due to computational limitations and the VB/EP algorithm then provides a very useful computationally efficient alternative.
Sparse linear models: Variational approximate inference and Bayesian experimental design
Energy Technology Data Exchange (ETDEWEB)
Seeger, Matthias W [Saarland University and Max Planck Institute for Informatics, Campus E1.4, 66123 Saarbruecken (Germany)
2009-12-01
A wide range of problems such as signal reconstruction, denoising, source separation, feature selection, and graphical model search are addressed today by posterior maximization for linear models with sparsity-favouring prior distributions. The Bayesian posterior contains useful information far beyond its mode, which can be used to drive methods for sampling optimization (active learning), feature relevance ranking, or hyperparameter estimation, if only this representation of uncertainty can be approximated in a tractable manner. In this paper, we review recent results for variational sparse inference, and show that they share underlying computational primitives. We discuss how sampling optimization can be implemented as sequential Bayesian experimental design. While there has been tremendous recent activity to develop sparse estimation, little attendance has been given to sparse approximate inference. In this paper, we argue that many problems in practice, such as compressive sensing for real-world image reconstruction, are served much better by proper uncertainty approximations than by ever more aggressive sparse estimation algorithms. Moreover, since some variational inference methods have been given strong convex optimization characterizations recently, theoretical analysis may become possible, promising new insights into nonlinear experimental design.
Inference algorithms and learning theory for Bayesian sparse factor analysis
Energy Technology Data Exchange (ETDEWEB)
Rattray, Magnus; Sharp, Kevin [School of Computer Science, University of Manchester, Manchester M13 9PL (United Kingdom); Stegle, Oliver [Max-Planck-Institute for Biological Cybernetics, Tuebingen (Germany); Winn, John, E-mail: magnus.rattray@manchester.ac.u [Microsoft Research Cambridge, Roger Needham Building, Cambridge, CB3 0FB (United Kingdom)
2009-12-01
Bayesian sparse factor analysis has many applications; for example, it has been applied to the problem of inferring a sparse regulatory network from gene expression data. We describe a number of inference algorithms for Bayesian sparse factor analysis using a slab and spike mixture prior. These include well-established Markov chain Monte Carlo (MCMC) and variational Bayes (VB) algorithms as well as a novel hybrid of VB and Expectation Propagation (EP). For the case of a single latent factor we derive a theory for learning performance using the replica method. We compare the MCMC and VB/EP algorithm results with simulated data to the theoretical prediction. The results for MCMC agree closely with the theory as expected. Results for VB/EP are slightly sub-optimal but show that the new algorithm is effective for sparse inference. In large-scale problems MCMC is infeasible due to computational limitations and the VB/EP algorithm then provides a very useful computationally efficient alternative.
Ruggeri, Fabrizio
2016-05-12
In this work we develop a Bayesian setting to infer unknown parameters in initial-boundary value problems related to linear parabolic partial differential equations. We realistically assume that the boundary data are noisy, for a given prescribed initial condition. We show how to derive the joint likelihood function for the forward problem, given some measurements of the solution field subject to Gaussian noise. Given Gaussian priors for the time-dependent Dirichlet boundary values, we analytically marginalize the joint likelihood using the linearity of the equation. Our hierarchical Bayesian approach is fully implemented in an example that involves the heat equation. In this example, the thermal diffusivity is the unknown parameter. We assume that the thermal diffusivity parameter can be modeled a priori through a lognormal random variable or by means of a space-dependent stationary lognormal random field. Synthetic data are used to test the inference. We exploit the behavior of the non-normalized log posterior distribution of the thermal diffusivity. Then, we use the Laplace method to obtain an approximated Gaussian posterior and therefore avoid costly Markov Chain Monte Carlo computations. Expected information gains and predictive posterior densities for observable quantities are numerically estimated using Laplace approximation for different experimental setups.
ANUBIS: artificial neuromodulation using a Bayesian inference system.
Smith, Benjamin J H; Saaj, Chakravarthini M; Allouis, Elie
2013-01-01
Gain tuning is a crucial part of controller design and depends not only on an accurate understanding of the system in question, but also on the designer's ability to predict what disturbances and other perturbations the system will encounter throughout its operation. This letter presents ANUBIS (artificial neuromodulation using a Bayesian inference system), a novel biologically inspired technique for automatically tuning controller parameters in real time. ANUBIS is based on the Bayesian brain concept and modifies it by incorporating a model of the neuromodulatory system comprising four artificial neuromodulators. It has been applied to the controller of EchinoBot, a prototype walking rover for Martian exploration. ANUBIS has been implemented at three levels of the controller; gait generation, foot trajectory planning using Bézier curves, and foot trajectory tracking using a terminal sliding mode controller. We compare the results to a similar system that has been tuned using a multilayer perceptron. The use of Bayesian inference means that the system retains mathematical interpretability, unlike other intelligent tuning techniques, which use neural networks, fuzzy logic, or evolutionary algorithms. The simulation results show that ANUBIS provides significant improvements in efficiency and adaptability of the three controller components; it allows the robot to react to obstacles and uncertainties faster than the system tuned with the MLP, while maintaining stability and accuracy. As well as advancing rover autonomy, ANUBIS could also be applied to other situations where operating conditions are likely to change or cannot be accurately modeled in advance, such as process control. In addition, it demonstrates one way in which neuromodulation could fit into the Bayesian brain framework.
A hierarchical bayesian approach to ecological count data: a flexible tool for ecologists.
Directory of Open Access Journals (Sweden)
James A Fordyce
Full Text Available Many ecological studies use the analysis of count data to arrive at biologically meaningful inferences. Here, we introduce a hierarchical bayesian approach to count data. This approach has the advantage over traditional approaches in that it directly estimates the parameters of interest at both the individual-level and population-level, appropriately models uncertainty, and allows for comparisons among models, including those that exceed the complexity of many traditional approaches, such as ANOVA or non-parametric analogs. As an example, we apply this method to oviposition preference data for butterflies in the genus Lycaeides. Using this method, we estimate the parameters that describe preference for each population, compare the preference hierarchies among populations, and explore various models that group populations that share the same preference hierarchy.
Skataric, Maja; Bose, Sandip; Zeroug, Smaine; Tilke, Peter
2017-02-01
It is not uncommon in the field of non-destructive evaluation that multiple measurements encompassing a variety of modalities are available for analysis and interpretation for determining the underlying states of nature of the materials or parts being tested. Despite and sometimes due to the richness of data, significant challenges arise in the interpretation manifested as ambiguities and inconsistencies due to various uncertain factors in the physical properties (inputs), environment, measurement device properties, human errors, and the measurement data (outputs). Most of these uncertainties cannot be described by any rigorous mathematical means, and modeling of all possibilities is usually infeasible for many real time applications. In this work, we will discuss an approach based on Hierarchical Bayesian Graphical Models (HBGM) for the improved interpretation of complex (multi-dimensional) problems with parametric uncertainties that lack usable physical models. In this setting, the input space of the physical properties is specified through prior distributions based on domain knowledge and expertise, which are represented as Gaussian mixtures to model the various possible scenarios of interest for non-destructive testing applications. Forward models are then used offline to generate the expected distribution of the proposed measurements which are used to train a hierarchical Bayesian network. In Bayesian analysis, all model parameters are treated as random variables, and inference of the parameters is made on the basis of posterior distribution given the observed data. Learned parameters of the posterior distribution obtained after the training can therefore be used to build an efficient classifier for differentiating new observed data in real time on the basis of pre-trained models. We will illustrate the implementation of the HBGM approach to ultrasonic measurements used for cement evaluation of cased wells in the oil industry.
GPU Computing in Bayesian Inference of Realized Stochastic Volatility Model
International Nuclear Information System (INIS)
Takaishi, Tetsuya
2015-01-01
The realized stochastic volatility (RSV) model that utilizes the realized volatility as additional information has been proposed to infer volatility of financial time series. We consider the Bayesian inference of the RSV model by the Hybrid Monte Carlo (HMC) algorithm. The HMC algorithm can be parallelized and thus performed on the GPU for speedup. The GPU code is developed with CUDA Fortran. We compare the computational time in performing the HMC algorithm on GPU (GTX 760) and CPU (Intel i7-4770 3.4GHz) and find that the GPU can be up to 17 times faster than the CPU. We also code the program with OpenACC and find that appropriate coding can achieve the similar speedup with CUDA Fortran
Bayesian inference method for stochastic damage accumulation modeling
International Nuclear Information System (INIS)
Jiang, Xiaomo; Yuan, Yong; Liu, Xian
2013-01-01
Damage accumulation based reliability model plays an increasingly important role in successful realization of condition based maintenance for complicated engineering systems. This paper developed a Bayesian framework to establish stochastic damage accumulation model from historical inspection data, considering data uncertainty. Proportional hazards modeling technique is developed to model the nonlinear effect of multiple influencing factors on system reliability. Different from other hazard modeling techniques such as normal linear regression model, the approach does not require any distribution assumption for the hazard model, and can be applied for a wide variety of distribution models. A Bayesian network is created to represent the nonlinear proportional hazards models and to estimate model parameters by Bayesian inference with Markov Chain Monte Carlo simulation. Both qualitative and quantitative approaches are developed to assess the validity of the established damage accumulation model. Anderson–Darling goodness-of-fit test is employed to perform the normality test, and Box–Cox transformation approach is utilized to convert the non-normality data into normal distribution for hypothesis testing in quantitative model validation. The methodology is illustrated with the seepage data collected from real-world subway tunnels.
Bayesian inference for hybrid discrete-continuous stochastic kinetic models
International Nuclear Information System (INIS)
Sherlock, Chris; Golightly, Andrew; Gillespie, Colin S
2014-01-01
We consider the problem of efficiently performing simulation and inference for stochastic kinetic models. Whilst it is possible to work directly with the resulting Markov jump process (MJP), computational cost can be prohibitive for networks of realistic size and complexity. In this paper, we consider an inference scheme based on a novel hybrid simulator that classifies reactions as either ‘fast’ or ‘slow’ with fast reactions evolving as a continuous Markov process whilst the remaining slow reaction occurrences are modelled through a MJP with time-dependent hazards. A linear noise approximation (LNA) of fast reaction dynamics is employed and slow reaction events are captured by exploiting the ability to solve the stochastic differential equation driving the LNA. This simulation procedure is used as a proposal mechanism inside a particle MCMC scheme, thus allowing Bayesian inference for the model parameters. We apply the scheme to a simple application and compare the output with an existing hybrid approach and also a scheme for performing inference for the underlying discrete stochastic model. (paper)
Bayesian inference for Markov jump processes with informative observations.
Golightly, Andrew; Wilkinson, Darren J
2015-04-01
In this paper we consider the problem of parameter inference for Markov jump process (MJP) representations of stochastic kinetic models. Since transition probabilities are intractable for most processes of interest yet forward simulation is straightforward, Bayesian inference typically proceeds through computationally intensive methods such as (particle) MCMC. Such methods ostensibly require the ability to simulate trajectories from the conditioned jump process. When observations are highly informative, use of the forward simulator is likely to be inefficient and may even preclude an exact (simulation based) analysis. We therefore propose three methods for improving the efficiency of simulating conditioned jump processes. A conditioned hazard is derived based on an approximation to the jump process, and used to generate end-point conditioned trajectories for use inside an importance sampling algorithm. We also adapt a recently proposed sequential Monte Carlo scheme to our problem. Essentially, trajectories are reweighted at a set of intermediate time points, with more weight assigned to trajectories that are consistent with the next observation. We consider two implementations of this approach, based on two continuous approximations of the MJP. We compare these constructs for a simple tractable jump process before using them to perform inference for a Lotka-Volterra system. The best performing construct is used to infer the parameters governing a simple model of motility regulation in Bacillus subtilis.
Bayesian inference from count data using discrete uniform priors.
Directory of Open Access Journals (Sweden)
Federico Comoglio
Full Text Available We consider a set of sample counts obtained by sampling arbitrary fractions of a finite volume containing an homogeneously dispersed population of identical objects. We report a Bayesian derivation of the posterior probability distribution of the population size using a binomial likelihood and non-conjugate, discrete uniform priors under sampling with or without replacement. Our derivation yields a computationally feasible formula that can prove useful in a variety of statistical problems involving absolute quantification under uncertainty. We implemented our algorithm in the R package dupiR and compared it with a previously proposed Bayesian method based on a Gamma prior. As a showcase, we demonstrate that our inference framework can be used to estimate bacterial survival curves from measurements characterized by extremely low or zero counts and rather high sampling fractions. All in all, we provide a versatile, general purpose algorithm to infer population sizes from count data, which can find application in a broad spectrum of biological and physical problems.
Bayesian Inference for Signal-Based Seismic Monitoring
Moore, D.
2015-12-01
Traditional seismic monitoring systems rely on discrete detections produced by station processing software, discarding significant information present in the original recorded signal. SIG-VISA (Signal-based Vertically Integrated Seismic Analysis) is a system for global seismic monitoring through Bayesian inference on seismic signals. By modeling signals directly, our forward model is able to incorporate a rich representation of the physics underlying the signal generation process, including source mechanisms, wave propagation, and station response. This allows inference in the model to recover the qualitative behavior of recent geophysical methods including waveform matching and double-differencing, all as part of a unified Bayesian monitoring system that simultaneously detects and locates events from a global network of stations. We demonstrate recent progress in scaling up SIG-VISA to efficiently process the data stream of global signals recorded by the International Monitoring System (IMS), including comparisons against existing processing methods that show increased sensitivity from our signal-based model and in particular the ability to locate events (including aftershock sequences that can tax analyst processing) precisely from waveform correlation effects. We also provide a Bayesian analysis of an alleged low-magnitude event near the DPRK test site in May 2010 [1] [2], investigating whether such an event could plausibly be detected through automated processing in a signal-based monitoring system. [1] Zhang, Miao and Wen, Lianxing. "Seismological Evidence for a Low-Yield Nuclear Test on 12 May 2010 in North Korea". Seismological Research Letters, January/February 2015. [2] Richards, Paul. "A Seismic Event in North Korea on 12 May 2010". CTBTO SnT 2015 oral presentation, video at https://video-archive.ctbto.org/index.php/kmc/preview/partner_id/103/uiconf_id/4421629/entry_id/0_ymmtpps0/delivery/http
Bayesian Hierarchical Random Effects Models in Forensic Science
Directory of Open Access Journals (Sweden)
Colin G. G. Aitken
2018-04-01
Full Text Available Statistical modeling of the evaluation of evidence with the use of the likelihood ratio has a long history. It dates from the Dreyfus case at the end of the nineteenth century through the work at Bletchley Park in the Second World War to the present day. The development received a significant boost in 1977 with a seminal work by Dennis Lindley which introduced a Bayesian hierarchical random effects model for the evaluation of evidence with an example of refractive index measurements on fragments of glass. Many models have been developed since then. The methods have now been sufficiently well-developed and have become so widespread that it is timely to try and provide a software package to assist in their implementation. With that in mind, a project (SAILR: Software for the Analysis and Implementation of Likelihood Ratios was funded by the European Network of Forensic Science Institutes through their Monopoly programme to develop a software package for use by forensic scientists world-wide that would assist in the statistical analysis and implementation of the approach based on likelihood ratios. It is the purpose of this document to provide a short review of a small part of this history. The review also provides a background, or landscape, for the development of some of the models within the SAILR package and references to SAILR as made as appropriate.
Bayesian Hierarchical Random Effects Models in Forensic Science.
Aitken, Colin G G
2018-01-01
Statistical modeling of the evaluation of evidence with the use of the likelihood ratio has a long history. It dates from the Dreyfus case at the end of the nineteenth century through the work at Bletchley Park in the Second World War to the present day. The development received a significant boost in 1977 with a seminal work by Dennis Lindley which introduced a Bayesian hierarchical random effects model for the evaluation of evidence with an example of refractive index measurements on fragments of glass. Many models have been developed since then. The methods have now been sufficiently well-developed and have become so widespread that it is timely to try and provide a software package to assist in their implementation. With that in mind, a project (SAILR: Software for the Analysis and Implementation of Likelihood Ratios) was funded by the European Network of Forensic Science Institutes through their Monopoly programme to develop a software package for use by forensic scientists world-wide that would assist in the statistical analysis and implementation of the approach based on likelihood ratios. It is the purpose of this document to provide a short review of a small part of this history. The review also provides a background, or landscape, for the development of some of the models within the SAILR package and references to SAILR as made as appropriate.
Bayesian inference for identifying interaction rules in moving animal groups.
Directory of Open Access Journals (Sweden)
Richard P Mann
Full Text Available The emergence of similar collective patterns from different self-propelled particle models of animal groups points to a restricted set of "universal" classes for these patterns. While universality is interesting, it is often the fine details of animal interactions that are of biological importance. Universality thus presents a challenge to inferring such interactions from macroscopic group dynamics since these can be consistent with many underlying interaction models. We present a Bayesian framework for learning animal interaction rules from fine scale recordings of animal movements in swarms. We apply these techniques to the inverse problem of inferring interaction rules from simulation models, showing that parameters can often be inferred from a small number of observations. Our methodology allows us to quantify our confidence in parameter fitting. For example, we show that attraction and alignment terms can be reliably estimated when animals are milling in a torus shape, while interaction radius cannot be reliably measured in such a situation. We assess the importance of rate of data collection and show how to test different models, such as topological and metric neighbourhood models. Taken together our results both inform the design of experiments on animal interactions and suggest how these data should be best analysed.
Bayesian methods for data analysis
Carlin, Bradley P.
2009-01-01
Approaches for statistical inference Introduction Motivating Vignettes Defining the Approaches The Bayes-Frequentist Controversy Some Basic Bayesian Models The Bayes approach Introduction Prior Distributions Bayesian Inference Hierarchical Modeling Model Assessment Nonparametric Methods Bayesian computation Introduction Asymptotic Methods Noniterative Monte Carlo Methods Markov Chain Monte Carlo Methods Model criticism and selection Bayesian Modeling Bayesian Robustness Model Assessment Bayes Factors via Marginal Density Estimation Bayes Factors
Bayesian inference for psychology. Part I : Theoretical advantages and practical ramifications
Wagenmakers, E.-J.; Marsman, M.; Jamil, T.; Ly, A.; Verhagen, J.; Love, J.; Selker, R.; Gronau, Q.F.; Šmíra, M.; Epskamp, S.; Matzke, D.; Rouder, J.N.; Morey, R.D.
2018-01-01
Bayesian parameter estimation and Bayesian hypothesis testing present attractive alternatives to classical inference using confidence intervals and p values. In part I of this series we outline ten prominent advantages of the Bayesian approach. Many of these advantages translate to concrete
Directory of Open Access Journals (Sweden)
J. P. Werner
2015-03-01
Full Text Available Reconstructions of the late-Holocene climate rely heavily upon proxies that are assumed to be accurately dated by layer counting, such as measurements of tree rings, ice cores, and varved lake sediments. Considerable advances could be achieved if time-uncertain proxies were able to be included within these multiproxy reconstructions, and if time uncertainties were recognized and correctly modeled for proxies commonly treated as free of age model errors. Current approaches for accounting for time uncertainty are generally limited to repeating the reconstruction using each one of an ensemble of age models, thereby inflating the final estimated uncertainty – in effect, each possible age model is given equal weighting. Uncertainties can be reduced by exploiting the inferred space–time covariance structure of the climate to re-weight the possible age models. Here, we demonstrate how Bayesian hierarchical climate reconstruction models can be augmented to account for time-uncertain proxies. Critically, although a priori all age models are given equal probability of being correct, the probabilities associated with the age models are formally updated within the Bayesian framework, thereby reducing uncertainties. Numerical experiments show that updating the age model probabilities decreases uncertainty in the resulting reconstructions, as compared with the current de facto standard of sampling over all age models, provided there is sufficient information from other data sources in the spatial region of the time-uncertain proxy. This approach can readily be generalized to non-layer-counted proxies, such as those derived from marine sediments.
A novel Bayesian hierarchical model for road safety hotspot prediction.
Fawcett, Lee; Thorpe, Neil; Matthews, Joseph; Kremer, Karsten
2017-02-01
In this paper, we propose a Bayesian hierarchical model for predicting accident counts in future years at sites within a pool of potential road safety hotspots. The aim is to inform road safety practitioners of the location of likely future hotspots to enable a proactive, rather than reactive, approach to road safety scheme implementation. A feature of our model is the ability to rank sites according to their potential to exceed, in some future time period, a threshold accident count which may be used as a criterion for scheme implementation. Our model specification enables the classical empirical Bayes formulation - commonly used in before-and-after studies, wherein accident counts from a single before period are used to estimate counterfactual counts in the after period - to be extended to incorporate counts from multiple time periods. This allows site-specific variations in historical accident counts (e.g. locally-observed trends) to offset estimates of safety generated by a global accident prediction model (APM), which itself is used to help account for the effects of global trend and regression-to-mean (RTM). The Bayesian posterior predictive distribution is exploited to formulate predictions and to properly quantify our uncertainty in these predictions. The main contributions of our model include (i) the ability to allow accident counts from multiple time-points to inform predictions, with counts in more recent years lending more weight to predictions than counts from time-points further in the past; (ii) where appropriate, the ability to offset global estimates of trend by variations in accident counts observed locally, at a site-specific level; and (iii) the ability to account for unknown/unobserved site-specific factors which may affect accident counts. We illustrate our model with an application to accident counts at 734 potential hotspots in the German city of Halle; we also propose some simple diagnostics to validate the predictive capability of our
BayesTwin: An R Package for Bayesian Inference of Item-Level Twin Data
Directory of Open Access Journals (Sweden)
Inga Schwabe
2017-11-01
Full Text Available BayesTwin is an open-source R package that serves as a pipeline to the MCMC program JAGS to perform Bayesian inference on genetically-informative hierarchical twin data. Simultaneously to the biometric model, an item response theory (IRT measurement model is estimated, allowing analysis of the raw phenotypic (item-level data. The integration of such a measurement model is important since earlier research has shown that an analysis based on an aggregated measure (e.g., a sum-score based analysis can lead to an underestimation of heritability and the spurious finding of genotype-environment interactions. The package includes all common biometric and IRT models as well as functions that help plot relevant information or determine whether the analysis was performed well. Funding statement: Partly funded by the PROO grant 411-12-623 from the Netherlands Organisation for Scientific Research (NWO.
Fast Bayesian Inference in Dirichlet Process Mixture Models.
Wang, Lianming; Dunson, David B
2011-01-01
There has been increasing interest in applying Bayesian nonparametric methods in large samples and high dimensions. As Markov chain Monte Carlo (MCMC) algorithms are often infeasible, there is a pressing need for much faster algorithms. This article proposes a fast approach for inference in Dirichlet process mixture (DPM) models. Viewing the partitioning of subjects into clusters as a model selection problem, we propose a sequential greedy search algorithm for selecting the partition. Then, when conjugate priors are chosen, the resulting posterior conditionally on the selected partition is available in closed form. This approach allows testing of parametric models versus nonparametric alternatives based on Bayes factors. We evaluate the approach using simulation studies and compare it with four other fast nonparametric methods in the literature. We apply the proposed approach to three datasets including one from a large epidemiologic study. Matlab codes for the simulation and data analyses using the proposed approach are available online in the supplemental materials.
A Bayesian hierarchical model for demand curve analysis.
Ho, Yen-Yi; Nhu Vo, Tien; Chu, Haitao; Luo, Xianghua; Le, Chap T
2018-07-01
Drug self-administration experiments are a frequently used approach to assessing the abuse liability and reinforcing property of a compound. It has been used to assess the abuse liabilities of various substances such as psychomotor stimulants and hallucinogens, food, nicotine, and alcohol. The demand curve generated from a self-administration study describes how demand of a drug or non-drug reinforcer varies as a function of price. With the approval of the 2009 Family Smoking Prevention and Tobacco Control Act, demand curve analysis provides crucial evidence to inform the US Food and Drug Administration's policy on tobacco regulation, because it produces several important quantitative measurements to assess the reinforcing strength of nicotine. The conventional approach popularly used to analyze the demand curve data is individual-specific non-linear least square regression. The non-linear least square approach sets out to minimize the residual sum of squares for each subject in the dataset; however, this one-subject-at-a-time approach does not allow for the estimation of between- and within-subject variability in a unified model framework. In this paper, we review the existing approaches to analyze the demand curve data, non-linear least square regression, and the mixed effects regression and propose a new Bayesian hierarchical model. We conduct simulation analyses to compare the performance of these three approaches and illustrate the proposed approaches in a case study of nicotine self-administration in rats. We present simulation results and discuss the benefits of using the proposed approaches.
Efficient hierarchical trans-dimensional Bayesian inversion of magnetotelluric data
Xiang, Enming; Guo, Rongwen; Dosso, Stan E.; Liu, Jianxin; Dong, Hao; Ren, Zhengyong
2018-06-01
This paper develops an efficient hierarchical trans-dimensional (trans-D) Bayesian algorithm to invert magnetotelluric (MT) data for subsurface geoelectrical structure, with unknown geophysical model parameterization (the number of conductivity-layer interfaces) and data-error models parameterized by an auto-regressive (AR) process to account for potential error correlations. The reversible-jump Markov-chain Monte Carlo algorithm, which adds/removes interfaces and AR parameters in birth/death steps, is applied to sample the trans-D posterior probability density for model parameterization, model parameters, error variance and AR parameters, accounting for the uncertainties of model dimension and data-error statistics in the uncertainty estimates of the conductivity profile. To provide efficient sampling over the multiple subspaces of different dimensions, advanced proposal schemes are applied. Parameter perturbations are carried out in principal-component space, defined by eigen-decomposition of the unit-lag model covariance matrix, to minimize the effect of inter-parameter correlations and provide effective perturbation directions and length scales. Parameters of new layers in birth steps are proposed from the prior, instead of focused distributions centred at existing values, to improve birth acceptance rates. Parallel tempering, based on a series of parallel interacting Markov chains with successively relaxed likelihoods, is applied to improve chain mixing over model dimensions. The trans-D inversion is applied in a simulation study to examine the resolution of model structure according to the data information content. The inversion is also applied to a measured MT data set from south-central Australia.
Sparse Event Modeling with Hierarchical Bayesian Kernel Methods
2016-01-05
SECURITY CLASSIFICATION OF: The research objective of this proposal was to develop a predictive Bayesian kernel approach to model count data based on...several predictive variables. Such an approach, which we refer to as the Poisson Bayesian kernel model, is able to model the rate of occurrence of... kernel methods made use of: (i) the Bayesian property of improving predictive accuracy as data are dynamically obtained, and (ii) the kernel function
Bayesian inference for partially identified models exploring the limits of limited data
Gustafson, Paul
2015-01-01
Introduction Identification What Is against Us? What Is for Us? Some Simple Examples of Partially Identified ModelsThe Road Ahead The Structure of Inference in Partially Identified Models Bayesian Inference The Structure of Posterior Distributions in PIMs Computational Strategies Strength of Bayesian Updating, Revisited Posterior MomentsCredible Intervals Evaluating the Worth of Inference Partial Identification versus Model Misspecification The Siren Call of Identification Comp
Rajabi, Mohammad Mahdi; Ataie-Ashtiani, Behzad
2016-05-01
Bayesian inference has traditionally been conceived as the proper framework for the formal incorporation of expert knowledge in parameter estimation of groundwater models. However, conventional Bayesian inference is incapable of taking into account the imprecision essentially embedded in expert provided information. In order to solve this problem, a number of extensions to conventional Bayesian inference have been introduced in recent years. One of these extensions is 'fuzzy Bayesian inference' which is the result of integrating fuzzy techniques into Bayesian statistics. Fuzzy Bayesian inference has a number of desirable features which makes it an attractive approach for incorporating expert knowledge in the parameter estimation process of groundwater models: (1) it is well adapted to the nature of expert provided information, (2) it allows to distinguishably model both uncertainty and imprecision, and (3) it presents a framework for fusing expert provided information regarding the various inputs of the Bayesian inference algorithm. However an important obstacle in employing fuzzy Bayesian inference in groundwater numerical modeling applications is the computational burden, as the required number of numerical model simulations often becomes extremely exhaustive and often computationally infeasible. In this paper, a novel approach of accelerating the fuzzy Bayesian inference algorithm is proposed which is based on using approximate posterior distributions derived from surrogate modeling, as a screening tool in the computations. The proposed approach is first applied to a synthetic test case of seawater intrusion (SWI) in a coastal aquifer. It is shown that for this synthetic test case, the proposed approach decreases the number of required numerical simulations by an order of magnitude. Then the proposed approach is applied to a real-world test case involving three-dimensional numerical modeling of SWI in Kish Island, located in the Persian Gulf. An expert
Bayesian Inference for Functional Dynamics Exploring in fMRI Data
Directory of Open Access Journals (Sweden)
Xuan Guo
2016-01-01
Full Text Available This paper aims to review state-of-the-art Bayesian-inference-based methods applied to functional magnetic resonance imaging (fMRI data. Particularly, we focus on one specific long-standing challenge in the computational modeling of fMRI datasets: how to effectively explore typical functional interactions from fMRI time series and the corresponding boundaries of temporal segments. Bayesian inference is a method of statistical inference which has been shown to be a powerful tool to encode dependence relationships among the variables with uncertainty. Here we provide an introduction to a group of Bayesian-inference-based methods for fMRI data analysis, which were designed to detect magnitude or functional connectivity change points and to infer their functional interaction patterns based on corresponding temporal boundaries. We also provide a comparison of three popular Bayesian models, that is, Bayesian Magnitude Change Point Model (BMCPM, Bayesian Connectivity Change Point Model (BCCPM, and Dynamic Bayesian Variable Partition Model (DBVPM, and give a summary of their applications. We envision that more delicate Bayesian inference models will be emerging and play increasingly important roles in modeling brain functions in the years to come.
Bayesian inference in probabilistic risk assessment-The current state of the art
International Nuclear Information System (INIS)
Kelly, Dana L.; Smith, Curtis L.
2009-01-01
Markov chain Monte Carlo (MCMC) approaches to sampling directly from the joint posterior distribution of aleatory model parameters have led to tremendous advances in Bayesian inference capability in a wide variety of fields, including probabilistic risk analysis. The advent of freely available software coupled with inexpensive computing power has catalyzed this advance. This paper examines where the risk assessment community is with respect to implementing modern computational-based Bayesian approaches to inference. Through a series of examples in different topical areas, it introduces salient concepts and illustrates the practical application of Bayesian inference via MCMC sampling to a variety of important problems
BayesCLUMPY: BAYESIAN INFERENCE WITH CLUMPY DUSTY TORUS MODELS
International Nuclear Information System (INIS)
Asensio Ramos, A.; Ramos Almeida, C.
2009-01-01
Our aim is to present a fast and general Bayesian inference framework based on the synergy between machine learning techniques and standard sampling methods and apply it to infer the physical properties of clumpy dusty torus using infrared photometric high spatial resolution observations of active galactic nuclei. We make use of the Metropolis-Hastings Markov Chain Monte Carlo algorithm for sampling the posterior distribution function. Such distribution results from combining all a priori knowledge about the parameters of the model and the information introduced by the observations. The main difficulty resides in the fact that the model used to explain the observations is computationally demanding and the sampling is very time consuming. For this reason, we apply a set of artificial neural networks that are used to approximate and interpolate a database of models. As a consequence, models not present in the original database can be computed ensuring continuity. We focus on the application of this solution scheme to the recently developed public database of clumpy dusty torus models. The machine learning scheme used in this paper allows us to generate any model from the database using only a factor of 10 -4 of the original size of the database and a factor of 10 -3 in computing time. The posterior distribution obtained for each model parameter allows us to investigate how the observations constrain the parameters and which ones remain partially or completely undetermined, providing statistically relevant confidence intervals. As an example, the application to the nuclear region of Centaurus A shows that the optical depth of the clouds, the total number of clouds, and the radial extent of the cloud distribution zone are well constrained using only six filters. The code is freely available from the authors.
Bayesian disease mapping: hierarchical modeling in spatial epidemiology
National Research Council Canada - National Science Library
Lawson, Andrew
2013-01-01
Since the publication of the first edition, many new Bayesian tools and methods have been developed for space-time data analysis, the predictive modeling of health outcomes, and other spatial biostatistical areas...
Bayesian Inference for NASA Probabilistic Risk and Reliability Analysis
Dezfuli, Homayoon; Kelly, Dana; Smith, Curtis; Vedros, Kurt; Galyean, William
2009-01-01
This document, Bayesian Inference for NASA Probabilistic Risk and Reliability Analysis, is intended to provide guidelines for the collection and evaluation of risk and reliability-related data. It is aimed at scientists and engineers familiar with risk and reliability methods and provides a hands-on approach to the investigation and application of a variety of risk and reliability data assessment methods, tools, and techniques. This document provides both: A broad perspective on data analysis collection and evaluation issues. A narrow focus on the methods to implement a comprehensive information repository. The topics addressed herein cover the fundamentals of how data and information are to be used in risk and reliability analysis models and their potential role in decision making. Understanding these topics is essential to attaining a risk informed decision making environment that is being sought by NASA requirements and procedures such as 8000.4 (Agency Risk Management Procedural Requirements), NPR 8705.05 (Probabilistic Risk Assessment Procedures for NASA Programs and Projects), and the System Safety requirements of NPR 8715.3 (NASA General Safety Program Requirements).
Bayesian Inference of Ecological Interactions from Spatial Data
Directory of Open Access Journals (Sweden)
Christopher R. Stephens
2017-11-01
Full Text Available The characterization and quantification of ecological interactions and the construction of species’ distributions and their associated ecological niches are of fundamental theoretical and practical importance. In this paper, we discuss a Bayesian inference framework, which, using spatial data, offers a general formalism within which ecological interactions may be characterized and quantified. Interactions are identified through deviations of the spatial distribution of co-occurrences of spatial variables relative to a benchmark for the non-interacting system and based on a statistical ensemble of spatial cells. The formalism allows for the integration of both biotic and abiotic factors of arbitrary resolution. We concentrate on the conceptual and mathematical underpinnings of the formalism, showing how, using the naive Bayes approximation, it can be used to not only compare and contrast the relative contribution from each variable, but also to construct species’ distributions and ecological niches based on an arbitrary variable type. We also show how non-linear interactions between distinct niche variables can be identified and the degree of confounding between variables accounted for.
Metainference: A Bayesian inference method for heterogeneous systems.
Bonomi, Massimiliano; Camilloni, Carlo; Cavalli, Andrea; Vendruscolo, Michele
2016-01-01
Modeling a complex system is almost invariably a challenging task. The incorporation of experimental observations can be used to improve the quality of a model and thus to obtain better predictions about the behavior of the corresponding system. This approach, however, is affected by a variety of different errors, especially when a system simultaneously populates an ensemble of different states and experimental data are measured as averages over such states. To address this problem, we present a Bayesian inference method, called "metainference," that is able to deal with errors in experimental measurements and with experimental measurements averaged over multiple states. To achieve this goal, metainference models a finite sample of the distribution of models using a replica approach, in the spirit of the replica-averaging modeling based on the maximum entropy principle. To illustrate the method, we present its application to a heterogeneous model system and to the determination of an ensemble of structures corresponding to the thermal fluctuations of a protein molecule. Metainference thus provides an approach to modeling complex systems with heterogeneous components and interconverting between different states by taking into account all possible sources of errors.
Self-Associations Influence Task-Performance through Bayesian Inference.
Bengtsson, Sara L; Penny, Will D
2013-01-01
The way we think about ourselves impacts greatly on our behavior. This paper describes a behavioral study and a computational model that shed new light on this important area. Participants were primed "clever" and "stupid" using a scrambled sentence task, and we measured the effect on response time and error-rate on a rule-association task. First, we observed a confirmation bias effect in that associations to being "stupid" led to a gradual decrease in performance, whereas associations to being "clever" did not. Second, we observed that the activated self-concepts selectively modified attention toward one's performance. There was an early to late double dissociation in RTs in that primed "clever" resulted in RT increase following error responses, whereas primed "stupid" resulted in RT increase following correct responses. We propose a computational model of subjects' behavior based on the logic of the experimental task that involves two processes; memory for rules and the integration of rules with subsequent visual cues. The model incorporates an adaptive decision threshold based on Bayes rule, whereby decision thresholds are increased if integration was inferred to be faulty. Fitting the computational model to experimental data confirmed our hypothesis that priming affects the memory process. This model explains both the confirmation bias and double dissociation effects and demonstrates that Bayesian inferential principles can be used to study the effect of self-concepts on behavior.
Self-associations influence task-performance through Bayesian inference
Directory of Open Access Journals (Sweden)
Sara L Bengtsson
2013-08-01
Full Text Available The way we think about ourselves impacts greatly on our behaviour. This paper describes a behavioural study and a computational model that sheds new light on this important area. Participants were primed 'clever' and 'stupid' using a scrambled sentence task, and we measured the effect on response time and error-rate on a rule-association task. First, we observed a confirmation bias effect in that associations to being 'stupid' led to a gradual decrease in performance, whereas associations to being 'clever' did not. Second, we observed that the activated self-concepts selectively modified attention towards one's performance. There was an early to late double dissociation in RTs in that primed 'clever' resulted in RT increase following error responses, whereas primed 'stupid' resulted in RT increase following correct responses. We propose a computational model of subjects' behaviour based on the logic of the experimental task that involves two processes; memory for rules and the integration of rules with subsequent visual cues. The model also incorporates an adaptive decision threshold based on Bayes rule, whereby decision thresholds are increased if integration was inferred to be faulty. Fitting the computational model to experimental data confirmed our hypothesis that priming affects the memory process. This model explains both the confirmation bias and double dissociation effects and demonstrates that Bayesian inferential principles can be used to study the effect of self-concepts on behaviour.
Bayesian inference on EMRI signals using low frequency approximations
International Nuclear Information System (INIS)
Ali, Asad; Meyer, Renate; Christensen, Nelson; Röver, Christian
2012-01-01
Extreme mass ratio inspirals (EMRIs) are thought to be one of the most exciting gravitational wave sources to be detected with LISA. Due to their complicated nature and weak amplitudes the detection and parameter estimation of such sources is a challenging task. In this paper we present a statistical methodology based on Bayesian inference in which the estimation of parameters is carried out by advanced Markov chain Monte Carlo (MCMC) algorithms such as parallel tempering MCMC. We analysed high and medium mass EMRI systems that fall well inside the low frequency range of LISA. In the context of the Mock LISA Data Challenges, our investigation and results are also the first instance in which a fully Markovian algorithm is applied for EMRI searches. Results show that our algorithm worked well in recovering EMRI signals from different (simulated) LISA data sets having single and multiple EMRI sources and holds great promise for posterior computation under more realistic conditions. The search and estimation methods presented in this paper are general in their nature, and can be applied in any other scenario such as AdLIGO, AdVIRGO and Einstein Telescope with their respective response functions. (paper)
Serang, Oliver
2014-01-01
Exact Bayesian inference can sometimes be performed efficiently for special cases where a function has commutative and associative symmetry of its inputs (called “causal independence”). For this reason, it is desirable to exploit such symmetry on big data sets. Here we present a method to exploit a general form of this symmetry on probabilistic adder nodes by transforming those probabilistic adder nodes into a probabilistic convolution tree with which dynamic programming computes exact probabilities. A substantial speedup is demonstrated using an illustration example that can arise when identifying splice forms with bottom-up mass spectrometry-based proteomics. On this example, even state-of-the-art exact inference algorithms require a runtime more than exponential in the number of splice forms considered. By using the probabilistic convolution tree, we reduce the runtime to and the space to where is the number of variables joined by an additive or cardinal operator. This approach, which can also be used with junction tree inference, is applicable to graphs with arbitrary dependency on counting variables or cardinalities and can be used on diverse problems and fields like forward error correcting codes, elemental decomposition, and spectral demixing. The approach also trivially generalizes to multiple dimensions. PMID:24626234
Statistical detection of EEG synchrony using empirical bayesian inference.
Directory of Open Access Journals (Sweden)
Archana K Singh
Full Text Available There is growing interest in understanding how the brain utilizes synchronized oscillatory activity to integrate information across functionally connected regions. Computing phase-locking values (PLV between EEG signals is a popular method for quantifying such synchronizations and elucidating their role in cognitive tasks. However, high-dimensionality in PLV data incurs a serious multiple testing problem. Standard multiple testing methods in neuroimaging research (e.g., false discovery rate, FDR suffer severe loss of power, because they fail to exploit complex dependence structure between hypotheses that vary in spectral, temporal and spatial dimension. Previously, we showed that a hierarchical FDR and optimal discovery procedures could be effectively applied for PLV analysis to provide better power than FDR. In this article, we revisit the multiple comparison problem from a new Empirical Bayes perspective and propose the application of the local FDR method (locFDR; Efron, 2001 for PLV synchrony analysis to compute FDR as a posterior probability that an observed statistic belongs to a null hypothesis. We demonstrate the application of Efron's Empirical Bayes approach for PLV synchrony analysis for the first time. We use simulations to validate the specificity and sensitivity of locFDR and a real EEG dataset from a visual search study for experimental validation. We also compare locFDR with hierarchical FDR and optimal discovery procedures in both simulation and experimental analyses. Our simulation results showed that the locFDR can effectively control false positives without compromising on the power of PLV synchrony inference. Our results from the application locFDR on experiment data detected more significant discoveries than our previously proposed methods whereas the standard FDR method failed to detect any significant discoveries.
Statistical detection of EEG synchrony using empirical bayesian inference.
Singh, Archana K; Asoh, Hideki; Takeda, Yuji; Phillips, Steven
2015-01-01
There is growing interest in understanding how the brain utilizes synchronized oscillatory activity to integrate information across functionally connected regions. Computing phase-locking values (PLV) between EEG signals is a popular method for quantifying such synchronizations and elucidating their role in cognitive tasks. However, high-dimensionality in PLV data incurs a serious multiple testing problem. Standard multiple testing methods in neuroimaging research (e.g., false discovery rate, FDR) suffer severe loss of power, because they fail to exploit complex dependence structure between hypotheses that vary in spectral, temporal and spatial dimension. Previously, we showed that a hierarchical FDR and optimal discovery procedures could be effectively applied for PLV analysis to provide better power than FDR. In this article, we revisit the multiple comparison problem from a new Empirical Bayes perspective and propose the application of the local FDR method (locFDR; Efron, 2001) for PLV synchrony analysis to compute FDR as a posterior probability that an observed statistic belongs to a null hypothesis. We demonstrate the application of Efron's Empirical Bayes approach for PLV synchrony analysis for the first time. We use simulations to validate the specificity and sensitivity of locFDR and a real EEG dataset from a visual search study for experimental validation. We also compare locFDR with hierarchical FDR and optimal discovery procedures in both simulation and experimental analyses. Our simulation results showed that the locFDR can effectively control false positives without compromising on the power of PLV synchrony inference. Our results from the application locFDR on experiment data detected more significant discoveries than our previously proposed methods whereas the standard FDR method failed to detect any significant discoveries.
A Hierarchical Bayesian Model to Predict Self-Thinning Line for Chinese Fir in Southern China.
Directory of Open Access Journals (Sweden)
Xiongqing Zhang
Full Text Available Self-thinning is a dynamic equilibrium between forest growth and mortality at full site occupancy. Parameters of the self-thinning lines are often confounded by differences across various stand and site conditions. For overcoming the problem of hierarchical and repeated measures, we used hierarchical Bayesian method to estimate the self-thinning line. The results showed that the self-thinning line for Chinese fir (Cunninghamia lanceolata (Lamb.Hook. plantations was not sensitive to the initial planting density. The uncertainty of model predictions was mostly due to within-subject variability. The simulation precision of hierarchical Bayesian method was better than that of stochastic frontier function (SFF. Hierarchical Bayesian method provided a reasonable explanation of the impact of other variables (site quality, soil type, aspect, etc. on self-thinning line, which gave us the posterior distribution of parameters of self-thinning line. The research of self-thinning relationship could be benefit from the use of hierarchical Bayesian method.
Mocapy++ - a toolkit for inference and learning in dynamic Bayesian networks
DEFF Research Database (Denmark)
Paluszewski, Martin; Hamelryck, Thomas Wim
2010-01-01
Background Mocapy++ is a toolkit for parameter learning and inference in dynamic Bayesian networks (DBNs). It supports a wide range of DBN architectures and probability distributions, including distributions from directional statistics (the statistics of angles, directions and orientations...
Using hierarchical Bayesian methods to examine the tools of decision-making
Michael D. Lee; Benjamin R. Newell
2011-01-01
Hierarchical Bayesian methods offer a principled and comprehensive way to relate psychological models to data. Here we use them to model the patterns of information search, stopping and deciding in a simulated binary comparison judgment task. The simulation involves 20 subjects making 100 forced choice comparisons about the relative magnitudes of two objects (which of two German cities has more inhabitants). Two worked-examples show how hierarchical models can be developed to account for and ...
Cruz-Marcelo, Alejandro; Ensor, Katherine B.; Rosner, Gary L.
2011-01-01
The term structure of interest rates is used to price defaultable bonds and credit derivatives, as well as to infer the quality of bonds for risk management purposes. We introduce a model that jointly estimates term structures by means of a Bayesian hierarchical model with a prior probability model based on Dirichlet process mixtures. The modeling methodology borrows strength across term structures for purposes of estimation. The main advantage of our framework is its ability to produce reliable estimators at the company level even when there are only a few bonds per company. After describing the proposed model, we discuss an empirical application in which the term structure of 197 individual companies is estimated. The sample of 197 consists of 143 companies with only one or two bonds. In-sample and out-of-sample tests are used to quantify the improvement in accuracy that results from approximating the term structure of corporate bonds with estimators by company rather than by credit rating, the latter being a popular choice in the financial literature. A complete description of a Markov chain Monte Carlo (MCMC) scheme for the proposed model is available as Supplementary Material. PMID:21765566
Inconsistency of Bayesian Inference for Misspecified Linear Models, and a Proposal for Repairing It
Grünwald, P.; van Ommen, T.
2017-01-01
We empirically show that Bayesian inference can be inconsistent under misspecification in simple linear regression problems, both in a model averaging/selection and in a Bayesian ridge regression setting. We use the standard linear model, which assumes homoskedasticity, whereas the data are
Inconsistency of Bayesian inference for misspecified linear models, and a proposal for repairing it
P.D. Grünwald (Peter); T. van Ommen (Thijs)
2017-01-01
textabstractWe empirically show that Bayesian inference can be inconsistent under misspecification in simple linear regression problems, both in a model averaging/selection and in a Bayesian ridge regression setting. We use the standard linear model, which assumes homoskedasticity, whereas the data
Epigenetic change detection and pattern recognition via Bayesian hierarchical hidden Markov models.
Wang, Xinlei; Zang, Miao; Xiao, Guanghua
2013-06-15
Epigenetics is the study of changes to the genome that can switch genes on or off and determine which proteins are transcribed without altering the DNA sequence. Recently, epigenetic changes have been linked to the development and progression of disease such as psychiatric disorders. High-throughput epigenetic experiments have enabled researchers to measure genome-wide epigenetic profiles and yield data consisting of intensity ratios of immunoprecipitation versus reference samples. The intensity ratios can provide a view of genomic regions where protein binding occur under one experimental condition and further allow us to detect epigenetic alterations through comparison between two different conditions. However, such experiments can be expensive, with only a few replicates available. Moreover, epigenetic data are often spatially correlated with high noise levels. In this paper, we develop a Bayesian hierarchical model, combined with hidden Markov processes with four states for modeling spatial dependence, to detect genomic sites with epigenetic changes from two-sample experiments with paired internal control. One attractive feature of the proposed method is that the four states of the hidden Markov process have well-defined biological meanings and allow us to directly call the change patterns based on the corresponding posterior probabilities. In contrast, none of existing methods can offer this advantage. In addition, the proposed method offers great power in statistical inference by spatial smoothing (via hidden Markov modeling) and information pooling (via hierarchical modeling). Both simulation studies and real data analysis in a cocaine addiction study illustrate the reliability and success of this method. Copyright © 2012 John Wiley & Sons, Ltd.
Czech Academy of Sciences Publication Activity Database
Suparta, W.; Gusrizal, G.; Kudela, Karel; Isa, Z.
2017-01-01
Roč. 28, č. 3 (2017), s. 357-370 ISSN 1017-0839 R&D Projects: GA MŠk EF15_003/0000481 Institutional support: RVO:61389005 Keywords : trapped particle * spatio-temporal * hierarchical Bayesian * forecasting Subject RIV: DG - Athmosphere Sciences, Meteorology OBOR OECD: Meteorology and atmospheric sciences Impact factor: 0.752, year: 2016
Osei, Frank B.; Osei, F.B.; Duker, Alfred A.; Stein, A.
2011-01-01
This study analyses the joint effects of the two transmission routes of cholera on the space-time diffusion dynamics. Statistical models are developed and presented to investigate the transmission network routes of cholera diffusion. A hierarchical Bayesian modelling approach is employed for a joint
Bayesian hierarchical models for regional climate reconstructions of the last glacial maximum
Weitzel, Nils; Hense, Andreas; Ohlwein, Christian
2017-04-01
Spatio-temporal reconstructions of past climate are important for the understanding of the long term behavior of the climate system and the sensitivity to forcing changes. Unfortunately, they are subject to large uncertainties, have to deal with a complex proxy-climate structure, and a physically reasonable interpolation between the sparse proxy observations is difficult. Bayesian Hierarchical Models (BHMs) are a class of statistical models that is well suited for spatio-temporal reconstructions of past climate because they permit the inclusion of multiple sources of information (e.g. records from different proxy types, uncertain age information, output from climate simulations) and quantify uncertainties in a statistically rigorous way. BHMs in paleoclimatology typically consist of three stages which are modeled individually and are combined using Bayesian inference techniques. The data stage models the proxy-climate relation (often named transfer function), the process stage models the spatio-temporal distribution of the climate variables of interest, and the prior stage consists of prior distributions of the model parameters. For our BHMs, we translate well-known proxy-climate transfer functions for pollen to a Bayesian framework. In addition, we can include Gaussian distributed local climate information from preprocessed proxy records. The process stage combines physically reasonable spatial structures from prior distributions with proxy records which leads to a multivariate posterior probability distribution for the reconstructed climate variables. The prior distributions that constrain the possible spatial structure of the climate variables are calculated from climate simulation output. We present results from pseudoproxy tests as well as new regional reconstructions of temperatures for the last glacial maximum (LGM, ˜ 21,000 years BP). These reconstructions combine proxy data syntheses with information from climate simulations for the LGM that were
Constraining East Antarctic mass trends using a Bayesian inference approach
Martin-Español, Alba; Bamber, Jonathan L.
2016-04-01
East Antarctica is an order of magnitude larger than its western neighbour and the Greenland ice sheet. It has the greatest potential to contribute to sea level rise of any source, including non-glacial contributors. It is, however, the most challenging ice mass to constrain because of a range of factors including the relative paucity of in-situ observations and the poor signal to noise ratio of Earth Observation data such as satellite altimetry and gravimetry. A recent study using satellite radar and laser altimetry (Zwally et al. 2015) concluded that the East Antarctic Ice Sheet (EAIS) had been accumulating mass at a rate of 136±28 Gt/yr for the period 2003-08. Here, we use a Bayesian hierarchical model, which has been tested on, and applied to, the whole of Antarctica, to investigate the impact of different assumptions regarding the origin of elevation changes of the EAIS. We combined GRACE, satellite laser and radar altimeter data and GPS measurements to solve simultaneously for surface processes (primarily surface mass balance, SMB), ice dynamics and glacio-isostatic adjustment over the period 2003-13. The hierarchical model partitions mass trends between SMB and ice dynamics based on physical principles and measures of statistical likelihood. Without imposing the division between these processes, the model apportions about a third of the mass trend to ice dynamics, +18 Gt/yr, and two thirds, +39 Gt/yr, to SMB. The total mass trend for that period for the EAIS was 57±20 Gt/yr. Over the period 2003-08, we obtain an ice dynamic trend of 12 Gt/yr and a SMB trend of 15 Gt/yr, with a total mass trend of 27 Gt/yr. We then imposed the condition that the surface mass balance is tightly constrained by the regional climate model RACMO2.3 and allowed height changes due to ice dynamics to occur in areas of low surface velocities (solution that satisfies all the input data, given these constraints. By imposing these conditions, over the period 2003-13 we obtained a mass
Bayesian inference for psychology. Part II: Example applications with JASP.
Wagenmakers, Eric-Jan; Love, Jonathon; Marsman, Maarten; Jamil, Tahira; Ly, Alexander; Verhagen, Josine; Selker, Ravi; Gronau, Quentin F; Dropmann, Damian; Boutin, Bruno; Meerhoff, Frans; Knight, Patrick; Raj, Akash; van Kesteren, Erik-Jan; van Doorn, Johnny; Šmíra, Martin; Epskamp, Sacha; Etz, Alexander; Matzke, Dora; de Jong, Tim; van den Bergh, Don; Sarafoglou, Alexandra; Steingroever, Helen; Derks, Koen; Rouder, Jeffrey N; Morey, Richard D
2018-02-01
Bayesian hypothesis testing presents an attractive alternative to p value hypothesis testing. Part I of this series outlined several advantages of Bayesian hypothesis testing, including the ability to quantify evidence and the ability to monitor and update this evidence as data come in, without the need to know the intention with which the data were collected. Despite these and other practical advantages, Bayesian hypothesis tests are still reported relatively rarely. An important impediment to the widespread adoption of Bayesian tests is arguably the lack of user-friendly software for the run-of-the-mill statistical problems that confront psychologists for the analysis of almost every experiment: the t-test, ANOVA, correlation, regression, and contingency tables. In Part II of this series we introduce JASP ( http://www.jasp-stats.org ), an open-source, cross-platform, user-friendly graphical software package that allows users to carry out Bayesian hypothesis tests for standard statistical problems. JASP is based in part on the Bayesian analyses implemented in Morey and Rouder's BayesFactor package for R. Armed with JASP, the practical advantages of Bayesian hypothesis testing are only a mouse click away.
Radial anisotropy of Northeast Asia inferred from Bayesian inversions of ambient noise data
Lee, S. J.; Kim, S.; Rhie, J.
2017-12-01
The eastern margin of the Eurasia plate exhibits complex tectonic settings due to interactions with the subducting Pacific and Philippine Sea plates and the colliding India plate. Distributed extensional basins and intraplate volcanoes, and their heterogeneous features in the region are not easily explained with a simple mechanism. Observations of radial anisotropy in the entire lithosphere and the part of the asthenosphere provide the most effective evidence for the deformation of the lithosphere and the associated variation of the lithosphere-asthenosphere boundary (LAB). To infer anisotropic structures of crustal and upper-mantle in this region, radial anisotropy is measured using ambient noise data. In a continuation of previous Rayleigh wave tomography study in Northeast Asia, we conduct Love wave tomography to determine radial anisotropy using the Bayesian inversion techniques. Continuous seismic noise recordings of 237 broad-band seismic stations are used and more than 55,000 group and phase velocities of fundamental mode are measured for periods of 5-60 s. Total 8 different types of dispersion maps of Love wave from this study (period 10-60 s), Rayleigh wave from previous tomographic study (Kim et al., 2016; period 8-70 s) and longer period data (period 70-200 s) from a global model (Ekstrom, 2011) are jointly inverted using a hierarchical and transdimensional Bayesian technique. For each grid-node, boundary depths, velocities and anisotropy parameters of layers are sampled simultaneously on the assumption of the layered half-space model. The constructed 3-D radial anisotropy model provides much more details about the crust and upper mantle anisotropic structures, and about the complex undulation of the LAB.
Narimani, Zahra; Beigy, Hamid; Ahmad, Ashar; Masoudi-Nejad, Ali; Fröhlich, Holger
2017-01-01
Inferring the structure of molecular networks from time series protein or gene expression data provides valuable information about the complex biological processes of the cell. Causal network structure inference has been approached using different methods in the past. Most causal network inference techniques, such as Dynamic Bayesian Networks and ordinary differential equations, are limited by their computational complexity and thus make large scale inference infeasible. This is specifically true if a Bayesian framework is applied in order to deal with the unavoidable uncertainty about the correct model. We devise a novel Bayesian network reverse engineering approach using ordinary differential equations with the ability to include non-linearity. Besides modeling arbitrary, possibly combinatorial and time dependent perturbations with unknown targets, one of our main contributions is the use of Expectation Propagation, an algorithm for approximate Bayesian inference over large scale network structures in short computation time. We further explore the possibility of integrating prior knowledge into network inference. We evaluate the proposed model on DREAM4 and DREAM8 data and find it competitive against several state-of-the-art existing network inference methods.
Bayesian Predictive Inference of a Proportion Under a Twofold Small-Area Model
Directory of Open Access Journals (Sweden)
Nandram Balgobin
2016-03-01
Full Text Available We extend the twofold small-area model of Stukel and Rao (1997; 1999 to accommodate binary data. An example is the Third International Mathematics and Science Study (TIMSS, in which pass-fail data for mathematics of students from US schools (clusters are available at the third grade by regions and communities (small areas. We compare the finite population proportions of these small areas. We present a hierarchical Bayesian model in which the firststage binary responses have independent Bernoulli distributions, and each subsequent stage is modeled using a beta distribution, which is parameterized by its mean and a correlation coefficient. This twofold small-area model has an intracluster correlation at the first stage and an intercluster correlation at the second stage. The final-stage mean and all correlations are assumed to be noninformative independent random variables. We show how to infer the finite population proportion of each area. We have applied our models to synthetic TIMSS data to show that the twofold model is preferred over a onefold small-area model that ignores the clustering within areas. We further compare these models using a simulation study, which shows that the intracluster correlation is particularly important.
Bayesian inference of the heat transfer properties of a wall using experimental data
Iglesias, Marco
2016-01-06
A hierarchical Bayesian inference method is developed to estimate the thermal resistance and volumetric heat capacity of a wall. We apply our methodology to a real case study where measurements are recorded each minute from two temperature probes and two heat flux sensors placed on both sides of a solid brick wall along a period of almost five days. We model the heat transfer through the wall by means of the one-dimensional heat equation with Dirichlet boundary conditions. The initial/boundary conditions for the temperature are approximated by piecewise linear functions. We assume that temperature and heat flux measurements have independent Gaussian noise and derive the joint likelihood of the wall parameters and the initial/boundary conditions. Under the model assumptions, the boundary conditions are marginalized analytically from the joint likelihood. ApproximatedGaussian posterior distributions for the wall parameters and the initial condition parameter are obtained using the Laplace method, after incorporating the available prior information. The information gain is estimated under different experimental setups, to determine the best allocation of resources.
Numerical methods for Bayesian inference in the face of aging
International Nuclear Information System (INIS)
Clarotti, C.A.; Villain, B.; Procaccia, H.
1996-01-01
In recent years, much attention has been paid to Bayesian methods for Risk Assessment. Until now, these methods have been studied from a theoretical point of view. Researchers have been mainly interested in: studying the effectiveness of Bayesian methods in handling rare events; debating about the problem of priors and other philosophical issues. An aspect central to the Bayesian approach is numerical computation because any safety/reliability problem, in a Bayesian frame, ends with a problem of numerical integration. This aspect has been neglected until now because most Risk studies assumed the Exponential model as the basic probabilistic model. The existence of conjugate priors makes numerical integration unnecessary in this case. If aging is to be taken into account, no conjugate family is available and the use of numerical integration becomes compulsory. EDF (National Board of Electricity, of France) and ENEA (National Committee for Energy, New Technologies and Environment, of Italy) jointly carried out a research program aimed at developing quadrature methods suitable for Bayesian Interference with underlying Weibull or gamma distributions. The paper will illustrate the main results achieved during the above research program and will discuss, via some sample cases, the performances of the numerical algorithms which on the appearance of stress corrosion cracking in the tubes of Steam Generators of PWR French power plants. (authors)
Bayesian inference for disease prevalence using negative binomial group testing
Pritchard, Nicholas A.; Tebbs, Joshua M.
2011-01-01
Group testing, also known as pooled testing, and inverse sampling are both widely used methods of data collection when the goal is to estimate a small proportion. Taking a Bayesian approach, we consider the new problem of estimating disease prevalence from group testing when inverse (negative binomial) sampling is used. Using different distributions to incorporate prior knowledge of disease incidence and different loss functions, we derive closed form expressions for posterior distributions and resulting point and credible interval estimators. We then evaluate our new estimators, on Bayesian and classical grounds, and apply our methods to a West Nile Virus data set. PMID:21259308
Thomas, D.L.; Johnson, D.; Griffith, B.
2006-01-01
Bayesian hierarchical discrete-choice model for resource selection can provide managers with 2 components of population-level inference: average population selection and variability of selection. Both components are necessary to make sound management decisions based on animal selection.
DEFF Research Database (Denmark)
Picchini, Umberto; Forman, Julie Lyng
2016-01-01
a nonlinear stochastic differential equation model observed with correlated measurement errors and an application to protein folding modelling. An approximate Bayesian computation (ABC)-MCMC algorithm is suggested to allow inference for model parameters within reasonable time constraints. The ABC algorithm......In recent years, dynamical modelling has been provided with a range of breakthrough methods to perform exact Bayesian inference. However, it is often computationally unfeasible to apply exact statistical methodologies in the context of large data sets and complex models. This paper considers...... applications. A simulation study is conducted to compare our strategy with exact Bayesian inference, the latter resulting two orders of magnitude slower than ABC-MCMC for the considered set-up. Finally, the ABC algorithm is applied to a large size protein data. The suggested methodology is fairly general...
Sparse Bayesian Inference and the Temperature Structure of the Solar Corona
Energy Technology Data Exchange (ETDEWEB)
Warren, Harry P. [Space Science Division, Naval Research Laboratory, Washington, DC 20375 (United States); Byers, Jeff M. [Materials Science and Technology Division, Naval Research Laboratory, Washington, DC 20375 (United States); Crump, Nicholas A. [Naval Center for Space Technology, Naval Research Laboratory, Washington, DC 20375 (United States)
2017-02-20
Measuring the temperature structure of the solar atmosphere is critical to understanding how it is heated to high temperatures. Unfortunately, the temperature of the upper atmosphere cannot be observed directly, but must be inferred from spectrally resolved observations of individual emission lines that span a wide range of temperatures. Such observations are “inverted” to determine the distribution of plasma temperatures along the line of sight. This inversion is ill posed and, in the absence of regularization, tends to produce wildly oscillatory solutions. We introduce the application of sparse Bayesian inference to the problem of inferring the temperature structure of the solar corona. Within a Bayesian framework a preference for solutions that utilize a minimum number of basis functions can be encoded into the prior and many ad hoc assumptions can be avoided. We demonstrate the efficacy of the Bayesian approach by considering a test library of 40 assumed temperature distributions.
Sraj, Ihab; Zedler, Sarah E.; Knio, Omar; Jackson, Charles S.; Hoteit, Ibrahim
2016-01-01
The authors present a polynomial chaos (PC)-based Bayesian inference method for quantifying the uncertainties of the K-profile parameterization (KPP) within the MIT general circulation model (MITgcm) of the tropical Pacific. The inference
Directory of Open Access Journals (Sweden)
Fidel Ernesto Castro Morales
2016-03-01
Full Text Available Abstract Objectives: to propose the use of a Bayesian hierarchical model to study the allometric scaling of the fetoplacental weight ratio, including possible confounders. Methods: data from 26 singleton pregnancies with gestational age at birth between 37 and 42 weeks were analyzed. The placentas were collected immediately after delivery and stored under refrigeration until the time of analysis, which occurred within up to 12 hours. Maternal data were collected from medical records. A Bayesian hierarchical model was proposed and Markov chain Monte Carlo simulation methods were used to obtain samples from distribution a posteriori. Results: the model developed showed a reasonable fit, even allowing for the incorporation of variables and a priori information on the parameters used. Conclusions: new variables can be added to the modelfrom the available code, allowing many possibilities for data analysis and indicating the potential for use in research on the subject.
Nonparametric Bayesian inference for multidimensional compound Poisson processes
Gugushvili, S.; van der Meulen, F.; Spreij, P.
2015-01-01
Given a sample from a discretely observed multidimensional compound Poisson process, we study the problem of nonparametric estimation of its jump size density r0 and intensity λ0. We take a nonparametric Bayesian approach to the problem and determine posterior contraction rates in this context,
Bayesian inference using WBDev: a tutorial for social scientists
Wetzels, R.; Lee, M.D.; Wagenmakers, E.-J.
2010-01-01
Over the last decade, the popularity of Bayesian data analysis in the empirical sciences has greatly increased. This is partly due to the availability of WinBUGS, a free and flexible statistical software package that comes with an array of predefined functions and distributions, allowing users to
Yi Huang; Francesca Dominici; Michelle Bell
2004-01-01
In this paper, we develop Bayesian hierarchical distributed lag models for estimating associations between daily variations in summer ozone levels and daily variations in cardiovascular and respiratory (CVDRESP) mortality counts for 19 U.S. large cities included in the National Morbidity Mortality Air Pollution Study (NMMAPS) for the period 1987 - 1994. At the first stage, we define a semi-parametric distributed lag Poisson regression model to estimate city-specific relative rates of CVDRESP ...
Metis: A Pure Metropolis Markov Chain Monte Carlo Bayesian Inference Library
Energy Technology Data Exchange (ETDEWEB)
Bates, Cameron Russell [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Mckigney, Edward Allen [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
2018-01-09
The use of Bayesian inference in data analysis has become the standard for large scienti c experiments [1, 2]. The Monte Carlo Codes Group(XCP-3) at Los Alamos has developed a simple set of algorithms currently implemented in C++ and Python to easily perform at-prior Markov Chain Monte Carlo Bayesian inference with pure Metropolis sampling. These implementations are designed to be user friendly and extensible for customization based on speci c application requirements. This document describes the algorithmic choices made and presents two use cases.
Statistical modelling of railway track geometry degradation using Hierarchical Bayesian models
International Nuclear Information System (INIS)
Andrade, A.R.; Teixeira, P.F.
2015-01-01
Railway maintenance planners require a predictive model that can assess the railway track geometry degradation. The present paper uses a Hierarchical Bayesian model as a tool to model the main two quality indicators related to railway track geometry degradation: the standard deviation of longitudinal level defects and the standard deviation of horizontal alignment defects. Hierarchical Bayesian Models (HBM) are flexible statistical models that allow specifying different spatially correlated components between consecutive track sections, namely for the deterioration rates and the initial qualities parameters. HBM are developed for both quality indicators, conducting an extensive comparison between candidate models and a sensitivity analysis on prior distributions. HBM is applied to provide an overall assessment of the degradation of railway track geometry, for the main Portuguese railway line Lisbon–Oporto. - Highlights: • Rail track geometry degradation is analysed using Hierarchical Bayesian models. • A Gibbs sampling strategy is put forward to estimate the HBM. • Model comparison and sensitivity analysis find the most suitable model. • We applied the most suitable model to all the segments of the main Portuguese line. • Tackling spatial correlations using CAR structures lead to a better model fit
Hierarchical Bayesian Analysis of Biased Beliefs and Distributional Other-Regarding Preferences
Directory of Open Access Journals (Sweden)
Jeroen Weesie
2013-02-01
Full Text Available This study investigates the relationship between an actor’s beliefs about others’ other-regarding (social preferences and her own other-regarding preferences, using an “avant-garde” hierarchical Bayesian method. We estimate two distributional other-regarding preference parameters, α and β, of actors using incentivized choice data in binary Dictator Games. Simultaneously, we estimate the distribution of actors’ beliefs about others α and β, conditional on actors’ own α and β, with incentivized belief elicitation. We demonstrate the benefits of the Bayesian method compared to it’s hierarchical frequentist counterparts. Results show a positive association between an actor’s own (α; β and her beliefs about average(α; β in the population. The association between own preferences and the variance in beliefs about others’ preferences in the population, however, is curvilinear for α and insignificant for β. These results are partially consistent with the cone effect [1,2] which is described in detail below. Because in the Bayesian-Nash equilibrium concept, beliefs and own preferences are assumed to be independent, these results cast doubt on the application of the Bayesian-Nash equilibrium concept to experimental data.
Inference method using bayesian network for diagnosis of pulmonary nodules
International Nuclear Information System (INIS)
Kawagishi, Masami; Iizuka, Yoshio; Yamamoto, Hiroyuki; Yakami, Masahiro; Kubo, Takeshi; Fujimoto, Koji; Togashi, Kaori
2010-01-01
This report describes the improvements of a naive Bayes model that infers the diagnosis of pulmonary nodules in chest CT images based on the findings obtained when a radiologist interprets the CT images. We have previously introduced an inference model using a naive Bayes classifier and have reported its clinical value based on evaluation using clinical data. In the present report, we introduce the following improvements to the original inference model: the selection of findings based on correlations and the generation of a model using only these findings, and the introduction of classifiers that integrate several simple classifiers each of which is specialized for specific diagnosis. These improvements were found to increase the inference accuracy by 10.4% (p<.01) as compared to the original model in 100 cases (222 nodules) based on leave-one-out evaluation. (author)
On a full Bayesian inference for force reconstruction problems
Aucejo, M.; De Smet, O.
2018-05-01
In a previous paper, the authors introduced a flexible methodology for reconstructing mechanical sources in the frequency domain from prior local information on both their nature and location over a linear and time invariant structure. The proposed approach was derived from Bayesian statistics, because of its ability in mathematically accounting for experimenter's prior knowledge. However, since only the Maximum a Posteriori estimate was computed, the posterior uncertainty about the regularized solution given the measured vibration field, the mechanical model and the regularization parameter was not assessed. To answer this legitimate question, this paper fully exploits the Bayesian framework to provide, from a Markov Chain Monte Carlo algorithm, credible intervals and other statistical measures (mean, median, mode) for all the parameters of the force reconstruction problem.
Operational modal analysis modeling, Bayesian inference, uncertainty laws
Au, Siu-Kui
2017-01-01
This book presents operational modal analysis (OMA), employing a coherent and comprehensive Bayesian framework for modal identification and covering stochastic modeling, theoretical formulations, computational algorithms, and practical applications. Mathematical similarities and philosophical differences between Bayesian and classical statistical approaches to system identification are discussed, allowing their mathematical tools to be shared and their results correctly interpreted. Many chapters can be used as lecture notes for the general topic they cover beyond the OMA context. After an introductory chapter (1), Chapters 2–7 present the general theory of stochastic modeling and analysis of ambient vibrations. Readers are first introduced to the spectral analysis of deterministic time series (2) and structural dynamics (3), which do not require the use of probability concepts. The concepts and techniques in these chapters are subsequently extended to a probabilistic context in Chapter 4 (on stochastic pro...
Bayesian inference and the analytic continuation of imaginary-time quantum Monte Carlo data
International Nuclear Information System (INIS)
Gubernatis, J.E.; Bonca, J.; Jarrell, M.
1995-01-01
We present brief description of how methods of Bayesian inference are used to obtain real frequency information by the analytic continuation of imaginary-time quantum Monte Carlo data. We present the procedure we used, which is due to R. K. Bryan, and summarize several bottleneck issues
DEFF Research Database (Denmark)
Jacobsen, Christian Robert Dahl; Møller, Jesper
2017-01-01
We introduce new estimation methods for a subclass of the Gaussian scale mixture models for wavelet trees by Wainwright, Simoncelli and Willsky that rely on modern results for composite likelihoods and approximate Bayesian inference. Our methodology is illustrated for denoising and edge detection...
Paudel, Y.; Botzen, W.J.W.; Aerts, J.C.J.H.
2013-01-01
This study applies Bayesian Inference to estimate flood risk for 53 dyke ring areas in the Netherlands, and focuses particularly on the data scarcity and extreme behaviour of catastrophe risk. The probability density curves of flood damage are estimated through Monte Carlo simulations. Based on
Bayesian Inference for Linear Parabolic PDEs with Noisy Boundary Conditions
Ruggeri, Fabrizio; Sawlan, Zaid A; Scavino, Marco; Tempone, Raul
2015-01-01
have been assumed for the time-dependent Dirichlet boundary values. Our approach is applied to synthetic data for the one-dimensional heat equation model, where the thermal diffusivity is the unknown parameter. We show how to infer the thermal
2017-09-01
application of statistical inference. Even when human forecasters leverage their professional experience, which is often gained through long periods of... application throughout statistics and Bayesian data analysis. The multivariate form of 2( , ) (e.g., Figure 12) is similarly analytically...data (i.e., no systematic manipulations with analytical functions), it is common in the statistical literature to apply mathematical transformations
Bayesian inference and model comparison for metallic fatigue data
Babuška, Ivo
2016-02-23
In this work, we present a statistical treatment of stress-life (S-N) data drawn from a collection of records of fatigue experiments that were performed on 75S-T6 aluminum alloys. Our main objective is to predict the fatigue life of materials by providing a systematic approach to model calibration, model selection and model ranking with reference to S-N data. To this purpose, we consider fatigue-limit models and random fatigue-limit models that are specially designed to allow the treatment of the run-outs (right-censored data). We first fit the models to the data by maximum likelihood methods and estimate the quantiles of the life distribution of the alloy specimen. To assess the robustness of the estimation of the quantile functions, we obtain bootstrap confidence bands by stratified resampling with respect to the cycle ratio. We then compare and rank the models by classical measures of fit based on information criteria. We also consider a Bayesian approach that provides, under the prior distribution of the model parameters selected by the user, their simulation-based posterior distributions. We implement and apply Bayesian model comparison methods, such as Bayes factor ranking and predictive information criteria based on cross-validation techniques under various a priori scenarios.
Bayesian inference and model comparison for metallic fatigue data
Babuška, Ivo; Sawlan, Zaid A; Scavino, Marco; Szabó , Barna; Tempone, Raul
2016-01-01
In this work, we present a statistical treatment of stress-life (S-N) data drawn from a collection of records of fatigue experiments that were performed on 75S-T6 aluminum alloys. Our main objective is to predict the fatigue life of materials by providing a systematic approach to model calibration, model selection and model ranking with reference to S-N data. To this purpose, we consider fatigue-limit models and random fatigue-limit models that are specially designed to allow the treatment of the run-outs (right-censored data). We first fit the models to the data by maximum likelihood methods and estimate the quantiles of the life distribution of the alloy specimen. To assess the robustness of the estimation of the quantile functions, we obtain bootstrap confidence bands by stratified resampling with respect to the cycle ratio. We then compare and rank the models by classical measures of fit based on information criteria. We also consider a Bayesian approach that provides, under the prior distribution of the model parameters selected by the user, their simulation-based posterior distributions. We implement and apply Bayesian model comparison methods, such as Bayes factor ranking and predictive information criteria based on cross-validation techniques under various a priori scenarios.
Risk Assessment for Mobile Systems Through a Multilayered Hierarchical Bayesian Network.
Li, Shancang; Tryfonas, Theo; Russell, Gordon; Andriotis, Panagiotis
2016-08-01
Mobile systems are facing a number of application vulnerabilities that can be combined together and utilized to penetrate systems with devastating impact. When assessing the overall security of a mobile system, it is important to assess the security risks posed by each mobile applications (apps), thus gaining a stronger understanding of any vulnerabilities present. This paper aims at developing a three-layer framework that assesses the potential risks which apps introduce within the Android mobile systems. A Bayesian risk graphical model is proposed to evaluate risk propagation in a layered risk architecture. By integrating static analysis, dynamic analysis, and behavior analysis in a hierarchical framework, the risks and their propagation through each layer are well modeled by the Bayesian risk graph, which can quantitatively analyze risks faced to both apps and mobile systems. The proposed hierarchical Bayesian risk graph model offers a novel way to investigate the security risks in mobile environment and enables users and administrators to evaluate the potential risks. This strategy allows to strengthen both app security as well as the security of the entire system.
Practical Bayesian inference a primer for physical scientists
Bailer-Jones, Coryn A L
2017-01-01
Science is fundamentally about learning from data, and doing so in the presence of uncertainty. This volume is an introduction to the major concepts of probability and statistics, and the computational tools for analysing and interpreting data. It describes the Bayesian approach, and explains how this can be used to fit and compare models in a range of problems. Topics covered include regression, parameter estimation, model assessment, and Monte Carlo methods, as well as widely used classical methods such as regularization and hypothesis testing. The emphasis throughout is on the principles, the unifying probabilistic approach, and showing how the methods can be implemented in practice. R code (with explanations) is included and is available online, so readers can reproduce the plots and results for themselves. Aimed primarily at undergraduate and graduate students, these techniques can be applied to a wide range of data analysis problems beyond the scope of this work.
Sahai, Swupnil
This thesis includes three parts. The overarching theme is how to analyze structured hierarchical data, with applications to astronomy and sociology. The first part discusses how expectation propagation can be used to parallelize the computation when fitting big hierarchical bayesian models. This methodology is then used to fit a novel, nonlinear mixture model to ultraviolet radiation from various regions of the observable universe. The second part discusses how the Stan probabilistic programming language can be used to numerically integrate terms in a hierarchical bayesian model. This technique is demonstrated on supernovae data to significantly speed up convergence to the posterior distribution compared to a previous study that used a Gibbs-type sampler. The third part builds a formal latent kernel representation for aggregate relational data as a way to more robustly estimate the mixing characteristics of agents in a network. In particular, the framework is applied to sociology surveys to estimate, as a function of ego age, the age and sex composition of the personal networks of individuals in the United States.
Directory of Open Access Journals (Sweden)
A. A. Zolotin
2015-07-01
Full Text Available Posteriori inference is one of the three kinds of probabilistic-logic inferences in the probabilistic graphical models theory and the base for processing of knowledge patterns with probabilistic uncertainty using Bayesian networks. The paper deals with a task of local posteriori inference description in algebraic Bayesian networks that represent a class of probabilistic graphical models by means of matrix-vector equations. The latter are essentially based on the use of tensor product of matrices, Kronecker degree and Hadamard product. Matrix equations for calculating posteriori probabilities vectors within posteriori inference in knowledge patterns with quanta propositions are obtained. Similar equations of the same type have already been discussed within the confines of the theory of algebraic Bayesian networks, but they were built only for the case of posteriori inference in the knowledge patterns on the ideals of conjuncts. During synthesis and development of matrix-vector equations on quanta propositions probability vectors, a number of earlier results concerning normalizing factors in posteriori inference and assignment of linear projective operator with a selector vector was adapted. We consider all three types of incoming evidences - deterministic, stochastic and inaccurate - combined with scalar and interval estimation of probability truth of propositional formulas in the knowledge patterns. Linear programming problems are formed. Their solution gives the desired interval values of posterior probabilities in the case of inaccurate evidence or interval estimates in a knowledge pattern. That sort of description of a posteriori inference gives the possibility to extend the set of knowledge pattern types that we can use in the local and global posteriori inference, as well as simplify complex software implementation by use of existing third-party libraries, effectively supporting submission and processing of matrices and vectors when
Bayesian Uncertainty Quantification for Subsurface Inversion Using a Multiscale Hierarchical Model
Mondal, Anirban
2014-07-03
We consider a Bayesian approach to nonlinear inverse problems in which the unknown quantity is a random field (spatial or temporal). The Bayesian approach contains a natural mechanism for regularization in the form of prior information, can incorporate information from heterogeneous sources and provide a quantitative assessment of uncertainty in the inverse solution. The Bayesian setting casts the inverse solution as a posterior probability distribution over the model parameters. The Karhunen-Loeve expansion is used for dimension reduction of the random field. Furthermore, we use a hierarchical Bayes model to inject multiscale data in the modeling framework. In this Bayesian framework, we show that this inverse problem is well-posed by proving that the posterior measure is Lipschitz continuous with respect to the data in total variation norm. Computational challenges in this construction arise from the need for repeated evaluations of the forward model (e.g., in the context of MCMC) and are compounded by high dimensionality of the posterior. We develop two-stage reversible jump MCMC that has the ability to screen the bad proposals in the first inexpensive stage. Numerical results are presented by analyzing simulated as well as real data from hydrocarbon reservoir. This article has supplementary material available online. © 2014 American Statistical Association and the American Society for Quality.
A Bayesian Hierarchical Model for Relating Multiple SNPs within Multiple Genes to Disease Risk
Directory of Open Access Journals (Sweden)
Lewei Duan
2013-01-01
Full Text Available A variety of methods have been proposed for studying the association of multiple genes thought to be involved in a common pathway for a particular disease. Here, we present an extension of a Bayesian hierarchical modeling strategy that allows for multiple SNPs within each gene, with external prior information at either the SNP or gene level. The model involves variable selection at the SNP level through latent indicator variables and Bayesian shrinkage at the gene level towards a prior mean vector and covariance matrix that depend on external information. The entire model is fitted using Markov chain Monte Carlo methods. Simulation studies show that the approach is capable of recovering many of the truly causal SNPs and genes, depending upon their frequency and size of their effects. The method is applied to data on 504 SNPs in 38 candidate genes involved in DNA damage response in the WECARE study of second breast cancers in relation to radiotherapy exposure.
International Nuclear Information System (INIS)
Von Nessi, G T; Hole, M J
2014-01-01
We present recent results and technical breakthroughs for the Bayesian inference of tokamak equilibria using force-balance as a prior constraint. Issues surrounding model parameter representation and posterior analysis are discussed and addressed. These points motivate the recent advancements embodied in the Bayesian Equilibrium Analysis and Simulation Tool (BEAST) software being presently utilized to study equilibria on the Mega-Ampere Spherical Tokamak (MAST) experiment in the UK (von Nessi et al 2012 J. Phys. A 46 185501). State-of-the-art results of using BEAST to study MAST equilibria are reviewed, with recent code advancements being systematically presented though out the manuscript. (paper)
Bayesian Inference on the Memory Parameter for Gamma-Modulated Regression Models
Directory of Open Access Journals (Sweden)
Plinio Andrade
2015-09-01
Full Text Available In this work, we propose a Bayesian methodology to make inferences for the memory parameter and other characteristics under non-standard assumptions for a class of stochastic processes. This class generalizes the Gamma-modulated process, with trajectories that exhibit long memory behavior, as well as decreasing variability as time increases. Different values of the memory parameter influence the speed of this decrease, making this heteroscedastic model very flexible. Its properties are used to implement an approximate Bayesian computation and MCMC scheme to obtain posterior estimates. We test and validate our method through simulations and real data from the big earthquake that occurred in 2010 in Chile.
Bayesian inference and model comparison for metallic fatigue data
Babuska, Ivo
2016-01-06
In this work, we present a statistical treatment of stress-life (S-N) data drawn from a collection of records of fatigue experiments that were performed on 75S-T6 aluminum alloys. Our main objective is to predict the fatigue life of materials by providing a systematic approach to model calibration, model selection and model ranking with reference to S-N data. To this purpose, we consider fatigue-limit models and random fatigue-limit models that are specially designed to allow the treatment of the run-outs (right-censored data). We first fit the models to the data by maximum likelihood methods and estimate the quantiles of the life distribution of the alloy specimen. We then compare and rank the models by classical measures of fit based on information criteria. We also consider a Bayesian approach that provides, under the prior distribution of the model parameters selected by the user, their simulation-based posterior distributions.
Estimating mountain basin-mean precipitation from streamflow using Bayesian inference
Henn, Brian; Clark, Martyn P.; Kavetski, Dmitri; Lundquist, Jessica D.
2015-10-01
Estimating basin-mean precipitation in complex terrain is difficult due to uncertainty in the topographical representativeness of precipitation gauges relative to the basin. To address this issue, we use Bayesian methodology coupled with a multimodel framework to infer basin-mean precipitation from streamflow observations, and we apply this approach to snow-dominated basins in the Sierra Nevada of California. Using streamflow observations, forcing data from lower-elevation stations, the Bayesian Total Error Analysis (BATEA) methodology and the Framework for Understanding Structural Errors (FUSE), we infer basin-mean precipitation, and compare it to basin-mean precipitation estimated using topographically informed interpolation from gauges (PRISM, the Parameter-elevation Regression on Independent Slopes Model). The BATEA-inferred spatial patterns of precipitation show agreement with PRISM in terms of the rank of basins from wet to dry but differ in absolute values. In some of the basins, these differences may reflect biases in PRISM, because some implied PRISM runoff ratios may be inconsistent with the regional climate. We also infer annual time series of basin precipitation using a two-step calibration approach. Assessment of the precision and robustness of the BATEA approach suggests that uncertainty in the BATEA-inferred precipitation is primarily related to uncertainties in hydrologic model structure. Despite these limitations, time series of inferred annual precipitation under different model and parameter assumptions are strongly correlated with one another, suggesting that this approach is capable of resolving year-to-year variability in basin-mean precipitation.
Shao, Kan; Allen, Bruce C; Wheeler, Matthew W
2017-10-01
Human variability is a very important factor considered in human health risk assessment for protecting sensitive populations from chemical exposure. Traditionally, to account for this variability, an interhuman uncertainty factor is applied to lower the exposure limit. However, using a fixed uncertainty factor rather than probabilistically accounting for human variability can hardly support probabilistic risk assessment advocated by a number of researchers; new methods are needed to probabilistically quantify human population variability. We propose a Bayesian hierarchical model to quantify variability among different populations. This approach jointly characterizes the distribution of risk at background exposure and the sensitivity of response to exposure, which are commonly represented by model parameters. We demonstrate, through both an application to real data and a simulation study, that using the proposed hierarchical structure adequately characterizes variability across different populations. © 2016 Society for Risk Analysis.
Gunji, Yukio-Pegio; Shinohara, Shuji; Haruna, Taichi; Basios, Vasileios
2017-02-01
To overcome the dualism between mind and matter and to implement consciousness in science, a physical entity has to be embedded with a measurement process. Although quantum mechanics have been regarded as a candidate for implementing consciousness, nature at its macroscopic level is inconsistent with quantum mechanics. We propose a measurement-oriented inference system comprising Bayesian and inverse Bayesian inferences. While Bayesian inference contracts probability space, the newly defined inverse one relaxes the space. These two inferences allow an agent to make a decision corresponding to an immediate change in their environment. They generate a particular pattern of joint probability for data and hypotheses, comprising multiple diagonal and noisy matrices. This is expressed as a nondistributive orthomodular lattice equivalent to quantum logic. We also show that an orthomodular lattice can reveal information generated by inverse syllogism as well as the solutions to the frame and symbol-grounding problems. Our model is the first to connect macroscopic cognitive processes with the mathematical structure of quantum mechanics with no additional assumptions. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
An empirical Bayesian approach for model-based inference of cellular signaling networks
Directory of Open Access Journals (Sweden)
Klinke David J
2009-11-01
Full Text Available Abstract Background A common challenge in systems biology is to infer mechanistic descriptions of biological process given limited observations of a biological system. Mathematical models are frequently used to represent a belief about the causal relationships among proteins within a signaling network. Bayesian methods provide an attractive framework for inferring the validity of those beliefs in the context of the available data. However, efficient sampling of high-dimensional parameter space and appropriate convergence criteria provide barriers for implementing an empirical Bayesian approach. The objective of this study was to apply an Adaptive Markov chain Monte Carlo technique to a typical study of cellular signaling pathways. Results As an illustrative example, a kinetic model for the early signaling events associated with the epidermal growth factor (EGF signaling network was calibrated against dynamic measurements observed in primary rat hepatocytes. A convergence criterion, based upon the Gelman-Rubin potential scale reduction factor, was applied to the model predictions. The posterior distributions of the parameters exhibited complicated structure, including significant covariance between specific parameters and a broad range of variance among the parameters. The model predictions, in contrast, were narrowly distributed and were used to identify areas of agreement among a collection of experimental studies. Conclusion In summary, an empirical Bayesian approach was developed for inferring the confidence that one can place in a particular model that describes signal transduction mechanisms and for inferring inconsistencies in experimental measurements.
Large-Scale Optimization for Bayesian Inference in Complex Systems
Energy Technology Data Exchange (ETDEWEB)
Willcox, Karen [MIT; Marzouk, Youssef [MIT
2013-11-12
The SAGUARO (Scalable Algorithms for Groundwater Uncertainty Analysis and Robust Optimization) Project focused on the development of scalable numerical algorithms for large-scale Bayesian inversion in complex systems that capitalize on advances in large-scale simulation-based optimization and inversion methods. The project was a collaborative effort among MIT, the University of Texas at Austin, Georgia Institute of Technology, and Sandia National Laboratories. The research was directed in three complementary areas: efficient approximations of the Hessian operator, reductions in complexity of forward simulations via stochastic spectral approximations and model reduction, and employing large-scale optimization concepts to accelerate sampling. The MIT--Sandia component of the SAGUARO Project addressed the intractability of conventional sampling methods for large-scale statistical inverse problems by devising reduced-order models that are faithful to the full-order model over a wide range of parameter values; sampling then employs the reduced model rather than the full model, resulting in very large computational savings. Results indicate little effect on the computed posterior distribution. On the other hand, in the Texas--Georgia Tech component of the project, we retain the full-order model, but exploit inverse problem structure (adjoint-based gradients and partial Hessian information of the parameter-to-observation map) to implicitly extract lower dimensional information on the posterior distribution; this greatly speeds up sampling methods, so that fewer sampling points are needed. We can think of these two approaches as ``reduce then sample'' and ``sample then reduce.'' In fact, these two approaches are complementary, and can be used in conjunction with each other. Moreover, they both exploit deterministic inverse problem structure, in the form of adjoint-based gradient and Hessian information of the underlying parameter-to-observation map, to
Bayesian inference of shared recombination hotspots between humans and chimpanzees.
Wang, Ying; Rannala, Bruce
2014-12-01
Recombination generates variation and facilitates evolution. Recombination (or lack thereof) also contributes to human genetic disease. Methods for mapping genes influencing complex genetic diseases via association rely on linkage disequilibrium (LD) in human populations, which is influenced by rates of recombination across the genome. Comparative population genomic analyses of recombination using related primate species can identify factors influencing rates of recombination in humans. Such studies can indicate how variable hotspots for recombination may be both among individuals (or populations) and over evolutionary timescales. Previous studies have suggested that locations of recombination hotspots are not conserved between humans and chimpanzees. We made use of the data sets from recent resequencing projects and applied a Bayesian method for identifying hotspots and estimating recombination rates. We also reanalyzed SNP data sets for regions with known hotspots in humans using samples from the human and chimpanzee. The Bayes factors (BF) of shared recombination hotspots between human and chimpanzee across regions were obtained. Based on the analysis of the aligned regions of human chromosome 21, locations where the two species show evidence of shared recombination hotspots (with high BFs) were identified. Interestingly, previous comparative studies of human and chimpanzee that focused on the known human recombination hotspots within the β-globin and HLA regions did not find overlapping of hotspots. Our results show high BFs of shared hotspots at locations within both regions, and the estimated locations of shared hotspots overlap with the locations of human recombination hotspots obtained from sperm-typing studies. Copyright © 2014 by the Genetics Society of America.
Directory of Open Access Journals (Sweden)
Salvador Dura-Bernal
Full Text Available Hierarchical generative models, such as Bayesian networks, and belief propagation have been shown to provide a theoretical framework that can account for perceptual processes, including feedforward recognition and feedback modulation. The framework explains both psychophysical and physiological experimental data and maps well onto the hierarchical distributed cortical anatomy. However, the complexity required to model cortical processes makes inference, even using approximate methods, very computationally expensive. Thus, existing object perception models based on this approach are typically limited to tree-structured networks with no loops, use small toy examples or fail to account for certain perceptual aspects such as invariance to transformations or feedback reconstruction. In this study we develop a Bayesian network with an architecture similar to that of HMAX, a biologically-inspired hierarchical model of object recognition, and use loopy belief propagation to approximate the model operations (selectivity and invariance. Crucially, the resulting Bayesian network extends the functionality of HMAX by including top-down recursive feedback. Thus, the proposed model not only achieves successful feedforward recognition invariant to noise, occlusions, and changes in position and size, but is also able to reproduce modulatory effects such as illusory contour completion and attention. Our novel and rigorous methodology covers key aspects such as learning using a layerwise greedy algorithm, combining feedback information from multiple parents and reducing the number of operations required. Overall, this work extends an established model of object recognition to include high-level feedback modulation, based on state-of-the-art probabilistic approaches. The methodology employed, consistent with evidence from the visual cortex, can be potentially generalized to build models of hierarchical perceptual organization that include top-down and bottom
DEFF Research Database (Denmark)
Møller, Jesper
2010-01-01
Chapter 9: This contribution concerns statistical inference for parametric models used in stochastic geometry and based on quick and simple simulation free procedures as well as more comprehensive methods based on a maximum likelihood or Bayesian approach combined with markov chain Monte Carlo...... (MCMC) techniques. Due to space limitations the focus is on spatial point processes....
DEFF Research Database (Denmark)
Heller, Rasmus; Chikhi, Lounes; Siegismund, Hans
2013-01-01
Many coalescent-based methods aiming to infer the demographic history of populations assume a single, isolated and panmictic population (i.e. a Wright-Fisher model). While this assumption may be reasonable under many conditions, several recent studies have shown that the results can be misleading...... when it is violated. Among the most widely applied demographic inference methods are Bayesian skyline plots (BSPs), which are used across a range of biological fields. Violations of the panmixia assumption are to be expected in many biological systems, but the consequences for skyline plot inferences...... the best scheme for inferring demographic change over a typical time scale. Analyses of data from a structured African buffalo population demonstrate how BSP results can be strengthened by simulations. We recommend that sample selection should be carefully considered in relation to population structure...
Detection of multiple damages employing best achievable eigenvectors under Bayesian inference
Prajapat, Kanta; Ray-Chaudhuri, Samit
2018-05-01
A novel approach is presented in this work to localize simultaneously multiple damaged elements in a structure along with the estimation of damage severity for each of the damaged elements. For detection of damaged elements, a best achievable eigenvector based formulation has been derived. To deal with noisy data, Bayesian inference is employed in the formulation wherein the likelihood of the Bayesian algorithm is formed on the basis of errors between the best achievable eigenvectors and the measured modes. In this approach, the most probable damage locations are evaluated under Bayesian inference by generating combinations of various possible damaged elements. Once damage locations are identified, damage severities are estimated using a Bayesian inference Markov chain Monte Carlo simulation. The efficiency of the proposed approach has been demonstrated by carrying out a numerical study involving a 12-story shear building. It has been found from this study that damage scenarios involving as low as 10% loss of stiffness in multiple elements are accurately determined (localized and severities quantified) even when 2% noise contaminated modal data are utilized. Further, this study introduces a term parameter impact (evaluated based on sensitivity of modal parameters towards structural parameters) to decide the suitability of selecting a particular mode, if some idea about the damaged elements are available. It has been demonstrated here that the accuracy and efficiency of the Bayesian quantification algorithm increases if damage localization is carried out a-priori. An experimental study involving a laboratory scale shear building and different stiffness modification scenarios shows that the proposed approach is efficient enough to localize the stories with stiffness modification.
Bayesian Inference using Neural Net Likelihood Models for Protein Secondary Structure Prediction
Directory of Open Access Journals (Sweden)
Seong-Gon Kim
2011-06-01
Full Text Available Several techniques such as Neural Networks, Genetic Algorithms, Decision Trees and other statistical or heuristic methods have been used to approach the complex non-linear task of predicting Alpha-helicies, Beta-sheets and Turns of a proteins secondary structure in the past. This project introduces a new machine learning method by using an offline trained Multilayered Perceptrons (MLP as the likelihood models within a Bayesian Inference framework to predict secondary structures proteins. Varying window sizes are used to extract neighboring amino acid information and passed back and forth between the Neural Net models and the Bayesian Inference process until there is a convergence of the posterior secondary structure probability.
Prudhomme, Serge; Bryant, Corey M.
2015-01-01
Parameter estimation for complex models using Bayesian inference is usually a very costly process as it requires a large number of solves of the forward problem. We show here how the construction of adaptive surrogate models using a posteriori error estimates for quantities of interest can significantly reduce the computational cost in problems of statistical inference. As surrogate models provide only approximations of the true solutions of the forward problem, it is nevertheless necessary to control these errors in order to construct an accurate reduced model with respect to the observables utilized in the identification of the model parameters. Effectiveness of the proposed approach is demonstrated on a numerical example dealing with the Spalart–Allmaras model for the simulation of turbulent channel flows. In particular, we illustrate how Bayesian model selection using the adapted surrogate model in place of solving the coupled nonlinear equations leads to the same quality of results while requiring fewer nonlinear PDE solves.
Prudhomme, Serge
2015-09-17
Parameter estimation for complex models using Bayesian inference is usually a very costly process as it requires a large number of solves of the forward problem. We show here how the construction of adaptive surrogate models using a posteriori error estimates for quantities of interest can significantly reduce the computational cost in problems of statistical inference. As surrogate models provide only approximations of the true solutions of the forward problem, it is nevertheless necessary to control these errors in order to construct an accurate reduced model with respect to the observables utilized in the identification of the model parameters. Effectiveness of the proposed approach is demonstrated on a numerical example dealing with the Spalart–Allmaras model for the simulation of turbulent channel flows. In particular, we illustrate how Bayesian model selection using the adapted surrogate model in place of solving the coupled nonlinear equations leads to the same quality of results while requiring fewer nonlinear PDE solves.
A method for crack sizing using Bayesian inference arising in eddy current testing
International Nuclear Information System (INIS)
Kojima, Fumio; Kikuchi, Mitsuhiro
2008-01-01
This paper is concerned with a sizing methodology of crack using Bayesian inference arising in eddy current testing. There is often uncertainty about data through quantitative measurements of nondestructive testing and this can yield misleading inference of crack sizing at on-site monitoring. In this paper, we propose optimal strategies of measurements in eddy current testing using Bayesian prior-to-posteriori analysis. First our likelihood functional is given by Gaussian distribution with the measurement model based on the hybrid use of finite and boundary element methods. Secondly, given a priori distributions of crack sizing, we propose a method for estimating the region of interest for sizing cracks. Finally an optimal sensing method is demonstrated using our idea. (author)
Shankle, William R.; Pooley, James P.; Steyvers, Mark; Hara, Junko; Mangrola, Tushar; Reisberg, Barry; Lee, Michael D.
2012-01-01
Determining how cognition affects functional abilities is important in Alzheimer’s disease and related disorders (ADRD). 280 patients (normal or ADRD) received a total of 1,514 assessments using the Functional Assessment Staging Test (FAST) procedure and the MCI Screen (MCIS). A hierarchical Bayesian cognitive processing (HBCP) model was created by embedding a signal detection theory (SDT) model of the MCIS delayed recognition memory task into a hierarchical Bayesian framework. The SDT model used latent parameters of discriminability (memory process) and response bias (executive function) to predict, simultaneously, recognition memory performance for each patient and each FAST severity group. The observed recognition memory data did not distinguish the six FAST severity stages, but the latent parameters completely separated them. The latent parameters were also used successfully to transform the ordinal FAST measure into a continuous measure reflecting the underlying continuum of functional severity. HBCP models applied to recognition memory data from clinical practice settings accurately translated a latent measure of cognition to a continuous measure of functional severity for both individuals and FAST groups. Such a translation links two levels of brain information processing, and may enable more accurate correlations with other levels, such as those characterized by biomarkers. PMID:22407225
Prion Amplification and Hierarchical Bayesian Modeling Refine Detection of Prion Infection
Wyckoff, A. Christy; Galloway, Nathan; Meyerett-Reid, Crystal; Powers, Jenny; Spraker, Terry; Monello, Ryan J.; Pulford, Bruce; Wild, Margaret; Antolin, Michael; Vercauteren, Kurt; Zabel, Mark
2015-02-01
Prions are unique infectious agents that replicate without a genome and cause neurodegenerative diseases that include chronic wasting disease (CWD) of cervids. Immunohistochemistry (IHC) is currently considered the gold standard for diagnosis of a prion infection but may be insensitive to early or sub-clinical CWD that are important to understanding CWD transmission and ecology. We assessed the potential of serial protein misfolding cyclic amplification (sPMCA) to improve detection of CWD prior to the onset of clinical signs. We analyzed tissue samples from free-ranging Rocky Mountain elk (Cervus elaphus nelsoni) and used hierarchical Bayesian analysis to estimate the specificity and sensitivity of IHC and sPMCA conditional on simultaneously estimated disease states. Sensitivity estimates were higher for sPMCA (99.51%, credible interval (CI) 97.15-100%) than IHC of obex (brain stem, 76.56%, CI 57.00-91.46%) or retropharyngeal lymph node (90.06%, CI 74.13-98.70%) tissues, or both (98.99%, CI 90.01-100%). Our hierarchical Bayesian model predicts the prevalence of prion infection in this elk population to be 18.90% (CI 15.50-32.72%), compared to previous estimates of 12.90%. Our data reveal a previously unidentified sub-clinical prion-positive portion of the elk population that could represent silent carriers capable of significantly impacting CWD ecology.
Prion amplification and hierarchical Bayesian modeling refine detection of prion infection.
Wyckoff, A Christy; Galloway, Nathan; Meyerett-Reid, Crystal; Powers, Jenny; Spraker, Terry; Monello, Ryan J; Pulford, Bruce; Wild, Margaret; Antolin, Michael; VerCauteren, Kurt; Zabel, Mark
2015-02-10
Prions are unique infectious agents that replicate without a genome and cause neurodegenerative diseases that include chronic wasting disease (CWD) of cervids. Immunohistochemistry (IHC) is currently considered the gold standard for diagnosis of a prion infection but may be insensitive to early or sub-clinical CWD that are important to understanding CWD transmission and ecology. We assessed the potential of serial protein misfolding cyclic amplification (sPMCA) to improve detection of CWD prior to the onset of clinical signs. We analyzed tissue samples from free-ranging Rocky Mountain elk (Cervus elaphus nelsoni) and used hierarchical Bayesian analysis to estimate the specificity and sensitivity of IHC and sPMCA conditional on simultaneously estimated disease states. Sensitivity estimates were higher for sPMCA (99.51%, credible interval (CI) 97.15-100%) than IHC of obex (brain stem, 76.56%, CI 57.00-91.46%) or retropharyngeal lymph node (90.06%, CI 74.13-98.70%) tissues, or both (98.99%, CI 90.01-100%). Our hierarchical Bayesian model predicts the prevalence of prion infection in this elk population to be 18.90% (CI 15.50-32.72%), compared to previous estimates of 12.90%. Our data reveal a previously unidentified sub-clinical prion-positive portion of the elk population that could represent silent carriers capable of significantly impacting CWD ecology.
International Nuclear Information System (INIS)
Brown, Kristen A.; Harlim, John
2013-01-01
In this paper, we consider a practical filtering approach for assimilating irregularly spaced, sparsely observed turbulent signals through a hierarchical Bayesian reduced stochastic filtering framework. The proposed hierarchical Bayesian approach consists of two steps, blending a data-driven interpolation scheme and the Mean Stochastic Model (MSM) filter. We examine the potential of using the deterministic piecewise linear interpolation scheme and the ordinary kriging scheme in interpolating irregularly spaced raw data to regularly spaced processed data and the importance of dynamical constraint (through MSM) in filtering the processed data on a numerically stiff state estimation problem. In particular, we test this approach on a two-layer quasi-geostrophic model in a two-dimensional domain with a small radius of deformation to mimic ocean turbulence. Our numerical results suggest that the dynamical constraint becomes important when the observation noise variance is large. Second, we find that the filtered estimates with ordinary kriging are superior to those with linear interpolation when observation networks are not too sparse; such robust results are found from numerical simulations with many randomly simulated irregularly spaced observation networks, various observation time intervals, and observation error variances. Third, when the observation network is very sparse, we find that both the kriging and linear interpolations are comparable
A hierarchical bayesian model to quantify uncertainty of stream water temperature forecasts.
Directory of Open Access Journals (Sweden)
Guillaume Bal
Full Text Available Providing generic and cost effective modelling approaches to reconstruct and forecast freshwater temperature using predictors as air temperature and water discharge is a prerequisite to understanding ecological processes underlying the impact of water temperature and of global warming on continental aquatic ecosystems. Using air temperature as a simple linear predictor of water temperature can lead to significant bias in forecasts as it does not disentangle seasonality and long term trends in the signal. Here, we develop an alternative approach based on hierarchical Bayesian statistical time series modelling of water temperature, air temperature and water discharge using seasonal sinusoidal periodic signals and time varying means and amplitudes. Fitting and forecasting performances of this approach are compared with that of simple linear regression between water and air temperatures using i an emotive simulated example, ii application to three French coastal streams with contrasting bio-geographical conditions and sizes. The time series modelling approach better fit data and does not exhibit forecasting bias in long term trends contrary to the linear regression. This new model also allows for more accurate forecasts of water temperature than linear regression together with a fair assessment of the uncertainty around forecasting. Warming of water temperature forecast by our hierarchical Bayesian model was slower and more uncertain than that expected with the classical regression approach. These new forecasts are in a form that is readily usable in further ecological analyses and will allow weighting of outcomes from different scenarios to manage climate change impacts on freshwater wildlife.
Directory of Open Access Journals (Sweden)
Benjamin W. Y. Lo
2013-01-01
Full Text Available Objective. The novel clinical prediction approach of Bayesian neural networks with fuzzy logic inferences is created and applied to derive prognostic decision rules in cerebral aneurysmal subarachnoid hemorrhage (aSAH. Methods. The approach of Bayesian neural networks with fuzzy logic inferences was applied to data from five trials of Tirilazad for aneurysmal subarachnoid hemorrhage (3551 patients. Results. Bayesian meta-analyses of observational studies on aSAH prognostic factors gave generalizable posterior distributions of population mean log odd ratios (ORs. Similar trends were noted in Bayesian and linear regression ORs. Significant outcome predictors include normal motor response, cerebral infarction, history of myocardial infarction, cerebral edema, history of diabetes mellitus, fever on day 8, prior subarachnoid hemorrhage, admission angiographic vasospasm, neurological grade, intraventricular hemorrhage, ruptured aneurysm size, history of hypertension, vasospasm day, age and mean arterial pressure. Heteroscedasticity was present in the nontransformed dataset. Artificial neural networks found nonlinear relationships with 11 hidden variables in 1 layer, using the multilayer perceptron model. Fuzzy logic decision rules (centroid defuzzification technique denoted cut-off points for poor prognosis at greater than 2.5 clusters. Discussion. This aSAH prognostic system makes use of existing knowledge, recognizes unknown areas, incorporates one's clinical reasoning, and compensates for uncertainty in prognostication.
2016-03-01
each IDF curve and subsequently used to force a calibrated and validated precipitation - runoff model. Probability-based, risk-informed hydrologic...ERDC/CHL CHETN-X-2 March 2016 Approved for public release; distribution is unlimited. Bayesian Inference of Nonstationary Precipitation Intensity...based means by which to develop local precipitation Intensity-Duration-Frequency (IDF) curves using historical rainfall time series data collected for
Khan, Haseeb A.; Arif, Ibrahim A.; Bahkali, Ali H.; Al Farhan, Ahmad H.; Al Homaidan, Ali A.
2008-01-01
This investigation was aimed to compare the inference of antelope phylogenies resulting from the 16S rRNA, cytochrome-b (cyt-b) and d-loop segments of mitochondrial DNA using three different computational models including Bayesian (BA), maximum parsimony (MP) and unweighted pair group method with arithmetic mean (UPGMA). The respective nucleotide sequences of three Oryx species (Oryx leucoryx, Oryx dammah and Oryx gazella) and an out-group (Addax nasomaculatus) were aligned and subjected to B...
Dorazio, R.M.; Johnson, F.A.
2003-01-01
Bayesian inference and decision theory may be used in the solution of relatively complex problems of natural resource management, owing to recent advances in statistical theory and computing. In particular, Markov chain Monte Carlo algorithms provide a computational framework for fitting models of adequate complexity and for evaluating the expected consequences of alternative management actions. We illustrate these features using an example based on management of waterfowl habitat.
Probabilistic daily ILI syndromic surveillance with a spatio-temporal Bayesian hierarchical model.
Directory of Open Access Journals (Sweden)
Ta-Chien Chan
Full Text Available BACKGROUND: For daily syndromic surveillance to be effective, an efficient and sensible algorithm would be expected to detect aberrations in influenza illness, and alert public health workers prior to any impending epidemic. This detection or alert surely contains uncertainty, and thus should be evaluated with a proper probabilistic measure. However, traditional monitoring mechanisms simply provide a binary alert, failing to adequately address this uncertainty. METHODS AND FINDINGS: Based on the Bayesian posterior probability of influenza-like illness (ILI visits, the intensity of outbreak can be directly assessed. The numbers of daily emergency room ILI visits at five community hospitals in Taipei City during 2006-2007 were collected and fitted with a Bayesian hierarchical model containing meteorological factors such as temperature and vapor pressure, spatial interaction with conditional autoregressive structure, weekend and holiday effects, seasonality factors, and previous ILI visits. The proposed algorithm recommends an alert for action if the posterior probability is larger than 70%. External data from January to February of 2008 were retained for validation. The decision rule detects successfully the peak in the validation period. When comparing the posterior probability evaluation with the modified Cusum method, results show that the proposed method is able to detect the signals 1-2 days prior to the rise of ILI visits. CONCLUSIONS: This Bayesian hierarchical model not only constitutes a dynamic surveillance system but also constructs a stochastic evaluation of the need to call for alert. The monitoring mechanism provides earlier detection as well as a complementary tool for current surveillance programs.
Energy Technology Data Exchange (ETDEWEB)
Kang, Seong Keun; Seong, Poong Hyun [KAIST, Daejeon (Korea, Republic of)
2014-08-15
Bayesian methodology has been used widely used in various research fields. It is method of inference using Bayes' rule to update the estimation of probability for the certain hypothesis when additional evidences are acquired. According to the current researches, malfunction of nuclear power plant can be detected by using this Bayesian inference which consistently piles up the newly incoming data and updates its estimation. However, those researches are based on the assumption that people are doing like computer perfectly, which can be criticized and may cause a problem in real world application. Studies in cognitive psychology indicates that when the amount of information becomes larger, people can't save the whole data because people have limited memory capacity which is well known as working memory, and also they have attention problem. The purpose of this paper is to consider the psychological factors and confirm how much this working memory and attention will affect the resulted estimation based on the Bayesian inference. To confirm this, experiment on human is needed, and the tool of experiment is Compact Nuclear Simulator (CNS)
International Nuclear Information System (INIS)
Kang, Seong Keun; Seong, Poong Hyun
2014-01-01
Bayesian methodology has been used widely used in various research fields. It is method of inference using Bayes' rule to update the estimation of probability for the certain hypothesis when additional evidences are acquired. According to the current researches, malfunction of nuclear power plant can be detected by using this Bayesian inference which consistently piles up the newly incoming data and updates its estimation. However, those researches are based on the assumption that people are doing like computer perfectly, which can be criticized and may cause a problem in real world application. Studies in cognitive psychology indicates that when the amount of information becomes larger, people can't save the whole data because people have limited memory capacity which is well known as working memory, and also they have attention problem. The purpose of this paper is to consider the psychological factors and confirm how much this working memory and attention will affect the resulted estimation based on the Bayesian inference. To confirm this, experiment on human is needed, and the tool of experiment is Compact Nuclear Simulator (CNS)
Adaptability and phenotypic stability of common bean genotypes through Bayesian inference.
Corrêa, A M; Teodoro, P E; Gonçalves, M C; Barroso, L M A; Nascimento, M; Santos, A; Torres, F E
2016-04-27
This study used Bayesian inference to investigate the genotype x environment interaction in common bean grown in Mato Grosso do Sul State, and it also evaluated the efficiency of using informative and minimally informative a priori distributions. Six trials were conducted in randomized blocks, and the grain yield of 13 common bean genotypes was assessed. To represent the minimally informative a priori distributions, a probability distribution with high variance was used, and a meta-analysis concept was adopted to represent the informative a priori distributions. Bayes factors were used to conduct comparisons between the a priori distributions. The Bayesian inference was effective for the selection of upright common bean genotypes with high adaptability and phenotypic stability using the Eberhart and Russell method. Bayes factors indicated that the use of informative a priori distributions provided more accurate results than minimally informative a priori distributions. According to Bayesian inference, the EMGOPA-201, BAMBUÍ, CNF 4999, CNF 4129 A 54, and CNFv 8025 genotypes had specific adaptability to favorable environments, while the IAPAR 14 and IAC CARIOCA ETE genotypes had specific adaptability to unfavorable environments.
Sraj, Ihab; Le Maî tre, Olivier P.; Knio, Omar; Hoteit, Ibrahim
2015-01-01
using a coordinate transformation to account for the dependence with respect to the covariance hyper-parameters. Polynomial Chaos expansions are employed for the acceleration of the Bayesian inference using similar coordinate transformations, enabling us
A Bayesian Framework That Integrates Heterogeneous Data for Inferring Gene Regulatory Networks
Energy Technology Data Exchange (ETDEWEB)
Santra, Tapesh, E-mail: tapesh.santra@ucd.ie [Systems Biology Ireland, University College Dublin, Dublin (Ireland)
2014-05-20
Reconstruction of gene regulatory networks (GRNs) from experimental data is a fundamental challenge in systems biology. A number of computational approaches have been developed to infer GRNs from mRNA expression profiles. However, expression profiles alone are proving to be insufficient for inferring GRN topologies with reasonable accuracy. Recently, it has been shown that integration of external data sources (such as gene and protein sequence information, gene ontology data, protein–protein interactions) with mRNA expression profiles may increase the reliability of the inference process. Here, I propose a new approach that incorporates transcription factor binding sites (TFBS) and physical protein interactions (PPI) among transcription factors (TFs) in a Bayesian variable selection (BVS) algorithm which can infer GRNs from mRNA expression profiles subjected to genetic perturbations. Using real experimental data, I show that the integration of TFBS and PPI data with mRNA expression profiles leads to significantly more accurate networks than those inferred from expression profiles alone. Additionally, the performance of the proposed algorithm is compared with a series of least absolute shrinkage and selection operator (LASSO) regression-based network inference methods that can also incorporate prior knowledge in the inference framework. The results of this comparison suggest that BVS can outperform LASSO regression-based method in some circumstances.
A Bayesian Framework That Integrates Heterogeneous Data for Inferring Gene Regulatory Networks
International Nuclear Information System (INIS)
Santra, Tapesh
2014-01-01
Reconstruction of gene regulatory networks (GRNs) from experimental data is a fundamental challenge in systems biology. A number of computational approaches have been developed to infer GRNs from mRNA expression profiles. However, expression profiles alone are proving to be insufficient for inferring GRN topologies with reasonable accuracy. Recently, it has been shown that integration of external data sources (such as gene and protein sequence information, gene ontology data, protein–protein interactions) with mRNA expression profiles may increase the reliability of the inference process. Here, I propose a new approach that incorporates transcription factor binding sites (TFBS) and physical protein interactions (PPI) among transcription factors (TFs) in a Bayesian variable selection (BVS) algorithm which can infer GRNs from mRNA expression profiles subjected to genetic perturbations. Using real experimental data, I show that the integration of TFBS and PPI data with mRNA expression profiles leads to significantly more accurate networks than those inferred from expression profiles alone. Additionally, the performance of the proposed algorithm is compared with a series of least absolute shrinkage and selection operator (LASSO) regression-based network inference methods that can also incorporate prior knowledge in the inference framework. The results of this comparison suggest that BVS can outperform LASSO regression-based method in some circumstances.
On inferring the noise in probabilistic seismic AVO inversion using hierarchical Bayes
DEFF Research Database (Denmark)
Madsen, Rasmus Bødker; Zunino, Andrea; Hansen, Thomas Mejer
2017-01-01
A realistic noise model is essential for trustworthy inversion of geophysical data. Sometimes, as in case of seismic data, quan- tification of the noise model is non-trivial. To remedy this, a hierarchical Bayes approach can be adopted in which proper- ties of the noise model, such as the amplitude...... of an assumed uncorrelated Gaussian noise model, can be inferred as part of the inversion. Here we demonstrate how such an approach can lead to substantial overfitting of noise when inverting a 1D re- flection seismic NMO data set. We then argue that usually the noise model is correlated, and suggest to infer...
From least squares to multilevel modeling: A graphical introduction to Bayesian inference
Loredo, Thomas J.
2016-01-01
This tutorial presentation will introduce some of the key ideas and techniques involved in applying Bayesian methods to problems in astrostatistics. The focus will be on the big picture: understanding the foundations (interpreting probability, Bayes's theorem, the law of total probability and marginalization), making connections to traditional methods (propagation of errors, least squares, chi-squared, maximum likelihood, Monte Carlo simulation), and highlighting problems where a Bayesian approach can be particularly powerful (Poisson processes, density estimation and curve fitting with measurement error). The "graphical" component of the title reflects an emphasis on pictorial representations of some of the math, but also on the use of graphical models (multilevel or hierarchical models) for analyzing complex data. Code for some examples from the talk will be available to participants, in Python and in the Stan probabilistic programming language.
Anderson, Eric C; Ng, Thomas C
2016-02-01
We develop a computational framework for addressing pedigree inference problems using small numbers (80-400) of single nucleotide polymorphisms (SNPs). Our approach relaxes the assumptions, which are commonly made, that sampling is complete with respect to the pedigree and that there is no genotyping error. It relies on representing the inferred pedigree as a factor graph and invoking the Sum-Product algorithm to compute and store quantities that allow the joint probability of the data to be rapidly computed under a large class of rearrangements of the pedigree structure. This allows efficient MCMC sampling over the space of pedigrees, and, hence, Bayesian inference of pedigree structure. In this paper we restrict ourselves to inference of pedigrees without loops using SNPs assumed to be unlinked. We present the methodology in general for multigenerational inference, and we illustrate the method by applying it to the inference of full sibling groups in a large sample (n=1157) of Chinook salmon typed at 95 SNPs. The results show that our method provides a better point estimate and estimate of uncertainty than the currently best-available maximum-likelihood sibling reconstruction method. Extensions of this work to more complex scenarios are briefly discussed. Published by Elsevier Inc.
Empirical verification for application of Bayesian inference in situation awareness evaluations
International Nuclear Information System (INIS)
Kang, Seongkeun; Kim, Ar Ryum; Seong, Poong Hyun
2017-01-01
Highlights: • Situation awareness (SA) of human operators is significantly important for safe operation in nuclear power plants (NPPs). • SA of human operators was empirically estimated using Bayesian inference. • In this empirical study, the effect of attention and working memory to SA was considered. • Complexcity of the given task and design of human machine interface (HMI) considerably affect SA of human operators. - Abstract: Bayesian methodology has been widely used in various research fields. According to current research, malfunctions of nuclear power plants can be detected using this Bayesian inference, which consistently piles up newly incoming data and updates the estimation. However, these studies have been based on the assumption that people work like computers—perfectly—a supposition that may cause a problem in real world applications. Studies in cognitive psychology indicate that when the amount of information to be processed becomes larger, people cannot save the whole set of data in their heads due to limited attention and limited memory capacity, also known as working memory. The purpose of the current research is to consider how actual human aware the situation contrasts with our expectations, and how such disparity affects the results of conventional Bayesian inference, if at all. We compared situation awareness (SA) of ideal operators with SA of human operators, and for the human operator we used both text-based human machine interface (HMI) and infographic-based HMI to further compare two existing human operators. In addition, two different scenarios were selected how scenario complexity affects SA of human operators. As a results, when a malfunction occurred, the ideal operator found the malfunction nearly 100% probability of the time using Bayesian inference. In contrast, out of forty-six human operators, only 69.57% found the correct malfunction with simple scenario and 58.70% with complex scenario in the text-based HMI. In
McGeachie, Michael J; Chang, Hsun-Hsien; Weiss, Scott T
2014-06-01
Bayesian Networks (BN) have been a popular predictive modeling formalism in bioinformatics, but their application in modern genomics has been slowed by an inability to cleanly handle domains with mixed discrete and continuous variables. Existing free BN software packages either discretize continuous variables, which can lead to information loss, or do not include inference routines, which makes prediction with the BN impossible. We present CGBayesNets, a BN package focused around prediction of a clinical phenotype from mixed discrete and continuous variables, which fills these gaps. CGBayesNets implements Bayesian likelihood and inference algorithms for the conditional Gaussian Bayesian network (CGBNs) formalism, one appropriate for predicting an outcome of interest from, e.g., multimodal genomic data. We provide four different network learning algorithms, each making a different tradeoff between computational cost and network likelihood. CGBayesNets provides a full suite of functions for model exploration and verification, including cross validation, bootstrapping, and AUC manipulation. We highlight several results obtained previously with CGBayesNets, including predictive models of wood properties from tree genomics, leukemia subtype classification from mixed genomic data, and robust prediction of intensive care unit mortality outcomes from metabolomic profiles. We also provide detailed example analysis on public metabolomic and gene expression datasets. CGBayesNets is implemented in MATLAB and available as MATLAB source code, under an Open Source license and anonymous download at http://www.cgbayesnets.com.
Tools to estimate PM2.5 mass have expanded in recent years, and now include: 1) stationary monitor readings, 2) Community Multi-Scale Air Quality (CMAQ) model estimates, 3) Hierarchical Bayesian (HB) estimates from combined stationary monitor readings and CMAQ model output; and, ...
Ross, Michelle; Wakefield, Jon
2015-10-01
Two-phase study designs are appealing since they allow for the oversampling of rare sub-populations which improves efficiency. In this paper we describe a Bayesian hierarchical model for the analysis of two-phase data. Such a model is particularly appealing in a spatial setting in which random effects are introduced to model between-area variability. In such a situation, one may be interested in estimating regression coefficients or, in the context of small area estimation, in reconstructing the population totals by strata. The efficiency gains of the two-phase sampling scheme are compared to standard approaches using 2011 birth data from the research triangle area of North Carolina. We show that the proposed method can overcome small sample difficulties and improve on existing techniques. We conclude that the two-phase design is an attractive approach for small area estimation.
Coley, Rebecca Yates; Browna, Elizabeth R.
2016-01-01
Inconsistent results in recent HIV prevention trials of pre-exposure prophylactic interventions may be due to heterogeneity in risk among study participants. Intervention effectiveness is most commonly estimated with the Cox model, which compares event times between populations. When heterogeneity is present, this population-level measure underestimates intervention effectiveness for individuals who are at risk. We propose a likelihood-based Bayesian hierarchical model that estimates the individual-level effectiveness of candidate interventions by accounting for heterogeneity in risk with a compound Poisson-distributed frailty term. This model reflects the mechanisms of HIV risk and allows that some participants are not exposed to HIV and, therefore, have no risk of seroconversion during the study. We assess model performance via simulation and apply the model to data from an HIV prevention trial. PMID:26869051
Feeney, Stephen M.; Mortlock, Daniel J.; Dalmasso, Niccolò
2018-05-01
Estimates of the Hubble constant, H0, from the local distance ladder and from the cosmic microwave background (CMB) are discrepant at the ˜3σ level, indicating a potential issue with the standard Λ cold dark matter (ΛCDM) cosmology. A probabilistic (i.e. Bayesian) interpretation of this tension requires a model comparison calculation, which in turn depends strongly on the tails of the H0 likelihoods. Evaluating the tails of the local H0 likelihood requires the use of non-Gaussian distributions to faithfully represent anchor likelihoods and outliers, and simultaneous fitting of the complete distance-ladder data set to ensure correct uncertainty propagation. We have hence developed a Bayesian hierarchical model of the full distance ladder that does not rely on Gaussian distributions and allows outliers to be modelled without arbitrary data cuts. Marginalizing over the full ˜3000-parameter joint posterior distribution, we find H0 = (72.72 ± 1.67) km s-1 Mpc-1 when applied to the outlier-cleaned Riess et al. data, and (73.15 ± 1.78) km s-1 Mpc-1 with supernova outliers reintroduced (the pre-cut Cepheid data set is not available). Using our precise evaluation of the tails of the H0 likelihood, we apply Bayesian model comparison to assess the evidence for deviation from ΛCDM given the distance-ladder and CMB data. The odds against ΛCDM are at worst ˜10:1 when considering the Planck 2015 XIII data, regardless of outlier treatment, considerably less dramatic than naïvely implied by the 2.8σ discrepancy. These odds become ˜60:1 when an approximation to the more-discrepant Planck Intermediate XLVI likelihood is included.
Holan, S.H.; Davis, G.M.; Wildhaber, M.L.; DeLonay, A.J.; Papoulias, D.M.
2009-01-01
The timing of spawning in fish is tightly linked to environmental factors; however, these factors are not very well understood for many species. Specifically, little information is available to guide recruitment efforts for endangered species such as the sturgeon. Therefore, we propose a Bayesian hierarchical model for predicting the success of spawning of the shovelnose sturgeon which uses both biological and behavioural (longitudinal) data. In particular, we use data that were produced from a tracking study that was conducted in the Lower Missouri River. The data that were produced from this study consist of biological variables associated with readiness to spawn along with longitudinal behavioural data collected by using telemetry and archival data storage tags. These high frequency data are complex both biologically and in the underlying behavioural process. To accommodate such complexity we developed a hierarchical linear regression model that uses an eigenvalue predictor, derived from the transition probability matrix of a two-state Markov switching model with generalized auto-regressive conditional heteroscedastic dynamics. Finally, to minimize the computational burden that is associated with estimation of this model, a parallel computing approach is proposed. ?? Journal compilation 2009 Royal Statistical Society.
A Bayesian inference approach to unveil supply curves in electricity markets
DEFF Research Database (Denmark)
Mitridati, Lesia Marie-Jeanne Mariane; Pinson, Pierre
2017-01-01
in the literature on modeling this uncertainty. In this study we introduce a Bayesian inference approach to reveal the aggregate supply curve in a day-ahead electricity market. The proposed algorithm relies on Markov Chain Monte Carlo and Sequential Monte Carlo methods. The major appeal of this approach......With increased competition in wholesale electricity markets, the need for new decision-making tools for strategic producers has arisen. Optimal bidding strategies have traditionally been modeled as stochastic profit maximization problems. However, for producers with non-negligible market power...
Bayesian inference for multivariate point processes observed at sparsely distributed times
DEFF Research Database (Denmark)
Rasmussen, Jakob Gulddahl; Møller, Jesper; Aukema, B.H.
We consider statistical and computational aspects of simulation-based Bayesian inference for a multivariate point process which is only observed at sparsely distributed times. For specicity we consider a particular data set which has earlier been analyzed by a discrete time model involving unknown...... normalizing constants. We discuss the advantages and disadvantages of using continuous time processes compared to discrete time processes in the setting of the present paper as well as other spatial-temporal situations. Keywords: Bark beetle, conditional intensity, forest entomology, Markov chain Monte Carlo...
Directory of Open Access Journals (Sweden)
Y. Paudel
2013-03-01
Full Text Available This study applies Bayesian Inference to estimate flood risk for 53 dyke ring areas in the Netherlands, and focuses particularly on the data scarcity and extreme behaviour of catastrophe risk. The probability density curves of flood damage are estimated through Monte Carlo simulations. Based on these results, flood insurance premiums are estimated using two different practical methods that each account in different ways for an insurer's risk aversion and the dispersion rate of loss data. This study is of practical relevance because insurers have been considering the introduction of flood insurance in the Netherlands, which is currently not generally available.
DEFF Research Database (Denmark)
Meng, Weizhi; Li, Wenjuan; Xiang, Yang
2017-01-01
and experience for both patients and healthcare workers, and the underlying network architecture to support such devices is also referred to as medical smartphone networks (MSNs). MSNs, similar to other networks, are subject to a wide range of attacks (e.g. leakage of sensitive patient information by a malicious...... insider). In this work, we focus on MSNs and present a compact but efficient trust-based approach using Bayesian inference to identify malicious nodes in such an environment. We then demonstrate the effectiveness of our approach in detecting malicious nodes by evaluating the deployment of our proposed...
Paudel, Y.; Botzen, W. J. W.; Aerts, J. C. J. H.
2013-03-01
This study applies Bayesian Inference to estimate flood risk for 53 dyke ring areas in the Netherlands, and focuses particularly on the data scarcity and extreme behaviour of catastrophe risk. The probability density curves of flood damage are estimated through Monte Carlo simulations. Based on these results, flood insurance premiums are estimated using two different practical methods that each account in different ways for an insurer's risk aversion and the dispersion rate of loss data. This study is of practical relevance because insurers have been considering the introduction of flood insurance in the Netherlands, which is currently not generally available.
Partial inversion of elliptic operator to speed up computation of likelihood in Bayesian inference
Litvinenko, Alexander
2017-08-09
In this paper, we speed up the solution of inverse problems in Bayesian settings. By computing the likelihood, the most expensive part of the Bayesian formula, one compares the available measurement data with the simulated data. To get simulated data, repeated solution of the forward problem is required. This could be a great challenge. Often, the available measurement is a functional $F(u)$ of the solution $u$ or a small part of $u$. Typical examples of $F(u)$ are the solution in a point, solution on a coarser grid, in a small subdomain, the mean value in a subdomain. It is a waste of computational resources to evaluate, first, the whole solution and then compute a part of it. In this work, we compute the functional $F(u)$ direct, without computing the full inverse operator and without computing the whole solution $u$. The main ingredients of the developed approach are the hierarchical domain decomposition technique, the finite element method and the Schur complements. To speed up computations and to reduce the storage cost, we approximate the forward operator and the Schur complement in the hierarchical matrix format. Applying the hierarchical matrix technique, we reduced the computing cost to $\\\\mathcal{O}(k^2n \\\\log^2 n)$, where $k\\\\ll n$ and $n$ is the number of degrees of freedom. Up to the $\\\\H$-matrix accuracy, the computation of the functional $F(u)$ is exact. To reduce the computational resources further, we can approximate $F(u)$ on, for instance, multiple coarse meshes. The offered method is well suited for solving multiscale problems. A disadvantage of this method is the assumption that one has to have access to the discretisation and to the procedure of assembling the Galerkin matrix.
A Hierarchical Poisson Log-Normal Model for Network Inference from RNA Sequencing Data
Gallopin, Mélina; Rau, Andrea; Jaffrézic, Florence
2013-01-01
Gene network inference from transcriptomic data is an important methodological challenge and a key aspect of systems biology. Although several methods have been proposed to infer networks from microarray data, there is a need for inference methods able to model RNA-seq data, which are count-based and highly variable. In this work we propose a hierarchical Poisson log-normal model with a Lasso penalty to infer gene networks from RNA-seq data; this model has the advantage of directly modelling discrete data and accounting for inter-sample variance larger than the sample mean. Using real microRNA-seq data from breast cancer tumors and simulations, we compare this method to a regularized Gaussian graphical model on log-transformed data, and a Poisson log-linear graphical model with a Lasso penalty on power-transformed data. For data simulated with large inter-sample dispersion, the proposed model performs better than the other methods in terms of sensitivity, specificity and area under the ROC curve. These results show the necessity of methods specifically designed for gene network inference from RNA-seq data. PMID:24147011
Zonta, Zivko J; Flotats, Xavier; Magrí, Albert
2014-08-01
The procedure commonly used for the assessment of the parameters included in activated sludge models (ASMs) relies on the estimation of their optimal value within a confidence region (i.e. frequentist inference). Once optimal values are estimated, parameter uncertainty is computed through the covariance matrix. However, alternative approaches based on the consideration of the model parameters as probability distributions (i.e. Bayesian inference), may be of interest. The aim of this work is to apply (and compare) both Bayesian and frequentist inference methods when assessing uncertainty for an ASM-type model, which considers intracellular storage and biomass growth, simultaneously. Practical identifiability was addressed exclusively considering respirometric profiles based on the oxygen uptake rate and with the aid of probabilistic global sensitivity analysis. Parameter uncertainty was thus estimated according to both the Bayesian and frequentist inferential procedures. Results were compared in order to evidence the strengths and weaknesses of both approaches. Since it was demonstrated that Bayesian inference could be reduced to a frequentist approach under particular hypotheses, the former can be considered as a more generalist methodology. Hence, the use of Bayesian inference is encouraged for tackling inferential issues in ASM environments.
Inferring gene and protein interactions using PubMed citations and consensus Bayesian networks.
Deeter, Anthony; Dalman, Mark; Haddad, Joseph; Duan, Zhong-Hui
2017-01-01
The PubMed database offers an extensive set of publication data that can be useful, yet inherently complex to use without automated computational techniques. Data repositories such as the Genomic Data Commons (GDC) and the Gene Expression Omnibus (GEO) offer experimental data storage and retrieval as well as curated gene expression profiles. Genetic interaction databases, including Reactome and Ingenuity Pathway Analysis, offer pathway and experiment data analysis using data curated from these publications and data repositories. We have created a method to generate and analyze consensus networks, inferring potential gene interactions, using large numbers of Bayesian networks generated by data mining publications in the PubMed database. Through the concept of network resolution, these consensus networks can be tailored to represent possible genetic interactions. We designed a set of experiments to confirm that our method is stable across variation in both sample and topological input sizes. Using gene product interactions from the KEGG pathway database and data mining PubMed publication abstracts, we verify that regardless of the network resolution or the inferred consensus network, our method is capable of inferring meaningful gene interactions through consensus Bayesian network generation with multiple, randomized topological orderings. Our method can not only confirm the existence of currently accepted interactions, but has the potential to hypothesize new ones as well. We show our method confirms the existence of known gene interactions such as JAK-STAT-PI3K-AKT-mTOR, infers novel gene interactions such as RAS- Bcl-2 and RAS-AKT, and found significant pathway-pathway interactions between the JAK-STAT signaling and Cardiac Muscle Contraction KEGG pathways.
Wavelet-Bayesian inference of cosmic strings embedded in the cosmic microwave background
McEwen, J. D.; Feeney, S. M.; Peiris, H. V.; Wiaux, Y.; Ringeval, C.; Bouchet, F. R.
2017-12-01
Cosmic strings are a well-motivated extension to the standard cosmological model and could induce a subdominant component in the anisotropies of the cosmic microwave background (CMB), in addition to the standard inflationary component. The detection of strings, while observationally challenging, would provide a direct probe of physics at very high-energy scales. We develop a framework for cosmic string inference from observations of the CMB made over the celestial sphere, performing a Bayesian analysis in wavelet space where the string-induced CMB component has distinct statistical properties to the standard inflationary component. Our wavelet-Bayesian framework provides a principled approach to compute the posterior distribution of the string tension Gμ and the Bayesian evidence ratio comparing the string model to the standard inflationary model. Furthermore, we present a technique to recover an estimate of any string-induced CMB map embedded in observational data. Using Planck-like simulations, we demonstrate the application of our framework and evaluate its performance. The method is sensitive to Gμ ∼ 5 × 10-7 for Nambu-Goto string simulations that include an integrated Sachs-Wolfe contribution only and do not include any recombination effects, before any parameters of the analysis are optimized. The sensitivity of the method compares favourably with other techniques applied to the same simulations.
Multi-model polynomial chaos surrogate dictionary for Bayesian inference in elasticity problems
Contreras, Andres A.
2016-09-19
A method is presented for inferring the presence of an inclusion inside a domain; the proposed approach is suitable to be used in a diagnostic device with low computational power. Specifically, we use the Bayesian framework for the inference of stiff inclusions embedded in a soft matrix, mimicking tumors in soft tissues. We rely on a polynomial chaos (PC) surrogate to accelerate the inference process. The PC surrogate predicts the dependence of the displacements field with the random elastic moduli of the materials, and are computed by means of the stochastic Galerkin (SG) projection method. Moreover, the inclusion\\'s geometry is assumed to be unknown, and this is addressed by using a dictionary consisting of several geometrical models with different configurations. A model selection approach based on the evidence provided by the data (Bayes factors) is used to discriminate among the different geometrical models and select the most suitable one. The idea of using a dictionary of pre-computed geometrical models helps to maintain the computational cost of the inference process very low, as most of the computational burden is carried out off-line for the resolution of the SG problems. Numerical tests are used to validate the methodology, assess its performance, and analyze the robustness to model errors. © 2016 Elsevier Ltd
Hadwin, Paul J.; Sipkens, T. A.; Thomson, K. A.; Liu, F.; Daun, K. J.
2016-01-01
Auto-correlated laser-induced incandescence (AC-LII) infers the soot volume fraction (SVF) of soot particles by comparing the spectral incandescence from laser-energized particles to the pyrometrically inferred peak soot temperature. This calculation requires detailed knowledge of model parameters such as the absorption function of soot, which may vary with combustion chemistry, soot age, and the internal structure of the soot. This work presents a Bayesian methodology to quantify such uncertainties. This technique treats the additional "nuisance" model parameters, including the soot absorption function, as stochastic variables and incorporates the current state of knowledge of these parameters into the inference process through maximum entropy priors. While standard AC-LII analysis provides a point estimate of the SVF, Bayesian techniques infer the posterior probability density, which will allow scientists and engineers to better assess the reliability of AC-LII inferred SVFs in the context of environmental regulations and competing diagnostics.
On inferring the noise in probabilistic seismic AVO inversion using hierarchical Bayes
Madsen, Rasmus Bødker; Zunino, Andrea; Hansen, Thomas Mejer
2017-01-01
A realistic noise model is essential for trustworthy inversion of geophysical data. Sometimes, as in case of seismic data, quan- tification of the noise model is non-trivial. To remedy this, a hierarchical Bayes approach can be adopted in which proper- ties of the noise model, such as the amplitude of an assumed uncorrelated Gaussian noise model, can be inferred as part of the inversion. Here we demonstrate how such an approach can lead to substantial overfitting of noise when inverting a 1D ...
Bayesian inference of earthquake parameters from buoy data using a polynomial chaos-based surrogate
Giraldi, Loic
2017-04-07
This work addresses the estimation of the parameters of an earthquake model by the consequent tsunami, with an application to the Chile 2010 event. We are particularly interested in the Bayesian inference of the location, the orientation, and the slip of an Okada-based model of the earthquake ocean floor displacement. The tsunami numerical model is based on the GeoClaw software while the observational data is provided by a single DARTⓇ buoy. We propose in this paper a methodology based on polynomial chaos expansion to construct a surrogate model of the wave height at the buoy location. A correlated noise model is first proposed in order to represent the discrepancy between the computational model and the data. This step is necessary, as a classical independent Gaussian noise is shown to be unsuitable for modeling the error, and to prevent convergence of the Markov Chain Monte Carlo sampler. Second, the polynomial chaos model is subsequently improved to handle the variability of the arrival time of the wave, using a preconditioned non-intrusive spectral method. Finally, the construction of a reduced model dedicated to Bayesian inference is proposed. Numerical results are presented and discussed.
International Nuclear Information System (INIS)
Unkelbach, J; Oelfke, U
2005-01-01
We present a method to calculate dose uncertainties due to inter-fraction organ movements in fractionated radiotherapy, i.e. in addition to the expectation value of the dose distribution a variance distribution is calculated. To calculate the expectation value of the dose distribution in the presence of organ movements, one estimates a probability distribution of possible patient geometries. The respective variance of the expected dose distribution arises for two reasons: first, the patient is irradiated with a finite number of fractions only and second, the probability distribution of patient geometries has to be estimated from a small number of images and is therefore not exactly known. To quantify the total dose variance, we propose a method that is based on the principle of Bayesian inference. The method is of particular interest when organ motion is incorporated in inverse IMRT planning by means of inverse planning performed on a probability distribution of patient geometries. In order to make this a robust approach, it turns out that the dose variance should be considered (and minimized) in the optimization process. As an application of the presented concept of Bayesian inference, we compare three approaches to inverse planning based on probability distributions that account for an increasing degree of uncertainty. The Bayes theorem further provides a concept to interpolate between patient specific data and population-based knowledge on organ motion which is relevant since the number of CT images of a patient is typically small
Sraj, Ihab
2015-10-22
This paper addresses model dimensionality reduction for Bayesian inference based on prior Gaussian fields with uncertainty in the covariance function hyper-parameters. The dimensionality reduction is traditionally achieved using the Karhunen-Loève expansion of a prior Gaussian process assuming covariance function with fixed hyper-parameters, despite the fact that these are uncertain in nature. The posterior distribution of the Karhunen-Loève coordinates is then inferred using available observations. The resulting inferred field is therefore dependent on the assumed hyper-parameters. Here, we seek to efficiently estimate both the field and covariance hyper-parameters using Bayesian inference. To this end, a generalized Karhunen-Loève expansion is derived using a coordinate transformation to account for the dependence with respect to the covariance hyper-parameters. Polynomial Chaos expansions are employed for the acceleration of the Bayesian inference using similar coordinate transformations, enabling us to avoid expanding explicitly the solution dependence on the uncertain hyper-parameters. We demonstrate the feasibility of the proposed method on a transient diffusion equation by inferring spatially-varying log-diffusivity fields from noisy data. The inferred profiles were found closer to the true profiles when including the hyper-parameters’ uncertainty in the inference formulation.
Evidence cross-validation and Bayesian inference of MAST plasma equilibria
Energy Technology Data Exchange (ETDEWEB)
Nessi, G. T. von; Hole, M. J. [Research School of Physical Sciences and Engineering, Australian National University, Canberra ACT 0200 (Australia); Svensson, J. [Max-Planck-Institut fuer Plasmaphysik, D-17491 Greifswald (Germany); Appel, L. [EURATOM/CCFE Fusion Association, Culham Science Centre, Abingdon, Oxon OX14 3DB (United Kingdom)
2012-01-15
In this paper, current profiles for plasma discharges on the mega-ampere spherical tokamak are directly calculated from pickup coil, flux loop, and motional-Stark effect observations via methods based in the statistical theory of Bayesian analysis. By representing toroidal plasma current as a series of axisymmetric current beams with rectangular cross-section and inferring the current for each one of these beams, flux-surface geometry and q-profiles are subsequently calculated by elementary application of Biot-Savart's law. The use of this plasma model in the context of Bayesian analysis was pioneered by Svensson and Werner on the joint-European tokamak [Svensson and Werner,Plasma Phys. Controlled Fusion 50(8), 085002 (2008)]. In this framework, linear forward models are used to generate diagnostic predictions, and the probability distribution for the currents in the collection of plasma beams was subsequently calculated directly via application of Bayes' formula. In this work, we introduce a new diagnostic technique to identify and remove outlier observations associated with diagnostics falling out of calibration or suffering from an unidentified malfunction. These modifications enable a good agreement between Bayesian inference of the last-closed flux-surface with other corroborating data, such as that from force balance considerations using EFIT++[Appel et al., ''A unified approach to equilibrium reconstruction'' Proceedings of the 33rd EPS Conference on Plasma Physics (Rome, Italy, 2006)]. In addition, this analysis also yields errors on the plasma current profile and flux-surface geometry as well as directly predicting the Shafranov shift of the plasma core.
Evidence cross-validation and Bayesian inference of MAST plasma equilibria
International Nuclear Information System (INIS)
Nessi, G. T. von; Hole, M. J.; Svensson, J.; Appel, L.
2012-01-01
In this paper, current profiles for plasma discharges on the mega-ampere spherical tokamak are directly calculated from pickup coil, flux loop, and motional-Stark effect observations via methods based in the statistical theory of Bayesian analysis. By representing toroidal plasma current as a series of axisymmetric current beams with rectangular cross-section and inferring the current for each one of these beams, flux-surface geometry and q-profiles are subsequently calculated by elementary application of Biot-Savart's law. The use of this plasma model in the context of Bayesian analysis was pioneered by Svensson and Werner on the joint-European tokamak [Svensson and Werner,Plasma Phys. Controlled Fusion 50(8), 085002 (2008)]. In this framework, linear forward models are used to generate diagnostic predictions, and the probability distribution for the currents in the collection of plasma beams was subsequently calculated directly via application of Bayes' formula. In this work, we introduce a new diagnostic technique to identify and remove outlier observations associated with diagnostics falling out of calibration or suffering from an unidentified malfunction. These modifications enable a good agreement between Bayesian inference of the last-closed flux-surface with other corroborating data, such as that from force balance considerations using EFIT++[Appel et al., ''A unified approach to equilibrium reconstruction'' Proceedings of the 33rd EPS Conference on Plasma Physics (Rome, Italy, 2006)]. In addition, this analysis also yields errors on the plasma current profile and flux-surface geometry as well as directly predicting the Shafranov shift of the plasma core.
Application of hierarchical Bayesian unmixing models in river sediment source apportionment
Blake, Will; Smith, Hugh; Navas, Ana; Bodé, Samuel; Goddard, Rupert; Zou Kuzyk, Zou; Lennard, Amy; Lobb, David; Owens, Phil; Palazon, Leticia; Petticrew, Ellen; Gaspar, Leticia; Stock, Brian; Boeckx, Pacsal; Semmens, Brice
2016-04-01
Fingerprinting and unmixing concepts are used widely across environmental disciplines for forensic evaluation of pollutant sources. In aquatic and marine systems, this includes tracking the source of organic and inorganic pollutants in water and linking problem sediment to soil erosion and land use sources. It is, however, the particular complexity of ecological systems that has driven creation of the most sophisticated mixing models, primarily to (i) evaluate diet composition in complex ecological food webs, (ii) inform population structure and (iii) explore animal movement. In the context of the new hierarchical Bayesian unmixing model, MIXSIAR, developed to characterise intra-population niche variation in ecological systems, we evaluate the linkage between ecological 'prey' and 'consumer' concepts and river basin sediment 'source' and sediment 'mixtures' to exemplify the value of ecological modelling tools to river basin science. Recent studies have outlined advantages presented by Bayesian unmixing approaches in handling complex source and mixture datasets while dealing appropriately with uncertainty in parameter probability distributions. MixSIAR is unique in that it allows individual fixed and random effects associated with mixture hierarchy, i.e. factors that might exert an influence on model outcome for mixture groups, to be explored within the source-receptor framework. This offers new and powerful ways of interpreting river basin apportionment data. In this contribution, key components of the model are evaluated in the context of common experimental designs for sediment fingerprinting studies namely simple, nested and distributed catchment sampling programmes. Illustrative examples using geochemical and compound specific stable isotope datasets are presented and used to discuss best practice with specific attention to (1) the tracer selection process, (2) incorporation of fixed effects relating to sample timeframe and sediment type in the modelling
Cucchi, K.; Kawa, N.; Hesse, F.; Rubin, Y.
2017-12-01
In order to reduce uncertainty in the prediction of subsurface flow and transport processes, practitioners should use all data available. However, classic inverse modeling frameworks typically only make use of information contained in in-situ field measurements to provide estimates of hydrogeological parameters. Such hydrogeological information about an aquifer is difficult and costly to acquire. In this data-scarce context, the transfer of ex-situ information coming from previously investigated sites can be critical for improving predictions by better constraining the estimation procedure. Bayesian inverse modeling provides a coherent framework to represent such ex-situ information by virtue of the prior distribution and combine them with in-situ information from the target site. In this study, we present an innovative data-driven approach for defining such informative priors for hydrogeological parameters at the target site. Our approach consists in two steps, both relying on statistical and machine learning methods. The first step is data selection; it consists in selecting sites similar to the target site. We use clustering methods for selecting similar sites based on observable hydrogeological features. The second step is data assimilation; it consists in assimilating data from the selected similar sites into the informative prior. We use a Bayesian hierarchical model to account for inter-site variability and to allow for the assimilation of multiple types of site-specific data. We present the application and validation of the presented methods on an established database of hydrogeological parameters. Data and methods are implemented in the form of an open-source R-package and therefore facilitate easy use by other practitioners.
International Nuclear Information System (INIS)
Nakamura, Makoto
2009-01-01
It is important for Level 1 PSA to quantify input reliability parameters and their uncertainty. Bayesian methods for inference of system/component unavailability, however, are not well studied. At present practitioners allocate the uncertainty (i.e. error factor) of the unavailability based on engineering judgment. Systematic methods based on Bayesian statistics are needed for quantification of such uncertainty. In this study we have developed a new method for Bayesian inference of unavailability, where the posterior of system/component unavailability is described by the inverted gamma distribution. We show that the average of the posterior comes close to the point estimate of the unavailability as the number of outages goes to infinity. That indicates validity of the new method. Using plant data recorded in NUCIA, we have applied the new method to inference of system unavailability under unplanned outages due to violations of LCO at BWRs in Japan. According to the inference results, the unavailability is populated in the order of 10 -5 -10 -4 and the error factor is within 1-2. Thus, the new Bayesian method allows one to quantify magnitudes and widths (i.e. error factor) of uncertainty distributions of unavailability. (author)
Kim, Daesang; El Gharamti, Iman; Bisetti, Fabrizio; Farooq, Aamir; Knio, Omar
2016-01-01
A new Bayesian inference method has been developed and applied to Furan shock tube experimental data for efficient statistical inferences of the Arrhenius parameters of two OH radical consumption reactions. The collected experimental data, which
Norros, Veera; Laine, Marko; Lignell, Risto; Thingstad, Frede
2017-10-01
Methods for extracting empirically and theoretically sound parameter values are urgently needed in aquatic ecosystem modelling to describe key flows and their variation in the system. Here, we compare three Bayesian formulations for mechanistic model parameterization that differ in their assumptions about the variation in parameter values between various datasets: 1) global analysis - no variation, 2) separate analysis - independent variation and 3) hierarchical analysis - variation arising from a shared distribution defined by hyperparameters. We tested these methods, using computer-generated and empirical data, coupled with simplified and reasonably realistic plankton food web models, respectively. While all methods were adequate, the simulated example demonstrated that a well-designed hierarchical analysis can result in the most accurate and precise parameter estimates and predictions, due to its ability to combine information across datasets. However, our results also highlighted sensitivity to hyperparameter prior distributions as an important caveat of hierarchical analysis. In the more complex empirical example, hierarchical analysis was able to combine precise identification of parameter values with reasonably good predictive performance, although the ranking of the methods was less straightforward. We conclude that hierarchical Bayesian analysis is a promising tool for identifying key ecosystem-functioning parameters and their variation from empirical datasets.
Detecting Hierarchical Structure in Networks
DEFF Research Database (Denmark)
Herlau, Tue; Mørup, Morten; Schmidt, Mikkel Nørgaard
2012-01-01
Many real-world networks exhibit hierarchical organization. Previous models of hierarchies within relational data has focused on binary trees; however, for many networks it is unknown whether there is hierarchical structure, and if there is, a binary tree might not account well for it. We propose...... a generative Bayesian model that is able to infer whether hierarchies are present or not from a hypothesis space encompassing all types of hierarchical tree structures. For efficient inference we propose a collapsed Gibbs sampling procedure that jointly infers a partition and its hierarchical structure....... On synthetic and real data we demonstrate that our model can detect hierarchical structure leading to better link-prediction than competing models. Our model can be used to detect if a network exhibits hierarchical structure, thereby leading to a better comprehension and statistical account the network....
Approximate Bayesian Computation by Subset Simulation using hierarchical state-space models
Vakilzadeh, Majid K.; Huang, Yong; Beck, James L.; Abrahamsson, Thomas
2017-02-01
A new multi-level Markov Chain Monte Carlo algorithm for Approximate Bayesian Computation, ABC-SubSim, has recently appeared that exploits the Subset Simulation method for efficient rare-event simulation. ABC-SubSim adaptively creates a nested decreasing sequence of data-approximating regions in the output space that correspond to increasingly closer approximations of the observed output vector in this output space. At each level, multiple samples of the model parameter vector are generated by a component-wise Metropolis algorithm so that the predicted output corresponding to each parameter value falls in the current data-approximating region. Theoretically, if continued to the limit, the sequence of data-approximating regions would converge on to the observed output vector and the approximate posterior distributions, which are conditional on the data-approximation region, would become exact, but this is not practically feasible. In this paper we study the performance of the ABC-SubSim algorithm for Bayesian updating of the parameters of dynamical systems using a general hierarchical state-space model. We note that the ABC methodology gives an approximate posterior distribution that actually corresponds to an exact posterior where a uniformly distributed combined measurement and modeling error is added. We also note that ABC algorithms have a problem with learning the uncertain error variances in a stochastic state-space model and so we treat them as nuisance parameters and analytically integrate them out of the posterior distribution. In addition, the statistical efficiency of the original ABC-SubSim algorithm is improved by developing a novel strategy to regulate the proposal variance for the component-wise Metropolis algorithm at each level. We demonstrate that Self-regulated ABC-SubSim is well suited for Bayesian system identification by first applying it successfully to model updating of a two degree-of-freedom linear structure for three cases: globally
Calibrated birth-death phylogenetic time-tree priors for bayesian inference.
Heled, Joseph; Drummond, Alexei J
2015-05-01
Here we introduce a general class of multiple calibration birth-death tree priors for use in Bayesian phylogenetic inference. All tree priors in this class separate ancestral node heights into a set of "calibrated nodes" and "uncalibrated nodes" such that the marginal distribution of the calibrated nodes is user-specified whereas the density ratio of the birth-death prior is retained for trees with equal values for the calibrated nodes. We describe two formulations, one in which the calibration information informs the prior on ranked tree topologies, through the (conditional) prior, and the other which factorizes the prior on divergence times and ranked topologies, thus allowing uniform, or any arbitrary prior distribution on ranked topologies. Although the first of these formulations has some attractive properties, the algorithm we present for computing its prior density is computationally intensive. However, the second formulation is always faster and computationally efficient for up to six calibrations. We demonstrate the utility of the new class of multiple-calibration tree priors using both small simulations and a real-world analysis and compare the results to existing schemes. The two new calibrated tree priors described in this article offer greater flexibility and control of prior specification in calibrated time-tree inference and divergence time dating, and will remove the need for indirect approaches to the assessment of the combined effect of calibration densities and tree priors in Bayesian phylogenetic inference. © The Author(s) 2014. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.
Inference of reactive transport model parameters using a Bayesian multivariate approach
Carniato, Luca; Schoups, Gerrit; van de Giesen, Nick
2014-08-01
Parameter estimation of subsurface transport models from multispecies data requires the definition of an objective function that includes different types of measurements. Common approaches are weighted least squares (WLS), where weights are specified a priori for each measurement, and weighted least squares with weight estimation (WLS(we)) where weights are estimated from the data together with the parameters. In this study, we formulate the parameter estimation task as a multivariate Bayesian inference problem. The WLS and WLS(we) methods are special cases in this framework, corresponding to specific prior assumptions about the residual covariance matrix. The Bayesian perspective allows for generalizations to cases where residual correlation is important and for efficient inference by analytically integrating out the variances (weights) and selected covariances from the joint posterior. Specifically, the WLS and WLS(we) methods are compared to a multivariate (MV) approach that accounts for specific residual correlations without the need for explicit estimation of the error parameters. When applied to inference of reactive transport model parameters from column-scale data on dissolved species concentrations, the following results were obtained: (1) accounting for residual correlation between species provides more accurate parameter estimation for high residual correlation levels whereas its influence for predictive uncertainty is negligible, (2) integrating out the (co)variances leads to an efficient estimation of the full joint posterior with a reduced computational effort compared to the WLS(we) method, and (3) in the presence of model structural errors, none of the methods is able to identify the correct parameter values.
DEFF Research Database (Denmark)
Kristensen, Anders Ringgaard; Søllested, Thomas Algot
2004-01-01
improvements. The biological model of the replacement model is described in a previous paper and in this paper the optimization model is described. The model is developed as a prototype for use under practical conditions. The application of the model is demonstrated using data from two commercial Danish sow......Recent methodological improvements in replacement models comprising multi-level hierarchical Markov processes and Bayesian updating have hardly been implemented in any replacement model and the aim of this study is to present a sow replacement model that really uses these methodological...... herds. It is concluded that the Bayesian updating technique and the hierarchical structure decrease the size of the state space dramatically. Since parameter estimates vary considerably among herds it is concluded that decision support concerning sow replacement only makes sense with parameters...
Le Maitre, Olivier
2015-01-07
We address model dimensionality reduction in the Bayesian inference of Gaussian fields, considering prior covariance function with unknown hyper-parameters. The Karhunen-Loeve (KL) expansion of a prior Gaussian process is traditionally derived assuming fixed covariance function with pre-assigned hyperparameter values. Thus, the modes strengths of the Karhunen-Loeve expansion inferred using available observations, as well as the resulting inferred process, dependent on the pre-assigned values for the covariance hyper-parameters. Here, we seek to infer the process and its the covariance hyper-parameters in a single Bayesian inference. To this end, the uncertainty in the hyper-parameters is treated by means of a coordinate transformation, leading to a KL-type expansion on a fixed reference basis of spatial modes, but with random coordinates conditioned on the hyper-parameters. A Polynomial Chaos (PC) expansion of the model prediction is also introduced to accelerate the Bayesian inference and the sampling of the posterior distribution with MCMC method. The PC expansion of the model prediction also rely on a coordinates transformation, enabling us to avoid expanding the dependence of the prediction with respect to the covariance hyper-parameters. We demonstrate the efficiency of the proposed method on a transient diffusion equation by inferring spatially-varying log-diffusivity fields from noisy data.
Directory of Open Access Journals (Sweden)
Mariana Inés Pocovi
2015-06-01
Full Text Available Understanding the population structure and genetic diversity in sugarcane (Saccharum officinarum L. accessions from INTA germplasm bank (Argentina will be of great importance for germplasm collection and breeding improvement as it will identify diverse parental combinations to create segregating progenies with maximum genetic variability for further selection. A Bayesian approach, ordination methods (PCoA, Principal Coordinate Analysis and clustering analysis (UPGMA, Unweighted Pair Group Method with Arithmetic Mean were applied to this purpose. Sixty three INTA sugarcane hybrids were genotyped for 107 Simple Sequence Repeat (SSR and 136 Amplified Fragment Length Polymorphism (AFLP loci. Given the low probability values found with AFLP for individual assignment (4.7%, microsatellites seemed to perform better (54% for STRUCTURE analysis that revealed the germplasm to exist in five optimum groups with partly corresponding to their origin. However clusters shown high degree of admixture, F ST values confirmed the existence of differences among groups. Dissimilarity coefficients ranged from 0.079 to 0.651. PCoA separated sugarcane in groups that did not agree with those identified by STRUCTURE. The clustering including all genotypes neither showed resemblance to populations find by STRUCTURE, but clustering performed considering only individuals displaying a proportional membership > 0.6 in their primary population obtained with STRUCTURE showed close similarities. The Bayesian method indubitably brought more information on cultivar origins than classical PCoA and hierarchical clustering method.
International Nuclear Information System (INIS)
Kelly, Brandon C.; Goodman, Alyssa A.; Shetty, Rahul; Stutz, Amelia M.; Launhardt, Ralf; Kauffmann, Jens
2012-01-01
We present a hierarchical Bayesian method for fitting infrared spectral energy distributions (SEDs) of dust emission to observed fluxes. Under the standard assumption of optically thin single temperature (T) sources, the dust SED as represented by a power-law-modified blackbody is subject to a strong degeneracy between T and the spectral index β. The traditional non-hierarchical approaches, typically based on χ 2 minimization, are severely limited by this degeneracy, as it produces an artificial anti-correlation between T and β even with modest levels of observational noise. The hierarchical Bayesian method rigorously and self-consistently treats measurement uncertainties, including calibration and noise, resulting in more precise SED fits. As a result, the Bayesian fits do not produce any spurious anti-correlations between the SED parameters due to measurement uncertainty. We demonstrate that the Bayesian method is substantially more accurate than the χ 2 fit in recovering the SED parameters, as well as the correlations between them. As an illustration, we apply our method to Herschel and submillimeter ground-based observations of the star-forming Bok globule CB244. This source is a small, nearby molecular cloud containing a single low-mass protostar and a starless core. We find that T and β are weakly positively correlated—in contradiction with the χ 2 fits, which indicate a T-β anti-correlation from the same data set. Additionally, in comparison to the χ 2 fits the Bayesian SED parameter estimates exhibit a reduced range in values.
Czech Academy of Sciences Publication Activity Database
Górecki, J.; Hofert, M.; Holeňa, Martin
2016-01-01
Roč. 46, č. 1 (2016), s. 21-59 ISSN 0925-9902 R&D Projects: GA ČR GA13-17187S Grant - others:Slezská univerzita v Opavě(CZ) SGS/21/2014 Institutional support: RVO:67985807 Keywords : Copula * Hierarchical archimedean copula * Copula estimation * Structure determination * Kendall’s tau * Bayesian classification Subject RIV: IN - Informatics, Computer Science Impact factor: 1.294, year: 2016
Combining information from multiple flood projections in a hierarchical Bayesian framework
Le Vine, Nataliya
2016-04-01
This study demonstrates, in the context of flood frequency analysis, the potential of a recently proposed hierarchical Bayesian approach to combine information from multiple models. The approach explicitly accommodates shared multimodel discrepancy as well as the probabilistic nature of the flood estimates, and treats the available models as a sample from a hypothetical complete (but unobserved) set of models. The methodology is applied to flood estimates from multiple hydrological projections (the Future Flows Hydrology data set) for 135 catchments in the UK. The advantages of the approach are shown to be: (1) to ensure adequate "baseline" with which to compare future changes; (2) to reduce flood estimate uncertainty; (3) to maximize use of statistical information in circumstances where multiple weak predictions individually lack power, but collectively provide meaningful information; (4) to diminish the importance of model consistency when model biases are large; and (5) to explicitly consider the influence of the (model performance) stationarity assumption. Moreover, the analysis indicates that reducing shared model discrepancy is the key to further reduction of uncertainty in the flood frequency analysis. The findings are of value regarding how conclusions about changing exposure to flooding are drawn, and to flood frequency change attribution studies.
Merging information from multi-model flood projections in a hierarchical Bayesian framework
Le Vine, Nataliya
2016-04-01
Multi-model ensembles are becoming widely accepted for flood frequency change analysis. The use of multiple models results in large uncertainty around estimates of flood magnitudes, due to both uncertainty in model selection and natural variability of river flow. The challenge is therefore to extract the most meaningful signal from the multi-model predictions, accounting for both model quality and uncertainties in individual model estimates. The study demonstrates the potential of a recently proposed hierarchical Bayesian approach to combine information from multiple models. The approach facilitates explicit treatment of shared multi-model discrepancy as well as the probabilistic nature of the flood estimates, by treating the available models as a sample from a hypothetical complete (but unobserved) set of models. The advantages of the approach are: 1) to insure an adequate 'baseline' conditions with which to compare future changes; 2) to reduce flood estimate uncertainty; 3) to maximize use of statistical information in circumstances where multiple weak predictions individually lack power, but collectively provide meaningful information; 4) to adjust multi-model consistency criteria when model biases are large; and 5) to explicitly consider the influence of the (model performance) stationarity assumption. Moreover, the analysis indicates that reducing shared model discrepancy is the key to further reduction of uncertainty in the flood frequency analysis. The findings are of value regarding how conclusions about changing exposure to flooding are drawn, and to flood frequency change attribution studies.
Commeau, Natalie; Cornu, Marie; Albert, Isabelle; Denis, Jean-Baptiste; Parent, Eric
2012-03-01
Assessing within-batch and between-batch variability is of major interest for risk assessors and risk managers in the context of microbiological contamination of food. For example, the ratio between the within-batch variability and the between-batch variability has a large impact on the results of a sampling plan. Here, we designed hierarchical Bayesian models to represent such variability. Compatible priors were built mathematically to obtain sound model comparisons. A numeric criterion is proposed to assess the contamination structure comparing the ability of the models to replicate grouped data at the batch level using a posterior predictive loss approach. Models were applied to two case studies: contamination by Listeria monocytogenes of pork breast used to produce diced bacon and contamination by the same microorganism on cold smoked salmon at the end of the process. In the first case study, a contamination structure clearly exists and is located at the batch level, that is, between batches variability is relatively strong, whereas in the second a structure also exists but is less marked. © 2012 Society for Risk Analysis.
Hierarchical Bayesian Spatio Temporal Model Comparison on the Earth Trapped Particle Forecast
International Nuclear Information System (INIS)
Suparta, Wayan; Gusrizal
2014-01-01
We compared two hierarchical Bayesian spatio temporal (HBST) results, Gaussian process (GP) and autoregressive (AR) models, on the Earth trapped particle forecast. Two models were employed on the South Atlantic Anomaly (SAA) region. Electron of >30 keV (mep0e1) from National Oceanic and Atmospheric Administration (NOAA) 15-18 satellites data was chosen as the particle modeled. We used two weeks data to perform the model fitting on a 5°x5° grid of longitude and latitude, and 31 August 2007 was set as the date of forecast. Three statistical validations were performed on the data, i.e. the root mean square error (RMSE), mean absolute percentage error (MAPE) and bias (BIAS). The statistical analysis showed that GP model performed better than AR with the average of RMSE = 0.38 and 0.63, MAPE = 11.98 and 17.30, and BIAS = 0.32 and 0.24, for GP and AR, respectively. Visual validation on both models with the NOAA map's also confirmed the superior of the GP than the AR. The variance of log flux minimum = 0.09 and 1.09, log flux maximum = 1.15 and 1.35, and in successively represents GP and AR
Love, C. A.; Skahill, B. E.; AghaKouchak, A.; Karlovits, G. S.; England, J. F.; Duren, A. M.
2017-12-01
We compare gridded extreme precipitation return levels obtained using spatial Bayesian hierarchical modeling (BHM) with their respective counterparts from a traditional regional frequency analysis (RFA) using the same set of extreme precipitation data. Our study area is the 11,478 square mile Willamette River basin (WRB) located in northwestern Oregon, a major tributary of the Columbia River whose 187 miles long main stem, the Willamette River, flows northward between the Coastal and Cascade Ranges. The WRB contains approximately two thirds of Oregon's population and 20 of the 25 most populous cities in the state. The U.S. Army Corps of Engineers (USACE) Portland District operates thirteen dams and extreme precipitation estimates are required to support risk informed hydrologic analyses as part of the USACE Dam Safety Program. Our intent is to profile for the USACE an alternate methodology to an RFA that was developed in 2008 due to the lack of an official NOAA Atlas 14 update for the state of Oregon. We analyze 24-hour annual precipitation maxima data for the WRB utilizing the spatial BHM R package "spatial.gev.bma", which has been shown to be efficient in developing coherent maps of extreme precipitation by return level. Our BHM modeling analysis involved application of leave-one-out cross validation (LOO-CV), which not only supported model selection but also a comprehensive assessment of location specific model performance. The LOO-CV results will provide a basis for the BHM RFA comparison.
A Bayesian Approach to Model Selection in Hierarchical Mixtures-of-Experts Architectures.
Tanner, Martin A.; Peng, Fengchun; Jacobs, Robert A.
1997-03-01
There does not exist a statistical model that shows good performance on all tasks. Consequently, the model selection problem is unavoidable; investigators must decide which model is best at summarizing the data for each task of interest. This article presents an approach to the model selection problem in hierarchical mixtures-of-experts architectures. These architectures combine aspects of generalized linear models with those of finite mixture models in order to perform tasks via a recursive "divide-and-conquer" strategy. Markov chain Monte Carlo methodology is used to estimate the distribution of the architectures' parameters. One part of our approach to model selection attempts to estimate the worth of each component of an architecture so that relatively unused components can be pruned from the architecture's structure. A second part of this approach uses a Bayesian hypothesis testing procedure in order to differentiate inputs that carry useful information from nuisance inputs. Simulation results suggest that the approach presented here adheres to the dictum of Occam's razor; simple architectures that are adequate for summarizing the data are favored over more complex structures. Copyright 1997 Elsevier Science Ltd. All Rights Reserved.
Okada, Kensuke; Vandekerckhove, Joachim; Lee, Michael D
2018-02-01
People often interact with environments that can provide only a finite number of items as resources. Eventually a book contains no more chapters, there are no more albums available from a band, and every Pokémon has been caught. When interacting with these sorts of environments, people either actively choose to quit collecting new items, or they are forced to quit when the items are exhausted. Modeling the distribution of how many items people collect before they quit involves untangling these two possibilities, We propose that censored geometric models are a useful basic technique for modeling the quitting distribution, and, show how, by implementing these models in a hierarchical and latent-mixture framework through Bayesian methods, they can be extended to capture the additional features of specific situations. We demonstrate this approach by developing and testing a series of models in two case studies involving real-world data. One case study deals with people choosing jokes from a recommender system, and the other deals with people completing items in a personality survey.
A bayesian hierarchical model for classification with selection of functional predictors.
Zhu, Hongxiao; Vannucci, Marina; Cox, Dennis D
2010-06-01
In functional data classification, functional observations are often contaminated by various systematic effects, such as random batch effects caused by device artifacts, or fixed effects caused by sample-related factors. These effects may lead to classification bias and thus should not be neglected. Another issue of concern is the selection of functions when predictors consist of multiple functions, some of which may be redundant. The above issues arise in a real data application where we use fluorescence spectroscopy to detect cervical precancer. In this article, we propose a Bayesian hierarchical model that takes into account random batch effects and selects effective functions among multiple functional predictors. Fixed effects or predictors in nonfunctional form are also included in the model. The dimension of the functional data is reduced through orthonormal basis expansion or functional principal components. For posterior sampling, we use a hybrid Metropolis-Hastings/Gibbs sampler, which suffers slow mixing. An evolutionary Monte Carlo algorithm is applied to improve the mixing. Simulation and real data application show that the proposed model provides accurate selection of functional predictors as well as good classification.
Directory of Open Access Journals (Sweden)
Eils Roland
2006-06-01
Full Text Available Abstract Background The subcellular location of a protein is closely related to its function. It would be worthwhile to develop a method to predict the subcellular location for a given protein when only the amino acid sequence of the protein is known. Although many efforts have been made to predict subcellular location from sequence information only, there is the need for further research to improve the accuracy of prediction. Results A novel method called HensBC is introduced to predict protein subcellular location. HensBC is a recursive algorithm which constructs a hierarchical ensemble of classifiers. The classifiers used are Bayesian classifiers based on Markov chain models. We tested our method on six various datasets; among them are Gram-negative bacteria dataset, data for discriminating outer membrane proteins and apoptosis proteins dataset. We observed that our method can predict the subcellular location with high accuracy. Another advantage of the proposed method is that it can improve the accuracy of the prediction of some classes with few sequences in training and is therefore useful for datasets with imbalanced distribution of classes. Conclusion This study introduces an algorithm which uses only the primary sequence of a protein to predict its subcellular location. The proposed recursive scheme represents an interesting methodology for learning and combining classifiers. The method is computationally efficient and competitive with the previously reported approaches in terms of prediction accuracies as empirical results indicate. The code for the software is available upon request.
Mapping brucellosis increases relative to elk density using hierarchical Bayesian models
Cross, Paul C.; Heisey, Dennis M.; Scurlock, Brandon M.; Edwards, William H.; Brennan, Angela; Ebinger, Michael R.
2010-01-01
The relationship between host density and parasite transmission is central to the effectiveness of many disease management strategies. Few studies, however, have empirically estimated this relationship particularly in large mammals. We applied hierarchical Bayesian methods to a 19-year dataset of over 6400 brucellosis tests of adult female elk (Cervus elaphus) in northwestern Wyoming. Management captures that occurred from January to March were over two times more likely to be seropositive than hunted elk that were killed in September to December, while accounting for site and year effects. Areas with supplemental feeding grounds for elk had higher seroprevalence in 1991 than other regions, but by 2009 many areas distant from the feeding grounds were of comparable seroprevalence. The increases in brucellosis seroprevalence were correlated with elk densities at the elk management unit, or hunt area, scale (mean 2070 km2; range = [95–10237]). The data, however, could not differentiate among linear and non-linear effects of host density. Therefore, control efforts that focus on reducing elk densities at a broad spatial scale were only weakly supported. Additional research on how a few, large groups within a region may be driving disease dynamics is needed for more targeted and effective management interventions. Brucellosis appears to be expanding its range into new regions and elk populations, which is likely to further complicate the United States brucellosis eradication program. This study is an example of how the dynamics of host populations can affect their ability to serve as disease reservoirs.
Mapping brucellosis increases relative to elk density using hierarchical Bayesian models.
Directory of Open Access Journals (Sweden)
Paul C Cross
Full Text Available The relationship between host density and parasite transmission is central to the effectiveness of many disease management strategies. Few studies, however, have empirically estimated this relationship particularly in large mammals. We applied hierarchical Bayesian methods to a 19-year dataset of over 6400 brucellosis tests of adult female elk (Cervus elaphus in northwestern Wyoming. Management captures that occurred from January to March were over two times more likely to be seropositive than hunted elk that were killed in September to December, while accounting for site and year effects. Areas with supplemental feeding grounds for elk had higher seroprevalence in 1991 than other regions, but by 2009 many areas distant from the feeding grounds were of comparable seroprevalence. The increases in brucellosis seroprevalence were correlated with elk densities at the elk management unit, or hunt area, scale (mean 2070 km(2; range = [95-10237]. The data, however, could not differentiate among linear and non-linear effects of host density. Therefore, control efforts that focus on reducing elk densities at a broad spatial scale were only weakly supported. Additional research on how a few, large groups within a region may be driving disease dynamics is needed for more targeted and effective management interventions. Brucellosis appears to be expanding its range into new regions and elk populations, which is likely to further complicate the United States brucellosis eradication program. This study is an example of how the dynamics of host populations can affect their ability to serve as disease reservoirs.
Directory of Open Access Journals (Sweden)
Santosh Jatrana
Full Text Available The aim of this paper was to see whether all-cause and cause-specific mortality rates vary between Asian ethnic subgroups, and whether overseas born Asian subgroup mortality rate ratios varied by nativity and duration of residence. We used hierarchical Bayesian methods to allow for sparse data in the analysis of linked census-mortality data for 25-75 year old New Zealanders. We found directly standardised posterior all-cause and cardiovascular mortality rates were highest for the Indian ethnic group, significantly so when compared with those of Chinese ethnicity. In contrast, cancer mortality rates were lowest for ethnic Indians. Asian overseas born subgroups have about 70% of the mortality rate of their New Zealand born Asian counterparts, a result that showed little variation by Asian subgroup or cause of death. Within the overseas born population, all-cause mortality rates for migrants living 0-9 years in New Zealand were about 60% of the mortality rate of those living more than 25 years in New Zealand regardless of ethnicity. The corresponding figure for cardiovascular mortality rates was 50%. However, while Chinese cancer mortality rates increased with duration of residence, Indian and Other Asian cancer mortality rates did not. Future research on the mechanisms of worsening of health with increased time spent in the host country is required to improve the understanding of the process, and would assist the policy-makers and health planners.
Subjective value of risky foods for individual domestic chicks: a hierarchical Bayesian model.
Kawamori, Ai; Matsushima, Toshiya
2010-05-01
For animals to decide which prey to attack, the gain and delay of the food item must be integrated in a value function. However, the subjective value is not obtained by expected profitability when it is accompanied by risk. To estimate the subjective value, we examined choices in a cross-shaped maze with two colored feeders in domestic chicks. When tested by a reversal in food amount or delay, chicks changed choices similarly in both conditions (experiment 1). We therefore examined risk sensitivity for amount and delay (experiment 2) by supplying one feeder with food of fixed profitability and the alternative feeder with high- or low-profitability food at equal probability. Profitability varied in amount (groups 1 and 2 at high and low variance) or in delay (group 3). To find the equilibrium, the amount (groups 1 and 2) or delay (group 3) of the food in the fixed feeder was adjusted in a total of 18 blocks. The Markov chain Monte Carlo method was applied to a hierarchical Bayesian model to estimate the subjective value. Chicks undervalued the variable feeder in group 1 and were indifferent in group 2 but overvalued the variable feeder in group 3 at a population level. Re-examination without the titration procedure (experiment 3) suggested that the subjective value was not absolute for each option. When the delay was varied, the variable option was often given a paradoxically high value depending on fixed alternative. Therefore, the basic assumption of the uniquely determined value function might be questioned.
A Bayesian hierarchical model with novel prior specifications for estimating HIV testing rates.
An, Qian; Kang, Jian; Song, Ruiguang; Hall, H Irene
2016-04-30
Human immunodeficiency virus (HIV) infection is a severe infectious disease actively spreading globally, and acquired immunodeficiency syndrome (AIDS) is an advanced stage of HIV infection. The HIV testing rate, that is, the probability that an AIDS-free HIV infected person seeks a test for HIV during a particular time interval, given no previous positive test has been obtained prior to the start of the time, is an important parameter for public health. In this paper, we propose a Bayesian hierarchical model with two levels of hierarchy to estimate the HIV testing rate using annual AIDS and AIDS-free HIV diagnoses data. At level one, we model the latent number of HIV infections for each year using a Poisson distribution with the intensity parameter representing the HIV incidence rate. At level two, the annual numbers of AIDS and AIDS-free HIV diagnosed cases and all undiagnosed cases stratified by the HIV infections at different years are modeled using a multinomial distribution with parameters including the HIV testing rate. We propose a new class of priors for the HIV incidence rate and HIV testing rate taking into account the temporal dependence of these parameters to improve the estimation accuracy. We develop an efficient posterior computation algorithm based on the adaptive rejection metropolis sampling technique. We demonstrate our model using simulation studies and the analysis of the national HIV surveillance data in the USA. Copyright © 2015 John Wiley & Sons, Ltd.
Albert, Carlo; Ulzega, Simone; Stoop, Ruedi
2016-04-01
Parameter inference is a fundamental problem in data-driven modeling. Given observed data that is believed to be a realization of some parameterized model, the aim is to find parameter values that are able to explain the observed data. In many situations, the dominant sources of uncertainty must be included into the model for making reliable predictions. This naturally leads to stochastic models. Stochastic models render parameter inference much harder, as the aim then is to find a distribution of likely parameter values. In Bayesian statistics, which is a consistent framework for data-driven learning, this so-called posterior distribution can be used to make probabilistic predictions. We propose a novel, exact, and very efficient approach for generating posterior parameter distributions for stochastic differential equation models calibrated to measured time series. The algorithm is inspired by reinterpreting the posterior distribution as a statistical mechanics partition function of an object akin to a polymer, where the measurements are mapped on heavier beads compared to those of the simulated data. To arrive at distribution samples, we employ a Hamiltonian Monte Carlo approach combined with a multiple time-scale integration. A separation of time scales naturally arises if either the number of measurement points or the number of simulation points becomes large. Furthermore, at least for one-dimensional problems, we can decouple the harmonic modes between measurement points and solve the fastest part of their dynamics analytically. Our approach is applicable to a wide range of inference problems and is highly parallelizable.
Nonparametric Bayesian inference for mean residual life functions in survival analysis.
Poynor, Valerie; Kottas, Athanasios
2018-01-19
Modeling and inference for survival analysis problems typically revolves around different functions related to the survival distribution. Here, we focus on the mean residual life (MRL) function, which provides the expected remaining lifetime given that a subject has survived (i.e. is event-free) up to a particular time. This function is of direct interest in reliability, medical, and actuarial fields. In addition to its practical interpretation, the MRL function characterizes the survival distribution. We develop general Bayesian nonparametric inference for MRL functions built from a Dirichlet process mixture model for the associated survival distribution. The resulting model for the MRL function admits a representation as a mixture of the kernel MRL functions with time-dependent mixture weights. This model structure allows for a wide range of shapes for the MRL function. Particular emphasis is placed on the selection of the mixture kernel, taken to be a gamma distribution, to obtain desirable properties for the MRL function arising from the mixture model. The inference method is illustrated with a data set of two experimental groups and a data set involving right censoring. The supplementary material available at Biostatistics online provides further results on empirical performance of the model, using simulated data examples. © The Author 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Bayesian modeling using WinBUGS
Ntzoufras, Ioannis
2009-01-01
A hands-on introduction to the principles of Bayesian modeling using WinBUGS Bayesian Modeling Using WinBUGS provides an easily accessible introduction to the use of WinBUGS programming techniques in a variety of Bayesian modeling settings. The author provides an accessible treatment of the topic, offering readers a smooth introduction to the principles of Bayesian modeling with detailed guidance on the practical implementation of key principles. The book begins with a basic introduction to Bayesian inference and the WinBUGS software and goes on to cover key topics, including: Markov Chain Monte Carlo algorithms in Bayesian inference Generalized linear models Bayesian hierarchical models Predictive distribution and model checking Bayesian model and variable evaluation Computational notes and screen captures illustrate the use of both WinBUGS as well as R software to apply the discussed techniques. Exercises at the end of each chapter allow readers to test their understanding of the presented concepts and all ...
Gopalan, Giri; Hrafnkelsson, Birgir; Aðalgeirsdóttir, Guðfinna; Jarosch, Alexander H.; Pálsson, Finnur
2018-03-01
Bayesian hierarchical modeling can assist the study of glacial dynamics and ice flow properties. This approach will allow glaciologists to make fully probabilistic predictions for the thickness of a glacier at unobserved spatio-temporal coordinates, and it will also allow for the derivation of posterior probability distributions for key physical parameters such as ice viscosity and basal sliding. The goal of this paper is to develop a proof of concept for a Bayesian hierarchical model constructed, which uses exact analytical solutions for the shallow ice approximation (SIA) introduced by Bueler et al. (2005). A suite of test simulations utilizing these exact solutions suggests that this approach is able to adequately model numerical errors and produce useful physical parameter posterior distributions and predictions. A byproduct of the development of the Bayesian hierarchical model is the derivation of a novel finite difference method for solving the SIA partial differential equation (PDE). An additional novelty of this work is the correction of numerical errors induced through a numerical solution using a statistical model. This error correcting process models numerical errors that accumulate forward in time and spatial variation of numerical errors between the dome, interior, and margin of a glacier.
Pajak, Bozena; Fine, Alex B; Kleinschmidt, Dave F; Jaeger, T Florian
2016-12-01
We present a framework of second and additional language (L2/L n ) acquisition motivated by recent work on socio-indexical knowledge in first language (L1) processing. The distribution of linguistic categories covaries with socio-indexical variables (e.g., talker identity, gender, dialects). We summarize evidence that implicit probabilistic knowledge of this covariance is critical to L1 processing, and propose that L2/L n learning uses the same type of socio-indexical information to probabilistically infer latent hierarchical structure over previously learned and new languages. This structure guides the acquisition of new languages based on their inferred place within that hierarchy, and is itself continuously revised based on new input from any language. This proposal unifies L1 processing and L2/L n acquisition as probabilistic inference under uncertainty over socio-indexical structure. It also offers a new perspective on crosslinguistic influences during L2/L n learning, accommodating gradient and continued transfer (both negative and positive) from previously learned to novel languages, and vice versa.
Extreme-Scale Bayesian Inference for Uncertainty Quantification of Complex Simulations
Energy Technology Data Exchange (ETDEWEB)
Biros, George [Univ. of Texas, Austin, TX (United States)
2018-01-12
Uncertainty quantification (UQ)—that is, quantifying uncertainties in complex mathematical models and their large-scale computational implementations—is widely viewed as one of the outstanding challenges facing the field of CS&E over the coming decade. The EUREKA project set to address the most difficult class of UQ problems: those for which both the underlying PDE model as well as the uncertain parameters are of extreme scale. In the project we worked on these extreme-scale challenges in the following four areas: 1. Scalable parallel algorithms for sampling and characterizing the posterior distribution that exploit the structure of the underlying PDEs and parameter-to-observable map. These include structure-exploiting versions of the randomized maximum likelihood method, which aims to overcome the intractability of employing conventional MCMC methods for solving extreme-scale Bayesian inversion problems by appealing to and adapting ideas from large-scale PDE-constrained optimization, which have been very successful at exploring high-dimensional spaces. 2. Scalable parallel algorithms for construction of prior and likelihood functions based on learning methods and non-parametric density estimation. Constructing problem-specific priors remains a critical challenge in Bayesian inference, and more so in high dimensions. Another challenge is construction of likelihood functions that capture unmodeled couplings between observations and parameters. We will create parallel algorithms for non-parametric density estimation using high dimensional N-body methods and combine them with supervised learning techniques for the construction of priors and likelihood functions. 3. Bayesian inadequacy models, which augment physics models with stochastic models that represent their imperfections. The success of the Bayesian inference framework depends on the ability to represent the uncertainty due to imperfections of the mathematical model of the phenomena of interest. This is a
A Bayesian method for inferring transmission chains in a partially observed epidemic.
Energy Technology Data Exchange (ETDEWEB)
Marzouk, Youssef M.; Ray, Jaideep
2008-10-01
We present a Bayesian approach for estimating transmission chains and rates in the Abakaliki smallpox epidemic of 1967. The epidemic affected 30 individuals in a community of 74; only the dates of appearance of symptoms were recorded. Our model assumes stochastic transmission of the infections over a social network. Distinct binomial random graphs model intra- and inter-compound social connections, while disease transmission over each link is treated as a Poisson process. Link probabilities and rate parameters are objects of inference. Dates of infection and recovery comprise the remaining unknowns. Distributions for smallpox incubation and recovery periods are obtained from historical data. Using Markov chain Monte Carlo, we explore the joint posterior distribution of the scalar parameters and provide an expected connectivity pattern for the social graph and infection pathway.
Mohammad-Djafari, Ali
2015-01-01
The main object of this tutorial article is first to review the main inference tools using Bayesian approach, Entropy, Information theory and their corresponding geometries. This review is focused mainly on the ways these tools have been used in data, signal and image processing. After a short introduction of the different quantities related to the Bayes rule, the entropy and the Maximum Entropy Principle (MEP), relative entropy and the Kullback-Leibler divergence, Fisher information, we will study their use in different fields of data and signal processing such as: entropy in source separation, Fisher information in model order selection, different Maximum Entropy based methods in time series spectral estimation and finally, general linear inverse problems.
Mocapy++ - A toolkit for inference and learning in dynamic Bayesian networks
Directory of Open Access Journals (Sweden)
Hamelryck Thomas
2010-03-01
Full Text Available Abstract Background Mocapy++ is a toolkit for parameter learning and inference in dynamic Bayesian networks (DBNs. It supports a wide range of DBN architectures and probability distributions, including distributions from directional statistics (the statistics of angles, directions and orientations. Results The program package is freely available under the GNU General Public Licence (GPL from SourceForge http://sourceforge.net/projects/mocapy. The package contains the source for building the Mocapy++ library, several usage examples and the user manual. Conclusions Mocapy++ is especially suitable for constructing probabilistic models of biomolecular structure, due to its support for directional statistics. In particular, it supports the Kent distribution on the sphere and the bivariate von Mises distribution on the torus. These distributions have proven useful to formulate probabilistic models of protein and RNA structure in atomic detail.
Energy Technology Data Exchange (ETDEWEB)
Chakraborty, Shubhankar; Das, Prasanta Kr., E-mail: pkd@mech.iitkgp.ernet.in [Department of Mechanical Engineering, Indian Institute of Technology Kharagpur, Kharagpur 721302 (India); Roy Chaudhuri, Partha [Department of Physics, Indian Institute of Technology Kharagpur, Kharagpur 721302 (India)
2016-07-15
In this communication, a novel optical technique has been proposed for the reconstruction of the shape of a Taylor bubble using measurements from multiple arrays of optical sensors. The deviation of an optical beam passing through the bubble depends on the contour of bubble surface. A theoretical model of the deviation of a beam during the traverse of a Taylor bubble through it has been developed. Using this model and the time history of the deviation captured by the sensor array, the bubble shape has been reconstructed. The reconstruction has been performed using an inverse algorithm based on Bayesian inference technique and Markov chain Monte Carlo sampling algorithm. The reconstructed nose shape has been compared with the true shape, extracted through image processing of high speed images. Finally, an error analysis has been performed to pinpoint the sources of the errors.
Multi-Objective data analysis using Bayesian Inference for MagLIF experiments
Knapp, Patrick; Glinksy, Michael; Evans, Matthew; Gom, Matth; Han, Stephanie; Harding, Eric; Slutz, Steve; Hahn, Kelly; Harvey-Thompson, Adam; Geissel, Matthias; Ampleford, David; Jennings, Christopher; Schmit, Paul; Smith, Ian; Schwarz, Jens; Peterson, Kyle; Jones, Brent; Rochau, Gregory; Sinars, Daniel
2017-10-01
The MagLIF concept has recently demonstrated Gbar pressures and confinement of charged fusion products at stagnation. We present a new analysis methodology that allows for integration of multiple diagnostics including nuclear, x-ray imaging, and x-ray power to determine the temperature, pressure, liner areal density, and mix fraction. A simplified hot-spot model is used with a Bayesian inference network to determine the most probable model parameters that describe the observations while simultaneously revealing the principal uncertainties in the analysis. Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC., a wholly owned subsidiary of Honeywell International, Inc., for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-NA-0003525.
A combined evidence Bayesian method for human ancestry inference applied to Afro-Colombians.
Rishishwar, Lavanya; Conley, Andrew B; Vidakovic, Brani; Jordan, I King
2015-12-15
Uniparental genetic markers, mitochondrial DNA (mtDNA) and Y chromosomal DNA, are widely used for the inference of human ancestry. However, the resolution of ancestral origins based on mtDNA haplotypes is limited by the fact that such haplotypes are often found to be distributed across wide geographical regions. We have addressed this issue here by combining two sources of ancestry information that have typically been considered separately: historical records regarding population origins and genetic information on mtDNA haplotypes. To combine these distinct data sources, we applied a Bayesian approach that considers historical records, in the form of prior probabilities, together with data on the geographical distribution of mtDNA haplotypes, formulated as likelihoods, to yield ancestry assignments from posterior probabilities. This combined evidence Bayesian approach to ancestry assignment was evaluated for its ability to accurately assign sub-continental African ancestral origins to Afro-Colombians based on their mtDNA haplotypes. We demonstrate that the incorporation of historical prior probabilities via this analytical framework can provide for substantially increased resolution in sub-continental African ancestry assignment for members of this population. In addition, a personalized approach to ancestry assignment that involves the tuning of priors to individual mtDNA haplotypes yields even greater resolution for individual ancestry assignment. Despite the fact that Colombia has a large population of Afro-descendants, the ancestry of this community has been understudied relative to populations with primarily European and Native American ancestry. Thus, the application of the kind of combined evidence approach developed here to the study of ancestry in the Afro-Colombian population has the potential to be impactful. The formal Bayesian analytical framework we propose for combining historical and genetic information also has the potential to be widely applied
Khan, Haseeb A; Arif, Ibrahim A; Bahkali, Ali H; Al Farhan, Ahmad H; Al Homaidan, Ali A
2008-10-06
This investigation was aimed to compare the inference of antelope phylogenies resulting from the 16S rRNA, cytochrome-b (cyt-b) and d-loop segments of mitochondrial DNA using three different computational models including Bayesian (BA), maximum parsimony (MP) and unweighted pair group method with arithmetic mean (UPGMA). The respective nucleotide sequences of three Oryx species (Oryx leucoryx, Oryx dammah and Oryx gazella) and an out-group (Addax nasomaculatus) were aligned and subjected to BA, MP and UPGMA models for comparing the topologies of respective phylogenetic trees. The 16S rRNA region possessed the highest frequency of conserved sequences (97.65%) followed by cyt-b (94.22%) and d-loop (87.29%). There were few transitions (2.35%) and none transversions in 16S rRNA as compared to cyt-b (5.61% transitions and 0.17% transversions) and d-loop (11.57% transitions and 1.14% transversions) while comparing the four taxa. All the three mitochondrial segments clearly differentiated the genus Addax from Oryx using the BA or UPGMA models. The topologies of all the gamma-corrected Bayesian trees were identical irrespective of the marker type. The UPGMA trees resulting from 16S rRNA and d-loop sequences were also identical (Oryx dammah grouped with Oryx leucoryx) to Bayesian trees except that the UPGMA tree based on cyt-b showed a slightly different phylogeny (Oryx dammah grouped with Oryx gazella) with a low bootstrap support. However, the MP model failed to differentiate the genus Addax from Oryx. These findings demonstrate the efficiency and robustness of BA and UPGMA methods for phylogenetic analysis of antelopes using mitochondrial markers.
Using Approximate Bayesian Computation to infer sex ratios from acoustic data.
Lehnen, Lisa; Schorcht, Wigbert; Karst, Inken; Biedermann, Martin; Kerth, Gerald; Puechmaille, Sebastien J
2018-01-01
Population sex ratios are of high ecological relevance, but are challenging to determine in species lacking conspicuous external cues indicating their sex. Acoustic sexing is an option if vocalizations differ between sexes, but is precluded by overlapping distributions of the values of male and female vocalizations in many species. A method allowing the inference of sex ratios despite such an overlap will therefore greatly increase the information extractable from acoustic data. To meet this demand, we developed a novel approach using Approximate Bayesian Computation (ABC) to infer the sex ratio of populations from acoustic data. Additionally, parameters characterizing the male and female distribution of acoustic values (mean and standard deviation) are inferred. This information is then used to probabilistically assign a sex to a single acoustic signal. We furthermore develop a simpler means of sex ratio estimation based on the exclusion of calls from the overlap zone. Applying our methods to simulated data demonstrates that sex ratio and acoustic parameter characteristics of males and females are reliably inferred by the ABC approach. Applying both the ABC and the exclusion method to empirical datasets (echolocation calls recorded in colonies of lesser horseshoe bats, Rhinolophus hipposideros) provides similar sex ratios as molecular sexing. Our methods aim to facilitate evidence-based conservation, and to benefit scientists investigating ecological or conservation questions related to sex- or group specific behaviour across a wide range of organisms emitting acoustic signals. The developed methodology is non-invasive, low-cost and time-efficient, thus allowing the study of many sites and individuals. We provide an R-script for the easy application of the method and discuss potential future extensions and fields of applications. The script can be easily adapted to account for numerous biological systems by adjusting the type and number of groups to be
Bayesian electron density inference from JET lithium beam emission spectra using Gaussian processes
Kwak, Sehyun; Svensson, J.; Brix, M.; Ghim, Y.-C.; Contributors, JET
2017-03-01
A Bayesian model to infer edge electron density profiles is developed for the JET lithium beam emission spectroscopy (Li-BES) system, measuring Li I (2p-2s) line radiation using 26 channels with ∼1 cm spatial resolution and 10∼ 20 ms temporal resolution. The density profile is modelled using a Gaussian process prior, and the uncertainty of the density profile is calculated by a Markov Chain Monte Carlo (MCMC) scheme. From the spectra measured by the transmission grating spectrometer, the Li I line intensities are extracted, and modelled as a function of the plasma density by a multi-state model which describes the relevant processes between neutral lithium beam atoms and plasma particles. The spectral model fully takes into account interference filter and instrument effects, that are separately estimated, again using Gaussian processes. The line intensities are inferred based on a spectral model consistent with the measured spectra within their uncertainties, which includes photon statistics and electronic noise. Our newly developed method to infer JET edge electron density profiles has the following advantages in comparison to the conventional method: (i) providing full posterior distributions of edge density profiles, including their associated uncertainties, (ii) the available radial range for density profiles is increased to the full observation range (∼26 cm), (iii) an assumption of monotonic electron density profile is not necessary, (iv) the absolute calibration factor of the diagnostic system is automatically estimated overcoming the limitation of the conventional technique and allowing us to infer the electron density profiles for all pulses without preprocessing the data or an additional boundary condition, and (v) since the full spectrum is modelled, the procedure of modulating the beam to measure the background signal is only necessary for the case of overlapping of the Li I line with impurity lines.
Yu, Bin; Xu, Jia-Meng; Li, Shan; Chen, Cheng; Chen, Rui-Xin; Wang, Lei; Zhang, Yan; Wang, Ming-Hui
2017-10-06
Gene regulatory networks (GRNs) research reveals complex life phenomena from the perspective of gene interaction, which is an important research field in systems biology. Traditional Bayesian networks have a high computational complexity, and the network structure scoring model has a single feature. Information-based approaches cannot identify the direction of regulation. In order to make up for the shortcomings of the above methods, this paper presents a novel hybrid learning method (DBNCS) based on dynamic Bayesian network (DBN) to construct the multiple time-delayed GRNs for the first time, combining the comprehensive score (CS) with the DBN model. DBNCS algorithm first uses CMI2NI (conditional mutual inclusive information-based network inference) algorithm for network structure profiles learning, namely the construction of search space. Then the redundant regulations are removed by using the recursive optimization algorithm (RO), thereby reduce the false positive rate. Secondly, the network structure profiles are decomposed into a set of cliques without loss, which can significantly reduce the computational complexity. Finally, DBN model is used to identify the direction of gene regulation within the cliques and search for the optimal network structure. The performance of DBNCS algorithm is evaluated by the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in Escherichia coli , and compared with other state-of-the-art methods. The experimental results show the rationality of the algorithm design and the outstanding performance of the GRNs.
Bayesian inference for the distribution of grams of marijuana in a joint.
Ridgeway, Greg; Kilmer, Beau
2016-08-01
The average amount of marijuana in a joint is unknown, yet this figure is a critical quantity for creating credible measures of marijuana consumption. It is essential for projecting tax revenues post-legalization, estimating the size of illicit marijuana markets, and learning about how much marijuana users are consuming in order to understand health and behavioral consequences. Arrestee Drug Abuse Monitoring data collected between 2000 and 2010 contain relevant information on 10,628 marijuana transactions, joints and loose marijuana purchases, including the city in which the purchase occurred and the price paid for the marijuana. Using the Brown-Silverman drug pricing model to link marijuana price and weight, we are able to infer the distribution of grams of marijuana in a joint and provide a Bayesian posterior distribution for the mean weight of marijuana in a joint. We estimate that the mean weight of marijuana in a joint is 0.32g (95% Bayesian posterior interval: 0.30-0.35). Our estimate of the mean weight of marijuana in a joint is lower than figures commonly used to make estimates of marijuana consumption. These estimates can be incorporated into drug policy discussions to produce better understanding about illicit marijuana markets, the size of potential legalized marijuana markets, and health and behavior outcomes. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Jiménez, José; García, Emilio J; Llaneza, Luis; Palacios, Vicente; González, Luis Mariano; García-Domínguez, Francisco; Múñoz-Igualada, Jaime; López-Bao, José Vicente
2016-08-01
In many cases, the first step in large-carnivore management is to obtain objective, reliable, and cost-effective estimates of population parameters through procedures that are reproducible over time. However, monitoring predators over large areas is difficult, and the data have a high level of uncertainty. We devised a practical multimethod and multistate modeling approach based on Bayesian hierarchical-site-occupancy models that combined multiple survey methods to estimate different population states for use in monitoring large predators at a regional scale. We used wolves (Canis lupus) as our model species and generated reliable estimates of the number of sites with wolf reproduction (presence of pups). We used 2 wolf data sets from Spain (Western Galicia in 2013 and Asturias in 2004) to test the approach. Based on howling surveys, the naïve estimation (i.e., estimate based only on observations) of the number of sites with reproduction was 9 and 25 sites in Western Galicia and Asturias, respectively. Our model showed 33.4 (SD 9.6) and 34.4 (3.9) sites with wolf reproduction, respectively. The number of occupied sites with wolf reproduction was 0.67 (SD 0.19) and 0.76 (0.11), respectively. This approach can be used to design more cost-effective monitoring programs (i.e., to define the sampling effort needed per site). Our approach should inspire well-coordinated surveys across multiple administrative borders and populations and lead to improved decision making for management of large carnivores on a landscape level. The use of this Bayesian framework provides a simple way to visualize the degree of uncertainty around population-parameter estimates and thus provides managers and stakeholders an intuitive approach to interpreting monitoring results. Our approach can be widely applied to large spatial scales in wildlife monitoring where detection probabilities differ between population states and where several methods are being used to estimate different population
A Bayesian Hierarchical Modeling Approach to Predicting Flow in Ungauged Basins
Gronewold, A.; Alameddine, I.; Anderson, R. M.
2009-12-01
Recent innovative approaches to identifying and applying regression-based relationships between land use patterns (such as increasing impervious surface area and decreasing vegetative cover) and rainfall-runoff model parameters represent novel and promising improvements to predicting flow from ungauged basins. In particular, these approaches allow for predicting flows under uncertain and potentially variable future conditions due to rapid land cover changes, variable climate conditions, and other factors. Despite the broad range of literature on estimating rainfall-runoff model parameters, however, the absence of a robust set of modeling tools for identifying and quantifying uncertainties in (and correlation between) rainfall-runoff model parameters represents a significant gap in current hydrological modeling research. Here, we build upon a series of recent publications promoting novel Bayesian and probabilistic modeling strategies for quantifying rainfall-runoff model parameter estimation uncertainty. Our approach applies alternative measures of rainfall-runoff model parameter joint likelihood (including Nash-Sutcliffe efficiency, among others) to simulate samples from the joint parameter posterior probability density function. We then use these correlated samples as response variables in a Bayesian hierarchical model with land use coverage data as predictor variables in order to develop a robust land use-based tool for forecasting flow in ungauged basins while accounting for, and explicitly acknowledging, parameter estimation uncertainty. We apply this modeling strategy to low-relief coastal watersheds of Eastern North Carolina, an area representative of coastal resource waters throughout the world because of its sensitive embayments and because of the abundant (but currently threatened) natural resources it hosts. Consequently, this area is the subject of several ongoing studies and large-scale planning initiatives, including those conducted through the United
Sub-seasonal-to-seasonal Reservoir Inflow Forecast using Bayesian Hierarchical Hidden Markov Model
Mukhopadhyay, S.; Arumugam, S.
2017-12-01
Sub-seasonal-to-seasonal (S2S) (15-90 days) streamflow forecasting is an emerging area of research that provides seamless information for reservoir operation from weather time scales to seasonal time scales. From an operational perspective, sub-seasonal inflow forecasts are highly valuable as these enable water managers to decide short-term releases (15-30 days), while holding water for seasonal needs (e.g., irrigation and municipal supply) and to meet end-of-the-season target storage at a desired level. We propose a Bayesian Hierarchical Hidden Markov Model (BHHMM) to develop S2S inflow forecasts for the Tennessee Valley Area (TVA) reservoir system. Here, the hidden states are predicted by relevant indices that influence the inflows at S2S time scale. The hidden Markov model also captures the both spatial and temporal hierarchy in predictors that operate at S2S time scale with model parameters being estimated as a posterior distribution using a Bayesian framework. We present our work in two steps, namely single site model and multi-site model. For proof of concept, we consider inflows to Douglas Dam, Tennessee, in the single site model. For multisite model we consider reservoirs in the upper Tennessee valley. Streamflow forecasts are issued and updated continuously every day at S2S time scale. We considered precipitation forecasts obtained from NOAA Climate Forecast System (CFSv2) GCM as predictors for developing S2S streamflow forecasts along with relevant indices for predicting hidden states. Spatial dependence of the inflow series of reservoirs are also preserved in the multi-site model. To circumvent the non-normality of the data, we consider the HMM in a Generalized Linear Model setting. Skill of the proposed approach is tested using split sample validation against a traditional multi-site canonical correlation model developed using the same set of predictors. From the posterior distribution of the inflow forecasts, we also highlight different system behavior
Walter, G.
2015-01-01
In the Bayesian approach to statistical inference, possibly subjective knowledge on model parameters can be expressed by so-called prior distributions. A prior distribution is updated, via Bayes’ Rule, to the so-called posterior distribution, which combines prior information and information from
Directory of Open Access Journals (Sweden)
Takebayashi Naoki
2007-07-01
Full Text Available Abstract Background Although testing for simultaneous divergence (vicariance across different population-pairs that span the same barrier to gene flow is of central importance to evolutionary biology, researchers often equate the gene tree and population/species tree thereby ignoring stochastic coalescent variance in their conclusions of temporal incongruence. In contrast to other available phylogeographic software packages, msBayes is the only one that analyses data from multiple species/population pairs under a hierarchical model. Results msBayes employs approximate Bayesian computation (ABC under a hierarchical coalescent model to test for simultaneous divergence (TSD in multiple co-distributed population-pairs. Simultaneous isolation is tested by estimating three hyper-parameters that characterize the degree of variability in divergence times across co-distributed population pairs while allowing for variation in various within population-pair demographic parameters (sub-parameters that can affect the coalescent. msBayes is a software package consisting of several C and R programs that are run with a Perl "front-end". Conclusion The method reasonably distinguishes simultaneous isolation from temporal incongruence in the divergence of co-distributed population pairs, even with sparse sampling of individuals. Because the estimate step is decoupled from the simulation step, one can rapidly evaluate different ABC acceptance/rejection conditions and the choice of summary statistics. Given the complex and idiosyncratic nature of testing multi-species biogeographic hypotheses, we envision msBayes as a powerful and flexible tool for tackling a wide array of difficult research questions that use population genetic data from multiple co-distributed species. The msBayes pipeline is available for download at http://msbayes.sourceforge.net/ under an open source license (GNU Public License. The msBayes pipeline is comprised of several C and R programs that
Chowdhury, Rasheda Arman; Lina, Jean Marc; Kobayashi, Eliane; Grova, Christophe
2013-01-01
Localizing the generators of epileptic activity in the brain using Electro-EncephaloGraphy (EEG) or Magneto-EncephaloGraphy (MEG) signals is of particular interest during the pre-surgical investigation of epilepsy. Epileptic discharges can be detectable from background brain activity, provided they are associated with spatially extended generators. Using realistic simulations of epileptic activity, this study evaluates the ability of distributed source localization methods to accurately estimate the location of the generators and their sensitivity to the spatial extent of such generators when using MEG data. Source localization methods based on two types of realistic models have been investigated: (i) brain activity may be modeled using cortical parcels and (ii) brain activity is assumed to be locally smooth within each parcel. A Data Driven Parcellization (DDP) method was used to segment the cortical surface into non-overlapping parcels and diffusion-based spatial priors were used to model local spatial smoothness within parcels. These models were implemented within the Maximum Entropy on the Mean (MEM) and the Hierarchical Bayesian (HB) source localization frameworks. We proposed new methods in this context and compared them with other standard ones using Monte Carlo simulations of realistic MEG data involving sources of several spatial extents and depths. Detection accuracy of each method was quantified using Receiver Operating Characteristic (ROC) analysis and localization error metrics. Our results showed that methods implemented within the MEM framework were sensitive to all spatial extents of the sources ranging from 3 cm(2) to 30 cm(2), whatever were the number and size of the parcels defining the model. To reach a similar level of accuracy within the HB framework, a model using parcels larger than the size of the sources should be considered.
Alameddine, Ibrahim; Karmakar, Subhankar; Qian, Song S.; Paerl, Hans W.; Reckhow, Kenneth H.
2013-10-01
The total maximum daily load program aims to monitor more than 40,000 standard violations in around 20,000 impaired water bodies across the United States. Given resource limitations, future monitoring efforts have to be hedged against the uncertainties in the monitored system, while taking into account existing knowledge. In that respect, we have developed a hierarchical spatiotemporal Bayesian model that can be used to optimize an existing monitoring network by retaining stations that provide the maximum amount of information, while identifying locations that would benefit from the addition of new stations. The model assumes the water quality parameters are adequately described by a joint matrix normal distribution. The adopted approach allows for a reduction in redundancies, while emphasizing information richness rather than data richness. The developed approach incorporates the concept of entropy to account for the associated uncertainties. Three different entropy-based criteria are adopted: total system entropy, chlorophyll-a standard violation entropy, and dissolved oxygen standard violation entropy. A multiple attribute decision making framework is adopted to integrate the competing design criteria and to generate a single optimal design. The approach is implemented on the water quality monitoring system of the Neuse River Estuary in North Carolina, USA. The model results indicate that the high priority monitoring areas identified by the total system entropy and the dissolved oxygen violation entropy criteria are largely coincident. The monitoring design based on the chlorophyll-a standard violation entropy proved to be less informative, given the low probabilities of violating the water quality standard in the estuary.
Directory of Open Access Journals (Sweden)
Rasheda Arman Chowdhury
Full Text Available Localizing the generators of epileptic activity in the brain using Electro-EncephaloGraphy (EEG or Magneto-EncephaloGraphy (MEG signals is of particular interest during the pre-surgical investigation of epilepsy. Epileptic discharges can be detectable from background brain activity, provided they are associated with spatially extended generators. Using realistic simulations of epileptic activity, this study evaluates the ability of distributed source localization methods to accurately estimate the location of the generators and their sensitivity to the spatial extent of such generators when using MEG data. Source localization methods based on two types of realistic models have been investigated: (i brain activity may be modeled using cortical parcels and (ii brain activity is assumed to be locally smooth within each parcel. A Data Driven Parcellization (DDP method was used to segment the cortical surface into non-overlapping parcels and diffusion-based spatial priors were used to model local spatial smoothness within parcels. These models were implemented within the Maximum Entropy on the Mean (MEM and the Hierarchical Bayesian (HB source localization frameworks. We proposed new methods in this context and compared them with other standard ones using Monte Carlo simulations of realistic MEG data involving sources of several spatial extents and depths. Detection accuracy of each method was quantified using Receiver Operating Characteristic (ROC analysis and localization error metrics. Our results showed that methods implemented within the MEM framework were sensitive to all spatial extents of the sources ranging from 3 cm(2 to 30 cm(2, whatever were the number and size of the parcels defining the model. To reach a similar level of accuracy within the HB framework, a model using parcels larger than the size of the sources should be considered.
International Nuclear Information System (INIS)
Mandel, Kaisey S.; Kirshner, Robert P.; Foley, Ryan J.
2014-01-01
We investigate the statistical dependence of the peak intrinsic colors of Type Ia supernovae (SNe Ia) on their expansion velocities at maximum light, measured from the Si II λ6355 spectral feature. We construct a new hierarchical Bayesian regression model, accounting for the random effects of intrinsic scatter, measurement error, and reddening by host galaxy dust, and implement a Gibbs sampler and deviance information criteria to estimate the correlation. The method is applied to the apparent colors from BVRI light curves and Si II velocity data for 79 nearby SNe Ia. The apparent color distributions of high-velocity (HV) and normal velocity (NV) supernovae exhibit significant discrepancies for B – V and B – R, but not other colors. Hence, they are likely due to intrinsic color differences originating in the B band, rather than dust reddening. The mean intrinsic B – V and B – R color differences between HV and NV groups are 0.06 ± 0.02 and 0.09 ± 0.02 mag, respectively. A linear model finds significant slopes of –0.021 ± 0.006 and –0.030 ± 0.009 mag (10 3 km s –1 ) –1 for intrinsic B – V and B – R colors versus velocity, respectively. Because the ejecta velocity distribution is skewed toward high velocities, these effects imply non-Gaussian intrinsic color distributions with skewness up to +0.3. Accounting for the intrinsic-color-velocity correlation results in corrections to A V extinction estimates as large as –0.12 mag for HV SNe Ia and +0.06 mag for NV events. Velocity measurements from SN Ia spectra have the potential to diminish systematic errors from the confounding of intrinsic colors and dust reddening affecting supernova distances
Bayesian inference for heterogeneous caprock permeability based on above zone pressure monitoring
Energy Technology Data Exchange (ETDEWEB)
Namhata, Argha; Small, Mitchell J.; Dilmore, Robert M.; Nakles, David V.; King, Seth
2017-02-01
The presence of faults/ fractures or highly permeable zones in the primary sealing caprock of a CO2 storage reservoir can result in leakage of CO2. Monitoring of leakage requires the capability to detect and resolve the onset, location, and volume of leakage in a systematic and timely manner. Pressure-based monitoring possesses such capabilities. This study demonstrates a basis for monitoring network design based on the characterization of CO2 leakage scenarios through an assessment of the integrity and permeability of the caprock inferred from above zone pressure measurements. Four representative heterogeneous fractured seal types are characterized to demonstrate seal permeability ranging from highly permeable to impermeable. Based on Bayesian classification theory, the probability of each fractured caprock scenario given above zone pressure measurements with measurement error is inferred. The sensitivity to injection rate and caprock thickness is also evaluated and the probability of proper classification is calculated. The time required to distinguish between above zone pressure outcomes and the associated leakage scenarios is also computed.
Louzoun, Yoram; Alter, Idan; Gragert, Loren; Albrecht, Mark; Maiers, Martin
2018-05-01
Regardless of sampling depth, accurate genotype imputation is limited in regions of high polymorphism which often have a heavy-tailed haplotype frequency distribution. Many rare haplotypes are thus unobserved. Statistical methods to improve imputation by extending reference haplotype distributions using linkage disequilibrium patterns that relate allele and haplotype frequencies have not yet been explored. In the field of unrelated stem cell transplantation, imputation of highly polymorphic human leukocyte antigen (HLA) genes has an important application in identifying the best-matched stem cell donor when searching large registries totaling over 28,000,000 donors worldwide. Despite these large registry sizes, a significant proportion of searched patients present novel HLA haplotypes. Supporting this observation, HLA population genetic models have indicated that many extant HLA haplotypes remain unobserved. The absent haplotypes are a significant cause of error in haplotype matching. We have applied a Bayesian inference methodology for extending haplotype frequency distributions, using a model where new haplotypes are created by recombination of observed alleles. Applications of this joint probability model offer significant improvement in frequency distribution estimates over the best existing alternative methods, as we illustrate using five-locus HLA frequency data from the National Marrow Donor Program registry. Transplant matching algorithms and disease association studies involving phasing and imputation of rare variants may benefit from this statistical inference framework.
Bayesian nonparametric generative models for causal inference with missing at random covariates.
Roy, Jason; Lum, Kirsten J; Zeldow, Bret; Dworkin, Jordan D; Re, Vincent Lo; Daniels, Michael J
2018-03-26
We propose a general Bayesian nonparametric (BNP) approach to causal inference in the point treatment setting. The joint distribution of the observed data (outcome, treatment, and confounders) is modeled using an enriched Dirichlet process. The combination of the observed data model and causal assumptions allows us to identify any type of causal effect-differences, ratios, or quantile effects, either marginally or for subpopulations of interest. The proposed BNP model is well-suited for causal inference problems, as it does not require parametric assumptions about the distribution of confounders and naturally leads to a computationally efficient Gibbs sampling algorithm. By flexibly modeling the joint distribution, we are also able to impute (via data augmentation) values for missing covariates within the algorithm under an assumption of ignorable missingness, obviating the need to create separate imputed data sets. This approach for imputing the missing covariates has the additional advantage of guaranteeing congeniality between the imputation model and the analysis model, and because we use a BNP approach, parametric models are avoided for imputation. The performance of the method is assessed using simulation studies. The method is applied to data from a cohort study of human immunodeficiency virus/hepatitis C virus co-infected patients. © 2018, The International Biometric Society.
TYPE Ia SUPERNOVA LIGHT CURVE INFERENCE: HIERARCHICAL MODELS IN THE OPTICAL AND NEAR-INFRARED
International Nuclear Information System (INIS)
Mandel, Kaisey S.; Narayan, Gautham; Kirshner, Robert P.
2011-01-01
We have constructed a comprehensive statistical model for Type Ia supernova (SN Ia) light curves spanning optical through near-infrared (NIR) data. A hierarchical framework coherently models multiple random and uncertain effects, including intrinsic supernova (SN) light curve covariances, dust extinction and reddening, and distances. An improved BAYESN Markov Chain Monte Carlo code computes probabilistic inferences for the hierarchical model by sampling the global probability density of parameters describing individual SNe and the population. We have applied this hierarchical model to optical and NIR data of 127 SNe Ia from PAIRITEL, CfA3, Carnegie Supernova Project, and the literature. We find an apparent population correlation between the host galaxy extinction A V and the ratio of total-to-selective dust absorption R V . For SNe with low dust extinction, A V ∼ V ∼ 2.5-2.9, while at high extinctions, A V ∼> 1, low values of R V < 2 are favored. The NIR luminosities are excellent standard candles and are less sensitive to dust extinction. They exhibit low correlation with optical peak luminosities, and thus provide independent information on distances. The combination of NIR and optical data constrains the dust extinction and improves the predictive precision of individual SN Ia distances by about 60%. Using cross-validation, we estimate an rms distance modulus prediction error of 0.11 mag for SNe with optical and NIR data versus 0.15 mag for SNe with optical data alone. Continued study of SNe Ia in the NIR is important for improving their utility as precise and accurate cosmological distance indicators.
The R Package MitISEM: Efficient and Robust Simulation Procedures for Bayesian Inference
Directory of Open Access Journals (Sweden)
Nalan Baştürk
2017-07-01
Full Text Available This paper presents the R package MitISEM (mixture of t by importance sampling weighted expectation maximization which provides an automatic and flexible two-stage method to approximate a non-elliptical target density kernel - typically a posterior density kernel - using an adaptive mixture of Student t densities as approximating density. In the first stage a mixture of Student t densities is fitted to the target using an expectation maximization algorithm where each step of the optimization procedure is weighted using importance sampling. In the second stage this mixture density is a candidate density for efficient and robust application of importance sampling or the Metropolis-Hastings (MH method to estimate properties of the target distribution. The package enables Bayesian inference and prediction on model parameters and probabilities, in particular, for models where densities have multi-modal or other non-elliptical shapes like curved ridges. These shapes occur in research topics in several scientific fields. For instance, analysis of DNA data in bio-informatics, obtaining loans in the banking sector by heterogeneous groups in financial economics and analysis of education's effect on earned income in labor economics. The package MitISEM provides also an extended algorithm, 'sequential MitISEM', which substantially decreases computation time when the target density has to be approximated for increasing data samples. This occurs when the posterior or predictive density is updated with new observations and/or when one computes model probabilities using predictive likelihoods. We illustrate the MitISEM algorithm using three canonical statistical and econometric models that are characterized by several types of non-elliptical posterior shapes and that describe well-known data patterns in econometrics and finance. We show that MH using the candidate density obtained by MitISEM outperforms, in terms of numerical efficiency, MH using a simpler
qPR: An adaptive partial-report procedure based on Bayesian inference.
Baek, Jongsoo; Lesmes, Luis Andres; Lu, Zhong-Lin
2016-08-01
Iconic memory is best assessed with the partial report procedure in which an array of letters appears briefly on the screen and a poststimulus cue directs the observer to report the identity of the cued letter(s). Typically, 6-8 cue delays or 600-800 trials are tested to measure the iconic memory decay function. Here we develop a quick partial report, or qPR, procedure based on a Bayesian adaptive framework to estimate the iconic memory decay function with much reduced testing time. The iconic memory decay function is characterized by an exponential function and a joint probability distribution of its three parameters. Starting with a prior of the parameters, the method selects the stimulus to maximize the expected information gain in the next test trial. It then updates the posterior probability distribution of the parameters based on the observer's response using Bayesian inference. The procedure is reiterated until either the total number of trials or the precision of the parameter estimates reaches a certain criterion. Simulation studies showed that only 100 trials were necessary to reach an average absolute bias of 0.026 and a precision of 0.070 (both in terms of probability correct). A psychophysical validation experiment showed that estimates of the iconic memory decay function obtained with 100 qPR trials exhibited good precision (the half width of the 68.2% credible interval = 0.055) and excellent agreement with those obtained with 1,600 trials of the conventional method of constant stimuli procedure (RMSE = 0.063). Quick partial-report relieves the data collection burden in characterizing iconic memory and makes it possible to assess iconic memory in clinical populations.
DEFF Research Database (Denmark)
Kristensen, Anders Ringgaard; Søllested, Thomas Algot
2004-01-01
that really uses all these methodological improvements. In this paper, the biological model describing the performance and feed intake of sows is presented. In particular, estimation of herd specific parameters is emphasized. The optimization model is described in a subsequent paper......Several replacement models have been presented in literature. In other applicational areas like dairy cow replacement, various methodological improvements like hierarchical Markov processes and Bayesian updating have been implemented, but not in sow models. Furthermore, there are methodological...... improvements like multi-level hierarchical Markov processes with decisions on multiple time scales, efficient methods for parameter estimations at herd level and standard software that has been hardly implemented at all in any replacement model. The aim of this study is to present a sow replacement model...
Rupa, Chandra; Mujumdar, Pradeep
2016-04-01
In urban areas, quantification of extreme precipitation is important in the design of storm water drains and other infrastructure. Intensity Duration Frequency (IDF) relationships are generally used to obtain design return level for a given duration and return period. Due to lack of availability of extreme precipitation data for sufficiently large number of years, estimating the probability of extreme events is difficult. Typically, a single station data is used to obtain the design return levels for various durations and return periods, which are used in the design of urban infrastructure for the entire city. In an urban setting, the spatial variation of precipitation can be high; the precipitation amounts and patterns often vary within short distances of less than 5 km. Therefore it is crucial to study the uncertainties in the spatial variation of return levels for various durations. In this work, the extreme precipitation is modeled spatially using the Bayesian hierarchical analysis and the spatial variation of return levels is studied. The analysis is carried out with Block Maxima approach for defining the extreme precipitation, using Generalized Extreme Value (GEV) distribution for Bangalore city, Karnataka state, India. Daily data for nineteen stations in and around Bangalore city is considered in the study. The analysis is carried out for summer maxima (March - May), monsoon maxima (June - September) and the annual maxima rainfall. In the hierarchical analysis, the statistical model is specified in three layers. The data layer models the block maxima, pooling the extreme precipitation from all the stations. In the process layer, the latent spatial process characterized by geographical and climatological covariates (lat-lon, elevation, mean temperature etc.) which drives the extreme precipitation is modeled and in the prior level, the prior distributions that govern the latent process are modeled. Markov Chain Monte Carlo (MCMC) algorithm (Metropolis Hastings
Aeschbacher, S; Futschik, A; Beaumont, M A
2013-02-01
We propose a two-step procedure for estimating multiple migration rates in an approximate Bayesian computation (ABC) framework, accounting for global nuisance parameters. The approach is not limited to migration, but generally of interest for inference problems with multiple parameters and a modular structure (e.g. independent sets of demes or loci). We condition on a known, but complex demographic model of a spatially subdivided population, motivated by the reintroduction of Alpine ibex (Capra ibex) into Switzerland. In the first step, the global parameters ancestral mutation rate and male mating skew have been estimated for the whole population in Aeschbacher et al. (Genetics 2012; 192: 1027). In the second step, we estimate in this study the migration rates independently for clusters of demes putatively connected by migration. For large clusters (many migration rates), ABC faces the problem of too many summary statistics. We therefore assess by simulation if estimation per pair of demes is a valid alternative. We find that the trade-off between reduced dimensionality for the pairwise estimation on the one hand and lower accuracy due to the assumption of pairwise independence on the other depends on the number of migration rates to be inferred: the accuracy of the pairwise approach increases with the number of parameters, relative to the joint estimation approach. To distinguish between low and zero migration, we perform ABC-type model comparison between a model with migration and one without. Applying the approach to microsatellite data from Alpine ibex, we find no evidence for substantial gene flow via migration, except for one pair of demes in one direction. © 2013 Blackwell Publishing Ltd.
Elsheikh, Ahmed H.; Hoteit, Ibrahim; Wheeler, Mary Fanett
2014-01-01
An efficient Bayesian calibration method based on the nested sampling (NS) algorithm and non-intrusive polynomial chaos method is presented. Nested sampling is a Bayesian sampling algorithm that builds a discrete representation of the posterior
Brain networks for confidence weighting and hierarchical inference during probabilistic learning.
Meyniel, Florent; Dehaene, Stanislas
2017-05-09
Learning is difficult when the world fluctuates randomly and ceaselessly. Classical learning algorithms, such as the delta rule with constant learning rate, are not optimal. Mathematically, the optimal learning rule requires weighting prior knowledge and incoming evidence according to their respective reliabilities. This "confidence weighting" implies the maintenance of an accurate estimate of the reliability of what has been learned. Here, using fMRI and an ideal-observer analysis, we demonstrate that the brain's learning algorithm relies on confidence weighting. While in the fMRI scanner, human adults attempted to learn the transition probabilities underlying an auditory or visual sequence, and reported their confidence in those estimates. They knew that these transition probabilities could change simultaneously at unpredicted moments, and therefore that the learning problem was inherently hierarchical. Subjective confidence reports tightly followed the predictions derived from the ideal observer. In particular, subjects managed to attach distinct levels of confidence to each learned transition probability, as required by Bayes-optimal inference. Distinct brain areas tracked the likelihood of new observations given current predictions, and the confidence in those predictions. Both signals were combined in the right inferior frontal gyrus, where they operated in agreement with the confidence-weighting model. This brain region also presented signatures of a hierarchical process that disentangles distinct sources of uncertainty. Together, our results provide evidence that the sense of confidence is an essential ingredient of probabilistic learning in the human brain, and that the right inferior frontal gyrus hosts a confidence-based statistical learning algorithm for auditory and visual sequences.
Brain networks for confidence weighting and hierarchical inference during probabilistic learning
Meyniel, Florent; Dehaene, Stanislas
2017-01-01
Learning is difficult when the world fluctuates randomly and ceaselessly. Classical learning algorithms, such as the delta rule with constant learning rate, are not optimal. Mathematically, the optimal learning rule requires weighting prior knowledge and incoming evidence according to their respective reliabilities. This “confidence weighting” implies the maintenance of an accurate estimate of the reliability of what has been learned. Here, using fMRI and an ideal-observer analysis, we demonstrate that the brain’s learning algorithm relies on confidence weighting. While in the fMRI scanner, human adults attempted to learn the transition probabilities underlying an auditory or visual sequence, and reported their confidence in those estimates. They knew that these transition probabilities could change simultaneously at unpredicted moments, and therefore that the learning problem was inherently hierarchical. Subjective confidence reports tightly followed the predictions derived from the ideal observer. In particular, subjects managed to attach distinct levels of confidence to each learned transition probability, as required by Bayes-optimal inference. Distinct brain areas tracked the likelihood of new observations given current predictions, and the confidence in those predictions. Both signals were combined in the right inferior frontal gyrus, where they operated in agreement with the confidence-weighting model. This brain region also presented signatures of a hierarchical process that disentangles distinct sources of uncertainty. Together, our results provide evidence that the sense of confidence is an essential ingredient of probabilistic learning in the human brain, and that the right inferior frontal gyrus hosts a confidence-based statistical learning algorithm for auditory and visual sequences. PMID:28439014
Barham, Barbara J.; Barham, Peter J.; Campbell, Kate J.; Crawford, Robert J. M.; Grigg, Jennifer; Horswill, Cat; Morris, Taryn L.; Pichegru, Lorien; Steinfurth, Antje; Weller, Florian; Winker, Henning
2018-01-01
Global forage-fish landings are increasing, with potentially grave consequences for marine ecosystems. Predators of forage fish may be influenced by this harvest, but the nature of these effects is contentious. Experimental fishery manipulations offer the best solution to quantify population-level impacts, but are rare. We used Bayesian inference to examine changes in chick survival, body condition and population growth rate of endangered African penguins Spheniscus demersus in response to 8 years of alternating time–area closures around two pairs of colonies. Our results demonstrate that fishing closures improved chick survival and condition, after controlling for changing prey availability. However, this effect was inconsistent across sites and years, highlighting the difficultly of assessing management interventions in marine ecosystems. Nevertheless, modelled increases in population growth rates exceeded 1% at one colony; i.e. the threshold considered biologically meaningful by fisheries management in South Africa. Fishing closures evidently can improve the population trend of a forage-fish-dependent predator—we therefore recommend they continue in South Africa and support their application elsewhere. However, detecting demographic gains for mobile marine predators from small no-take zones requires experimental time frames and scales that will often exceed those desired by decision makers. PMID:29343602
A dynamic discretization method for reliability inference in Dynamic Bayesian Networks
International Nuclear Information System (INIS)
Zhu, Jiandao; Collette, Matthew
2015-01-01
The material and modeling parameters that drive structural reliability analysis for marine structures are subject to a significant uncertainty. This is especially true when time-dependent degradation mechanisms such as structural fatigue cracking are considered. Through inspection and monitoring, information such as crack location and size can be obtained to improve these parameters and the corresponding reliability estimates. Dynamic Bayesian Networks (DBNs) are a powerful and flexible tool to model dynamic system behavior and update reliability and uncertainty analysis with life cycle data for problems such as fatigue cracking. However, a central challenge in using DBNs is the need to discretize certain types of continuous random variables to perform network inference while still accurately tracking low-probability failure events. Most existing discretization methods focus on getting the overall shape of the distribution correct, with less emphasis on the tail region. Therefore, a novel scheme is presented specifically to estimate the likelihood of low-probability failure events. The scheme is an iterative algorithm which dynamically partitions the discretization intervals at each iteration. Through applications to two stochastic crack-growth example problems, the algorithm is shown to be robust and accurate. Comparisons are presented between the proposed approach and existing methods for the discretization problem. - Highlights: • A dynamic discretization method is developed for low-probability events in DBNs. • The method is compared to existing approaches on two crack growth problems. • The method is shown to improve on existing methods for low-probability events
Characteristics of SiC neutron sensor spectrum unfolding process based on Bayesian inference
Energy Technology Data Exchange (ETDEWEB)
Cetnar, Jerzy; Krolikowski, Igor [Faculty of Energy and Fuels AGH - University of Science and Technology, Al. Mickiewicza 30, 30-059 Krakow (Poland); Ottaviani, L. [IM2NP, UMR CNRS 7334, Aix-Marseille University, Case 231 -13397 Marseille Cedex 20 (France); Lyoussi, A. [CEA, DEN, DER, Instrumentation Sensors and Dosimetry Laboratory, Cadarache, F-13108 St-Paul-Lez-Durance (France)
2015-07-01
This paper deals with SiC detector signal interpretation in neutron radiation measurements in mixed neutron gamma radiation fields, which is called the detector inverse problem or the spectrum unfolding, and it aims in finding a representation of the primary radiation, based on the measured detector signals. In our novel methodology we resort to Bayesian inference approach. In the developed procedure the resultant spectra is unfolded form detector channels reading, where the estimated neutron fluence in a group structure is obtained with its statistical characteristic comprising of standard deviation and correlation matrix. In the paper we present results of unfolding process for case of D-T neutron source in neutron moderating environment. Discussions of statistical properties of obtained results are presented as well as of the physical meaning of obtained correlation matrix of estimated group fluence. The presented works has been carried out within the I-SMART project, which is part of the KIC InnoEnergy R and D program. (authors)
Directory of Open Access Journals (Sweden)
Dario Cuevas Rivera
2015-10-01
Full Text Available The olfactory information that is received by the insect brain is encoded in the form of spatiotemporal patterns in the projection neurons of the antennal lobe. These dense and overlapping patterns are transformed into a sparse code in Kenyon cells in the mushroom body. Although it is clear that this sparse code is the basis for rapid categorization of odors, it is yet unclear how the sparse code in Kenyon cells is computed and what information it represents. Here we show that this computation can be modeled by sequential firing rate patterns using Lotka-Volterra equations and Bayesian online inference. This new model can be understood as an 'intelligent coincidence detector', which robustly and dynamically encodes the presence of specific odor features. We found that the model is able to qualitatively reproduce experimentally observed activity in both the projection neurons and the Kenyon cells. In particular, the model explains mechanistically how sparse activity in the Kenyon cells arises from the dense code in the projection neurons. The odor classification performance of the model proved to be robust against noise and time jitter in the observed input sequences. As in recent experimental results, we found that recognition of an odor happened very early during stimulus presentation in the model. Critically, by using the model, we found surprising but simple computational explanations for several experimental phenomena.
Schmit, C. J.; Pritchard, J. R.
2018-03-01
Next generation radio experiments such as LOFAR, HERA, and SKA are expected to probe the Epoch of Reionization (EoR) and claim a first direct detection of the cosmic 21cm signal within the next decade. Data volumes will be enormous and can thus potentially revolutionize our understanding of the early Universe and galaxy formation. However, numerical modelling of the EoR can be prohibitively expensive for Bayesian parameter inference and how to optimally extract information from incoming data is currently unclear. Emulation techniques for fast model evaluations have recently been proposed as a way to bypass costly simulations. We consider the use of artificial neural networks as a blind emulation technique. We study the impact of training duration and training set size on the quality of the network prediction and the resulting best-fitting values of a parameter search. A direct comparison is drawn between our emulation technique and an equivalent analysis using 21CMMC. We find good predictive capabilities of our network using training sets of as low as 100 model evaluations, which is within the capabilities of fully numerical radiative transfer codes.
International Nuclear Information System (INIS)
George, J.S.; Schmidt, D.M.; Wood, C.C.
1999-01-01
We have developed a Bayesian approach to the analysis of neural electromagnetic (MEG/EEG) data that can incorporate or fuse information from other imaging modalities and addresses the ill-posed inverse problem by sarnpliig the many different solutions which could have produced the given data. From these samples one can draw probabilistic inferences about regions of activation. Our source model assumes a variable number of variable size cortical regions of stimulus-correlated activity. An active region consists of locations on the cortical surf ace, within a sphere centered on some location in cortex. The number and radi of active regions can vary to defined maximum values. The goal of the analysis is to determine the posterior probability distribution for the set of parameters that govern the number, location, and extent of active regions. Markov Chain Monte Carlo is used to generate a large sample of sets of parameters distributed according to the posterior distribution. This sample is representative of the many different source distributions that could account for given data, and allows identification of probable (i.e. consistent) features across solutions. Examples of the use of this analysis technique with both simulated and empirical MEG data are presented
Directory of Open Access Journals (Sweden)
Antonio Canale
2017-06-01
Full Text Available msBP is an R package that implements a new method to perform Bayesian multiscale nonparametric inference introduced by Canale and Dunson (2016. The method, based on mixtures of multiscale beta dictionary densities, overcomes the drawbacks of Pólya trees and inherits many of the advantages of Dirichlet process mixture models. The key idea is that an infinitely-deep binary tree is introduced, with a beta dictionary density assigned to each node of the tree. Using a multiscale stick-breaking characterization, stochastically decreasing weights are assigned to each node. The result is an infinite mixture model. The package msBP implements a series of basic functions to deal with this family of priors such as random densities and numbers generation, creation and manipulation of binary tree objects, and generic functions to plot and print the results. In addition, it implements the Gibbs samplers for posterior computation to perform multiscale density estimation and multiscale testing of group differences described in Canale and Dunson (2016.
Duggento, Andrea; Stankovski, Tomislav; McClintock, Peter V. E.; Stefanovska, Aneta
2012-12-01
Living systems have time-evolving interactions that, until recently, could not be identified accurately from recorded time series in the presence of noise. Stankovski [Phys. Rev. Lett.PRLTAO0031-900710.1103/PhysRevLett.109.024101 109, 024101 (2012)] introduced a method based on dynamical Bayesian inference that facilitates the simultaneous detection of time-varying synchronization, directionality of influence, and coupling functions. It can distinguish unsynchronized dynamics from noise-induced phase slips. The method is based on phase dynamics, with Bayesian inference of the time-evolving parameters being achieved by shaping the prior densities to incorporate knowledge of previous samples. We now present the method in detail using numerically generated data, data from an analog electronic circuit, and cardiorespiratory data. We also generalize the method to encompass networks of interacting oscillators and thus demonstrate its applicability to small-scale networks.
Multiview Bayesian Correlated Component Analysis
DEFF Research Database (Denmark)
Kamronn, Simon Due; Poulsen, Andreas Trier; Hansen, Lars Kai
2015-01-01
are identical. Here we propose a hierarchical probabilistic model that can infer the level of universality in such multiview data, from completely unrelated representations, corresponding to canonical correlation analysis, to identical representations as in correlated component analysis. This new model, which...... we denote Bayesian correlated component analysis, evaluates favorably against three relevant algorithms in simulated data. A well-established benchmark EEG data set is used to further validate the new model and infer the variability of spatial representations across multiple subjects....
Directory of Open Access Journals (Sweden)
Ali Mohammad-Djafari
2015-06-01
Full Text Available The main content of this review article is first to review the main inference tools using Bayes rule, the maximum entropy principle (MEP, information theory, relative entropy and the Kullback–Leibler (KL divergence, Fisher information and its corresponding geometries. For each of these tools, the precise context of their use is described. The second part of the paper is focused on the ways these tools have been used in data, signal and image processing and in the inverse problems, which arise in different physical sciences and engineering applications. A few examples of the applications are described: entropy in independent components analysis (ICA and in blind source separation, Fisher information in data model selection, different maximum entropy-based methods in time series spectral estimation and in linear inverse problems and, finally, the Bayesian inference for general inverse problems. Some original materials concerning the approximate Bayesian computation (ABC and, in particular, the variational Bayesian approximation (VBA methods are also presented. VBA is used for proposing an alternative Bayesian computational tool to the classical Markov chain Monte Carlo (MCMC methods. We will also see that VBA englobes joint maximum a posteriori (MAP, as well as the different expectation-maximization (EM algorithms as particular cases.
Directory of Open Access Journals (Sweden)
Hero Alfred
2010-11-01
Full Text Available Abstract Background Nonparametric Bayesian techniques have been developed recently to extend the sophistication of factor models, allowing one to infer the number of appropriate factors from the observed data. We consider such techniques for sparse factor analysis, with application to gene-expression data from three virus challenge studies. Particular attention is placed on employing the Beta Process (BP, the Indian Buffet Process (IBP, and related sparseness-promoting techniques to infer a proper number of factors. The posterior density function on the model parameters is computed using Gibbs sampling and variational Bayesian (VB analysis. Results Time-evolving gene-expression data are considered for respiratory syncytial virus (RSV, Rhino virus, and influenza, using blood samples from healthy human subjects. These data were acquired in three challenge studies, each executed after receiving institutional review board (IRB approval from Duke University. Comparisons are made between several alternative means of per-forming nonparametric factor analysis on these data, with comparisons as well to sparse-PCA and Penalized Matrix Decomposition (PMD, closely related non-Bayesian approaches. Conclusions Applying the Beta Process to the factor scores, or to the singular values of a pseudo-SVD construction, the proposed algorithms infer the number of factors in gene-expression data. For real data the "true" number of factors is unknown; in our simulations we consider a range of noise variances, and the proposed Bayesian models inferred the number of factors accurately relative to other methods in the literature, such as sparse-PCA and PMD. We have also identified a "pan-viral" factor of importance for each of the three viruses considered in this study. We have identified a set of genes associated with this pan-viral factor, of interest for early detection of such viruses based upon the host response, as quantified via gene-expression data.
Chen, Bo; Chen, Minhua; Paisley, John; Zaas, Aimee; Woods, Christopher; Ginsburg, Geoffrey S; Hero, Alfred; Lucas, Joseph; Dunson, David; Carin, Lawrence
2010-11-09
Nonparametric Bayesian techniques have been developed recently to extend the sophistication of factor models, allowing one to infer the number of appropriate factors from the observed data. We consider such techniques for sparse factor analysis, with application to gene-expression data from three virus challenge studies. Particular attention is placed on employing the Beta Process (BP), the Indian Buffet Process (IBP), and related sparseness-promoting techniques to infer a proper number of factors. The posterior density function on the model parameters is computed using Gibbs sampling and variational Bayesian (VB) analysis. Time-evolving gene-expression data are considered for respiratory syncytial virus (RSV), Rhino virus, and influenza, using blood samples from healthy human subjects. These data were acquired in three challenge studies, each executed after receiving institutional review board (IRB) approval from Duke University. Comparisons are made between several alternative means of per-forming nonparametric factor analysis on these data, with comparisons as well to sparse-PCA and Penalized Matrix Decomposition (PMD), closely related non-Bayesian approaches. Applying the Beta Process to the factor scores, or to the singular values of a pseudo-SVD construction, the proposed algorithms infer the number of factors in gene-expression data. For real data the "true" number of factors is unknown; in our simulations we consider a range of noise variances, and the proposed Bayesian models inferred the number of factors accurately relative to other methods in the literature, such as sparse-PCA and PMD. We have also identified a "pan-viral" factor of importance for each of the three viruses considered in this study. We have identified a set of genes associated with this pan-viral factor, of interest for early detection of such viruses based upon the host response, as quantified via gene-expression data.
Energy Technology Data Exchange (ETDEWEB)
La Russa, D [The Ottawa Hospital Cancer Centre, Ottawa, ON (Canada)
2015-06-15
Purpose: The purpose of this project is to develop a robust method of parameter estimation for a Poisson-based TCP model using Bayesian inference. Methods: Bayesian inference was performed using the PyMC3 probabilistic programming framework written in Python. A Poisson-based TCP regression model that accounts for clonogen proliferation was fit to observed rates of local relapse as a function of equivalent dose in 2 Gy fractions for a population of 623 stage-I non-small-cell lung cancer patients. The Slice Markov Chain Monte Carlo sampling algorithm was used to sample the posterior distributions, and was initiated using the maximum of the posterior distributions found by optimization. The calculation of TCP with each sample step required integration over the free parameter α, which was performed using an adaptive 24-point Gauss-Legendre quadrature. Convergence was verified via inspection of the trace plot and posterior distribution for each of the fit parameters, as well as with comparisons of the most probable parameter values with their respective maximum likelihood estimates. Results: Posterior distributions for α, the standard deviation of α (σ), the average tumour cell-doubling time (Td), and the repopulation delay time (Tk), were generated assuming α/β = 10 Gy, and a fixed clonogen density of 10{sup 7} cm−{sup 3}. Posterior predictive plots generated from samples from these posterior distributions are in excellent agreement with the observed rates of local relapse used in the Bayesian inference. The most probable values of the model parameters also agree well with maximum likelihood estimates. Conclusion: A robust method of performing Bayesian inference of TCP data using a complex TCP model has been established.
International Nuclear Information System (INIS)
Byun, Hyunsuk; Lee, Chul-Yong
2017-01-01
Generally, consumers use electricity without considering the source the electricity was generated from. Since different energy sources exert varying effects on society, it is necessary to analyze consumers’ latent preference for electricity generation sources. The present study estimates Korean consumers’ marginal utility and an appropriate generation mix is derived using the hierarchical Bayesian logit model in a discrete choice experiment. The results show that consumers consider the danger posed by the source of electricity as the most important factor among the effects of electricity generation sources. Additionally, Korean consumers wish to reduce the contribution of nuclear power from the existing 32–11%, and increase that of renewable energy from the existing 4–32%. - Highlights: • We derive an electricity mix reflecting Korean consumers’ latent preferences. • We use the discrete choice experiment and hierarchical Bayesian logit model. • The danger posed by the generation source is the most important attribute. • The consumers wish to increase the renewable energy proportion from 4.3% to 32.8%. • Korea's cost-oriented energy supply policy and consumers’ preference differ markedly.
Yu, Wenxi; Liu, Yang; Ma, Zongwei; Bi, Jun
2017-08-01
Using satellite-based aerosol optical depth (AOD) measurements and statistical models to estimate ground-level PM 2.5 is a promising way to fill the areas that are not covered by ground PM 2.5 monitors. The statistical models used in previous studies are primarily Linear Mixed Effects (LME) and Geographically Weighted Regression (GWR) models. In this study, we developed a new regression model between PM 2.5 and AOD using Gaussian processes in a Bayesian hierarchical setting. Gaussian processes model the stochastic nature of the spatial random effects, where the mean surface and the covariance function is specified. The spatial stochastic process is incorporated under the Bayesian hierarchical framework to explain the variation of PM 2.5 concentrations together with other factors, such as AOD, spatial and non-spatial random effects. We evaluate the results of our model and compare them with those of other, conventional statistical models (GWR and LME) by within-sample model fitting and out-of-sample validation (cross validation, CV). The results show that our model possesses a CV result (R 2 = 0.81) that reflects higher accuracy than that of GWR and LME (0.74 and 0.48, respectively). Our results indicate that Gaussian process models have the potential to improve the accuracy of satellite-based PM 2.5 estimates.
When mechanism matters: Bayesian forecasting using models of ecological diffusion
Hefley, Trevor J.; Hooten, Mevin B.; Russell, Robin E.; Walsh, Daniel P.; Powell, James A.
2017-01-01
Ecological diffusion is a theory that can be used to understand and forecast spatio-temporal processes such as dispersal, invasion, and the spread of disease. Hierarchical Bayesian modelling provides a framework to make statistical inference and probabilistic forecasts, using mechanistic ecological models. To illustrate, we show how hierarchical Bayesian models of ecological diffusion can be implemented for large data sets that are distributed densely across space and time. The hierarchical Bayesian approach is used to understand and forecast the growth and geographic spread in the prevalence of chronic wasting disease in white-tailed deer (Odocoileus virginianus). We compare statistical inference and forecasts from our hierarchical Bayesian model to phenomenological regression-based methods that are commonly used to analyse spatial occurrence data. The mechanistic statistical model based on ecological diffusion led to important ecological insights, obviated a commonly ignored type of collinearity, and was the most accurate method for forecasting.
Jameel, M. Y.; Brewer, S.; Fiorella, R.; Tipple, B. J.; Bowen, G. J.; Terry, S.
2017-12-01
Public water supply systems (PWSS) are complex distribution systems and critical infrastructure, making them vulnerable to physical disruption and contamination. Exploring the susceptibility of PWSS to such perturbations requires detailed knowledge of the supply system structure and operation. Although the physical structure of supply systems (i.e., pipeline connection) is usually well documented for developed cities, the actual flow patterns of water in these systems are typically unknown or estimated based on hydrodynamic models with limited observational validation. Here, we present a novel method for mapping the flow structure of water in a large, complex PWSS, building upon recent work highlighting the potential of stable isotopes of water (SIW) to document water management practices within complex PWSS. We sampled a major water distribution system of the Salt Lake Valley, Utah, measuring SIW of water sources, treatment facilities, and numerous sites within in the supply system. We then developed a hierarchical Bayesian (HB) isotope mixing model to quantify the proportion of water supplied by different sources at sites within the supply system. Known production volumes and spatial distance effects were used to define the prior probabilities for each source; however, we did not include other physical information about the supply system. Our results were in general agreement with those obtained by hydrodynamic models and provide quantitative estimates of contributions of different water sources to a given site along with robust estimates of uncertainty. Secondary properties of the supply system, such as regions of "static" and "dynamic" source (e.g., regions supplied dominantly by one source vs. those experiencing active mixing between multiple sources), can be inferred from the results. The isotope-based HB isotope mixing model offers a new investigative technique for analyzing PWSS and documenting aspects of supply system structure and operation that are
An analysis of the costs of treating schizophrenia in Spain: a hierarchical Bayesian approach.
Vázquez-Polo, Francisco-Jose; Negrín, Miguel; Cabasés, Juan M; Sánchez, Eduardo; Haro, Joseph M; Salvador-Carulla, Luis
2005-09-01
Health care decisions should incorporate cost of illness and treatment data, particularly for disorders such as schizophrenia with a high morbidity rate and a disproportionately low allocation of resources. Previous cost of illness analyses may have disregarded geographical aspects relevant for resource consumption and unit cost calculation. To compare the utilisation of resources and the care costs of schizophrenic patients in four mental-health districts in Spain (in Madrid, Catalonia, Navarra and Andalusia), and to analyse factors that determine the costs and the differences between areas. A treated prevalence bottom-up three year follow-up design was used for obtaining data concerning socio-demography, clinical evolution and the utilisation of services. 1997 reference prices were updated for years 1998-2000 in euros. We propose two different scenarios, varying in the prices applied. In the first (Scenario 0) the reference prices are those obtained for a single geographic area, and so the cost variations are only due to differences in the use of resources. In the second situation (Scenario 1), we analyse the variations in resource utilisation at different levels, using the prices applicable to each healthcare area. Bayesian hierarchical models are used to discuss the factors that determine such costs and the differences between geographic areas. In scenario 0, the estimated mean cost was 4918.948 euros for the first year. In scenario 1 the highest cost was in Gava (Catalonia) and the lowest in Loja (Andalusia). Mean costs were respectively 4547.24 and 2473.98 euros. With respect to the evolution of costs over time, we observed an increase during the second year and a reduction during the third year. Geographical differences appeared in follow-up costs. The variables related to lower treatment costs were: residence in the family household, higher patient age and being in work. On the contrary, the number of relapses is directly related to higher treatment costs
International Nuclear Information System (INIS)
Qin, H.; Zhou, W.; Zhang, S.
2015-01-01
Stochastic process-based models are developed to characterize the generation and growth of metal-loss corrosion defects on oil and gas steel pipelines. The generation of corrosion defects over time is characterized by the non-homogenous Poisson process, and the growth of depths of individual defects is modeled by the non-homogenous gamma process (NHGP). The defect generation and growth models are formulated in a hierarchical Bayesian framework, whereby the parameters of the models are evaluated from the in-line inspection (ILI) data through the Bayesian updating by accounting for the probability of detection (POD) and measurement errors associated with the ILI data. The Markov Chain Monte Carlo (MCMC) simulation in conjunction with the data augmentation (DA) technique is employed to carry out the Bayesian updating. Numerical examples that involve simulated ILI data are used to illustrate and validate the proposed methodology. - Highlights: • Bayesian updating of growth and generation models of defects on energy pipelines. • Non-homogeneous Poisson process for defect generation. • Non-homogeneous gamma process for defect growth. • Updating based on inspection data with detecting and sizing uncertainties. • MCMC in conjunction with data augmentation technique employed for the updating.
Multi-scale inference of interaction rules in animal groups using Bayesian model selection.
Directory of Open Access Journals (Sweden)
Richard P Mann
Full Text Available Inference of interaction rules of animals moving in groups usually relies on an analysis of large scale system behaviour. Models are tuned through repeated simulation until they match the observed behaviour. More recent work has used the fine scale motions of animals to validate and fit the rules of interaction of animals in groups. Here, we use a Bayesian methodology to compare a variety of models to the collective motion of glass prawns (Paratya australiensis. We show that these exhibit a stereotypical 'phase transition', whereby an increase in density leads to the onset of collective motion in one direction. We fit models to this data, which range from: a mean-field model where all prawns interact globally; to a spatial Markovian model where prawns are self-propelled particles influenced only by the current positions and directions of their neighbours; up to non-Markovian models where prawns have 'memory' of previous interactions, integrating their experiences over time when deciding to change behaviour. We show that the mean-field model fits the large scale behaviour of the system, but does not capture the observed locality of interactions. Traditional self-propelled particle models fail to capture the fine scale dynamics of the system. The most sophisticated model, the non-Markovian model, provides a good match to the data at both the fine scale and in terms of reproducing global dynamics, while maintaining a biologically plausible perceptual range. We conclude that prawns' movements are influenced by not just the current direction of nearby conspecifics, but also those encountered in the recent past. Given the simplicity of prawns as a study system our research suggests that self-propelled particle models of collective motion should, if they are to be realistic at multiple biological scales, include memory of previous interactions and other non-Markovian effects.
Energy Technology Data Exchange (ETDEWEB)
Kang, Seongkeun; Seong, Poong Hyun [Korea Advanced Institute of Science and Technology, Daejeon (Korea, Republic of)
2014-05-15
The purpose of this paper is to confirm if Bayesian inference can properly reflect the situation awareness of real human operators, and find the difference between the situation of ideal and practical operators, and investigate the factors which contributes to those difference. As a results, human can not think like computer. If human can memorize all the information, and their thinking process is same to the CPU of computer, the results of these two experiments come out more than 99%. However the probability of finding right malfunction by humans are only 64.52% in simple experiment, and 51.61% in complex experiment. Cognition is the mental processing that includes the attention of working memory, comprehending and producing language, calculating, reasoning, problem solving, and decision making. There are many reasons why human thinking process is different with computer, but in this experiment, we suggest that the working memory is the most important factor. Humans have limited working memory which has only seven chunks capacity. These seven chunks are called magic number. If there are more than seven sequential information, people start to forget the previous information because their working memory capacity is running over. We can check how much working memory affects to the result through the simple experiment. Then what if we neglect the effect of working memory? The total number of subjects who have incorrect memory is 7 (subject 3, 5, 6, 7, 8, 15, 25). They could find the right malfunction if the memory hadn't changed because of lack of working memory. Then the probability of find correct malfunction will be increased to 87.10% from 64.52%. Complex experiment has similar result. In this case, eight subjects(1, 5, 8, 9, 15, 17, 18, 30) had changed the memory, and it affects to find the right malfunction. Considering it, then the probability would be (16+8)/31 = 77.42%.
Palacios, Julia A; Minin, Vladimir N
2013-03-01
Changes in population size influence genetic diversity of the population and, as a result, leave a signature of these changes in individual genomes in the population. We are interested in the inverse problem of reconstructing past population dynamics from genomic data. We start with a standard framework based on the coalescent, a stochastic process that generates genealogies connecting randomly sampled individuals from the population of interest. These genealogies serve as a glue between the population demographic history and genomic sequences. It turns out that only the times of genealogical lineage coalescences contain information about population size dynamics. Viewing these coalescent times as a point process, estimating population size trajectories is equivalent to estimating a conditional intensity of this point process. Therefore, our inverse problem is similar to estimating an inhomogeneous Poisson process intensity function. We demonstrate how recent advances in Gaussian process-based nonparametric inference for Poisson processes can be extended to Bayesian nonparametric estimation of population size dynamics under the coalescent. We compare our Gaussian process (GP) approach to one of the state-of-the-art Gaussian Markov random field (GMRF) methods for estimating population trajectories. Using simulated data, we demonstrate that our method has better accuracy and precision. Next, we analyze two genealogies reconstructed from real sequences of hepatitis C and human Influenza A viruses. In both cases, we recover more believed aspects of the viral demographic histories than the GMRF approach. We also find that our GP method produces more reasonable uncertainty estimates than the GMRF method. Copyright © 2013, The International Biometric Society.
Koskela, J. J.; Croke, B. W. F.; Koivusalo, H.; Jakeman, A. J.; Kokkonen, T.
2012-11-01
Bayesian inference is used to study the effect of precipitation and model structural uncertainty on estimates of model parameters and confidence limits of predictive variables in a conceptual rainfall-runoff model in the snow-fed Rudbäck catchment (142 ha) in southern Finland. The IHACRES model is coupled with a simple degree day model to account for snow accumulation and melt. The posterior probability distribution of the model parameters is sampled by using the Differential Evolution Adaptive Metropolis (DREAM(ZS)) algorithm and the generalized likelihood function. Precipitation uncertainty is taken into account by introducing additional latent variables that were used as multipliers for individual storm events. Results suggest that occasional snow water equivalent (SWE) observations together with daily streamflow observations do not contain enough information to simultaneously identify model parameters, precipitation uncertainty and model structural uncertainty in the Rudbäck catchment. The addition of an autoregressive component to account for model structure error and latent variables having uniform priors to account for input uncertainty lead to dubious posterior distributions of model parameters. Thus our hypothesis that informative priors for latent variables could be replaced by additional SWE data could not be confirmed. The model was found to work adequately in 1-day-ahead simulation mode, but the results were poor in the simulation batch mode. This was caused by the interaction of parameters that were used to describe different sources of uncertainty. The findings may have lessons for other cases where parameterizations are similarly high in relation to available prior information.
Hallo, Miroslav; Asano, Kimiyuki; Gallovič, František
2017-09-01
On April 16, 2016, Kumamoto prefecture in Kyushu region, Japan, was devastated by a shallow M JMA7.3 earthquake. The series of foreshocks started by M JMA6.5 foreshock 28 h before the mainshock. They have originated in Hinagu fault zone intersecting the mainshock Futagawa fault zone; hence, the tectonic background for this earthquake sequence is rather complex. Here we infer centroid moment tensors (CMTs) for 11 events with M JMA between 4.8 and 6.5, using strong motion records of the K-NET, KiK-net and F-net networks. We use upgraded Bayesian full-waveform inversion code ISOLA-ObsPy, which takes into account uncertainty of the velocity model. Such an approach allows us to reliably assess uncertainty of the CMT parameters including the centroid position. The solutions show significant systematic spatial and temporal variations throughout the sequence. Foreshocks are right-lateral steeply dipping strike-slip events connected to the NE-SW shear zone. Those located close to the intersection of the Hinagu and Futagawa fault zones are dipping slightly to ESE, while those in the southern area are dipping to WNW. Contrarily, aftershocks are mostly normal dip-slip events, being related to the N-S extensional tectonic regime. Most of the deviatoric moment tensors contain only minor CLVD component, which can be attributed to the velocity model uncertainty. Nevertheless, two of the CMTs involve a significant CLVD component, which may reflect complex rupture process. Decomposition of those moment tensors into two pure shear moment tensors suggests combined right-lateral strike-slip and normal dip-slip mechanisms, consistent with the tectonic settings of the intersection of the Hinagu and Futagawa fault zones.[Figure not available: see fulltext.
International Nuclear Information System (INIS)
Kang, Seongkeun; Seong, Poong Hyun
2014-01-01
The purpose of this paper is to confirm if Bayesian inference can properly reflect the situation awareness of real human operators, and find the difference between the situation of ideal and practical operators, and investigate the factors which contributes to those difference. As a results, human can not think like computer. If human can memorize all the information, and their thinking process is same to the CPU of computer, the results of these two experiments come out more than 99%. However the probability of finding right malfunction by humans are only 64.52% in simple experiment, and 51.61% in complex experiment. Cognition is the mental processing that includes the attention of working memory, comprehending and producing language, calculating, reasoning, problem solving, and decision making. There are many reasons why human thinking process is different with computer, but in this experiment, we suggest that the working memory is the most important factor. Humans have limited working memory which has only seven chunks capacity. These seven chunks are called magic number. If there are more than seven sequential information, people start to forget the previous information because their working memory capacity is running over. We can check how much working memory affects to the result through the simple experiment. Then what if we neglect the effect of working memory? The total number of subjects who have incorrect memory is 7 (subject 3, 5, 6, 7, 8, 15, 25). They could find the right malfunction if the memory hadn't changed because of lack of working memory. Then the probability of find correct malfunction will be increased to 87.10% from 64.52%. Complex experiment has similar result. In this case, eight subjects(1, 5, 8, 9, 15, 17, 18, 30) had changed the memory, and it affects to find the right malfunction. Considering it, then the probability would be (16+8)/31 = 77.42%
Spatial variability of coastal wetland resilience to sea-level rise using Bayesian inference
Hardy, T.; Wu, W.
2017-12-01
The coastal wetlands in the Northern Gulf of Mexico (NGOM) account for 40% of coastal wetland area in the United States and provide various ecosystem services to the region and broader areas. Increasing rates of relative sea-level rise (RSLR), and reduced sediment input have increased coastal wetland loss in the NGOM, accounting for 80% of coastal wetland loss in the nation. Traditional models for predicting the impact of RSLR on coastal wetlands in the NGOM have focused on coastal erosion driven by geophysical variables only, and/or at small spatial extents. Here we developed a model in Bayesian inference to make probabilistic prediction of wetland loss in the entire NGOM as a function of vegetation productivity and geophysical attributes. We also studied how restoration efforts help maintain the area of coastal wetlands. Vegetation productivity contributes organic matter to wetland sedimentation and was approximated using the remotely sensed normalized difference moisture index (NDMI). The geophysical variables include RSLR, tidal range, river discharge, coastal slope, and wave height. We found a significantly positive relation between wetland loss and RSLR, which varied significantly at different river discharge regimes. There also existed a significantly negative relation between wetland loss and NDMI, indicating that in-situ vegetation productivity contributed to wetland resilience to RSLR. This relation did not vary significantly between river discharge regimes. The spatial relation revealed three areas of high RSLR but relatively low wetland loss; these areas were associated with wetland restoration projects in coastal Louisiana. Two projects were breakwater projects, where hard materials were placed off-shore to reduce wave action and promote sedimentation. And one project was a vegetation planting project used to promote sedimentation and wetland stabilization. We further developed an interactive web tool that allows stakeholders to develop similar wetland
Multi-scale inference of interaction rules in animal groups using Bayesian model selection.
Directory of Open Access Journals (Sweden)
Richard P Mann
2012-01-01
Full Text Available Inference of interaction rules of animals moving in groups usually relies on an analysis of large scale system behaviour. Models are tuned through repeated simulation until they match the observed behaviour. More recent work has used the fine scale motions of animals to validate and fit the rules of interaction of animals in groups. Here, we use a Bayesian methodology to compare a variety of models to the collective motion of glass prawns (Paratya australiensis. We show that these exhibit a stereotypical 'phase transition', whereby an increase in density leads to the onset of collective motion in one direction. We fit models to this data, which range from: a mean-field model where all prawns interact globally; to a spatial Markovian model where prawns are self-propelled particles influenced only by the current positions and directions of their neighbours; up to non-Markovian models where prawns have 'memory' of previous interactions, integrating their experiences over time when deciding to change behaviour. We show that the mean-field model fits the large scale behaviour of the system, but does not capture fine scale rules of interaction, which are primarily mediated by physical contact. Conversely, the Markovian self-propelled particle model captures the fine scale rules of interaction but fails to reproduce global dynamics. The most sophisticated model, the non-Markovian model, provides a good match to the data at both the fine scale and in terms of reproducing global dynamics. We conclude that prawns' movements are influenced by not just the current direction of nearby conspecifics, but also those encountered in the recent past. Given the simplicity of prawns as a study system our research suggests that self-propelled particle models of collective motion should, if they are to be realistic at multiple biological scales, include memory of previous interactions and other non-Markovian effects.
Mandel, Kaisey; Kirshner, R. P.; Narayan, G.; Wood-Vasey, W. M.; Friedman, A. S.; Hicken, M.
2010-01-01
I have constructed a comprehensive statistical model for Type Ia supernova light curves spanning optical through near infrared data simultaneously. The near infrared light curves are found to be excellent standard candles (sigma(MH) = 0.11 +/- 0.03 mag) that are less vulnerable to systematic error from dust extinction, a major confounding factor for cosmological studies. A hierarchical statistical framework incorporates coherently multiple sources of randomness and uncertainty, including photometric error, intrinsic supernova light curve variations and correlations, dust extinction and reddening, peculiar velocity dispersion and distances, for probabilistic inference with Type Ia SN light curves. Inferences are drawn from the full probability density over individual supernovae and the SN Ia and dust populations, conditioned on a dataset of SN Ia light curves and redshifts. To compute probabilistic inferences with hierarchical models, I have developed BayeSN, a Markov Chain Monte Carlo algorithm based on Gibbs sampling. This code explores and samples the global probability density of parameters describing individual supernovae and the population. I have applied this hierarchical model to optical and near infrared data of over 100 nearby Type Ia SN from PAIRITEL, the CfA3 sample, and the literature. Using this statistical model, I find that SN with optical and NIR data have a smaller residual scatter in the Hubble diagram than SN with only optical data. The continued study of Type Ia SN in the near infrared will be important for improving their utility as precise and accurate cosmological distance indicators.
Sparse Estimation Using Bayesian Hierarchical Prior Modeling for Real and Complex Linear Models
DEFF Research Database (Denmark)
Pedersen, Niels Lovmand; Manchón, Carles Navarro; Badiu, Mihai Alin
2015-01-01
In sparse Bayesian learning (SBL), Gaussian scale mixtures (GSMs) have been used to model sparsity-inducing priors that realize a class of concave penalty functions for the regression task in real-valued signal models. Motivated by the relative scarcity of formal tools for SBL in complex-valued m......In sparse Bayesian learning (SBL), Gaussian scale mixtures (GSMs) have been used to model sparsity-inducing priors that realize a class of concave penalty functions for the regression task in real-valued signal models. Motivated by the relative scarcity of formal tools for SBL in complex...... error, and robustness in low and medium signal-to-noise ratio regimes....
Khazraee, S Hadi; Johnson, Valen; Lord, Dominique
2018-08-01
The Poisson-gamma (PG) and Poisson-lognormal (PLN) regression models are among the most popular means for motor vehicle crash data analysis. Both models belong to the Poisson-hierarchical family of models. While numerous studies have compared the overall performance of alternative Bayesian Poisson-hierarchical models, little research has addressed the impact of model choice on the expected crash frequency prediction at individual sites. This paper sought to examine whether there are any trends among candidate models predictions e.g., that an alternative model's prediction for sites with certain conditions tends to be higher (or lower) than that from another model. In addition to the PG and PLN models, this research formulated a new member of the Poisson-hierarchical family of models: the Poisson-inverse gamma (PIGam). Three field datasets (from Texas, Michigan and Indiana) covering a wide range of over-dispersion characteristics were selected for analysis. This study demonstrated that the model choice can be critical when the calibrated models are used for prediction at new sites, especially when the data are highly over-dispersed. For all three datasets, the PIGam model would predict higher expected crash frequencies than would the PLN and PG models, in order, indicating a clear link between the models predictions and the shape of their mixing distributions (i.e., gamma, lognormal, and inverse gamma, respectively). The thicker tail of the PIGam and PLN models (in order) may provide an advantage when the data are highly over-dispersed. The analysis results also illustrated a major deficiency of the Deviance Information Criterion (DIC) in comparing the goodness-of-fit of hierarchical models; models with drastically different set of coefficients (and thus predictions for new sites) may yield similar DIC values, because the DIC only accounts for the parameters in the lowest (observation) level of the hierarchy and ignores the higher levels (regression coefficients
A new Bayesian Inference-based Phase Associator for Earthquake Early Warning
Meier, Men-Andrin; Heaton, Thomas; Clinton, John; Wiemer, Stefan
2013-04-01
State of the art network-based Earthquake Early Warning (EEW) systems can provide warnings for large magnitude 7+ earthquakes. Although regions in the direct vicinity of the epicenter will not receive warnings prior to damaging shaking, real-time event characterization is available before the destructive S-wave arrival across much of the strongly affected region. In contrast, in the case of the more frequent medium size events, such as the devastating 1994 Mw6.7 Northridge, California, earthquake, providing timely warning to the smaller damage zone is more difficult. For such events the "blind zone" of current systems (e.g. the CISN ShakeAlert system in California) is similar in size to the area over which severe damage occurs. We propose a faster and more robust Bayesian inference-based event associator, that in contrast to the current standard associators (e.g. Earthworm Binder), is tailored to EEW and exploits information other than only phase arrival times. In particular, the associator potentially allows for reliable automated event association with as little as two observations, which, compared to the ShakeAlert system, would speed up the real-time characterizations by about ten seconds and thus reduce the blind zone area by up to 80%. We compile an extensive data set of regional and teleseismic earthquake and noise waveforms spanning a wide range of earthquake magnitudes and tectonic regimes. We pass these waveforms through a causal real-time filterbank with passband filters between 0.1 and 50Hz, and, updating every second from the event detection, extract the maximum amplitudes in each frequency band. Using this dataset, we define distributions of amplitude maxima in each passband as a function of epicentral distance and magnitude. For the real-time data, we pass incoming broadband and strong motion waveforms through the same filterbank and extract an evolving set of maximum amplitudes in each passband. We use the maximum amplitude distributions to check
Hierarchical Bayesian modelling of mobility metrics for hazard model input calibration
Calder, Eliza; Ogburn, Sarah; Spiller, Elaine; Rutarindwa, Regis; Berger, Jim
2015-04-01
In this work we present a method to constrain flow mobility input parameters for pyroclastic flow models using hierarchical Bayes modeling of standard mobility metrics such as H/L and flow volume etc. The advantage of hierarchical modeling is that it can leverage the information in global dataset for a particular mobility metric in order to reduce the uncertainty in modeling of an individual volcano, especially important where individual volcanoes have only sparse datasets. We use compiled pyroclastic flow runout data from Colima, Merapi, Soufriere Hills, Unzen and Semeru volcanoes, presented in an open-source database FlowDat (https://vhub.org/groups/massflowdatabase). While the exact relationship between flow volume and friction varies somewhat between volcanoes, dome collapse flows originating from the same volcano exhibit similar mobility relationships. Instead of fitting separate regression models for each volcano dataset, we use a variation of the hierarchical linear model (Kass and Steffey, 1989). The model presents a hierarchical structure with two levels; all dome collapse flows and dome collapse flows at specific volcanoes. The hierarchical model allows us to assume that the flows at specific volcanoes share a common distribution of regression slopes, then solves for that distribution. We present comparisons of the 95% confidence intervals on the individual regression lines for the data set from each volcano as well as those obtained from the hierarchical model. The results clearly demonstrate the advantage of considering global datasets using this technique. The technique developed is demonstrated here for mobility metrics, but can be applied to many other global datasets of volcanic parameters. In particular, such methods can provide a means to better contain parameters for volcanoes for which we only have sparse data, a ubiquitous problem in volcanology.
Directory of Open Access Journals (Sweden)
W David Walter
Full Text Available Bovine tuberculosis is a bacterial disease caused by Mycobacterium bovis in livestock and wildlife with hosts that include Eurasian badgers (Meles meles, brushtail possum (Trichosurus vulpecula, and white-tailed deer (Odocoileus virginianus. Risk-assessment efforts in Michigan have been initiated on farms to minimize interactions of cattle with wildlife hosts but research on M. bovis on cattle farms has not investigated the spatial context of disease epidemiology. To incorporate spatially explicit data, initial likelihood of infection probabilities for cattle farms tested for M. bovis, prevalence of M. bovis in white-tailed deer, deer density, and environmental variables for each farm were modeled in a Bayesian hierarchical framework. We used geo-referenced locations of 762 cattle farms that have been tested for M. bovis, white-tailed deer prevalence, and several environmental variables that may lead to long-term survival and viability of M. bovis on farms and surrounding habitats (i.e., soil type, habitat type. Bayesian hierarchical analyses identified deer prevalence and proportion of sandy soil within our sampling grid as the most supported model. Analysis of cattle farms tested for M. bovis identified that for every 1% increase in sandy soil resulted in an increase in odds of infection by 4%. Our analysis revealed that the influence of prevalence of M. bovis in white-tailed deer was still a concern even after considerable efforts to prevent cattle interactions with white-tailed deer through on-farm mitigation and reduction in the deer population. Cattle farms test positive for M. bovis annually in our study area suggesting that the potential for an environmental source either on farms or in the surrounding landscape may contributing to new or re-infections with M. bovis. Our research provides an initial assessment of potential environmental factors that could be incorporated into additional modeling efforts as more knowledge of deer herd
Huang, Yi-Fei; Golding, G Brian
2015-02-15
A number of statistical phylogenetic methods have been developed to infer conserved functional sites or regions in proteins. Many methods, e.g. Rate4Site, apply the standard phylogenetic models to infer site-specific substitution rates and totally ignore the spatial correlation of substitution rates in protein tertiary structures, which may reduce their power to identify conserved functional patches in protein tertiary structures when the sequences used in the analysis are highly similar. The 3D sliding window method has been proposed to infer conserved functional patches in protein tertiary structures, but the window size, which reflects the strength of the spatial correlation, must be predefined and is not inferred from data. We recently developed GP4Rate to solve these problems under the Bayesian framework. Unfortunately, GP4Rate is computationally slow. Here, we present an intuitive web server, FuncPatch, to perform a fast approximate Bayesian inference of conserved functional patches in protein tertiary structures. Both simulations and four case studies based on empirical data suggest that FuncPatch is a good approximation to GP4Rate. However, FuncPatch is orders of magnitudes faster than GP4Rate. In addition, simulations suggest that FuncPatch is potentially a useful tool complementary to Rate4Site, but the 3D sliding window method is less powerful than FuncPatch and Rate4Site. The functional patches predicted by FuncPatch in the four case studies are supported by experimental evidence, which corroborates the usefulness of FuncPatch. The software FuncPatch is freely available at the web site, http://info.mcmaster.ca/yifei/FuncPatch golding@mcmaster.ca Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Ghosh, Sujit K
2010-01-01
Bayesian methods are rapidly becoming popular tools for making statistical inference in various fields of science including biology, engineering, finance, and genetics. One of the key aspects of Bayesian inferential method is its logical foundation that provides a coherent framework to utilize not only empirical but also scientific information available to a researcher. Prior knowledge arising from scientific background, expert judgment, or previously collected data is used to build a prior distribution which is then combined with current data via the likelihood function to characterize the current state of knowledge using the so-called posterior distribution. Bayesian methods allow the use of models of complex physical phenomena that were previously too difficult to estimate (e.g., using asymptotic approximations). Bayesian methods offer a means of more fully understanding issues that are central to many practical problems by allowing researchers to build integrated models based on hierarchical conditional distributions that can be estimated even with limited amounts of data. Furthermore, advances in numerical integration methods, particularly those based on Monte Carlo methods, have made it possible to compute the optimal Bayes estimators. However, there is a reasonably wide gap between the background of the empirically trained scientists and the full weight of Bayesian statistical inference. Hence, one of the goals of this chapter is to bridge the gap by offering elementary to advanced concepts that emphasize linkages between standard approaches and full probability modeling via Bayesian methods.
Yu, Jiyang; Silva, Jose; Califano, Andrea
2016-01-15
Functional genomics (FG) screens, using RNAi or CRISPR technology, have become a standard tool for systematic, genome-wide loss-of-function studies for therapeutic target discovery. As in many large-scale assays, however, off-target effects, variable reagents' potency and experimental noise must be accounted for appropriately control for false positives. Indeed, rigorous statistical analysis of high-throughput FG screening data remains challenging, particularly when integrative analyses are used to combine multiple sh/sgRNAs targeting the same gene in the library. We use large RNAi and CRISPR repositories that are publicly available to evaluate a novel meta-analysis approach for FG screens via Bayesian hierarchical modeling, Screening Bayesian Evaluation and Analysis Method (ScreenBEAM). Results from our analysis show that the proposed strategy, which seamlessly combines all available data, robustly outperforms classical algorithms developed for microarray data sets as well as recent approaches designed for next generation sequencing technologies. Remarkably, the ScreenBEAM algorithm works well even when the quality of FG screens is relatively low, which accounts for about 80-95% of the public datasets. R package and source code are available at: https://github.com/jyyu/ScreenBEAM. ac2248@columbia.edu, jose.silva@mssm.edu, yujiyang@gmail.com Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Yin, Ping; Mu, Lan; Madden, Marguerite; Vena, John E.
2014-10-01
Lung cancer is the second most commonly diagnosed cancer in both men and women in Georgia, USA. However, the spatio-temporal patterns of lung cancer risk in Georgia have not been fully studied. Hierarchical Bayesian models are used here to explore the spatio-temporal patterns of lung cancer incidence risk by race and gender in Georgia for the period of 2000-2007. With the census tract level as the spatial scale and the 2-year period aggregation as the temporal scale, we compare a total of seven Bayesian spatio-temporal models including two under a separate modeling framework and five under a joint modeling framework. One joint model outperforms others based on the deviance information criterion. Results show that the northwest region of Georgia has consistently high lung cancer incidence risk for all population groups during the study period. In addition, there are inverse relationships between the socioeconomic status and the lung cancer incidence risk among all Georgian population groups, and the relationships in males are stronger than those in females. By mapping more reliable variations in lung cancer incidence risk at a relatively fine spatio-temporal scale for different Georgian population groups, our study aims to better support healthcare performance assessment, etiological hypothesis generation, and health policy making.
Aksoy, Ozan; Weesie, Jeroen
2014-05-01
In this paper, using a within-subjects design, we estimate the utility weights that subjects attach to the outcome of their interaction partners in four decision situations: (1) binary Dictator Games (DG), second player's role in the sequential Prisoner's Dilemma (PD) after the first player (2) cooperated and (3) defected, and (4) first player's role in the sequential Prisoner's Dilemma game. We find that the average weights in these four decision situations have the following order: (1)>(2)>(4)>(3). Moreover, the average weight is positive in (1) but negative in (2), (3), and (4). Our findings indicate the existence of strong negative and small positive reciprocity for the average subject, but there is also high interpersonal variation in the weights in these four nodes. We conclude that the PD frame makes subjects more competitive than the DG frame. Using hierarchical Bayesian modeling, we simultaneously analyze beliefs of subjects about others' utility weights in the same four decision situations. We compare several alternative theoretical models on beliefs, e.g., rational beliefs (Bayesian-Nash equilibrium) and a consensus model. Our results on beliefs strongly support the consensus effect and refute rational beliefs: there is a strong relationship between own preferences and beliefs and this relationship is relatively stable across the four decision situations. Copyright © 2014 Elsevier Inc. All rights reserved.
Raghavan, Ram K.; Goodin, Douglas G.; Neises, Daniel; Anderson, Gary A.; Ganta, Roman R.
2016-01-01
This study aims to examine the spatio-temporal dynamics of Rocky Mountain spotted fever (RMSF) prevalence in four contiguous states of Midwestern United States, and to determine the impact of environmental and socio–economic factors associated with this disease. Bayesian hierarchical models were used to quantify space and time only trends and spatio–temporal interaction effect in the case reports submitted to the state health departments in the region. Various socio–economic, environmental and climatic covariates screened a priori in a bivariate procedure were added to a main–effects Bayesian model in progressive steps to evaluate important drivers of RMSF space-time patterns in the region. Our results show a steady increase in RMSF incidence over the study period to newer geographic areas, and the posterior probabilities of county-specific trends indicate clustering of high risk counties in the central and southern parts of the study region. At the spatial scale of a county, the prevalence levels of RMSF is influenced by poverty status, average relative humidity, and average land surface temperature (>35°C) in the region, and the relevance of these factors in the context of climate–change impacts on tick–borne diseases are discussed. PMID:26942604
Directory of Open Access Journals (Sweden)
Ram K Raghavan
Full Text Available This study aims to examine the spatio-temporal dynamics of Rocky Mountain spotted fever (RMSF prevalence in four contiguous states of Midwestern United States, and to determine the impact of environmental and socio-economic factors associated with this disease. Bayesian hierarchical models were used to quantify space and time only trends and spatio-temporal interaction effect in the case reports submitted to the state health departments in the region. Various socio-economic, environmental and climatic covariates screened a priori in a bivariate procedure were added to a main-effects Bayesian model in progressive steps to evaluate important drivers of RMSF space-time patterns in the region. Our results show a steady increase in RMSF incidence over the study period to newer geographic areas, and the posterior probabilities of county-specific trends indicate clustering of high risk counties in the central and southern parts of the study region. At the spatial scale of a county, the prevalence levels of RMSF is influenced by poverty status, average relative humidity, and average land surface temperature (>35°C in the region, and the relevance of these factors in the context of climate-change impacts on tick-borne diseases are discussed.
Raghavan, Ram K; Goodin, Douglas G; Neises, Daniel; Anderson, Gary A; Ganta, Roman R
2016-01-01
This study aims to examine the spatio-temporal dynamics of Rocky Mountain spotted fever (RMSF) prevalence in four contiguous states of Midwestern United States, and to determine the impact of environmental and socio-economic factors associated with this disease. Bayesian hierarchical models were used to quantify space and time only trends and spatio-temporal interaction effect in the case reports submitted to the state health departments in the region. Various socio-economic, environmental and climatic covariates screened a priori in a bivariate procedure were added to a main-effects Bayesian model in progressive steps to evaluate important drivers of RMSF space-time patterns in the region. Our results show a steady increase in RMSF incidence over the study period to newer geographic areas, and the posterior probabilities of county-specific trends indicate clustering of high risk counties in the central and southern parts of the study region. At the spatial scale of a county, the prevalence levels of RMSF is influenced by poverty status, average relative humidity, and average land surface temperature (>35°C) in the region, and the relevance of these factors in the context of climate-change impacts on tick-borne diseases are discussed.
Raithel, Carolyn A.; Özel, Feryal; Psaltis, Dimitrios
2017-08-01
One of the key goals of observing neutron stars is to infer the equation of state (EoS) of the cold, ultradense matter in their interiors. Here, we present a Bayesian statistical method of inferring the pressures at five fixed densities, from a sample of mock neutron star masses and radii. We show that while five polytropic segments are needed for maximum flexibility in the absence of any prior knowledge of the EoS, regularizers are also necessary to ensure that simple underlying EoS are not over-parameterized. For ideal data with small measurement uncertainties, we show that the pressure at roughly twice the nuclear saturation density, {ρ }{sat}, can be inferred to within 0.3 dex for many realizations of potential sources of uncertainties. The pressures of more complicated EoS with significant phase transitions can also be inferred to within ˜30%. We also find that marginalizing the multi-dimensional parameter space of pressure to infer a mass-radius relation can lead to biases of nearly 1 km in radius, toward larger radii. Using the full, five-dimensional posterior likelihoods avoids this bias.
Paz-Linares, Deirel; Vega-Hernández, Mayrim; Rojas-López, Pedro A; Valdés-Hernández, Pedro A; Martínez-Montes, Eduardo; Valdés-Sosa, Pedro A
2017-01-01
The estimation of EEG generating sources constitutes an Inverse Problem (IP) in Neuroscience. This is an ill-posed problem due to the non-uniqueness of the solution and regularization or prior information is needed to undertake Electrophysiology Source Imaging. Structured Sparsity priors can be attained through combinations of (L1 norm-based) and (L2 norm-based) constraints such as the Elastic Net (ENET) and Elitist Lasso (ELASSO) models. The former model is used to find solutions with a small number of smooth nonzero patches, while the latter imposes different degrees of sparsity simultaneously along different dimensions of the spatio-temporal matrix solutions. Both models have been addressed within the penalized regression approach, where the regularization parameters are selected heuristically, leading usually to non-optimal and computationally expensive solutions. The existing Bayesian formulation of ENET allows hyperparameter learning, but using the computationally intensive Monte Carlo/Expectation Maximization methods, which makes impractical its application to the EEG IP. While the ELASSO have not been considered before into the Bayesian context. In this work, we attempt to solve the EEG IP using a Bayesian framework for ENET and ELASSO models. We propose a Structured Sparse Bayesian Learning algorithm based on combining the Empirical Bayes and the iterative coordinate descent procedures to estimate both the parameters and hyperparameters. Using realistic simulations and avoiding the inverse crime we illustrate that our methods are able to recover complicated source setups more accurately and with a more robust estimation of the hyperparameters and behavior under different sparsity scenarios than classical LORETA, ENET and LASSO Fusion solutions. We also solve the EEG IP using data from a visual attention experiment, finding more interpretable neurophysiological patterns with our methods. The Matlab codes used in this work, including Simulations, Methods
Systematic validation of non-equilibrium thermochemical models using Bayesian inference
Miki, Kenji; Panesi, Marco; Prudhomme, Serge
2015-01-01
and Treanor [3], and Park [4]. Up to 16 uncertain parameters are estimated using Bayesian updating based on the latest absolute volumetric radiance data collected at the Electric Arc Shock Tube (EAST) installed inside the NASA Ames Research Center. Following
DEFF Research Database (Denmark)
Hooghoudt, Jan Otto; Barroso, Margarida; Waagepetersen, Rasmus Plenge
2017-01-01
Főrster resonance energy transfer (FRET) is a quantum-physical phenomenon where energy may be transferred from one molecule to a neighbour molecule if the molecules are close enough. Using fluorophore molecule marking of proteins in a cell it is possible to measure in microscopic images to what....... In this paper we propose a new likelihood-based approach to statistical inference for FRET microscopic data. The likelihood function is obtained from a detailed modeling of the FRET data generating mechanism conditional on a protein configuration. We next follow a Bayesian approach and introduce a spatial point...
Malekmohammadi, Bahram; Tayebzadeh Moghadam, Negar
2018-04-13
Environmental risk assessment (ERA) is a commonly used, effective tool applied to reduce adverse effects of environmental risk factors. In this study, ERA was investigated using the Bayesian network (BN) model based on a hierarchical structure of variables in an influence diagram (ID). ID facilitated ranking of the different alternatives under uncertainty that were then used to evaluate comparisons of the different risk factors. BN was used to present a new model for ERA applicable to complicated development projects such as dam construction. The methodology was applied to the Gabric Dam, in southern Iran. The main environmental risk factors in the region, presented by the Gabric Dam, were identified based on the Delphi technique and specific features of the study area. These included the following: flood, water pollution, earthquake, changes in land use, erosion and sedimentation, effects on the population, and ecosensitivity. These risk factors were then categorized based on results from the output decision node of the BN, including expected utility values for risk factors in the decision node. ERA was performed for the Gabric Dam using the analytical hierarchy process (AHP) method to compare results of BN modeling with those of conventional methods. Results determined that a BN-based hierarchical structure to ERA present acceptable and reasonable risk assessment prioritization in proposing suitable solutions to reduce environmental risks and can be used as a powerful decision support system for evaluating environmental risks.
Directory of Open Access Journals (Sweden)
Thomas J Rodhouse
Full Text Available Monitoring programs that evaluate restoration and inform adaptive management are important for addressing environmental degradation. These efforts may be well served by spatially explicit hierarchical approaches to modeling because of unavoidable spatial structure inherited from past land use patterns and other factors. We developed bayesian hierarchical models to estimate trends from annual density counts observed in a spatially structured wetland forb (Camassia quamash [camas] population following the cessation of grazing and mowing on the study area, and in a separate reference population of camas. The restoration site was bisected by roads and drainage ditches, resulting in distinct subpopulations ("zones" with different land use histories. We modeled this spatial structure by fitting zone-specific intercepts and slopes. We allowed spatial covariance parameters in the model to vary by zone, as in stratified kriging, accommodating anisotropy and improving computation and biological interpretation. Trend estimates provided evidence of a positive effect of passive restoration, and the strength of evidence was influenced by the amount of spatial structure in the model. Allowing trends to vary among zones and accounting for topographic heterogeneity increased precision of trend estimates. Accounting for spatial autocorrelation shifted parameter coefficients in ways that varied among zones depending on strength of statistical shrinkage, autocorrelation and topographic heterogeneity--a phenomenon not widely described. Spatially explicit estimates of trend from hierarchical models will generally be more useful to land managers than pooled regional estimates and provide more realistic assessments of uncertainty. The ability to grapple with historical contingency is an appealing benefit of this approach.
Directory of Open Access Journals (Sweden)
E. Laloy
2017-07-01
Full Text Available The rate at which low-lying sandy areas in temperate regions, such as the Campine Plateau (NE Belgium, have been eroding during the Quaternary is a matter of debate. Current knowledge on the average pace of landscape evolution in the Campine area is largely based on geological inferences and modern analogies. We performed a Bayesian inversion of an in situ-produced 10Be concentration depth profile to infer the average long-term erosion rate together with two other parameters: the surface exposure age and the inherited 10Be concentration. Compared to the latest advances in probabilistic inversion of cosmogenic radionuclide (CRN data, our approach has the following two innovative components: it (1 uses Markov chain Monte Carlo (MCMC sampling and (2 accounts (under certain assumptions for the contribution of model errors to posterior uncertainty. To investigate to what extent our approach differs from the state of the art in practice, a comparison against the Bayesian inversion method implemented in the CRONUScalc program is made. Both approaches identify similar maximum a posteriori (MAP parameter values, but posterior parameter and predictive uncertainty derived using the method taken in CRONUScalc is moderately underestimated. A simple way for producing more consistent uncertainty estimates with the CRONUScalc-like method in the presence of model errors is therefore suggested. Our inferred erosion rate of 39 ± 8. 9 mm kyr−1 (1σ is relatively large in comparison with landforms that erode under comparable (paleo-climates elsewhere in the world. We evaluate this value in the light of the erodibility of the substrate and sudden base level lowering during the Middle Pleistocene. A denser sampling scheme of a two-nuclide concentration depth profile would allow for better inferred erosion rate resolution, and including more uncertain parameters in the MCMC inversion.
Laloy, Eric; Beerten, Koen; Vanacker, Veerle; Christl, Marcus; Rogiers, Bart; Wouters, Laurent
2017-07-01
The rate at which low-lying sandy areas in temperate regions, such as the Campine Plateau (NE Belgium), have been eroding during the Quaternary is a matter of debate. Current knowledge on the average pace of landscape evolution in the Campine area is largely based on geological inferences and modern analogies. We performed a Bayesian inversion of an in situ-produced 10Be concentration depth profile to infer the average long-term erosion rate together with two other parameters: the surface exposure age and the inherited 10Be concentration. Compared to the latest advances in probabilistic inversion of cosmogenic radionuclide (CRN) data, our approach has the following two innovative components: it (1) uses Markov chain Monte Carlo (MCMC) sampling and (2) accounts (under certain assumptions) for the contribution of model errors to posterior uncertainty. To investigate to what extent our approach differs from the state of the art in practice, a comparison against the Bayesian inversion method implemented in the CRONUScalc program is made. Both approaches identify similar maximum a posteriori (MAP) parameter values, but posterior parameter and predictive uncertainty derived using the method taken in CRONUScalc is moderately underestimated. A simple way for producing more consistent uncertainty estimates with the CRONUScalc-like method in the presence of model errors is therefore suggested. Our inferred erosion rate of 39 ± 8. 9 mm kyr-1 (1σ) is relatively large in comparison with landforms that erode under comparable (paleo-)climates elsewhere in the world. We evaluate this value in the light of the erodibility of the substrate and sudden base level lowering during the Middle Pleistocene. A denser sampling scheme of a two-nuclide concentration depth profile would allow for better inferred erosion rate resolution, and including more uncertain parameters in the MCMC inversion.
Energy Technology Data Exchange (ETDEWEB)
Li, Kangji [Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou 310027 (China); School of Electricity Information Engineering, Jiangsu University, Zhenjiang 212013 (China); Su, Hongye [Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou 310027 (China)
2010-11-15
There are several ways to forecast building energy consumption, varying from simple regression to models based on physical principles. In this paper, a new method, namely, the hybrid genetic algorithm-hierarchical adaptive network-based fuzzy inference system (GA-HANFIS) model is developed. In this model, hierarchical structure decreases the rule base dimension. Both clustering and rule base parameters are optimized by GAs and neural networks (NNs). The model is applied to predict a hotel's daily air conditioning consumption for a period over 3 months. The results obtained by the proposed model are presented and compared with regular method of NNs, which indicates that GA-HANFIS model possesses better performance than NNs in terms of their forecasting accuracy. (author)
Bayesian inference – a way to combine statistical data and semantic analysis meaningfully
Directory of Open Access Journals (Sweden)
Eila Lindfors
2011-11-01
Full Text Available This article focuses on presenting the possibilities of Bayesian modelling (Finite Mixture Modelling in the semantic analysis of statistically modelled data. The probability of a hypothesis in relation to the data available is an important question in inductive reasoning. Bayesian modelling allows the researcher to use many models at a time and provides tools to evaluate the goodness of different models. The researcher should always be aware that there is no such thing as the exact probability of an exact event. This is the reason for using probabilistic models. Each model presents a different perspective on the phenomenon in focus, and the researcher has to choose the most probable model with a view to previous research and the knowledge available.The idea of Bayesian modelling is illustrated here by presenting two different sets of data, one from craft science research (n=167 and the other (n=63 from educational research (Lindfors, 2007, 2002. The principles of how to build models and how to combine different profiles are described in the light of the research mentioned.Bayesian modelling is an analysis based on calculating probabilities in relation to a specific set of quantitative data. It is a tool for handling data and interpreting it semantically. The reliability of the analysis arises from an argumentation of which model can be selected from the model space as the basis for an interpretation, and on which arguments.Keywords: method, sloyd, Bayesian modelling, student teachersURN:NBN:no-29959
Ruggeri, Fabrizio; Sawlan, Zaid A; Scavino, Marco; Tempone, Raul
2016-01-01
that the thermal diffusivity parameter can be modeled a priori through a lognormal random variable or by means of a space-dependent stationary lognormal random field. Synthetic data are used to test the inference. We exploit the behavior of the non-normalized log
Yasmirullah, Septia Devi Prihastuti; Iriawan, Nur; Sipayung, Feronika Rosalinda
2017-11-01
The success of regional economic establishment could be measured by economic growth. Since the Act No. 32 of 2004 has been implemented, unbalance economic among the regency in Indonesia is increasing. This condition is contrary different with the government goal to build society welfare through the economic activity development in each region. This research aims to examine economic growth through the distribution of bank credits to each Indonesia's regency. The data analyzed in this research is hierarchically structured data which follow normal distribution in first level. Two modeling approaches are employed in this research, a global-one level Bayesian approach and two-level hierarchical Bayesian approach. The result shows that hierarchical Bayesian has succeeded to demonstrate a better estimation than a global-one level Bayesian. It proves that the different economic growth in each province is significantly influenced by the variations of micro level characteristics in each province. These variations are significantly affected by cities and province characteristics in second level.
Wheeler, David C.; Hickson, DeMarc A.; Waller, Lance A.
2010-01-01
Many diagnostic tools and goodness-of-fit measures, such as the Akaike information criterion (AIC) and the Bayesian deviance information criterion (DIC), are available to evaluate the overall adequacy of linear regression models. In addition, visually assessing adequacy in models has become an essential part of any regression analysis. In this paper, we focus on a spatial consideration of the local DIC measure for model selection and goodness-of-fit evaluation. We use a partitioning of the DIC into the local DIC, leverage, and deviance residuals to assess local model fit and influence for both individual observations and groups of observations in a Bayesian framework. We use visualization of the local DIC and differences in local DIC between models to assist in model selection and to visualize the global and local impacts of adding covariates or model parameters. We demonstrate the utility of the local DIC in assessing model adequacy using HIV prevalence data from pregnant women in the Butare province of Rwanda during 1989-1993 using a range of linear model specifications, from global effects only to spatially varying coefficient models, and a set of covariates related to sexual behavior. Results of applying the diagnostic visualization approach include more refined model selection and greater understanding of the models as applied to the data. PMID:21243121
Lin, Yi-Shin; Heinke, Dietmar; Humphreys, Glyn W
2015-04-01
In this study, we applied Bayesian-based distributional analyses to examine the shapes of response time (RT) distributions in three visual search paradigms, which varied in task difficulty. In further analyses we investigated two common observations in visual search-the effects of display size and of variations in search efficiency across different task conditions-following a design that had been used in previous studies (Palmer, Horowitz, Torralba, & Wolfe, Journal of Experimental Psychology: Human Perception and Performance, 37, 58-71, 2011; Wolfe, Palmer, & Horowitz, Vision Research, 50, 1304-1311, 2010) in which parameters of the response distributions were measured. Our study showed that the distributional parameters in an experimental condition can be reliably estimated by moderate sample sizes when Monte Carlo simulation techniques are applied. More importantly, by analyzing trial RTs, we were able to extract paradigm-dependent shape changes in the RT distributions that could be accounted for by using the EZ2 diffusion model. The study showed that Bayesian-based RT distribution analyses can provide an important means to investigate the underlying cognitive processes in search, including stimulus grouping and the bottom-up guidance of attention.
Manual hierarchical clustering of regional geochemical data using a Bayesian finite mixture model
International Nuclear Information System (INIS)
Ellefsen, Karl J.; Smith, David B.
2016-01-01
Interpretation of regional scale, multivariate geochemical data is aided by a statistical technique called “clustering.” We investigate a particular clustering procedure by applying it to geochemical data collected in the State of Colorado, United States of America. The clustering procedure partitions the field samples for the entire survey area into two clusters. The field samples in each cluster are partitioned again to create two subclusters, and so on. This manual procedure generates a hierarchy of clusters, and the different levels of the hierarchy show geochemical and geological processes occurring at different spatial scales. Although there are many different clustering methods, we use Bayesian finite mixture modeling with two probability distributions, which yields two clusters. The model parameters are estimated with Hamiltonian Monte Carlo sampling of the posterior probability density function, which usually has multiple modes. Each mode has its own set of model parameters; each set is checked to ensure that it is consistent both with the data and with independent geologic knowledge. The set of model parameters that is most consistent with the independent geologic knowledge is selected for detailed interpretation and partitioning of the field samples. - Highlights: • We evaluate a clustering procedure by applying it to geochemical data. • The procedure generates a hierarchy of clusters. • Different levels of the hierarchy show geochemical processes at different spatial scales. • The clustering method is Bayesian finite mixture modeling. • Model parameters are estimated with Hamiltonian Monte Carlo sampling.
Yu, Jianbo
2015-12-01
Prognostics is much efficient to achieve zero-downtime performance, maximum productivity and proactive maintenance of machines. Prognostics intends to assess and predict the time evolution of machine health degradation so that machine failures can be predicted and prevented. A novel prognostics system is developed based on the data-model-fusion scheme using the Bayesian inference-based self-organizing map (SOM) and an integration of logistic regression (LR) and high-order particle filtering (HOPF). In this prognostics system, a baseline SOM is constructed to model the data distribution space of healthy machine under an assumption that predictable fault patterns are not available. Bayesian inference-based probability (BIP) derived from the baseline SOM is developed as a quantification indication of machine health degradation. BIP is capable of offering failure probability for the monitored machine, which has intuitionist explanation related to health degradation state. Based on those historic BIPs, the constructed LR and its modeling noise constitute a high-order Markov process (HOMP) to describe machine health propagation. HOPF is used to solve the HOMP estimation to predict the evolution of the machine health in the form of a probability density function (PDF). An on-line model update scheme is developed to adapt the Markov process changes to machine health dynamics quickly. The experimental results on a bearing test-bed illustrate the potential applications of the proposed system as an effective and simple tool for machine health prognostics.
Zhou, X.; Albertson, J. D.
2016-12-01
Natural gas is considered as a bridge fuel towards clean energy due to its potential lower greenhouse gas emission comparing with other fossil fuels. Despite numerous efforts, an efficient and cost-effective approach to monitor fugitive methane emissions along the natural gas production-supply chain has not been developed yet. Recently, mobile methane measurement has been introduced which applies a Bayesian approach to probabilistically infer methane emission rates and update estimates recursively when new measurements become available. However, the likelihood function, especially the error term which determines the shape of the estimate uncertainty, is not rigorously defined and evaluated with field data. To address this issue, we performed a series of near-source (using a specialized vehicle mounted with fast response methane analyzers and a GPS unit. Methane concentrations were measured at two different heights along mobile traversals downwind of the sources, and concurrent wind and temperature data are recorded by nearby 3-D sonic anemometers. With known methane release rates, the measurements were used to determine the functional form and the parameterization of the likelihood function in the Bayesian inference scheme under different meteorological conditions.
Scliar, Marilia O; Gouveia, Mateus H; Benazzo, Andrea; Ghirotto, Silvia; Fagundes, Nelson J R; Leal, Thiago P; Magalhães, Wagner C S; Pereira, Latife; Rodrigues, Maira R; Soares-Souza, Giordano B; Cabrera, Lilia; Berg, Douglas E; Gilman, Robert H; Bertorelle, Giorgio; Tarazona-Santos, Eduardo
2014-09-30
Archaeology reports millenary cultural contacts between Peruvian Coast-Andes and the Amazon Yunga, a rainforest transitional region between Andes and Lower Amazonia. To clarify the relationships between cultural and biological evolution of these populations, in particular between Amazon Yungas and Andeans, we used DNA-sequence data, a model-based Bayesian approach and several statistical validations to infer a set of demographic parameters. We found that the genetic diversity of the Shimaa (an Amazon Yunga population) is a subset of that of Quechuas from Central-Andes. Using the Isolation-with-Migration population genetics model, we inferred that the Shimaa ancestors were a small subgroup that split less than 5300 years ago (after the development of complex societies) from an ancestral Andean population. After the split, the most plausible scenario compatible with our results is that the ancestors of Shimaas moved toward the Peruvian Amazon Yunga and incorporated the culture and language of some of their neighbors, but not a substantial amount of their genes. We validated our results using Approximate Bayesian Computations, posterior predictive tests and the analysis of pseudo-observed datasets. We presented a case study in which model-based Bayesian approaches, combined with necessary statistical validations, shed light into the prehistoric demographic relationship between Andeans and a population from the Amazon Yunga. Our results offer a testable model for the peopling of this large transitional environmental region between the Andes and the Lower Amazonia. However, studies on larger samples and involving more populations of these regions are necessary to confirm if the predominant Andean biological origin of the Shimaas is the rule, and not the exception.
Directory of Open Access Journals (Sweden)
M. Subbiah
2017-01-01
Full Text Available Extensive statistical practice has shown the importance and relevance of the inferential problem of estimating probability parameters in a binomial experiment; especially on the issues of competing intervals from frequentist, Bayesian, and Bootstrap approaches. The package written in the free R environment and presented in this paper tries to take care of the issues just highlighted, by pooling a number of widely available and well-performing methods and apporting on them essential variations. A wide range of functions helps users with differing skills to estimate, evaluate, summarize, numerically and graphically, various measures adopting either the frequentist or the Bayesian paradigm.
A hierarchical Bayesian spatio-temporal model for extreme precipitation events
Ghosh, Souparno; Mallick, Bani K.
2011-01-01
We propose a new approach to model a sequence of spatially distributed time series of extreme values. Unlike common practice, we incorporate spatial dependence directly in the likelihood and allow the temporal component to be captured at the second level of hierarchy. Inferences about the parameters and spatio-temporal predictions are obtained via MCMC technique. The model is fitted to a gridded precipitation data set collected over 99 years across the continental U.S. © 2010 John Wiley & Sons, Ltd..
A hierarchical Bayesian spatio-temporal model for extreme precipitation events
Ghosh, Souparno
2011-03-01
We propose a new approach to model a sequence of spatially distributed time series of extreme values. Unlike common practice, we incorporate spatial dependence directly in the likelihood and allow the temporal component to be captured at the second level of hierarchy. Inferences about the parameters and spatio-temporal predictions are obtained via MCMC technique. The model is fitted to a gridded precipitation data set collected over 99 years across the continental U.S. © 2010 John Wiley & Sons, Ltd..
Hensman, James; Lawrence, Neil D; Rattray, Magnus
2013-08-20
Time course data from microarrays and high-throughput sequencing experiments require simple, computationally efficient and powerful statistical models to extract meaningful biological signal, and for tasks such as data fusion and clustering. Existing methodologies fail to capture either the temporal or replicated nature of the experiments, and often impose constraints on the data collection process, such as regularly spaced samples, or similar sampling schema across replications. We propose hierarchical Gaussian processes as a general model of gene expression time-series, with application to a variety of problems. In particular, we illustrate the method's capacity for missing data imputation, data fusion and clustering.The method can impute data which is missing both systematically and at random: in a hold-out test on real data, performance is significantly better than commonly used imputation methods. The method's ability to model inter- and intra-cluster variance leads to more biologically meaningful clusters. The approach removes the necessity for evenly spaced samples, an advantage illustrated on a developmental Drosophila dataset with irregular replications. The hierarchical Gaussian process model provides an excellent statistical basis for several gene-expression time-series tasks. It has only a few additional parameters over a regular GP, has negligible additional complexity, is easily implemented and can be integrated into several existing algorithms. Our experiments were implemented in python, and are available from the authors' website: http://staffwww.dcs.shef.ac.uk/people/J.Hensman/.
Czech Academy of Sciences Publication Activity Database
Fernandes, R.; Eley, Y.; Brabec, Marek; Lucquin, A.; Millard, A.; Craig, O.E.
2018-01-01
Roč. 117, March (2018), s. 31-42 ISSN 0146-6380 Institutional support: RVO:67985807 Keywords : Fatty acids * carbon isotopes * pottery use * Bayesian mixing models * FRUITS Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 3.081, year: 2016
Adaptive Bayesian inference on the mean of an infinite-dimensional normal distribution
Belitser, E.; Ghosal, S.
2003-01-01
We consider the problem of estimating the mean of an infinite-break dimensional normal distribution from the Bayesian perspective. Under the assumption that the unknown true mean satisfies a "smoothness condition," we first derive the convergence rate of the posterior distribution for a prior that
International Nuclear Information System (INIS)
Blanc, Guillermo A.; Kewley, Lisa; Vogt, Frédéric P. A.; Dopita, Michael A.
2015-01-01
We present a new method for inferring the metallicity (Z) and ionization parameter (q) of H II regions and star-forming galaxies using strong nebular emission lines (SELs). We use Bayesian inference to derive the joint and marginalized posterior probability density functions for Z and q given a set of observed line fluxes and an input photoionization model. Our approach allows the use of arbitrary sets of SELs and the inclusion of flux upper limits. The method provides a self-consistent way of determining the physical conditions of ionized nebulae that is not tied to the arbitrary choice of a particular SEL diagnostic and uses all the available information. Unlike theoretically calibrated SEL diagnostics, the method is flexible and not tied to a particular photoionization model. We describe our algorithm, validate it against other methods, and present a tool that implements it called IZI. Using a sample of nearby extragalactic H II regions, we assess the performance of commonly used SEL abundance diagnostics. We also use a sample of 22 local H II regions having both direct and recombination line (RL) oxygen abundance measurements in the literature to study discrepancies in the abundance scale between different methods. We find that oxygen abundances derived through Bayesian inference using currently available photoionization models in the literature can be in good (∼30%) agreement with RL abundances, although some models perform significantly better than others. We also confirm that abundances measured using the direct method are typically ∼0.2 dex lower than both RL and photoionization-model-based abundances
Energy Technology Data Exchange (ETDEWEB)
Blanc, Guillermo A. [Observatories of the Carnegie Institution for Science, 813 Santa Barbara Street, Pasadena, CA 91101 (United States); Kewley, Lisa; Vogt, Frédéric P. A.; Dopita, Michael A. [Research School of Astronomy and Astrophysics, Australian National University, Cotter Road, Weston, ACT 2611 (Australia)
2015-01-10
We present a new method for inferring the metallicity (Z) and ionization parameter (q) of H II regions and star-forming galaxies using strong nebular emission lines (SELs). We use Bayesian inference to derive the joint and marginalized posterior probability density functions for Z and q given a set of observed line fluxes and an input photoionization model. Our approach allows the use of arbitrary sets of SELs and the inclusion of flux upper limits. The method provides a self-consistent way of determining the physical conditions of ionized nebulae that is not tied to the arbitrary choice of a particular SEL diagnostic and uses all the available information. Unlike theoretically calibrated SEL diagnostics, the method is flexible and not tied to a particular photoionization model. We describe our algorithm, validate it against other methods, and present a tool that implements it called IZI. Using a sample of nearby extragalactic H II regions, we assess the performance of commonly used SEL abundance diagnostics. We also use a sample of 22 local H II regions having both direct and recombination line (RL) oxygen abundance measurements in the literature to study discrepancies in the abundance scale between different methods. We find that oxygen abundances derived through Bayesian inference using currently available photoionization models in the literature can be in good (∼30%) agreement with RL abundances, although some models perform significantly better than others. We also confirm that abundances measured using the direct method are typically ∼0.2 dex lower than both RL and photoionization-model-based abundances.
Bayesian models: A statistical primer for ecologists
Hobbs, N. Thompson; Hooten, Mevin B.
2015-01-01
Bayesian modeling has become an indispensable tool for ecological research because it is uniquely suited to deal with complexity in a statistically coherent way. This textbook provides a comprehensive and accessible introduction to the latest Bayesian methods—in language ecologists can understand. Unlike other books on the subject, this one emphasizes the principles behind the computations, giving ecologists a big-picture understanding of how to implement this powerful statistical approach.Bayesian Models is an essential primer for non-statisticians. It begins with a definition of probability and develops a step-by-step sequence of connected ideas, including basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and inference from single and multiple models. This unique book places less emphasis on computer coding, favoring instead a concise presentation of the mathematical statistics needed to understand how and why Bayesian analysis works. It also explains how to write out properly formulated hierarchical Bayesian models and use them in computing, research papers, and proposals.This primer enables ecologists to understand the statistical principles behind Bayesian modeling and apply them to research, teaching, policy, and management.Presents the mathematical and statistical foundations of Bayesian modeling in language accessible to non-statisticiansCovers basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and moreDeemphasizes computer coding in favor of basic principlesExplains how to write out properly factored statistical expressions representing Bayesian models
Walter, W. David; Smith, Rick; Vanderklok, Mike; VerCauterren, Kurt C.
2014-01-01
Bovine tuberculosis is a bacterial disease caused by Mycobacterium bovis in livestock and wildlife with hosts that include Eurasian badgers (Meles meles), brushtail possum (Trichosurus vulpecula), and white-tailed deer (Odocoileus virginianus). Risk-assessment efforts in Michigan have been initiated on farms to minimize interactions of cattle with wildlife hosts but research onM. bovis on cattle farms has not investigated the spatial context of disease epidemiology. To incorporate spatially explicit data, initial likelihood of infection probabilities for cattle farms tested for M. bovis, prevalence of M. bovis in white-tailed deer, deer density, and environmental variables for each farm were modeled in a Bayesian hierarchical framework. We used geo-referenced locations of 762 cattle farms that have been tested for M. bovis, white-tailed deer prevalence, and several environmental variables that may lead to long-term survival and viability of M. bovis on farms and surrounding habitats (i.e., soil type, habitat type). Bayesian hierarchical analyses identified deer prevalence and proportion of sandy soil within our sampling grid as the most supported model. Analysis of cattle farms tested for M. bovisidentified that for every 1% increase in sandy soil resulted in an increase in odds of infection by 4%. Our analysis revealed that the influence of prevalence of M. bovis in white-tailed deer was still a concern even after considerable efforts to prevent cattle interactions with white-tailed deer through on-farm mitigation and reduction in the deer population. Cattle farms test positive for M. bovis annually in our study area suggesting that the potential for an environmental source either on farms or in the surrounding landscape may contributing to new or re-infections with M. bovis. Our research provides an initial assessment of potential environmental factors that could be incorporated into additional modeling efforts as more knowledge of deer herd
Multi-model polynomial chaos surrogate dictionary for Bayesian inference in elasticity problems
Contreras, Andres A.; Le Maî tre, Olivier P.; Aquino, Wilkins; Knio, Omar
2016-01-01
of stiff inclusions embedded in a soft matrix, mimicking tumors in soft tissues. We rely on a polynomial chaos (PC) surrogate to accelerate the inference process. The PC surrogate predicts the dependence of the displacements field with the random elastic
DEFF Research Database (Denmark)
Iglesias, Juan Eugenio; Sabuncu, Mert Rory; Van Leemput, Koen
2013-01-01
Many segmentation algorithms in medical image analysis use Bayesian modeling to augment local image appearance with prior anatomical knowledge. Such methods often contain a large number of free parameters that are first estimated and then kept fixed during the actual segmentation process. However...... in an Alzheimer’s disease classification task. As an additional benefit, the technique also allows one to compute informative “error bars” on the volume estimates of individual structures....
Inferring the most probable maps of underground utilities using Bayesian mapping model
Bilal, Muhammad; Khan, Wasiq; Muggleton, Jennifer; Rustighi, Emiliano; Jenks, Hugo; Pennock, Steve R.; Atkins, Phil R.; Cohn, Anthony
2018-03-01
Mapping the Underworld (MTU), a major initiative in the UK, is focused on addressing social, environmental and economic consequences raised from the inability to locate buried underground utilities (such as pipes and cables) by developing a multi-sensor mobile device. The aim of MTU device is to locate different types of buried assets in real time with the use of automated data processing techniques and statutory records. The statutory records, even though typically being inaccurate and incomplete, provide useful prior information on what is buried under the ground and where. However, the integration of information from multiple sensors (raw data) with these qualitative maps and their visualization is challenging and requires the implementation of robust machine learning/data fusion approaches. An approach for automated creation of revised maps was developed as a Bayesian Mapping model in this paper by integrating the knowledge extracted from sensors raw data and available statutory records. The combination of statutory records with the hypotheses from sensors was for initial estimation of what might be found underground and roughly where. The maps were (re)constructed using automated image segmentation techniques for hypotheses extraction and Bayesian classification techniques for segment-manhole connections. The model consisting of image segmentation algorithm and various Bayesian classification techniques (segment recognition and expectation maximization (EM) algorithm) provided robust performance on various simulated as well as real sites in terms of predicting linear/non-linear segments and constructing refined 2D/3D maps.
Hagemann, M. W.; Gleason, C. J.; Durand, M. T.
2017-11-01
The forthcoming Surface Water and Ocean Topography (SWOT) NASA satellite mission will measure water surface width, height, and slope of major rivers worldwide. The resulting data could provide an unprecedented account of river discharge at continental scales, but reliable methods need to be identified prior to launch. Here we present a novel algorithm for discharge estimation from only remotely sensed stream width, slope, and height at multiple locations along a mass-conserved river segment. The algorithm, termed the Bayesian AMHG-Manning (BAM) algorithm, implements a Bayesian formulation of streamflow uncertainty using a combination of Manning's equation and at-many-stations hydraulic geometry (AMHG). Bayesian methods provide a statistically defensible approach to generating discharge estimates in a physically underconstrained system but rely on prior distributions that quantify the a priori uncertainty of unknown quantities including discharge and hydraulic equation parameters. These were obtained from literature-reported values and from a USGS data set of acoustic Doppler current profiler (ADCP) measurements at USGS stream gauges. A data set of simulated widths, slopes, and heights from 19 rivers was used to evaluate the algorithms using a set of performance metrics. Results across the 19 rivers indicate an improvement in performance of BAM over previously tested methods and highlight a path forward in solving discharge estimation using solely satellite remote sensing.
Safner, T.; Miller, M.P.; McRae, B.H.; Fortin, M.-J.; Manel, S.
2011-01-01
Recently, techniques available for identifying clusters of individuals or boundaries between clusters using genetic data from natural populations have expanded rapidly. Consequently, there is a need to evaluate these different techniques. We used spatially-explicit simulation models to compare three spatial Bayesian clustering programs and two edge detection methods. Spatially-structured populations were simulated where a continuous population was subdivided by barriers. We evaluated the ability of each method to correctly identify boundary locations while varying: (i) time after divergence, (ii) strength of isolation by distance, (iii) level of genetic diversity, and (iv) amount of gene flow across barriers. To further evaluate the methods' effectiveness to detect genetic clusters in natural populations, we used previously published data on North American pumas and a European shrub. Our results show that with simulated and empirical data, the Bayesian spatial clustering algorithms outperformed direct edge detection methods. All methods incorrectly detected boundaries in the presence of strong patterns of isolation by distance. Based on this finding, we support the application of Bayesian spatial clustering algorithms for boundary detection in empirical datasets, with necessary tests for the influence of isolation by distance. ?? 2011 by the authors; licensee MDPI, Basel, Switzerland.
Directory of Open Access Journals (Sweden)
Zhensheng Wang
2017-02-01
Full Text Available The spatial variation of geographical phenomena is a classical problem in spatial data analysis and can provide insight into underlying processes. Traditional exploratory methods mostly depend on the planar distance assumption, but many spatial phenomena are constrained to a subset of Euclidean space. In this study, we apply a method based on a hierarchical Bayesian model to analyse the spatial variation of network-constrained phenomena represented by a link attribute in conjunction with two experiments based on a simplified hypothetical network and a complex road network in Shenzhen that includes 4212 urban facility points of interest (POIs for leisure activities. Then, the methods named local indicators of network-constrained clusters (LINCS are applied to explore local spatial patterns in the given network space. The proposed method is designed for phenomena that are represented by attribute values of network links and is capable of removing part of random variability resulting from small-sample estimation. The effects of spatial dependence and the base distribution are also considered in the proposed method, which could be applied in the fields of urban planning and safety research.
Ishigami, Hideaki
2016-01-01
Relative age effect (RAE) in sports has been well documented. Recent studies investigate the effect of birthplace in addition to the RAE. The first objective of this study was to show the magnitude of the RAE in two major professional sports in Japan, baseball and soccer. Second, we examined the birthplace effect and compared its magnitude with that of the RAE. The effect sizes were estimated using a Bayesian hierarchical Poisson model with the number of players as dependent variable. The RAEs were 9.0% and 7.7% per month for soccer and baseball, respectively. These estimates imply that children born in the first month of a school year have about three times greater chance of becoming a professional player than those born in the last month of the year. Over half of the difference in likelihoods of becoming a professional player between birthplaces was accounted for by weather conditions, with the likelihood decreasing by 1% per snow day. An effect of population size was not detected in the data. By investigating different samples, we demonstrated that using quarterly data leads to underestimation and that the age range of sampled athletes should be set carefully.
Demeter, R M; Kristensen, A R; Dijkstra, J; Oude Lansink, A G J M; Meuwissen, M P M; van Arendonk, J A M
2011-12-01
Herd optimization models that determine economically optimal insemination and replacement decisions are valuable research tools to study various aspects of farming systems. The aim of this study was to develop a herd optimization and simulation model for dairy cattle. The model determines economically optimal insemination and replacement decisions for individual cows and simulates whole-herd results that follow from optimal decisions. The optimization problem was formulated as a multi-level hierarchic Markov process, and a state space model with Bayesian updating was applied to model variation in milk yield. Methodological developments were incorporated in 2 main aspects. First, we introduced an additional level to the model hierarchy to obtain a more tractable and efficient structure. Second, we included a recently developed cattle feed intake model. In addition to methodological developments, new parameters were used in the state space model and other biological functions. Results were generated for Dutch farming conditions, and outcomes were in line with actual herd performance in the Netherlands. Optimal culling decisions were sensitive to variation in milk yield but insensitive to energy requirements for maintenance and feed intake capacity. We anticipate that the model will be applied in research and extension. Copyright © 2011 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Directory of Open Access Journals (Sweden)
Ali Reza Soltanian
2016-08-01
Full Text Available Background Adolescence is one of the most important periods in the course of human evolution and the prevalence of mental disorders among adolescence in different regions of Iran, especially in southern Iran. Objectives This study was conducted to determine the prevalence of mental disorders among high school students in Bushehr province, south of Iran. Methods In this cross-sectional study, 286 high school students were recruited by a multi-stage random sampling in Bushehr province in 2015. A general health questionnaire (GHQ-28 was used to assess mental disorders. The small area method, under the hierarchical Bayesian approach, was used to determine the prevalence of mental disorders and data analysis. Results From 286 questionnaires only 182 were completely filed and evaluated (the response rate was 70.5%. Of the students, 58.79% and 41.21% were male and female, respectively. Of all students, the prevalence of mental disorders in Bushehr, Dayyer, Deylam, Kangan, Dashtestan, Tangestan, Genaveh, and Dashty were 0.48, 0.42, 0.45, 0.52, 0.41, 0.47, 0.42, and 0.43, respectively. Conclusions Based on this study, the prevalence of mental disorders among adolescents was increasing in Bushehr Province counties. The lack of a national policy in this way is a serious obstacle to mental health and wellbeing access.
Directory of Open Access Journals (Sweden)
Allen Rodrigo
2006-01-01
Full Text Available Using the structured serial coalescent with Bayesian MCMC and serial samples, we estimate population size when some demes are not sampled or are hidden, ie ghost demes. It is found that even with the presence of a ghost deme, accurate inference was possible if the parameters are estimated with the true model. However with an incorrect model, estimates were biased and can be positively misleading. We extend these results to the case where there are sequences from the ghost at the last time sample. This case can arise in HIV patients, when some tissue samples and viral sequences only become available after death. When some sequences from the ghost deme are available at the last sampling time, estimation bias is reduced and accurate estimation of parameters associated with the ghost deme is possible despite sampling bias. Migration rates for this case are also shown to be good estimates when migration values are low.
Furtado-Junior, I; Abrunhosa, F A; Holanda, F C A F; Tavares, M C S
2016-06-01
Fishing selectivity of the mangrove crab Ucides cordatus in the north coast of Brazil can be defined as the fisherman's ability to capture and select individuals from a certain size or sex (or a combination of these factors) which suggests an empirical selectivity. Considering this hypothesis, we calculated the selectivity curves for males and females crabs using the logit function of the logistic model in the formulation. The Bayesian inference consisted of obtaining the posterior distribution by applying the Markov chain Monte Carlo (MCMC) method to software R using the OpenBUGS, BRugs, and R2WinBUGS libraries. The estimated results of width average carapace selection for males and females compared with previous studies reporting the average width of the carapace of sexual maturity allow us to confirm the hypothesis that most mature individuals do not suffer from fishing pressure; thus, ensuring their sustainability.
Goodlet, Brent R.; Mills, Leah; Bales, Ben; Charpagne, Marie-Agathe; Murray, Sean P.; Lenthe, William C.; Petzold, Linda; Pollock, Tresa M.
2018-06-01
Bayesian inference is employed to precisely evaluate single crystal elastic properties of novel γ -γ ' Co- and CoNi-based superalloys from simple and non-destructive resonant ultrasound spectroscopy (RUS) measurements. Nine alloys from three Co-, CoNi-, and Ni-based alloy classes were evaluated in the fully aged condition, with one alloy per class also evaluated in the solution heat-treated condition. Comparisons are made between the elastic properties of the three alloy classes and among the alloys of a single class, with the following trends observed. A monotonic rise in the c_{44} (shear) elastic constant by a total of 12 pct is observed between the three alloy classes as Co is substituted for Ni. Elastic anisotropy ( A) is also increased, with a large majority of the nearly 13 pct increase occurring after Co becomes the dominant constituent. Together the five CoNi alloys, with Co:Ni ratios from 1:1 to 1.5:1, exhibited remarkably similar properties with an average A 1.8 pct greater than the Ni-based alloy CMSX-4. Custom code demonstrating a substantial advance over previously reported methods for RUS inversion is also reported here for the first time. CmdStan-RUS is built upon the open-source probabilistic programing language of Stan and formulates the inverse problem using Bayesian methods. Bayesian posterior distributions are efficiently computed with Hamiltonian Monte Carlo (HMC), while initial parameterization is randomly generated from weakly informative prior distributions. Remarkably robust convergence behavior is demonstrated across multiple independent HMC chains in spite of initial parameterization often very far from actual parameter values. Experimental procedures are substantially simplified by allowing any arbitrary misorientation between the specimen and crystal axes, as elastic properties and misorientation are estimated simultaneously.
Ponciano, José Miguel
2017-11-22
Using a nonparametric Bayesian approach Palacios and Minin (2013) dramatically improved the accuracy, precision of Bayesian inference of population size trajectories from gene genealogies. These authors proposed an extension of a Gaussian Process (GP) nonparametric inferential method for the intensity function of non-homogeneous Poisson processes. They found that not only the statistical properties of the estimators were improved with their method, but also, that key aspects of the demographic histories were recovered. The authors' work represents the first Bayesian nonparametric solution to this inferential problem because they specify a convenient prior belief without a particular functional form on the population trajectory. Their approach works so well and provides such a profound understanding of the biological process, that the question arises as to how truly "biology-free" their approach really is. Using well-known concepts of stochastic population dynamics, here I demonstrate that in fact, Palacios and Minin's GP model can be cast as a parametric population growth model with density dependence and environmental stochasticity. Making this link between population genetics and stochastic population dynamics modeling provides novel insights into eliciting biologically meaningful priors for the trajectory of the effective population size. The results presented here also bring novel understanding of GP as models for the evolution of a trait. Thus, the ecological principles foundation of Palacios and Minin (2013)'s prior adds to the conceptual and scientific value of these authors' inferential approach. I conclude this note by listing a series of insights brought about by this connection with Ecology. Copyright © 2017 The Author. Published by Elsevier Inc. All rights reserved.
Underwood, Kristen L.; Rizzo, Donna M.; Schroth, Andrew W.; Dewoolkar, Mandar M.
2017-12-01
Given the variable biogeochemical, physical, and hydrological processes driving fluvial sediment and nutrient export, the water science and management communities need data-driven methods to identify regions prone to production and transport under variable hydrometeorological conditions. We use Bayesian analysis to segment concentration-discharge linear regression models for total suspended solids (TSS) and particulate and dissolved phosphorus (PP, DP) using 22 years of monitoring data from 18 Lake Champlain watersheds. Bayesian inference was leveraged to estimate segmented regression model parameters and identify threshold position. The identified threshold positions demonstrated a considerable range below and above the median discharge—which has been used previously as the default breakpoint in segmented regression models to discern differences between pre and post-threshold export regimes. We then applied a Self-Organizing Map (SOM), which partitioned the watersheds into clusters of TSS, PP, and DP export regimes using watershed characteristics, as well as Bayesian regression intercepts and slopes. A SOM defined two clusters of high-flux basins, one where PP flux was predominantly episodic and hydrologically driven; and another in which the sediment and nutrient sourcing and mobilization were more bimodal, resulting from both hydrologic processes at post-threshold discharges and reactive processes (e.g., nutrient cycling or lateral/vertical exchanges of fine sediment) at prethreshold discharges. A separate DP SOM defined two high-flux clusters exhibiting a bimodal concentration-discharge response, but driven by differing land use. Our novel framework shows promise as a tool with broad management application that provides insights into landscape drivers of riverine solute and sediment export.
Bayesian inference for data assimilation using Least-Squares Finite Element methods
International Nuclear Information System (INIS)
Dwight, Richard P
2010-01-01
It has recently been observed that Least-Squares Finite Element methods (LS-FEMs) can be used to assimilate experimental data into approximations of PDEs in a natural way, as shown by Heyes et al. in the case of incompressible Navier-Stokes flow. The approach was shown to be effective without regularization terms, and can handle substantial noise in the experimental data without filtering. Of great practical importance is that - unlike other data assimilation techniques - it is not significantly more expensive than a single physical simulation. However the method as presented so far in the literature is not set in the context of an inverse problem framework, so that for example the meaning of the final result is unclear. In this paper it is shown that the method can be interpreted as finding a maximum a posteriori (MAP) estimator in a Bayesian approach to data assimilation, with normally distributed observational noise, and a Bayesian prior based on an appropriate norm of the governing equations. In this setting the method may be seen to have several desirable properties: most importantly discretization and modelling error in the simulation code does not affect the solution in limit of complete experimental information, so these errors do not have to be modelled statistically. Also the Bayesian interpretation better justifies the choice of the method, and some useful generalizations become apparent. The technique is applied to incompressible Navier-Stokes flow in a pipe with added velocity data, where its effectiveness, robustness to noise, and application to inverse problems is demonstrated.
Bayesian inference in a discrete shock model using confounded common cause data
International Nuclear Information System (INIS)
Kvam, Paul H.; Martz, Harry F.
1995-01-01
We consider redundant systems of identical components for which reliability is assessed statistically using only demand-based failures and successes. Direct assessment of system reliability can lead to gross errors in estimation if there exist external events in the working environment that cause two or more components in the system to fail in the same demand period which have not been included in the reliability model. We develop a simple Bayesian model for estimating component reliability and the corresponding probability of common cause failure in operating systems for which the data is confounded; that is, the common cause failures cannot be distinguished from multiple independent component failures in the narrative event descriptions
Bayesian inference for spatio-temporal spike-and-slab priors
DEFF Research Database (Denmark)
Andersen, Michael Riis; Vehtari, Aki; Winther, Ole
2017-01-01
a transformed Gaussian process on the spike-and-slab probabilities. An expectation propagation (EP) algorithm for posterior inference under the proposed model is derived. For large scale problems, the standard EP algorithm can be prohibitively slow. We therefore introduce three different approximation schemes...
A cost minimisation and Bayesian inference model predicts startle reflex modulation across species.
Bach, Dominik R
2015-04-07
In many species, rapid defensive reflexes are paramount to escaping acute danger. These reflexes are modulated by the state of the environment. This is exemplified in fear-potentiated startle, a more vigorous startle response during conditioned anticipation of an unrelated threatening event. Extant explanations of this phenomenon build on descriptive models of underlying psychological states, or neural processes. Yet, they fail to predict invigorated startle during reward anticipation and instructed attention, and do not explain why startle reflex modulation evolved. Here, we fill this lacuna by developing a normative cost minimisation model based on Bayesian optimality principles. This model predicts the observed pattern of startle modification by rewards, punishments, instructed attention, and several other states. Moreover, the mathematical formalism furnishes predictions that can be tested experimentally. Comparing the model with existing data suggests a specific neural implementation of the underlying computations which yields close approximations to the optimal solution under most circumstances. This analysis puts startle modification into the framework of Bayesian decision theory and predictive coding, and illustrates the importance of an adaptive perspective to interpret defensive behaviour across species. Copyright © 2015 The Author. Published by Elsevier Ltd.. All rights reserved.
Rajaona, Harizo; Septier, François; Armand, Patrick; Delignon, Yves; Olry, Christophe; Albergel, Armand; Moussafir, Jacques
2015-12-01
In the eventuality of an accidental or intentional atmospheric release, the reconstruction of the source term using measurements from a set of sensors is an important and challenging inverse problem. A rapid and accurate estimation of the source allows faster and more efficient action for first-response teams, in addition to providing better damage assessment. This paper presents a Bayesian probabilistic approach to estimate the location and the temporal emission profile of a pointwise source. The release rate is evaluated analytically by using a Gaussian assumption on its prior distribution, and is enhanced with a positivity constraint to improve the estimation. The source location is obtained by the means of an advanced iterative Monte-Carlo technique called Adaptive Multiple Importance Sampling (AMIS), which uses a recycling process at each iteration to accelerate its convergence. The proposed methodology is tested using synthetic and real concentration data in the framework of the Fusion Field Trials 2007 (FFT-07) experiment. The quality of the obtained results is comparable to those coming from the Markov Chain Monte Carlo (MCMC) algorithm, a popular Bayesian method used for source estimation. Moreover, the adaptive processing of the AMIS provides a better sampling efficiency by reusing all the generated samples.
Bayesian inference in mass flow simulations - from back calculation to prediction
Kofler, Andreas; Fischer, Jan-Thomas; Hellweger, Valentin; Huber, Andreas; Mergili, Martin; Pudasaini, Shiva; Fellin, Wolfgang; Oberguggenberger, Michael
2017-04-01
Mass flow simulations are an integral part of hazard assessment. Determining the hazard potential requires a multidisciplinary approach, including different scientific fields such as geomorphology, meteorology, physics, civil engineering and mathematics. An important task in snow avalanche simulation is to predict process intensities (runout, flow velocity and depth, ...). The application of probabilistic methods allows one to develop a comprehensive simulation concept, ranging from back to forward calculation and finally to prediction of mass flow events. In this context optimized parameter sets for the used simulation model or intensities of the modeled mass flow process (e.g. runout distances) are represented by probability distributions. Existing deterministic flow models, in particular with respect to snow avalanche dynamics, contain several parameters (e.g. friction). Some of these parameters are more conceptual than physical and their direct measurement in the field is hardly possible. Hence, parameters have to be optimized by matching simulation results to field observations. This inverse problem can be solved by a Bayesian approach (Markov chain Monte Carlo). The optimization process yields parameter distributions, that can be utilized for probabilistic reconstruction and prediction of avalanche events. Arising challenges include the limited amount of observations, correlations appearing in model parameters or observed avalanche characteristics (e.g. velocity and runout) and the accurate handling of ensemble simulations, always taking into account the related uncertainties. Here we present an operational Bayesian simulation framework with r.avaflow, the open source GIS simulation model for granular avalanches and debris flows.
Bayesian inference in genetic parameter estimation of visual scores in Nellore beef-cattle
2009-01-01
The aim of this study was to estimate the components of variance and genetic parameters for the visual scores which constitute the Morphological Evaluation System (MES), such as body structure (S), precocity (P) and musculature (M) in Nellore beef-cattle at the weaning and yearling stages, by using threshold Bayesian models. The information used for this was gleaned from visual scores of 5,407 animals evaluated at the weaning and 2,649 at the yearling stages. The genetic parameters for visual score traits were estimated through two-trait analysis, using the threshold animal model, with Bayesian statistics methodology and MTGSAM (Multiple Trait Gibbs Sampler for Animal Models) threshold software. Heritability estimates for S, P and M were 0.68, 0.65 and 0.62 (at weaning) and 0.44, 0.38 and 0.32 (at the yearling stage), respectively. Heritability estimates for S, P and M were found to be high, and so it is expected that these traits should respond favorably to direct selection. The visual scores evaluated at the weaning and yearling stages might be used in the composition of new selection indexes, as they presented sufficient genetic variability to promote genetic progress in such morphological traits. PMID:21637450
Elsheikh, Ahmed H.
2014-02-01
An efficient Bayesian calibration method based on the nested sampling (NS) algorithm and non-intrusive polynomial chaos method is presented. Nested sampling is a Bayesian sampling algorithm that builds a discrete representation of the posterior distributions by iteratively re-focusing a set of samples to high likelihood regions. NS allows representing the posterior probability density function (PDF) with a smaller number of samples and reduces the curse of dimensionality effects. The main difficulty of the NS algorithm is in the constrained sampling step which is commonly performed using a random walk Markov Chain Monte-Carlo (MCMC) algorithm. In this work, we perform a two-stage sampling using a polynomial chaos response surface to filter out rejected samples in the Markov Chain Monte-Carlo method. The combined use of nested sampling and the two-stage MCMC based on approximate response surfaces provides significant computational gains in terms of the number of simulation runs. The proposed algorithm is applied for calibration and model selection of subsurface flow models. © 2013.
Bayesian inference of nonlinear unsteady aerodynamics from aeroelastic limit cycle oscillations
Energy Technology Data Exchange (ETDEWEB)
Sandhu, Rimple [Department of Civil and Environmental Engineering, Carleton University, Ottawa, Ontario (Canada); Poirel, Dominique [Department of Mechanical and Aerospace Engineering, Royal Military College of Canada, Kingston, Ontario (Canada); Pettit, Chris [Department of Aerospace Engineering, United States Naval Academy, Annapolis, MD (United States); Khalil, Mohammad [Department of Civil and Environmental Engineering, Carleton University, Ottawa, Ontario (Canada); Sarkar, Abhijit, E-mail: abhijit.sarkar@carleton.ca [Department of Civil and Environmental Engineering, Carleton University, Ottawa, Ontario (Canada)
2016-07-01
A Bayesian model selection and parameter estimation algorithm is applied to investigate the influence of nonlinear and unsteady aerodynamic loads on the limit cycle oscillation (LCO) of a pitching airfoil in the transitional Reynolds number regime. At small angles of attack, laminar boundary layer trailing edge separation causes negative aerodynamic damping leading to the LCO. The fluid–structure interaction of the rigid, but elastically mounted, airfoil and nonlinear unsteady aerodynamics is represented by two coupled nonlinear stochastic ordinary differential equations containing uncertain parameters and model approximation errors. Several plausible aerodynamic models with increasing complexity are proposed to describe the aeroelastic system leading to LCO. The likelihood in the posterior parameter probability density function (pdf) is available semi-analytically using the extended Kalman filter for the state estimation of the coupled nonlinear structural and unsteady aerodynamic model. The posterior parameter pdf is sampled using a parallel and adaptive Markov Chain Monte Carlo (MCMC) algorithm. The posterior probability of each model is estimated using the Chib–Jeliazkov method that directly uses the posterior MCMC samples for evidence (marginal likelihood) computation. The Bayesian algorithm is validated through a numerical study and then applied to model the nonlinear unsteady aerodynamic loads using wind-tunnel test data at various Reynolds numbers.
Nieland, Simon; Kleinschmit, Birgit; Förster, Michael
2015-05-01
Ontology-based applications hold promise in improving spatial data interoperability. In this work we use remote sensing-based biodiversity information and apply semantic formalisation and ontological inference to show improvements in data interoperability/comparability. The proposed methodology includes an observation-based, "bottom-up" engineering approach for remote sensing applications and gives a practical example of semantic mediation of geospatial products. We apply the methodology to three different nomenclatures used for remote sensing-based classification of two heathland nature conservation areas in Belgium and Germany. We analysed sensor nomenclatures with respect to their semantic formalisation and their bio-geographical differences. The results indicate that a hierarchical and transparent nomenclature is far more important for transferability than the sensor or study area. The inclusion of additional information, not necessarily belonging to a vegetation class description, is a key factor for the future success of using semantics for interoperability in remote sensing.
Energy Technology Data Exchange (ETDEWEB)
Marzouk, Youssef; Fast P. (Lawrence Livermore National Laboratory, Livermore, CA); Kraus, M. (Peterson AFB, CO); Ray, J. P.
2006-01-01
Terrorist attacks using an aerosolized pathogen preparation have gained credibility as a national security concern after the anthrax attacks of 2001. The ability to characterize such attacks, i.e., to estimate the number of people infected, the time of infection, and the average dose received, is important when planning a medical response. We address this question of characterization by formulating a Bayesian inverse problem predicated on a short time-series of diagnosed patients exhibiting symptoms. To be of relevance to response planning, we limit ourselves to 3-5 days of data. In tests performed with anthrax as the pathogen, we find that these data are usually sufficient, especially if the model of the outbreak used in the inverse problem is an accurate one. In some cases the scarcity of data may initially support outbreak characterizations at odds with the true one, but with sufficient data the correct inferences are recovered; in other words, the inverse problem posed and its solution methodology are consistent. We also explore the effect of model error-situations for which the model used in the inverse problem is only a partially accurate representation of the outbreak; here, the model predictions and the observations differ by more than a random noise. We find that while there is a consistent discrepancy between the inferred and the true characterizations, they are also close enough to be of relevance when planning a response.
Sraj, Ihab
2016-08-26
The authors present a polynomial chaos (PC)-based Bayesian inference method for quantifying the uncertainties of the K-profile parameterization (KPP) within the MIT general circulation model (MITgcm) of the tropical Pacific. The inference of the uncertain parameters is based on a Markov chain Monte Carlo (MCMC) scheme that utilizes a newly formulated test statistic taking into account the different components representing the structures of turbulent mixing on both daily and seasonal time scales in addition to the data quality, and filters for the effects of parameter perturbations over those as a result of changes in the wind. To avoid the prohibitive computational cost of integrating the MITgcm model at each MCMC iteration, a surrogate model for the test statistic using the PC method is built. Because of the noise in the model predictions, a basis-pursuit-denoising (BPDN) compressed sensing approach is employed to determine the PC coefficients of a representative surrogate model. The PC surrogate is then used to evaluate the test statistic in the MCMC step for sampling the posterior of the uncertain parameters. Results of the posteriors indicate good agreement with the default values for two parameters of the KPP model, namely the critical bulk and gradient Richardson numbers; while the posteriors of the remaining parameters were barely informative. © 2016 American Meteorological Society.
Lewis, Cecil M
2010-02-01
This study examines a genome-wide dataset of 678 Short Tandem Repeat loci characterized in 444 individuals representing 29 Native American populations as well as the Tundra Netsi and Yakut populations from Siberia. Using these data, the study tests four current hypotheses regarding the hierarchical distribution of neutral genetic variation in native South American populations: (1) the western region of South America harbors more variation than the eastern region of South America, (2) Central American and western South American populations cluster exclusively, (3) populations speaking the Chibchan-Paezan and Equatorial-Tucanoan language stock emerge as a group within an otherwise South American clade, (4) Chibchan-Paezan populations in Central America emerge together at the tips of the Chibchan-Paezan cluster. This study finds that hierarchical models with the best fit place Central American populations, and populations speaking the Chibchan-Paezan language stock, at a basal position or separated from the South American group, which is more consistent with a serial founder effect into South America than that previously described. Western (Andean) South America is found to harbor similar levels of variation as eastern (Equatorial-Tucanoan and Ge-Pano-Carib) South America, which is inconsistent with an initial west coast migration into South America. Moreover, in all relevant models, the estimates of genetic diversity within geographic regions suggest a major bottleneck or founder effect occurring within the North American subcontinent, before the peopling of Central and South America. 2009 Wiley-Liss, Inc.
Burn, Robert W; Underwood, Fiona M; Blanc, Julian
2011-01-01
Elephant poaching and the ivory trade remain high on the agenda at meetings of the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES). Well-informed debates require robust estimates of trends, the spatial distribution of poaching, and drivers of poaching. We present an analysis of trends and drivers of an indicator of elephant poaching of all elephant species. The site-based monitoring system known as Monitoring the Illegal Killing of Elephants (MIKE), set up by the 10(th) Conference of the Parties of CITES in 1997, produces carcass encounter data reported mainly by anti-poaching patrols. Data analyzed were site by year totals of 6,337 carcasses from 66 sites in Africa and Asia from 2002-2009. Analysis of these observational data is a serious challenge to traditional statistical methods because of the opportunistic and non-random nature of patrols, and the heterogeneity across sites. Adopting a bayesian hierarchical modeling approach, we used the proportion of carcasses that were illegally killed (PIKE) as a poaching index, to estimate the trend and the effects of site- and country-level factors associated with poaching. Important drivers of illegal killing that emerged at country level were poor governance and low levels of human development, and at site level, forest cover and area of the site in regions where human population density is low. After a drop from 2002, PIKE remained fairly constant from 2003 until 2006, after which it increased until 2008. The results for 2009 indicate a decline. Sites with PIKE ranging from the lowest to the highest were identified. The results of the analysis provide a sound information base for scientific evidence-based decision making in the CITES process.
Directory of Open Access Journals (Sweden)
Robert W Burn
Full Text Available Elephant poaching and the ivory trade remain high on the agenda at meetings of the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES. Well-informed debates require robust estimates of trends, the spatial distribution of poaching, and drivers of poaching. We present an analysis of trends and drivers of an indicator of elephant poaching of all elephant species. The site-based monitoring system known as Monitoring the Illegal Killing of Elephants (MIKE, set up by the 10(th Conference of the Parties of CITES in 1997, produces carcass encounter data reported mainly by anti-poaching patrols. Data analyzed were site by year totals of 6,337 carcasses from 66 sites in Africa and Asia from 2002-2009. Analysis of these observational data is a serious challenge to traditional statistical methods because of the opportunistic and non-random nature of patrols, and the heterogeneity across sites. Adopting a bayesian hierarchical modeling approach, we used the proportion of carcasses that were illegally killed (PIKE as a poaching index, to estimate the trend and the effects of site- and country-level factors associated with poaching. Important drivers of illegal killing that emerged at country level were poor governance and low levels of human development, and at site level, forest cover and area of the site in regions where human population density is low. After a drop from 2002, PIKE remained fairly constant from 2003 until 2006, after which it increased until 2008. The results for 2009 indicate a decline. Sites with PIKE ranging from the lowest to the highest were identified. The results of the analysis provide a sound information base for scientific evidence-based decision making in the CITES process.
Directory of Open Access Journals (Sweden)
Jack Giovanini
Full Text Available As human demand for ecosystem products increases, management intervention may become more frequent after environmental disturbances. Evaluations of ecological responses to cumulative effects of management interventions and natural disturbances provide critical decision-support tools for managers who strive to balance environmental conservation and economic development. We conducted an experiment to evaluate the effects of salvage logging on avian community composition in lodgepole pine (Pinus contorta forests affected by beetle outbreaks in Oregon, USA, 1996-1998. Treatments consisted of the removal of lodgepole pine snags only, and live trees were not harvested. We used a bayesian hierarchical model to quantify occupancy dynamics for 27 breeding species, while accounting for variation in the detection process. We examined how magnitude and precision of treatment effects varied when incorporating prior information from a separate intervention study that occurred in a similar ecological system. Regardless of which prior we evaluated, we found no evidence that the harvest treatment had a negative impact on species richness, with an estimated average of 0.2-2.2 more species in harvested stands than unharvested stands. Estimated average similarity between control and treatment stands ranged from 0.82-0.87 (1 indicating complete similarity between a pair of stands and suggested that treatment stands did not contain novel assemblies of species responding to the harvesting prescription. Estimated treatment effects were positive for twenty-four (90% of the species, although the credible intervals contained 0 in all cases. These results suggest that, unlike most post-fire salvage logging prescriptions, selective harvesting after beetle outbreaks may meet multiple management objectives, including the maintenance of avian community richness comparable to what is found in unharvested stands. Our results provide managers with prescription alternatives to
Prudhomme, Serge
2015-01-07
The need for surrogate models and adaptive methods can be best appreciated if one is interested in parameter estimation using a Bayesian calibration procedure for validation purposes. We extend here our latest work on error decomposition and adaptive refinement for response surfaces to the development of surrogate models that can be substituted for the full models to estimate the parameters of Reynolds-averaged Navier-Stokes models. The error estimates and adaptive schemes are driven here by a quantity of interest and are thus based on the approximation of an adjoint problem. We will focus in particular to the accurate estimation of evidences to facilitate model selection. The methodology will be illustrated on the Spalart-Allmaras RANS model for turbulence simulation.
Prudhomme, Serge
2015-01-01
The need for surrogate models and adaptive methods can be best appreciated if one is interested in parameter estimation using a Bayesian calibration procedure for validation purposes. We extend here our latest work on error decomposition and adaptive refinement for response surfaces to the development of surrogate models that can be substituted for the full models to estimate the parameters of Reynolds-averaged Navier-Stokes models. The error estimates and adaptive schemes are driven here by a quantity of interest and are thus based on the approximation of an adjoint problem. We will focus in particular to the accurate estimation of evidences to facilitate model selection. The methodology will be illustrated on the Spalart-Allmaras RANS model for turbulence simulation.
Fukuda, J.; Johnson, K. M.
2009-12-01
Studies utilizing inversions of geodetic data for the spatial distribution of coseismic slip on faults typically present the result as a single fault plane and slip distribution. Commonly the geometry of the fault plane is assumed to be known a priori and the data are inverted for slip. However, sometimes there is not strong a priori information on the geometry of the fault that produced the earthquake and the data is not always strong enough to completely resolve the fault geometry. We develop a method to solve for the full posterior probability distribution of fault slip and fault geometry parameters in a Bayesian framework using Monte Carlo methods. The slip inversion problem is particularly challenging because it often involves multiple data sets with unknown relative weights (e.g. InSAR, GPS), model parameters that are related linearly (slip) and nonlinearly (fault geometry) through the theoretical model to surface observations, prior information on model parameters, and a regularization prior to stabilize the inversion. We present the theoretical framework and solution method for a Bayesian inversion that can handle all of these aspects of the problem. The method handles the mixed linear/nonlinear nature of the problem through combination of both analytical least-squares solutions and Monte Carlo methods. We first illustrate and validate the inversion scheme using synthetic data sets. We then apply the method to inversion of geodetic data from the 2003 M6.6 San Simeon, California earthquake. We show that the uncertainty in strike and dip of the fault plane is over 20 degrees. We characterize the uncertainty in the slip estimate with a volume around the mean fault solution in which the slip most likely occurred. Slip likely occurred somewhere in a volume that extends 5-10 km in either direction normal to the fault plane. We implement slip inversions with both traditional, kinematic smoothing constraints on slip and a simple physical condition of uniform stress
Wang, L.; Davis, J. L.; Tamisiea, M. E.
2017-12-01
The Antarctic ice sheet (AIS) holds about 60% of all fresh water on the Earth, an amount equivalent to about 58 m of sea-level rise. Observation of AIS mass change is thus essential in determining and predicting its contribution to sea level. While the ice mass loss estimates for West Antarctica (WA) and the Antarctic Peninsula (AP) are in good agreement, what the mass balance over East Antarctica (EA) is, and whether or not it compensates for the mass loss is under debate. Besides the different error sources and sensitivities of different measurement types, complex spatial and temporal variabilities would be another factor complicating the accurate estimation of the AIS mass balance. Therefore, a model that allows for variabilities in both melting rate and seasonal signals would seem appropriate in the estimation of present-day AIS melting. We present a stochastic filter technique, which enables the Bayesian separation of the systematic stripe noise and mass signal in decade-length GRACE monthly gravity series, and allows the estimation of time-variable seasonal and inter-annual components in the signals. One of the primary advantages of this Bayesian method is that it yields statistically rigorous uncertainty estimates reflecting the inherent spatial resolution of the data. By applying the stochastic filter to the decade-long GRACE observations, we present the temporal variabilities of the AIS mass balance at basin scale, particularly over East Antarctica, and decipher the EA mass variations in the past decade, and their role in affecting overall AIS mass balance and sea level.
Exploring the Connection Between Sampling Problems in Bayesian Inference and Statistical Mechanics
Pohorille, Andrew
2006-01-01
The Bayesian and statistical mechanical communities often share the same objective in their work - estimating and integrating probability distribution functions (pdfs) describing stochastic systems, models or processes. Frequently, these pdfs are complex functions of random variables exhibiting multiple, well separated local minima. Conventional strategies for sampling such pdfs are inefficient, sometimes leading to an apparent non-ergodic behavior. Several recently developed techniques for handling this problem have been successfully applied in statistical mechanics. In the multicanonical and Wang-Landau Monte Carlo (MC) methods, the correct pdfs are recovered from uniform sampling of the parameter space by iteratively establishing proper weighting factors connecting these distributions. Trivial generalizations allow for sampling from any chosen pdf. The closely related transition matrix method relies on estimating transition probabilities between different states. All these methods proved to generate estimates of pdfs with high statistical accuracy. In another MC technique, parallel tempering, several random walks, each corresponding to a different value of a parameter (e.g. "temperature"), are generated and occasionally exchanged using the Metropolis criterion. This method can be considered as a statistically correct version of simulated annealing. An alternative approach is to represent the set of independent variables as a Hamiltonian system. Considerab!e progress has been made in understanding how to ensure that the system obeys the equipartition theorem or, equivalently, that coupling between the variables is correctly described. Then a host of techniques developed for dynamical systems can be used. Among them, probably the most powerful is the Adaptive Biasing Force method, in which thermodynamic integration and biased sampling are combined to yield very efficient estimates of pdfs. The third class of methods deals with transitions between states described
Bayesian inference in an item response theory model with a generalized student t link function
Azevedo, Caio L. N.; Migon, Helio S.
2012-10-01
In this paper we introduce a new item response theory (IRT) model with a generalized Student t-link function with unknown degrees of freedom (df), named generalized t-link (GtL) IRT model. In this model we consider only the difficulty parameter in the item response function. GtL is an alternative to the two parameter logit and probit models, since the degrees of freedom (df) play a similar role to the discrimination parameter. However, the behavior of the curves of the GtL is different from those of the two parameter models and the usual Student t link, since in GtL the curve obtained from different df's can cross the probit curves in more than one latent trait level. The GtL model has similar proprieties to the generalized linear mixed models, such as the existence of sufficient statistics and easy parameter interpretation. Also, many techniques of parameter estimation, model fit assessment and residual analysis developed for that models can be used for the GtL model. We develop fully Bayesian estimation and model fit assessment tools through a Metropolis-Hastings step within Gibbs sampling algorithm. We consider a prior sensitivity choice concerning the degrees of freedom. The simulation study indicates that the algorithm recovers all parameters properly. In addition, some Bayesian model fit assessment tools are considered. Finally, a real data set is analyzed using our approach and other usual models. The results indicate that our model fits the data better than the two parameter models.
The SWELLS survey - VI. Hierarchical inference of the initial mass functions of bulges and discs
DEFF Research Database (Denmark)
Brewer, Brendon J.; Marshal, Philip J.; Auger, Matthew W.
2014-01-01
) and stellar masses (constrained by optical and near-infrared colours in the context of a stellar population synthesis model, up to an IMF normalization parameter). Using minimal assumptions apart from the physical constraint that the total stellar mass m* within any aperture must be less than the total mass...... mtot with in the aperture, we find that the bulges of the galaxies cannot have IMFs heavier (i.e. implying high mass per unit luminosity) than Salpeter, while the disc IMFs are not well constrained by this data set.We also discuss the necessity for hierarchical modelling when combining incomplete...... information about multiple astronomical objects. This modelling approach allows us to place upper limits on the size of any departures from universality. More data, including spatially resolved kinematics (as in Paper V) and stellar population diagnostics over a range of bulge and disc masses, are needed...
International Nuclear Information System (INIS)
Lucka, Felix
2012-01-01
Sparsity has become a key concept for solving of high-dimensional inverse problems using variational regularization techniques. Recently, using similar sparsity-constraints in the Bayesian framework for inverse problems by encoding them in the prior distribution has attracted attention. Important questions about the relation between regularization theory and Bayesian inference still need to be addressed when using sparsity promoting inversion. A practical obstacle for these examinations is the lack of fast posterior sampling algorithms for sparse, high-dimensional Bayesian inversion. Accessing the full range of Bayesian inference methods requires being able to draw samples from the posterior probability distribution in a fast and efficient way. This is usually done using Markov chain Monte Carlo (MCMC) sampling algorithms. In this paper, we develop and examine a new implementation of a single component Gibbs MCMC sampler for sparse priors relying on L1-norms. We demonstrate that the efficiency of our Gibbs sampler increases when the level of sparsity or the dimension of the unknowns is increased. This property is contrary to the properties of the most commonly applied Metropolis–Hastings (MH) sampling schemes. We demonstrate that the efficiency of MH schemes for L1-type priors dramatically decreases when the level of sparsity or the dimension of the unknowns is increased. Practically, Bayesian inversion for L1-type priors using MH samplers is not feasible at all. As this is commonly believed to be an intrinsic feature of MCMC sampling, the performance of our Gibbs sampler also challenges common beliefs about the applicability of sample based Bayesian inference. (paper)
Niwayama, Ritsuya; Nagao, Hiromichi; Kitajima, Tomoya S; Hufnagel, Lars; Shinohara, Kyosuke; Higuchi, Tomoyuki; Ishikawa, Takuji; Kimura, Akatsuki
2016-01-01
Cellular structures are hydrodynamically interconnected, such that force generation in one location can move distal structures. One example of this phenomenon is cytoplasmic streaming, whereby active forces at the cell cortex induce streaming of the entire cytoplasm. However, it is not known how the spatial distribution and magnitude of these forces move distant objects within the cell. To address this issue, we developed a computational method that used cytoplasm hydrodynamics to infer the spatial distribution of shear stress at the cell cortex induced by active force generators from experimentally obtained flow field of cytoplasmic streaming. By applying this method, we determined the shear-stress distribution that quantitatively reproduces in vivo flow fields in Caenorhabditis elegans embryos and mouse oocytes during meiosis II. Shear stress in mouse oocytes were predicted to localize to a narrower cortical region than that with a high cortical flow velocity and corresponded with the localization of the cortical actin cap. The predicted patterns of pressure gradient in both species were consistent with species-specific cytoplasmic streaming functions. The shear-stress distribution inferred by our method can contribute to the characterization of active force generation driving biological streaming.
Yildiz, Izzet B; von Kriegstein, Katharina; Kiebel, Stefan J
2013-01-01
Our knowledge about the computational mechanisms underlying human learning and recognition of sound sequences, especially speech, is still very limited. One difficulty in deciphering the exact means by which humans recognize speech is that there are scarce experimental findings at a neuronal, microscopic level. Here, we show that our neuronal-computational understanding of speech learning and recognition may be vastly improved by looking at an animal model, i.e., the songbird, which faces the same challenge as humans: to learn and decode complex auditory input, in an online fashion. Motivated by striking similarities between the human and songbird neural recognition systems at the macroscopic level, we assumed that the human brain uses the same computational principles at a microscopic level and translated a birdsong model into a novel human sound learning and recognition model with an emphasis on speech. We show that the resulting Bayesian model with a hierarchy of nonlinear dynamical systems can learn speech samples such as words rapidly and recognize them robustly, even in adverse conditions. In addition, we show that recognition can be performed even when words are spoken by different speakers and with different accents-an everyday situation in which current state-of-the-art speech recognition models often fail. The model can also be used to qualitatively explain behavioral data on human speech learning and derive predictions for future experiments.
Directory of Open Access Journals (Sweden)
Izzet B Yildiz
Full Text Available Our knowledge about the computational mechanisms underlying human learning and recognition of sound sequences, especially speech, is still very limited. One difficulty in deciphering the exact means by which humans recognize speech is that there are scarce experimental findings at a neuronal, microscopic level. Here, we show that our neuronal-computational understanding of speech learning and recognition may be vastly improved by looking at an animal model, i.e., the songbird, which faces the same challenge as humans: to learn and decode complex auditory input, in an online fashion. Motivated by striking similarities between the human and songbird neural recognition systems at the macroscopic level, we assumed that the human brain uses the same computational principles at a microscopic level and translated a birdsong model into a novel human sound learning and recognition model with an emphasis on speech. We show that the resulting Bayesian model with a hierarchy of nonlinear dynamical systems can learn speech samples such as words rapidly and recognize them robustly, even in adverse conditions. In addition, we show that recognition can be performed even when words are spoken by different speakers and with different accents-an everyday situation in which current state-of-the-art speech recognition models often fail. The model can also be used to qualitatively explain behavioral data on human speech learning and derive predictions for future experiments.
Systematic validation of non-equilibrium thermochemical models using Bayesian inference
Miki, Kenji
2015-10-01
© 2015 Elsevier Inc. The validation process proposed by Babuška et al. [1] is applied to thermochemical models describing post-shock flow conditions. In this validation approach, experimental data is involved only in the calibration of the models, and the decision process is based on quantities of interest (QoIs) predicted on scenarios that are not necessarily amenable experimentally. Moreover, uncertainties present in the experimental data, as well as those resulting from an incomplete physical model description, are propagated to the QoIs. We investigate four commonly used thermochemical models: a one-temperature model (which assumes thermal equilibrium among all inner modes), and two-temperature models developed by Macheret et al. [2], Marrone and Treanor [3], and Park [4]. Up to 16 uncertain parameters are estimated using Bayesian updating based on the latest absolute volumetric radiance data collected at the Electric Arc Shock Tube (EAST) installed inside the NASA Ames Research Center. Following the solution of the inverse problems, the forward problems are solved in order to predict the radiative heat flux, QoI, and examine the validity of these models. Our results show that all four models are invalid, but for different reasons: the one-temperature model simply fails to reproduce the data while the two-temperature models exhibit unacceptably large uncertainties in the QoI predictions.
Park, Taeyoung; Jeong, Jong-Hyeon; Lee, Jae Won
2012-08-15
There is often an interest in estimating a residual life function as a summary measure of survival data. For ease in presentation of the potential therapeutic effect of a new drug, investigators may summarize survival data in terms of the remaining life years of patients. Under heavy right censoring, however, some reasonably high quantiles (e.g., median) of a residual lifetime distribution cannot be always estimated via a popular nonparametric approach on the basis of the Kaplan-Meier estimator. To overcome the difficulties in dealing with heavily censored survival data, this paper develops a Bayesian nonparametric approach that takes advantage of a fully model-based but highly flexible probabilistic framework. We use a Dirichlet process mixture of Weibull distributions to avoid strong parametric assumptions on the unknown failure time distribution, making it possible to estimate any quantile residual life function under heavy censoring. Posterior computation through Markov chain Monte Carlo is straightforward and efficient because of conjugacy properties and partial collapse. We illustrate the proposed methods by using both simulated data and heavily censored survival data from a recent breast cancer clinical trial conducted by the National Surgical Adjuvant Breast and Bowel Project. Copyright © 2012 John Wiley & Sons, Ltd.
A framework for Bayesian nonparametric inference for causal effects of mediation.
Kim, Chanmin; Daniels, Michael J; Marcus, Bess H; Roy, Jason A
2017-06-01
We propose a Bayesian non-parametric (BNP) framework for estimating causal effects of mediation, the natural direct, and indirect, effects. The strategy is to do this in two parts. Part 1 is a flexible model (using BNP) for the observed data distribution. Part 2 is a set of uncheckable assumptions with sensitivity parameters that in conjunction with Part 1 allows identification and estimation of the causal parameters and allows for uncertainty about these assumptions via priors on the sensitivity parameters. For Part 1, we specify a Dirichlet process mixture of multivariate normals as a prior on the joint distribution of the outcome, mediator, and covariates. This approach allows us to obtain a (simple) closed form of each marginal distribution. For Part 2, we consider two sets of assumptions: (a) the standard sequential ignorability (Imai et al., 2010) and (b) weakened set of the conditional independence type assumptions introduced in Daniels et al. (2012) and propose sensitivity analyses for both. We use this approach to assess mediation in a physical activity promotion trial. © 2016, The International Biometric Society.
Baur, Brittany; Bozdag, Serdar
2015-04-01
One of the challenging and important computational problems in systems biology is to infer gene regulatory networks (GRNs) of biological systems. Several methods that exploit gene expression data have been developed to tackle this problem. In this study, we propose the use of copy number and DNA methylation data to infer GRNs. We developed an algorithm that scores regulatory interactions between genes based on canonical correlation analysis. In this algorithm, copy number or DNA methylation variables are treated as potential regulator variables, and expression variables are treated as potential target variables. We first validated that the canonical correlation analysis method is able to infer true interactions in high accuracy. We showed that the use of DNA methylation or copy number datasets leads to improved inference over steady-state expression. Our results also showed that epigenetic and structural information could be used to infer directionality of regulatory interactions. Additional improvements in GRN inference can be gleaned from incorporating the result in an informative prior in a dynamic Bayesian algorithm. This is the first study that incorporates copy number and DNA methylation into an informative prior in dynamic Bayesian framework. By closely examining top-scoring interactions with different sources of epigenetic or structural information, we also identified potential novel regulatory interactions.
Hernández, Mario R.; Francés, Félix
2015-04-01
One phase of the hydrological models implementation process, significantly contributing to the hydrological predictions uncertainty, is the calibration phase in which values of the unknown model parameters are tuned by optimizing an objective function. An unsuitable error model (e.g. Standard Least Squares or SLS) introduces noise into the estimation of the parameters. The main sources of this noise are the input errors and the hydrological model structural deficiencies. Thus, the biased calibrated parameters cause the divergence model phenomenon, where the errors variance of the (spatially and temporally) forecasted flows far exceeds the errors variance in the fitting period, and provoke the loss of part or all of the physical meaning of the modeled processes. In other words, yielding a calibrated hydrological model which works well, but not for the right reasons. Besides, an unsuitable error model yields a non-reliable predictive uncertainty assessment. Hence, with the aim of prevent all these undesirable effects, this research focuses on the Bayesian joint inference (BJI) of both the hydrological and error model parameters, considering a general additive (GA) error model that allows for correlation, non-stationarity (in variance and bias) and non-normality of model residuals. As hydrological model, it has been used a conceptual distributed model called TETIS, with a particular split structure of the effective model parameters. Bayesian inference has been performed with the aid of a Markov Chain Monte Carlo (MCMC) algorithm called Dream-ZS. MCMC algorithm quantifies the uncertainty of the hydrological and error model parameters by getting the joint posterior probability distribution, conditioned on the observed flows. The BJI methodology is a very powerful and reliable tool, but it must be used correctly this is, if non-stationarity in errors variance and bias is modeled, the Total Laws must be taken into account. The results of this research show that the
Bayesian inference for dynamic transcriptional regulation; the Hes1 system as a case study.
Heron, Elizabeth A; Finkenstädt, Bärbel; Rand, David A
2007-10-01
In this study, we address the problem of estimating the parameters of regulatory networks and provide the first application of Markov chain Monte Carlo (MCMC) methods to experimental data. As a case study, we consider a stochastic model of the Hes1 system expressed in terms of stochastic differential equations (SDEs) to which rigorous likelihood methods of inference can be applied. When fitting continuous-time stochastic models to discretely observed time series the lengths of the sampling intervals are important, and much of our study addresses the problem when the data are sparse. We estimate the parameters of an autoregulatory network providing results both for simulated and real experimental data from the Hes1 system. We develop an estimation algorithm using MCMC techniques which are flexible enough to allow for the imputation of latent data on a finer time scale and the presence of prior information about parameters which may be informed from other experiments as well as additional measurement error.
FPGA Acceleration of the phylogenetic likelihood function for Bayesian MCMC inference methods
Directory of Open Access Journals (Sweden)
Bakos Jason D
2010-04-01
Full Text Available Abstract Background Likelihood (ML-based phylogenetic inference has become a popular method for estimating the evolutionary relationships among species based on genomic sequence data. This method is used in applications such as RAxML, GARLI, MrBayes, PAML, and PAUP. The Phylogenetic Likelihood Function (PLF is an important kernel computation for this method. The PLF consists of a loop with no conditional behavior or dependencies between iterations. As such it contains a high potential for exploiting parallelism using micro-architectural techniques. In this paper, we describe a technique for mapping the PLF and supporting logic onto a Field Programmable Gate Array (FPGA-based co-processor. By leveraging the FPGA's on-chip DSP modules and the high-bandwidth local memory attached to the FPGA, the resultant co-processor can accelerate ML-based methods and outperform state-of-the-art multi-core processors. Results We use the MrBayes 3 tool as a framework for designing our co-processor. For large datasets, we estimate that our accelerated MrBayes, if run on a current-generation FPGA, achieves a 10× speedup relative to software running on a state-of-the-art server-class microprocessor. The FPGA-based implementation achieves its performance by deeply pipelining the likelihood computations, performing multiple floating-point operations in parallel, and through a natural log approximation that is chosen specifically to leverage a deeply pipelined custom architecture. Conclusions Heterogeneous computing, which combines general-purpose processors with special-purpose co-processors such as FPGAs and GPUs, is a promising approach for high-performance phylogeny inference as shown by the growing body of literature in this field. FPGAs in particular are well-suited for this task because of their low power consumption as compared to many-core processors and Graphics Processor Units (GPUs 1.
Rapid molecular evolution of human bocavirus revealed by Bayesian coalescent inference.
Zehender, Gianguglielmo; De Maddalena, Chiara; Canuti, Marta; Zappa, Alessandra; Amendola, Antonella; Lai, Alessia; Galli, Massimo; Tanzi, Elisabetta
2010-03-01
Human bocavirus (HBoV) is a linear single-stranded DNA virus belonging to the Parvoviridae family that has recently been isolated from the upper respiratory tract of children with acute respiratory infection. All of the strains observed so far segregate into two genotypes (1 and 2) with a low level of polymorphism. Given the recent description of the infection and the lack of epidemiological and molecular data, we estimated the virus's rates of molecular evolution and population dynamics. A dataset of forty-nine dated VP2 sequences, including also eight new isolates obtained from pharyngeal swabs of Italian patients with acute respiratory tract infections, was submitted to phylogenetic analysis. The model parameters, evolutionary rates and population dynamics were co-estimated using a Bayesian Markov Chain Monte Carlo approach, and site-specific positive and negative selection was also investigated. Recombination was investigated by seven different methods and one suspected recombinant strain was excluded from further analysis. The estimated mean evolutionary rate of HBoV was 8.6x10(-4)subs/site/year, and that of the 1st+2nd codon positions was more than 15 times less than that of the 3rd codon position. Viral population dynamics analysis revealed that the two known genotypes diverged recently (mean tMRCA: 24 years), and that the epidemic due to HBoV genotype 2 grew exponentially at a rate of 1.01year(-1). Selection analysis of the partial VP2 showed that 8.5% of sites were under significant negative pressure and the absence of positive selection. Our results show that, like other parvoviruses, HBoV is characterised by a rapid evolution. The low level of polymorphism is probably due to a relatively recent divergence between the circulating genotypes and strong purifying selection acting on viral antigens.
Directory of Open Access Journals (Sweden)
Pablo Fresia
Full Text Available Insect pest phylogeography might be shaped both by biogeographic events and by human influence. Here, we conducted an approximate Bayesian computation (ABC analysis to investigate the phylogeography of the New World screwworm fly, Cochliomyia hominivorax, with the aim of understanding its population history and its order and time of divergence. Our ABC analysis supports that populations spread from North to South in the Americas, in at least two different moments. The first split occurred between the North/Central American and South American populations in the end of the Last Glacial Maximum (15,300-19,000 YBP. The second split occurred between the North and South Amazonian populations in the transition between the Pleistocene and the Holocene eras (9,100-11,000 YBP. The species also experienced population expansion. Phylogenetic analysis likewise suggests this north to south colonization and Maxent models suggest an increase in the number of suitable areas in South America from the past to present. We found that the phylogeographic patterns observed in C. hominivorax cannot be explained only by climatic oscillations and can be connected to host population histories. Interestingly we found these patterns are very coincident with general patterns of ancient human movements in the Americas, suggesting that humans might have played a crucial role in shaping the distribution and population structure of this insect pest. This work presents the first hypothesis test regarding the processes that shaped the current phylogeographic structure of C. hominivorax and represents an alternate perspective on investigating the problem of insect pests.
Final Report: Large-Scale Optimization for Bayesian Inference in Complex Systems
Energy Technology Data Exchange (ETDEWEB)
Ghattas, Omar [The University of Texas at Austin
2013-10-15
The SAGUARO (Scalable Algorithms for Groundwater Uncertainty Analysis and Robust Optimiza- tion) Project focuses on the development of scalable numerical algorithms for large-scale Bayesian inversion in complex systems that capitalize on advances in large-scale simulation-based optimiza- tion and inversion methods. Our research is directed in three complementary areas: efficient approximations of the Hessian operator, reductions in complexity of forward simulations via stochastic spectral approximations and model reduction, and employing large-scale optimization concepts to accelerate sampling. Our efforts are integrated in the context of a challenging testbed problem that considers subsurface reacting flow and transport. The MIT component of the SAGUARO Project addresses the intractability of conventional sampling methods for large-scale statistical inverse problems by devising reduced-order models that are faithful to the full-order model over a wide range of parameter values; sampling then employs the reduced model rather than the full model, resulting in very large computational savings. Results indicate little effect on the computed posterior distribution. On the other hand, in the Texas-Georgia Tech component of the project, we retain the full-order model, but exploit inverse problem structure (adjoint-based gradients and partial Hessian information of the parameter-to- observation map) to implicitly extract lower dimensional information on the posterior distribution; this greatly speeds up sampling methods, so that fewer sampling points are needed. We can think of these two approaches as "reduce then sample" and "sample then reduce." In fact, these two approaches are complementary, and can be used in conjunction with each other. Moreover, they both exploit deterministic inverse problem structure, in the form of adjoint-based gradient and Hessian information of the underlying parameter-to-observation map, to achieve their speedups.
Davis, A. D.; Huan, X.; Heimbach, P.; Marzouk, Y.
2017-12-01
Borehole data are essential for calibrating ice sheet models. However, field expeditions for acquiring borehole data are often time-consuming, expensive, and dangerous. It is thus essential to plan the best sampling locations that maximize the value of data while minimizing costs and risks. We present an uncertainty quantification (UQ) workflow based on rigorous probability framework to achieve these objectives. First, we employ an optimal experimental design (OED) procedure to compute borehole locations that yield the highest expected information gain. We take into account practical considerations of location accessibility (e.g., proximity to research sites, terrain, and ice velocity may affect feasibility of drilling) and robustness (e.g., real-time constraints such as weather may force researchers to drill at sub-optimal locations near those originally planned), by incorporating a penalty reflecting accessibility as well as sensitivity to deviations from the optimal locations. Next, we extract vertical temperature profiles from these boreholes and formulate a Bayesian inverse problem to reconstruct past surface temperatures. Using a model of temperature advection/diffusion, the top boundary condition (corresponding to surface temperatures) is calibrated via efficient Markov chain Monte Carlo (MCMC). The overall procedure can then be iterated to choose new optimal borehole locations for the next expeditions.Through this work, we demonstrate powerful UQ methods for designing experiments, calibrating models, making predictions, and assessing sensitivity--all performed under an uncertain environment. We develop a theoretical framework as well as practical software within an intuitive workflow, and illustrate their usefulness for combining data and models for environmental and climate research.
Bayesian data analysis in population ecology: motivations, methods, and benefits
Dorazio, Robert
2016-01-01
During the 20th century ecologists largely relied on the frequentist system of inference for the analysis of their data. However, in the past few decades ecologists have become increasingly interested in the use of Bayesian methods of data analysis. In this article I provide guidance to ecologists who would like to decide whether Bayesian methods can be used to improve their conclusions and predictions. I begin by providing a concise summary of Bayesian methods of analysis, including a comparison of differences between Bayesian and frequentist approaches to inference when using hierarchical models. Next I provide a list of problems where Bayesian methods of analysis may arguably be preferred over frequentist methods. These problems are usually encountered in analyses based on hierarchical models of data. I describe the essentials required for applying modern methods of Bayesian computation, and I use real-world examples to illustrate these methods. I conclude by summarizing what I perceive to be the main strengths and weaknesses of using Bayesian methods to solve ecological inference problems.
Applied Bayesian hierarchical methods
National Research Council Canada - National Science Library
Congdon, P
2010-01-01
.... It also incorporates BayesX code, which is particularly useful in nonlinear regression. To demonstrate MCMC sampling from first principles, the author includes worked examples using the R package...
Energy Technology Data Exchange (ETDEWEB)
Weyant, Anja; Wood-Vasey, W. Michael [Pittsburgh Particle Physics, Astrophysics, and Cosmology Center (PITT PACC), Physics and Astronomy Department, University of Pittsburgh, Pittsburgh, PA 15260 (United States); Schafer, Chad, E-mail: anw19@pitt.edu [Department of Statistics, Carnegie Mellon University, Pittsburgh, PA 15213 (United States)
2013-02-20
Cosmological inference becomes increasingly difficult when complex data-generating processes cannot be modeled by simple probability distributions. With the ever-increasing size of data sets in cosmology, there is an increasing burden placed on adequate modeling; systematic errors in the model will dominate where previously these were swamped by statistical errors. For example, Gaussian distributions are an insufficient representation for errors in quantities like photometric redshifts. Likewise, it can be difficult to quantify analytically the distribution of errors that are introduced in complex fitting codes. Without a simple form for these distributions, it becomes difficult to accurately construct a likelihood function for the data as a function of parameters of interest. Approximate Bayesian computation (ABC) provides a means of probing the posterior distribution when direct calculation of a sufficiently accurate likelihood is intractable. ABC allows one to bypass direct calculation of the likelihood but instead relies upon the ability to simulate the forward process that generated the data. These simulations can naturally incorporate priors placed on nuisance parameters, and hence these can be marginalized in a natural way. We present and discuss ABC methods in the context of supernova cosmology using data from the SDSS-II Supernova Survey. Assuming a flat cosmology and constant dark energy equation of state, we demonstrate that ABC can recover an accurate posterior distribution. Finally, we show that ABC can still produce an accurate posterior distribution when we contaminate the sample with Type IIP supernovae.
International Nuclear Information System (INIS)
Weyant, Anja; Wood-Vasey, W. Michael; Schafer, Chad
2013-01-01
Cosmological inference becomes increasingly difficult when complex data-generating processes cannot be modeled by simple probability distributions. With the ever-increasing size of data sets in cosmology, there is an increasing burden placed on adequate modeling; systematic errors in the model will dominate where previously these were swamped by statistical errors. For example, Gaussian distributions are an insufficient representation for errors in quantities like photometric redshifts. Likewise, it can be difficult to quantify analytically the distribution of errors that are introduced in complex fitting codes. Without a simple form for these distributions, it becomes difficult to accurately construct a likelihood function for the data as a function of parameters of interest. Approximate Bayesian computation (ABC) provides a means of probing the posterior distribution when direct calculation of a sufficiently accurate likelihood is intractable. ABC allows one to bypass direct calculation of the likelihood but instead relies upon the ability to simulate the forward process that generated the data. These simulations can naturally incorporate priors placed on nuisance parameters, and hence these can be marginalized in a natural way. We present and discuss ABC methods in the context of supernova cosmology using data from the SDSS-II Supernova Survey. Assuming a flat cosmology and constant dark energy equation of state, we demonstrate that ABC can recover an accurate posterior distribution. Finally, we show that ABC can still produce an accurate posterior distribution when we contaminate the sample with Type IIP supernovae.
International Nuclear Information System (INIS)
Karras, D A; Mertzios, G B
2009-01-01
A novel approach is presented in this paper for improving anisotropic diffusion PDE models, based on the Perona–Malik equation. A solution is proposed from an engineering perspective to adaptively estimate the parameters of the regularizing function in this equation. The goal of such a new adaptive diffusion scheme is to better preserve edges when the anisotropic diffusion PDE models are applied to image enhancement tasks. The proposed adaptive parameter estimation in the anisotropic diffusion PDE model involves self-organizing maps and Bayesian inference to define edge probabilities accurately. The proposed modifications attempt to capture not only simple edges but also difficult textural edges and incorporate their probability in the anisotropic diffusion model. In the context of the application of PDE models to image processing such adaptive schemes are closely related to the discrete image representation problem and the investigation of more suitable discretization algorithms using constraints derived from image processing theory. The proposed adaptive anisotropic diffusion model illustrates these concepts when it is numerically approximated by various discretization schemes in a database of magnetic resonance images (MRI), where it is shown to be efficient in image filtering and restoration applications
Xing, Junliang; Ai, Haizhou; Liu, Liwei; Lao, Shihong
2011-06-01
Multiple object tracking (MOT) is a very challenging task yet of fundamental importance for many practical applications. In this paper, we focus on the problem of tracking multiple players in sports video which is even more difficult due to the abrupt movements of players and their complex interactions. To handle the difficulties in this problem, we present a new MOT algorithm which contributes both in the observation modeling level and in the tracking strategy level. For the observation modeling, we develop a progressive observation modeling process that is able to provide strong tracking observations and greatly facilitate the tracking task. For the tracking strategy, we propose a dual-mode two-way Bayesian inference approach which dynamically switches between an offline general model and an online dedicated model to deal with single isolated object tracking and multiple occluded object tracking integrally by forward filtering and backward smoothing. Extensive experiments on different kinds of sports videos, including football, basketball, as well as hockey, demonstrate the effectiveness and efficiency of the proposed method.
Höhna, Sebastian; May, Michael R; Moore, Brian R
2016-03-01
Many fundamental questions in evolutionary biology entail estimating rates of lineage diversification (speciation-extinction) that are modeled using birth-death branching processes. We leverage recent advances in branching-process theory to develop a flexible Bayesian framework for specifying diversification models-where rates are constant, vary continuously, or change episodically through time-and implement numerical methods to estimate parameters of these models from molecular phylogenies, even when species sampling is incomplete. We enable both statistical inference and efficient simulation under these models. We also provide robust methods for comparing the relative and absolute fit of competing branching-process models to a given tree, thereby providing rigorous tests of biological hypotheses regarding patterns and processes of lineage diversification. The source code for TESS is freely available at http://cran.r-project.org/web/packages/TESS/ CONTACT: Sebastian.Hoehna@gmail.com. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
von Hansen, Yann; Mehlich, Alexander; Pelz, Benjamin; Rief, Matthias; Netz, Roland R
2012-09-01
The thermal fluctuations of micron-sized beads in dual trap optical tweezer experiments contain complete dynamic information about the viscoelastic properties of the embedding medium and-if present-macromolecular constructs connecting the two beads. To quantitatively interpret the spectral properties of the measured signals, a detailed understanding of the instrumental characteristics is required. To this end, we present a theoretical description of the signal processing in a typical dual trap optical tweezer experiment accounting for polarization crosstalk and instrumental noise and discuss the effect of finite statistics. To infer the unknown parameters from experimental data, a maximum likelihood method based on the statistical properties of the stochastic signals is derived. In a first step, the method can be used for calibration purposes: We propose a scheme involving three consecutive measurements (both traps empty, first one occupied and second empty, and vice versa), by which all instrumental and physical parameters of the setup are determined. We test our approach for a simple model system, namely a pair of unconnected, but hydrodynamically interacting spheres. The comparison to theoretical predictions based on instantaneous as well as retarded hydrodynamics emphasizes the importance of hydrodynamic retardation effects due to vorticity diffusion in the fluid. For more complex experimental scenarios, where macromolecular constructs are tethered between the two beads, the same maximum likelihood method in conjunction with dynamic deconvolution theory will in a second step allow one to determine the viscoelastic properties of the tethered element connecting the two beads.
Native South American genetic structure and prehistory inferred from hierarchical modeling of mtDNA.
Lewis, Cecil M; Long, Jeffrey C
2008-03-01
Genetic diversity in Native South Americans forms a complex pattern at both the continental and local levels. In comparing the West to the East, there is more variation within groups and smaller genetic distances between groups. From this pattern, researchers have proposed that there is more variation in the West and that a larger, more genetically diverse, founding population entered the West than the East. Here, we question this characterization of South American genetic variation and its interpretation. Our concern arises because others have inferred regional variation from the mean variation within local populations without taking into account the variation among local populations within the same region. This failure produces a biased view of the actual variation in the East. In this study, we analyze the mitochondrial DNA sequence between positions 16040 and 16322 of the Cambridge reference sequence. Our sample represents a total of 886 people from 27 indigenous populations from South (22), Central (3), and North America (2). The basic unit of our analyses is nucleotide identity by descent, which is easily modeled and proportional to nucleotide diversity. We use a forward modeling strategy to fit a series of nested models to identity by descent within and between all pairs of local populations. This method provides estimates of identity by descent at different levels of population hierarchy without assuming homogeneity within populations, regions, or continents. Our main discovery is that Eastern South America harbors more genetic variation than has been recognized. We find no evidence that there is increased identity by descent in the East relative to the total for South America. By contrast, we discovered that populations in the Western region, as a group, harbor more identity by descent than has been previously recognized, despite the fact that average identity by descent within groups is lower. In this light, there is no need to postulate separate founding
Liu, Fang; Eugenio, Evercita C
2018-04-01
Beta regression is an increasingly popular statistical technique in medical research for modeling of outcomes that assume values in (0, 1), such as proportions and patient reported outcomes. When outcomes take values in the intervals [0,1), (0,1], or [0,1], zero-or-one-inflated beta (zoib) regression can be used. We provide a thorough review on beta regression and zoib regression in the modeling, inferential, and computational aspects via the likelihood-based and Bayesian approaches. We demonstrate the statistical and practical importance of correctly modeling the inflation at zero/one rather than ad hoc replacing them with values close to zero/one via simulation studies; the latter approach can lead to biased estimates and invalid inferences. We show via simulation studies that the likelihood-based approach is computationally faster in general than MCMC algorithms used in the Bayesian inferences, but runs the risk of non-convergence, large biases, and sensitivity to starting values in the optimization algorithm especially with clustered/correlated data, data with sparse inflation at zero and one, and data that warrant regularization of the likelihood. The disadvantages of the regular likelihood-based approach make the Bayesian approach an attractive alternative in these cases. Software packages and tools for fitting beta and zoib regressions in both the likelihood-based and Bayesian frameworks are also reviewed.
DEFF Research Database (Denmark)
Møller, Jesper
(This text written by Jesper Møller, Aalborg University, is submitted for the collection ‘Stochastic Geometry: Highlights, Interactions and New Perspectives', edited by Wilfrid S. Kendall and Ilya Molchanov, to be published by ClarendonPress, Oxford, and planned to appear as Section 4.1 with the ......(This text written by Jesper Møller, Aalborg University, is submitted for the collection ‘Stochastic Geometry: Highlights, Interactions and New Perspectives', edited by Wilfrid S. Kendall and Ilya Molchanov, to be published by ClarendonPress, Oxford, and planned to appear as Section 4.......1 with the title ‘Inference'.) This contribution concerns statistical inference for parametric models used in stochastic geometry and based on quick and simple simulation free procedures as well as more comprehensive methods using Markov chain Monte Carlo (MCMC) simulations. Due to space limitations the focus...
International Nuclear Information System (INIS)
Wu Wei; Biber, Patrick D; Peterson, Mark S; Gong Chongfeng
2012-01-01
To study the impact of the Deepwater Horizon oil spill on photosynthesis of coastal salt marsh plants in Mississippi, we developed a hierarchical Bayesian (HB) model based on field measurements collected from July 2010 to November 2011. We sampled three locations in Davis Bayou, Mississippi (30.375°N, 88.790°W) representative of a range of oil spill impacts. Measured photosynthesis was negative (respiration only) at the heavily oiled location in July 2010 only, and rates started to increase by August 2010. Photosynthesis at the medium oiling location was lower than at the control location in July 2010 and it continued to decrease in September 2010. During winter 2010–2011, the contrast between the control and the two impacted locations was not as obvious as in the growing season of 2010. Photosynthesis increased through spring 2011 at the three locations and decreased starting with October at the control location and a month earlier (September) at the impacted locations. Using the field data, we developed an HB model. The model simulations agreed well with the measured photosynthesis, capturing most of the variability of the measured data. On the basis of the posteriors of the parameters, we found that air temperature and photosynthetic active radiation positively influenced photosynthesis whereas the leaf stress level negatively affected photosynthesis. The photosynthesis rates at the heavily impacted location had recovered to the status of the control location about 140 days after the initial impact, while the impact at the medium impact location was never severe enough to make photosynthesis significantly lower than that at the control location over the study period. The uncertainty in modeling photosynthesis rates mainly came from the individual and micro-site scales, and to a lesser extent from the leaf scale. (letter)
Kim, Daesang
2016-01-06
A new Bayesian inference method has been developed and applied to Furan shock tube experimental data for efficient statistical inferences of the Arrhenius parameters of two OH radical consumption reactions. The collected experimental data, which consist of time series signals of OH radical concentrations of 14 shock tube experiments, may require several days for MCMC computations even with the support of a fast surrogate of the combustion simulation model, while the new method reduces it to several hours by splitting the process into two steps of MCMC: the first inference of rate constants and the second inference of the Arrhenius parameters. Each step has low dimensional parameter spaces and the second step does not need the executions of the combustion simulation. Furthermore, the new approach has more flexibility in choosing the ranges of the inference parameters, and the higher speed and flexibility enable the more accurate inferences and the analyses of the propagation of errors in the measured temperatures and the alignment of the experimental time to the inference results.
DEFF Research Database (Denmark)
Choi, Sang-Hyeon; Lee, Ikjin; Jeong, Cheol-Ho
2016-01-01
Sabine absorption coefficient is a widely used one deduced from reverberation time measurements via the Sabine equation. First- and second-level Bayesian analysis are used to estimate the flow resistivity of a sound absorber and the influences of the test chambers from Sabine absorption...... coefficients measured in 13 different reverberation chambers. The first-level Bayesian analysis is more general than the second-level Bayesian analysis. Sharper posterior distribution can be acquired by the second-level Bayesian analysis than the one by the first-level Bayesian analysis because more data...... are used to set more reliable prior distribution. The estimated room’s influences by the first- and the second-level Bayesian analyses are similar to the estimated results by the mean absolute error minimization....
Directory of Open Access Journals (Sweden)
Simon Boitard
2016-03-01
Full Text Available Inferring the ancestral dynamics of effective population size is a long-standing question in population genetics, which can now be tackled much more accurately thanks to the massive genomic data available in many species. Several promising methods that take advantage of whole-genome sequences have been recently developed in this context. However, they can only be applied to rather small samples, which limits their ability to estimate recent population size history. Besides, they can be very sensitive to sequencing or phasing errors. Here we introduce a new approximate Bayesian computation approach named PopSizeABC that allows estimating the evolution of the effective population size through time, using a large sample of complete genomes. This sample is summarized using the folded allele frequency spectrum and the average zygotic linkage disequilibrium at different bins of physical distance, two classes of statistics that are widely used in population genetics and can be easily computed from unphased and unpolarized SNP data. Our approach provides accurate estimations of past population sizes, from the very first generations before present back to the expected time to the most recent common ancestor of the sample, as shown by simulations under a wide range of demographic scenarios. When applied to samples of 15 or 25 complete genomes in four cattle breeds (Angus, Fleckvieh, Holstein and Jersey, PopSizeABC revealed a series of population declines, related to historical events such as domestication or modern breed creation. We further highlight that our approach is robust to sequencing errors, provided summary statistics are computed from SNPs with common alleles.
Gogoshin, Grigoriy; Boerwinkle, Eric; Rodin, Andrei S
2017-04-01
Bayesian network (BN) reconstruction is a prototypical systems biology data analysis approach that has been successfully used to reverse engineer and model networks reflecting different layers of biological organization (ranging from genetic to epigenetic to cellular pathway to metabolomic). It is especially relevant in the context of modern (ongoing and prospective) studies that generate heterogeneous high-throughput omics datasets. However, there are both theoretical and practical obstacles to the seamless application of BN modeling to such big data, including computational inefficiency of optimal BN structure search algorithms, ambiguity in data discretization, mixing data types, imputation and validation, and, in general, limited scalability in both reconstruction and visualization of BNs. To overcome these and other obstacles, we present BNOmics, an improved algorithm and software toolkit for inferring and analyzing BNs from omics datasets. BNOmics aims at comprehensive systems biology-type data exploration, including both generating new biological hypothesis and testing and validating the existing ones. Novel aspects of the algorithm center around increasing scalability and applicability to varying data types (with different explicit and implicit distributional assumptions) within the same analysis framework. An output and visualization interface to widely available graph-rendering software is also included. Three diverse applications are detailed. BNOmics was originally developed in the context of genetic epidemiology data and is being continuously optimized to keep pace with the ever-increasing inflow of available large-scale omics datasets. As such, the software scalability and usability on the less than exotic computer hardware are a priority, as well as the applicability of the algorithm and software to the heterogeneous datasets containing many data types-single-nucleotide polymorphisms and other genetic/epigenetic/transcriptome variables, metabolite
Introduction to Bayesian statistics
Bolstad, William M
2017-01-01
There is a strong upsurge in the use of Bayesian methods in applied statistical analysis, yet most introductory statistics texts only present frequentist methods. Bayesian statistics has many important advantages that students should learn about if they are going into fields where statistics will be used. In this Third Edition, four newly-added chapters address topics that reflect the rapid advances in the field of Bayesian staistics. The author continues to provide a Bayesian treatment of introductory statistical topics, such as scientific data gathering, discrete random variables, robust Bayesian methods, and Bayesian approaches to inferenfe cfor discrete random variables, bionomial proprotion, Poisson, normal mean, and simple linear regression. In addition, newly-developing topics in the field are presented in four new chapters: Bayesian inference with unknown mean and variance; Bayesian inference for Multivariate Normal mean vector; Bayesian inference for Multiple Linear RegressionModel; and Computati...
Statistical inference an integrated approach
Migon, Helio S; Louzada, Francisco
2014-01-01
Introduction Information The concept of probability Assessing subjective probabilities An example Linear algebra and probability Notation Outline of the bookElements of Inference Common statistical modelsLikelihood-based functions Bayes theorem Exchangeability Sufficiency and exponential family Parameter elimination Prior Distribution Entirely subjective specification Specification through functional forms Conjugacy with the exponential family Non-informative priors Hierarchical priors Estimation Introduction to decision theoryBayesian point estimation Classical point estimation Empirical Bayes estimation Comparison of estimators Interval estimation Estimation in the Normal model Approximating Methods The general problem of inference Optimization techniquesAsymptotic theory Other analytical approximations Numerical integration methods Simulation methods Hypothesis Testing Introduction Classical hypothesis testingBayesian hypothesis testing Hypothesis testing and confidence intervalsAsymptotic tests Prediction...
Hao, Haijing
2013-01-01
Information technology adoption and diffusion is currently a significant challenge in the healthcare delivery setting. This thesis includes three papers that explore social influence on information technology adoption and sustained use in the healthcare delivery environment using conventional regression models and novel hierarchical Bayesian…
Directory of Open Access Journals (Sweden)
Wills Rachael A
2009-05-01
Full Text Available Abstract Background The problem of silent multiple comparisons is one of the most difficult statistical problems faced by scientists. It is a particular problem for investigating a one-off cancer cluster reported to a health department because any one of hundreds, or possibly thousands, of neighbourhoods, schools, or workplaces could have reported a cluster, which could have been for any one of several types of cancer or any one of several time periods. Methods This paper contrasts the frequentist approach with a Bayesian approach for dealing with silent multiple comparisons in the context of a one-off cluster reported to a health department. Two published cluster investigations were re-analysed using the Dunn-Sidak method to adjust frequentist p-values and confidence intervals for silent multiple comparisons. Bayesian methods were based on the Gamma distribution. Results Bayesian analysis with non-informative priors produced results similar to the frequentist analysis, and suggested that both clusters represented a statistical excess. In the frequentist framework, the statistical significance of both clusters was extremely sensitive to the number of silent multiple comparisons, which can only ever be a subjective "guesstimate". The Bayesian approach is also subjective: whether there is an apparent statistical excess depends on the specified prior. Conclusion In cluster investigations, the frequentist approach is just as subjective as the Bayesian approach, but the Bayesian approach is less ambitious in that it treats the analysis as a synthesis of data and personal judgements (possibly poor ones, rather than objective reality. Bayesian analysis is (arguably a useful tool to support complicated decision-making, because it makes the uncertainty associated with silent multiple comparisons explicit.
International Nuclear Information System (INIS)
Davis, Sergio; Gutiérrez, Gonzalo
2013-01-01
We present a systematic implementation of the recently developed Z-method for computing melting points of solids, augmented by a Bayesian analysis of the data obtained from molecular dynamics simulations. The use of Bayesian inference allows us to extract valuable information from limited data, reducing the computational cost of drawing the isochoric curve. From this Bayesian Z-method we obtain posterior distributions for the melting temperature T m , the critical superheating temperature T LS and the slopes dT/dE of the liquid and solid phases. The method therefore gives full quantification of the errors in the prediction of the melting point. This procedure is applied to the estimation of the melting point of Ti 2 GaN (one of the so-called MAX phases), a complex, laminar material, by density functional theory molecular dynamics, finding an estimate T m of 2591.61 ± 89.61 K, which is in good agreement with melting points of similar ceramics. (paper)
Caticha, Ariel
2011-03-01
In this tutorial we review the essential arguments behing entropic inference. We focus on the epistemological notion of information and its relation to the Bayesian beliefs of rational agents. The problem of updating from a prior to a posterior probability distribution is tackled through an eliminative induction process that singles out the logarithmic relative entropy as the unique tool for inference. The resulting method of Maximum relative Entropy (ME), includes as special cases both MaxEnt and Bayes' rule, and therefore unifies the two themes of these workshops—the Maximum Entropy and the Bayesian methods—into a single general inference scheme.
Modelling dependable systems using hybrid Bayesian networks
International Nuclear Information System (INIS)
Neil, Martin; Tailor, Manesh; Marquez, David; Fenton, Norman; Hearty, Peter
2008-01-01
A hybrid Bayesian network (BN) is one that incorporates both discrete and continuous nodes. In our extensive applications of BNs for system dependability assessment, the models are invariably hybrid and the need for efficient and accurate computation is paramount. We apply a new iterative algorithm that efficiently combines dynamic discretisation with robust propagation algorithms on junction tree structures to perform inference in hybrid BNs. We illustrate its use in the field of dependability with two example of reliability estimation. Firstly we estimate the reliability of a simple single system and next we implement a hierarchical Bayesian model. In the hierarchical model we compute the reliability of two unknown subsystems from data collected on historically similar subsystems and then input the result into a reliability block model to compute system level reliability. We conclude that dynamic discretisation can be used as an alternative to analytical or Monte Carlo methods with high precision and can be applied to a wide range of dependability problems
Ross, Cody T; Winterhalder, Bruce
2016-01-01
We conduct a revaluation of the Thornhill and Fincher research project on parasites using finely-resolved geographic data on parasite prevalence, individual-level sociocultural data, and multilevel Bayesian modeling. In contrast to the evolutionary psychological mechanisms linking parasites to human behavior and cultural characteristics proposed by Thornhill and Fincher, we offer an alternative hypothesis that structural racism and differential access to sanitation systems drive both variation in parasite prevalence and differential behaviors and cultural characteristics. We adopt a Bayesian framework to estimate parasite prevalence rates in 51 districts in eight Latin American countries using the disease status of 170,220 individuals tested for infection with the intestinal roundworm Ascaris lumbricoides (Hürlimann et al., []: PLoS Negl Trop Dis 5:e1404). We then use district-level estimates of parasite prevalence and individual-level social data from 5,558 individuals in the same 51 districts (Latinobarómetro, 2008) to assess claims of causal associations between parasite prevalence and sociocultural characteristics. We find, contrary to Thornhill and Fincher, that parasite prevalence is positively associated with preferences for democracy, negatively associated with preferences for collectivism, and not associated with violent crime rates or gender inequality. A positive association between parasite prevalence and religiosity, as in Fincher and Thornhill (: Behav Brain Sci 35:61-79), and a negative association between parasite prevalence and achieved education, as predicted by Eppig et al. (: Proc R S B: Biol Sci 277:3801-3808), become negative and unreliable when reasonable controls are included in the model. We find support for all predictions derived from our hypothesis linking structural racism to both parasite prevalence and cultural outcomes. We conclude that best practices in biocultural modeling require examining more than one hypothesis, retaining
Gerbino, Martina; Lattanzi, Massimiliano; Mena, Olga; Freese, Katherine
2017-12-01
We present a novel approach to derive constraints on neutrino masses, as well as on other cosmological parameters, from cosmological data, while taking into account our ignorance of the neutrino mass ordering. We derive constraints from a combination of current as well as future cosmological datasets on the total neutrino mass Mν and on the mass fractions fν,i =mi /Mν (where the index i = 1 , 2 , 3 indicates the three mass eigenstates) carried by each of the mass eigenstates mi, after marginalizing over the (unknown) neutrino mass ordering, either normal ordering (NH) or inverted ordering (IH). The bounds on all the cosmological parameters, including those on the total neutrino mass, take therefore into account the uncertainty related to our ignorance of the mass hierarchy that is actually realized in nature. This novel approach is carried out in the framework of Bayesian analysis of a typical hierarchical problem, where the distribution of the parameters of the model depends on further parameters, the hyperparameters. In this context, the choice of the neutrino mass ordering is modeled via the discrete hyperparameterhtype, which we introduce in the usual Markov chain analysis. The preference from cosmological data for either the NH or the IH scenarios is then simply encoded in the posterior distribution of the hyperparameter itself. Current cosmic microwave background (CMB) measurements assign equal odds to the two hierarchies, and are thus unable to distinguish between them. However, after the addition of baryon acoustic oscillation (BAO) measurements, a weak preference for the normal hierarchical scenario appears, with odds of 4 : 3 from Planck temperature and large-scale polarization in combination with BAO (3 : 2 if small-scale polarization is also included). Concerning next-generation cosmological experiments, forecasts suggest that the combination of upcoming CMB (COrE) and BAO surveys (DESI) may determine the neutrino mass hierarchy at a high statistical
Directory of Open Access Journals (Sweden)
Ilaria A M Marino
Full Text Available A precise inference of past demographic histories including dating of demographic events using bayesian methods can only be achieved with the use of appropriate molecular rates and evolutionary models. Using a set of 596 mitochondrial cytochrome c oxidase I (COI sequences of two sister species of European green crabs of the genus Carcinus (C. maenas and C. aestuarii, our study shows how chronologies of past evolutionary events change significantly with the application of revised molecular rates that incorporate biogeographic events for calibration and appropriate demographic priors. A clear signal of demographic expansion was found for both species, dated between 10,000 and 20,000 years ago, which places the expansions events in a time frame following the Last Glacial Maximum (LGM. In the case of C. aestuarii, a population expansion was only inferred for the Adriatic-Ionian, suggestive of a colonization event following the flooding of the Adriatic Sea (18,000 years ago. For C. maenas, the demographic expansion inferred for the continental populations of West and North Europe might result from a northward recolonization from a southern refugium when the ice sheet retreated after the LGM. Collectively, our results highlight the importance of using adequate calibrations and demographic priors in order to avoid considerable overestimates of evolutionary time scales.
Yuan, Ying; MacKinnon, David P.
2009-01-01
This article proposes Bayesian analysis of mediation effects. Compared to conventional frequentist mediation analysis, the Bayesian approach has several advantages. First, it allows researchers to incorporate prior information into the mediation analysis, thus potentially improving the efficiency of estimates. Second, under the Bayesian mediation analysis, inference is straightforward and exact, which makes it appealing for studies with small samples. Third, the Bayesian approach is conceptua...
Bayesian artificial intelligence
Korb, Kevin B
2010-01-01
Updated and expanded, Bayesian Artificial Intelligence, Second Edition provides a practical and accessible introduction to the main concepts, foundation, and applications of Bayesian networks. It focuses on both the causal discovery of networks and Bayesian inference procedures. Adopting a causal interpretation of Bayesian networks, the authors discuss the use of Bayesian networks for causal modeling. They also draw on their own applied research to illustrate various applications of the technology.New to the Second EditionNew chapter on Bayesian network classifiersNew section on object-oriente
Kaufman, Jay S; MacLehose, Richard F; Torrone, Elizabeth A; Savitz, David A
2011-04-19
Previous research has documented heterogeneity in the effects of maternal education on adverse birth outcomes by nativity and Hispanic subgroup in the United States. In this article, we considered the risk of preterm birth (PTB) using 9 years of vital statistics birth data from New York City. We employed finer categorizations of exposure than used previously and estimated the risk dose-response across the range of education by nativity and ethnicity. Using Bayesian random effects logistic regression models with restricted quadratic spline terms for years of completed maternal education, we calculated and plotted the estimated posterior probabilities of PTB (gestational age education by ethnic and nativity subgroups adjusted for only maternal age, as well as with more extensive covariate adjustments. We then estimated the posterior risk difference between native and foreign born mothers by ethnicity over the continuous range of education exposures. The risk of PTB varied substantially by education, nativity and ethnicity. Native born groups showed higher absolute risk of PTB and declining risk associated with higher levels of education beyond about 10 years, as did foreign-born Puerto Ricans. For most other foreign born groups, however, risk of PTB was flatter across the education range. For Mexicans, Central Americans, Dominicans, South Americans and "Others", the protective effect of foreign birth diminished progressively across the educational range. Only for Puerto Ricans was there no nativity advantage for the foreign born, although small numbers of foreign born Cubans limited precision of estimates for that group. Using flexible Bayesian regression models with random effects allowed us to estimate absolute risks without strong modeling assumptions. Risk comparisons for any sub-groups at any exposure level were simple to calculate. Shrinkage of posterior estimates through the use of random effects allowed for finer categorization of exposures without
International Nuclear Information System (INIS)
Toman, Blaza; Nelson, Michael A.; Lippa, Katrice A.
2016-01-01
Chemical purity assessment using quantitative "1H-nuclear magnetic resonance spectroscopy is a method based on ratio references of mass and signal intensity of the analyte species to that of chemical standards of known purity. As such, it is an example of a calculation using a known measurement equation with multiple inputs. Though multiple samples are often analyzed during purity evaluations in order to assess measurement repeatability, the uncertainty evaluation must also account for contributions from inputs to the measurement equation. Furthermore, there may be other uncertainty components inherent in the experimental design, such as independent implementation of multiple calibration standards. As such, the uncertainty evaluation is not purely bottom up (based on the measurement equation) or top down (based on the experimental design), but inherently contains elements of both. This hybrid form of uncertainty analysis is readily implemented with Bayesian statistical analysis. In this article we describe this type of analysis in detail and illustrate it using data from an evaluation of chemical purity and its uncertainty for a folic acid material. (authors)
Learning with hierarchical-deep models.
Salakhutdinov, Ruslan; Tenenbaum, Joshua B; Torralba, Antonio
2013-08-01
We introduce HD (or “Hierarchical-Deep”) models, a new compositional learning architecture that integrates deep learning models with structured hierarchical Bayesian (HB) models. Specifically, we show how we can learn a hierarchical Dirichlet process (HDP) prior over the activities of the top-level features in a deep Boltzmann machine (DBM). This compound HDP-DBM model learns to learn novel concepts from very few training example by learning low-level generic features, high-level features that capture correlations among low-level features, and a category hierarchy for sharing priors over the high-level features that are typical of different kinds of concepts. We present efficient learning and inference algorithms for the HDP-DBM model and show that it is able to learn new concepts from very few examples on CIFAR-100 object recognition, handwritten character recognition, and human motion capture datasets.
Faurby, Søren; Svenning, Jens-Christian
2015-03-01
Across large clades, two problems are generally encountered in the estimation of species-level phylogenies: (a) the number of taxa involved is generally so high that computation-intensive approaches cannot readily be utilized and (b) even for clades that have received intense study (e.g., mammals), attention has been centered on relatively few selected species, and most taxa must therefore be positioned on the basis of very limited genetic data. Here, we describe a new heuristic-hierarchical Bayesian approach and use it to construct a species-level phylogeny for all extant and late Quaternary extinct mammals. In this approach, species with large quantities of genetic data are placed nearly freely in the mammalian phylogeny according to these data, whereas the placement of species with lower quantities of data is performed with steadily stricter restrictions for decreasing data quantities. The advantages of the proposed method include (a) an improved ability to incorporate phylogenetic uncertainty in downstream analyses based on the resulting phylogeny, (b) a reduced potential for long-branch attraction or other types of errors that place low-data taxa far from their true position, while maintaining minimal restrictions for better-studied taxa, and (c) likely improved placement of low-data taxa due to the use of closer outgroups. Copyright © 2014 Elsevier Inc. All rights reserved.
Directory of Open Access Journals (Sweden)
Julie Vercelloni
Full Text Available Recently, attempts to improve decision making in species management have focussed on uncertainties associated with modelling temporal fluctuations in populations. Reducing model uncertainty is challenging; while larger samples improve estimation of species trajectories and reduce statistical errors, they typically amplify variability in observed trajectories. In particular, traditional modelling approaches aimed at estimating population trajectories usually do not account well for nonlinearities and uncertainties associated with multi-scale observations characteristic of large spatio-temporal surveys. We present a Bayesian semi-parametric hierarchical model for simultaneously quantifying uncertainties associated with model structure and parameters, and scale-specific variability over time. We estimate uncertainty across a four-tiered spatial hierarchy of coral cover from the Great Barrier Reef. Coral variability is well described; however, our results show that, in the absence of additional model specifications, conclusions regarding coral trajectories become highly uncertain when considering multiple reefs, suggesting that management should focus more at the scale of individual reefs. The approach presented facilitates the description and estimation of population trajectories and associated uncertainties when variability cannot be attributed to specific causes and origins. We argue that our model can unlock value contained in large-scale datasets, provide guidance for understanding sources of uncertainty, and support better informed decision making.
Introduction to applied Bayesian statistics and estimation for social scientists
Lynch, Scott M
2007-01-01
""Introduction to Applied Bayesian Statistics and Estimation for Social Scientists"" covers the complete process of Bayesian statistical analysis in great detail from the development of a model through the process of making statistical inference. The key feature of this book is that it covers models that are most commonly used in social science research - including the linear regression model, generalized linear models, hierarchical models, and multivariate regression models - and it thoroughly develops each real-data example in painstaking detail.The first part of the book provides a detailed
Serang, Oliver
2015-08-01
Observations depending on sums of random variables are common throughout many fields; however, no efficient solution is currently known for performing max-product inference on these sums of general discrete distributions (max-product inference can be used to obtain maximum a posteriori estimates). The limiting step to max-product inference is the max-convolution problem (sometimes presented in log-transformed form and denoted as "infimal convolution," "min-convolution," or "convolution on the tropical semiring"), for which no O(k log(k)) method is currently known. Presented here is an O(k log(k)) numerical method for estimating the max-convolution of two nonnegative vectors (e.g., two probability mass functions), where k is the length of the larger vector. This numerical max-convolution method is then demonstrated by performing fast max-product inference on a convolution tree, a data structure for performing fast inference given information on the sum of n discrete random variables in O(nk log(nk)log(n)) steps (where each random variable has an arbitrary prior distribution on k contiguous possible states). The numerical max-convolution method can be applied to specialized classes of hidden Markov models to reduce the runtime of computing the Viterbi path from nk(2) to nk log(k), and has potential application to the all-pairs shortest paths problem.
McNally, Kevin; Cotton, Richard; Cocker, John; Jones, Kate; Bartels, Mike; Rick, David; Price, Paul; Loizou, George
2012-01-01
There are numerous biomonitoring programs, both recent and ongoing, to evaluate environmental exposure of humans to chemicals. Due to the lack of exposure and kinetic data, the correlation of biomarker levels with exposure concentrations leads to difficulty in utilizing biomonitoring data for biological guidance values. Exposure reconstruction or reverse dosimetry is the retrospective interpretation of external exposure consistent with biomonitoring data. We investigated the integration of physiologically based pharmacokinetic modelling, global sensitivity analysis, Bayesian inference, and Markov chain Monte Carlo simulation to obtain a population estimate of inhalation exposure to m-xylene. We used exhaled breath and venous blood m-xylene and urinary 3-methylhippuric acid measurements from a controlled human volunteer study in order to evaluate the ability of our computational framework to predict known inhalation exposures. We also investigated the importance of model structure and dimensionality with respect to its ability to reconstruct exposure. PMID:22719759
Directory of Open Access Journals (Sweden)
Kevin McNally
2012-01-01
Full Text Available There are numerous biomonitoring programs, both recent and ongoing, to evaluate environmental exposure of humans to chemicals. Due to the lack of exposure and kinetic data, the correlation of biomarker levels with exposure concentrations leads to difficulty in utilizing biomonitoring data for biological guid