Bayesian models: A statistical primer for ecologists
Hobbs, N. Thompson; Hooten, Mevin B.
2015-01-01
Bayesian modeling has become an indispensable tool for ecological research because it is uniquely suited to deal with complexity in a statistically coherent way. This textbook provides a comprehensive and accessible introduction to the latest Bayesian methods—in language ecologists can understand. Unlike other books on the subject, this one emphasizes the principles behind the computations, giving ecologists a big-picture understanding of how to implement this powerful statistical approach.Bayesian Models is an essential primer for non-statisticians. It begins with a definition of probability and develops a step-by-step sequence of connected ideas, including basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and inference from single and multiple models. This unique book places less emphasis on computer coding, favoring instead a concise presentation of the mathematical statistics needed to understand how and why Bayesian analysis works. It also explains how to write out properly formulated hierarchical Bayesian models and use them in computing, research papers, and proposals.This primer enables ecologists to understand the statistical principles behind Bayesian modeling and apply them to research, teaching, policy, and management.Presents the mathematical and statistical foundations of Bayesian modeling in language accessible to non-statisticiansCovers basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and moreDeemphasizes computer coding in favor of basic principlesExplains how to write out properly factored statistical expressions representing Bayesian models
Bayesian models a statistical primer for ecologists
Hobbs, N Thompson
2015-01-01
Bayesian modeling has become an indispensable tool for ecological research because it is uniquely suited to deal with complexity in a statistically coherent way. This textbook provides a comprehensive and accessible introduction to the latest Bayesian methods-in language ecologists can understand. Unlike other books on the subject, this one emphasizes the principles behind the computations, giving ecologists a big-picture understanding of how to implement this powerful statistical approach. Bayesian Models is an essential primer for non-statisticians. It begins with a definition of probabili
Introduction to Bayesian statistics
Bolstad, William M
2017-01-01
There is a strong upsurge in the use of Bayesian methods in applied statistical analysis, yet most introductory statistics texts only present frequentist methods. Bayesian statistics has many important advantages that students should learn about if they are going into fields where statistics will be used. In this Third Edition, four newly-added chapters address topics that reflect the rapid advances in the field of Bayesian staistics. The author continues to provide a Bayesian treatment of introductory statistical topics, such as scientific data gathering, discrete random variables, robust Bayesian methods, and Bayesian approaches to inferenfe cfor discrete random variables, bionomial proprotion, Poisson, normal mean, and simple linear regression. In addition, newly-developing topics in the field are presented in four new chapters: Bayesian inference with unknown mean and variance; Bayesian inference for Multivariate Normal mean vector; Bayesian inference for Multiple Linear RegressionModel; and Computati...
Bayesian statistic methods and theri application in probabilistic simulation models
Directory of Open Access Journals (Sweden)
Sergio Iannazzo
2007-03-01
Full Text Available Bayesian statistic methods are facing a rapidly growing level of interest and acceptance in the field of health economics. The reasons of this success are probably to be found on the theoretical fundaments of the discipline that make these techniques more appealing to decision analysis. To this point should be added the modern IT progress that has developed different flexible and powerful statistical software framework. Among them probably one of the most noticeably is the BUGS language project and its standalone application for MS Windows WinBUGS. Scope of this paper is to introduce the subject and to show some interesting applications of WinBUGS in developing complex economical models based on Markov chains. The advantages of this approach reside on the elegance of the code produced and in its capability to easily develop probabilistic simulations. Moreover an example of the integration of bayesian inference models in a Markov model is shown. This last feature let the analyst conduce statistical analyses on the available sources of evidence and exploit them directly as inputs in the economic model.
Bayesian Sensitivity Analysis of Statistical Models with Missing Data.
Zhu, Hongtu; Ibrahim, Joseph G; Tang, Niansheng
2014-04-01
Methods for handling missing data depend strongly on the mechanism that generated the missing values, such as missing completely at random (MCAR) or missing at random (MAR), as well as other distributional and modeling assumptions at various stages. It is well known that the resulting estimates and tests may be sensitive to these assumptions as well as to outlying observations. In this paper, we introduce various perturbations to modeling assumptions and individual observations, and then develop a formal sensitivity analysis to assess these perturbations in the Bayesian analysis of statistical models with missing data. We develop a geometric framework, called the Bayesian perturbation manifold, to characterize the intrinsic structure of these perturbations. We propose several intrinsic influence measures to perform sensitivity analysis and quantify the effect of various perturbations to statistical models. We use the proposed sensitivity analysis procedure to systematically investigate the tenability of the non-ignorable missing at random (NMAR) assumption. Simulation studies are conducted to evaluate our methods, and a dataset is analyzed to illustrate the use of our diagnostic measures.
Understanding Computational Bayesian Statistics
Bolstad, William M
2011-01-01
A hands-on introduction to computational statistics from a Bayesian point of view Providing a solid grounding in statistics while uniquely covering the topics from a Bayesian perspective, Understanding Computational Bayesian Statistics successfully guides readers through this new, cutting-edge approach. With its hands-on treatment of the topic, the book shows how samples can be drawn from the posterior distribution when the formula giving its shape is all that is known, and how Bayesian inferences can be based on these samples from the posterior. These ideas are illustrated on common statistic
Bayesian statistics an introduction
Lee, Peter M
2012-01-01
Bayesian Statistics is the school of thought that combines prior beliefs with the likelihood of a hypothesis to arrive at posterior beliefs. The first edition of Peter Lee’s book appeared in 1989, but the subject has moved ever onwards, with increasing emphasis on Monte Carlo based techniques. This new fourth edition looks at recent techniques such as variational methods, Bayesian importance sampling, approximate Bayesian computation and Reversible Jump Markov Chain Monte Carlo (RJMCMC), providing a concise account of the way in which the Bayesian approach to statistics develops as wel
Kleibergen, F.R.; Kleijn, R.; Paap, R.
2000-01-01
We propose a novel Bayesian test under a (noninformative) Jeffreys'priorspecification. We check whether the fixed scalar value of the so-calledBayesian Score Statistic (BSS) under the null hypothesis is aplausiblerealization from its known and standardized distribution under thealternative. Unlike
Statistics: a Bayesian perspective
National Research Council Canada - National Science Library
Berry, Donald A
1996-01-01
...: it is the only introductory textbook based on Bayesian ideas, it combines concepts and methods, it presents statistics as a means of integrating data into the significant process, it develops ideas...
Bayesian statistical inference
Directory of Open Access Journals (Sweden)
Bruno De Finetti
2017-04-01
Full Text Available This work was translated into English and published in the volume: Bruno De Finetti, Induction and Probability, Biblioteca di Statistica, eds. P. Monari, D. Cocchi, Clueb, Bologna, 1993.Bayesian statistical Inference is one of the last fundamental philosophical papers in which we can find the essential De Finetti's approach to the statistical inference.
Probability and Bayesian statistics
1987-01-01
This book contains selected and refereed contributions to the "Inter national Symposium on Probability and Bayesian Statistics" which was orga nized to celebrate the 80th birthday of Professor Bruno de Finetti at his birthplace Innsbruck in Austria. Since Professor de Finetti died in 1985 the symposium was dedicated to the memory of Bruno de Finetti and took place at Igls near Innsbruck from 23 to 26 September 1986. Some of the pa pers are published especially by the relationship to Bruno de Finetti's scientific work. The evolution of stochastics shows growing importance of probability as coherent assessment of numerical values as degrees of believe in certain events. This is the basis for Bayesian inference in the sense of modern statistics. The contributions in this volume cover a broad spectrum ranging from foundations of probability across psychological aspects of formulating sub jective probability statements, abstract measure theoretical considerations, contributions to theoretical statistics an...
Use of SAMC for Bayesian analysis of statistical models with intractable normalizing constants
Jin, Ick Hoon
2014-03-01
Statistical inference for the models with intractable normalizing constants has attracted much attention. During the past two decades, various approximation- or simulation-based methods have been proposed for the problem, such as the Monte Carlo maximum likelihood method and the auxiliary variable Markov chain Monte Carlo methods. The Bayesian stochastic approximation Monte Carlo algorithm specifically addresses this problem: It works by sampling from a sequence of approximate distributions with their average converging to the target posterior distribution, where the approximate distributions can be achieved using the stochastic approximation Monte Carlo algorithm. A strong law of large numbers is established for the Bayesian stochastic approximation Monte Carlo estimator under mild conditions. Compared to the Monte Carlo maximum likelihood method, the Bayesian stochastic approximation Monte Carlo algorithm is more robust to the initial guess of model parameters. Compared to the auxiliary variable MCMC methods, the Bayesian stochastic approximation Monte Carlo algorithm avoids the requirement for perfect samples, and thus can be applied to many models for which perfect sampling is not available or very expensive. The Bayesian stochastic approximation Monte Carlo algorithm also provides a general framework for approximate Bayesian analysis. © 2012 Elsevier B.V. All rights reserved.
A new model test in high energy physics in frequentist and bayesian statistical formalisms
International Nuclear Information System (INIS)
Kamenshchikov, A.
2017-01-01
The problem of a new physical model test using observed experimental data is a typical one for modern experiments in high energy physics (HEP). A solution of the problem may be provided with two alternative statistical formalisms, namely frequentist and Bayesian, which are widely spread in contemporary HEP searches. A characteristic experimental situation is modeled from general considerations, and both the approaches are utilized in order to test a new model. The results are juxtaposed, which demonstrates their consistency in this work. An effect of a systematic uncertainty treatment in the statistical analysis is also considered.
He, Yuning
2015-01-01
The behavior of complex aerospace systems is governed by numerous parameters. For safety analysis it is important to understand how the system behaves with respect to these parameter values. In particular, understanding the boundaries between safe and unsafe regions is of major importance. In this paper, we describe a hierarchical Bayesian statistical modeling approach for the online detection and characterization of such boundaries. Our method for classification with active learning uses a particle filter-based model and a boundary-aware metric for best performance. From a library of candidate shapes incorporated with domain expert knowledge, the location and parameters of the boundaries are estimated using advanced Bayesian modeling techniques. The results of our boundary analysis are then provided in a form understandable by the domain expert. We illustrate our approach using a simulation model of a NASA neuro-adaptive flight control system, as well as a system for the detection of separation violations in the terminal airspace.
Predictive data-derived Bayesian statistic-transport model and simulator of sunken oil mass
Echavarria Gregory, Maria Angelica
Sunken oil is difficult to locate because remote sensing techniques cannot as yet provide views of sunken oil over large areas. Moreover, the oil may re-suspend and sink with changes in salinity, sediment load, and temperature, making deterministic fate models difficult to deploy and calibrate when even the presence of sunken oil is difficult to assess. For these reasons, together with the expense of field data collection, there is a need for a statistical technique integrating limited data collection with stochastic transport modeling. Predictive Bayesian modeling techniques have been developed and demonstrated for exploiting limited information for decision support in many other applications. These techniques brought to a multi-modal Lagrangian modeling framework, representing a near-real time approach to locating and tracking sunken oil driven by intrinsic physical properties of field data collected following a spill after oil has begun collecting on a relatively flat bay bottom. Methods include (1) development of the conceptual predictive Bayesian model and multi-modal Gaussian computational approach based on theory and literature review; (2) development of an object-oriented programming and combinatorial structure capable of managing data, integration and computation over an uncertain and highly dimensional parameter space; (3) creating a new bi-dimensional approach of the method of images to account for curved shoreline boundaries; (4) confirmation of model capability for locating sunken oil patches using available (partial) real field data and capability for temporal projections near curved boundaries using simulated field data; and (5) development of a stand-alone open-source computer application with graphical user interface capable of calibrating instantaneous oil spill scenarios, obtaining sets maps of relative probability profiles at different prediction times and user-selected geographic areas and resolution, and capable of performing post
Philosophy and the practice of Bayesian statistics.
Gelman, Andrew; Shalizi, Cosma Rohilla
2013-02-01
A substantial school in the philosophy of science identifies Bayesian inference with inductive inference and even rationality as such, and seems to be strengthened by the rise and practical success of Bayesian statistics. We argue that the most successful forms of Bayesian statistics do not actually support that particular philosophy but rather accord much better with sophisticated forms of hypothetico-deductivism. We examine the actual role played by prior distributions in Bayesian models, and the crucial aspects of model checking and model revision, which fall outside the scope of Bayesian confirmation theory. We draw on the literature on the consistency of Bayesian updating and also on our experience of applied work in social science. Clarity about these matters should benefit not just philosophy of science, but also statistical practice. At best, the inductivist view has encouraged researchers to fit and compare models without checking them; at worst, theorists have actively discouraged practitioners from performing model checking because it does not fit into their framework. © 2012 The British Psychological Society.
Philosophy and the practice of Bayesian statistics
Gelman, Andrew; Shalizi, Cosma Rohilla
2015-01-01
A substantial school in the philosophy of science identifies Bayesian inference with inductive inference and even rationality as such, and seems to be strengthened by the rise and practical success of Bayesian statistics. We argue that the most successful forms of Bayesian statistics do not actually support that particular philosophy but rather accord much better with sophisticated forms of hypothetico-deductivism. We examine the actual role played by prior distributions in Bayesian models, and the crucial aspects of model checking and model revision, which fall outside the scope of Bayesian confirmation theory. We draw on the literature on the consistency of Bayesian updating and also on our experience of applied work in social science. Clarity about these matters should benefit not just philosophy of science, but also statistical practice. At best, the inductivist view has encouraged researchers to fit and compare models without checking them; at worst, theorists have actively discouraged practitioners from performing model checking because it does not fit into their framework. PMID:22364575
Onisko, Agnieszka; Druzdzel, Marek J; Austin, R Marshall
2016-01-01
Classical statistics is a well-established approach in the analysis of medical data. While the medical community seems to be familiar with the concept of a statistical analysis and its interpretation, the Bayesian approach, argued by many of its proponents to be superior to the classical frequentist approach, is still not well-recognized in the analysis of medical data. The goal of this study is to encourage data analysts to use the Bayesian approach, such as modeling with graphical probabilistic networks, as an insightful alternative to classical statistical analysis of medical data. This paper offers a comparison of two approaches to analysis of medical time series data: (1) classical statistical approach, such as the Kaplan-Meier estimator and the Cox proportional hazards regression model, and (2) dynamic Bayesian network modeling. Our comparison is based on time series cervical cancer screening data collected at Magee-Womens Hospital, University of Pittsburgh Medical Center over 10 years. The main outcomes of our comparison are cervical cancer risk assessments produced by the three approaches. However, our analysis discusses also several aspects of the comparison, such as modeling assumptions, model building, dealing with incomplete data, individualized risk assessment, results interpretation, and model validation. Our study shows that the Bayesian approach is (1) much more flexible in terms of modeling effort, and (2) it offers an individualized risk assessment, which is more cumbersome for classical statistical approaches.
2017-09-01
represented by the dispersion of the discrete forecast estimates (black curves). 6 The computational intractability of Epstein’s complete...that scales well with complicated systems, the posterior densities are often analytically intractable (G13). To this end, MCMC methods provide a...participation in this conflict . Nevertheless, the advantages of intuitive, common-sense Bayesian statistical conclusions detailed by Casella (2008), Gelman
Directory of Open Access Journals (Sweden)
Sarah Depaoli
2015-03-01
Full Text Available Background: After traumatic events, such as disaster, war trauma, and injuries including burns (which is the focus here, the risk to develop posttraumatic stress disorder (PTSD is approximately 10% (Breslau & Davis, 1992. Latent Growth Mixture Modeling can be used to classify individuals into distinct groups exhibiting different patterns of PTSD (Galatzer-Levy, 2015. Currently, empirical evidence points to four distinct trajectories of PTSD patterns in those who have experienced burn trauma. These trajectories are labeled as: resilient, recovery, chronic, and delayed onset trajectories (e.g., Bonanno, 2004; Bonanno, Brewin, Kaniasty, & Greca, 2010; Maercker, Gäbler, O'Neil, Schützwohl, & Müller, 2013; Pietrzak et al., 2013. The delayed onset trajectory affects only a small group of individuals, that is, about 4–5% (O'Donnell, Elliott, Lau, & Creamer, 2007. In addition to its low frequency, the later onset of this trajectory may contribute to the fact that these individuals can be easily overlooked by professionals. In this special symposium on Estimating PTSD trajectories (Van de Schoot, 2015a, we illustrate how to properly identify this small group of individuals through the Bayesian estimation framework using previous knowledge through priors (see, e.g., Depaoli & Boyajian, 2014; Van de Schoot, Broere, Perryck, Zondervan-Zwijnenburg, & Van Loey, 2015. Method: We used latent growth mixture modeling (LGMM (Van de Schoot, 2015b to estimate PTSD trajectories across 4 years that followed a traumatic burn. We demonstrate and compare results from traditional (maximum likelihood and Bayesian estimation using priors (see, Depaoli, 2012, 2013. Further, we discuss where priors come from and how to define them in the estimation process. Results: We demonstrate that only the Bayesian approach results in the desired theory-driven solution of PTSD trajectories. Since the priors are chosen subjectively, we also present a sensitivity analysis of the
Sahai, Swupnil
This thesis includes three parts. The overarching theme is how to analyze structured hierarchical data, with applications to astronomy and sociology. The first part discusses how expectation propagation can be used to parallelize the computation when fitting big hierarchical bayesian models. This methodology is then used to fit a novel, nonlinear mixture model to ultraviolet radiation from various regions of the observable universe. The second part discusses how the Stan probabilistic programming language can be used to numerically integrate terms in a hierarchical bayesian model. This technique is demonstrated on supernovae data to significantly speed up convergence to the posterior distribution compared to a previous study that used a Gibbs-type sampler. The third part builds a formal latent kernel representation for aggregate relational data as a way to more robustly estimate the mixing characteristics of agents in a network. In particular, the framework is applied to sociology surveys to estimate, as a function of ego age, the age and sex composition of the personal networks of individuals in the United States.
12th Brazilian Meeting on Bayesian Statistics
Louzada, Francisco; Rifo, Laura; Stern, Julio; Lauretto, Marcelo
2015-01-01
Through refereed papers, this volume focuses on the foundations of the Bayesian paradigm; their comparison to objectivistic or frequentist Statistics counterparts; and the appropriate application of Bayesian foundations. This research in Bayesian Statistics is applicable to data analysis in biostatistics, clinical trials, law, engineering, and the social sciences. EBEB, the Brazilian Meeting on Bayesian Statistics, is held every two years by the ISBrA, the International Society for Bayesian Analysis, one of the most active chapters of the ISBA. The 12th meeting took place March 10-14, 2014 in Atibaia. Interest in foundations of inductive Statistics has grown recently in accordance with the increasing availability of Bayesian methodological alternatives. Scientists need to deal with the ever more difficult choice of the optimal method to apply to their problem. This volume shows how Bayes can be the answer. The examination and discussion on the foundations work towards the goal of proper application of Bayesia...
Congdon, Peter
2014-01-01
This book provides an accessible approach to Bayesian computing and data analysis, with an emphasis on the interpretation of real data sets. Following in the tradition of the successful first edition, this book aims to make a wide range of statistical modeling applications accessible using tested code that can be readily adapted to the reader's own applications. The second edition has been thoroughly reworked and updated to take account of advances in the field. A new set of worked examples is included. The novel aspect of the first edition was the coverage of statistical modeling using WinBU
新家, 健精
1991-01-01
© 2012 Springer Science+Business Media, LLC. All rights reserved. Article Outline: Glossary Definition of the Subject and Introduction The Bayesian Statistical Paradigm Three Examples Comparison with the Frequentist Statistical Paradigm Future Directions Bibliography
Topics in Bayesian statistics and maximum entropy
International Nuclear Information System (INIS)
Mutihac, R.; Cicuttin, A.; Cerdeira, A.; Stanciulescu, C.
1998-12-01
Notions of Bayesian decision theory and maximum entropy methods are reviewed with particular emphasis on probabilistic inference and Bayesian modeling. The axiomatic approach is considered as the best justification of Bayesian analysis and maximum entropy principle applied in natural sciences. Particular emphasis is put on solving the inverse problem in digital image restoration and Bayesian modeling of neural networks. Further topics addressed briefly include language modeling, neutron scattering, multiuser detection and channel equalization in digital communications, genetic information, and Bayesian court decision-making. (author)
2016-05-31
Distribution Unlimited UU UU UU UU 31-05-2016 15-Apr-2014 14-Jan-2015 Final Report: Technical Topic 3.2.2.d Bayesian and Non- parametric Statistics...of Papers published in non peer-reviewed journals: Final Report: Technical Topic 3.2.2.d Bayesian and Non- parametric Statistics: Integration of Neural...Transfer N/A Number of graduating undergraduates who achieved a 3.5 GPA to 4.0 (4.0 max scale ): Number of graduating undergraduates funded by a DoD funded
Directory of Open Access Journals (Sweden)
Seifu Hagos
Full Text Available Understanding the spatial distribution of stunting and underlying factors operating at meso-scale is of paramount importance for intervention designing and implementations. Yet, little is known about the spatial distribution of stunting and some discrepancies are documented on the relative importance of reported risk factors. Therefore, the present study aims at exploring the spatial distribution of stunting at meso- (district scale, and evaluates the effect of spatial dependency on the identification of risk factors and their relative contribution to the occurrence of stunting and severe stunting in a rural area of Ethiopia.A community based cross sectional study was conducted to measure the occurrence of stunting and severe stunting among children aged 0-59 months. Additionally, we collected relevant information on anthropometric measures, dietary habits, parent and child-related demographic and socio-economic status. Latitude and longitude of surveyed households were also recorded. Local Anselin Moran's I was calculated to investigate the spatial variation of stunting prevalence and identify potential local pockets (hotspots of high prevalence. Finally, we employed a Bayesian geo-statistical model, which accounted for spatial dependency structure in the data, to identify potential risk factors for stunting in the study area.Overall, the prevalence of stunting and severe stunting in the district was 43.7% [95%CI: 40.9, 46.4] and 21.3% [95%CI: 19.5, 23.3] respectively. We identified statistically significant clusters of high prevalence of stunting (hotspots in the eastern part of the district and clusters of low prevalence (cold spots in the western. We found out that the inclusion of spatial structure of the data into the Bayesian model has shown to improve the fit for stunting model. The Bayesian geo-statistical model indicated that the risk of stunting increased as the child's age increased (OR 4.74; 95% Bayesian credible interval [BCI]:3
Application of a Bayesian algorithm for the Statistical Energy model updating of a railway coach
DEFF Research Database (Denmark)
Sadri, Mehran; Brunskog, Jonas; Younesian, Davood
2016-01-01
The classical statistical energy analysis (SEA) theory is a common approach for vibroacoustic analysis of coupled complex structures, being efficient to predict high-frequency noise and vibration of engineering systems. There are however some limitations in applying the conventional SEA. The pres......The classical statistical energy analysis (SEA) theory is a common approach for vibroacoustic analysis of coupled complex structures, being efficient to predict high-frequency noise and vibration of engineering systems. There are however some limitations in applying the conventional SEA...... the performance of the proposed strategy, the SEA model updating of a railway passenger coach is carried out. First, a sensitivity analysis is carried out to select the most sensitive parameters of the SEA model. For the selected parameters of the model, prior probability density functions are then taken...
Probabilistic model for untargeted peak detection in LC-MS using Bayesian statistics
Woldegebriel, M.; Vivó-Truyols, G.
2015-01-01
We introduce a novel Bayesian probabilistic peak detection algorithm for liquid chromatography mass spectroscopy (LC-MS). The final probabilistic result allows the user to make a final decision about which points in a 2 chromatogram are affected by a chromatographic peak and which ones are only
Calibration in a Bayesian modelling framework
Jansen, M.J.W.; Hagenaars, T.H.J.
2004-01-01
Bayesian statistics may constitute the core of a consistent and comprehensive framework for the statistical aspects of modelling complex processes that involve many parameters whose values are derived from many sources. Bayesian statistics holds great promises for model calibration, provides the
An introduction to Bayesian statistics in health psychology.
Depaoli, Sarah; Rus, Holly M; Clifton, James P; van de Schoot, Rens; Tiemensma, Jitske
2017-09-01
The aim of the current article is to provide a brief introduction to Bayesian statistics within the field of health psychology. Bayesian methods are increasing in prevalence in applied fields, and they have been shown in simulation research to improve the estimation accuracy of structural equation models, latent growth curve (and mixture) models, and hierarchical linear models. Likewise, Bayesian methods can be used with small sample sizes since they do not rely on large sample theory. In this article, we discuss several important components of Bayesian statistics as they relate to health-based inquiries. We discuss the incorporation and impact of prior knowledge into the estimation process and the different components of the analysis that should be reported in an article. We present an example implementing Bayesian estimation in the context of blood pressure changes after participants experienced an acute stressor. We conclude with final thoughts on the implementation of Bayesian statistics in health psychology, including suggestions for reviewing Bayesian manuscripts and grant proposals. We have also included an extensive amount of online supplementary material to complement the content presented here, including Bayesian examples using many different software programmes and an extensive sensitivity analysis examining the impact of priors.
Introduction to applied Bayesian statistics and estimation for social scientists
Lynch, Scott M
2007-01-01
""Introduction to Applied Bayesian Statistics and Estimation for Social Scientists"" covers the complete process of Bayesian statistical analysis in great detail from the development of a model through the process of making statistical inference. The key feature of this book is that it covers models that are most commonly used in social science research - including the linear regression model, generalized linear models, hierarchical models, and multivariate regression models - and it thoroughly develops each real-data example in painstaking detail.The first part of the book provides a detailed
DEFF Research Database (Denmark)
Jensen, Finn Verner; Nielsen, Thomas Dyhre
2016-01-01
is largely due to the availability of efficient inference algorithms for answering probabilistic queries about the states of the variables in the network. Furthermore, to support the construction of Bayesian network models, learning algorithms are also available. We give an overview of the Bayesian network...
Bayesian analysis of CCDM models
Energy Technology Data Exchange (ETDEWEB)
Jesus, J.F. [Universidade Estadual Paulista (Unesp), Câmpus Experimental de Itapeva, Rua Geraldo Alckmin 519, Vila N. Sra. de Fátima, Itapeva, SP, 18409-010 Brazil (Brazil); Valentim, R. [Departamento de Física, Instituto de Ciências Ambientais, Químicas e Farmacêuticas—ICAQF, Universidade Federal de São Paulo (UNIFESP), Unidade José Alencar, Rua São Nicolau No. 210, Diadema, SP, 09913-030 Brazil (Brazil); Andrade-Oliveira, F., E-mail: jfjesus@itapeva.unesp.br, E-mail: valentim.rodolfo@unifesp.br, E-mail: felipe.oliveira@port.ac.uk [Institute of Cosmology and Gravitation—University of Portsmouth, Burnaby Road, Portsmouth, PO1 3FX United Kingdom (United Kingdom)
2017-09-01
Creation of Cold Dark Matter (CCDM), in the context of Einstein Field Equations, produces a negative pressure term which can be used to explain the accelerated expansion of the Universe. In this work we tested six different spatially flat models for matter creation using statistical criteria, in light of SNe Ia data: Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and Bayesian Evidence (BE). These criteria allow to compare models considering goodness of fit and number of free parameters, penalizing excess of complexity. We find that JO model is slightly favoured over LJO/ΛCDM model, however, neither of these, nor Γ = 3α H {sub 0} model can be discarded from the current analysis. Three other scenarios are discarded either because poor fitting or because of the excess of free parameters. A method of increasing Bayesian evidence through reparameterization in order to reducing parameter degeneracy is also developed.
The humble Bayesian : Model checking from a fully Bayesian perspective
Morey, Richard D.; Romeijn, Jan-Willem; Rouder, Jeffrey N.
Gelman and Shalizi (2012) criticize what they call the usual story in Bayesian statistics: that the distribution over hypotheses or models is the sole means of statistical inference, thus excluding model checking and revision, and that inference is inductivist rather than deductivist. They present
Bayesian Inference in Statistical Analysis
Box, George E P
2011-01-01
The Wiley Classics Library consists of selected books that have become recognized classics in their respective fields. With these new unabridged and inexpensive editions, Wiley hopes to extend the life of these important works by making them available to future generations of mathematicians and scientists. Currently available in the Series: T. W. Anderson The Statistical Analysis of Time Series T. S. Arthanari & Yadolah Dodge Mathematical Programming in Statistics Emil Artin Geometric Algebra Norman T. J. Bailey The Elements of Stochastic Processes with Applications to the Natural Sciences Rob
Theory change and Bayesian statistical inference
Romeijn, Jan-Willem
2005-01-01
This paper addresses the problem that Bayesian statistical inference cannot accommodate theory change, and proposes a framework for dealing with such changes. It first presents a scheme for generating predictions from observations by means of hypotheses. An example shows how the hypotheses represent
Theory Change and Bayesian Statistical Inference
Romeyn, Jan-Willem
2008-01-01
This paper addresses the problem that Bayesian statistical inference cannot accommodate theory change, and proposes a framework for dealing with such changes. It first presents a scheme for generating predictions from observations by means of hypotheses. An example shows how the hypotheses represent
Statistical Information:. a Bayesian Perspective
Stern, R. B.; Pereira, C. A. De B.
2012-12-01
We explore the concept of information in statistics: information about unknown quantities of interest, the parameters. We discuss intuitive ideas of what should be information in statistics. Our approach on information is divided in two scenarios: observed data and planning of an experiment. On the first scenario, we discuss the Sufficiency Principle, the Conditionality Principle, the Likelihood Principle and their relationship with trivial experiments. We also provide applications of some measures of information to an intuitive example. On the second scenario, the definition and new applications of Blackwell Sufficiency are presented. We discuss a new relationship between Blackwell Equivalence and the Likelihood Principle. Finally, the expected values of some measures of information are calculated.
How few? Bayesian statistics in injury biomechanics.
Cutcliffe, Hattie C; Schmidt, Allison L; Lucas, Joseph E; Bass, Cameron R
2012-10-01
In injury biomechanics, there are currently no general a priori estimates of how few specimens are necessary to obtain sufficiently accurate injury risk curves for a given underlying distribution. Further, several methods are available for constructing these curves, and recent methods include Bayesian survival analysis. This study used statistical simulations to evaluate the fidelity of different injury risk methods using limited sample sizes across four different underlying distributions. Five risk curve techniques were evaluated, including Bayesian techniques. For the Bayesian analyses, various prior distributions were assessed, each incorporating more accurate information. Simulated subject injury and biomechanical input values were randomly sampled from each underlying distribution, and injury status was determined by comparing these values. Injury risk curves were developed for this data using each technique for various small sample sizes; for each, analyses on 2000 simulated data sets were performed. Resulting median predicted risk values and confidence intervals were compared with the underlying distributions. Across conditions, the standard and Bayesian survival analyses better represented the underlying distributions included in this study, especially for extreme (1, 10, and 90%) risk. This study demonstrates that the value of the Bayesian analysis is the use of informed priors. As the mean of the prior approaches the actual value, the sample size necessary for good reproduction of the underlying distribution with small confidence intervals can be as small as 2. This study provides estimates of confidence intervals and number of samples to allow the selection of the most appropriate sample sizes given known information.
Directory of Open Access Journals (Sweden)
Erfan Ayubi
2017-05-01
Full Text Available OBJECTIVES The aim of this study was to explore the spatial pattern of female breast cancer (BC incidence at the neighborhood level in Tehran, Iran. METHODS The present study included all registered incident cases of female BC from March 2008 to March 2011. The raw standardized incidence ratio (SIR of BC for each neighborhood was estimated by comparing observed cases relative to expected cases. The estimated raw SIRs were smoothed by a Besag, York, and Mollie spatial model and the spatial empirical Bayesian method. The purely spatial scan statistic was used to identify spatial clusters. RESULTS There were 4,175 incident BC cases in the study area from 2008 to 2011, of which 3,080 were successfully geocoded to the neighborhood level. Higher than expected rates of BC were found in neighborhoods located in northern and central Tehran, whereas lower rates appeared in southern areas. The most likely cluster of higher than expected BC incidence involved neighborhoods in districts 3 and 6, with an observed-to-expected ratio of 3.92 (p<0.001, whereas the most likely cluster of lower than expected rates involved neighborhoods in districts 17, 18, and 19, with an observed-to-expected ratio of 0.05 (p<0.001. CONCLUSIONS Neighborhood-level inequality in the incidence of BC exists in Tehran. These findings can serve as a basis for resource allocation and preventive strategies in at-risk areas.
Bayesian statistical models to estimate EQ-5D utility scores from EORTC QLQ data in myeloma.
Kharroubi, Samer A; Edlin, Richard; Meads, David; McCabe, Christopher
2018-02-20
It is well documented that the modelling of health-related quality of life data is difficult as the distribution of such data is often strongly right/left skewed and it includes a significant percentage of observations at one. The objective of this study is to develop a series of two-part models (TPMs) that deal with these issues. Data from the UK Medical Research Council Myeloma IX trial were used to examine the relationship between the European Organization for Research and Treatment of Cancer (EORTC) QLQ-C30/QLQ-MY20 scores and the European QoL-5 Dimensions (EQ-5D) utility score. Four different TPMs were developed. The models fitted included TPM with normal regression, TPM with normal regression with variance a function of participant characteristics, TPM with log-transformed data, and TPM with gamma regression and a log link. The cohort of 1839 patients was divided into 75% derivation sample, to fit the different models, and 25% validation sample to assess the predictive ability of these models by comparing predicted and observed mean EQ-5D scores in the validation set, unadjusted R 2 , and root mean square error. Predictive performance in the derivation dataset depended on the criterion used, with R 2 /adjusted-R 2 favouring the TPM with normal regression and mean predicted error favouring the TPM with gamma regression. The TPM with gamma regression performs best within the validation dataset under all criteria. TPM regression models provide flexible approaches to estimate mean EQ-5D utility weights from the EORTC QLQ-C30/QLQ-MY20 for use in economic evaluation. Copyright © 2018 John Wiley & Sons, Ltd.
Norris, P. M.; da Silva, A. M., Jr.
2016-12-01
Norris and da Silva recently published a method to constrain a statistical model of sub-gridcolumn moisture variability using high-resolution satellite cloud data. The method can be used for large-scale model parameter estimation or cloud data assimilation (CDA). The gridcolumn model includes assumed-PDF intra-layer horizontal variability and a copula-based inter-layer correlation model. The observables used are MODIS cloud-top pressure, brightness temperature and cloud optical thickness, but the method should be extensible to direct cloudy radiance assimilation for a small number of channels. The algorithm is a form of Bayesian inference with a Markov chain Monte Carlo (MCMC) approach to characterizing the posterior distribution. This approach is especially useful in cases where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach is not gradient-based and allows jumps into regions of non-zero cloud probability. In the example provided, the method is able to restore marine stratocumulus near the Californian coast where the background state has a clear swath. The new approach not only significantly reduces mean and standard deviation biases with respect to the assimilated observables, but also improves the simulated rotational-Ramman scattering cloud optical centroid pressure against independent (non-assimilated) retrievals from the OMI instrument. One obvious difficulty for the method, and other CDA methods, is the lack of information content in passive cloud observables on cloud vertical structure, beyond cloud-top and thickness, thus necessitating strong dependence on the background vertical moisture structure. It is found that a simple flow-dependent correlation modification due to Riishojgaard is helpful, better honoring inversion structures in the background state.
Kittisuwan, Pichid
2015-03-01
The application of image processing in industry has shown remarkable success over the last decade, for example, in security and telecommunication systems. The denoising of natural image corrupted by Gaussian noise is a classical problem in image processing. So, image denoising is an indispensable step during image processing. This paper is concerned with dual-tree complex wavelet-based image denoising using Bayesian techniques. One of the cruxes of the Bayesian image denoising algorithms is to estimate the statistical parameter of the image. Here, we employ maximum a posteriori (MAP) estimation to calculate local observed variance with generalized Gamma density prior for local observed variance and Laplacian or Gaussian distribution for noisy wavelet coefficients. Evidently, our selection of prior distribution is motivated by efficient and flexible properties of generalized Gamma density. The experimental results show that the proposed method yields good denoising results.
Bayesian statistics and Monte Carlo methods
Koch, K. R.
2018-03-01
The Bayesian approach allows an intuitive way to derive the methods of statistics. Probability is defined as a measure of the plausibility of statements or propositions. Three rules are sufficient to obtain the laws of probability. If the statements refer to the numerical values of variables, the so-called random variables, univariate and multivariate distributions follow. They lead to the point estimation by which unknown quantities, i.e. unknown parameters, are computed from measurements. The unknown parameters are random variables, they are fixed quantities in traditional statistics which is not founded on Bayes' theorem. Bayesian statistics therefore recommends itself for Monte Carlo methods, which generate random variates from given distributions. Monte Carlo methods, of course, can also be applied in traditional statistics. The unknown parameters, are introduced as functions of the measurements, and the Monte Carlo methods give the covariance matrix and the expectation of these functions. A confidence region is derived where the unknown parameters are situated with a given probability. Following a method of traditional statistics, hypotheses are tested by determining whether a value for an unknown parameter lies inside or outside the confidence region. The error propagation of a random vector by the Monte Carlo methods is presented as an application. If the random vector results from a nonlinearly transformed vector, its covariance matrix and its expectation follow from the Monte Carlo estimate. This saves a considerable amount of derivatives to be computed, and errors of the linearization are avoided. The Monte Carlo method is therefore efficient. If the functions of the measurements are given by a sum of two or more random vectors with different multivariate distributions, the resulting distribution is generally not known. TheMonte Carlo methods are then needed to obtain the covariance matrix and the expectation of the sum.
A Bayesian statistical method for particle identification in shower counters
International Nuclear Information System (INIS)
Takashimizu, N.; Kimura, A.; Shibata, A.; Sasaki, T.
2004-01-01
We report an attempt on identifying particles using a Bayesian statistical method. We have developed the mathematical model and software for this purpose. We tried to identify electrons and charged pions in shower counters using this method. We designed an ideal shower counter and studied the efficiency of identification using Monte Carlo simulation based on Geant4. Without having any other information, e.g. charges of particles which are given by tracking detectors, we have achieved 95% identifications of both particles
Bayesian nonparametric hierarchical modeling.
Dunson, David B
2009-04-01
In biomedical research, hierarchical models are very widely used to accommodate dependence in multivariate and longitudinal data and for borrowing of information across data from different sources. A primary concern in hierarchical modeling is sensitivity to parametric assumptions, such as linearity and normality of the random effects. Parametric assumptions on latent variable distributions can be challenging to check and are typically unwarranted, given available prior knowledge. This article reviews some recent developments in Bayesian nonparametric methods motivated by complex, multivariate and functional data collected in biomedical studies. The author provides a brief review of flexible parametric approaches relying on finite mixtures and latent class modeling. Dirichlet process mixture models are motivated by the need to generalize these approaches to avoid assuming a fixed finite number of classes. Focusing on an epidemiology application, the author illustrates the practical utility and potential of nonparametric Bayes methods.
Fully probabilistic design of hierarchical Bayesian models
Czech Academy of Sciences Publication Activity Database
Quinn, A.; Kárný, Miroslav; Guy, Tatiana Valentine
2016-01-01
Roč. 369, č. 1 (2016), s. 532-547 ISSN 0020-0255 R&D Projects: GA ČR GA13-13502S Institutional support: RVO:67985556 Keywords : Fully probabilistic design * Ideal distribution * Minimum cross- entropy principle * Bayesian conditioning * Kullback-Leibler divergence * Bayesian nonparametric modelling Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 4.832, year: 2016 http://library.utia.cas.cz/separaty/2016/AS/karny-0463052.pdf
Bayesian statistics in environmental engineering planning
Energy Technology Data Exchange (ETDEWEB)
Englehardt, J.D.; Simon, T.W.
1999-07-01
Today's engineer must be able to quantify both uncertainty due to information limitations, and the variability of natural processes, in order to determine risk. Nowhere is this emphasis on risk assessment more evident than in environmental engineering. The use of Bayesian inference for the rigorous assessment of risk based on available information is reviewed in this paper. Several example environmental engineering planning applications are presented: (1) assessment of losses involving the evaluation of proposed revisions to the South Florida Building Code after Hurricane Andrew; (2) development of a model to predict oil spill consequences due to proposed changes in the oil transportation network in the Gulf of Mexico; (3) studies of ambient concentrations of perchloroethylene surrounding dry cleaners and of tire particulates in residential areas near roadways in Miami, FL; (4) risk assessment from contaminated soils at a cleanup of an old transformer dump site.
Bayesian modelling of fusion diagnostics
Fischer, R.; Dinklage, A.; Pasch, E.
2003-07-01
Integrated data analysis of fusion diagnostics is the combination of different, heterogeneous diagnostics in order to improve physics knowledge and reduce the uncertainties of results. One example is the validation of profiles of plasma quantities. Integration of different diagnostics requires systematic and formalized error analysis for all uncertainties involved. The Bayesian probability theory (BPT) allows a systematic combination of all information entering the measurement descriptive model that considers all uncertainties of the measured data, calibration measurements, physical model parameters and measurement nuisance parameters. A sensitivity analysis of model parameters allows crucial uncertainties to be found, which has an impact on both diagnostic improvement and design. The systematic statistical modelling within the BPT is used for reconstructing electron density and electron temperature profiles from Thomson scattering data from the Wendelstein 7-AS stellarator. The inclusion of different diagnostics and first-principle information is discussed in terms of improvements.
Olugboji, T. M.; Lekic, V.; McDonough, W.
2017-07-01
We present a new approach for evaluating existing crustal models using ambient noise data sets and its associated uncertainties. We use a transdimensional hierarchical Bayesian inversion approach to invert ambient noise surface wave phase dispersion maps for Love and Rayleigh waves using measurements obtained from Ekström (2014). Spatiospectral analysis shows that our results are comparable to a linear least squares inverse approach (except at higher harmonic degrees), but the procedure has additional advantages: (1) it yields an autoadaptive parameterization that follows Earth structure without making restricting assumptions on model resolution (regularization or damping) and data errors; (2) it can recover non-Gaussian phase velocity probability distributions while quantifying the sources of uncertainties in the data measurements and modeling procedure; and (3) it enables statistical assessments of different crustal models (e.g., CRUST1.0, LITHO1.0, and NACr14) using variable resolution residual and standard deviation maps estimated from the ensemble. These assessments show that in the stable old crust of the Archean, the misfits are statistically negligible, requiring no significant update to crustal models from the ambient noise data set. In other regions of the U.S., significant updates to regionalization and crustal structure are expected especially in the shallow sedimentary basins and the tectonically active regions, where the differences between model predictions and data are statistically significant.
Bayesian Statistics: Concepts and Applications in Animal Breeding – A Review
Directory of Open Access Journals (Sweden)
Lsxmikant-Sambhaji Kokate
2011-07-01
Full Text Available Statistics uses two major approaches- conventional (or frequentist and Bayesian approach. Bayesian approach provides a complete paradigm for both statistical inference and decision making under uncertainty. Bayesian methods solve many of the difficulties faced by conventional statistical methods, and extend the applicability of statistical methods. It exploits the use of probabilistic models to formulate scientific problems. To use Bayesian statistics, there is computational difficulty and secondly, Bayesian methods require specifying prior probability distributions. Markov Chain Monte-Carlo (MCMC methods were applied to overcome the computational difficulty, and interest in Bayesian methods was renewed. In Bayesian statistics, Bayesian structural equation model (SEM is used. It provides a powerful and flexible approach for studying quantitative traits for wide spectrum problems and thus it has no operational difficulties, with the exception of some complex cases. In this method, the problems are solved at ease, and the statisticians feel it comfortable with the particular way of expressing the results and employing the software available to analyze a large variety of problems.
An introduction to Bayesian statistics in health psychology
Depaoli, Sarah; Rus, Holly; Clifton, James; van de Schoot, A.G.J.; Tiemensma, Jitske
2017-01-01
The aim of the current article is to provide a brief introduction to Bayesian statistics within the field of Health Psychology. Bayesian methods are increasing in prevalence in applied fields, and they have been shown in simulation research to improve the estimation accuracy of structural equation
Daniel Goodman’s empirical approach to Bayesian statistics
Gerrodette, Tim; Ward, Eric; Taylor, Rebecca L.; Schwarz, Lisa K.; Eguchi, Tomoharu; Wade, Paul; Himes Boor, Gina
2016-01-01
Bayesian statistics, in contrast to classical statistics, uses probability to represent uncertainty about the state of knowledge. Bayesian statistics has often been associated with the idea that knowledge is subjective and that a probability distribution represents a personal degree of belief. Dr. Daniel Goodman considered this viewpoint problematic for issues of public policy. He sought to ground his Bayesian approach in data, and advocated the construction of a prior as an empirical histogram of “similar” cases. In this way, the posterior distribution that results from a Bayesian analysis combined comparable previous data with case-specific current data, using Bayes’ formula. Goodman championed such a data-based approach, but he acknowledged that it was difficult in practice. If based on a true representation of our knowledge and uncertainty, Goodman argued that risk assessment and decision-making could be an exact science, despite the uncertainties. In his view, Bayesian statistics is a critical component of this science because a Bayesian analysis produces the probabilities of future outcomes. Indeed, Goodman maintained that the Bayesian machinery, following the rules of conditional probability, offered the best legitimate inference from available data. We give an example of an informative prior in a recent study of Steller sea lion spatial use patterns in Alaska.
Bayesian estimation and modeling: Editorial to the second special issue on Bayesian data analysis.
Chow, Sy-Miin; Hoijtink, Herbert
2017-12-01
This editorial accompanies the second special issue on Bayesian data analysis published in this journal. The emphases of this issue are on Bayesian estimation and modeling. In this editorial, we outline the basics of current Bayesian estimation techniques and some notable developments in the statistical literature, as well as adaptations and extensions by psychological researchers to better tailor to the modeling applications in psychology. We end with a discussion on future outlooks of Bayesian data analysis in psychology. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Bayesian statistics and information fusion for GPS-denied navigation
Copp, Brian Lee
It is well known that satellite navigation systems are vulnerable to disruption due to jamming, spoofing, or obstruction of the signal. The desire for robust navigation of aircraft in GPS-denied environments has motivated the development of feature-aided navigation systems, in which measurements of environmental features are used to complement the dead reckoning solution produced by an inertial navigation system. Examples of environmental features which can be exploited for navigation include star positions, terrain elevation, terrestrial wireless signals, and features extracted from photographic data. Feature-aided navigation represents a particularly challenging estimation problem because the measurements are often strongly nonlinear, and the quality of the navigation solution is limited by the knowledge of nuisance parameters which may be difficult to model accurately. As a result, integration approaches based on the Kalman filter and its variants may fail to give adequate performance. This project develops a framework for the integration of feature-aided navigation techniques using Bayesian statistics. In this approach, the probability density function for aircraft horizontal position (latitude and longitude) is approximated by a two-dimensional point mass function defined on a rectangular grid. Nuisance parameters are estimated using a hypothesis based approach (Multiple Model Adaptive Estimation) which continuously maintains an accurate probability density even in the presence of strong nonlinearities. The effectiveness of the proposed approach is illustrated by the simulated use of terrain referenced navigation and wireless time-of-arrival positioning to estimate a reference aircraft trajectory. Monte Carlo simulations have shown that accurate position estimates can be obtained in terrain referenced navigation even with a strongly nonlinear altitude bias. The integration of terrain referenced and wireless time-of-arrival measurements is described along with
Bayesian structural equation modeling in sport and exercise psychology.
Stenling, Andreas; Ivarsson, Andreas; Johnson, Urban; Lindwall, Magnus
2015-08-01
Bayesian statistics is on the rise in mainstream psychology, but applications in sport and exercise psychology research are scarce. In this article, the foundations of Bayesian analysis are introduced, and we will illustrate how to apply Bayesian structural equation modeling in a sport and exercise psychology setting. More specifically, we contrasted a confirmatory factor analysis on the Sport Motivation Scale II estimated with the most commonly used estimator, maximum likelihood, and a Bayesian approach with weakly informative priors for cross-loadings and correlated residuals. The results indicated that the model with Bayesian estimation and weakly informative priors provided a good fit to the data, whereas the model estimated with a maximum likelihood estimator did not produce a well-fitting model. The reasons for this discrepancy between maximum likelihood and Bayesian estimation are discussed as well as potential advantages and caveats with the Bayesian approach.
Fully Bayesian tests of neutrality using genealogical summary statistics
Directory of Open Access Journals (Sweden)
Drummond Alexei J
2008-10-01
Full Text Available Abstract Background Many data summary statistics have been developed to detect departures from neutral expectations of evolutionary models. However questions about the neutrality of the evolution of genetic loci within natural populations remain difficult to assess. One critical cause of this difficulty is that most methods for testing neutrality make simplifying assumptions simultaneously about the mutational model and the population size model. Consequentially, rejecting the null hypothesis of neutrality under these methods could result from violations of either or both assumptions, making interpretation troublesome. Results Here we harness posterior predictive simulation to exploit summary statistics of both the data and model parameters to test the goodness-of-fit of standard models of evolution. We apply the method to test the selective neutrality of molecular evolution in non-recombining gene genealogies and we demonstrate the utility of our method on four real data sets, identifying significant departures of neutrality in human influenza A virus, even after controlling for variation in population size. Conclusion Importantly, by employing a full model-based Bayesian analysis, our method separates the effects of demography from the effects of selection. The method also allows multiple summary statistics to be used in concert, thus potentially increasing sensitivity. Furthermore, our method remains useful in situations where analytical expectations and variances of summary statistics are not available. This aspect has great potential for the analysis of temporally spaced data, an expanding area previously ignored for limited availability of theory and methods.
Norris, Peter M.; Da Silva, Arlindo M.
2016-01-01
A method is presented to constrain a statistical model of sub-gridcolumn moisture variability using high-resolution satellite cloud data. The method can be used for large-scale model parameter estimation or cloud data assimilation. The gridcolumn model includes assumed probability density function (PDF) intra-layer horizontal variability and a copula-based inter-layer correlation model. The observables used in the current study are Moderate Resolution Imaging Spectroradiometer (MODIS) cloud-top pressure, brightness temperature and cloud optical thickness, but the method should be extensible to direct cloudy radiance assimilation for a small number of channels. The algorithm is a form of Bayesian inference with a Markov chain Monte Carlo (MCMC) approach to characterizing the posterior distribution. This approach is especially useful in cases where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach is not gradient-based and allows jumps into regions of non-zero cloud probability. The current study uses a skewed-triangle distribution for layer moisture. The article also includes a discussion of the Metropolis and multiple-try Metropolis versions of MCMC.
A Bayesian statistics approach to multiscale coarse graining
Liu, Pu; Shi, Qiang; Daumé, Hal; Voth, Gregory A.
2008-12-01
Coarse-grained (CG) modeling provides a promising way to investigate many important physical and biological phenomena over large spatial and temporal scales. The multiscale coarse-graining (MS-CG) method has been proven to be a thermodynamically consistent way to systematically derive a CG model from atomistic force information, as shown in a variety of systems, ranging from simple liquids to proteins embedded in lipid bilayers. In the present work, Bayes' theorem, an advanced statistical tool widely used in signal processing and pattern recognition, is adopted to further improve the MS-CG force field obtained from the CG modeling. This approach can regularize the linear equation resulting from the underlying force-matching methodology, therefore substantially improving the quality of the MS-CG force field, especially for the regions with limited sampling. Moreover, this Bayesian approach can naturally provide an error estimation for each force field parameter, from which one can know the extent the results can be trusted. The robustness and accuracy of the Bayesian MS-CG algorithm is demonstrated for three different systems, including simple liquid methanol, polyalanine peptide solvated in explicit water, and a much more complicated peptide assembly with 32 NNQQNY hexapeptides.
Howard B. Stauffer; Cynthia J. Zabel; Jeffrey R. Dunk
2005-01-01
We compared a set of competing logistic regression habitat selection models for Northern Spotted Owls (Strix occidentalis caurina) in California. The habitat selection models were estimated, compared, evaluated, and tested using multiple sample datasets collected on federal forestlands in northern California. We used Bayesian methods in interpreting...
Bayesian statistics in radionuclide metrology: measurement of a decaying source
International Nuclear Information System (INIS)
Bochud, F. O.; Bailat, C.J.; Laedermann, J.P.
2007-01-01
The most intuitive way of defining a probability is perhaps through the frequency at which it appears when a large number of trials are realized in identical conditions. The probability derived from the obtained histogram characterizes the so-called frequentist or conventional statistical approach. In this sense, probability is defined as a physical property of the observed system. By contrast, in Bayesian statistics, a probability is not a physical property or a directly observable quantity, but a degree of belief or an element of inference. The goal of this paper is to show how Bayesian statistics can be used in radionuclide metrology and what its advantages and disadvantages are compared with conventional statistics. This is performed through the example of an yttrium-90 source typically encountered in environmental surveillance measurement. Because of the very low activity of this kind of source and the small half-life of the radionuclide, this measurement takes several days, during which the source decays significantly. Several methods are proposed to compute simultaneously the number of unstable nuclei at a given reference time, the decay constant and the background. Asymptotically, all approaches give the same result. However, Bayesian statistics produces coherent estimates and confidence intervals in a much smaller number of measurements. Apart from the conceptual understanding of statistics, the main difficulty that could deter radionuclide metrologists from using Bayesian statistics is the complexity of the computation. (authors)
Bayesian methodology for reliability model acceptance
International Nuclear Information System (INIS)
Zhang Ruoxue; Mahadevan, Sankaran
2003-01-01
This paper develops a methodology to assess the reliability computation model validity using the concept of Bayesian hypothesis testing, by comparing the model prediction and experimental observation, when there is only one computational model available to evaluate system behavior. Time-independent and time-dependent problems are investigated, with consideration of both cases: with and without statistical uncertainty in the model. The case of time-independent failure probability prediction with no statistical uncertainty is a straightforward application of Bayesian hypothesis testing. However, for the life prediction (time-dependent reliability) problem, a new methodology is developed in this paper to make the same Bayesian hypothesis testing concept applicable. With the existence of statistical uncertainty in the model, in addition to the application of a predictor estimator of the Bayes factor, the uncertainty in the Bayes factor is explicitly quantified through treating it as a random variable and calculating the probability that it exceeds a specified value. The developed method provides a rational criterion to decision-makers for the acceptance or rejection of the computational model
Bayesian Model Averaging for Propensity Score Analysis
Kaplan, David; Chen, Jianshen
2013-01-01
The purpose of this study is to explore Bayesian model averaging in the propensity score context. Previous research on Bayesian propensity score analysis does not take into account model uncertainty. In this regard, an internally consistent Bayesian framework for model building and estimation must also account for model uncertainty. The…
Bayesian models in cognitive neuroscience: A tutorial
O'Reilly, J.X.; Mars, R.B.
2015-01-01
This chapter provides an introduction to Bayesian models and their application in cognitive neuroscience. The central feature of Bayesian models, as opposed to other classes of models, is that Bayesian models represent the beliefs of an observer as probability distributions, allowing them to
Statistical assignment of DNA sequences using Bayesian phylogenetics
DEFF Research Database (Denmark)
Terkelsen, Kasper Munch; Boomsma, Wouter Krogh; Huelsenbeck, John P.
2008-01-01
We provide a new automated statistical method for DNA barcoding based on a Bayesian phylogenetic analysis. The method is based on automated database sequence retrieval, alignment, and phylogenetic analysis using a custom-built program for Bayesian phylogenetic analysis. We show on real data......-analysis of previously published ancient DNA data and show that, with high statistical confidence, most of the published sequences are in fact of Neanderthal origin. However, there are several cases of chimeric sequences that are comprised of a combination of both Neanderthal and modern human DNA....
Bayesian Modelling of Functional Whole Brain Connectivity
DEFF Research Database (Denmark)
Røge, Rasmus
This thesis deals with parcellation of whole-brain functional magnetic resonance imaging (fMRI) using Bayesian inference with mixture models tailored to the fMRI data. In the three included papers and manuscripts, we analyze two different approaches to modeling fMRI signal; either we accept...... the prevalent strategy of standardizing of fMRI time series and model data using directional statistics or we model the variability in the signal across the brain and across multiple subjects. In either case, we use Bayesian nonparametric modeling to automatically learn from the fMRI data the number...... of funcional units, i.e. parcels. We benchmark the proposed mixture models against state of the art methods of brain parcellation, both probabilistic and non-probabilistic. The time series of each voxel are most often standardized using z-scoring which projects the time series data onto a hypersphere...
Bayesian modeling using WinBUGS
Ntzoufras, Ioannis
2009-01-01
A hands-on introduction to the principles of Bayesian modeling using WinBUGS Bayesian Modeling Using WinBUGS provides an easily accessible introduction to the use of WinBUGS programming techniques in a variety of Bayesian modeling settings. The author provides an accessible treatment of the topic, offering readers a smooth introduction to the principles of Bayesian modeling with detailed guidance on the practical implementation of key principles. The book begins with a basic introduction to Bayesian inference and the WinBUGS software and goes on to cover key topics, including: Markov Chain Monte Carlo algorithms in Bayesian inference Generalized linear models Bayesian hierarchical models Predictive distribution and model checking Bayesian model and variable evaluation Computational notes and screen captures illustrate the use of both WinBUGS as well as R software to apply the discussed techniques. Exercises at the end of each chapter allow readers to test their understanding of the presented concepts and all ...
Norris, Peter M.; da Silva, Arlindo M.
2016-01-01
Part 1 of this series presented a Monte Carlo Bayesian method for constraining a complex statistical model of global circulation model (GCM) sub-gridcolumn moisture variability using high-resolution Moderate Resolution Imaging Spectroradiometer (MODIS) cloud data, thereby permitting parameter estimation and cloud data assimilation for large-scale models. This article performs some basic testing of this new approach, verifying that it does indeed reduce mean and standard deviation biases significantly with respect to the assimilated MODIS cloud optical depth, brightness temperature and cloud-top pressure and that it also improves the simulated rotational-Raman scattering cloud optical centroid pressure (OCP) against independent (non-assimilated) retrievals from the Ozone Monitoring Instrument (OMI). Of particular interest, the Monte Carlo method does show skill in the especially difficult case where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach allows non-gradient-based jumps into regions of non-zero cloud probability. In the example provided, the method is able to restore marine stratocumulus near the Californian coast, where the background state has a clear swath. This article also examines a number of algorithmic and physical sensitivities of the new method and provides guidance for its cost-effective implementation. One obvious difficulty for the method, and other cloud data assimilation methods as well, is the lack of information content in passive-radiometer-retrieved cloud observables on cloud vertical structure, beyond cloud-top pressure and optical thickness, thus necessitating strong dependence on the background vertical moisture structure. It is found that a simple flow-dependent correlation modification from Riishojgaard provides some help in this respect, by
da Silva, Arlindo M.; Norris, Peter M.
2013-01-01
Part I presented a Monte Carlo Bayesian method for constraining a complex statistical model of GCM sub-gridcolumn moisture variability using high-resolution MODIS cloud data, thereby permitting large-scale model parameter estimation and cloud data assimilation. This part performs some basic testing of this new approach, verifying that it does indeed significantly reduce mean and standard deviation biases with respect to the assimilated MODIS cloud optical depth, brightness temperature and cloud top pressure, and that it also improves the simulated rotational-Ramman scattering cloud optical centroid pressure (OCP) against independent (non-assimilated) retrievals from the OMI instrument. Of particular interest, the Monte Carlo method does show skill in the especially difficult case where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach allows finite jumps into regions of non-zero cloud probability. In the example provided, the method is able to restore marine stratocumulus near the Californian coast where the background state has a clear swath. This paper also examines a number of algorithmic and physical sensitivities of the new method and provides guidance for its cost-effective implementation. One obvious difficulty for the method, and other cloud data assimilation methods as well, is the lack of information content in the cloud observables on cloud vertical structure, beyond cloud top pressure and optical thickness, thus necessitating strong dependence on the background vertical moisture structure. It is found that a simple flow-dependent correlation modification due to Riishojgaard (1998) provides some help in this respect, by better honoring inversion structures in the background state.
Practical Statistics for LHC Physicists: Bayesian Inference (3/3)
CERN. Geneva
2015-01-01
These lectures cover those principles and practices of statistics that are most relevant for work at the LHC. The first lecture discusses the basic ideas of descriptive statistics, probability and likelihood. The second lecture covers the key ideas in the frequentist approach, including confidence limits, profile likelihoods, p-values, and hypothesis testing. The third lecture covers inference in the Bayesian approach. Throughout, real-world examples will be used to illustrate the practical application of the ideas. No previous knowledge is assumed.
Bootstrap prediction and Bayesian prediction under misspecified models
Fushiki, Tadayoshi
2005-01-01
We consider a statistical prediction problem under misspecified models. In a sense, Bayesian prediction is an optimal prediction method when an assumed model is true. Bootstrap prediction is obtained by applying Breiman's `bagging' method to a plug-in prediction. Bootstrap prediction can be considered to be an approximation to the Bayesian prediction under the assumption that the model is true. However, in applications, there are frequently deviations from the assumed model. In this paper, bo...
Bayesian Model Averaging for Propensity Score Analysis.
Kaplan, David; Chen, Jianshen
2014-01-01
This article considers Bayesian model averaging as a means of addressing uncertainty in the selection of variables in the propensity score equation. We investigate an approximate Bayesian model averaging approach based on the model-averaged propensity score estimates produced by the R package BMA but that ignores uncertainty in the propensity score. We also provide a fully Bayesian model averaging approach via Markov chain Monte Carlo sampling (MCMC) to account for uncertainty in both parameters and models. A detailed study of our approach examines the differences in the causal estimate when incorporating noninformative versus informative priors in the model averaging stage. We examine these approaches under common methods of propensity score implementation. In addition, we evaluate the impact of changing the size of Occam's window used to narrow down the range of possible models. We also assess the predictive performance of both Bayesian model averaging propensity score approaches and compare it with the case without Bayesian model averaging. Overall, results show that both Bayesian model averaging propensity score approaches recover the treatment effect estimates well and generally provide larger uncertainty estimates, as expected. Both Bayesian model averaging approaches offer slightly better prediction of the propensity score compared with the Bayesian approach with a single propensity score equation. Covariate balance checks for the case study show that both Bayesian model averaging approaches offer good balance. The fully Bayesian model averaging approach also provides posterior probability intervals of the balance indices.
Bayesian inference – a way to combine statistical data and semantic analysis meaningfully
Directory of Open Access Journals (Sweden)
Eila Lindfors
2011-11-01
Full Text Available This article focuses on presenting the possibilities of Bayesian modelling (Finite Mixture Modelling in the semantic analysis of statistically modelled data. The probability of a hypothesis in relation to the data available is an important question in inductive reasoning. Bayesian modelling allows the researcher to use many models at a time and provides tools to evaluate the goodness of different models. The researcher should always be aware that there is no such thing as the exact probability of an exact event. This is the reason for using probabilistic models. Each model presents a different perspective on the phenomenon in focus, and the researcher has to choose the most probable model with a view to previous research and the knowledge available.The idea of Bayesian modelling is illustrated here by presenting two different sets of data, one from craft science research (n=167 and the other (n=63 from educational research (Lindfors, 2007, 2002. The principles of how to build models and how to combine different profiles are described in the light of the research mentioned.Bayesian modelling is an analysis based on calculating probabilities in relation to a specific set of quantitative data. It is a tool for handling data and interpreting it semantically. The reliability of the analysis arises from an argumentation of which model can be selected from the model space as the basis for an interpretation, and on which arguments.Keywords: method, sloyd, Bayesian modelling, student teachersURN:NBN:no-29959
Flexible Bayesian Human Fecundity Models.
Kim, Sungduk; Sundaram, Rajeshwari; Buck Louis, Germaine M; Pyper, Cecilia
2012-12-01
Human fecundity is an issue of considerable interest for both epidemiological and clinical audiences, and is dependent upon a couple's biologic capacity for reproduction coupled with behaviors that place a couple at risk for pregnancy. Bayesian hierarchical models have been proposed to better model the conception probabilities by accounting for the acts of intercourse around the day of ovulation, i.e., during the fertile window. These models can be viewed in the framework of a generalized nonlinear model with an exponential link. However, a fixed choice of link function may not always provide the best fit, leading to potentially biased estimates for probability of conception. Motivated by this, we propose a general class of models for fecundity by relaxing the choice of the link function under the generalized nonlinear model framework. We use a sample from the Oxford Conception Study (OCS) to illustrate the utility and fit of this general class of models for estimating human conception. Our findings reinforce the need for attention to be paid to the choice of link function in modeling conception, as it may bias the estimation of conception probabilities. Various properties of the proposed models are examined and a Markov chain Monte Carlo sampling algorithm was developed for implementing the Bayesian computations. The deviance information criterion measure and logarithm of pseudo marginal likelihood are used for guiding the choice of links. The supplemental material section contains technical details of the proof of the theorem stated in the paper, and contains further simulation results and analysis.
Bayesian operational risk models
Silvia Figini; Lijun Gao; Paolo Giudici
2013-01-01
Operational risk is hard to quantify, for the presence of heavy tailed loss distributions. Extreme value distributions, used in this context, are very sensitive to the data, and this is a problem in the presence of rare loss data. Self risk assessment questionnaires, if properly modelled, may provide the missing piece of information that is necessary to adequately estimate op- erational risks. In this paper we propose to embody self risk assessment data into suitable prior distributions, and ...
To P or Not to P: Backing Bayesian Statistics.
Buchinsky, Farrel J; Chadha, Neil K
2017-12-01
In biomedical research, it is imperative to differentiate chance variation from truth before we generalize what we see in a sample of subjects to the wider population. For decades, we have relied on null hypothesis significance testing, where we calculate P values for our data to decide whether to reject a null hypothesis. This methodology is subject to substantial misinterpretation and errant conclusions. Instead of working backward by calculating the probability of our data if the null hypothesis were true, Bayesian statistics allow us instead to work forward, calculating the probability of our hypothesis given the available data. This methodology gives us a mathematical means of incorporating our "prior probabilities" from previous study data (if any) to produce new "posterior probabilities." Bayesian statistics tell us how confidently we should believe what we believe. It is time to embrace and encourage their use in our otolaryngology research.
Bayesian variable order Markov models: Towards Bayesian predictive state representations
Dimitrakakis, C.
2009-01-01
We present a Bayesian variable order Markov model that shares many similarities with predictive state representations. The resulting models are compact and much easier to specify and learn than classical predictive state representations. Moreover, we show that they significantly outperform a more
Modeling Diagnostic Assessments with Bayesian Networks
Almond, Russell G.; DiBello, Louis V.; Moulder, Brad; Zapata-Rivera, Juan-Diego
2007-01-01
This paper defines Bayesian network models and examines their applications to IRT-based cognitive diagnostic modeling. These models are especially suited to building inference engines designed to be synchronous with the finer grained student models that arise in skills diagnostic assessment. Aspects of the theory and use of Bayesian network models…
Bayesian analysis of a correlated binomial model
Diniz, Carlos A. R.; Tutia, Marcelo H.; Leite, Jose G.
2010-01-01
In this paper a Bayesian approach is applied to the correlated binomial model, CB(n, p, ρ), proposed by Luceño (Comput. Statist. Data Anal. 20 (1995) 511–520). The data augmentation scheme is used in order to overcome the complexity of the mixture likelihood. MCMC methods, including Gibbs sampling and Metropolis within Gibbs, are applied to estimate the posterior marginal for the probability of success p and for the correlation coefficient ρ. The sensitivity of the posterior is studied taking...
Improving Transparency and Replication in Bayesian Statistics : The WAMBS-Checklist
Depaoli, Sarah; van de Schoot, Rens
2017-01-01
Bayesian statistical methods are slowly creeping into all fields of science and are becoming ever more popular in applied research. Although it is very attractive to use Bayesian statistics, our personal experience has led us to believe that naively applying Bayesian methods can be dangerous for at
Biedermann, A; Taroni, F; Bozza, S
2009-12-15
As a thorough aggregation of probability and graph theory, Bayesian networks currently enjoy widespread interest as a means for studying factors that affect the coherent evaluation of scientific evidence in forensic science. Paper I of this series of papers intends to contribute to the discussion of Bayesian networks as a framework that is helpful for both illustrating and implementing statistical procedures that are commonly employed for the study of uncertainties (e.g. the estimation of unknown quantities). While the respective statistical procedures are widely described in literature, the primary aim of this paper is to offer an essentially non-technical introduction on how interested readers may use these analytical approaches--with the help of Bayesian networks--for processing their own forensic science data. Attention is mainly drawn to the structure and underlying rationale of a series of basic and context-independent network fragments that users may incorporate as building blocs while constructing larger inference models. As an example of how this may be done, the proposed concepts will be used in a second paper (Part II) for specifying graphical probability networks whose purpose is to assist forensic scientists in the evaluation of scientific evidence encountered in the context of forensic document examination (i.e. results of the analysis of black toners present on printed or copied documents).
Properties of the Bayesian Knowledge Tracing Model
van de Sande, Brett
2013-01-01
Bayesian Knowledge Tracing is used very widely to model student learning. It comes in two different forms: The first form is the Bayesian Knowledge Tracing "hidden Markov model" which predicts the probability of correct application of a skill as a function of the number of previous opportunities to apply that skill and the model…
Bayesians versus frequentists a philosophical debate on statistical reasoning
Vallverdú, Jordi
2016-01-01
This book analyzes the origins of statistical thinking as well as its related philosophical questions, such as causality, determinism or chance. Bayesian and frequentist approaches are subjected to a historical, cognitive and epistemological analysis, making it possible to not only compare the two competing theories, but to also find a potential solution. The work pursues a naturalistic approach, proceeding from the existence of numerosity in natural environments to the existence of contemporary formulas and methodologies to heuristic pragmatism, a concept introduced in the book’s final section. This monograph will be of interest to philosophers and historians of science and students in related fields. Despite the mathematical nature of the topic, no statistical background is required, making the book a valuable read for anyone interested in the history of statistics and human cognition.
2012-01-01
Background A statistical analysis plan (SAP) is a critical link between how a clinical trial is conducted and the clinical study report. To secure objective study results, regulatory bodies expect that the SAP will meet requirements in pre-specifying inferential analyses and other important statistical techniques. To write a good SAP for model-based sensitivity and ancillary analyses involves non-trivial decisions on and justification of many aspects of the chosen setting. In particular, trials with longitudinal count data as primary endpoints pose challenges for model choice and model validation. In the random effects setting, frequentist strategies for model assessment and model diagnosis are complex and not easily implemented and have several limitations. Therefore, it is of interest to explore Bayesian alternatives which provide the needed decision support to finalize a SAP. Methods We focus on generalized linear mixed models (GLMMs) for the analysis of longitudinal count data. A series of distributions with over- and under-dispersion is considered. Additionally, the structure of the variance components is modified. We perform a simulation study to investigate the discriminatory power of Bayesian tools for model criticism in different scenarios derived from the model setting. We apply the findings to the data from an open clinical trial on vertigo attacks. These data are seen as pilot data for an ongoing phase III trial. To fit GLMMs we use a novel Bayesian computational approach based on integrated nested Laplace approximations (INLAs). The INLA methodology enables the direct computation of leave-one-out predictive distributions. These distributions are crucial for Bayesian model assessment. We evaluate competing GLMMs for longitudinal count data according to the deviance information criterion (DIC) or probability integral transform (PIT), and by using proper scoring rules (e.g. the logarithmic score). Results The instruments under study provide excellent
Directory of Open Access Journals (Sweden)
Adrion Christine
2012-09-01
Full Text Available Abstract Background A statistical analysis plan (SAP is a critical link between how a clinical trial is conducted and the clinical study report. To secure objective study results, regulatory bodies expect that the SAP will meet requirements in pre-specifying inferential analyses and other important statistical techniques. To write a good SAP for model-based sensitivity and ancillary analyses involves non-trivial decisions on and justification of many aspects of the chosen setting. In particular, trials with longitudinal count data as primary endpoints pose challenges for model choice and model validation. In the random effects setting, frequentist strategies for model assessment and model diagnosis are complex and not easily implemented and have several limitations. Therefore, it is of interest to explore Bayesian alternatives which provide the needed decision support to finalize a SAP. Methods We focus on generalized linear mixed models (GLMMs for the analysis of longitudinal count data. A series of distributions with over- and under-dispersion is considered. Additionally, the structure of the variance components is modified. We perform a simulation study to investigate the discriminatory power of Bayesian tools for model criticism in different scenarios derived from the model setting. We apply the findings to the data from an open clinical trial on vertigo attacks. These data are seen as pilot data for an ongoing phase III trial. To fit GLMMs we use a novel Bayesian computational approach based on integrated nested Laplace approximations (INLAs. The INLA methodology enables the direct computation of leave-one-out predictive distributions. These distributions are crucial for Bayesian model assessment. We evaluate competing GLMMs for longitudinal count data according to the deviance information criterion (DIC or probability integral transform (PIT, and by using proper scoring rules (e.g. the logarithmic score. Results The instruments under study
A Bayesian approach to model uncertainty
International Nuclear Information System (INIS)
Buslik, A.
1994-01-01
A Bayesian approach to model uncertainty is taken. For the case of a finite number of alternative models, the model uncertainty is equivalent to parameter uncertainty. A derivation based on Savage's partition problem is given
Inventory model using bayesian dynamic linear model for demand forecasting
Directory of Open Access Journals (Sweden)
Marisol Valencia-Cárdenas
2014-12-01
Full Text Available An important factor of manufacturing process is the inventory management of terminated product. Constantly, industry is looking for better alternatives to establish an adequate plan of production and stored quantities, with optimal cost, getting quantities in a time horizon, which permits to define resources and logistics with anticipation, needed to distribute products on time. Total absence of historical data, required by many statistical models to forecast, demands the search for other kind of accurate techniques. This work presents an alternative that not only permits to forecast, in an adjusted way, but also, to provide optimal quantities to produce and store with an optimal cost, using Bayesian statistics. The proposal is illustrated with real data. Palabras clave: estadística bayesiana, optimización, modelo de inventarios, modelo lineal dinámico bayesiano. Keywords: Bayesian statistics, opti
Bayesian estimation of parameters in a regional hydrological model
Directory of Open Access Journals (Sweden)
K. Engeland
2002-01-01
Full Text Available This study evaluates the applicability of the distributed, process-oriented Ecomag model for prediction of daily streamflow in ungauged basins. The Ecomag model is applied as a regional model to nine catchments in the NOPEX area, using Bayesian statistics to estimate the posterior distribution of the model parameters conditioned on the observed streamflow. The distribution is calculated by Markov Chain Monte Carlo (MCMC analysis. The Bayesian method requires formulation of a likelihood function for the parameters and three alternative formulations are used. The first is a subjectively chosen objective function that describes the goodness of fit between the simulated and observed streamflow, as defined in the GLUE framework. The second and third formulations are more statistically correct likelihood models that describe the simulation errors. The full statistical likelihood model describes the simulation errors as an AR(1 process, whereas the simple model excludes the auto-regressive part. The statistical parameters depend on the catchments and the hydrological processes and the statistical and the hydrological parameters are estimated simultaneously. The results show that the simple likelihood model gives the most robust parameter estimates. The simulation error may be explained to a large extent by the catchment characteristics and climatic conditions, so it is possible to transfer knowledge about them to ungauged catchments. The statistical models for the simulation errors indicate that structural errors in the model are more important than parameter uncertainties. Keywords: regional hydrological model, model uncertainty, Bayesian analysis, Markov Chain Monte Carlo analysis
WebBUGS: Conducting Bayesian Statistical Analysis Online
Directory of Open Access Journals (Sweden)
Zhiyong Zhang
2014-11-01
Full Text Available A web interface, named WebBUGS, is developed to conduct Bayesian analysis online over the Internet through OpenBUGS and R. WebBUGS can be used with the minimum requirement of a web browser both remotely and locally. WebBUGS has many collaborative features such as email notification and sharing. WebBUGS also eases the use of OpenBUGS by providing built-in model templates, data management module, and other useful modules. In this paper, the use of WebBUGS is illustrated and discussed.
Bayesian Statistics-The Theory of Inverse Probability
Indian Academy of Sciences (India)
Statistical inference; inductive inference; probability model; likelihood function; prior probability; posterior probability; estimation; estimation error; maximum likelihood estimate; maximum a posteriori estimate; penalized likelihood; statistical computing; Bayes theorem; confidence interval.
Nonparametric Bayesian Modeling of Complex Networks
DEFF Research Database (Denmark)
Schmidt, Mikkel Nørgaard; Mørup, Morten
2013-01-01
an infinite mixture model as running example, we go through the steps of deriving the model as an infinite limit of a finite parametric model, inferring the model parameters by Markov chain Monte Carlo, and checking the model?s fit and predictive performance. We explain how advanced nonparametric models......Modeling structure in complex networks using Bayesian nonparametrics makes it possible to specify flexible model structures and infer the adequate model complexity from the observed data. This article provides a gentle introduction to nonparametric Bayesian modeling of complex networks: Using...
Quantification and propagation of disciplinary uncertainty via Bayesian statistics
Mantis, George Constantine
2002-08-01
Several needs exist in the military, commercial, and civil sectors for new hypersonic systems. These needs remain unfulfilled, due in part to the uncertainty encountered in designing these systems. This uncertainty takes a number of forms, including disciplinary uncertainty, that which is inherent in the analytical tools utilized during the design process. Yet, few efforts to date empower the designer with the means to account for this uncertainty within the disciplinary analyses. In the current state-of-the-art in design, the effects of this unquantifiable uncertainty significantly increase the risks associated with new design efforts. Typically, the risk proves too great to allow a given design to proceed beyond the conceptual stage. To that end, the research encompasses the formulation and validation of a new design method, a systematic process for probabilistically assessing the impact of disciplinary uncertainty. The method implements Bayesian Statistics theory to quantify this source of uncertainty, and propagate its effects to the vehicle system level. Comparison of analytical and physical data for existing systems, modeled a priori in the given analysis tools, leads to quantification of uncertainty in those tools' calculation of discipline-level metrics. Then, after exploration of the new vehicle's design space, the quantified uncertainty is propagated probabilistically through the design space. This ultimately results in the assessment of the impact of disciplinary uncertainty on the confidence in the design solution: the final shape and variability of the probability functions defining the vehicle's system-level metrics. Although motivated by the hypersonic regime, the proposed treatment of uncertainty applies to any class of aerospace vehicle, just as the problem itself affects the design process of any vehicle. A number of computer programs comprise the environment constructed for the implementation of this work. Application to a single
Conjunction analysis and propositional logic in fMRI data analysis using Bayesian statistics.
Rudert, Thomas; Lohmann, Gabriele
2008-12-01
To evaluate logical expressions over different effects in data analyses using the general linear model (GLM) and to evaluate logical expressions over different posterior probability maps (PPMs). In functional magnetic resonance imaging (fMRI) data analysis, the GLM was applied to estimate unknown regression parameters. Based on the GLM, Bayesian statistics can be used to determine the probability of conjunction, disjunction, implication, or any other arbitrary logical expression over different effects or contrast. For second-level inferences, PPMs from individual sessions or subjects are utilized. These PPMs can be combined to a logical expression and its probability can be computed. The methods proposed in this article are applied to data from a STROOP experiment and the methods are compared to conjunction analysis approaches for test-statistics. The combination of Bayesian statistics with propositional logic provides a new approach for data analyses in fMRI. Two different methods are introduced for propositional logic: the first for analyses using the GLM and the second for common inferences about different probability maps. The methods introduced extend the idea of conjunction analysis to a full propositional logic and adapt it from test-statistics to Bayesian statistics. The new approaches allow inferences that are not possible with known standard methods in fMRI. (c) 2008 Wiley-Liss, Inc.
MERGING DIGITAL SURFACE MODELS IMPLEMENTING BAYESIAN APPROACHES
Directory of Open Access Journals (Sweden)
H. Sadeq
2016-06-01
Full Text Available In this research different DSMs from different sources have been merged. The merging is based on a probabilistic model using a Bayesian Approach. The implemented data have been sourced from very high resolution satellite imagery sensors (e.g. WorldView-1 and Pleiades. It is deemed preferable to use a Bayesian Approach when the data obtained from the sensors are limited and it is difficult to obtain many measurements or it would be very costly, thus the problem of the lack of data can be solved by introducing a priori estimations of data. To infer the prior data, it is assumed that the roofs of the buildings are specified as smooth, and for that purpose local entropy has been implemented. In addition to the a priori estimations, GNSS RTK measurements have been collected in the field which are used as check points to assess the quality of the DSMs and to validate the merging result. The model has been applied in the West-End of Glasgow containing different kinds of buildings, such as flat roofed and hipped roofed buildings. Both quantitative and qualitative methods have been employed to validate the merged DSM. The validation results have shown that the model was successfully able to improve the quality of the DSMs and improving some characteristics such as the roof surfaces, which consequently led to better representations. In addition to that, the developed model has been compared with the well established Maximum Likelihood model and showed similar quantitative statistical results and better qualitative results. Although the proposed model has been applied on DSMs that were derived from satellite imagery, it can be applied to any other sourced DSMs.
Chemical identification using Bayesian model selection
Energy Technology Data Exchange (ETDEWEB)
Burr, Tom; Fry, H. A. (Herbert A.); McVey, B. D. (Brian D.); Sander, E. (Eric)
2002-01-01
Remote detection and identification of chemicals in a scene is a challenging problem. We introduce an approach that uses some of the image's pixels to establish the background characteristics while other pixels represent the target for which we seek to identify all chemical species present. This leads to a generalized least squares problem in which we focus on 'subset selection' to identify the chemicals thought to be present. Bayesian model selection allows us to approximate the posterior probability that each chemical in the library is present by adding the posterior probabilities of all the subsets which include the chemical. We present results using realistic simulated data for the case with 1 to 5 chemicals present in each target and compare performance to a hybrid of forward and backward stepwise selection procedure using the F statistic.
Combination of Bayesian Network and Overlay Model in User Modeling
Directory of Open Access Journals (Sweden)
Loc Nguyen
2009-12-01
Full Text Available The core of adaptive system is user model containing personal information such as knowledge, learning styles, goals… which is requisite for learning personalized process. There are many modeling approaches, for example: stereotype, overlay, plan recognition… but they don’t bring out the solid method for reasoning from user model. This paper introduces the statistical method that combines Bayesian network and overlay modeling so that it is able to infer user’s knowledge from evidences collected during user’s learning process.
Bayesian nonparametric estimation of hazard rate in monotone Aalen model
Czech Academy of Sciences Publication Activity Database
Timková, Jana
2014-01-01
Roč. 50, č. 6 (2014), s. 849-868 ISSN 0023-5954 Institutional support: RVO:67985556 Keywords : Aalen model * Bayesian estimation * MCMC Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.541, year: 2014 http://library.utia.cas.cz/separaty/2014/SI/timkova-0438210.pdf
Kruschke, John K; Liddell, Torrin M
2018-02-01
In the practice of data analysis, there is a conceptual distinction between hypothesis testing, on the one hand, and estimation with quantified uncertainty on the other. Among frequentists in psychology, a shift of emphasis from hypothesis testing to estimation has been dubbed "the New Statistics" (Cumming 2014). A second conceptual distinction is between frequentist methods and Bayesian methods. Our main goal in this article is to explain how Bayesian methods achieve the goals of the New Statistics better than frequentist methods. The article reviews frequentist and Bayesian approaches to hypothesis testing and to estimation with confidence or credible intervals. The article also describes Bayesian approaches to meta-analysis, randomized controlled trials, and power analysis.
Bayesian statistics applied to neutron activation data for reactor flux spectrum analysis
International Nuclear Information System (INIS)
Chiesa, Davide; Previtali, Ezio; Sisti, Monica
2014-01-01
Highlights: • Bayesian statistics to analyze the neutron flux spectrum from activation data. • Rigorous statistical approach for accurate evaluation of the neutron flux groups. • Cross section and activation data uncertainties included for the problem solution. • Flexible methodology applied to analyze different nuclear reactor flux spectra. • The results are in good agreement with the MCNP simulations of neutron fluxes. - Abstract: In this paper, we present a statistical method, based on Bayesian statistics, to analyze the neutron flux spectrum from the activation data of different isotopes. The experimental data were acquired during a neutron activation experiment performed at the TRIGA Mark II reactor of Pavia University (Italy) in four irradiation positions characterized by different neutron spectra. In order to evaluate the neutron flux spectrum, subdivided in energy groups, a system of linear equations, containing the group effective cross sections and the activation rate data, has to be solved. However, since the system’s coefficients are experimental data affected by uncertainties, a rigorous statistical approach is fundamental for an accurate evaluation of the neutron flux groups. For this purpose, we applied the Bayesian statistical analysis, that allows to include the uncertainties of the coefficients and the a priori information about the neutron flux. A program for the analysis of Bayesian hierarchical models, based on Markov Chain Monte Carlo (MCMC) simulations, was used to define the problem statistical model and solve it. The first analysis involved the determination of the thermal, resonance-intermediate and fast flux components and the dependence of the results on the Prior distribution choice was investigated to confirm the reliability of the Bayesian analysis. After that, the main resonances of the activation cross sections were analyzed to implement multi-group models with finer energy subdivisions that would allow to determine the
Kwon, Deukwoo; Reis, Isildinha M.
2016-01-01
Background: We proposed approximate Bayesian computation with single distribution selection (ABC-SD) for estimating mean and standard deviation from other reported summary statistics. The ABC-SD generates pseudo data from a single parametric distribution thought to be the true distribution of underlying study data. This single distribution is either an educated guess, or it is selected via model selection using posterior probability criterion for testing two or more candidate distributions. F...
Spatial and spatio-temporal bayesian models with R - INLA
Blangiardo, Marta
2015-01-01
Dedication iiiPreface ix1 Introduction 11.1 Why spatial and spatio-temporal statistics? 11.2 Why do we use Bayesian methods for modelling spatial and spatio-temporal structures? 21.3 Why INLA? 31.4 Datasets 32 Introduction to 212.1 The language 212.2 objects 222.3 Data and session management 342.4 Packages 352.5 Programming in 362.6 Basic statistical analysis with 393 Introduction to Bayesian Methods 533.1 Bayesian Philosophy 533.2 Basic Probability Elements 573.3 Bayes Theorem 623.4 Prior and Posterior Distributions 643.5 Working with the Posterior Distribution 663.6 Choosing the Prior Distr
Accurate phenotyping: Reconciling approaches through Bayesian model averaging.
Directory of Open Access Journals (Sweden)
Carla Chia-Ming Chen
Full Text Available Genetic research into complex diseases is frequently hindered by a lack of clear biomarkers for phenotype ascertainment. Phenotypes for such diseases are often identified on the basis of clinically defined criteria; however such criteria may not be suitable for understanding the genetic composition of the diseases. Various statistical approaches have been proposed for phenotype definition; however our previous studies have shown that differences in phenotypes estimated using different approaches have substantial impact on subsequent analyses. Instead of obtaining results based upon a single model, we propose a new method, using Bayesian model averaging to overcome problems associated with phenotype definition. Although Bayesian model averaging has been used in other fields of research, this is the first study that uses Bayesian model averaging to reconcile phenotypes obtained using multiple models. We illustrate the new method by applying it to simulated genetic and phenotypic data for Kofendred personality disorder-an imaginary disease with several sub-types. Two separate statistical methods were used to identify clusters of individuals with distinct phenotypes: latent class analysis and grade of membership. Bayesian model averaging was then used to combine the two clusterings for the purpose of subsequent linkage analyses. We found that causative genetic loci for the disease produced higher LOD scores using model averaging than under either individual model separately. We attribute this improvement to consolidation of the cores of phenotype clusters identified using each individual method.
Model parameter updating using Bayesian networks
Energy Technology Data Exchange (ETDEWEB)
Treml, C. A. (Christine A.); Ross, Timothy J.
2004-01-01
This paper outlines a model parameter updating technique for a new method of model validation using a modified model reference adaptive control (MRAC) framework with Bayesian Networks (BNs). The model parameter updating within this method is generic in the sense that the model/simulation to be validated is treated as a black box. It must have updateable parameters to which its outputs are sensitive, and those outputs must have metrics that can be compared to that of the model reference, i.e., experimental data. Furthermore, no assumptions are made about the statistics of the model parameter uncertainty, only upper and lower bounds need to be specified. This method is designed for situations where a model is not intended to predict a complete point-by-point time domain description of the item/system behavior; rather, there are specific points, features, or events of interest that need to be predicted. These specific points are compared to the model reference derived from actual experimental data. The logic for updating the model parameters to match the model reference is formed via a BN. The nodes of this BN consist of updateable model input parameters and the specific output values or features of interest. Each time the model is executed, the input/output pairs are used to adapt the conditional probabilities of the BN. Each iteration further refines the inferred model parameters to produce the desired model output. After parameter updating is complete and model inputs are inferred, reliabilities for the model output are supplied. Finally, this method is applied to a simulation of a resonance control cooling system for a prototype coupled cavity linac. The results are compared to experimental data.
Konijn, Elly A.; van de Schoot, Rens; Winter, Sonja D.; Ferguson, Christopher J.
2015-01-01
The present paper argues that an important cause of publication bias resides in traditional frequentist statistics forcing binary decisions. An alternative approach through Bayesian statistics provides various degrees of support for any hypothesis allowing balanced decisions and proper null
Bayesian modeling of unknown diseases for biosurveillance.
Shen, Yanna; Cooper, Gregory F
2009-11-14
This paper investigates Bayesian modeling of unknown causes of events in the context of disease-outbreak detection. We introduce a Bayesian approach that models and detects both (1) known diseases (e.g., influenza and anthrax) by using informative prior probabilities and (2) unknown diseases (e.g., a new, highly contagious respiratory virus that has never been seen before) by using relatively non-informative prior probabilities. We report the results of simulation experiments which support that this modeling method can improve the detection of new disease outbreaks in a population. A key contribution of this paper is that it introduces a Bayesian approach for jointly modeling both known and unknown causes of events. Such modeling has broad applicability in medical informatics, where the space of known causes of outcomes of interest is seldom complete.
A Bayesian Markov geostatistical model for estimation of hydrogeological properties
International Nuclear Information System (INIS)
Rosen, L.; Gustafson, G.
1996-01-01
A geostatistical methodology based on Markov-chain analysis and Bayesian statistics was developed for probability estimations of hydrogeological and geological properties in the siting process of a nuclear waste repository. The probability estimates have practical use in decision-making on issues such as siting, investigation programs, and construction design. The methodology is nonparametric which makes it possible to handle information that does not exhibit standard statistical distributions, as is often the case for classified information. Data do not need to meet the requirements on additivity and normality as with the geostatistical methods based on regionalized variable theory, e.g., kriging. The methodology also has a formal way for incorporating professional judgments through the use of Bayesian statistics, which allows for updating of prior estimates to posterior probabilities each time new information becomes available. A Bayesian Markov Geostatistical Model (BayMar) software was developed for implementation of the methodology in two and three dimensions. This paper gives (1) a theoretical description of the Bayesian Markov Geostatistical Model; (2) a short description of the BayMar software; and (3) an example of application of the model for estimating the suitability for repository establishment with respect to the three parameters of lithology, hydraulic conductivity, and rock quality designation index (RQD) at 400--500 meters below ground surface in an area around the Aespoe Hard Rock Laboratory in southeastern Sweden
Posterior Predictive Model Checking in Bayesian Networks
Crawford, Aaron
2014-01-01
This simulation study compared the utility of various discrepancy measures within a posterior predictive model checking (PPMC) framework for detecting different types of data-model misfit in multidimensional Bayesian network (BN) models. The investigated conditions were motivated by an applied research program utilizing an operational complex…
Robust bayesian analysis of an autoregressive model with ...
African Journals Online (AJOL)
In this work, robust Bayesian analysis of the Bayesian estimation of an autoregressive model with exponential innovations is performed. Using a Bayesian robustness methodology, we show that, using a suitable generalized quadratic loss, we obtain optimal Bayesian estimators of the parameters corresponding to the ...
Empirical Bayesian inference and model uncertainty
International Nuclear Information System (INIS)
Poern, K.
1994-01-01
This paper presents a hierarchical or multistage empirical Bayesian approach for the estimation of uncertainty concerning the intensity of a homogeneous Poisson process. A class of contaminated gamma distributions is considered to describe the uncertainty concerning the intensity. These distributions in turn are defined through a set of secondary parameters, the knowledge of which is also described and updated via Bayes formula. This two-stage Bayesian approach is an example where the modeling uncertainty is treated in a comprehensive way. Each contaminated gamma distributions, represented by a point in the 3D space of secondary parameters, can be considered as a specific model of the uncertainty about the Poisson intensity. Then, by the empirical Bayesian method each individual model is assigned a posterior probability
2015-10-24
Taroni, F, Aitken, C, Garbolino, P, Biedermann, A, Bayesian Networks and Probabilistic Inference in Forensic Science (Statistics in Practice), Wiley...were transformed into a Bayesian network . Bayesian networks allow for the assessment of evidence based upon two propositions (same gun or different...gun). This allows a forensic scientist to provide insight to courts and investigators as to the value of the evidence. The breech face (BF) and
BDgraph: An R Package for Bayesian Structure Learning in Graphical Models
Mohammadi, A.; Wit, E.C.
2017-01-01
Graphical models provide powerful tools to uncover complicated patterns in multivariate data and are commonly used in Bayesian statistics and machine learning. In this paper, we introduce an R package BDgraph which performs Bayesian structure learning for general undirected graphical models with
Bayesian mixture models for partially verified data
DEFF Research Database (Denmark)
Kostoulas, Polychronis; Browne, William J.; Nielsen, Søren Saxmose
2013-01-01
Bayesian mixture models can be used to discriminate between the distributions of continuous test responses for different infection stages. These models are particularly useful in case of chronic infections with a long latent period, like Mycobacterium avium subsp. paratuberculosis (MAP) infection...
Bayesian Dimensionality Assessment for the Multidimensional Nominal Response Model
Directory of Open Access Journals (Sweden)
Javier Revuelta
2017-06-01
Full Text Available This article introduces Bayesian estimation and evaluation procedures for the multidimensional nominal response model. The utility of this model is to perform a nominal factor analysis of items that consist of a finite number of unordered response categories. The key aspect of the model, in comparison with traditional factorial model, is that there is a slope for each response category on the latent dimensions, instead of having slopes associated to the items. The extended parameterization of the multidimensional nominal response model requires large samples for estimation. When sample size is of a moderate or small size, some of these parameters may be weakly empirically identifiable and the estimation algorithm may run into difficulties. We propose a Bayesian MCMC inferential algorithm to estimate the parameters and the number of dimensions underlying the multidimensional nominal response model. Two Bayesian approaches to model evaluation were compared: discrepancy statistics (DIC, WAICC, and LOO that provide an indication of the relative merit of different models, and the standardized generalized discrepancy measure that requires resampling data and is computationally more involved. A simulation study was conducted to compare these two approaches, and the results show that the standardized generalized discrepancy measure can be used to reliably estimate the dimensionality of the model whereas the discrepancy statistics are questionable. The paper also includes an example with real data in the context of learning styles, in which the model is used to conduct an exploratory factor analysis of nominal data.
Applying Bayesian statistics to the study of psychological trauma: A suggestion for future research.
Yalch, Matthew M
2016-03-01
Several contemporary researchers have noted the virtues of Bayesian methods of data analysis. Although debates continue about whether conventional or Bayesian statistics is the "better" approach for researchers in general, there are reasons why Bayesian methods may be well suited to the study of psychological trauma in particular. This article describes how Bayesian statistics offers practical solutions to the problems of data non-normality, small sample size, and missing data common in research on psychological trauma. After a discussion of these problems and the effects they have on trauma research, this article explains the basic philosophical and statistical foundations of Bayesian statistics and how it provides solutions to these problems using an applied example. Results of the literature review and the accompanying example indicates the utility of Bayesian statistics in addressing problems common in trauma research. Bayesian statistics provides a set of methodological tools and a broader philosophical framework that is useful for trauma researchers. Methodological resources are also provided so that interested readers can learn more. (c) 2016 APA, all rights reserved).
Kaplan, David; Lee, Chansoon
2018-01-01
This article provides a review of Bayesian model averaging as a means of optimizing the predictive performance of common statistical models applied to large-scale educational assessments. The Bayesian framework recognizes that in addition to parameter uncertainty, there is uncertainty in the choice of models themselves. A Bayesian approach to addressing the problem of model uncertainty is the method of Bayesian model averaging. Bayesian model averaging searches the space of possible models for a set of submodels that satisfy certain scientific principles and then averages the coefficients across these submodels weighted by each model's posterior model probability (PMP). Using the weighted coefficients for prediction has been shown to yield optimal predictive performance according to certain scoring rules. We demonstrate the utility of Bayesian model averaging for prediction in education research with three examples: Bayesian regression analysis, Bayesian logistic regression, and a recently developed approach for Bayesian structural equation modeling. In each case, the model-averaged estimates are shown to yield better prediction of the outcome of interest than any submodel based on predictive coverage and the log-score rule. Implications for the design of large-scale assessments when the goal is optimal prediction in a policy context are discussed.
Quantifying Registration Uncertainty With Sparse Bayesian Modelling.
Le Folgoc, Loic; Delingette, Herve; Criminisi, Antonio; Ayache, Nicholas
2017-02-01
We investigate uncertainty quantification under a sparse Bayesian model of medical image registration. Bayesian modelling has proven powerful to automate the tuning of registration hyperparameters, such as the trade-off between the data and regularization functionals. Sparsity-inducing priors have recently been used to render the parametrization itself adaptive and data-driven. The sparse prior on transformation parameters effectively favors the use of coarse basis functions to capture the global trends in the visible motion while finer, highly localized bases are introduced only in the presence of coherent image information and motion. In earlier work, approximate inference under the sparse Bayesian model was tackled in an efficient Variational Bayes (VB) framework. In this paper we are interested in the theoretical and empirical quality of uncertainty estimates derived under this approximate scheme vs. under the exact model. We implement an (asymptotically) exact inference scheme based on reversible jump Markov Chain Monte Carlo (MCMC) sampling to characterize the posterior distribution of the transformation and compare the predictions of the VB and MCMC based methods. The true posterior distribution under the sparse Bayesian model is found to be meaningful: orders of magnitude for the estimated uncertainty are quantitatively reasonable, the uncertainty is higher in textureless regions and lower in the direction of strong intensity gradients.
Rationalizing method of replacement intervals by using Bayesian statistics
International Nuclear Information System (INIS)
Kasai, Masao; Notoya, Junichi; Kusakari, Yoshiyuki
2007-01-01
This study represents the formulations for rationalizing the replacement intervals of equipments and/or parts taking into account the probability density functions (PDF) of the parameters of failure distribution functions (FDF) and compares the optimized intervals by our formulations with those by conventional formulations which uses only representative values of the parameters of FDF instead of using these PDFs. The failure data are generated by Monte Carlo simulations since the real failure data can not be available for us. The PDF of PDF parameters are obtained by Bayesian method and the representative values are obtained by likelihood estimation and Bayesian method. We found that the method using PDF by Bayesian method brings longer replacement intervals than one using the representative of the parameters. (author)
Lee, Sik-Yum; Song, Xin-Yuan; Tang, Nian-Sheng
2007-01-01
The analysis of interaction among latent variables has received much attention. This article introduces a Bayesian approach to analyze a general structural equation model that accommodates the general nonlinear terms of latent variables and covariates. This approach produces a Bayesian estimate that has the same statistical optimal properties as a…
Bayesian Statistics at Work: the Troublesome Extraction of the CKM Phase alpha
Charles, J; Lacker, H; Le Diberder, F R; T'Jampens, S
2006-01-01
In Bayesian statistics, one's prior beliefs about underlying model parameters are revised with the information content of observed data from which, using Bayes' rule, a posterior belief is obtained. A non-trivial example taken from the isospin analysis of B-->PP (P = pi or rho) decays in heavy-flavor physics is chosen to illustrate the effect of the naive "objective" choice of flat priors in a multi-dimensional parameter space in presence of mirror solutions. It is demonstrated that the posterior distribution for the parameter of interest, the phase alpha, strongly depends on the choice of the parameterization in which the priors are uniform, and on the validity range in which the (un-normalizable) priors are truncated. We prove that the most probable values found by the Bayesian treatment do not coincide with the explicit analytical solution, in contrast to the frequentist approach. It is also shown in the appendix that the alpha-->0 limit cannot be consistently treated in the Bayesian paradigm, because the ...
Bayesian Statistics at Work: the Troublesome Extraction of the CKM Phase α
International Nuclear Information System (INIS)
Charles, J.; Hoecker, A.; Lacker, H.; Le Diberder, F.R.; T'Jampens, S.
2007-04-01
In Bayesian statistics, one's prior beliefs about underlying model parameters are revised with the information content of observed data from which, using Bayes' rule, a posterior belief is obtained. A non-trivial example taken from the isospin analysis of B → PP (P = π or ρ) decays in heavy-flavor physics is chosen to illustrate the effect of the naive 'objective' choice of flat priors in a multi- dimensional parameter space in presence of mirror solutions. It is demonstrated that the posterior distribution for the parameter of interest, the phase α, strongly depends on the choice of the parameterization in which the priors are uniform, and on the validity range in which the (un-normalizable) priors are truncated. We prove that the most probable values found by the Bayesian treatment do not coincide with the explicit analytical solutions, in contrast to the frequentist approach. It is also shown in the appendix that the α → 0 limit cannot be consistently treated in the Bayesian paradigm, because the latter violates the physical symmetries of the problem. (authors)
Distributed Bayesian Networks for User Modeling
DEFF Research Database (Denmark)
Tedesco, Roberto; Dolog, Peter; Nejdl, Wolfgang
2006-01-01
The World Wide Web is a popular platform for providing eLearning applications to a wide spectrum of users. However – as users differ in their preferences, background, requirements, and goals – applications should provide personalization mechanisms. In the Web context, user models used...... by such adaptive applications are often partial fragments of an overall user model. The fragments have then to be collected and merged into a global user profile. In this paper we investigate and present algorithms able to cope with distributed, fragmented user models – based on Bayesian Networks – in the context...... of Web-based eLearning platforms. The scenario we are tackling assumes learners who use several systems over time, which are able to create partial Bayesian Networks for user models based on the local system context. In particular, we focus on how to merge these partial user models. Our merge mechanism...
Hierarchical Bayesian spatial models for multispecies conservation planning and monitoring.
Carroll, Carlos; Johnson, Devin S; Dunk, Jeffrey R; Zielinski, William J
2010-12-01
Biologists who develop and apply habitat models are often familiar with the statistical challenges posed by their data's spatial structure but are unsure of whether the use of complex spatial models will increase the utility of model results in planning. We compared the relative performance of nonspatial and hierarchical Bayesian spatial models for three vertebrate and invertebrate taxa of conservation concern (Church's sideband snails [Monadenia churchi], red tree voles [Arborimus longicaudus], and Pacific fishers [Martes pennanti pacifica]) that provide examples of a range of distributional extents and dispersal abilities. We used presence-absence data derived from regional monitoring programs to develop models with both landscape and site-level environmental covariates. We used Markov chain Monte Carlo algorithms and a conditional autoregressive or intrinsic conditional autoregressive model framework to fit spatial models. The fit of Bayesian spatial models was between 35 and 55% better than the fit of nonspatial analogue models. Bayesian spatial models outperformed analogous models developed with maximum entropy (Maxent) methods. Although the best spatial and nonspatial models included similar environmental variables, spatial models provided estimates of residual spatial effects that suggested how ecological processes might structure distribution patterns. Spatial models built from presence-absence data improved fit most for localized endemic species with ranges constrained by poorly known biogeographic factors and for widely distributed species suspected to be strongly affected by unmeasured environmental variables or population processes. By treating spatial effects as a variable of interest rather than a nuisance, hierarchical Bayesian spatial models, especially when they are based on a common broad-scale spatial lattice (here the national Forest Inventory and Analysis grid of 24 km(2) hexagons), can increase the relevance of habitat models to multispecies
Bayesian Subset Modeling for High-Dimensional Generalized Linear Models
Liang, Faming
2013-06-01
This article presents a new prior setting for high-dimensional generalized linear models, which leads to a Bayesian subset regression (BSR) with the maximum a posteriori model approximately equivalent to the minimum extended Bayesian information criterion model. The consistency of the resulting posterior is established under mild conditions. Further, a variable screening procedure is proposed based on the marginal inclusion probability, which shares the same properties of sure screening and consistency with the existing sure independence screening (SIS) and iterative sure independence screening (ISIS) procedures. However, since the proposed procedure makes use of joint information from all predictors, it generally outperforms SIS and ISIS in real applications. This article also makes extensive comparisons of BSR with the popular penalized likelihood methods, including Lasso, elastic net, SIS, and ISIS. The numerical results indicate that BSR can generally outperform the penalized likelihood methods. The models selected by BSR tend to be sparser and, more importantly, of higher prediction ability. In addition, the performance of the penalized likelihood methods tends to deteriorate as the number of predictors increases, while this is not significant for BSR. Supplementary materials for this article are available online. © 2013 American Statistical Association.
Bayesian non parametric modelling of Higgs pair production
Directory of Open Access Journals (Sweden)
Scarpa Bruno
2017-01-01
Full Text Available Statistical classification models are commonly used to separate a signal from a background. In this talk we face the problem of isolating the signal of Higgs pair production using the decay channel in which each boson decays into a pair of b-quarks. Typically in this context non parametric methods are used, such as Random Forests or different types of boosting tools. We remain in the same non-parametric framework, but we propose to face the problem following a Bayesian approach. A Dirichlet process is used as prior for the random effects in a logit model which is fitted by leveraging the Polya-Gamma data augmentation. Refinements of the model include the insertion in the simple model of P-splines to relate explanatory variables with the response and the use of Bayesian trees (BART to describe the atoms in the Dirichlet process.
Modelling dependable systems using hybrid Bayesian networks
International Nuclear Information System (INIS)
Neil, Martin; Tailor, Manesh; Marquez, David; Fenton, Norman; Hearty, Peter
2008-01-01
A hybrid Bayesian network (BN) is one that incorporates both discrete and continuous nodes. In our extensive applications of BNs for system dependability assessment, the models are invariably hybrid and the need for efficient and accurate computation is paramount. We apply a new iterative algorithm that efficiently combines dynamic discretisation with robust propagation algorithms on junction tree structures to perform inference in hybrid BNs. We illustrate its use in the field of dependability with two example of reliability estimation. Firstly we estimate the reliability of a simple single system and next we implement a hierarchical Bayesian model. In the hierarchical model we compute the reliability of two unknown subsystems from data collected on historically similar subsystems and then input the result into a reliability block model to compute system level reliability. We conclude that dynamic discretisation can be used as an alternative to analytical or Monte Carlo methods with high precision and can be applied to a wide range of dependability problems
Bayesian disease mapping: hierarchical modeling in spatial epidemiology
National Research Council Canada - National Science Library
Lawson, Andrew
2013-01-01
.... Exploring these new developments, Bayesian Disease Mapping: Hierarchical Modeling in Spatial Epidemiology, Second Edition provides an up-to-date, cohesive account of the full range of Bayesian disease mapping methods and applications...
Constrained bayesian inference of project performance models
Sunmola, Funlade
2013-01-01
Project performance models play an important role in the management of project success. When used for monitoring projects, they can offer predictive ability such as indications of possible delivery problems. Approaches for monitoring project performance relies on available project information including restrictions imposed on the project, particularly the constraints of cost, quality, scope and time. We study in this paper a Bayesian inference methodology for project performance modelling in ...
Modeling error distributions of growth curve models through Bayesian methods.
Zhang, Zhiyong
2016-06-01
Growth curve models are widely used in social and behavioral sciences. However, typical growth curve models often assume that the errors are normally distributed although non-normal data may be even more common than normal data. In order to avoid possible statistical inference problems in blindly assuming normality, a general Bayesian framework is proposed to flexibly model normal and non-normal data through the explicit specification of the error distributions. A simulation study shows when the distribution of the error is correctly specified, one can avoid the loss in the efficiency of standard error estimates. A real example on the analysis of mathematical ability growth data from the Early Childhood Longitudinal Study, Kindergarten Class of 1998-99 is used to show the application of the proposed methods. Instructions and code on how to conduct growth curve analysis with both normal and non-normal error distributions using the the MCMC procedure of SAS are provided.
Bayesian methods for data analysis
Carlin, Bradley P.
2009-01-01
Approaches for statistical inference Introduction Motivating Vignettes Defining the Approaches The Bayes-Frequentist Controversy Some Basic Bayesian Models The Bayes approach Introduction Prior Distributions Bayesian Inference Hierarchical Modeling Model Assessment Nonparametric Methods Bayesian computation Introduction Asymptotic Methods Noniterative Monte Carlo Methods Markov Chain Monte Carlo Methods Model criticism and selection Bayesian Modeling Bayesian Robustness Model Assessment Bayes Factors via Marginal Density Estimation Bayes Factors
Embracing Uncertainty: The Interface of Bayesian Statistics and Cognitive Psychology
Directory of Open Access Journals (Sweden)
Judith L. Anderson
1998-06-01
Full Text Available Ecologists working in conservation and resource management are discovering the importance of using Bayesian analytic methods to deal explicitly with uncertainty in data analyses and decision making. However, Bayesian procedures require, as inputs and outputs, an idea that is problematic for the human brain: the probability of a hypothesis ("single-event probability". I describe several cognitive concepts closely related to single-event probabilities, and discuss how their interchangeability in the human mind results in "cognitive illusions," apparent deficits in reasoning about uncertainty. Each cognitive illusion implies specific possible pitfalls for the use of single-event probabilities in ecology and resource management. I then discuss recent research in cognitive psychology showing that simple tactics of communication, suggested by an evolutionary perspective on human cognition, help people to process uncertain information more effectively as they read and talk about probabilities. In addition, I suggest that carefully considered standards for methodology and conventions for presentation may also make Bayesian analyses easier to understand.
Network structure exploration via Bayesian nonparametric models
International Nuclear Information System (INIS)
Chen, Y; Wang, X L; Xiang, X; Tang, B Z; Bu, J Z
2015-01-01
Complex networks provide a powerful mathematical representation of complex systems in nature and society. To understand complex networks, it is crucial to explore their internal structures, also called structural regularities. The task of network structure exploration is to determine how many groups there are in a complex network and how to group the nodes of the network. Most existing structure exploration methods need to specify either a group number or a certain type of structure when they are applied to a network. In the real world, however, the group number and also the certain type of structure that a network has are usually unknown in advance. To explore structural regularities in complex networks automatically, without any prior knowledge of the group number or the certain type of structure, we extend a probabilistic mixture model that can handle networks with any type of structure but needs to specify a group number using Bayesian nonparametric theory. We also propose a novel Bayesian nonparametric model, called the Bayesian nonparametric mixture (BNPM) model. Experiments conducted on a large number of networks with different structures show that the BNPM model is able to explore structural regularities in networks automatically with a stable, state-of-the-art performance. (paper)
Bayesian Recurrent Neural Network for Language Modeling.
Chien, Jen-Tzung; Ku, Yuan-Chu
2016-02-01
A language model (LM) is calculated as the probability of a word sequence that provides the solution to word prediction for a variety of information systems. A recurrent neural network (RNN) is powerful to learn the large-span dynamics of a word sequence in the continuous space. However, the training of the RNN-LM is an ill-posed problem because of too many parameters from a large dictionary size and a high-dimensional hidden layer. This paper presents a Bayesian approach to regularize the RNN-LM and apply it for continuous speech recognition. We aim to penalize the too complicated RNN-LM by compensating for the uncertainty of the estimated model parameters, which is represented by a Gaussian prior. The objective function in a Bayesian classification network is formed as the regularized cross-entropy error function. The regularized model is constructed not only by calculating the regularized parameters according to the maximum a posteriori criterion but also by estimating the Gaussian hyperparameter by maximizing the marginal likelihood. A rapid approximation to a Hessian matrix is developed to implement the Bayesian RNN-LM (BRNN-LM) by selecting a small set of salient outer-products. The proposed BRNN-LM achieves a sparser model than the RNN-LM. Experiments on different corpora show the robustness of system performance by applying the rapid BRNN-LM under different conditions.
Centralized Bayesian reliability modelling with sensor networks
Czech Academy of Sciences Publication Activity Database
Dedecius, Kamil; Sečkárová, Vladimíra
2013-01-01
Roč. 19, č. 5 (2013), s. 471-482 ISSN 1387-3954 R&D Projects: GA MŠk 7D12004 Grant - others:GA MŠk(CZ) SVV-265315 Keywords : Bayesian modelling * Sensor network * Reliability Subject RIV: BD - Theory of Information Impact factor: 0.984, year: 2013 http://library.utia.cas.cz/separaty/2013/AS/dedecius-0392551.pdf
Operational modal analysis modeling, Bayesian inference, uncertainty laws
Au, Siu-Kui
2017-01-01
This book presents operational modal analysis (OMA), employing a coherent and comprehensive Bayesian framework for modal identification and covering stochastic modeling, theoretical formulations, computational algorithms, and practical applications. Mathematical similarities and philosophical differences between Bayesian and classical statistical approaches to system identification are discussed, allowing their mathematical tools to be shared and their results correctly interpreted. Many chapters can be used as lecture notes for the general topic they cover beyond the OMA context. After an introductory chapter (1), Chapters 2–7 present the general theory of stochastic modeling and analysis of ambient vibrations. Readers are first introduced to the spectral analysis of deterministic time series (2) and structural dynamics (3), which do not require the use of probability concepts. The concepts and techniques in these chapters are subsequently extended to a probabilistic context in Chapter 4 (on stochastic pro...
Goldstein, Harvey
2011-01-01
This book provides a clear introduction to this important area of statistics. The author provides a wide of coverage of different kinds of multilevel models, and how to interpret different statistical methodologies and algorithms applied to such models. This 4th edition reflects the growth and interest in this area and is updated to include new chapters on multilevel models with mixed response types, smoothing and multilevel data, models with correlated random effects and modeling with variance.
Optimization of inspection and replacement period by using Bayesian statistics
International Nuclear Information System (INIS)
Kasai, Masao; Watanabe, Yasushi; Kusakari, Yoshiyuki; Notoya, Junichi
2006-01-01
This study describes the formulations to optimize the time interval of inspections and/or replacements of equipment/parts taking into account the probability density functions (PDF) for failure rates and parameters of failure distribution functions (FDF) and evaluates the optimized results of these time intervals using our formulations by comparing with those using only representative values of failure rates and the parameters of FDF instead of using these PDFs. The PDFs are obtained with Bayesian method and the representative values are obtained with likelihood estimation method. However, any significant difference is not observed between both optimized results within our preliminary calculations. (author)
Bayesian Inference of a Multivariate Regression Model
Directory of Open Access Journals (Sweden)
Marick S. Sinay
2014-01-01
Full Text Available We explore Bayesian inference of a multivariate linear regression model with use of a flexible prior for the covariance structure. The commonly adopted Bayesian setup involves the conjugate prior, multivariate normal distribution for the regression coefficients and inverse Wishart specification for the covariance matrix. Here we depart from this approach and propose a novel Bayesian estimator for the covariance. A multivariate normal prior for the unique elements of the matrix logarithm of the covariance matrix is considered. Such structure allows for a richer class of prior distributions for the covariance, with respect to strength of beliefs in prior location hyperparameters, as well as the added ability, to model potential correlation amongst the covariance structure. The posterior moments of all relevant parameters of interest are calculated based upon numerical results via a Markov chain Monte Carlo procedure. The Metropolis-Hastings-within-Gibbs algorithm is invoked to account for the construction of a proposal density that closely matches the shape of the target posterior distribution. As an application of the proposed technique, we investigate a multiple regression based upon the 1980 High School and Beyond Survey.
Exploring the Connection Between Sampling Problems in Bayesian Inference and Statistical Mechanics
Pohorille, Andrew
2006-01-01
The Bayesian and statistical mechanical communities often share the same objective in their work - estimating and integrating probability distribution functions (pdfs) describing stochastic systems, models or processes. Frequently, these pdfs are complex functions of random variables exhibiting multiple, well separated local minima. Conventional strategies for sampling such pdfs are inefficient, sometimes leading to an apparent non-ergodic behavior. Several recently developed techniques for handling this problem have been successfully applied in statistical mechanics. In the multicanonical and Wang-Landau Monte Carlo (MC) methods, the correct pdfs are recovered from uniform sampling of the parameter space by iteratively establishing proper weighting factors connecting these distributions. Trivial generalizations allow for sampling from any chosen pdf. The closely related transition matrix method relies on estimating transition probabilities between different states. All these methods proved to generate estimates of pdfs with high statistical accuracy. In another MC technique, parallel tempering, several random walks, each corresponding to a different value of a parameter (e.g. "temperature"), are generated and occasionally exchanged using the Metropolis criterion. This method can be considered as a statistically correct version of simulated annealing. An alternative approach is to represent the set of independent variables as a Hamiltonian system. Considerab!e progress has been made in understanding how to ensure that the system obeys the equipartition theorem or, equivalently, that coupling between the variables is correctly described. Then a host of techniques developed for dynamical systems can be used. Among them, probably the most powerful is the Adaptive Biasing Force method, in which thermodynamic integration and biased sampling are combined to yield very efficient estimates of pdfs. The third class of methods deals with transitions between states described
Sampling, Probability Models and Statistical Reasoning Statistical ...
Indian Academy of Sciences (India)
Home; Journals; Resonance – Journal of Science Education; Volume 1; Issue 5. Sampling, Probability Models and Statistical Reasoning Statistical Inference. Mohan Delampady V R Padmawar. General Article Volume 1 Issue 5 May 1996 pp 49-58 ...
Sampling, Probability Models and Statistical Reasoning Statistical
Indian Academy of Sciences (India)
Home; Journals; Resonance – Journal of Science Education; Volume 1; Issue 5. Sampling, Probability Models and Statistical Reasoning Statistical Inference. Mohan Delampady V R Padmawar. General Article Volume 1 Issue 5 May 1996 pp 49-58 ...
International Nuclear Information System (INIS)
Procaccia, H.; Cordier, R.; Muller, S.
1994-07-01
Statistical decision theory could be a alternative for the optimization of preventive maintenance periodicity. In effect, this theory concerns the situation in which a decision maker has to make a choice between a set of reasonable decisions, and where the loss associated to a given decision depends on a probabilistic risk, called state of nature. In the case of maintenance optimization, the decisions to be analyzed are different periodicities proposed by the experts, given the observed feedback experience, the states of nature are the associated failure probabilities, and the losses are the expectations of the induced cost of maintenance and of consequences of the failures. As failure probabilities concern rare events, at the ultimate state of RCM analysis (failure of sub-component), and as expected foreseeable behaviour of equipment has to be evaluated by experts, Bayesian approach is successfully used to compute states of nature. In Bayesian decision theory, a prior distribution for failure probabilities is modeled from expert knowledge, and is combined with few stochastic information provided by feedback experience, giving a posterior distribution of failure probabilities. The optimized decision is the decision that minimizes the expected loss over the posterior distribution. This methodology has been applied to inspection and maintenance optimization of cylinders of diesel generator engines of 900 MW nuclear plants. In these plants, auxiliary electric power is supplied by 2 redundant diesel generators which are tested every 2 weeks during about 1 hour. Until now, during yearly refueling of each plant, one endoscopic inspection of diesel cylinders is performed, and every 5 operating years, all cylinders are replaced. RCM has shown that cylinder failures could be critical. So Bayesian decision theory has been applied, taking into account expert opinions, and possibility of aging when maintenance periodicity is extended. (authors). 8 refs., 5 figs., 1 tab
Advances in Applications of Hierarchical Bayesian Methods with Hydrological Models
Alexander, R. B.; Schwarz, G. E.; Boyer, E. W.
2017-12-01
Mechanistic and empirical watershed models are increasingly used to inform water resource decisions. Growing access to historical stream measurements and data from in-situ sensor technologies has increased the need for improved techniques for coupling models with hydrological measurements. Techniques that account for the intrinsic uncertainties of both models and measurements are especially needed. Hierarchical Bayesian methods provide an efficient modeling tool for quantifying model and prediction uncertainties, including those associated with measurements. Hierarchical methods can also be used to explore spatial and temporal variations in model parameters and uncertainties that are informed by hydrological measurements. We used hierarchical Bayesian methods to develop a hybrid (statistical-mechanistic) SPARROW (SPAtially Referenced Regression On Watershed attributes) model of long-term mean annual streamflow across diverse environmental and climatic drainages in 18 U.S. hydrological regions. Our application illustrates the use of a new generation of Bayesian methods that offer more advanced computational efficiencies than the prior generation. Evaluations of the effects of hierarchical (regional) variations in model coefficients and uncertainties on model accuracy indicates improved prediction accuracies (median of 10-50%) but primarily in humid eastern regions, where model uncertainties are one-third of those in arid western regions. Generally moderate regional variability is observed for most hierarchical coefficients. Accounting for measurement and structural uncertainties, using hierarchical state-space techniques, revealed the effects of spatially-heterogeneous, latent hydrological processes in the "localized" drainages between calibration sites; this improved model precision, with only minor changes in regional coefficients. Our study can inform advances in the use of hierarchical methods with hydrological models to improve their integration with stream
Targeted search for continuous gravitational waves: Bayesian versus maximum-likelihood statistics
International Nuclear Information System (INIS)
Prix, Reinhard; Krishnan, Badri
2009-01-01
We investigate the Bayesian framework for detection of continuous gravitational waves (GWs) in the context of targeted searches, where the phase evolution of the GW signal is assumed to be known, while the four amplitude parameters are unknown. We show that the orthodox maximum-likelihood statistic (known as F-statistic) can be rediscovered as a Bayes factor with an unphysical prior in amplitude parameter space. We introduce an alternative detection statistic ('B-statistic') using the Bayes factor with a more natural amplitude prior, namely an isotropic probability distribution for the orientation of GW sources. Monte Carlo simulations of targeted searches show that the resulting Bayesian B-statistic is more powerful in the Neyman-Pearson sense (i.e., has a higher expected detection probability at equal false-alarm probability) than the frequentist F-statistic.
Statistical Modeling for Radiation Hardness Assurance
Ladbury, Raymond L.
2014-01-01
We cover the models and statistics associated with single event effects (and total ionizing dose), why we need them, and how to use them: What models are used, what errors exist in real test data, and what the model allows us to say about the DUT will be discussed. In addition, how to use other sources of data such as historical, heritage, and similar part and how to apply experience, physics, and expert opinion to the analysis will be covered. Also included will be concepts of Bayesian statistics, data fitting, and bounding rates.
Bayesian Test of Significance for Conditional Independence: The Multinomial Model
Directory of Open Access Journals (Sweden)
Pablo de Morais Andrade
2014-03-01
Full Text Available Conditional independence tests have received special attention lately in machine learning and computational intelligence related literature as an important indicator of the relationship among the variables used by their models. In the field of probabilistic graphical models, which includes Bayesian network models, conditional independence tests are especially important for the task of learning the probabilistic graphical model structure from data. In this paper, we propose the full Bayesian significance test for tests of conditional independence for discrete datasets. The full Bayesian significance test is a powerful Bayesian test for precise hypothesis, as an alternative to the frequentist’s significance tests (characterized by the calculation of the p-value.
Diffeomorphic Statistical Deformation Models
DEFF Research Database (Denmark)
Hansen, Michael Sass; Hansen, Mads/Fogtman; Larsen, Rasmus
2007-01-01
In this paper we present a new method for constructing diffeomorphic statistical deformation models in arbitrary dimensional images with a nonlinear generative model and a linear parameter space. Our deformation model is a modified version of the diffeomorphic model introduced by Cootes et al....... The modifications ensure that no boundary restriction has to be enforced on the parameter space to prevent folds or tears in the deformation field. For straightforward statistical analysis, principal component analysis and sparse methods, we assume that the parameters for a class of deformations lie on a linear...
Accelerating Bayesian inference for evolutionary biology models.
Meyer, Xavier; Chopard, Bastien; Salamin, Nicolas
2017-03-01
Bayesian inference is widely used nowadays and relies largely on Markov chain Monte Carlo (MCMC) methods. Evolutionary biology has greatly benefited from the developments of MCMC methods, but the design of more complex and realistic models and the ever growing availability of novel data is pushing the limits of the current use of these methods. We present a parallel Metropolis-Hastings (M-H) framework built with a novel combination of enhancements aimed towards parameter-rich and complex models. We show on a parameter-rich macroevolutionary model increases of the sampling speed up to 35 times with 32 processors when compared to a sequential M-H process. More importantly, our framework achieves up to a twentyfold faster convergence to estimate the posterior probability of phylogenetic trees using 32 processors when compared to the well-known software MrBayes for Bayesian inference of phylogenetic trees. https://bitbucket.org/XavMeyer/hogan. nicolas.salamin@unil.ch. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Quantum-Like Bayesian Networks for Modeling Decision Making
Directory of Open Access Journals (Sweden)
Catarina eMoreira
2016-01-01
Full Text Available In this work, we explore an alternative quantum structure to perform quantum probabilistic inferences to accommodate the paradoxical findings of the Sure Thing Principle. We propose a Quantum-Like Bayesian Network, which consists in replacing classical probabilities by quantum probability amplitudes. However, since this approach suffers from the problem of exponential growth of quantum parameters, we also propose a similarity heuristic that automatically fits quantum parameters through vector similarities. This makes the proposed model general and predictive in contrast to the current state of the art models, which cannot be generalized for more complex decision scenarios and that only provide an explanatory nature for the observed paradoxes. In the end, the model that we propose consists in a nonparametric method for estimating inference effects from a statistical point of view. It is a statistical model that is simpler than the previous quantum dynamic and quantum-like models proposed in the literature. We tested the proposed network with several empirical data from the literature, mainly from the Prisoner's Dilemma game and the Two Stage Gambling game. The results obtained show that the proposed quantum Bayesian Network is a general method that can accommodate violations of the laws of classical probability theory and make accurate predictions regarding human decision-making in these scenarios.
Bayesian Predictive Models for Rayleigh Wind Speed
DEFF Research Database (Denmark)
Shahirinia, Amir; Hajizadeh, Amin; Yu, David C
2017-01-01
predictive model of the wind speed aggregates the non-homogeneous distributions into a single continuous distribution. Therefore, the result is able to capture the variation among the probability distributions of the wind speeds at the turbines’ locations in a wind farm. More specifically, instead of using...... a wind speed distribution whose parameters are known or estimated, the parameters are considered as random whose variations are according to probability distributions. The Bayesian predictive model for a Rayleigh which only has a single model scale parameter has been proposed. Also closed-form posterior......One of the major challenges with the increase in wind power generation is the uncertain nature of wind speed. So far the uncertainty about wind speed has been presented through probability distributions. Also the existing models that consider the uncertainty of the wind speed primarily view...
Bayesian inference and model comparison for metallic fatigue data
Babuška, Ivo
2016-02-23
In this work, we present a statistical treatment of stress-life (S-N) data drawn from a collection of records of fatigue experiments that were performed on 75S-T6 aluminum alloys. Our main objective is to predict the fatigue life of materials by providing a systematic approach to model calibration, model selection and model ranking with reference to S-N data. To this purpose, we consider fatigue-limit models and random fatigue-limit models that are specially designed to allow the treatment of the run-outs (right-censored data). We first fit the models to the data by maximum likelihood methods and estimate the quantiles of the life distribution of the alloy specimen. To assess the robustness of the estimation of the quantile functions, we obtain bootstrap confidence bands by stratified resampling with respect to the cycle ratio. We then compare and rank the models by classical measures of fit based on information criteria. We also consider a Bayesian approach that provides, under the prior distribution of the model parameters selected by the user, their simulation-based posterior distributions. We implement and apply Bayesian model comparison methods, such as Bayes factor ranking and predictive information criteria based on cross-validation techniques under various a priori scenarios.
Bayesian inference with information content model check for Langevin equations
DEFF Research Database (Denmark)
Krog, Jens F. C.; Lomholt, Michael Andersen
2017-01-01
The Bayesian data analysis framework has been proven to be a systematic and effective method of parameter inference and model selection for stochastic processes. In this work we introduce an information content model check which may serve as a goodness-of-fit, like the chi-square procedure......, to complement conventional Bayesian analysis. We demonstrate this extended Bayesian framework on a system of Langevin equations, where coordinate dependent mobilities and measurement noise hinder the normal mean squared displacement approach....
Jones, Matt; Love, Bradley C
2011-08-01
The prominence of Bayesian modeling of cognition has increased recently largely because of mathematical advances in specifying and deriving predictions from complex probabilistic models. Much of this research aims to demonstrate that cognitive behavior can be explained from rational principles alone, without recourse to psychological or neurological processes and representations. We note commonalities between this rational approach and other movements in psychology - namely, Behaviorism and evolutionary psychology - that set aside mechanistic explanations or make use of optimality assumptions. Through these comparisons, we identify a number of challenges that limit the rational program's potential contribution to psychological theory. Specifically, rational Bayesian models are significantly unconstrained, both because they are uninformed by a wide range of process-level data and because their assumptions about the environment are generally not grounded in empirical measurement. The psychological implications of most Bayesian models are also unclear. Bayesian inference itself is conceptually trivial, but strong assumptions are often embedded in the hypothesis sets and the approximation algorithms used to derive model predictions, without a clear delineation between psychological commitments and implementational details. Comparing multiple Bayesian models of the same task is rare, as is the realization that many Bayesian models recapitulate existing (mechanistic level) theories. Despite the expressive power of current Bayesian models, we argue they must be developed in conjunction with mechanistic considerations to offer substantive explanations of cognition. We lay out several means for such an integration, which take into account the representations on which Bayesian inference operates, as well as the algorithms and heuristics that carry it out. We argue this unification will better facilitate lasting contributions to psychological theory, avoiding the pitfalls
Energy Technology Data Exchange (ETDEWEB)
Kwag, Shinyoung [North Carolina State University, Raleigh, NC 27695 (United States); Korea Atomic Energy Research Institute, Daejeon 305-353 (Korea, Republic of); Gupta, Abhinav, E-mail: agupta1@ncsu.edu [North Carolina State University, Raleigh, NC 27695 (United States)
2017-04-15
Highlights: • This study presents the development of Bayesian framework for probabilistic risk assessment (PRA) of structural systems under multiple hazards. • The concepts of Bayesian network and Bayesian inference are combined by mapping the traditionally used fault trees into a Bayesian network. • The proposed mapping allows for consideration of dependencies as well as correlations between events. • Incorporation of Bayesian inference permits a novel way for exploration of a scenario that is likely to result in a system level “vulnerability.” - Abstract: Conventional probabilistic risk assessment (PRA) methodologies (USNRC, 1983; IAEA, 1992; EPRI, 1994; Ellingwood, 2001) conduct risk assessment for different external hazards by considering each hazard separately and independent of each other. The risk metric for a specific hazard is evaluated by a convolution of the fragility and the hazard curves. The fragility curve for basic event is obtained by using empirical, experimental, and/or numerical simulation data for a particular hazard. Treating each hazard as an independently can be inappropriate in some cases as certain hazards are statistically correlated or dependent. Examples of such correlated events include but are not limited to flooding induced fire, seismically induced internal or external flooding, or even seismically induced fire. In the current practice, system level risk and consequence sequences are typically calculated using logic trees to express the causative relationship between events. In this paper, we present the results from a study on multi-hazard risk assessment that is conducted using a Bayesian network (BN) with Bayesian inference. The framework can consider statistical dependencies among risks from multiple hazards, allows updating by considering the newly available data/information at any level, and provide a novel way to explore alternative failure scenarios that may exist due to vulnerabilities.
International Nuclear Information System (INIS)
Kwag, Shinyoung; Gupta, Abhinav
2017-01-01
Highlights: • This study presents the development of Bayesian framework for probabilistic risk assessment (PRA) of structural systems under multiple hazards. • The concepts of Bayesian network and Bayesian inference are combined by mapping the traditionally used fault trees into a Bayesian network. • The proposed mapping allows for consideration of dependencies as well as correlations between events. • Incorporation of Bayesian inference permits a novel way for exploration of a scenario that is likely to result in a system level “vulnerability.” - Abstract: Conventional probabilistic risk assessment (PRA) methodologies (USNRC, 1983; IAEA, 1992; EPRI, 1994; Ellingwood, 2001) conduct risk assessment for different external hazards by considering each hazard separately and independent of each other. The risk metric for a specific hazard is evaluated by a convolution of the fragility and the hazard curves. The fragility curve for basic event is obtained by using empirical, experimental, and/or numerical simulation data for a particular hazard. Treating each hazard as an independently can be inappropriate in some cases as certain hazards are statistically correlated or dependent. Examples of such correlated events include but are not limited to flooding induced fire, seismically induced internal or external flooding, or even seismically induced fire. In the current practice, system level risk and consequence sequences are typically calculated using logic trees to express the causative relationship between events. In this paper, we present the results from a study on multi-hazard risk assessment that is conducted using a Bayesian network (BN) with Bayesian inference. The framework can consider statistical dependencies among risks from multiple hazards, allows updating by considering the newly available data/information at any level, and provide a novel way to explore alternative failure scenarios that may exist due to vulnerabilities.
A Bayesian Statistical Analysis of the Enhanced Greenhouse Effect
de Vos, A.F.; Tol, R.S.J.
1998-01-01
This paper demonstrates that there is a robust statistical relationship between the records of the global mean surface air temperature and the atmospheric concentration of carbon dioxide over the period 1870-1991. As such, the enhanced greenhouse effect is a plausible explanation for the observed
Effect on Prediction when Modeling Covariates in Bayesian Nonparametric Models.
Cruz-Marcelo, Alejandro; Rosner, Gary L; Müller, Peter; Stewart, Clinton F
2013-04-01
In biomedical research, it is often of interest to characterize biologic processes giving rise to observations and to make predictions of future observations. Bayesian nonparametric methods provide a means for carrying out Bayesian inference making as few assumptions about restrictive parametric models as possible. There are several proposals in the literature for extending Bayesian nonparametric models to include dependence on covariates. Limited attention, however, has been directed to the following two aspects. In this article, we examine the effect on fitting and predictive performance of incorporating covariates in a class of Bayesian nonparametric models by one of two primary ways: either in the weights or in the locations of a discrete random probability measure. We show that different strategies for incorporating continuous covariates in Bayesian nonparametric models can result in big differences when used for prediction, even though they lead to otherwise similar posterior inferences. When one needs the predictive density, as in optimal design, and this density is a mixture, it is better to make the weights depend on the covariates. We demonstrate these points via a simulated data example and in an application in which one wants to determine the optimal dose of an anticancer drug used in pediatric oncology.
Bayesian statistics in educational research: a look at the current state of affairs
König, Christoph; van de Schoot, Rens
2017-01-01
The ability of a scientific discipline to build cumulative knowledge depends on its predominant method of data analysis. A steady accumulation of knowledge requires approaches which allow researchers to consider results from comparable prior research. Bayesian statistics is especially relevant for
Predicting coastal cliff erosion using a Bayesian probabilistic model
Hapke, Cheryl J.; Plant, Nathaniel G.
2010-01-01
Regional coastal cliff retreat is difficult to model due to the episodic nature of failures and the along-shore variability of retreat events. There is a growing demand, however, for predictive models that can be used to forecast areas vulnerable to coastal erosion hazards. Increasingly, probabilistic models are being employed that require data sets of high temporal density to define the joint probability density function that relates forcing variables (e.g. wave conditions) and initial conditions (e.g. cliff geometry) to erosion events. In this study we use a multi-parameter Bayesian network to investigate correlations between key variables that control and influence variations in cliff retreat processes. The network uses Bayesian statistical methods to estimate event probabilities using existing observations. Within this framework, we forecast the spatial distribution of cliff retreat along two stretches of cliffed coast in Southern California. The input parameters are the height and slope of the cliff, a descriptor of material strength based on the dominant cliff-forming lithology, and the long-term cliff erosion rate that represents prior behavior. The model is forced using predicted wave impact hours. Results demonstrate that the Bayesian approach is well-suited to the forward modeling of coastal cliff retreat, with the correct outcomes forecast in 70–90% of the modeled transects. The model also performs well in identifying specific locations of high cliff erosion, thus providing a foundation for hazard mapping. This approach can be employed to predict cliff erosion at time-scales ranging from storm events to the impacts of sea-level rise at the century-scale.
BiomeNet: a Bayesian model for inference of metabolic divergence among microbial communities.
Mahdi Shafiei; Katherine A Dunn; Hugh Chipman; Hong Gu; Joseph P Bielawski
2014-01-01
Metagenomics yields enormous numbers of microbial sequences that can be assigned a metabolic function. Using such data to infer community-level metabolic divergence is hindered by the lack of a suitable statistical framework. Here, we describe a novel hierarchical Bayesian model, called BiomeNet (Bayesian inference of metabolic networks), for inferring differential prevalence of metabolic subnetworks among microbial communities. To infer the structure of community-level metabolic interactions...
Applied Bayesian statistical studies in biology and medicine
D’Amore, G; Scalfari, F
2004-01-01
It was written on another occasion· that "It is apparent that the scientific culture, if one means production of scientific papers, is growing exponentially, and chaotically, in almost every field of investigation". The biomedical sciences sensu lato and mathematical statistics are no exceptions. One might say then, and with good reason, that another collection of bio statistical papers would only add to the overflow and cause even more confusion. Nevertheless, this book may be greeted with some interest if we state that most of the papers in it are the result of a collaboration between biologists and statisticians, and partly the product of the Summer School th "Statistical Inference in Human Biology" which reaches its 10 edition in 2003 (information about the School can be obtained at the Web site http://www2. stat. unibo. itleventilSito%20scuolalindex. htm). is common experience - and not only This is rather important. Indeed, it in Italy - that encounters between statisticians and researchers are spora...
THOMAS: Building Bayesian Statistical Expert Systems to Aid in Clinical Decision Making
Lehmann, Harold P.; Shortliffe, Edward H.
1990-01-01
Previous knowledge-based systems for statistical analysis separate the numeric knowledge needed for data analysis from the heuristic knowledge employed in using the results of the analysis. In contrast, a Bayesian framework for building biostatistical expert systems allows for the integration of the data-analytic and decision-making tasks. The architecture of such a framework entails enabling the system (1) to make its recommendations on decision-analytic grounds; (2) to update a statistical ...
Bayesian networks and statistical analysis application to analyze the diagnostic test accuracy
Orzechowski, P.; Makal, Jaroslaw; Onisko, A.
2005-02-01
The computer aided BPH diagnosis system based on Bayesian network is described in the paper. First result are compared to a given statistical method. Different statistical methods are used successfully in medicine for years. However, the undoubted advantages of probabilistic methods make them useful in application in newly created systems which are frequent in medicine, but do not have full and competent knowledge. The article presents advantages of the computer aided BPH diagnosis system in clinical practice for urologists.
Bayesian statistical analysis of censored data in geotechnical engineering
DEFF Research Database (Denmark)
Ditlevsen, Ove Dalager; Tarp-Johansen, Niels Jacob; Denver, Hans
2000-01-01
The geotechnical engineer is often faced with the problem ofhow to assess the statistical properties of a soil parameter on the basis ofa sample measured in-situ or in the laboratory with the defect that somevalues have been replaced by interval bounds because the corresponding soilparameter values...... is available about the soil parameter distribution.The present paper shows how a characteristic value by computer calcula-tions can be assessed systematically from the actual sample of censored datasupplemented with prior information from a soil parameter data base....
Bayesian approach to errors-in-variables in regression models
Rozliman, Nur Aainaa; Ibrahim, Adriana Irawati Nur; Yunus, Rossita Mohammad
2017-05-01
In many applications and experiments, data sets are often contaminated with error or mismeasured covariates. When at least one of the covariates in a model is measured with error, Errors-in-Variables (EIV) model can be used. Measurement error, when not corrected, would cause misleading statistical inferences and analysis. Therefore, our goal is to examine the relationship of the outcome variable and the unobserved exposure variable given the observed mismeasured surrogate by applying the Bayesian formulation to the EIV model. We shall extend the flexible parametric method proposed by Hossain and Gustafson (2009) to another nonlinear regression model which is the Poisson regression model. We shall then illustrate the application of this approach via a simulation study using Markov chain Monte Carlo sampling methods.
To be certain about the uncertainty: Bayesian statistics for 13 C metabolic flux analysis.
Theorell, Axel; Leweke, Samuel; Wiechert, Wolfgang; Nöh, Katharina
2017-11-01
13 C Metabolic Fluxes Analysis ( 13 C MFA) remains to be the most powerful approach to determine intracellular metabolic reaction rates. Decisions on strain engineering and experimentation heavily rely upon the certainty with which these fluxes are estimated. For uncertainty quantification, the vast majority of 13 C MFA studies relies on confidence intervals from the paradigm of Frequentist statistics. However, it is well known that the confidence intervals for a given experimental outcome are not uniquely defined. As a result, confidence intervals produced by different methods can be different, but nevertheless equally valid. This is of high relevance to 13 C MFA, since practitioners regularly use three different approximate approaches for calculating confidence intervals. By means of a computational study with a realistic model of the central carbon metabolism of E. coli, we provide strong evidence that confidence intervals used in the field depend strongly on the technique with which they were calculated and, thus, their use leads to misinterpretation of the flux uncertainty. In order to provide a better alternative to confidence intervals in 13 C MFA, we demonstrate that credible intervals from the paradigm of Bayesian statistics give more reliable flux uncertainty quantifications which can be readily computed with high accuracy using Markov chain Monte Carlo. In addition, the widely applied chi-square test, as a means of testing whether the model reproduces the data, is examined closer. © 2017 Wiley Periodicals, Inc.
A novel approach for choosing summary statistics in approximate Bayesian computation.
Aeschbacher, Simon; Beaumont, Mark A; Futschik, Andreas
2012-11-01
The choice of summary statistics is a crucial step in approximate Bayesian computation (ABC). Since statistics are often not sufficient, this choice involves a trade-off between loss of information and reduction of dimensionality. The latter may increase the efficiency of ABC. Here, we propose an approach for choosing summary statistics based on boosting, a technique from the machine-learning literature. We consider different types of boosting and compare them to partial least-squares regression as an alternative. To mitigate the lack of sufficiency, we also propose an approach for choosing summary statistics locally, in the putative neighborhood of the true parameter value. We study a demographic model motivated by the reintroduction of Alpine ibex (Capra ibex) into the Swiss Alps. The parameters of interest are the mean and standard deviation across microsatellites of the scaled ancestral mutation rate (θ(anc) = 4N(e)u) and the proportion of males obtaining access to matings per breeding season (ω). By simulation, we assess the properties of the posterior distribution obtained with the various methods. According to our criteria, ABC with summary statistics chosen locally via boosting with the L(2)-loss performs best. Applying that method to the ibex data, we estimate θ(anc)≈ 1.288 and find that most of the variation across loci of the ancestral mutation rate u is between 7.7 × 10(-4) and 3.5 × 10(-3) per locus per generation. The proportion of males with access to matings is estimated as ω≈ 0.21, which is in good agreement with recent independent estimates.
Li, Zhijun; Feng, Maria Q.; Luo, Longxi; Feng, Dongming; Xu, Xiuli
2018-01-01
Uncertainty of modal parameters estimation appear in structural health monitoring (SHM) practice of civil engineering to quite some significant extent due to environmental influences and modeling errors. Reasonable methodologies are needed for processing the uncertainty. Bayesian inference can provide a promising and feasible identification solution for the purpose of SHM. However, there are relatively few researches on the application of Bayesian spectral method in the modal identification using SHM data sets. To extract modal parameters from large data sets collected by SHM system, the Bayesian spectral density algorithm was applied to address the uncertainty of mode extraction from output-only response of a long-span suspension bridge. The posterior most possible values of modal parameters and their uncertainties were estimated through Bayesian inference. A long-term variation and statistical analysis was performed using the sensor data sets collected from the SHM system of the suspension bridge over a one-year period. The t location-scale distribution was shown to be a better candidate function for frequencies of lower modes. On the other hand, the burr distribution provided the best fitting to the higher modes which are sensitive to the temperature. In addition, wind-induced variation of modal parameters was also investigated. It was observed that both the damping ratios and modal forces increased during the period of typhoon excitations. Meanwhile, the modal damping ratios exhibit significant correlation with the spectral intensities of the corresponding modal forces.
Modelling crime linkage with Bayesian networks.
de Zoete, Jacob; Sjerps, Marjan; Lagnado, David; Fenton, Norman
2015-05-01
When two or more crimes show specific similarities, such as a very distinct modus operandi, the probability that they were committed by the same offender becomes of interest. This probability depends on the degree of similarity and distinctiveness. We show how Bayesian networks can be used to model different evidential structures that can occur when linking crimes, and how they assist in understanding the complex underlying dependencies. That is, how evidence that is obtained in one case can be used in another and vice versa. The flip side of this is that the intuitive decision to "unlink" a case in which exculpatory evidence is obtained leads to serious overestimation of the strength of the remaining cases. Copyright © 2014 Forensic Science Society. Published by Elsevier Ireland Ltd. All rights reserved.
Distributions with given marginals and statistical modelling
Fortiana, Josep; Rodriguez-Lallena, José
2002-01-01
This book contains a selection of the papers presented at the meeting `Distributions with given marginals and statistical modelling', held in Barcelona (Spain), July 17-20, 2000. In 24 chapters, this book covers topics such as the theory of copulas and quasi-copulas, the theory and compatibility of distributions, models for survival distributions and other well-known distributions, time series, categorical models, definition and estimation of measures of dependence, monotonicity and stochastic ordering, shape and separability of distributions, hidden truncation models, diagonal families, orthogonal expansions, tests of independence, and goodness of fit assessment. These topics share the use and properties of distributions with given marginals, this being the fourth specialised text on this theme. The innovative aspect of the book is the inclusion of statistical aspects such as modelling, Bayesian statistics, estimation, and tests.
Directory of Open Access Journals (Sweden)
M.M. Mohie El-Din
2011-10-01
Full Text Available In this paper, two sample Bayesian prediction intervals for order statistics (OS are obtained. This prediction is based on a certain class of the inverse exponential-type distributions using a right censored sample. A general class of prior density functions is used and the predictive cumulative function is obtained in the two samples case. The class of the inverse exponential-type distributions includes several important distributions such the inverse Weibull distribution, the inverse Burr distribution, the loglogistic distribution, the inverse Pareto distribution and the inverse paralogistic distribution. Special cases of the inverse Weibull model such as the inverse exponential model and the inverse Rayleigh model are considered.
Bayesian Dose-Response Modeling in Sparse Data
Kim, Steven B.
This book discusses Bayesian dose-response modeling in small samples applied to two different settings. The first setting is early phase clinical trials, and the second setting is toxicology studies in cancer risk assessment. In early phase clinical trials, experimental units are humans who are actual patients. Prior to a clinical trial, opinions from multiple subject area experts are generally more informative than the opinion of a single expert, but we may face a dilemma when they have disagreeing prior opinions. In this regard, we consider compromising the disagreement and compare two different approaches for making a decision. In addition to combining multiple opinions, we also address balancing two levels of ethics in early phase clinical trials. The first level is individual-level ethics which reflects the perspective of trial participants. The second level is population-level ethics which reflects the perspective of future patients. We extensively compare two existing statistical methods which focus on each perspective and propose a new method which balances the two conflicting perspectives. In toxicology studies, experimental units are living animals. Here we focus on a potential non-monotonic dose-response relationship which is known as hormesis. Briefly, hormesis is a phenomenon which can be characterized by a beneficial effect at low doses and a harmful effect at high doses. In cancer risk assessments, the estimation of a parameter, which is known as a benchmark dose, can be highly sensitive to a class of assumptions, monotonicity or hormesis. In this regard, we propose a robust approach which considers both monotonicity and hormesis as a possibility. In addition, We discuss statistical hypothesis testing for hormesis and consider various experimental designs for detecting hormesis based on Bayesian decision theory. Past experiments have not been optimally designed for testing for hormesis, and some Bayesian optimal designs may not be optimal under a
Statistical detection of EEG synchrony using empirical bayesian inference.
Singh, Archana K; Asoh, Hideki; Takeda, Yuji; Phillips, Steven
2015-01-01
There is growing interest in understanding how the brain utilizes synchronized oscillatory activity to integrate information across functionally connected regions. Computing phase-locking values (PLV) between EEG signals is a popular method for quantifying such synchronizations and elucidating their role in cognitive tasks. However, high-dimensionality in PLV data incurs a serious multiple testing problem. Standard multiple testing methods in neuroimaging research (e.g., false discovery rate, FDR) suffer severe loss of power, because they fail to exploit complex dependence structure between hypotheses that vary in spectral, temporal and spatial dimension. Previously, we showed that a hierarchical FDR and optimal discovery procedures could be effectively applied for PLV analysis to provide better power than FDR. In this article, we revisit the multiple comparison problem from a new Empirical Bayes perspective and propose the application of the local FDR method (locFDR; Efron, 2001) for PLV synchrony analysis to compute FDR as a posterior probability that an observed statistic belongs to a null hypothesis. We demonstrate the application of Efron's Empirical Bayes approach for PLV synchrony analysis for the first time. We use simulations to validate the specificity and sensitivity of locFDR and a real EEG dataset from a visual search study for experimental validation. We also compare locFDR with hierarchical FDR and optimal discovery procedures in both simulation and experimental analyses. Our simulation results showed that the locFDR can effectively control false positives without compromising on the power of PLV synchrony inference. Our results from the application locFDR on experiment data detected more significant discoveries than our previously proposed methods whereas the standard FDR method failed to detect any significant discoveries.
Statistical detection of EEG synchrony using empirical bayesian inference.
Directory of Open Access Journals (Sweden)
Archana K Singh
Full Text Available There is growing interest in understanding how the brain utilizes synchronized oscillatory activity to integrate information across functionally connected regions. Computing phase-locking values (PLV between EEG signals is a popular method for quantifying such synchronizations and elucidating their role in cognitive tasks. However, high-dimensionality in PLV data incurs a serious multiple testing problem. Standard multiple testing methods in neuroimaging research (e.g., false discovery rate, FDR suffer severe loss of power, because they fail to exploit complex dependence structure between hypotheses that vary in spectral, temporal and spatial dimension. Previously, we showed that a hierarchical FDR and optimal discovery procedures could be effectively applied for PLV analysis to provide better power than FDR. In this article, we revisit the multiple comparison problem from a new Empirical Bayes perspective and propose the application of the local FDR method (locFDR; Efron, 2001 for PLV synchrony analysis to compute FDR as a posterior probability that an observed statistic belongs to a null hypothesis. We demonstrate the application of Efron's Empirical Bayes approach for PLV synchrony analysis for the first time. We use simulations to validate the specificity and sensitivity of locFDR and a real EEG dataset from a visual search study for experimental validation. We also compare locFDR with hierarchical FDR and optimal discovery procedures in both simulation and experimental analyses. Our simulation results showed that the locFDR can effectively control false positives without compromising on the power of PLV synchrony inference. Our results from the application locFDR on experiment data detected more significant discoveries than our previously proposed methods whereas the standard FDR method failed to detect any significant discoveries.
Hierarchical Bayesian models of subtask learning.
Anglim, Jeromy; Wynton, Sarah K A
2015-07-01
The current study used Bayesian hierarchical methods to challenge and extend previous work on subtask learning consistency. A general model of individual-level subtask learning was proposed focusing on power and exponential functions with constraints to test for inconsistency. To study subtask learning, we developed a novel computer-based booking task, which logged participant actions, enabling measurement of strategy use and subtask performance. Model comparison was performed using deviance information criterion (DIC), posterior predictive checks, plots of model fits, and model recovery simulations. Results showed that although learning tended to be monotonically decreasing and decelerating, and approaching an asymptote for all subtasks, there was substantial inconsistency in learning curves both at the group- and individual-levels. This inconsistency was most apparent when constraining both the rate and the ratio of learning to asymptote to be equal across subtasks, thereby giving learning curves only 1 parameter for scaling. The inclusion of 6 strategy covariates provided improved prediction of subtask performance capturing different subtask learning processes and subtask trade-offs. In addition, strategy use partially explained the inconsistency in subtask learning. Overall, the model provided a more nuanced representation of how complex tasks can be decomposed in terms of simpler learning mechanisms. (c) 2015 APA, all rights reserved.
Applications of Bayesian Statistics to Problems in Gamma-Ray Bursts
Meegan, Charles A.
1997-01-01
This presentation will describe two applications of Bayesian statistics to Gamma Ray Bursts (GRBS). The first attempts to quantify the evidence for a cosmological versus galactic origin of GRBs using only the observations of the dipole and quadrupole moments of the angular distribution of bursts. The cosmological hypothesis predicts isotropy, while the galactic hypothesis is assumed to produce a uniform probability distribution over positive values for these moments. The observed isotropic distribution indicates that the Bayes factor for the cosmological hypothesis over the galactic hypothesis is about 300. Another application of Bayesian statistics is in the estimation of chance associations of optical counterparts with galaxies. The Bayesian approach is preferred to frequentist techniques here because the Bayesian approach easily accounts for galaxy mass distributions and because one can incorporate three disjoint hypotheses: (1) bursts come from galactic centers, (2) bursts come from galaxies in proportion to luminosity, and (3) bursts do not come from external galaxies. This technique was used in the analysis of the optical counterpart to GRB970228.
Advances in Bayesian Model Based Clustering Using Particle Learning
Energy Technology Data Exchange (ETDEWEB)
Merl, D M
2009-11-19
Recent work by Carvalho, Johannes, Lopes and Polson and Carvalho, Lopes, Polson and Taddy introduced a sequential Monte Carlo (SMC) alternative to traditional iterative Monte Carlo strategies (e.g. MCMC and EM) for Bayesian inference for a large class of dynamic models. The basis of SMC techniques involves representing the underlying inference problem as one of state space estimation, thus giving way to inference via particle filtering. The key insight of Carvalho et al was to construct the sequence of filtering distributions so as to make use of the posterior predictive distribution of the observable, a distribution usually only accessible in certain Bayesian settings. Access to this distribution allows a reversal of the usual propagate and resample steps characteristic of many SMC methods, thereby alleviating to a large extent many problems associated with particle degeneration. Furthermore, Carvalho et al point out that for many conjugate models the posterior distribution of the static variables can be parametrized in terms of [recursively defined] sufficient statistics of the previously observed data. For models where such sufficient statistics exist, particle learning as it is being called, is especially well suited for the analysis of streaming data do to the relative invariance of its algorithmic complexity with the number of data observations. Through a particle learning approach, a statistical model can be fit to data as the data is arriving, allowing at any instant during the observation process direct quantification of uncertainty surrounding underlying model parameters. Here we describe the use of a particle learning approach for fitting a standard Bayesian semiparametric mixture model as described in Carvalho, Lopes, Polson and Taddy. In Section 2 we briefly review the previously presented particle learning algorithm for the case of a Dirichlet process mixture of multivariate normals. In Section 3 we describe several novel extensions to the original
Development of a Bayesian Belief Network Runway Incursion Model
Green, Lawrence L.
2014-01-01
In a previous paper, a statistical analysis of runway incursion (RI) events was conducted to ascertain their relevance to the top ten Technical Challenges (TC) of the National Aeronautics and Space Administration (NASA) Aviation Safety Program (AvSP). The study revealed connections to perhaps several of the AvSP top ten TC. That data also identified several primary causes and contributing factors for RI events that served as the basis for developing a system-level Bayesian Belief Network (BBN) model for RI events. The system-level BBN model will allow NASA to generically model the causes of RI events and to assess the effectiveness of technology products being developed under NASA funding. These products are intended to reduce the frequency of RI events in particular, and to improve runway safety in general. The development, structure and assessment of that BBN for RI events by a Subject Matter Expert panel are documented in this paper.
Prior sensitivity analysis in default Bayesian structural equation modeling
van Erp, S.J.; Mulder, J.; Oberski, Daniel L.
2018-01-01
Bayesian structural equation modeling (BSEM) has recently gained popularity because it enables researchers to fit complex models while solving some of the issues often encountered in classical maximum likelihood (ML) estimation, such as nonconvergence and inadmissible solutions. An important
A guide to Bayesian model selection for ecologists
Hooten, Mevin B.; Hobbs, N.T.
2015-01-01
The steady upward trend in the use of model selection and Bayesian methods in ecological research has made it clear that both approaches to inference are important for modern analysis of models and data. However, in teaching Bayesian methods and in working with our research colleagues, we have noticed a general dissatisfaction with the available literature on Bayesian model selection and multimodel inference. Students and researchers new to Bayesian methods quickly find that the published advice on model selection is often preferential in its treatment of options for analysis, frequently advocating one particular method above others. The recent appearance of many articles and textbooks on Bayesian modeling has provided welcome background on relevant approaches to model selection in the Bayesian framework, but most of these are either very narrowly focused in scope or inaccessible to ecologists. Moreover, the methodological details of Bayesian model selection approaches are spread thinly throughout the literature, appearing in journals from many different fields. Our aim with this guide is to condense the large body of literature on Bayesian approaches to model selection and multimodel inference and present it specifically for quantitative ecologists as neutrally as possible. We also bring to light a few important and fundamental concepts relating directly to model selection that seem to have gone unnoticed in the ecological literature. Throughout, we provide only a minimal discussion of philosophy, preferring instead to examine the breadth of approaches as well as their practical advantages and disadvantages. This guide serves as a reference for ecologists using Bayesian methods, so that they can better understand their options and can make an informed choice that is best aligned with their goals for inference.
Multi-objective calibration of forecast ensembles using Bayesian model averaging
Vrugt, J.A.; Clark, M.P.; Diks, C.G.H.; Duan, Q.; Robinson, B.A.
2006-01-01
Bayesian Model Averaging (BMA) has recently been proposed as a method for statistical postprocessing of forecast ensembles from numerical weather prediction models. The BMA predictive probability density function (PDF) of any weather quantity of interest is a weighted average of PDFs centered on the
Osei, Frank B.; Osei, F.B.; Duker, Alfred A.; Stein, A.
2011-01-01
This study analyses the joint effects of the two transmission routes of cholera on the space-time diffusion dynamics. Statistical models are developed and presented to investigate the transmission network routes of cholera diffusion. A hierarchical Bayesian modelling approach is employed for a joint
Reasoning with data an introduction to traditional and Bayesian statistics using R
Stanton, Jeffrey M
2017-01-01
Engaging and accessible, this book teaches readers how to use inferential statistical thinking to check their assumptions, assess evidence about their beliefs, and avoid overinterpreting results that may look more promising than they really are. It provides step-by-step guidance for using both classical (frequentist) and Bayesian approaches to inference. Statistical techniques covered side by side from both frequentist and Bayesian approaches include hypothesis testing, replication, analysis of variance, calculation of effect sizes, regression, time series analysis, and more. Students also get a complete introduction to the open-source R programming language and its key packages. Throughout the text, simple commands in R demonstrate essential data analysis skills using real-data examples. The companion website provides annotated R code for the book's examples, in-class exercises, supplemental reading lists, and links to online videos, interactive materials, and other resources.
Hierarchical Bayesian Modeling of Fluid-Induced Seismicity
Broccardo, M.; Mignan, A.; Wiemer, S.; Stojadinovic, B.; Giardini, D.
2017-11-01
In this study, we present a Bayesian hierarchical framework to model fluid-induced seismicity. The framework is based on a nonhomogeneous Poisson process with a fluid-induced seismicity rate proportional to the rate of injected fluid. The fluid-induced seismicity rate model depends upon a set of physically meaningful parameters and has been validated for six fluid-induced case studies. In line with the vision of hierarchical Bayesian modeling, the rate parameters are considered as random variables. We develop both the Bayesian inference and updating rules, which are used to develop a probabilistic forecasting model. We tested the Basel 2006 fluid-induced seismic case study to prove that the hierarchical Bayesian model offers a suitable framework to coherently encode both epistemic uncertainty and aleatory variability. Moreover, it provides a robust and consistent short-term seismic forecasting model suitable for online risk quantification and mitigation.
Bayesian Model Selection in Geophysics: The evidence
Vrugt, J. A.
2016-12-01
Bayesian inference has found widespread application and use in science and engineering to reconcile Earth system models with data, including prediction in space (interpolation), prediction in time (forecasting), assimilation of observations and deterministic/stochastic model output, and inference of the model parameters. Per Bayes theorem, the posterior probability, , P(H|D), of a hypothesis, H, given the data D, is equivalent to the product of its prior probability, P(H), and likelihood, L(H|D), divided by a normalization constant, P(D). In geophysics, the hypothesis, H, often constitutes a description (parameterization) of the subsurface for some entity of interest (e.g. porosity, moisture content). The normalization constant, P(D), is not required for inference of the subsurface structure, yet of great value for model selection. Unfortunately, it is not particularly easy to estimate P(D) in practice. Here, I will introduce the various building blocks of a general purpose method which provides robust and unbiased estimates of the evidence, P(D). This method uses multi-dimensional numerical integration of the posterior (parameter) distribution. I will then illustrate this new estimator by application to three competing subsurface models (hypothesis) using GPR travel time data from the South Oyster Bacterial Transport Site, in Virginia, USA. The three subsurface models differ in their treatment of the porosity distribution and use (a) horizontal layering with fixed layer thicknesses, (b) vertical layering with fixed layer thicknesses and (c) a multi-Gaussian field. The results of the new estimator are compared against the brute force Monte Carlo method, and the Laplace-Metropolis method.
Ridge, Lasso and Bayesian additive-dominance genomic models.
Azevedo, Camila Ferreira; de Resende, Marcos Deon Vilela; E Silva, Fabyano Fonseca; Viana, José Marcelo Soriano; Valente, Magno Sávio Ferreira; Resende, Márcio Fernando Ribeiro; Muñoz, Patricio
2015-08-25
A complete approach for genome-wide selection (GWS) involves reliable statistical genetics models and methods. Reports on this topic are common for additive genetic models but not for additive-dominance models. The objective of this paper was (i) to compare the performance of 10 additive-dominance predictive models (including current models and proposed modifications), fitted using Bayesian, Lasso and Ridge regression approaches; and (ii) to decompose genomic heritability and accuracy in terms of three quantitative genetic information sources, namely, linkage disequilibrium (LD), co-segregation (CS) and pedigree relationships or family structure (PR). The simulation study considered two broad sense heritability levels (0.30 and 0.50, associated with narrow sense heritabilities of 0.20 and 0.35, respectively) and two genetic architectures for traits (the first consisting of small gene effects and the second consisting of a mixed inheritance model with five major genes). G-REML/G-BLUP and a modified Bayesian/Lasso (called BayesA*B* or t-BLASSO) method performed best in the prediction of genomic breeding as well as the total genotypic values of individuals in all four scenarios (two heritabilities x two genetic architectures). The BayesA*B*-type method showed a better ability to recover the dominance variance/additive variance ratio. Decomposition of genomic heritability and accuracy revealed the following descending importance order of information: LD, CS and PR not captured by markers, the last two being very close. Amongst the 10 models/methods evaluated, the G-BLUP, BAYESA*B* (-2,8) and BAYESA*B* (4,6) methods presented the best results and were found to be adequate for accurately predicting genomic breeding and total genotypic values as well as for estimating additive and dominance in additive-dominance genomic models.
DEFF Research Database (Denmark)
Jacobsen, Christian Robert Dahl; Møller, Jesper
2017-01-01
We introduce new estimation methods for a subclass of the Gaussian scale mixture models for wavelet trees by Wainwright, Simoncelli and Willsky that rely on modern results for composite likelihoods and approximate Bayesian inference. Our methodology is illustrated for denoising and edge detection...... problems in two-dimensional images....
Involving stakeholders in building integrated fisheries models using Bayesian methods
DEFF Research Database (Denmark)
Haapasaari, Päivi Elisabet; Mäntyniemi, Samu; Kuikka, Sakari
2013-01-01
A participatory Bayesian approach was used to investigate how the views of stakeholders could be utilized to develop models to help understand the Central Baltic herring ﬁshery. In task one, we applied the Bayesian belief network methodology to elicit the causal assumptions of six stakeholders...... on factors that inﬂuence natural mortality, growth, and egg survival of the herring stock in probabilistic terms. We also integrated the expressed views into a meta-model using the Bayesian model averaging (BMA) method. In task two, we used inﬂuence diagrams to study qualitatively how the stakeholders frame...... the potential of the study to contribute to the development of participatory modeling practices. It is concluded that the subjective perspective to knowledge, that is fundamental in Bayesian theory, suits participatory modeling better than a positivist paradigm that seeks the objective truth. The methodology...
Bayesian nonparametric meta-analysis using Polya tree mixture models.
Branscum, Adam J; Hanson, Timothy E
2008-09-01
Summary. A common goal in meta-analysis is estimation of a single effect measure using data from several studies that are each designed to address the same scientific inquiry. Because studies are typically conducted in geographically disperse locations, recent developments in the statistical analysis of meta-analytic data involve the use of random effects models that account for study-to-study variability attributable to differences in environments, demographics, genetics, and other sources that lead to heterogeneity in populations. Stemming from asymptotic theory, study-specific summary statistics are modeled according to normal distributions with means representing latent true effect measures. A parametric approach subsequently models these latent measures using a normal distribution, which is strictly a convenient modeling assumption absent of theoretical justification. To eliminate the influence of overly restrictive parametric models on inferences, we consider a broader class of random effects distributions. We develop a novel hierarchical Bayesian nonparametric Polya tree mixture (PTM) model. We present methodology for testing the PTM versus a normal random effects model. These methods provide researchers a straightforward approach for conducting a sensitivity analysis of the normality assumption for random effects. An application involving meta-analysis of epidemiologic studies designed to characterize the association between alcohol consumption and breast cancer is presented, which together with results from simulated data highlight the performance of PTMs in the presence of nonnormality of effect measures in the source population.
Integration of three strucutally different stock assessment models in a Bayesian framework
Kraak, S.B.M.; Bogaards, H.; Borges, L.; Machiels, M.A.M.; Keeken, van O.A.
2007-01-01
Bayesian statistics provide a method for expressing uncertainty of an unknown parameter value probabilistically (www.bayesian.org). Bayesian methods have been widely used in biological sciences, and recently in fisheries science applied to stock assessment. In our previous studies on Bayesian
Bayesian disease mapping: hierarchical modeling in spatial epidemiology
National Research Council Canada - National Science Library
Lawson, Andrew
2013-01-01
Since the publication of the first edition, many new Bayesian tools and methods have been developed for space-time data analysis, the predictive modeling of health outcomes, and other spatial biostatistical areas...
Bayesian network modeling of operator's state recognition process
International Nuclear Information System (INIS)
Hatakeyama, Naoki; Furuta, Kazuo
2000-01-01
Nowadays we are facing a difficult problem of establishing a good relation between humans and machines. To solve this problem, we suppose that machine system need to have a model of human behavior. In this study we model the state cognition process of a PWR plant operator as an example. We use a Bayesian network as an inference engine. We incorporate the knowledge hierarchy in the Bayesian network and confirm its validity using the example of PWR plant operator. (author)
Time independent seismic hazard analysis of Greece deduced from Bayesian statistics
Directory of Open Access Journals (Sweden)
T. M. Tsapanos
2003-01-01
Full Text Available A Bayesian statistics approach is applied in the seismogenic sources of Greece and the surrounding area in order to assess seismic hazard, assuming that the earthquake occurrence follows the Poisson process. The Bayesian approach applied supplies the probability that a certain cut-off magnitude of Ms = 6.0 will be exceeded in time intervals of 10, 20 and 75 years. We also produced graphs which present the different seismic hazard in the seismogenic sources examined in terms of varying probability which is useful for engineering and civil protection purposes, allowing the designation of priority sources for earthquake-resistant design. It is shown that within the above time intervals the seismogenic source (4 called Igoumenitsa (in NW Greece and west Albania has the highest probability to experience an earthquake with magnitude M > 6.0. High probabilities are found also for Ochrida (source 22, Samos (source 53 and Chios (source 56.
Czech Academy of Sciences Publication Activity Database
Fernandes, R.; Millard, A.R.; Brabec, Marek; Nadeau, M.J.; Grootes, P.
2014-01-01
Roč. 9, č. 2 (2014), Art . no. e87436 E-ISSN 1932-6203 Institutional support: RVO:67985807 Keywords : ancienit diet reconstruction * stable isotope measurements * mixture model * Bayesian estimation * Dirichlet prior Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 3.234, year: 2014
Bayesian graphical models for genomewide association studies.
Verzilli, Claudio J; Stallard, Nigel; Whittaker, John C
2006-07-01
As the extent of human genetic variation becomes more fully characterized, the research community is faced with the challenging task of using this information to dissect the heritable components of complex traits. Genomewide association studies offer great promise in this respect, but their analysis poses formidable difficulties. In this article, we describe a computationally efficient approach to mining genotype-phenotype associations that scales to the size of the data sets currently being collected in such studies. We use discrete graphical models as a data-mining tool, searching for single- or multilocus patterns of association around a causative site. The approach is fully Bayesian, allowing us to incorporate prior knowledge on the spatial dependencies around each marker due to linkage disequilibrium, which reduces considerably the number of possible graphical structures. A Markov chain-Monte Carlo scheme is developed that yields samples from the posterior distribution of graphs conditional on the data from which probabilistic statements about the strength of any genotype-phenotype association can be made. Using data simulated under scenarios that vary in marker density, genotype relative risk of a causative allele, and mode of inheritance, we show that the proposed approach has better localization properties and leads to lower false-positive rates than do single-locus analyses. Finally, we present an application of our method to a quasi-synthetic data set in which data from the CYP2D6 region are embedded within simulated data on 100K single-nucleotide polymorphisms. Analysis is quick (<5 min), and we are able to localize the causative site to a very short interval.
A tutorial introduction to Bayesian models of cognitive development.
Perfors, Amy; Tenenbaum, Joshua B; Griffiths, Thomas L; Xu, Fei
2011-09-01
We present an introduction to Bayesian inference as it is used in probabilistic models of cognitive development. Our goal is to provide an intuitive and accessible guide to the what, the how, and the why of the Bayesian approach: what sorts of problems and data the framework is most relevant for, and how and why it may be useful for developmentalists. We emphasize a qualitative understanding of Bayesian inference, but also include information about additional resources for those interested in the cognitive science applications, mathematical foundations, or machine learning details in more depth. In addition, we discuss some important interpretation issues that often arise when evaluating Bayesian models in cognitive science. Copyright © 2010 Elsevier B.V. All rights reserved.
Biagini, Francesca
2016-01-01
This book provides an introduction to elementary probability and to Bayesian statistics using de Finetti's subjectivist approach. One of the features of this approach is that it does not require the introduction of sample space – a non-intrinsic concept that makes the treatment of elementary probability unnecessarily complicate – but introduces as fundamental the concept of random numbers directly related to their interpretation in applications. Events become a particular case of random numbers and probability a particular case of expectation when it is applied to events. The subjective evaluation of expectation and of conditional expectation is based on an economic choice of an acceptable bet or penalty. The properties of expectation and conditional expectation are derived by applying a coherence criterion that the evaluation has to follow. The book is suitable for all introductory courses in probability and statistics for students in Mathematics, Informatics, Engineering, and Physics.
Bayesian analysis of a reduced-form air quality model.
Foley, Kristen M; Reich, Brian J; Napelenok, Sergey L
2012-07-17
Numerical air quality models are being used for assessing emission control strategies for improving ambient pollution levels across the globe. This paper applies probabilistic modeling to evaluate the effectiveness of emission reduction scenarios aimed at lowering ground-level ozone concentrations. A Bayesian hierarchical model is used to combine air quality model output and monitoring data in order to characterize the impact of emissions reductions while accounting for different degrees of uncertainty in the modeled emissions inputs. The probabilistic model predictions are weighted based on population density in order to better quantify the societal benefits/disbenefits of four hypothetical emission reduction scenarios in which domain-wide NO(x) emissions from various sectors are reduced individually and then simultaneously. Cross validation analysis shows the statistical model performs well compared to observed ozone levels. Accounting for the variability and uncertainty in the emissions and atmospheric systems being modeled is shown to impact how emission reduction scenarios would be ranked, compared to standard methodology.
Bayesian inference and model comparison for metallic fatigue data
Babuska, Ivo
2016-01-06
In this work, we present a statistical treatment of stress-life (S-N) data drawn from a collection of records of fatigue experiments that were performed on 75S-T6 aluminum alloys. Our main objective is to predict the fatigue life of materials by providing a systematic approach to model calibration, model selection and model ranking with reference to S-N data. To this purpose, we consider fatigue-limit models and random fatigue-limit models that are specially designed to allow the treatment of the run-outs (right-censored data). We first fit the models to the data by maximum likelihood methods and estimate the quantiles of the life distribution of the alloy specimen. We then compare and rank the models by classical measures of fit based on information criteria. We also consider a Bayesian approach that provides, under the prior distribution of the model parameters selected by the user, their simulation-based posterior distributions.
Bayesian conditional-independence modeling of the AIDS epidemic in England and Wales
Gilks, Walter R.; De Angelis, Daniela; Day, Nicholas E.
We describe the use of conditional-independence modeling, Bayesian inference and Markov chain Monte Carlo, to model and project the HIV-AIDS epidemic in homosexual/bisexual males in England and Wales. Complexity in this analysis arises through selectively missing data, indirectly observed underlying processes, and measurement error. Our emphasis is on presentation and discussion of the concepts, not on the technicalities of this analysis, which can be found elsewhere [D. De Angelis, W.R. Gilks, N.E. Day, Bayesian projection of the the acquired immune deficiency syndrome epidemic (with discussion), Applied Statistics, in press].
Statistical models for expert judgement and wear prediction
International Nuclear Information System (INIS)
Pulkkinen, U.
1994-01-01
This thesis studies the statistical analysis of expert judgements and prediction of wear. The point of view adopted is the one of information theory and Bayesian statistics. A general Bayesian framework for analyzing both the expert judgements and wear prediction is presented. Information theoretic interpretations are given for some averaging techniques used in the determination of consensus distributions. Further, information theoretic models are compared with a Bayesian model. The general Bayesian framework is then applied in analyzing expert judgements based on ordinal comparisons. In this context, the value of information lost in the ordinal comparison process is analyzed by applying decision theoretic concepts. As a generalization of the Bayesian framework, stochastic filtering models for wear prediction are formulated. These models utilize the information from condition monitoring measurements in updating the residual life distribution of mechanical components. Finally, the application of stochastic control models in optimizing operational strategies for inspected components are studied. Monte-Carlo simulation methods, such as the Gibbs sampler and the stochastic quasi-gradient method, are applied in the determination of posterior distributions and in the solution of stochastic optimization problems. (orig.) (57 refs., 7 figs., 1 tab.)
Model Criticism of Bayesian Networks with Latent Variables.
Williamson, David M.; Mislevy, Robert J.; Almond, Russell G.
This study investigated statistical methods for identifying errors in Bayesian networks (BN) with latent variables, as found in intelligent cognitive assessments. BN, commonly used in artificial intelligence systems, are promising mechanisms for scoring constructed-response examinations. The success of an intelligent assessment or tutoring system…
A Bayesian alternative for multi-objective ecohydrological model specification
Tang, Yating; Marshall, Lucy; Sharma, Ashish; Ajami, Hoori
2018-01-01
Recent studies have identified the importance of vegetation processes in terrestrial hydrologic systems. Process-based ecohydrological models combine hydrological, physical, biochemical and ecological processes of the catchments, and as such are generally more complex and parametric than conceptual hydrological models. Thus, appropriate calibration objectives and model uncertainty analysis are essential for ecohydrological modeling. In recent years, Bayesian inference has become one of the most popular tools for quantifying the uncertainties in hydrological modeling with the development of Markov chain Monte Carlo (MCMC) techniques. The Bayesian approach offers an appealing alternative to traditional multi-objective hydrologic model calibrations by defining proper prior distributions that can be considered analogous to the ad-hoc weighting often prescribed in multi-objective calibration. Our study aims to develop appropriate prior distributions and likelihood functions that minimize the model uncertainties and bias within a Bayesian ecohydrological modeling framework based on a traditional Pareto-based model calibration technique. In our study, a Pareto-based multi-objective optimization and a formal Bayesian framework are implemented in a conceptual ecohydrological model that combines a hydrological model (HYMOD) and a modified Bucket Grassland Model (BGM). Simulations focused on one objective (streamflow/LAI) and multiple objectives (streamflow and LAI) with different emphasis defined via the prior distribution of the model error parameters. Results show more reliable outputs for both predicted streamflow and LAI using Bayesian multi-objective calibration with specified prior distributions for error parameters based on results from the Pareto front in the ecohydrological modeling. The methodology implemented here provides insight into the usefulness of multiobjective Bayesian calibration for ecohydrologic systems and the importance of appropriate prior
A Bayesian joint model of menstrual cycle length and fecundity.
Lum, Kirsten J; Sundaram, Rajeshwari; Buck Louis, Germaine M; Louis, Thomas A
2016-03-01
Menstrual cycle length (MCL) has been shown to play an important role in couple fecundity, which is the biologic capacity for reproduction irrespective of pregnancy intentions. However, a comprehensive assessment of its role requires a fecundity model that accounts for male and female attributes and the couple's intercourse pattern relative to the ovulation day. To this end, we employ a Bayesian joint model for MCL and pregnancy. MCLs follow a scale multiplied (accelerated) mixture model with Gaussian and Gumbel components; the pregnancy model includes MCL as a covariate and computes the cycle-specific probability of pregnancy in a menstrual cycle conditional on the pattern of intercourse and no previous fertilization. Day-specific fertilization probability is modeled using natural, cubic splines. We analyze data from the Longitudinal Investigation of Fertility and the Environment Study (the LIFE Study), a couple based prospective pregnancy study, and find a statistically significant quadratic relation between fecundity and menstrual cycle length, after adjustment for intercourse pattern and other attributes, including male semen quality, both partner's age, and active smoking status (determined by baseline cotinine level 100 ng/mL). We compare results to those produced by a more basic model and show the advantages of a more comprehensive approach. © 2015, The International Biometric Society.
Metrics for evaluating performance and uncertainty of Bayesian network models
Bruce G. Marcot
2012-01-01
This paper presents a selected set of existing and new metrics for gauging Bayesian network model performance and uncertainty. Selected existing and new metrics are discussed for conducting model sensitivity analysis (variance reduction, entropy reduction, case file simulation); evaluating scenarios (influence analysis); depicting model complexity (numbers of model...
Bayesian Estimation of the Logistic Positive Exponent IRT Model
Bolfarine, Heleno; Bazan, Jorge Luis
2010-01-01
A Bayesian inference approach using Markov Chain Monte Carlo (MCMC) is developed for the logistic positive exponent (LPE) model proposed by Samejima and for a new skewed Logistic Item Response Theory (IRT) model, named Reflection LPE model. Both models lead to asymmetric item characteristic curves (ICC) and can be appropriate because a symmetric…
Kaewprag, Pacharmon; Newton, Cheryl; Vermillion, Brenda; Hyun, Sookyung; Huang, Kun; Machiraju, Raghu
2017-07-05
We develop predictive models enabling clinicians to better understand and explore patient clinical data along with risk factors for pressure ulcers in intensive care unit patients from electronic health record data. Identifying accurate risk factors of pressure ulcers is essential to determining appropriate prevention strategies; in this work we examine medication, diagnosis, and traditional Braden pressure ulcer assessment scale measurements as patient features. In order to predict pressure ulcer incidence and better understand the structure of related risk factors, we construct Bayesian networks from patient features. Bayesian network nodes (features) and edges (conditional dependencies) are simplified with statistical network techniques. Upon reviewing a network visualization of our model, our clinician collaborators were able to identify strong relationships between risk factors widely recognized as associated with pressure ulcers. We present a three-stage framework for predictive analysis of patient clinical data: 1) Developing electronic health record feature extraction functions with assistance of clinicians, 2) simplifying features, and 3) building Bayesian network predictive models. We evaluate all combinations of Bayesian network models from different search algorithms, scoring functions, prior structure initializations, and sets of features. From the EHRs of 7,717 ICU patients, we construct Bayesian network predictive models from 86 medication, diagnosis, and Braden scale features. Our model not only identifies known and suspected high PU risk factors, but also substantially increases sensitivity of the prediction - nearly three times higher comparing to logistical regression models - without sacrificing the overall accuracy. We visualize a representative model with which our clinician collaborators identify strong relationships between risk factors widely recognized as associated with pressure ulcers. Given the strong adverse effect of pressure ulcers
Using consensus bayesian network to model the reactive oxygen species regulatory pathway.
Directory of Open Access Journals (Sweden)
Liangdong Hu
Full Text Available Bayesian network is one of the most successful graph models for representing the reactive oxygen species regulatory pathway. With the increasing number of microarray measurements, it is possible to construct the bayesian network from microarray data directly. Although large numbers of bayesian network learning algorithms have been developed, when applying them to learn bayesian networks from microarray data, the accuracies are low due to that the databases they used to learn bayesian networks contain too few microarray data. In this paper, we propose a consensus bayesian network which is constructed by combining bayesian networks from relevant literatures and bayesian networks learned from microarray data. It would have a higher accuracy than the bayesian networks learned from one database. In the experiment, we validated the bayesian network combination algorithm on several classic machine learning databases and used the consensus bayesian network to model the Escherichia coli's ROS pathway.
Energy Technology Data Exchange (ETDEWEB)
Brown, Justin [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Hund, Lauren [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
2017-02-01
Dynamic compression experiments are being performed on complicated materials using increasingly complex drivers. The data produced in these experiments are beginning to reach a regime where traditional analysis techniques break down; requiring the solution of an inverse problem. A common measurement in dynamic experiments is an interface velocity as a function of time, and often this functional output can be simulated using a hydrodynamics code. Bayesian model calibration is a statistical framework to estimate inputs into a computational model in the presence of multiple uncertainties, making it well suited to measurements of this type. In this article, we apply Bayesian model calibration to high pressure (250 GPa) ramp compression measurements in tantalum. We address several issues speci c to this calibration including the functional nature of the output as well as parameter and model discrepancy identi ability. Speci cally, we propose scaling the likelihood function by an e ective sample size rather than modeling the autocorrelation function to accommodate the functional output and propose sensitivity analyses using the notion of `modularization' to assess the impact of experiment-speci c nuisance input parameters on estimates of material properties. We conclude that the proposed Bayesian model calibration procedure results in simple, fast, and valid inferences on the equation of state parameters for tantalum.
Ensemble bayesian model averaging using markov chain Monte Carlo sampling
Energy Technology Data Exchange (ETDEWEB)
Vrugt, Jasper A [Los Alamos National Laboratory; Diks, Cees G H [NON LANL; Clark, Martyn P [NON LANL
2008-01-01
Bayesian model averaging (BMA) has recently been proposed as a statistical method to calibrate forecast ensembles from numerical weather models. Successful implementation of BMA however, requires accurate estimates of the weights and variances of the individual competing models in the ensemble. In their seminal paper (Raftery etal. Mon Weather Rev 133: 1155-1174, 2(05)) has recommended the Expectation-Maximization (EM) algorithm for BMA model training, even though global convergence of this algorithm cannot be guaranteed. In this paper, we compare the performance of the EM algorithm and the recently developed Differential Evolution Adaptive Metropolis (DREAM) Markov Chain Monte Carlo (MCMC) algorithm for estimating the BMA weights and variances. Simulation experiments using 48-hour ensemble data of surface temperature and multi-model stream-flow forecasts show that both methods produce similar results, and that their performance is unaffected by the length of the training data set. However, MCMC simulation with DREAM is capable of efficiently handling a wide variety of BMA predictive distributions, and provides useful information about the uncertainty associated with the estimated BMA weights and variances.
Bayesian calibration of the Community Land Model using surrogates
Energy Technology Data Exchange (ETDEWEB)
Ray, Jaideep; Hou, Zhangshuan; Huang, Maoyi; Swiler, Laura Painton
2014-02-01
We present results from the Bayesian calibration of hydrological parameters of the Community Land Model (CLM), which is often used in climate simulations and Earth system models. A statistical inverse problem is formulated for three hydrological parameters, conditional on observations of latent heat surface fluxes over 48 months. Our calibration method uses polynomial and Gaussian process surrogates of the CLM, and solves the parameter estimation problem using a Markov chain Monte Carlo sampler. Posterior probability densities for the parameters are developed for two sites with different soil and vegetation covers. Our method also allows us to examine the structural error in CLM under two error models. We find that surrogate models can be created for CLM in most cases. The posterior distributions are more predictive than the default parameter values in CLM. Climatologically averaging the observations does not modify the parameters' distributions significantly. The structural error model reveals a correlation time-scale which can be used to identify the physical process that could be contributing to it. While the calibrated CLM has a higher predictive skill, the calibration is under-dispersive.
Chen, Mingjie; Izady, Azizallah; Abdalla, Osman A.; Amerjeed, Mansoor
2018-02-01
Bayesian inference using Markov Chain Monte Carlo (MCMC) provides an explicit framework for stochastic calibration of hydrogeologic models accounting for uncertainties; however, the MCMC sampling entails a large number of model calls, and could easily become computationally unwieldy if the high-fidelity hydrogeologic model simulation is time consuming. This study proposes a surrogate-based Bayesian framework to address this notorious issue, and illustrates the methodology by inverse modeling a regional MODFLOW model. The high-fidelity groundwater model is approximated by a fast statistical model using Bagging Multivariate Adaptive Regression Spline (BMARS) algorithm, and hence the MCMC sampling can be efficiently performed. In this study, the MODFLOW model is developed to simulate the groundwater flow in an arid region of Oman consisting of mountain-coast aquifers, and used to run representative simulations to generate training dataset for BMARS model construction. A BMARS-based Sobol' method is also employed to efficiently calculate input parameter sensitivities, which are used to evaluate and rank their importance for the groundwater flow model system. According to sensitivity analysis, insensitive parameters are screened out of Bayesian inversion of the MODFLOW model, further saving computing efforts. The posterior probability distribution of input parameters is efficiently inferred from the prescribed prior distribution using observed head data, demonstrating that the presented BMARS-based Bayesian framework is an efficient tool to reduce parameter uncertainties of a groundwater system.
Bayesian Proteoform Modeling Improves Protein Quantification of Global Proteomic Measurements
Energy Technology Data Exchange (ETDEWEB)
Webb-Robertson, Bobbie-Jo M.; Matzke, Melissa M.; Datta, Susmita; Payne, Samuel H.; Kang, Jiyun; Bramer, Lisa M.; Nicora, Carrie D.; Shukla, Anil K.; Metz, Thomas O.; Rodland, Karin D.; Smith, Richard D.; Tardiff, Mark F.; McDermott, Jason E.; Pounds, Joel G.; Waters, Katrina M.
2014-12-01
As the capability of mass spectrometry-based proteomics has matured, tens of thousands of peptides can be measured simultaneously, which has the benefit of offering a systems view of protein expression. However, a major challenge is that with an increase in throughput, protein quantification estimation from the native measured peptides has become a computational task. A limitation to existing computationally-driven protein quantification methods is that most ignore protein variation, such as alternate splicing of the RNA transcript and post-translational modifications or other possible proteoforms, which will affect a significant fraction of the proteome. The consequence of this assumption is that statistical inference at the protein level, and consequently downstream analyses, such as network and pathway modeling, have only limited power for biomarker discovery. Here, we describe a Bayesian model (BP-Quant) that uses statistically derived peptides signatures to identify peptides that are outside the dominant pattern, or the existence of multiple over-expressed patterns to improve relative protein abundance estimates. It is a research-driven approach that utilizes the objectives of the experiment, defined in the context of a standard statistical hypothesis, to identify a set of peptides exhibiting similar statistical behavior relating to a protein. This approach infers that changes in relative protein abundance can be used as a surrogate for changes in function, without necessarily taking into account the effect of differential post-translational modifications, processing, or splicing in altering protein function. We verify the approach using a dilution study from mouse plasma samples and demonstrate that BP-Quant achieves similar accuracy as the current state-of-the-art methods at proteoform identification with significantly better specificity. BP-Quant is available as a MatLab ® and R packages at https://github.com/PNNL-Comp-Mass-Spec/BP-Quant.
The application of a hierarchical Bayesian spatiotemporal model for ...
Indian Academy of Sciences (India)
2005.09.070. Sahu S K and Bakar K S 2012 Hierarchical bayesian autore- gressive models for large space-time data with application to ozone concentration modeling; Appl. Stochastic Models. Bus. Ind. 28 395–415, doi: 10.1002/asmb.1951.
A Bayesian Infinite Hidden Markov Vector Autoregressive Model
D. Nibbering (Didier); R. Paap (Richard); M. van der Wel (Michel)
2016-01-01
textabstractWe propose a Bayesian infinite hidden Markov model to estimate time-varying parameters in a vector autoregressive model. The Markov structure allows for heterogeneity over time while accounting for state-persistence. By modelling the transition distribution as a Dirichlet process mixture
Maritime piracy situation modelling with dynamic Bayesian networks
CSIR Research Space (South Africa)
Dabrowski, James M
2015-05-01
Full Text Available A generative model for modelling maritime vessel behaviour is proposed. The model is a novel variant of the dynamic Bayesian network (DBN). The proposed DBN is in the form of a switching linear dynamic system (SLDS) that has been extended into a...
Bayesian Plackett-Luce Mixture Models for Partially Ranked Data.
Mollica, Cristina; Tardella, Luca
2017-06-01
The elicitation of an ordinal judgment on multiple alternatives is often required in many psychological and behavioral experiments to investigate preference/choice orientation of a specific population. The Plackett-Luce model is one of the most popular and frequently applied parametric distributions to analyze rankings of a finite set of items. The present work introduces a Bayesian finite mixture of Plackett-Luce models to account for unobserved sample heterogeneity of partially ranked data. We describe an efficient way to incorporate the latent group structure in the data augmentation approach and the derivation of existing maximum likelihood procedures as special instances of the proposed Bayesian method. Inference can be conducted with the combination of the Expectation-Maximization algorithm for maximum a posteriori estimation and the Gibbs sampling iterative procedure. We additionally investigate several Bayesian criteria for selecting the optimal mixture configuration and describe diagnostic tools for assessing the fitness of ranking distributions conditionally and unconditionally on the number of ranked items. The utility of the novel Bayesian parametric Plackett-Luce mixture for characterizing sample heterogeneity is illustrated with several applications to simulated and real preference ranked data. We compare our method with the frequentist approach and a Bayesian nonparametric mixture model both assuming the Plackett-Luce model as a mixture component. Our analysis on real datasets reveals the importance of an accurate diagnostic check for an appropriate in-depth understanding of the heterogenous nature of the partial ranking data.
Comparative performance of Bayesian and AIC-based measures of phylogenetic model uncertainty.
Alfaro, Michael E; Huelsenbeck, John P
2006-02-01
Reversible-jump Markov chain Monte Carlo (RJ-MCMC) is a technique for simultaneously evaluating multiple related (but not necessarily nested) statistical models that has recently been applied to the problem of phylogenetic model selection. Here we use a simulation approach to assess the performance of this method and compare it to Akaike weights, a measure of model uncertainty that is based on the Akaike information criterion. Under conditions where the assumptions of the candidate models matched the generating conditions, both Bayesian and AIC-based methods perform well. The 95% credible interval contained the generating model close to 95% of the time. However, the size of the credible interval differed with the Bayesian credible set containing approximately 25% to 50% fewer models than an AIC-based credible interval. The posterior probability was a better indicator of the correct model than the Akaike weight when all assumptions were met but both measures performed similarly when some model assumptions were violated. Models in the Bayesian posterior distribution were also more similar to the generating model in their number of parameters and were less biased in their complexity. In contrast, Akaike-weighted models were more distant from the generating model and biased towards slightly greater complexity. The AIC-based credible interval appeared to be more robust to the violation of the rate homogeneity assumption. Both AIC and Bayesian approaches suggest that substantial uncertainty can accompany the choice of model for phylogenetic analyses, suggesting that alternative candidate models should be examined in analysis of phylogenetic data. [AIC; Akaike weights; Bayesian phylogenetics; model averaging; model selection; model uncertainty; posterior probability; reversible jump.].
Assessing fit in Bayesian models for spatial processes
Jun, M.
2014-09-16
© 2014 John Wiley & Sons, Ltd. Gaussian random fields are frequently used to model spatial and spatial-temporal data, particularly in geostatistical settings. As much of the attention of the statistics community has been focused on defining and estimating the mean and covariance functions of these processes, little effort has been devoted to developing goodness-of-fit tests to allow users to assess the models\\' adequacy. We describe a general goodness-of-fit test and related graphical diagnostics for assessing the fit of Bayesian Gaussian process models using pivotal discrepancy measures. Our method is applicable for both regularly and irregularly spaced observation locations on planar and spherical domains. The essential idea behind our method is to evaluate pivotal quantities defined for a realization of a Gaussian random field at parameter values drawn from the posterior distribution. Because the nominal distribution of the resulting pivotal discrepancy measures is known, it is possible to quantitatively assess model fit directly from the output of Markov chain Monte Carlo algorithms used to sample from the posterior distribution on the parameter space. We illustrate our method in a simulation study and in two applications.
Image Segmentation Using Disjunctive Normal Bayesian Shape and Appearance Models.
Mesadi, Fitsum; Erdil, Ertunc; Cetin, Mujdat; Tasdizen, Tolga
2018-01-01
The use of appearance and shape priors in image segmentation is known to improve accuracy; however, existing techniques have several drawbacks. For instance, most active shape and appearance models require landmark points and assume unimodal shape and appearance distributions, and the level set representation does not support construction of local priors. In this paper, we present novel appearance and shape models for image segmentation based on a differentiable implicit parametric shape representation called a disjunctive normal shape model (DNSM). The DNSM is formed by the disjunction of polytopes, which themselves are formed by the conjunctions of half-spaces. The DNSM's parametric nature allows the use of powerful local prior statistics, and its implicit nature removes the need to use landmarks and easily handles topological changes. In a Bayesian inference framework, we model arbitrary shape and appearance distributions using nonparametric density estimations, at any local scale. The proposed local shape prior results in accurate segmentation even when very few training shapes are available, because the method generates a rich set of shape variations by locally combining training samples. We demonstrate the performance of the framework by applying it to both 2-D and 3-D data sets with emphasis on biomedical image segmentation applications.
Bayesian inference of chemical kinetic models from proposed reactions
Galagali, Nikhil
2015-02-01
© 2014 Elsevier Ltd. Bayesian inference provides a natural framework for combining experimental data with prior knowledge to develop chemical kinetic models and quantify the associated uncertainties, not only in parameter values but also in model structure. Most existing applications of Bayesian model selection methods to chemical kinetics have been limited to comparisons among a small set of models, however. The significant computational cost of evaluating posterior model probabilities renders traditional Bayesian methods infeasible when the model space becomes large. We present a new framework for tractable Bayesian model inference and uncertainty quantification using a large number of systematically generated model hypotheses. The approach involves imposing point-mass mixture priors over rate constants and exploring the resulting posterior distribution using an adaptive Markov chain Monte Carlo method. The posterior samples are used to identify plausible models, to quantify rate constant uncertainties, and to extract key diagnostic information about model structure-such as the reactions and operating pathways most strongly supported by the data. We provide numerical demonstrations of the proposed framework by inferring kinetic models for catalytic steam and dry reforming of methane using available experimental data.
Statistical Model for Content Extraction
DEFF Research Database (Denmark)
Qureshi, Pir Abdul Rasool; Memon, Nasrullah
2011-01-01
We present a statistical model for content extraction from HTML documents. The model operates on Document Object Model (DOM) tree of the corresponding HTML document. It evaluates each tree node and associated statistical features to predict significance of the node towards overall content of the ...... also describe the significance of the model in the domain of counterterrorism and open source intelligence....
A mixture copula Bayesian network model for multimodal genomic data
Directory of Open Access Journals (Sweden)
Qingyang Zhang
2017-04-01
Full Text Available Gaussian Bayesian networks have become a widely used framework to estimate directed associations between joint Gaussian variables, where the network structure encodes the decomposition of multivariate normal density into local terms. However, the resulting estimates can be inaccurate when the normality assumption is moderately or severely violated, making it unsuitable for dealing with recent genomic data such as the Cancer Genome Atlas data. In the present paper, we propose a mixture copula Bayesian network model which provides great flexibility in modeling non-Gaussian and multimodal data for causal inference. The parameters in mixture copula functions can be efficiently estimated by a routine expectation–maximization algorithm. A heuristic search algorithm based on Bayesian information criterion is developed to estimate the network structure, and prediction can be further improved by the best-scoring network out of multiple predictions from random initial values. Our method outperforms Gaussian Bayesian networks and regular copula Bayesian networks in terms of modeling flexibility and prediction accuracy, as demonstrated using a cell signaling data set. We apply the proposed methods to the Cancer Genome Atlas data to study the genetic and epigenetic pathways that underlie serous ovarian cancer.
Parameter Estimation of Structural Equation Modeling Using Bayesian Approach
Directory of Open Access Journals (Sweden)
Dewi Kurnia Sari
2016-05-01
Full Text Available Leadership is a process of influencing, directing or giving an example of employees in order to achieve the objectives of the organization and is a key element in the effectiveness of the organization. In addition to the style of leadership, the success of an organization or company in achieving its objectives can also be influenced by the commitment of the organization. Where organizational commitment is a commitment created by each individual for the betterment of the organization. The purpose of this research is to obtain a model of leadership style and organizational commitment to job satisfaction and employee performance, and determine the factors that influence job satisfaction and employee performance using SEM with Bayesian approach. This research was conducted at Statistics FNI employees in Malang, with 15 people. The result of this study showed that the measurement model, all significant indicators measure each latent variable. Meanwhile in the structural model, it was concluded there are a significant difference between the variables of Leadership Style and Organizational Commitment toward Job Satisfaction directly as well as a significant difference between Job Satisfaction on Employee Performance. As for the influence of Leadership Style and variable Organizational Commitment on Employee Performance directly declared insignificant.
Involving stakeholders in building integrated fisheries models using Bayesian methods
DEFF Research Database (Denmark)
Haapasaari, Päivi Elisabet; Mäntyniemi, Samu; Kuikka, Sakari
2013-01-01
the potential of the study to contribute to the development of participatory modeling practices. It is concluded that the subjective perspective to knowledge, that is fundamental in Bayesian theory, suits participatory modeling better than a positivist paradigm that seeks the objective truth. The methodology...
Validation & verification of a Bayesian network model for aircraft vulnerability
CSIR Research Space (South Africa)
Schietekat, Sunelle
2016-09-01
Full Text Available This paper provides a methodology for Validation and Verification (V&V) of a Bayesian Network (BN) model for aircraft vulnerability against Infrared (IR) missile threats. The model considers that the aircraft vulnerability depends both on a missile...
Bayesian Network Models in Cyber Security: A Systematic Review
Chockalingam, S.; Pieters, W.; Herdeiro Teixeira, A.M.; van Gelder, P.H.A.J.M.; Lipmaa, Helger; Mitrokotsa, Aikaterini; Matulevicius, Raimundas
2017-01-01
Bayesian Networks (BNs) are an increasingly popular modelling technique in cyber security especially due to their capability to overcome data limitations. This is also instantiated by the growth of BN models development in cyber security. However, a comprehensive comparison and analysis of these
Prudhomme, Serge
2015-09-17
Parameter estimation for complex models using Bayesian inference is usually a very costly process as it requires a large number of solves of the forward problem. We show here how the construction of adaptive surrogate models using a posteriori error estimates for quantities of interest can significantly reduce the computational cost in problems of statistical inference. As surrogate models provide only approximations of the true solutions of the forward problem, it is nevertheless necessary to control these errors in order to construct an accurate reduced model with respect to the observables utilized in the identification of the model parameters. Effectiveness of the proposed approach is demonstrated on a numerical example dealing with the Spalart–Allmaras model for the simulation of turbulent channel flows. In particular, we illustrate how Bayesian model selection using the adapted surrogate model in place of solving the coupled nonlinear equations leads to the same quality of results while requiring fewer nonlinear PDE solves.
Research & development and growth: A Bayesian model averaging analysis
Czech Academy of Sciences Publication Activity Database
Horváth, Roman
2011-01-01
Roč. 28, č. 6 (2011), s. 2669-2673 ISSN 0264-9993. [Society for Non-linear Dynamics and Econometrics Annual Conferencen. Washington DC, 16.03.2011-18.03.2011] R&D Projects: GA ČR GA402/09/0965 Institutional research plan: CEZ:AV0Z10750506 Keywords : Research and development * Growth * Bayesian model averaging Subject RIV: AH - Economics Impact factor: 0.701, year: 2011 http://library.utia.cas.cz/separaty/2011/E/horvath-research & development and growth a bayesian model averaging analysis.pdf
Bayesian log-periodic model for financial crashes
Rodríguez-Caballero, Carlos Vladimir; Knapik, Oskar
2014-10-01
This paper introduces a Bayesian approach in econophysics literature about financial bubbles in order to estimate the most probable time for a financial crash to occur. To this end, we propose using noninformative prior distributions to obtain posterior distributions. Since these distributions cannot be performed analytically, we develop a Markov Chain Monte Carlo algorithm to draw from posterior distributions. We consider three Bayesian models that involve normal and Student's t-distributions in the disturbances and an AR(1)-GARCH(1,1) structure only within the first case. In the empirical part of the study, we analyze a well-known example of financial bubble - the S&P 500 1987 crash - to show the usefulness of the three methods under consideration and crashes of Merval-94, Bovespa-97, IPCMX-94, Hang Seng-97 using the simplest method. The novelty of this research is that the Bayesian models provide 95% credible intervals for the estimated crash time.
Methods of statistical model estimation
Hilbe, Joseph
2013-01-01
Methods of Statistical Model Estimation examines the most important and popular methods used to estimate parameters for statistical models and provide informative model summary statistics. Designed for R users, the book is also ideal for anyone wanting to better understand the algorithms used for statistical model fitting. The text presents algorithms for the estimation of a variety of regression procedures using maximum likelihood estimation, iteratively reweighted least squares regression, the EM algorithm, and MCMC sampling. Fully developed, working R code is constructed for each method. Th
Exclusion statistics and integrable models
International Nuclear Information System (INIS)
Mashkevich, S.
1998-01-01
The definition of exclusion statistics that was given by Haldane admits a 'statistical interaction' between distinguishable particles (multispecies statistics). For such statistics, thermodynamic quantities can be evaluated exactly; explicit expressions are presented here for cluster coefficients. Furthermore, single-species exclusion statistics is realized in one-dimensional integrable models of the Calogero-Sutherland type. The interesting questions of generalizing this correspondence to the higher-dimensional and the multispecies cases remain essentially open; however, our results provide some hints as to searches for the models in question
Directory of Open Access Journals (Sweden)
Wills Rachael A
2009-05-01
Full Text Available Abstract Background The problem of silent multiple comparisons is one of the most difficult statistical problems faced by scientists. It is a particular problem for investigating a one-off cancer cluster reported to a health department because any one of hundreds, or possibly thousands, of neighbourhoods, schools, or workplaces could have reported a cluster, which could have been for any one of several types of cancer or any one of several time periods. Methods This paper contrasts the frequentist approach with a Bayesian approach for dealing with silent multiple comparisons in the context of a one-off cluster reported to a health department. Two published cluster investigations were re-analysed using the Dunn-Sidak method to adjust frequentist p-values and confidence intervals for silent multiple comparisons. Bayesian methods were based on the Gamma distribution. Results Bayesian analysis with non-informative priors produced results similar to the frequentist analysis, and suggested that both clusters represented a statistical excess. In the frequentist framework, the statistical significance of both clusters was extremely sensitive to the number of silent multiple comparisons, which can only ever be a subjective "guesstimate". The Bayesian approach is also subjective: whether there is an apparent statistical excess depends on the specified prior. Conclusion In cluster investigations, the frequentist approach is just as subjective as the Bayesian approach, but the Bayesian approach is less ambitious in that it treats the analysis as a synthesis of data and personal judgements (possibly poor ones, rather than objective reality. Bayesian analysis is (arguably a useful tool to support complicated decision-making, because it makes the uncertainty associated with silent multiple comparisons explicit.
Development of dynamic Bayesian models for web application test management
Azarnova, T. V.; Polukhin, P. V.; Bondarenko, Yu V.; Kashirina, I. L.
2018-03-01
The mathematical apparatus of dynamic Bayesian networks is an effective and technically proven tool that can be used to model complex stochastic dynamic processes. According to the results of the research, mathematical models and methods of dynamic Bayesian networks provide a high coverage of stochastic tasks associated with error testing in multiuser software products operated in a dynamically changing environment. Formalized representation of the discrete test process as a dynamic Bayesian model allows us to organize the logical connection between individual test assets for multiple time slices. This approach gives an opportunity to present testing as a discrete process with set structural components responsible for the generation of test assets. Dynamic Bayesian network-based models allow us to combine in one management area individual units and testing components with different functionalities and a direct influence on each other in the process of comprehensive testing of various groups of computer bugs. The application of the proposed models provides an opportunity to use a consistent approach to formalize test principles and procedures, methods used to treat situational error signs, and methods used to produce analytical conclusions based on test results.
Bayesian log-periodic model for financial crashes
DEFF Research Database (Denmark)
Rodríguez-Caballero, Carlos Vladimir; Knapik, Oskar
2014-01-01
This paper introduces a Bayesian approach in econophysics literature about financial bubbles in order to estimate the most probable time for a financial crash to occur. To this end, we propose using noninformative prior distributions to obtain posterior distributions. Since these distributions...... cannot be performed analytically, we develop a Markov Chain Monte Carlo algorithm to draw from posterior distributions. We consider three Bayesian models that involve normal and Student’s t-distributions in the disturbances and an AR(1)-GARCH(1,1) structure only within the first case. In the empirical...... part of the study, we analyze a well-known example of financial bubble – the S&P 500 1987 crash – to show the usefulness of the three methods under consideration and crashes of Merval-94, Bovespa-97, IPCMX-94, Hang Seng-97 using the simplest method. The novelty of this research is that the Bayesian...
Bayesian inference model for fatigue life of laminated composites
DEFF Research Database (Denmark)
Dimitrov, Nikolay Krasimirov; Kiureghian, Armen Der; Berggreen, Christian
2016-01-01
A probabilistic model for estimating the fatigue life of laminated composite plates is developed. The model is based on lamina-level input data, making it possible to predict fatigue properties for a wide range of laminate configurations. Model parameters are estimated by Bayesian inference....... The reference data used consists of constant-amplitude cycle test results for four laminates with different layup configurations. The paper describes the modeling techniques and the parameter estimation procedure, supported by an illustrative application....
Bayesian Model Comparison With the g-Prior
DEFF Research Database (Denmark)
Nielsen, Jesper Kjær; Christensen, Mads Græsbøll; Cemgil, Ali Taylan
2014-01-01
Model comparison and selection is an important problem in many model-based signal processing applications. Often, very simple information criteria such as the Akaike information criterion or the Bayesian information criterion are used despite their shortcomings. Compared to these methods, Djuric...... demonstrate that our proposed model comparison and selection rules outperform the traditional information criteria both in terms of detecting the true model and in terms of predicting unobserved data. The simulation code is available online....
Uncertainty analysis of depth predictions from seismic reflection data using Bayesian statistics
Michelioudakis, Dimitrios G.; Hobbs, Richard W.; Caiado, Camila C. S.
2018-03-01
Estimating the depths of target horizons from seismic reflection data is an important task in exploration geophysics. To constrain these depths we need a reliable and accurate velocity model. Here, we build an optimum 2D seismic reflection data processing flow focused on pre - stack deghosting filters and velocity model building and apply Bayesian methods, including Gaussian process emulation and Bayesian History Matching (BHM), to estimate the uncertainties of the depths of key horizons near the borehole DSDP-258 located in the Mentelle Basin, south west of Australia, and compare the results with the drilled core from that well. Following this strategy, the tie between the modelled and observed depths from DSDP-258 core was in accordance with the ± 2σ posterior credibility intervals and predictions for depths to key horizons were made for the two new drill sites, adjacent the existing borehole of the area. The probabilistic analysis allowed us to generate multiple realizations of pre-stack depth migrated images, these can be directly used to better constrain interpretation and identify potential risk at drill sites. The method will be applied to constrain the drilling targets for the upcoming International Ocean Discovery Program (IODP), leg 369.
A Bayesian network approach to coastal storm impact modeling
Jäger, W.S.; Den Heijer, C.; Bolle, A.; Hanea, A.M.
2015-01-01
In this paper we develop a Bayesian network (BN) that relates offshore storm conditions to their accompagnying flood characteristics and damages to residential buildings, following on the trend of integrated flood impact modeling. It is based on data from hydrodynamic storm simulations, information
Bayesian model discrimination for glucose-insulin homeostasis
DEFF Research Database (Denmark)
Andersen, Kim Emil; Brooks, Stephen P.; Højbjerre, Malene
In this paper we analyse a set of experimental data on a number of healthy and diabetic patients and discuss a variety of models for describing the physiological processes involved in glucose absorption and insulin secretion within the human body. We adopt a Bayesian approach which facilitates th...
Bayesian Modelling of fMRI Time Series
DEFF Research Database (Denmark)
Højen-Sørensen, Pedro; Hansen, Lars Kai; Rasmussen, Carl Edward
2000-01-01
We present a Hidden Markov Model (HMM) for inferring the hidden psychological state (or neural activity) during single trial fMRI activation experiments with blocked task paradigms. Inference is based on Bayesian methodology, using a combination of analytical and a variety of Markov Chain Monte...
Shortlist B: A Bayesian model of continuous speech recognition
Norris, D.; McQueen, J.M.
2008-01-01
A Bayesian model of continuous speech recognition is presented. It is based on Shortlist (D. Norris, 1994; D. Norris, J. M. McQueen, A. Cutler, & S. Butterfield, 1997) and shares many of its key assumptions: parallel competitive evaluation of multiple lexical hypotheses, phonologically abstract
Shortlist B: A Bayesian Model of Continuous Speech Recognition
Norris, Dennis; McQueen, James M.
2008-01-01
A Bayesian model of continuous speech recognition is presented. It is based on Shortlist (D. Norris, 1994; D. Norris, J. M. McQueen, A. Cutler, & S. Butterfield, 1997) and shares many of its key assumptions: parallel competitive evaluation of multiple lexical hypotheses, phonologically abstract prelexical and lexical representations, a feedforward…
Efficient Bayesian Estimation and Combination of GARCH-Type Models
D. David (David); L.F. Hoogerheide (Lennart)
2010-01-01
textabstractThis paper proposes an up-to-date review of estimation strategies available for the Bayesian inference of GARCH-type models. The emphasis is put on a novel efficient procedure named AdMitIS. The methodology automatically constructs a mixture of Student-t distributions as an approximation
International Nuclear Information System (INIS)
Coolen, F.P.A.
1997-01-01
This paper is intended to make researchers in reliability theory aware of a recently introduced Bayesian model with imprecise prior distributions for statistical inference on failure data, that can also be considered as a robust Bayesian model. The model consists of a multinomial distribution with Dirichlet priors, making the approach basically nonparametric. New results for the model are presented, related to right-censored observations, where estimation based on this model is closely related to the product-limit estimator, which is an important statistical method to deal with reliability or survival data including right-censored observations. As for the product-limit estimator, the model considered in this paper aims at not using any information other than that provided by observed data, but our model fits into the robust Bayesian context which has the advantage that all inferences can be based on probabilities or expectations, or bounds for probabilities or expectations. The model uses a finite partition of the time-axis, and as such it is also related to life-tables
Sensometrics: Thurstonian and Statistical Models
DEFF Research Database (Denmark)
Christensen, Rune Haubo Bojesen
This thesis is concerned with the development and bridging of Thurstonian and statistical models for sensory discrimination testing as applied in the scientific discipline of sensometrics. In sensory discrimination testing sensory differences between products are detected and quantified by the us...... of generalized linear mixed models, cumulative link models and cumulative link mixed models. The relation between the Wald, likelihood and score statistics is expanded upon using the shape of the (profile) likelihood function as common reference....
Joshi, Deepti; St-Hilaire, André; Daigle, Anik; Ouarda, Taha B. M. J.
2013-04-01
SummaryThis study attempts to compare the performance of two statistical downscaling frameworks in downscaling hydrological indices (descriptive statistics) characterizing the low flow regimes of three rivers in Eastern Canada - Moisie, Romaine and Ouelle. The statistical models selected are Relevance Vector Machine (RVM), an implementation of Sparse Bayesian Learning, and the Automated Statistical Downscaling tool (ASD), an implementation of Multiple Linear Regression. Inputs to both frameworks involve climate variables significantly (α = 0.05) correlated with the indices. These variables were processed using Canonical Correlation Analysis and the resulting canonical variates scores were used as input to RVM to estimate the selected low flow indices. In ASD, the significantly correlated climate variables were subjected to backward stepwise predictor selection and the selected predictors were subsequently used to estimate the selected low flow indices using Multiple Linear Regression. With respect to the correlation between climate variables and the selected low flow indices, it was observed that all indices are influenced, primarily, by wind components (Vertical, Zonal and Meridonal) and humidity variables (Specific and Relative Humidity). The downscaling performance of the framework involving RVM was found to be better than ASD in terms of Relative Root Mean Square Error, Relative Mean Absolute Bias and Coefficient of Determination. In all cases, the former resulted in less variability of the performance indices between calibration and validation sets, implying better generalization ability than for the latter.
Bayesian Uncertainty Quantification for Subsurface Inversion Using a Multiscale Hierarchical Model
Mondal, Anirban
2014-07-03
We consider a Bayesian approach to nonlinear inverse problems in which the unknown quantity is a random field (spatial or temporal). The Bayesian approach contains a natural mechanism for regularization in the form of prior information, can incorporate information from heterogeneous sources and provide a quantitative assessment of uncertainty in the inverse solution. The Bayesian setting casts the inverse solution as a posterior probability distribution over the model parameters. The Karhunen-Loeve expansion is used for dimension reduction of the random field. Furthermore, we use a hierarchical Bayes model to inject multiscale data in the modeling framework. In this Bayesian framework, we show that this inverse problem is well-posed by proving that the posterior measure is Lipschitz continuous with respect to the data in total variation norm. Computational challenges in this construction arise from the need for repeated evaluations of the forward model (e.g., in the context of MCMC) and are compounded by high dimensionality of the posterior. We develop two-stage reversible jump MCMC that has the ability to screen the bad proposals in the first inexpensive stage. Numerical results are presented by analyzing simulated as well as real data from hydrocarbon reservoir. This article has supplementary material available online. © 2014 American Statistical Association and the American Society for Quality.
Statistical modeling for degradation data
Lio, Yuhlong; Ng, Hon; Tsai, Tzong-Ru
2017-01-01
This book focuses on the statistical aspects of the analysis of degradation data. In recent years, degradation data analysis has come to play an increasingly important role in different disciplines such as reliability, public health sciences, and finance. For example, information on products’ reliability can be obtained by analyzing degradation data. In addition, statistical modeling and inference techniques have been developed on the basis of different degradation measures. The book brings together experts engaged in statistical modeling and inference, presenting and discussing important recent advances in degradation data analysis and related applications. The topics covered are timely and have considerable potential to impact both statistics and reliability engineering.
Climatic Models Ensemble-based Mid-21st Century Runoff Projections: A Bayesian Framework
Achieng, K. O.; Zhu, J.
2017-12-01
There are a number of North American Regional Climate Change Assessment Program (NARCCAP) climatic models that have been used to project surface runoff in the mid-21st century. Statistical model selection techniques are often used to select the model that best fits data. However, model selection techniques often lead to different conclusions. In this study, ten models are averaged in Bayesian paradigm to project runoff. Bayesian Model Averaging (BMA) is used to project and identify effect of model uncertainty on future runoff projections. Baseflow separation - a two-digital filter which is also called Eckhardt filter - is used to separate USGS streamflow (total runoff) into two components: baseflow and surface runoff. We use this surface runoff as the a priori runoff when conducting BMA of runoff simulated from the ten RCM models. The primary objective of this study is to evaluate how well RCM multi-model ensembles simulate surface runoff, in a Bayesian framework. Specifically, we investigate and discuss the following questions: How well do ten RCM models ensemble jointly simulate surface runoff by averaging over all the models using BMA, given a priori surface runoff? What are the effects of model uncertainty on surface runoff simulation?
Copula Based Factorization in Bayesian Multivariate Infinite Mixture Models
Martin Burda; Artem Prokhorov
2012-01-01
Bayesian nonparametric models based on infinite mixtures of density kernels have been recently gaining in popularity due to their flexibility and feasibility of implementation even in complicated modeling scenarios. In economics, they have been particularly useful in estimating nonparametric distributions of latent variables. However, these models have been rarely applied in more than one dimension. Indeed, the multivariate case suffers from the curse of dimensionality, with a rapidly increas...
Non-parametric Bayesian models of response function in dynamic image sequences
Czech Academy of Sciences Publication Activity Database
Tichý, Ondřej; Šmídl, Václav
2016-01-01
Roč. 151, č. 1 (2016), s. 90-100 ISSN 1077-3142 R&D Projects: GA ČR GA13-29225S Institutional support: RVO:67985556 Keywords : Response function * Blind source separation * Dynamic medical imaging * Probabilistic models * Bayesian methods Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 2.498, year: 2016 http://library.utia.cas.cz/separaty/2016/AS/tichy-0456983.pdf
Bayat, Sahar; Cuggia, Marc; Kessler, Michel; Briançon, Serge; Le Beux, Pierre; Frimat, Luc
2008-01-01
Evaluation of adult candidates for kidney transplantation diverges from one centre to another. Our purpose was to assess the suitability of Bayesian method for describing the factors associated to registration on the waiting list in a French healthcare network. We have found no published paper using Bayesian method in this domain. Eight hundred and nine patients starting renal replacement therapy were included in the analysis. The data were extracted from the information system of the healthcare network. We performed conventional statistical analysis and data mining analysis using mainly Bayesian networks. The Bayesian model showed that the probability of registration on the waiting list is associated to age, cardiovascular disease, diabetes, serum albumin level, respiratory disease, physical impairment, follow-up in the department performing transplantation and past history of malignancy. These results are similar to conventional statistical method. The comparison between conventional analysis and data mining analysis showed us the contribution of the data mining method for sorting variables and having a global view of the variables' associations. Moreover theses approaches constitute an essential step toward a decisional information system for healthcare networks.
A Bayesian posterior predictive framework for weighting ensemble regional climate models
Directory of Open Access Journals (Sweden)
Y. Fan
2017-06-01
Full Text Available We present a novel Bayesian statistical approach to computing model weights in climate change projection ensembles in order to create probabilistic projections. The weight of each climate model is obtained by weighting the current day observed data under the posterior distribution admitted under competing climate models. We use a linear model to describe the model output and observations. The approach accounts for uncertainty in model bias, trend and internal variability, including error in the observations used. Our framework is general, requires very little problem-specific input, and works well with default priors. We carry out cross-validation checks that confirm that the method produces the correct coverage.
Statistical modelling with quantile functions
Gilchrist, Warren
2000-01-01
Galton used quantiles more than a hundred years ago in describing data. Tukey and Parzen used them in the 60s and 70s in describing populations. Since then, the authors of many papers, both theoretical and practical, have used various aspects of quantiles in their work. Until now, however, no one put all the ideas together to form what turns out to be a general approach to statistics.Statistical Modelling with Quantile Functions does just that. It systematically examines the entire process of statistical modelling, starting with using the quantile function to define continuous distributions. The author shows that by using this approach, it becomes possible to develop complex distributional models from simple components. A modelling kit can be developed that applies to the whole model - deterministic and stochastic components - and this kit operates by adding, multiplying, and transforming distributions rather than data.Statistical Modelling with Quantile Functions adds a new dimension to the practice of stati...
Bayesian model and spatial analysis of oral and oropharynx cancer mortality in Minas Gerais, Brazil.
Fonseca, Emílio Prado da; Oliveira, Cláudia Di Lorenzo; Chiaravalloti, Francisco; Pereira, Antonio Carlos; Vedovello, Silvia Amélia Scudeler; Meneghim, Marcelo de Castro
2018-01-01
The objective of this study was to determine of oral and oropharynx cancer mortality rate and the results were analyzed by applying the Spatial Analysis of Empirical Bayesian Model. To this end, we used the information contained in the International Classification of Diseases (ICD-10), Chapter II, Category C00 to C14 and Brazilian Mortality Information System (SIM) of Minas Gerais State. Descriptive statistics were observed and the gross rate of mortality was calculated for each municipality. Then Empirical Bayesian estimators were applied. The results showed that, in 2012, in the state of Minas Gerais, were registered 769 deaths of patients with cancer of oral and oropharynx, with 607 (78.96%) men and 162 (21.04%) women. There was a wide variation in spatial distribution of crude mortality rate and were identified agglomeration in the South, Central and North more accurately by Bayesian Estimator Global and Local Model. Through Bayesian models was possible to map the spatial clustering of deaths from oral cancer more accurately, and with the application of the method of spatial epidemiology, it was possible to obtain more accurate results and provide subsidies to reduce the number of deaths from this type of cancer.
Equifinality of formal (DREAM) and informal (GLUE) bayesian approaches in hydrologic modeling?
Energy Technology Data Exchange (ETDEWEB)
Vrugt, Jasper A [Los Alamos National Laboratory; Robinson, Bruce A [Los Alamos National Laboratory; Ter Braak, Cajo J F [NON LANL; Gupta, Hoshin V [NON LANL
2008-01-01
In recent years, a strong debate has emerged in the hydrologic literature regarding what constitutes an appropriate framework for uncertainty estimation. Particularly, there is strong disagreement whether an uncertainty framework should have its roots within a proper statistical (Bayesian) context, or whether such a framework should be based on a different philosophy and implement informal measures and weaker inference to summarize parameter and predictive distributions. In this paper, we compare a formal Bayesian approach using Markov Chain Monte Carlo (MCMC) with generalized likelihood uncertainty estimation (GLUE) for assessing uncertainty in conceptual watershed modeling. Our formal Bayesian approach is implemented using the recently developed differential evolution adaptive metropolis (DREAM) MCMC scheme with a likelihood function that explicitly considers model structural, input and parameter uncertainty. Our results demonstrate that DREAM and GLUE can generate very similar estimates of total streamflow uncertainty. This suggests that formal and informal Bayesian approaches have more common ground than the hydrologic literature and ongoing debate might suggest. The main advantage of formal approaches is, however, that they attempt to disentangle the effect of forcing, parameter and model structural error on total predictive uncertainty. This is key to improving hydrologic theory and to better understand and predict the flow of water through catchments.
Bayesian Network Model with Application to Smart Power Semiconductor Lifetime Data.
Plankensteiner, Kathrin; Bluder, Olivia; Pilz, Jürgen
2015-09-01
In this article, Bayesian networks are used to model semiconductor lifetime data obtained from a cyclic stress test system. The data of interest are a mixture of log-normal distributions, representing two dominant physical failure mechanisms. Moreover, the data can be censored due to limited test resources. For a better understanding of the complex lifetime behavior, interactions between test settings, geometric designs, material properties, and physical parameters of the semiconductor device are modeled by a Bayesian network. Statistical toolboxes in MATLAB® have been extended and applied to find the best structure of the Bayesian network and to perform parameter learning. Due to censored observations Markov chain Monte Carlo (MCMC) simulations are employed to determine the posterior distributions. For model selection the automatic relevance determination (ARD) algorithm and goodness-of-fit criteria such as marginal likelihoods, Bayes factors, posterior predictive density distributions, and sum of squared errors of prediction (SSEP) are applied and evaluated. The results indicate that the application of Bayesian networks to semiconductor reliability provides useful information about the interactions between the significant covariates and serves as a reliable alternative to currently applied methods. © 2015 Society for Risk Analysis.
Bayesian modelling of multiple diagnostics at Wendelstein 7-X using the Minerva framework
Kwak, Sehyun; Svensson, Jakob; Bozhenkov, Sergey; Trimino Mora, Humberto; Hoefel, Udo; Pavone, Andrea; Krychowiak, Maciej; Langenberg, Andreas; Ghim, Young-Chul; W7-X Team Team
2017-10-01
Wendelstein 7-X (W7-X) is a large scale optimised stellarator designed for steady-state operation with fusion reactor relevant conditions. Consistent inference of physics parameters and their associated uncertainties requires the capability to handle the complexity of the entire system, including physics models of multiple diagnostics. A Bayesian model has been developed in the Minerva framework to infer electron temperature and density profiles from multiple diagnostics in a consistent way. Here, the physics models predict the data of multiple diagnostics in a joint Bayesian analysis. The electron temperature and density profiles are modelled by Gaussian processes with hyperparameters. Markov chain Monte Carlo methods explore the full posterior of electron temperature and density profiles as well as possible combinations of hyperparameters and calibration factors. This results in a profile inference with proper uncertainties reflecting both statistical error and the automatic calibration for diagnostics.
Bayesian models for comparative analysis integrating phylogenetic uncertainty
2012-01-01
Background Uncertainty in comparative analyses can come from at least two sources: a) phylogenetic uncertainty in the tree topology or branch lengths, and b) uncertainty due to intraspecific variation in trait values, either due to measurement error or natural individual variation. Most phylogenetic comparative methods do not account for such uncertainties. Not accounting for these sources of uncertainty leads to false perceptions of precision (confidence intervals will be too narrow) and inflated significance in hypothesis testing (e.g. p-values will be too small). Although there is some application-specific software for fitting Bayesian models accounting for phylogenetic error, more general and flexible software is desirable. Methods We developed models to directly incorporate phylogenetic uncertainty into a range of analyses that biologists commonly perform, using a Bayesian framework and Markov Chain Monte Carlo analyses. Results We demonstrate applications in linear regression, quantification of phylogenetic signal, and measurement error models. Phylogenetic uncertainty was incorporated by applying a prior distribution for the phylogeny, where this distribution consisted of the posterior tree sets from Bayesian phylogenetic tree estimation programs. The models were analysed using simulated data sets, and applied to a real data set on plant traits, from rainforest plant species in Northern Australia. Analyses were performed using the free and open source software OpenBUGS and JAGS. Conclusions Incorporating phylogenetic uncertainty through an empirical prior distribution of trees leads to more precise estimation of regression model parameters than using a single consensus tree and enables a more realistic estimation of confidence intervals. In addition, models incorporating measurement errors and/or individual variation, in one or both variables, are easily formulated in the Bayesian framework. We show that BUGS is a useful, flexible general purpose tool for
Technical note: Bayesian calibration of dynamic ruminant nutrition models.
Reed, K F; Arhonditsis, G B; France, J; Kebreab, E
2016-08-01
Mechanistic models of ruminant digestion and metabolism have advanced our understanding of the processes underlying ruminant animal physiology. Deterministic modeling practices ignore the inherent variation within and among individual animals and thus have no way to assess how sources of error influence model outputs. We introduce Bayesian calibration of mathematical models to address the need for robust mechanistic modeling tools that can accommodate error analysis by remaining within the bounds of data-based parameter estimation. For the purpose of prediction, the Bayesian approach generates a posterior predictive distribution that represents the current estimate of the value of the response variable, taking into account both the uncertainty about the parameters and model residual variability. Predictions are expressed as probability distributions, thereby conveying significantly more information than point estimates in regard to uncertainty. Our study illustrates some of the technical advantages of Bayesian calibration and discusses the future perspectives in the context of animal nutrition modeling. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Organism-level models: When mechanisms and statistics fail us
Phillips, M. H.; Meyer, J.; Smith, W. P.; Rockhill, J. K.
2014-03-01
Purpose: To describe the unique characteristics of models that represent the entire course of radiation therapy at the organism level and to highlight the uses to which such models can be put. Methods: At the level of an organism, traditional model-building runs into severe difficulties. We do not have sufficient knowledge to devise a complete biochemistry-based model. Statistical model-building fails due to the vast number of variables and the inability to control many of them in any meaningful way. Finally, building surrogate models, such as animal-based models, can result in excluding some of the most critical variables. Bayesian probabilistic models (Bayesian networks) provide a useful alternative that have the advantages of being mathematically rigorous, incorporating the knowledge that we do have, and being practical. Results: Bayesian networks representing radiation therapy pathways for prostate cancer and head & neck cancer were used to highlight the important aspects of such models and some techniques of model-building. A more specific model representing the treatment of occult lymph nodes in head & neck cancer were provided as an example of how such a model can inform clinical decisions. A model of the possible role of PET imaging in brain cancer was used to illustrate the means by which clinical trials can be modelled in order to come up with a trial design that will have meaningful outcomes. Conclusions: Probabilistic models are currently the most useful approach to representing the entire therapy outcome process.
Statistical validation of stochastic models
Energy Technology Data Exchange (ETDEWEB)
Hunter, N.F. [Los Alamos National Lab., NM (United States). Engineering Science and Analysis Div.; Barney, P.; Paez, T.L. [Sandia National Labs., Albuquerque, NM (United States). Experimental Structural Dynamics Dept.; Ferregut, C.; Perez, L. [Univ. of Texas, El Paso, TX (United States). Dept. of Civil Engineering
1996-12-31
It is common practice in structural dynamics to develop mathematical models for system behavior, and the authors are now capable of developing stochastic models, i.e., models whose parameters are random variables. Such models have random characteristics that are meant to simulate the randomness in characteristics of experimentally observed systems. This paper suggests a formal statistical procedure for the validation of mathematical models of stochastic systems when data taken during operation of the stochastic system are available. The statistical characteristics of the experimental system are obtained using the bootstrap, a technique for the statistical analysis of non-Gaussian data. The authors propose a procedure to determine whether or not a mathematical model is an acceptable model of a stochastic system with regard to user-specified measures of system behavior. A numerical example is presented to demonstrate the application of the technique.
Bayesian interference in heterogeneous dynamic panel data models: three essays.
Ciccarelli, Matteo
2001-01-01
The task of this work is to discuss issues conceming the specification, estimation, inference and forecasting in multivariate dynamic heterogeneous panel data models from a Bayesian perspective. Three essays linked by a few conraion ideas compose the work. Multivariate dynamic models (mainly VARs) based on micro or macro panel data sets have become increasingly popular in macroeconomics, especially to study the transmission of real and monetary shocks across economies. This great use...
Bayesian Age-Period-Cohort Modeling and Prediction - BAMP
Directory of Open Access Journals (Sweden)
Volker J. Schmid
2007-10-01
Full Text Available The software package BAMP provides a method of analyzing incidence or mortality data on the Lexis diagram, using a Bayesian version of an age-period-cohort model. A hierarchical model is assumed with a binomial model in the first-stage. As smoothing priors for the age, period and cohort parameters random walks of first and second order, with and without an additional unstructured component are available. Unstructured heterogeneity can also be included in the model. In order to evaluate the model fit, posterior deviance, DIC and predictive deviances are computed. By projecting the random walk prior into the future, future death rates can be predicted.
Kypraios, Theodore; Neal, Peter; Prangle, Dennis
2017-05-01
Likelihood-based inference for disease outbreak data can be very challenging due to the inherent dependence of the data and the fact that they are usually incomplete. In this paper we review recent Approximate Bayesian Computation (ABC) methods for the analysis of such data by fitting to them stochastic epidemic models without having to calculate the likelihood of the observed data. We consider both non-temporal and temporal-data and illustrate the methods with a number of examples featuring different models and datasets. In addition, we present extensions to existing algorithms which are easy to implement and provide an improvement to the existing methodology. Finally, R code to implement the algorithms presented in the paper is available on https://github.com/kypraios/epiABC. Copyright © 2016 Elsevier Inc. All rights reserved.
Detecting Multiple Random Changepoints in Bayesian Piecewise Growth Mixture Models.
Lock, Eric F; Kohli, Nidhi; Bose, Maitreyee
2017-11-17
Piecewise growth mixture models are a flexible and useful class of methods for analyzing segmented trends in individual growth trajectory over time, where the individuals come from a mixture of two or more latent classes. These models allow each segment of the overall developmental process within each class to have a different functional form; examples include two linear phases of growth, or a quadratic phase followed by a linear phase. The changepoint (knot) is the time of transition from one developmental phase (segment) to another. Inferring the location of the changepoint(s) is often of practical interest, along with inference for other model parameters. A random changepoint allows for individual differences in the transition time within each class. The primary objectives of our study are as follows: (1) to develop a PGMM using a Bayesian inference approach that allows the estimation of multiple random changepoints within each class; (2) to develop a procedure to empirically detect the number of random changepoints within each class; and (3) to empirically investigate the bias and precision of the estimation of the model parameters, including the random changepoints, via a simulation study. We have developed the user-friendly package BayesianPGMM for R to facilitate the adoption of this methodology in practice, which is available at https://github.com/lockEF/BayesianPGMM . We describe an application to mouse-tracking data for a visual recognition task.
PDS-Modelling and Regional Bayesian Estimation of Extreme Rainfalls
DEFF Research Database (Denmark)
Madsen, Henrik; Rosbjerg, Dan; Harremoës, Poul
1994-01-01
Since 1979 a country-wide system of raingauges has been operated in Denmark in order to obtain a better basis for design and analysis of urban drainage systems. As an alternative to the traditional non-parametric approach the Partial Duration Series method is employed in the modelling of extreme ....... The application of the Bayesian approach is derived in case of both exponential and generalized Pareto distributed exceedances. Finally, the aspect of including economic perspectives in the estimation of the design events is briefly discussed....... in Denmark cannot be justified. In order to obtain an estimation procedure at non-monitored sites and to improve at-site estimates a regional Bayesian approach is adopted. The empirical regional distributions of the parameters in the Partial Duration Series model are used as prior information...
Approximate Bayesian computation for spatial SEIR(S) epidemic models.
Brown, Grant D; Porter, Aaron T; Oleson, Jacob J; Hinman, Jessica A
2018-02-01
Approximate Bayesia n Computation (ABC) provides an attractive approach to estimation in complex Bayesian inferential problems for which evaluation of the kernel of the posterior distribution is impossible or computationally expensive. These highly parallelizable techniques have been successfully applied to many fields, particularly in cases where more traditional approaches such as Markov chain Monte Carlo (MCMC) are impractical. In this work, we demonstrate the application of approximate Bayesian inference to spatially heterogeneous Susceptible-Exposed-Infectious-Removed (SEIR) stochastic epidemic models. These models have a tractable posterior distribution, however MCMC techniques nevertheless become computationally infeasible for moderately sized problems. We discuss the practical implementation of these techniques via the open source ABSEIR package for R. The performance of ABC relative to traditional MCMC methods in a small problem is explored under simulation, as well as in the spatially heterogeneous context of the 2014 epidemic of Chikungunya in the Americas. Copyright © 2017 Elsevier Ltd. All rights reserved.
Bayesian Predictive Modeling Based on Multidimensional Connectivity Profiling
Herskovits, Edward
2015-01-01
Dysfunction of brain structural and functional connectivity is increasingly being recognized as playing an important role in many brain disorders. Diffusion tensor imaging (DTI) and functional magnetic resonance (fMR) imaging are widely used to infer structural and functional connectivity, respectively. How to combine structural and functional connectivity patterns for predictive modeling is an important, yet open, problem. We propose a new method, called Bayesian prediction based on multidimensional connectivity profiling (BMCP), to distinguish subjects at the individual level based on structural and functional connectivity patterns. BMCP combines finite mixture modeling and Bayesian network classification. We demonstrate its use in distinguishing young and elderly adults based on DTI and resting-state fMR data. PMID:25924166
Comparison of evidence theory and Bayesian theory for uncertainty modeling
International Nuclear Information System (INIS)
Soundappan, Prabhu; Nikolaidis, Efstratios; Haftka, Raphael T.; Grandhi, Ramana; Canfield, Robert
2004-01-01
This paper compares Evidence Theory (ET) and Bayesian Theory (BT) for uncertainty modeling and decision under uncertainty, when the evidence about uncertainty is imprecise. The basic concepts of ET and BT are introduced and the ways these theories model uncertainties, propagate them through systems and assess the safety of these systems are presented. ET and BT approaches are demonstrated and compared on challenge problems involving an algebraic function whose input variables are uncertain. The evidence about the input variables consists of intervals provided by experts. It is recommended that a decision-maker compute both the Bayesian probabilities of the outcomes of alternative actions and their plausibility and belief measures when evidence about uncertainty is imprecise, because this helps assess the importance of imprecision and the value of additional information. Finally, the paper presents and demonstrates a method for testing approaches for decision under uncertainty in terms of their effectiveness in making decisions
A Unified Bayesian Inference Framework for Generalized Linear Models
Meng, Xiangming; Wu, Sheng; Zhu, Jiang
2018-03-01
In this letter, we present a unified Bayesian inference framework for generalized linear models (GLM) which iteratively reduces the GLM problem to a sequence of standard linear model (SLM) problems. This framework provides new perspectives on some established GLM algorithms derived from SLM ones and also suggests novel extensions for some other SLM algorithms. Specific instances elucidated under such framework are the GLM versions of approximate message passing (AMP), vector AMP (VAMP), and sparse Bayesian learning (SBL). It is proved that the resultant GLM version of AMP is equivalent to the well-known generalized approximate message passing (GAMP). Numerical results for 1-bit quantized compressed sensing (CS) demonstrate the effectiveness of this unified framework.
A Bayesian hierarchical model for climate change detection and attribution
Katzfuss, Matthias; Hammerling, Dorit; Smith, Richard L.
2017-06-01
Regression-based detection and attribution methods continue to take a central role in the study of climate change and its causes. Here we propose a novel Bayesian hierarchical approach to this problem, which allows us to address several open methodological questions. Specifically, we take into account the uncertainties in the true temperature change due to imperfect measurements, the uncertainty in the true climate signal under different forcing scenarios due to the availability of only a small number of climate model simulations, and the uncertainty associated with estimating the climate variability covariance matrix, including the truncation of the number of empirical orthogonal functions (EOFs) in this covariance matrix. We apply Bayesian model averaging to assign optimal probabilistic weights to different possible truncations and incorporate all uncertainties into the inference on the regression coefficients. We provide an efficient implementation of our method in a software package and illustrate its use with a realistic application.
Bayesian Statistics and Uncertainty Quantification for Safety Boundary Analysis in Complex Systems
He, Yuning; Davies, Misty Dawn
2014-01-01
The analysis of a safety-critical system often requires detailed knowledge of safe regions and their highdimensional non-linear boundaries. We present a statistical approach to iteratively detect and characterize the boundaries, which are provided as parameterized shape candidates. Using methods from uncertainty quantification and active learning, we incrementally construct a statistical model from only few simulation runs and obtain statistically sound estimates of the shape parameters for safety boundaries.
Genome scans for detecting footprints of local adaptation using a Bayesian factor model.
Duforet-Frebourg, Nicolas; Bazin, Eric; Blum, Michael G B
2014-09-01
There is a considerable impetus in population genomics to pinpoint loci involved in local adaptation. A powerful approach to find genomic regions subject to local adaptation is to genotype numerous molecular markers and look for outlier loci. One of the most common approaches for selection scans is based on statistics that measure population differentiation such as FST. However, there are important caveats with approaches related to FST because they require grouping individuals into populations and they additionally assume a particular model of population structure. Here, we implement a more flexible individual-based approach based on Bayesian factor models. Factor models capture population structure with latent variables called factors, which can describe clustering of individuals into populations or isolation-by-distance patterns. Using hierarchical Bayesian modeling, we both infer population structure and identify outlier loci that are candidates for local adaptation. In order to identify outlier loci, the hierarchical factor model searches for loci that are atypically related to population structure as measured by the latent factors. In a model of population divergence, we show that it can achieve a 2-fold or more reduction of false discovery rate compared with the software BayeScan or with an FST approach. We show that our software can handle large data sets by analyzing the single nucleotide polymorphisms of the Human Genome Diversity Project. The Bayesian factor model is implemented in the open-source PCAdapt software. © The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Nonparametric Bayesian models through probit stick-breaking processes.
Rodríguez, Abel; Dunson, David B
2011-03-01
We describe a novel class of Bayesian nonparametric priors based on stick-breaking constructions where the weights of the process are constructed as probit transformations of normal random variables. We show that these priors are extremely flexible, allowing us to generate a great variety of models while preserving computational simplicity. Particular emphasis is placed on the construction of rich temporal and spatial processes, which are applied to two problems in finance and ecology.
Bayesian Modelling of fMRI Time Series
DEFF Research Database (Denmark)
Højen-Sørensen, Pedro; Hansen, Lars Kai; Rasmussen, Carl Edward
2000-01-01
We present a Hidden Markov Model (HMM) for inferring the hidden psychological state (or neural activity) during single trial fMRI activation experiments with blocked task paradigms. Inference is based on Bayesian methodology, using a combination of analytical and a variety of Markov Chain Monte...... Carlo (MCMC) sampling techniques. The advantage of this method is that detection of short time learning effects between repeated trials is possible since inference is based only on single trial experiments....
Bayesian tsunami fragility modeling considering input data uncertainty
De Risi, Raffaele; Goda, Katsu; Mori, Nobuhito; Yasuda, Tomohiro
2017-01-01
Empirical tsunami fragility curves are developed based on a Bayesian framework by accounting for uncertainty of input tsunami hazard data in a systematic and comprehensive manner. Three fragility modeling approaches, i.e. lognormal method, binomial logistic method, and multinomial logistic method, are considered, and are applied to extensive tsunami damage data for the 2011 Tohoku earthquake. A unique aspect of this study is that uncertainty of tsunami inundation data (i.e. input hazard data ...
Shortlist B: A Bayesian model of continuous speech recognition
Norris, D.; McQueen, J.
2008-01-01
A Bayesian model of continuous speech recognition is presented. It is based on Shortlist ( D. Norris, 1994; D. Norris, J. M. McQueen, A. Cutler, & S. Butterfield, 1997) and shares many of its key assumptions: parallel competitive evaluation of multiple lexical hypotheses, phonologically abstract prelexical and lexical representations, a feedforward architecture with no online feedback, and a lexical segmentation algorithm based on the viability of chunks of the input as possible words. Shortl...
Bayesian Modeling of Cerebral Information Processing
Labatut, Vincent; Pastor, Josette
2001-01-01
International audience; Modeling explicitly the links between cognitive functions and networks of cerebral areas is necessitated both by the understanding of the clinical outcomes of brain lesions and by the interpretation of activation data provided by functional neuroimaging techniques. At this global level of representation, the human brain can be best modeled by a probabilistic functional causal network. Our modeling approach is based on the anatomical connection pattern, the information ...
DISSECTING MAGNETAR VARIABILITY WITH BAYESIAN HIERARCHICAL MODELS
Energy Technology Data Exchange (ETDEWEB)
Huppenkothen, Daniela; Elenbaas, Chris; Watts, Anna L.; Horst, Alexander J. van der [Anton Pannekoek Institute for Astronomy, University of Amsterdam, Postbus 94249, 1090 GE Amsterdam (Netherlands); Brewer, Brendon J. [Department of Statistics, The University of Auckland, Private Bag 92019, Auckland 1142 (New Zealand); Hogg, David W. [Center for Data Science, New York University, 726 Broadway, 7th Floor, New York, NY 10003 (United States); Murray, Iain [School of Informatics, University of Edinburgh, Edinburgh EH8 9AB (United Kingdom); Frean, Marcus [School of Engineering and Computer Science, Victoria University of Wellington (New Zealand); Levin, Yuri [Monash Center for Astrophysics and School of Physics, Monash University, Clayton, Victoria 3800 (Australia); Kouveliotou, Chryssa, E-mail: daniela.huppenkothen@nyu.edu [Astrophysics Office, ZP 12, NASA/Marshall Space Flight Center, Huntsville, AL 35812 (United States)
2015-09-01
Neutron stars are a prime laboratory for testing physical processes under conditions of strong gravity, high density, and extreme magnetic fields. Among the zoo of neutron star phenomena, magnetars stand out for their bursting behavior, ranging from extremely bright, rare giant flares to numerous, less energetic recurrent bursts. The exact trigger and emission mechanisms for these bursts are not known; favored models involve either a crust fracture and subsequent energy release into the magnetosphere, or explosive reconnection of magnetic field lines. In the absence of a predictive model, understanding the physical processes responsible for magnetar burst variability is difficult. Here, we develop an empirical model that decomposes magnetar bursts into a superposition of small spike-like features with a simple functional form, where the number of model components is itself part of the inference problem. The cascades of spikes that we model might be formed by avalanches of reconnection, or crust rupture aftershocks. Using Markov Chain Monte Carlo sampling augmented with reversible jumps between models with different numbers of parameters, we characterize the posterior distributions of the model parameters and the number of components per burst. We relate these model parameters to physical quantities in the system, and show for the first time that the variability within a burst does not conform to predictions from ideas of self-organized criticality. We also examine how well the properties of the spikes fit the predictions of simplified cascade models for the different trigger mechanisms.
Statistical Models for Social Networks
Snijders, Tom A. B.; Cook, KS; Massey, DS
2011-01-01
Statistical models for social networks as dependent variables must represent the typical network dependencies between tie variables such as reciprocity, homophily, transitivity, etc. This review first treats models for single (cross-sectionally observed) networks and then for network dynamics. For
Mixed deterministic statistical modelling of regional ozone air pollution
Kalenderski, Stoitchko
2011-03-17
We develop a physically motivated statistical model for regional ozone air pollution by separating the ground-level pollutant concentration field into three components, namely: transport, local production and large-scale mean trend mostly dominated by emission rates. The model is novel in the field of environmental spatial statistics in that it is a combined deterministic-statistical model, which gives a new perspective to the modelling of air pollution. The model is presented in a Bayesian hierarchical formalism, and explicitly accounts for advection of pollutants, using the advection equation. We apply the model to a specific case of regional ozone pollution-the Lower Fraser valley of British Columbia, Canada. As a predictive tool, we demonstrate that the model vastly outperforms existing, simpler modelling approaches. Our study highlights the importance of simultaneously considering different aspects of an air pollution problem as well as taking into account the physical bases that govern the processes of interest. © 2011 John Wiley & Sons, Ltd..
A Bayesian Nonparametric Meta-Analysis Model
Karabatsos, George; Talbott, Elizabeth; Walker, Stephen G.
2015-01-01
In a meta-analysis, it is important to specify a model that adequately describes the effect-size distribution of the underlying population of studies. The conventional normal fixed-effect and normal random-effects models assume a normal effect-size population distribution, conditionally on parameters and covariates. For estimating the mean overall…
Bayesian Inference using Neural Net Likelihood Models for Protein Secondary Structure Prediction
Directory of Open Access Journals (Sweden)
Seong-Gon Kim
2011-06-01
Full Text Available Several techniques such as Neural Networks, Genetic Algorithms, Decision Trees and other statistical or heuristic methods have been used to approach the complex non-linear task of predicting Alpha-helicies, Beta-sheets and Turns of a proteins secondary structure in the past. This project introduces a new machine learning method by using an offline trained Multilayered Perceptrons (MLP as the likelihood models within a Bayesian Inference framework to predict secondary structures proteins. Varying window sizes are used to extract neighboring amino acid information and passed back and forth between the Neural Net models and the Bayesian Inference process until there is a convergence of the posterior secondary structure probability.
Bayesian Computation Methods for Inferring Regulatory Network Models Using Biomedical Data.
Tian, Tianhai
2016-01-01
The rapid advancement of high-throughput technologies provides huge amounts of information for gene expression and protein activity in the genome-wide scale. The availability of genomics, transcriptomics, proteomics, and metabolomics dataset gives an unprecedented opportunity to study detailed molecular regulations that is very important to precision medicine. However, it is still a significant challenge to design effective and efficient method to infer the network structure and dynamic property of regulatory networks. In recent years a number of computing methods have been designed to explore the regulatory mechanisms as well as estimate unknown model parameters. Among them, the Bayesian inference method can combine both prior knowledge and experimental data to generate updated information regarding the regulatory mechanisms. This chapter gives a brief review for Bayesian statistical methods that are used to infer the network structure and estimate model parameters based on experimental data.
AIC, BIC, Bayesian evidence against the interacting dark energy model
International Nuclear Information System (INIS)
Szydlowski, Marek; Krawiec, Adam; Kurek, Aleksandra; Kamionka, Michal
2015-01-01
Recent astronomical observations have indicated that the Universe is in a phase of accelerated expansion. While there are many cosmological models which try to explain this phenomenon, we focus on the interacting ΛCDM model where an interaction between the dark energy and dark matter sectors takes place. This model is compared to its simpler alternative - the ΛCDM model. To choose between these models the likelihood ratio test was applied as well as the model comparison methods (employing Occam's principle): the Akaike information criterion (AIC), the Bayesian information criterion (BIC) and the Bayesian evidence. Using the current astronomical data: type Ia supernova (Union2.1), h(z), baryon acoustic oscillation, the Alcock- Paczynski test, and the cosmic microwave background data, we evaluated both models. The analyses based on the AIC indicated that there is less support for the interacting ΛCDM model when compared to the ΛCDM model, while those based on the BIC indicated that there is strong evidence against it in favor of the ΛCDM model. Given the weak or almost non-existing support for the interacting ΛCDM model and bearing in mind Occam's razor we are inclined to reject this model. (orig.)
A Bayesian Framework for Multiple Trait Colo-calization from Summary Association Statistics.
Giambartolomei, Claudia; Zhenli Liu, Jimmy; Zhang, Wen; Hauberg, Mads; Shi, Huwenbo; Boocock, James; Pickrell, Joe; Jaffe, Andrew E; Pasaniuc, Bogdan; Roussos, Panos
2018-03-19
Most genetic variants implicated in complex diseases by genome-wide association studies (GWAS) are non-coding, making it challenging to understand the causative genes involved in disease. Integrating external information such as quantitative trait locus (QTL) mapping of molecular traits (e.g., expression, methylation) is a powerful approach to identify the subset of GWAS signals explained by regulatory effects. In particular, expression QTLs (eQTLs) help pinpoint the responsible gene among the GWAS regions that harbor many genes, while methylation QTLs (mQTLs) help identify the epigenetic mechanisms that impact gene expression which in turn affect disease risk. In this work we propose multiple-trait-coloc (moloc), a Bayesian statistical framework that integrates GWAS summary data with multiple molecular QTL data to identify regulatory effects at GWAS risk loci. We applied moloc to schizophrenia (SCZ) and eQTL/mQTL data derived from human brain tissue and identified 52 candidate genes that influence SCZ through methylation. Our method can be applied to any GWAS and relevant functional data to help prioritize disease associated genes. moloc is available for download as an R package (https://github.com/clagiamba/moloc). We also developed a web site to visualize the biological findings (icahn.mssm.edu/moloc). The browser allows searches by gene, methylation probe, and scenario of interest. claudia.giambartolomei@gmail.com. Supplementary data are available at Bioinformatics online.
Application of Bayesian statistical decision theory for a maintenance optimization problem
International Nuclear Information System (INIS)
Procaccia, H.; Cordier, R.; Muller, S.
1997-01-01
Reliability-centered maintenance (RCM) is a rational approach that can be used to identify the equipment of facilities that may turn out to be critical with respect to safety, to availability, or to maintenance costs. Is is dor these critical pieces of equipment alone that a corrective (one waits for a failure) or preventive (the type and frequency are specified) maintenance policy is established. But this approach has limitations: - when there is little operating feedback and it concerns rare events affecting a piece of equipment judged critical on a priori grounds (how is it possible, in this case, to decide whether or not it is critical, since there is conflict between the gravity of the potential failure and its frequency?); - when the aim is propose an optimal maintenance frequency for a critical piece of equipment - changing the maintenance frequency hitherto applied may cause a significant drift in the observed reliability of the equipment, an aspect not generally taken into account in the RCM approach. In these two situations, expert judgments can be combined with the available operating feedback (Bayesian approach) and the combination of risk of failure and economic consequences taken into account (statistical decision theory) to achieve a true optimization of maintenance policy choices. This paper presents an application on the maintenance of diesel generator component
Towards port sustainability through probabilistic models: Bayesian networks
Directory of Open Access Journals (Sweden)
B. Molina
2018-04-01
Full Text Available It is necessary that a manager of an infrastructure knows relations between variables. Using Bayesian networks, variables can be classified, predicted and diagnosed, being able to estimate posterior probability of the unknown ones based on known ones. The proposed methodology has generated a database with port variables, which have been classified as economic, social, environmental and institutional, as addressed in of smart ports studies made in all Spanish Port System. Network has been developed using an acyclic directed graph, which have let us know relationships in terms of parents and sons. In probabilistic terms, it can be concluded from the constructed network that the most decisive variables for port sustainability are those that are part of the institutional dimension. It has been concluded that Bayesian networks allow modeling uncertainty probabilistically even when the number of variables is high as it occurs in port planning and exploitation.
Directory of Open Access Journals (Sweden)
Thanoon Y. Thanoon
2016-03-01
Full Text Available In this paper, ordered categorical variables are used to compare between linear and nonlinear interactions of fixed covariate and latent variables Bayesian structural equation models. Gibbs sampling method is applied for estimation and model comparison. Hidden continuous normal distribution (censored normal distribution is used to handle the problem of ordered categorical data. Statistical inferences, which involve estimation of parameters and their standard deviations, and residuals analyses for testing the selected model, are discussed. The proposed procedure is illustrated by a simulation data obtained from R program. Analysis are done by using OpenBUGS program.
Bayesian geostatistical modeling of leishmaniasis incidence in Brazil.
Directory of Open Access Journals (Sweden)
Dimitrios-Alexios Karagiannis-Voules
Full Text Available BACKGROUND: Leishmaniasis is endemic in 98 countries with an estimated 350 million people at risk and approximately 2 million cases annually. Brazil is one of the most severely affected countries. METHODOLOGY: We applied Bayesian geostatistical negative binomial models to analyze reported incidence data of cutaneous and visceral leishmaniasis in Brazil covering a 10-year period (2001-2010. Particular emphasis was placed on spatial and temporal patterns. The models were fitted using integrated nested Laplace approximations to perform fast approximate Bayesian inference. Bayesian variable selection was employed to determine the most important climatic, environmental, and socioeconomic predictors of cutaneous and visceral leishmaniasis. PRINCIPAL FINDINGS: For both types of leishmaniasis, precipitation and socioeconomic proxies were identified as important risk factors. The predicted number of cases in 2010 were 30,189 (standard deviation [SD]: 7,676 for cutaneous leishmaniasis and 4,889 (SD: 288 for visceral leishmaniasis. Our risk maps predicted the highest numbers of infected people in the states of Minas Gerais and Pará for visceral and cutaneous leishmaniasis, respectively. CONCLUSIONS/SIGNIFICANCE: Our spatially explicit, high-resolution incidence maps identified priority areas where leishmaniasis control efforts should be targeted with the ultimate goal to reduce disease incidence.
A Bayesian Network View on Nested Effects Models
Directory of Open Access Journals (Sweden)
Fröhlich Holger
2009-01-01
Full Text Available Nested effects models (NEMs are a class of probabilistic models that were designed to reconstruct a hidden signalling structure from a large set of observable effects caused by active interventions into the signalling pathway. We give a more flexible formulation of NEMs in the language of Bayesian networks. Our framework constitutes a natural generalization of the original NEM model, since it explicitly states the assumptions that are tacitly underlying the original version. Our approach gives rise to new learning methods for NEMs, which have been implemented in the /Bioconductor package nem. We validate these methods in a simulation study and apply them to a synthetic lethality dataset in yeast.
Bayesian methods for proteomic biomarker development
Directory of Open Access Journals (Sweden)
Belinda Hernández
2015-12-01
In this review we provide an introduction to Bayesian inference and demonstrate some of the advantages of using a Bayesian framework. We summarize how Bayesian methods have been used previously in proteomics and other areas of bioinformatics. Finally, we describe some popular and emerging Bayesian models from the statistical literature and provide a worked tutorial including code snippets to show how these methods may be applied for the evaluation of proteomic biomarkers.
Neural network uncertainty assessment using Bayesian statistics: a remote sensing application
Aires, F.; Prigent, C.; Rossow, W. B.
2004-01-01
Neural network (NN) techniques have proved successful for many regression problems, in particular for remote sensing; however, uncertainty estimates are rarely provided. In this article, a Bayesian technique to evaluate uncertainties of the NN parameters (i.e., synaptic weights) is first presented. In contrast to more traditional approaches based on point estimation of the NN weights, we assess uncertainties on such estimates to monitor the robustness of the NN model. These theoretical developments are illustrated by applying them to the problem of retrieving surface skin temperature, microwave surface emissivities, and integrated water vapor content from a combined analysis of satellite microwave and infrared observations over land. The weight uncertainty estimates are then used to compute analytically the uncertainties in the network outputs (i.e., error bars and correlation structure of these errors). Such quantities are very important for evaluating any application of an NN model. The uncertainties on the NN Jacobians are then considered in the third part of this article. Used for regression fitting, NN models can be used effectively to represent highly nonlinear, multivariate functions. In this situation, most emphasis is put on estimating the output errors, but almost no attention has been given to errors associated with the internal structure of the regression model. The complex structure of dependency inside the NN is the essence of the model, and assessing its quality, coherency, and physical character makes all the difference between a blackbox model with small output errors and a reliable, robust, and physically coherent model. Such dependency structures are described to the first order by the NN Jacobians: they indicate the sensitivity of one output with respect to the inputs of the model for given input data. We use a Monte Carlo integration procedure to estimate the robustness of the NN Jacobians. A regularization strategy based on principal component
Statistical Model of Extreme Shear
DEFF Research Database (Denmark)
Hansen, Kurt Schaldemose; Larsen, Gunner Chr.
2005-01-01
In order to continue cost-optimisation of modern large wind turbines, it is important to continuously increase the knowledge of wind field parameters relevant to design loads. This paper presents a general statistical model that offers site-specific prediction of the probability density function...... (PDF) of turbulence driven short-term extreme wind shear events, conditioned on the mean wind speed, for an arbitrary recurrence period. The model is based on an asymptotic expansion, and only a few and easily accessible parameters are needed as input. The model of the extreme PDF is supplemented...... by a model that, on a statistically consistent basis, describes the most likely spatial shape of an extreme wind shear event. Predictions from the model have been compared with results from an extreme value data analysis, based on a large number of full-scale measurements recorded with a high sampling rate...
Statistical Model of Extreme Shear
DEFF Research Database (Denmark)
Larsen, Gunner Chr.; Hansen, Kurt Schaldemose
2004-01-01
In order to continue cost-optimisation of modern large wind turbines, it is important to continously increase the knowledge on wind field parameters relevant to design loads. This paper presents a general statistical model that offers site-specific prediction of the probability density function...... (PDF) of turbulence driven short-term extreme wind shear events, conditioned on the mean wind speed, for an arbitrary recurrence period. The model is based on an asymptotic expansion, and only a few and easily accessible parameters are needed as input. The model of the extreme PDF is supplemented...... by a model that, on a statistically consistent basis, describe the most likely spatial shape of an extreme wind shear event. Predictions from the model have been compared with results from an extreme value data analysis, based on a large number of high-sampled full-scale time series measurements...
Directory of Open Access Journals (Sweden)
Kostas Alexandridis
2013-06-01
Full Text Available Assessing spatial model performance often presents challenges related to the choice and suitability of traditional statistical methods in capturing the true validity and dynamics of the predicted outcomes. The stochastic nature of many of our contemporary spatial models of land use change necessitate the testing and development of new and innovative methodologies in statistical spatial assessment. In many cases, spatial model performance depends critically on the spatially-explicit prior distributions, characteristics, availability and prevalence of the variables and factors under study. This study explores the statistical spatial characteristics of statistical model assessment of modeling land use change dynamics in a seven-county study area in South-Eastern Wisconsin during the historical period of 1963–1990. The artificial neural network-based Land Transformation Model (LTM predictions are used to compare simulated with historical land use transformations in urban/suburban landscapes. We introduce a range of Bayesian information entropy statistical spatial metrics for assessing the model performance across multiple simulation testing runs. Bayesian entropic estimates of model performance are compared against information-theoretic stochastic entropy estimates and theoretically-derived accuracy assessments. We argue for the critical role of informational uncertainty across different scales of spatial resolution in informing spatial landscape model assessment. Our analysis reveals how incorporation of spatial and landscape information asymmetry estimates can improve our stochastic assessments of spatial model predictions. Finally our study shows how spatially-explicit entropic classification accuracy estimates can work closely with dynamic modeling methodologies in improving our scientific understanding of landscape change as a complex adaptive system and process.
Risk prediction model for knee pain in the Nottingham community: a Bayesian modelling approach.
Fernandes, G S; Bhattacharya, A; McWilliams, D F; Ingham, S L; Doherty, M; Zhang, W
2017-03-20
Twenty-five percent of the British population over the age of 50 years experiences knee pain. Knee pain can limit physical ability and cause distress and bears significant socioeconomic costs. The objectives of this study were to develop and validate the first risk prediction model for incident knee pain in the Nottingham community and validate this internally within the Nottingham cohort and externally within the Osteoarthritis Initiative (OAI) cohort. A total of 1822 participants from the Nottingham community who were at risk for knee pain were followed for 12 years. Of this cohort, two-thirds (n = 1203) were used to develop the risk prediction model, and one-third (n = 619) were used to validate the model. Incident knee pain was defined as pain on most days for at least 1 month in the past 12 months. Predictors were age, sex, body mass index, pain elsewhere, prior knee injury and knee alignment. A Bayesian logistic regression model was used to determine the probability of an OR >1. The Hosmer-Lemeshow χ 2 statistic (HLS) was used for calibration, and ROC curve analysis was used for discrimination. The OAI cohort from the United States was also used to examine the performance of the model. A risk prediction model for knee pain incidence was developed using a Bayesian approach. The model had good calibration, with an HLS of 7.17 (p = 0.52) and moderate discriminative ability (ROC 0.70) in the community. Individual scenarios are given using the model. However, the model had poor calibration (HLS 5866.28, p prediction model for knee pain, regardless of underlying structural changes of knee osteoarthritis, in the community using a Bayesian modelling approach. The model appears to work well in a community-based population but not in individuals with a higher risk for knee osteoarthritis, and it may provide a convenient tool for use in primary care to predict the risk of knee pain in the general population.
DPpackage: Bayesian Semi- and Nonparametric Modeling in R
Directory of Open Access Journals (Sweden)
Alejandro Jara
2011-04-01
Full Text Available Data analysis sometimes requires the relaxation of parametric assumptions in order to gain modeling flexibility and robustness against mis-specification of the probability model. In the Bayesian context, this is accomplished by placing a prior distribution on a function space, such as the space of all probability distributions or the space of all regression functions. Unfortunately, posterior distributions ranging over function spaces are highly complex and hence sampling methods play a key role. This paper provides an introduction to a simple, yet comprehensive, set of programs for the implementation of some Bayesian nonparametric and semiparametric models in R, DPpackage. Currently, DPpackage includes models for marginal and conditional density estimation, receiver operating characteristic curve analysis, interval-censored data, binary regression data, item response data, longitudinal and clustered data using generalized linear mixed models, and regression data using generalized additive models. The package also contains functions to compute pseudo-Bayes factors for model comparison and for eliciting the precision parameter of the Dirichlet process prior, and a general purpose Metropolis sampling algorithm. To maximize computational efficiency, the actual sampling for each model is carried out using compiled C, C++ or Fortran code.
Kwak, Sehyun; Svensson, J; Brix, M; Ghim, Y-C
2016-02-01
A Bayesian model of the emission spectrum of the JET lithium beam has been developed to infer the intensity of the Li I (2p-2s) line radiation and associated uncertainties. The detected spectrum for each channel of the lithium beam emission spectroscopy system is here modelled by a single Li line modified by an instrumental function, Bremsstrahlung background, instrumental offset, and interference filter curve. Both the instrumental function and the interference filter curve are modelled with non-parametric Gaussian processes. All free parameters of the model, the intensities of the Li line, Bremsstrahlung background, and instrumental offset, are inferred using Bayesian probability theory with a Gaussian likelihood for photon statistics and electronic background noise. The prior distributions of the free parameters are chosen as Gaussians. Given these assumptions, the intensity of the Li line and corresponding uncertainties are analytically available using a Bayesian linear inversion technique. The proposed approach makes it possible to extract the intensity of Li line without doing a separate background subtraction through modulation of the Li beam.
A Statistical Programme Assignment Model
DEFF Research Database (Denmark)
Rosholm, Michael; Staghøj, Jonas; Svarer, Michael
When treatment effects of active labour market programmes are heterogeneous in an observable way across the population, the allocation of the unemployed into different programmes becomes a particularly important issue. In this paper, we present a statistical model designed to improve the present...
International Nuclear Information System (INIS)
Ainsbury, Elizabeth A.; Lloyd, David C.; Rothkamm, Kai; Vinnikov, Volodymyr A.; Maznyk, Nataliya A.; Puig, Pedro; Higueras, Manuel
2014-01-01
Classical methods of assessing the uncertainty associated with radiation doses estimated using cytogenetic techniques are now extremely well defined. However, several authors have suggested that a Bayesian approach to uncertainty estimation may be more suitable for cytogenetic data, which are inherently stochastic in nature. The Bayesian analysis framework focuses on identification of probability distributions (for yield of aberrations or estimated dose), which also means that uncertainty is an intrinsic part of the analysis, rather than an 'afterthought'. In this paper Bayesian, as well as some more advanced classical, data analysis methods for radiation cytogenetics are reviewed that have been proposed in the literature. A practical overview of Bayesian cytogenetic dose estimation is also presented, with worked examples from the literature. (authors)
Textual information access statistical models
Gaussier, Eric
2013-01-01
This book presents statistical models that have recently been developed within several research communities to access information contained in text collections. The problems considered are linked to applications aiming at facilitating information access:- information extraction and retrieval;- text classification and clustering;- opinion mining;- comprehension aids (automatic summarization, machine translation, visualization).In order to give the reader as complete a description as possible, the focus is placed on the probability models used in the applications
Bayesian analysis of physiologically based toxicokinetic and toxicodynamic models.
Hack, C Eric
2006-04-17
Physiologically based toxicokinetic (PBTK) and toxicodynamic (TD) models of bromate in animals and humans would improve our ability to accurately estimate the toxic doses in humans based on available animal studies. These mathematical models are often highly parameterized and must be calibrated in order for the model predictions of internal dose to adequately fit the experimentally measured doses. Highly parameterized models are difficult to calibrate and it is difficult to obtain accurate estimates of uncertainty or variability in model parameters with commonly used frequentist calibration methods, such as maximum likelihood estimation (MLE) or least squared error approaches. The Bayesian approach called Markov chain Monte Carlo (MCMC) analysis can be used to successfully calibrate these complex models. Prior knowledge about the biological system and associated model parameters is easily incorporated in this approach in the form of prior parameter distributions, and the distributions are refined or updated using experimental data to generate posterior distributions of parameter estimates. The goal of this paper is to give the non-mathematician a brief description of the Bayesian approach and Markov chain Monte Carlo analysis, how this technique is used in risk assessment, and the issues associated with this approach.
Kim, Inyoung; Pang, Herbert; Zhao, Hongyu
2013-01-01
Many statistical methods for microarray data analysis consider one gene at a time, and they may miss subtle changes at the single gene level. This limitation may be overcome by considering a set of genes simultaneously where the gene sets are derived from prior biological knowledge. Limited work has been carried out in the regression setting to study the effects of clinical covariates and expression levels of genes in a pathway either on a continuous or on a binary clinical outcome. Hence, we propose a Bayesian approach for identifying pathways related to both types of outcomes. We compare our Bayesian approaches with a likelihood-based approach that was developed by relating a least squares kernel machine for nonparametric pathway effect with a restricted maximum likelihood for variance components. Unlike the likelihood-based approach, the Bayesian approach allows us to directly estimate all parameters and pathway effects. It can incorporate prior knowledge into Bayesian hierarchical model formulation and makes inference by using the posterior samples without asymptotic theory. We consider several kernels (Gaussian, polynomial, and neural network kernels) to characterize gene expression effects in a pathway on clinical outcomes. Our simulation results suggest that the Bayesian approach has more accurate coverage probability than the likelihood-based approach, and this is especially so when the sample size is small compared with the number of genes being studied in a pathway. We demonstrate the usefulness of our approaches through its applications to a type II diabetes mellitus data set. Our approaches can also be applied to other settings where a large number of strongly correlated predictors are present. PMID:22438129
Bayesian inference in a discrete shock model using confounded common cause data
International Nuclear Information System (INIS)
Kvam, Paul H.; Martz, Harry F.
1995-01-01
We consider redundant systems of identical components for which reliability is assessed statistically using only demand-based failures and successes. Direct assessment of system reliability can lead to gross errors in estimation if there exist external events in the working environment that cause two or more components in the system to fail in the same demand period which have not been included in the reliability model. We develop a simple Bayesian model for estimating component reliability and the corresponding probability of common cause failure in operating systems for which the data is confounded; that is, the common cause failures cannot be distinguished from multiple independent component failures in the narrative event descriptions
Karabatsos, George
2017-02-01
Most of applied statistics involves regression analysis of data. In practice, it is important to specify a regression model that has minimal assumptions which are not violated by data, to ensure that statistical inferences from the model are informative and not misleading. This paper presents a stand-alone and menu-driven software package, Bayesian Regression: Nonparametric and Parametric Models, constructed from MATLAB Compiler. Currently, this package gives the user a choice from 83 Bayesian models for data analysis. They include 47 Bayesian nonparametric (BNP) infinite-mixture regression models; 5 BNP infinite-mixture models for density estimation; and 31 normal random effects models (HLMs), including normal linear models. Each of the 78 regression models handles either a continuous, binary, or ordinal dependent variable, and can handle multi-level (grouped) data. All 83 Bayesian models can handle the analysis of weighted observations (e.g., for meta-analysis), and the analysis of left-censored, right-censored, and/or interval-censored data. Each BNP infinite-mixture model has a mixture distribution assigned one of various BNP prior distributions, including priors defined by either the Dirichlet process, Pitman-Yor process (including the normalized stable process), beta (two-parameter) process, normalized inverse-Gaussian process, geometric weights prior, dependent Dirichlet process, or the dependent infinite-probits prior. The software user can mouse-click to select a Bayesian model and perform data analysis via Markov chain Monte Carlo (MCMC) sampling. After the sampling completes, the software automatically opens text output that reports MCMC-based estimates of the model's posterior distribution and model predictive fit to the data. Additional text and/or graphical output can be generated by mouse-clicking other menu options. This includes output of MCMC convergence analyses, and estimates of the model's posterior predictive distribution, for selected
Forecasting natural gas consumption in China by Bayesian Model Averaging
Directory of Open Access Journals (Sweden)
Wei Zhang
2015-11-01
Full Text Available With rapid growth of natural gas consumption in China, it is in urgent need of more accurate and reliable models to make a reasonable forecast. Considering the limitations of the single model and the model uncertainty, this paper presents a combinative method to forecast natural gas consumption by Bayesian Model Averaging (BMA. It can effectively handle the uncertainty associated with model structure and parameters, and thus improves the forecasting accuracy. This paper chooses six variables for forecasting the natural gas consumption, including GDP, urban population, energy consumption structure, industrial structure, energy efficiency and exports of goods and services. The results show that comparing to Gray prediction model, Linear regression model and Artificial neural networks, the BMA method provides a flexible tool to forecast natural gas consumption that will have a rapid growth in the future. This study can provide insightful information on natural gas consumption in the future.
Bayesian analysis for uncertainty estimation of a canopy transpiration model
Samanta, S.; Mackay, D. S.; Clayton, M. K.; Kruger, E. L.; Ewers, B. E.
2007-04-01
A Bayesian approach was used to fit a conceptual transpiration model to half-hourly transpiration rates for a sugar maple (Acer saccharum) stand collected over a 5-month period and probabilistically estimate its parameter and prediction uncertainties. The model used the Penman-Monteith equation with the Jarvis model for canopy conductance. This deterministic model was extended by adding a normally distributed error term. This extension enabled using Markov chain Monte Carlo simulations to sample the posterior parameter distributions. The residuals revealed approximate conformance to the assumption of normally distributed errors. However, minor systematic structures in the residuals at fine timescales suggested model changes that would potentially improve the modeling of transpiration. Results also indicated considerable uncertainties in the parameter and transpiration estimates. This simple methodology of uncertainty analysis would facilitate the deductive step during the development cycle of deterministic conceptual models by accounting for these uncertainties while drawing inferences from data.
Modeling operational risks of the nuclear industry with Bayesian networks
International Nuclear Information System (INIS)
Wieland, Patricia; Lustosa, Leonardo J.
2009-01-01
Basically, planning a new industrial plant requires information on the industrial management, regulations, site selection, definition of initial and planned capacity, and on the estimation of the potential demand. However, this is far from enough to assure the success of an industrial enterprise. Unexpected and extremely damaging events may occur that deviates from the original plan. The so-called operational risks are not only in the system, equipment, process or human (technical or managerial) failures. They are also in intentional events such as frauds and sabotage, or extreme events like terrorist attacks or radiological accidents and even on public reaction to perceived environmental or future generation impacts. For the nuclear industry, it is a challenge to identify and to assess the operational risks and their various sources. Early identification of operational risks can help in preparing contingency plans, to delay the decision to invest or to approve a project that can, at an extreme, affect the public perception of the nuclear energy. A major problem in modeling operational risk losses is the lack of internal data that are essential, for example, to apply the loss distribution approach. As an alternative, methods that consider qualitative and subjective information can be applied, for example, fuzzy logic, neural networks, system dynamic or Bayesian networks. An advantage of applying Bayesian networks to model operational risk is the possibility to include expert opinions and variables of interest, to structure the model via causal dependencies among these variables, and to specify subjective prior and conditional probabilities distributions at each step or network node. This paper suggests a classification of operational risks in industry and discusses the benefits and obstacles of the Bayesian networks approach to model those risks. (author)
Variational Bayesian labeled multi-Bernoulli filter with unknown sensor noise statistics
Directory of Open Access Journals (Sweden)
Qiu Hao
2016-10-01
Full Text Available It is difficult to build accurate model for measurement noise covariance in complex backgrounds. For the scenarios of unknown sensor noise variances, an adaptive multi-target tracking algorithm based on labeled random finite set and variational Bayesian (VB approximation is proposed. The variational approximation technique is introduced to the labeled multi-Bernoulli (LMB filter to jointly estimate the states of targets and sensor noise variances. Simulation results show that the proposed method can give unbiased estimation of cardinality and has better performance than the VB probability hypothesis density (VB-PHD filter and the VB cardinality balanced multi-target multi-Bernoulli (VB-CBMeMBer filter in harsh situations. The simulations also confirm the robustness of the proposed method against the time-varying noise variances. The computational complexity of proposed method is higher than the VB-PHD and VB-CBMeMBer in extreme cases, while the mean execution times of the three methods are close when targets are well separated.
Directory of Open Access Journals (Sweden)
Hashem Salarzadeh Jenatabadi
2016-11-01
Full Text Available There are many factors which could influence the sustainability of airlines. The main purpose of this study is to introduce a framework for a financial sustainability index and model it based on structural equation modeling (SEM with maximum likelihood and Bayesian predictors. The introduced framework includes economic performance, operational performance, cost performance, and financial performance. Based on both Bayesian SEM (Bayesian-SEM and Classical SEM (Classical-SEM, it was found that economic performance with both operational performance and cost performance are significantly related to the financial performance index. The four mathematical indices employed are root mean square error, coefficient of determination, mean absolute error, and mean absolute percentage error to compare the efficiency of Bayesian-SEM and Classical-SEM in predicting the airline financial performance. The outputs confirmed that the framework with Bayesian prediction delivered a good fit with the data, although the framework predicted with a Classical-SEM approach did not prepare a well-fitting model. The reasons for this discrepancy between Classical and Bayesian predictions, as well as the potential advantages and caveats with the application of Bayesian approach in airline sustainability studies, are debated.
Bayesian comparison of Markov models of molecular dynamics with detailed balance constraint
Bacallado, Sergio; Chodera, John D.; Pande, Vijay
2009-07-01
Discrete-space Markov models are a convenient way of describing the kinetics of biomolecules. The most common strategies used to validate these models employ statistics from simulation data, such as the eigenvalue spectrum of the inferred rate matrix, which are often associated with large uncertainties. Here, we propose a Bayesian approach, which makes it possible to differentiate between models at a fixed lag time making use of short trajectories. The hierarchical definition of the models allows one to compare instances with any number of states. We apply a conjugate prior for reversible Markov chains, which was recently introduced in the statistics literature. The method is tested in two different systems, a Monte Carlo dynamics simulation of a two-dimensional model system and molecular dynamics simulations of the terminally blocked alanine dipeptide.
Improved model for statistical alignment
Energy Technology Data Exchange (ETDEWEB)
Miklos, I.; Toroczkai, Z. (Zoltan)
2001-01-01
The statistical approach to molecular sequence evolution involves the stochastic modeling of the substitution, insertion and deletion processes. Substitution has been modeled in a reliable way for more than three decades by using finite Markov-processes. Insertion and deletion, however, seem to be more difficult to model, and thc recent approaches cannot acceptably deal with multiple insertions and deletions. A new method based on a generating function approach is introduced to describe the multiple insertion process. The presented algorithm computes the approximate joint probability of two sequences in 0(13) running time where 1 is the geometric mean of the sequence lengths.
From qualitative reasoning models to Bayesian-based learner modeling
Milošević, U.; Bredeweg, B.; de Kleer, J.; Forbus, K.D.
2010-01-01
Assessing the knowledge of a student is a fundamental part of intelligent learning environments. We present a Bayesian network based approach to dealing with uncertainty when estimating a learner’s state of knowledge in the context of Qualitative Reasoning (QR). A proposal for a global architecture
Development of a cyber security risk model using Bayesian networks
International Nuclear Information System (INIS)
Shin, Jinsoo; Son, Hanseong; Khalil ur, Rahman; Heo, Gyunyoung
2015-01-01
Cyber security is an emerging safety issue in the nuclear industry, especially in the instrumentation and control (I and C) field. To address the cyber security issue systematically, a model that can be used for cyber security evaluation is required. In this work, a cyber security risk model based on a Bayesian network is suggested for evaluating cyber security for nuclear facilities in an integrated manner. The suggested model enables the evaluation of both the procedural and technical aspects of cyber security, which are related to compliance with regulatory guides and system architectures, respectively. The activity-quality analysis model was developed to evaluate how well people and/or organizations comply with the regulatory guidance associated with cyber security. The architecture analysis model was created to evaluate vulnerabilities and mitigation measures with respect to their effect on cyber security. The two models are integrated into a single model, which is called the cyber security risk model, so that cyber security can be evaluated from procedural and technical viewpoints at the same time. The model was applied to evaluate the cyber security risk of the reactor protection system (RPS) of a research reactor and to demonstrate its usefulness and feasibility. - Highlights: • We developed the cyber security risk model can be find the weak point of cyber security integrated two cyber analysis models by using Bayesian Network. • One is the activity-quality model signifies how people and/or organization comply with the cyber security regulatory guide. • Other is the architecture model represents the probability of cyber-attack on RPS architecture. • The cyber security risk model can provide evidence that is able to determine the key element for cyber security for RPS of a research reactor
Two-stage Bayesian models-application to ZEDB project
International Nuclear Information System (INIS)
Bunea, C.; Charitos, T.; Cooke, R.M.; Becker, G.
2005-01-01
A well-known mathematical tool to analyze plant specific reliability data for nuclear power facilities is the two-stage Bayesian model. Such two-stage Bayesian models are standard practice nowadays, for example in the German ZEDB project or in the Swedish T-Book, although they may differ in their mathematical models and software implementation. In this paper, we review the mathematical model, its underlying assumptions and supporting arguments. Reasonable conditional assumptions are made to yield tractable and mathematically valid form for the failure rate at plant of interest, given failures and operational times at other plants in the population. The posterior probability of failure rate at plant of interest is sensitive to the choice of hyperprior parameters since the effect of hyperprior distribution will never be dominated by the effect of observation. The methods of Poern and Jeffrey for choosing distributions over hyperparameters are discussed. Furthermore, we will perform verification tasks associated with the theoretical model presented in this paper. The present software implementation produces good agreement with ZEDB results for various prior distributions. The difference between our results and those of ZEDB reflect differences that may arise from numerical implementation, as that would use different step size and truncation bounds
Prior Sensitivity Analysis in Default Bayesian Structural Equation Modeling.
van Erp, Sara; Mulder, Joris; Oberski, Daniel L
2017-11-27
Bayesian structural equation modeling (BSEM) has recently gained popularity because it enables researchers to fit complex models and solve some of the issues often encountered in classical maximum likelihood estimation, such as nonconvergence and inadmissible solutions. An important component of any Bayesian analysis is the prior distribution of the unknown model parameters. Often, researchers rely on default priors, which are constructed in an automatic fashion without requiring substantive prior information. However, the prior can have a serious influence on the estimation of the model parameters, which affects the mean squared error, bias, coverage rates, and quantiles of the estimates. In this article, we investigate the performance of three different default priors: noninformative improper priors, vague proper priors, and empirical Bayes priors-with the latter being novel in the BSEM literature. Based on a simulation study, we find that these three default BSEM methods may perform very differently, especially with small samples. A careful prior sensitivity analysis is therefore needed when performing a default BSEM analysis. For this purpose, we provide a practical step-by-step guide for practitioners to conducting a prior sensitivity analysis in default BSEM. Our recommendations are illustrated using a well-known case study from the structural equation modeling literature, and all code for conducting the prior sensitivity analysis is available in the online supplemental materials. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Bayesian Networks An Introduction
Koski, Timo
2009-01-01
Bayesian Networks: An Introduction provides a self-contained introduction to the theory and applications of Bayesian networks, a topic of interest and importance for statisticians, computer scientists and those involved in modelling complex data sets. The material has been extensively tested in classroom teaching and assumes a basic knowledge of probability, statistics and mathematics. All notions are carefully explained and feature exercises throughout. Features include:.: An introduction to Dirichlet Distribution, Exponential Families and their applications.; A detailed description of learni
Di Giuseppe, Edmondo; Lasinio, Giovanna Jona; Pasqui, Massimiliano; Esposito, Stanislao
2015-01-01
We propose a new statistical protocol for the estimation of precipitation using lightning data. We first identify rainy events using a scan statistics, then we estimate Rainfall Lighting Ratio (RLR) to convert lightning number into rain volume given the storm intensity. Then we build a hierarchical Bayesian model aiming at the prediction of 15- and 30-minutes cumulated precipitation at unobserved locations and time using information on lightning in the same area. More specifically, we build a...
Czech Academy of Sciences Publication Activity Database
Tichý, Ondřej; Šmídl, Václav; Hofman, Radek; Šindelářová, Kateřina; Hýža, M.; Stohl, A.
2017-01-01
Roč. 17, č. 20 (2017), s. 12677-12696 ISSN 1680-7316 R&D Projects: GA MŠk(CZ) 7F14287 Institutional support: RVO:67985556 Keywords : Bayesian inverse modeling * iodine-131 * consequences of the iodine release Subject RIV: BB - Applied Statistics, Operational Research OBOR OECD: Statistics and probability Impact factor: 5.318, year: 2016 http://library.utia.cas.cz/separaty/2017/AS/tichy-0480506.pdf
Directory of Open Access Journals (Sweden)
Kelemen Arpad
2008-08-01
Full Text Available Abstract Background This paper addresses key biological problems and statistical issues in the analysis of large gene expression data sets that describe systemic temporal response cascades to therapeutic doses in multiple tissues such as liver, skeletal muscle, and kidney from the same animals. Affymetrix time course gene expression data U34A are obtained from three different tissues including kidney, liver and muscle. Our goal is not only to find the concordance of gene in different tissues, identify the common differentially expressed genes over time and also examine the reproducibility of the findings by integrating the results through meta analysis from multiple tissues in order to gain a significant increase in the power of detecting differentially expressed genes over time and to find the differential differences of three tissues responding to the drug. Results and conclusion Bayesian categorical model for estimating the proportion of the 'call' are used for pre-screening genes. Hierarchical Bayesian Mixture Model is further developed for the identifications of differentially expressed genes across time and dynamic clusters. Deviance information criterion is applied to determine the number of components for model comparisons and selections. Bayesian mixture model produces the gene-specific posterior probability of differential/non-differential expression and the 95% credible interval, which is the basis for our further Bayesian meta-inference. Meta-analysis is performed in order to identify commonly expressed genes from multiple tissues that may serve as ideal targets for novel treatment strategies and to integrate the results across separate studies. We have found the common expressed genes in the three tissues. However, the up/down/no regulations of these common genes are different at different time points. Moreover, the most differentially expressed genes were found in the liver, then in kidney, and then in muscle.
Bayesian Variable Selection on Model Spaces Constrained by Heredity Conditions.
Taylor-Rodriguez, Daniel; Womack, Andrew; Bliznyuk, Nikolay
2016-01-01
This paper investigates Bayesian variable selection when there is a hierarchical dependence structure on the inclusion of predictors in the model. In particular, we study the type of dependence found in polynomial response surfaces of orders two and higher, whose model spaces are required to satisfy weak or strong heredity conditions. These conditions restrict the inclusion of higher-order terms depending upon the inclusion of lower-order parent terms. We develop classes of priors on the model space, investigate their theoretical and finite sample properties, and provide a Metropolis-Hastings algorithm for searching the space of models. The tools proposed allow fast and thorough exploration of model spaces that account for hierarchical polynomial structure in the predictors and provide control of the inclusion of false positives in high posterior probability models.
Experimental validation of a Bayesian model of visual acuity.
LENUS (Irish Health Repository)
Dalimier, Eugénie
2009-01-01
Based on standard procedures used in optometry clinics, we compare measurements of visual acuity for 10 subjects (11 eyes tested) in the presence of natural ocular aberrations and different degrees of induced defocus, with the predictions given by a Bayesian model customized with aberrometric data of the eye. The absolute predictions of the model, without any adjustment, show good agreement with the experimental data, in terms of correlation and absolute error. The efficiency of the model is discussed in comparison with image quality metrics and other customized visual process models. An analysis of the importance and customization of each stage of the model is also given; it stresses the potential high predictive power from precise modeling of ocular and neural transfer functions.
A flexible Bayesian model for studying gene-environment interaction.
Directory of Open Access Journals (Sweden)
Kai Yu
2012-01-01
Full Text Available An important follow-up step after genetic markers are found to be associated with a disease outcome is a more detailed analysis investigating how the implicated gene or chromosomal region and an established environment risk factor interact to influence the disease risk. The standard approach to this study of gene-environment interaction considers one genetic marker at a time and therefore could misrepresent and underestimate the genetic contribution to the joint effect when one or more functional loci, some of which might not be genotyped, exist in the region and interact with the environment risk factor in a complex way. We develop a more global approach based on a Bayesian model that uses a latent genetic profile variable to capture all of the genetic variation in the entire targeted region and allows the environment effect to vary across different genetic profile categories. We also propose a resampling-based test derived from the developed Bayesian model for the detection of gene-environment interaction. Using data collected in the Environment and Genetics in Lung Cancer Etiology (EAGLE study, we apply the Bayesian model to evaluate the joint effect of smoking intensity and genetic variants in the 15q25.1 region, which contains a cluster of nicotinic acetylcholine receptor genes and has been shown to be associated with both lung cancer and smoking behavior. We find evidence for gene-environment interaction (P-value = 0.016, with the smoking effect appearing to be stronger in subjects with a genetic profile associated with a higher lung cancer risk; the conventional test of gene-environment interaction based on the single-marker approach is far from significant.
CONSTRAINTS ON COSMIC-RAY PROPAGATION MODELS FROM A GLOBAL BAYESIAN ANALYSIS
International Nuclear Information System (INIS)
Trotta, R.; Johannesson, G.; Moskalenko, I. V.; Porter, T. A.; Ruiz de Austri, R.; Strong, A. W.
2011-01-01
Research in many areas of modern physics such as, e.g., indirect searches for dark matter and particle acceleration in supernova remnant shocks rely heavily on studies of cosmic rays (CRs) and associated diffuse emissions (radio, microwave, X-rays, γ-rays). While very detailed numerical models of CR propagation exist, a quantitative statistical analysis of such models has been so far hampered by the large computational effort that those models require. Although statistical analyses have been carried out before using semi-analytical models (where the computation is much faster), the evaluation of the results obtained from such models is difficult, as they necessarily suffer from many simplifying assumptions. The main objective of this paper is to present a working method for a full Bayesian parameter estimation for a numerical CR propagation model. For this study, we use the GALPROP code, the most advanced of its kind, which uses astrophysical information, and nuclear and particle data as inputs to self-consistently predict CRs, γ-rays, synchrotron, and other observables. We demonstrate that a full Bayesian analysis is possible using nested sampling and Markov Chain Monte Carlo methods (implemented in the SuperBayeS code) despite the heavy computational demands of a numerical propagation code. The best-fit values of parameters found in this analysis are in agreement with previous, significantly simpler, studies also based on GALPROP.
Bayesian models for cost-effectiveness analysis in the presence of structural zero costs.
Baio, Gianluca
2014-05-20
Bayesian modelling for cost-effectiveness data has received much attention in both the health economics and the statistical literature, in recent years. Cost-effectiveness data are characterised by a relatively complex structure of relationships linking a suitable measure of clinical benefit (e.g. quality-adjusted life years) and the associated costs. Simplifying assumptions, such as (bivariate) normality of the underlying distributions, are usually not granted, particularly for the cost variable, which is characterised by markedly skewed distributions. In addition, individual-level data sets are often characterised by the presence of structural zeros in the cost variable. Hurdle models can be used to account for the presence of excess zeros in a distribution and have been applied in the context of cost data. We extend their application to cost-effectiveness data, defining a full Bayesian specification, which consists of a model for the individual probability of null costs, a marginal model for the costs and a conditional model for the measure of effectiveness (given the observed costs). We presented the model using a working example to describe its main features. © 2013 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd.
On-line Bayesian model updating for structural health monitoring
Rocchetta, Roberto; Broggi, Matteo; Huchet, Quentin; Patelli, Edoardo
2018-03-01
Fatigue induced cracks is a dangerous failure mechanism which affects mechanical components subject to alternating load cycles. System health monitoring should be adopted to identify cracks which can jeopardise the structure. Real-time damage detection may fail in the identification of the cracks due to different sources of uncertainty which have been poorly assessed or even fully neglected. In this paper, a novel efficient and robust procedure is used for the detection of cracks locations and lengths in mechanical components. A Bayesian model updating framework is employed, which allows accounting for relevant sources of uncertainty. The idea underpinning the approach is to identify the most probable crack consistent with the experimental measurements. To tackle the computational cost of the Bayesian approach an emulator is adopted for replacing the computationally costly Finite Element model. To improve the overall robustness of the procedure, different numerical likelihoods, measurement noises and imprecision in the value of model parameters are analysed and their effects quantified. The accuracy of the stochastic updating and the efficiency of the numerical procedure are discussed. An experimental aluminium frame and on a numerical model of a typical car suspension arm are used to demonstrate the applicability of the approach.
Modeling the Frequency of Cyclists’ Red-Light Running Behavior Using Bayesian PG Model and PLN Model
Directory of Open Access Journals (Sweden)
Yao Wu
2016-01-01
Full Text Available Red-light running behaviors of bicycles at signalized intersection lead to a large number of traffic conflicts and high collision potentials. The primary objective of this study is to model the cyclists’ red-light running frequency within the framework of Bayesian statistics. Data was collected at twenty-five approaches at seventeen signalized intersections. The Poisson-gamma (PG and Poisson-lognormal (PLN model were developed and compared. The models were validated using Bayesian p values based on posterior predictive checking indicators. It was found that the two models have a good fit of the observed cyclists’ red-light running frequency. Furthermore, the PLN model outperformed the PG model. The model estimated results showed that the amount of cyclists’ red-light running is significantly influenced by bicycle flow, conflict traffic flow, pedestrian signal type, vehicle speed, and e-bike rate. The validation result demonstrated the reliability of the PLN model. The research results can help transportation professionals to predict the expected amount of the cyclists’ red-light running and develop effective guidelines or policies to reduce red-light running frequency of bicycles at signalized intersections.
Holtmann, Jana; Koch, Tobias; Lochner, Katharina; Eid, Michael
2016-01-01
Multilevel structural equation models are increasingly applied in psychological research. With increasing model complexity, estimation becomes computationally demanding, and small sample sizes pose further challenges on estimation methods relying on asymptotic theory. Recent developments of Bayesian estimation techniques may help to overcome the shortcomings of classical estimation techniques. The use of potentially inaccurate prior information may, however, have detrimental effects, especially in small samples. The present Monte Carlo simulation study compares the statistical performance of classical estimation techniques with Bayesian estimation using different prior specifications for a two-level SEM with either continuous or ordinal indicators. Using two software programs (Mplus and Stan), differential effects of between- and within-level sample sizes on estimation accuracy were investigated. Moreover, it was tested to which extent inaccurate priors may have detrimental effects on parameter estimates in categorical indicator models. For continuous indicators, Bayesian estimation did not show performance advantages over ML. For categorical indicators, Bayesian estimation outperformed WLSMV solely in case of strongly informative accurate priors. Weakly informative inaccurate priors did not deteriorate performance of the Bayesian approach, while strong informative inaccurate priors led to severely biased estimates even with large sample sizes. With diffuse priors, Stan yielded better results than Mplus in terms of parameter estimates.
Non-stationary magnetoencephalography by Bayesian filtering of dipole models
Somersalo, E.; Voutilainen, A.; Kaipio, J. P.
2003-10-01
In this paper, we consider the biomagnetic inverse problem of estimating a time-varying source current from magnetic field measurements. It is assumed that the data are severely corrupted by measurement noise. This setting is a model for magnetoencephalography (MEG) when the dynamic nature of the source prevents us from effecting noise reduction by averaging over consecutive measurements. Thus, the potential applications of this approach include the single trial estimation of the brain activity, in particular from the spontaneous MEG data. Our approach is based on non-stationary Bayesian estimation, and we propose the use of particle filters. The source model in this work is either a single dipole or multiple dipole model. Part of the problem consists of the model determination. Numerical simulations are presented.
A kinematic model for Bayesian tracking of cyclic human motion
Greif, Thomas; Lienhart, Rainer
2010-01-01
We introduce a two-dimensional kinematic model for cyclic motions of humans, which is suitable for the use as temporal prior in any Bayesian tracking framework. This human motion model is solely based on simple kinematic properties: the joint accelerations. Distributions of joint accelerations subject to the cycle progress are learned from training data. We present results obtained by applying the introduced model to the cyclic motion of backstroke swimming in a Kalman filter framework that represents the posterior distribution by a Gaussian. We experimentally evaluate the sensitivity of the motion model with respect to the frequency and noise level of assumed appearance-based pose measurements by simulating various fidelities of the pose measurements using ground truth data.
Markov chain Monte Carlo simulation for Bayesian Hidden Markov Models
Chan, Lay Guat; Ibrahim, Adriana Irawati Nur Binti
2016-10-01
A hidden Markov model (HMM) is a mixture model which has a Markov chain with finite states as its mixing distribution. HMMs have been applied to a variety of fields, such as speech and face recognitions. The main purpose of this study is to investigate the Bayesian approach to HMMs. Using this approach, we can simulate from the parameters' posterior distribution using some Markov chain Monte Carlo (MCMC) sampling methods. HMMs seem to be useful, but there are some limitations. Therefore, by using the Mixture of Dirichlet processes Hidden Markov Model (MDPHMM) based on Yau et. al (2011), we hope to overcome these limitations. We shall conduct a simulation study using MCMC methods to investigate the performance of this model.
Dynamic Bayesian networks as prognostic models for clinical patient management.
van Gerven, Marcel A J; Taal, Babs G; Lucas, Peter J F
2008-08-01
Prognostic models in medicine are usually been built using simple decision rules, proportional hazards models, or Markov models. Dynamic Bayesian networks (DBNs) offer an approach that allows for the incorporation of the causal and temporal nature of medical domain knowledge as elicited from domain experts, thereby allowing for detailed prognostic predictions. The aim of this paper is to describe the considerations that must be taken into account when constructing a DBN for complex medical domains and to demonstrate their usefulness in practice. To this end, we focus on the construction of a DBN for prognosis of carcinoid patients, compare performance with that of a proportional hazards model, and describe predictions for three individual patients. We show that the DBN can make detailed predictions, about not only patient survival, but also other variables of interest, such as disease progression, the effect of treatment, and the development of complications. Strengths and limitations of our approach are discussed and compared with those offered by traditional methods.
One-Stage and Bayesian Two-Stage Optimal Designs for Mixture Models
Lin, Hefang
1999-01-01
In this research, Bayesian two-stage D-D optimal designs for mixture experiments with or without process variables under model uncertainty are developed. A Bayesian optimality criterion is used in the first stage to minimize the determinant of the posterior variances of the parameters. The second stage design is then generated according to an optimality procedure that collaborates with the improved model from first stage data. Our results show that the Bayesian two-stage D-D optimal design...
Bayesian uncertainty analysis with applications to turbulence modeling
International Nuclear Information System (INIS)
Cheung, Sai Hung; Oliver, Todd A.; Prudencio, Ernesto E.; Prudhomme, Serge; Moser, Robert D.
2011-01-01
In this paper, we apply Bayesian uncertainty quantification techniques to the processes of calibrating complex mathematical models and predicting quantities of interest (QoI's) with such models. These techniques also enable the systematic comparison of competing model classes. The processes of calibration and comparison constitute the building blocks of a larger validation process, the goal of which is to accept or reject a given mathematical model for the prediction of a particular QoI for a particular scenario. In this work, we take the first step in this process by applying the methodology to the analysis of the Spalart-Allmaras turbulence model in the context of incompressible, boundary layer flows. Three competing model classes based on the Spalart-Allmaras model are formulated, calibrated against experimental data, and used to issue a prediction with quantified uncertainty. The model classes are compared in terms of their posterior probabilities and their prediction of QoI's. The model posterior probability represents the relative plausibility of a model class given the data. Thus, it incorporates the model's ability to fit experimental observations. Alternatively, comparing models using the predicted QoI connects the process to the needs of decision makers that use the results of the model. We show that by using both the model plausibility and predicted QoI, one has the opportunity to reject some model classes after calibration, before subjecting the remaining classes to additional validation challenges.
Directory of Open Access Journals (Sweden)
David W Redding
Full Text Available Statistical approaches for inferring the spatial distribution of taxa (Species Distribution Models, SDMs commonly rely on available occurrence data, which is often clumped and geographically restricted. Although available SDM methods address some of these factors, they could be more directly and accurately modelled using a spatially-explicit approach. Software to fit models with spatial autocorrelation parameters in SDMs are now widely available, but whether such approaches for inferring SDMs aid predictions compared to other methodologies is unknown. Here, within a simulated environment using 1000 generated species' ranges, we compared the performance of two commonly used non-spatial SDM methods (Maximum Entropy Modelling, MAXENT and boosted regression trees, BRT, to a spatial Bayesian SDM method (fitted using R-INLA, when the underlying data exhibit varying combinations of clumping and geographic restriction. Finally, we tested how any recommended methodological settings designed to account for spatially non-random patterns in the data impact inference. Spatial Bayesian SDM method was the most consistently accurate method, being in the top 2 most accurate methods in 7 out of 8 data sampling scenarios. Within high-coverage sample datasets, all methods performed fairly similarly. When sampling points were randomly spread, BRT had a 1-3% greater accuracy over the other methods and when samples were clumped, the spatial Bayesian SDM method had a 4%-8% better AUC score. Alternatively, when sampling points were restricted to a small section of the true range all methods were on average 10-12% less accurate, with greater variation among the methods. Model inference under the recommended settings to account for autocorrelation was not impacted by clumping or restriction of data, except for the complexity of the spatial regression term in the spatial Bayesian model. Methods, such as those made available by R-INLA, can be successfully used to account
MATHEMATICAL RISK ANALYSIS: VIA NICHOLAS RISK MODEL AND BAYESIAN ANALYSIS
Directory of Open Access Journals (Sweden)
Anass BAYAGA
2010-07-01
Full Text Available The objective of this second part of a two-phased study was to explorethe predictive power of quantitative risk analysis (QRA method andprocess within Higher Education Institution (HEI. The method and process investigated the use impact analysis via Nicholas risk model and Bayesian analysis, with a sample of hundred (100 risk analysts in a historically black South African University in the greater Eastern Cape Province.The first findings supported and confirmed previous literature (KingIII report, 2009: Nicholas and Steyn, 2008: Stoney, 2007: COSA, 2004 that there was a direct relationship between risk factor, its likelihood and impact, certiris paribus. The second finding in relation to either controlling the likelihood or the impact of occurrence of risk (Nicholas risk model was that to have a brighter risk reward, it was important to control the likelihood ofoccurrence of risks as compared with its impact so to have a direct effect on entire University. On the Bayesian analysis, thus third finding, the impact of risk should be predicted along three aspects. These aspects included the human impact (decisions made, the property impact (students and infrastructural based and the business impact. Lastly, the study revealed that although in most business cases, where as business cycles considerably vary dependingon the industry and or the institution, this study revealed that, most impacts in HEI (University was within the period of one academic.The recommendation was that application of quantitative risk analysisshould be related to current legislative framework that affects HEI.
Bayesian Age-Period-Cohort Model of Lung Cancer Mortality
Directory of Open Access Journals (Sweden)
Bhikhari P. Tharu
2015-09-01
Full Text Available Background The objective of this study was to analyze the time trend for lung cancer mortality in the population of the USA by 5 years based on most recent available data namely to 2010. The knowledge of the mortality rates in the temporal trends is necessary to understand cancer burden.Methods Bayesian Age-Period-Cohort model was fitted using Poisson regression with histogram smoothing prior to decompose mortality rates based on age at death, period at death, and birth-cohort.Results Mortality rates from lung cancer increased more rapidly from age 52 years. It ended up to 325 deaths annually for 82 years on average. The mortality of younger cohorts was lower than older cohorts. The risk of lung cancer was lowered from period 1993 to recent periods.Conclusions The fitted Bayesian Age-Period-Cohort model with histogram smoothing prior is capable of explaining mortality rate of lung cancer. The reduction in carcinogens in cigarettes and increase in smoking cessation from around 1960 might led to decreasing trend of lung cancer mortality after calendar period 1993.
Quantitative comparison of canopy conductance models using a Bayesian approach
Samanta, S.; Clayton, M. K.; Mackay, D. S.; Kruger, E. L.; Ewers, B. E.
2008-09-01
A quantitative model comparison methodology based on deviance information criterion, a Bayesian measure of the trade-off between model complexity and goodness of fit, is developed and demonstrated by comparing semiempirical transpiration models. This methodology accounts for parameter and prediction uncertainties associated with such models and facilitates objective selection of the simplest model, out of available alternatives, which does not significantly compromise the ability to accurately model observations. We use this methodology to compare various Jarvis canopy conductance model configurations, embedded within a larger transpiration model, against canopy transpiration measured by sap flux. The results indicate that descriptions of the dependence of stomatal conductance on vapor pressure deficit, photosynthetic radiation, and temperature, as well as the gradual variation in canopy conductance through the season are essential in the transpiration model. Use of soil moisture was moderately significant, but only when used with a hyperbolic vapor pressure deficit relationship. Subtle differences in model quality could be clearly associated with small structural changes through the use of this methodology. The results also indicate that increments in model complexity are not always accompanied by improvements in model quality and that such improvements are conditional on model structure. Possible application of this methodology to compare complex semiempirical models of natural systems in general is also discussed.
Application of Bayesian Model Selection for Metal Yield Models using ALEGRA and Dakota.
Energy Technology Data Exchange (ETDEWEB)
Portone, Teresa; Niederhaus, John Henry; Sanchez, Jason James; Swiler, Laura Painton
2018-02-01
This report introduces the concepts of Bayesian model selection, which provides a systematic means of calibrating and selecting an optimal model to represent a phenomenon. This has many potential applications, including for comparing constitutive models. The ideas described herein are applied to a model selection problem between different yield models for hardened steel under extreme loading conditions.
Wallace, D. J.; Rosenheim, B. E.; Roberts, M. L.; Burton, J. R.; Donnelly, J. P.; Woodruff, J. D.
2014-12-01
Is a small quantity of high-precision ages more robust than a higher quantity of lower-precision ages for sediment core chronologies? AMS Radiocarbon ages have been available to researchers for several decades now, and precision of the technique has continued to improve. Analysis and time cost is high, though, and projects are often limited in terms of the number of dates that can be used to develop a chronology. The Gas Ion Source at the National Ocean Sciences Accelerator Mass Spectrometry Facility (NOSAMS), while providing lower-precision (uncertainty of order 100 14C y for a sample), is significantly less expensive and far less time consuming than conventional age dating and offers the unique opportunity for large amounts of ages. Here we couple two approaches, one analytical and one statistical, to investigate the utility of an age model comprised of these lower-precision ages for paleotempestology. We use a gas ion source interfaced to a gas-bench type device to generate radiocarbon dates approximately every 5 minutes while determining the order of sample analysis using the published Bayesian accumulation histories for deposits (Bacon). During two day-long sessions, several dates were obtained from carbonate shells in living position in a sediment core comprised of sapropel gel from Mangrove Lake, Bermuda. Samples were prepared where large shells were available, and the order of analysis was determined by the depth with the highest uncertainty according to Bacon. We present the results of these analyses as well as a prognosis for a future where such age models can be constructed from many dates that are quickly obtained relative to conventional radiocarbon dates. This technique currently is limited to carbonates, but development of a system for organic material dating is underway. We will demonstrate the extent to which sacrificing some analytical precision in favor of more dates improves age models.
A Framework for the Statistical Analysis of Probability of Mission Success Based on Bayesian Theory
2014-06-01
regression, and two non-probabilistic methods, fuzzy logic and neural networks, are discussed and compared below to determine which gives the best...2 2.3 Fuzzy Logic ...accurate measure than possibilistic methods, such as fuzzy logic discussed below [5]. Bayesian inference easily accounts for subjectivity and
Festa, Roberto
1992-01-01
According to the Bayesian view, scientific hypotheses must be appraised in terms of their posterior probabilities relative to the available experimental data. Such posterior probabilities are derived from the prior probabilities of the hypotheses by applying Bayes'theorem. One of the most important
A simulated annealing-based method for learning Bayesian networks from statistical data
Czech Academy of Sciences Publication Activity Database
Janžura, Martin; Nielsen, Jan
2006-01-01
Roč. 21, č. 3 (2006), s. 335-348 ISSN 0884-8173 R&D Projects: GA ČR GA201/03/0478 Institutional research plan: CEZ:AV0Z10750506 Keywords : Bayesian network * simulated annealing * Markov Chain Monte Carlo Subject RIV: BA - General Mathematics Impact factor: 0.429, year: 2006
Directory of Open Access Journals (Sweden)
David Lunn
Full Text Available The advantages of Bayesian statistical approaches, such as flexibility and the ability to acknowledge uncertainty in all parameters, have made them the prevailing method for analysing the spread of infectious diseases in human or animal populations. We introduce a Bayesian approach to experimental host-pathogen systems that shares these attractive features. Since uncertainty in all parameters is acknowledged, existing information can be accounted for through prior distributions, rather than through fixing some parameter values. The non-linear dynamics, multi-factorial design, multiple measurements of responses over time and sampling error that are typical features of experimental host-pathogen systems can also be naturally incorporated. We analyse the dynamics of the free-living protozoan Paramecium caudatum and its specialist bacterial parasite Holospora undulata. Our analysis provides strong evidence for a saturable infection function, and we were able to reproduce the two waves of infection apparent in the data by separating the initial inoculum from the parasites released after the first cycle of infection. In addition, the parameter estimates from the hierarchical model can be combined to infer variations in the parasite's basic reproductive ratio across experimental groups, enabling us to make predictions about the effect of resources and host genotype on the ability of the parasite to spread. Even though the high level of variability between replicates limited the resolution of the results, this Bayesian framework has strong potential to be used more widely in experimental ecology.
A Bayesian Analysis of Unobserved Component Models Using Ox
Directory of Open Access Journals (Sweden)
Charles S. Bos
2011-05-01
Full Text Available This article details a Bayesian analysis of the Nile river flow data, using a similar state space model as other articles in this volume. For this data set, Metropolis-Hastings and Gibbs sampling algorithms are implemented in the programming language Ox. These Markov chain Monte Carlo methods only provide output conditioned upon the full data set. For filtered output, conditioning only on past observations, the particle filter is introduced. The sampling methods are flexible, and this advantage is used to extend the model to incorporate a stochastic volatility process. The volatility changes both in the Nile data and also in daily S&P 500 return data are investigated. The posterior density of parameters and states is found to provide information on which elements of the model are easily identifiable, and which elements are estimated with less precision.
Fast Bayesian Inference in Dirichlet Process Mixture Models.
Wang, Lianming; Dunson, David B
2011-01-01
There has been increasing interest in applying Bayesian nonparametric methods in large samples and high dimensions. As Markov chain Monte Carlo (MCMC) algorithms are often infeasible, there is a pressing need for much faster algorithms. This article proposes a fast approach for inference in Dirichlet process mixture (DPM) models. Viewing the partitioning of subjects into clusters as a model selection problem, we propose a sequential greedy search algorithm for selecting the partition. Then, when conjugate priors are chosen, the resulting posterior conditionally on the selected partition is available in closed form. This approach allows testing of parametric models versus nonparametric alternatives based on Bayes factors. We evaluate the approach using simulation studies and compare it with four other fast nonparametric methods in the literature. We apply the proposed approach to three datasets including one from a large epidemiologic study. Matlab codes for the simulation and data analyses using the proposed approach are available online in the supplemental materials.
Aggregated Residential Load Modeling Using Dynamic Bayesian Networks
Energy Technology Data Exchange (ETDEWEB)
Vlachopoulou, Maria; Chin, George; Fuller, Jason C.; Lu, Shuai
2014-09-28
Abstract—It is already obvious that the future power grid will have to address higher demand for power and energy, and to incorporate renewable resources of different energy generation patterns. Demand response (DR) schemes could successfully be used to manage and balance power supply and demand under operating conditions of the future power grid. To achieve that, more advanced tools for DR management of operations and planning are necessary that can estimate the available capacity from DR resources. In this research, a Dynamic Bayesian Network (DBN) is derived, trained, and tested that can model aggregated load of Heating, Ventilation, and Air Conditioning (HVAC) systems. DBNs can provide flexible and powerful tools for both operations and planing, due to their unique analytical capabilities. The DBN model accuracy and flexibility of use is demonstrated by testing the model under different operational scenarios.
Road network safety evaluation using Bayesian hierarchical joint model.
Wang, Jie; Huang, Helai
2016-05-01
Safety and efficiency are commonly regarded as two significant performance indicators of transportation systems. In practice, road network planning has focused on road capacity and transport efficiency whereas the safety level of a road network has received little attention in the planning stage. This study develops a Bayesian hierarchical joint model for road network safety evaluation to help planners take traffic safety into account when planning a road network. The proposed model establishes relationships between road network risk and micro-level variables related to road entities and traffic volume, as well as socioeconomic, trip generation and network density variables at macro level which are generally used for long term transportation plans. In addition, network spatial correlation between intersections and their connected road segments is also considered in the model. A road network is elaborately selected in order to compare the proposed hierarchical joint model with a previous joint model and a negative binomial model. According to the results of the model comparison, the hierarchical joint model outperforms the joint model and negative binomial model in terms of the goodness-of-fit and predictive performance, which indicates the reasonableness of considering the hierarchical data structure in crash prediction and analysis. Moreover, both random effects at the TAZ level and the spatial correlation between intersections and their adjacent segments are found to be significant, supporting the employment of the hierarchical joint model as an alternative in road-network-level safety modeling as well. Copyright © 2016 Elsevier Ltd. All rights reserved.
Integrated Bayesian network framework for modeling complex ecological issues.
Johnson, Sandra; Mengersen, Kerrie
2012-07-01
The management of environmental problems is multifaceted, requiring varied and sometimes conflicting objectives and perspectives to be considered. Bayesian network (BN) modeling facilitates the integration of information from diverse sources and is well suited to tackling the management challenges of complex environmental problems. However, combining several perspectives in one model can lead to large, unwieldy BNs that are difficult to maintain and understand. Conversely, an oversimplified model may lead to an unrealistic representation of the environmental problem. Environmental managers require the current research and available knowledge about an environmental problem of interest to be consolidated in a meaningful way, thereby enabling the assessment of potential impacts and different courses of action. Previous investigations of the environmental problem of interest may have already resulted in the construction of several disparate ecological models. On the other hand, the opportunity may exist to initiate this modeling. In the first instance, the challenge is to integrate existing models and to merge the information and perspectives from these models. In the second instance, the challenge is to include different aspects of the environmental problem incorporating both the scientific and management requirements. Although the paths leading to the combined model may differ for these 2 situations, the common objective is to design an integrated model that captures the available information and research, yet is simple to maintain, expand, and refine. BN modeling is typically an iterative process, and we describe a heuristic method, the iterative Bayesian network development cycle (IBNDC), for the development of integrated BN models that are suitable for both situations outlined above. The IBNDC approach facilitates object-oriented BN (OOBN) modeling, arguably viewed as the next logical step in adaptive management modeling, and that embraces iterative development
Irving, J.; Koepke, C.; Elsheikh, A. H.
2017-12-01
Bayesian solutions to geophysical and hydrological inverse problems are dependent upon a forward process model linking subsurface parameters to measured data, which is typically assumed to be known perfectly in the inversion procedure. However, in order to make the stochastic solution of the inverse problem computationally tractable using, for example, Markov-chain-Monte-Carlo (MCMC) methods, fast approximations of the forward model are commonly employed. This introduces model error into the problem, which has the potential to significantly bias posterior statistics and hamper data integration efforts if not properly accounted for. Here, we present a new methodology for addressing the issue of model error in Bayesian solutions to hydrogeophysical inverse problems that is geared towards the common case where these errors cannot be effectively characterized globally through some parametric statistical distribution or locally based on interpolation between a small number of computed realizations. Rather than focusing on the construction of a global or local error model, we instead work towards identification of the model-error component of the residual through a projection-based approach. In this regard, pairs of approximate and detailed model runs are stored in a dictionary that grows at a specified rate during the MCMC inversion procedure. At each iteration, a local model-error basis is constructed for the current test set of model parameters using the K-nearest neighbour entries in the dictionary, which is then used to separate the model error from the other error sources before computing the likelihood of the proposed set of model parameters. We demonstrate the performance of our technique on the inversion of synthetic crosshole ground-penetrating radar traveltime data for three different subsurface parameterizations of varying complexity. The synthetic data are generated using the eikonal equation, whereas a straight-ray forward model is assumed in the inversion
Calibration of complex models through Bayesian evidence synthesis: a demonstration and tutorial
Jackson, Christopher; Jit, Mark; Sharples, Linda; DeAngelis, Daniela
2016-01-01
Summary Decision-analytic models must often be informed using data which are only indirectly related to the main model parameters. The authors outline how to implement a Bayesian synthesis of diverse sources of evidence to calibrate the parameters of a complex model. A graphical model is built to represent how observed data are generated from statistical models with unknown parameters, and how those parameters are related to quantities of interest for decision-making. This forms the basis of an algorithm to estimate a posterior probability distribution, which represents the updated state of evidence for all unknowns given all data and prior beliefs. This process calibrates the quantities of interest against data, and at the same time, propagates all parameter uncertainties to the results used for decision-making. To illustrate these methods, the authors demonstrate how a previously-developed Markov model for the progression of human papillomavirus (HPV16) infection was rebuilt in a Bayesian framework. Transition probabilities between states of disease severity are inferred indirectly from cross-sectional observations of prevalence of HPV16 and HPV16-related disease by age, cervical cancer incidence, and other published information. Previously, a discrete collection of plausible scenarios was identified, but with no further indication of which of these are more plausible. Instead, the authors derive a Bayesian posterior distribution, in which scenarios are implicitly weighted according to how well they are supported by the data. In particular, we emphasise the appropriate choice of prior distributions and checking and comparison of fitted models. PMID:23886677
Modeling Land-Use Decision Behavior with Bayesian Belief Networks
Directory of Open Access Journals (Sweden)
Inge Aalders
2008-06-01
Full Text Available The ability to incorporate and manage the different drivers of land-use change in a modeling process is one of the key challenges because they are complex and are both quantitative and qualitative in nature. This paper uses Bayesian belief networks (BBN to incorporate characteristics of land managers in the modeling process and to enhance our understanding of land-use change based on the limited and disparate sources of information. One of the two models based on spatial data represented land managers in the form of a quantitative variable, the area of individual holdings, whereas the other model included qualitative data from a survey of land managers. Random samples from the spatial data provided evidence of the relationship between the different variables, which I used to develop the BBN structure. The model was tested for four different posterior probability distributions, and results showed that the trained and learned models are better at predicting land use than the uniform and random models. The inference from the model demonstrated the constraints that biophysical characteristics impose on land managers; for older land managers without heirs, there is a higher probability of the land use being arable agriculture. The results show the benefits of incorporating a more complex notion of land managers in land-use models, and of using different empirical data sources in the modeling process. Future research should focus on incorporating more complex social processes into the modeling structure, as well as incorporating spatio-temporal dynamics in a BBN.
Berliner, M.
2017-12-01
Bayesian statistical decision theory offers a natural framework for decision-policy making in the presence of uncertainty. Key advantages of the approach include efficient incorporation of information and observations. However, in complicated settings it is very difficult, perhaps essentially impossible, to formalize the mathematical inputs needed in the approach. Nevertheless, using the approach as a template is useful for decision support; that is, organizing and communicating our analyses. Bayesian hierarchical modeling is valuable in quantifying and managing uncertainty such cases. I review some aspects of the idea emphasizing statistical model development and use in the context of sea-level rise.
Thyer, Mark; Kavetski, Dmitri; Evin, Guillaume; Kuczera, George; Renard, Ben; McInerney, David
2015-04-01
All scientific and statistical analysis, particularly in natural sciences, is based on approximations and assumptions. For example, the calibration of hydrological models using approaches such as Nash-Sutcliffe efficiency and/or simple least squares (SLS) objective functions may appear to be 'assumption-free'. However, this is a naïve point of view, as SLS assumes that the model residuals (residuals=observed-predictions) are independent, homoscedastic and Gaussian. If these assumptions are poor, parameter inference and model predictions will be correspondingly poor. An essential step in model development is therefore to verify the assumptions and approximations made in the modeling process. Diagnostics play a key role in verifying modeling assumptions. An important advantage of the formal Bayesian approach is that the modeler is required to make the assumptions explicit. Specialized diagnostics can then be developed and applied to test and verify their assumptions. This paper presents a suite of statistical and modeling diagnostics that can be used by environmental modelers to test their modeling calibration assumptions and diagnose model deficiencies. Three major types of diagnostics are presented: Residual Diagnostics Residual diagnostics are used to test whether the assumptions of the residual error model within the likelihood function are compatible with the data. This includes testing for statistical independence, homoscedasticity, unbiasedness, Gaussianity and any distributional assumptions. Parameter Uncertainty and MCMC Diagnostics An important part of Bayesian analysis is assess parameter uncertainty. Markov Chain Monte Carlo (MCMC) methods are a powerful numerical tool for estimating these uncertainties. Diagnostics based on posterior parameter distributions can be used to assess parameter identifiability, interactions and correlations. This provides a very useful tool for detecting and remedying model deficiencies. In addition, numerical diagnostics are
Bayesian meta-analysis models for microarray data: a comparative study
Directory of Open Access Journals (Sweden)
Song Joon J
2007-03-01
Full Text Available Abstract Background With the growing abundance of microarray data, statistical methods are increasingly needed to integrate results across studies. Two common approaches for meta-analysis of microarrays include either combining gene expression measures across studies or combining summaries such as p-values, probabilities or ranks. Here, we compare two Bayesian meta-analysis models that are analogous to these methods. Results Two Bayesian meta-analysis models for microarray data have recently been introduced. The first model combines standardized gene expression measures across studies into an overall mean, accounting for inter-study variability, while the second combines probabilities of differential expression without combining expression values. Both models produce the gene-specific posterior probability of differential expression, which is the basis for inference. Since the standardized expression integration model includes inter-study variability, it may improve accuracy of results versus the probability integration model. However, due to the small number of studies typical in microarray meta-analyses, the variability between studies is challenging to estimate. The probability integration model eliminates the need to model variability between studies, and thus its implementation is more straightforward. We found in simulations of two and five studies that combining probabilities outperformed combining standardized gene expression measures for three comparison values: the percent of true discovered genes in meta-analysis versus individual studies; the percent of true genes omitted in meta-analysis versus separate studies, and the number of true discovered genes for fixed levels of Bayesian false discovery. We identified similar results when pooling two independent studies of Bacillus subtilis. We assumed that each study was produced from the same microarray platform with only two conditions: a treatment and control, and that the data sets
Evaluating Flight Crew Performance by a Bayesian Network Model
Directory of Open Access Journals (Sweden)
Wei Chen
2018-03-01
Full Text Available Flight crew performance is of great significance in keeping flights safe and sound. When evaluating the crew performance, quantitative detailed behavior information may not be available. The present paper introduces the Bayesian Network to perform flight crew performance evaluation, which permits the utilization of multidisciplinary sources of objective and subjective information, despite sparse behavioral data. In this paper, the causal factors are selected based on the analysis of 484 aviation accidents caused by human factors. Then, a network termed Flight Crew Performance Model is constructed. The Delphi technique helps to gather subjective data as a supplement to objective data from accident reports. The conditional probabilities are elicited by the leaky noisy MAX model. Two ways of inference for the BN—probability prediction and probabilistic diagnosis are used and some interesting conclusions are drawn, which could provide data support to make interventions for human error management in aviation safety.
Joint Modeling of Multiple Crimes: A Bayesian Spatial Approach
Directory of Open Access Journals (Sweden)
Hongqiang Liu
2017-01-01
Full Text Available A multivariate Bayesian spatial modeling approach was used to jointly model the counts of two types of crime, i.e., burglary and non-motor vehicle theft, and explore the geographic pattern of crime risks and relevant risk factors. In contrast to the univariate model, which assumes independence across outcomes, the multivariate approach takes into account potential correlations between crimes. Six independent variables are included in the model as potential risk factors. In order to fully present this method, both the multivariate model and its univariate counterpart are examined. We fitted the two models to the data and assessed them using the deviance information criterion. A comparison of the results from the two models indicates that the multivariate model was superior to the univariate model. Our results show that population density and bar density are clearly associated with both burglary and non-motor vehicle theft risks and indicate a close relationship between these two types of crime. The posterior means and 2.5% percentile of type-specific crime risks estimated by the multivariate model were mapped to uncover the geographic patterns. The implications, limitations and future work of the study are discussed in the concluding section.
HDDM: Hierarchical Bayesian estimation of the Drift-Diffusion Model in Python
Directory of Open Access Journals (Sweden)
Thomas V Wiecki
2013-08-01
Full Text Available The diffusion model is a commonly used tool to infer latent psychological processes underlying decision making, and to link them to neural mechanisms based on reaction times. Although efficient open source software has been made available to quantitatively fit the model to data, current estimation methods require an abundance of reaction time measurements to recover meaningful parameters, and only provide point estimates of each parameter. In contrast, hierarchical Bayesian parameter estimation methods are useful for enhancing statistical power, allowing for simultaneous estimation of individual subject parameters and the group distribution that they are drawn from, while also providing measures of uncertainty in these parameters in the posterior distribution. Here, we present a novel Python-based toolbox called HDDM (hierarchical drift diffusion model, which allows fast and flexible estimation of the the drift-diffusion model and the related linear ballistic accumulator model. HDDM requires fewer data per subject / condition than non-hierarchical method, allows for full Bayesian data analysis, and can handle outliers in the data. Finally, HDDM supports the estimation of how trial-by-trial measurements (e.g. fMRI influence decision making parameters. This paper will first describe the theoretical background of drift-diffusion model and Bayesian inference. We then illustrate usage of the toolbox on a real-world data set from our lab. Finally, parameter recovery studies show that HDDM beats alternative fitting methods like the chi-quantile method as well as maximum likelihood estimation. The software and documentation can be downloaded at: http://ski.clps.brown.edu/hddm_docs
Modelling of population dynamics of red king crab using Bayesian approach
Directory of Open Access Journals (Sweden)
Bakanev Sergey ...
2012-10-01
Modeling population dynamics based on the Bayesian approach enables to successfully resolve the above issues. The integration of the data from various studies into a unified model based on Bayesian parameter estimation method provides a much more detailed description of the processes occurring in the population.
Bayesian leave-one-out cross-validation approximations for Gaussian latent variable models
DEFF Research Database (Denmark)
Vehtari, Aki; Mononen, Tommi; Tolvanen, Ville
2016-01-01
The future predictive performance of a Bayesian model can be estimated using Bayesian cross-validation. In this article, we consider Gaussian latent variable models where the integration over the latent values is approximated using the Laplace method or expectation propagation (EP). We study the ...
Bayesian network as a modelling tool for risk management in agriculture
DEFF Research Database (Denmark)
Rasmussen, Svend; Madsen, Anders L.; Lund, Mogens
. In this paper we use Bayesian networks as an integrated modelling approach for representing uncertainty and analysing risk management in agriculture. It is shown how historical farm account data may be efficiently used to estimate conditional probabilities, which are the core elements in Bayesian network models...
Some explorations into Bayesian modelling of risks due to pesticide intake from food
Voet, van der, H.; Paulo, M.J.
2004-01-01
This paper presents some common types of data and models in pesticide exposure assessment. The problems of traditional methods are discussed in connection with possibilities to address them in a Bayesian framework. We present simple Bayesian models for consumption of food and for residue monitoring data
Cooper, Richard J; Krueger, Tobias; Hiscock, Kevin M; Rawlins, Barry G
2014-11-01
Mixing models have become increasingly common tools for apportioning fluvial sediment load to various sediment sources across catchments using a wide variety of Bayesian and frequentist modeling approaches. In this study, we demonstrate how different model setups can impact upon resulting source apportionment estimates in a Bayesian framework via a one-factor-at-a-time (OFAT) sensitivity analysis. We formulate 13 versions of a mixing model, each with different error assumptions and model structural choices, and apply them to sediment geochemistry data from the River Blackwater, Norfolk, UK, to apportion suspended particulate matter (SPM) contributions from three sources (arable topsoils, road verges, and subsurface material) under base flow conditions between August 2012 and August 2013. Whilst all 13 models estimate subsurface sources to be the largest contributor of SPM (median ∼76%), comparison of apportionment estimates reveal varying degrees of sensitivity to changing priors, inclusion of covariance terms, incorporation of time-variant distributions, and methods of proportion characterization. We also demonstrate differences in apportionment results between a full and an empirical Bayesian setup, and between a Bayesian and a frequentist optimization approach. This OFAT sensitivity analysis reveals that mixing model structural choices and error assumptions can significantly impact upon sediment source apportionment results, with estimated median contributions in this study varying by up to 21% between model versions. Users of mixing models are therefore strongly advised to carefully consider and justify their choice of model structure prior to conducting sediment source apportionment investigations. An OFAT sensitivity analysis of sediment fingerprinting mixing models is conductedBayesian models display high sensitivity to error assumptions and structural choicesSource apportionment results differ between Bayesian and frequentist approaches.
Bayesian models of thermal and pluviometric time series in the Fucino plateau
Directory of Open Access Journals (Sweden)
Adriana Trabucco
2011-09-01
Full Text Available This work was developed within the Project Metodologie e sistemi integrati per la qualificazione di produzioni orticole del Fucino (Methodologies and integrated systems for the classification of horticultural products in the Fucino plateau, sponsored by the Italian Ministry of Education, University and Research, Strategic Projects, Law 448/97. Agro-system managing, especially if necessary to achieve high quality in speciality crops, requires knowledge of main features and intrinsic variability of climate. Statistical models may properly summarize the structure existing behind the observed variability, furthermore they may support the agronomic manager by providing the probability that meteorological events happen in a time window of interest. More than 30 years of daily values collected in four sites located on the Fucino plateau, Abruzzo region, Italy, were studied by fitting Bayesian generalized linear models to air temperature maximum /minimum and rainfall time series. Bayesian predictive distributions of climate variables supporting decision-making processes were calculated at different timescales, 5-days for temperatures and 10-days for rainfall, both to reduce computational efforts and to simplify statistical model assumptions. Technicians and field operators, even with limited statistical training, may exploit the model output by inspecting graphs and climatic profiles of the cultivated areas during decision-making processes. Realizations taken from predictive distributions may also be used as input for agro-ecological models (e.g. models of crop growth, water balance. Fitted models may be exploited to monitor climatic changes and to revise climatic profiles of interest areas, periodically updating the probability distributions of target climatic variables. For the sake of brevity, the description of results is limited to just one of the four sites, and results for all other sites are available as supplementary information.
Bayesian Correction for Misclassification in Multilevel Count Data Models
Directory of Open Access Journals (Sweden)
Tyler Nelson
2018-01-01
Full Text Available Covariate misclassification is well known to yield biased estimates in single level regression models. The impact on hierarchical count models has been less studied. A fully Bayesian approach to modeling both the misclassified covariate and the hierarchical response is proposed. Models with a single diagnostic test and with multiple diagnostic tests are considered. Simulation studies show the ability of the proposed model to appropriately account for the misclassification by reducing bias and improving performance of interval estimators. A real data example further demonstrated the consequences of ignoring the misclassification. Ignoring misclassification yielded a model that indicated there was a significant, positive impact on the number of children of females who observed spousal abuse between their parents. When the misclassification was accounted for, the relationship switched to negative, but not significant. Ignoring misclassification in standard linear and generalized linear models is well known to lead to biased results. We provide an approach to extend misclassification modeling to the important area of hierarchical generalized linear models.
Automated comparison of Bayesian reconstructions of experimental profiles with physical models
International Nuclear Information System (INIS)
Irishkin, Maxim
2014-01-01
In this work we developed an expert system that carries out in an integrated and fully automated way i) a reconstruction of plasma profiles from the measurements, using Bayesian analysis ii) a prediction of the reconstructed quantities, according to some models and iii) an intelligent comparison of the first two steps. This system includes systematic checking of the internal consistency of the reconstructed quantities, enables automated model validation and, if a well-validated model is used, can be applied to help detecting interesting new physics in an experiment. The work shows three applications of this quite general system. The expert system can successfully detect failures in the automated plasma reconstruction and provide (on successful reconstruction cases) statistics of agreement of the models with the experimental data, i.e. information on the model validity. (author) [fr
Bayesian network models for error detection in radiotherapy plans
Kalet, Alan M.; Gennari, John H.; Ford, Eric C.; Phillips, Mark H.
2015-04-01
The purpose of this study is to design and develop a probabilistic network for detecting errors in radiotherapy plans for use at the time of initial plan verification. Our group has initiated a multi-pronged approach to reduce these errors. We report on our development of Bayesian models of radiotherapy plans. Bayesian networks consist of joint probability distributions that define the probability of one event, given some set of other known information. Using the networks, we find the probability of obtaining certain radiotherapy parameters, given a set of initial clinical information. A low probability in a propagated network then corresponds to potential errors to be flagged for investigation. To build our networks we first interviewed medical physicists and other domain experts to identify the relevant radiotherapy concepts and their associated interdependencies and to construct a network topology. Next, to populate the network’s conditional probability tables, we used the Hugin Expert software to learn parameter distributions from a subset of de-identified data derived from a radiation oncology based clinical information database system. These data represent 4990 unique prescription cases over a 5 year period. Under test case scenarios with approximately 1.5% introduced error rates, network performance produced areas under the ROC curve of 0.88, 0.98, and 0.89 for the lung, brain and female breast cancer error detection networks, respectively. Comparison of the brain network to human experts performance (AUC of 0.90 ± 0.01) shows the Bayes network model performs better than domain experts under the same test conditions. Our results demonstrate the feasibility and effectiveness of comprehensive probabilistic models as part of decision support systems for improved detection of errors in initial radiotherapy plan verification procedures.
Factors contributing to academic achievement: a Bayesian structure equation modelling study
Payandeh Najafabadi, Amir T.; Omidi Najafabadi, Maryam; Farid-Rohani, Mohammad Reza
2013-06-01
In Iran, high school graduates enter university after taking a very difficult entrance exam called the Konkoor. Therefore, only the top-performing students are admitted by universities to continue their bachelor's education in statistics. Surprisingly, statistically, most of such students fall into the following categories: (1) do not succeed in their education despite their excellent performance on the Konkoor and in high school; (2) graduate with a grade point average (GPA) that is considerably lower than their high school GPA; (3) continue their master's education in majors other than statistics and (4) try to find jobs unrelated to statistics. This article employs the well-known and powerful statistical technique, the Bayesian structural equation modelling (SEM), to study the academic success of recent graduates who have studied statistics at Shahid Beheshti University in Iran. This research: (i) considered academic success as a latent variable, which was measured by GPA and other academic success (see below) of students in the target population; (ii) employed the Bayesian SEM, which works properly for small sample sizes and ordinal variables; (iii), which is taken from the literature, developed five main factors that affected academic success and (iv) considered several standard psychological tests and measured characteristics such as 'self-esteem' and 'anxiety'. We then study the impact of such factors on the academic success of the target population. Six factors that positively impact student academic success were identified in the following order of relative impact (from greatest to least): 'Teaching-Evaluation', 'Learner', 'Environment', 'Family', 'Curriculum' and 'Teaching Knowledge'. Particularly, influential variables within each factor have also been noted.
A hierarchical Bayesian model to incorporate uncertainty into methods for diversity partitioning.
Marion, Zachary H; Fordyce, James A; Fitzpatrick, Benjamin M
2018-04-01
Recently there have been major theoretical advances in the quantification and partitioning of diversity within and among communities, regions, and ecosystems. However, applying those advances to real data remains a challenge. Ecologists often end up describing their samples rather than estimating the diversity components of an underlying study system, and existing approaches do not easily provide statistical frameworks for testing ecological questions. Here we offer one avenue to do all of the above using a hierarchical Bayesian approach. We estimate posterior distributions of the underlying "true" relative abundances of each species within each unit sampled. These posterior estimates of relative abundance can then be used with existing formulae to estimate and partition diversity. The result is a posterior distribution of diversity metrics describing our knowledge (or beliefs) about the study system. This approach intuitively leads to statistical inferences addressing biologically motivated hypotheses via Bayesian model comparison. Using simulations, we demonstrate that our approach does as well or better at approximating the "true" diversity of a community relative to naïve or ad-hoc bias-corrected estimates. Moreover, model comparison correctly distinguishes between alternative hypotheses about the distribution of diversity within and among samples. Finally, we use an empirical ecological dataset to illustrate how the approach can be used to address questions about the makeup and diversities of assemblages at local and regional scales. © 2018 by the Ecological Society of America.
Large scale Bayesian nuclear data evaluation with consistent model defects
International Nuclear Information System (INIS)
Schnabel, G
2015-01-01
The aim of nuclear data evaluation is the reliable determination of cross sections and related quantities of the atomic nuclei. To this end, evaluation methods are applied which combine the information of experiments with the results of model calculations. The evaluated observables with their associated uncertainties and correlations are assembled into data sets, which are required for the development of novel nuclear facilities, such as fusion reactors for energy supply, and accelerator driven systems for nuclear waste incineration. The efficiency and safety of such future facilities is dependent on the quality of these data sets and thus also on the reliability of the applied evaluation methods. This work investigated the performance of the majority of available evaluation methods in two scenarios. The study indicated the importance of an essential component in these methods, which is the frequently ignored deficiency of nuclear models. Usually, nuclear models are based on approximations and thus their predictions may deviate from reliable experimental data. As demonstrated in this thesis, the neglect of this possibility in evaluation methods can lead to estimates of observables which are inconsistent with experimental data. Due to this finding, an extension of Bayesian evaluation methods is proposed to take into account the deficiency of the nuclear models. The deficiency is modeled as a random function in terms of a Gaussian process and combined with the model prediction. This novel formulation conserves sum rules and allows to explicitly estimate the magnitude of model deficiency. Both features are missing in available evaluation methods so far. Furthermore, two improvements of existing methods have been developed in the course of this thesis. The first improvement concerns methods relying on Monte Carlo sampling. A Metropolis-Hastings scheme with a specific proposal distribution is suggested, which proved to be more efficient in the studied scenarios than the
A Bayesian Attractor Model for Perceptual Decision Making
Bitzer, Sebastian; Bruineberg, Jelle; Kiebel, Stefan J.
2015-01-01
Even for simple perceptual decisions, the mechanisms that the brain employs are still under debate. Although current consensus states that the brain accumulates evidence extracted from noisy sensory information, open questions remain about how this simple model relates to other perceptual phenomena such as flexibility in decisions, decision-dependent modulation of sensory gain, or confidence about a decision. We propose a novel approach of how perceptual decisions are made by combining two influential formalisms into a new model. Specifically, we embed an attractor model of decision making into a probabilistic framework that models decision making as Bayesian inference. We show that the new model can explain decision making behaviour by fitting it to experimental data. In addition, the new model combines for the first time three important features: First, the model can update decisions in response to switches in the underlying stimulus. Second, the probabilistic formulation accounts for top-down effects that may explain recent experimental findings of decision-related gain modulation of sensory neurons. Finally, the model computes an explicit measure of confidence which we relate to recent experimental evidence for confidence computations in perceptual decision tasks. PMID:26267143
Technical Note: Approximate Bayesian parameterization of a process-based tropical forest model
Hartig, F.; Dislich, C.; Wiegand, T.; Huth, A.
2014-02-01
Inverse parameter estimation of process-based models is a long-standing problem in many scientific disciplines. A key question for inverse parameter estimation is how to define the metric that quantifies how well model predictions fit to the data. This metric can be expressed by general cost or objective functions, but statistical inversion methods require a particular metric, the probability of observing the data given the model parameters, known as the likelihood. For technical and computational reasons, likelihoods for process-based stochastic models are usually based on general assumptions about variability in the observed data, and not on the stochasticity generated by the model. Only in recent years have new methods become available that allow the generation of likelihoods directly from stochastic simulations. Previous applications of these approximate Bayesian methods have concentrated on relatively simple models. Here, we report on the application of a simulation-based likelihood approximation for FORMIND, a parameter-rich individual-based model of tropical forest dynamics. We show that approximate Bayesian inference, based on a parametric likelihood approximation placed in a conventional Markov chain Monte Carlo (MCMC) sampler, performs well in retrieving known parameter values from virtual inventory data generated by the forest model. We analyze the results of the parameter estimation, examine its sensitivity to the choice and aggregation of model outputs and observed data (summary statistics), and demonstrate the application of this method by fitting the FORMIND model to field data from an Ecuadorian tropical forest. Finally, we discuss how this approach differs from approximate Bayesian computation (ABC), another method commonly used to generate simulation-based likelihood approximations. Our results demonstrate that simulation-based inference, which offers considerable conceptual advantages over more traditional methods for inverse parameter estimation
Technical Note: Approximate Bayesian parameterization of a complex tropical forest model
Hartig, F.; Dislich, C.; Wiegand, T.; Huth, A.
2013-08-01
Inverse parameter estimation of process-based models is a long-standing problem in ecology and evolution. A key problem of inverse parameter estimation is to define a metric that quantifies how well model predictions fit to the data. Such a metric can be expressed by general cost or objective functions, but statistical inversion approaches are based on a particular metric, the probability of observing the data given the model, known as the likelihood. Deriving likelihoods for dynamic models requires making assumptions about the probability for observations to deviate from mean model predictions. For technical reasons, these assumptions are usually derived without explicit consideration of the processes in the simulation. Only in recent years have new methods become available that allow generating likelihoods directly from stochastic simulations. Previous applications of these approximate Bayesian methods have concentrated on relatively simple models. Here, we report on the application of a simulation-based likelihood approximation for FORMIND, a parameter-rich individual-based model of tropical forest dynamics. We show that approximate Bayesian inference, based on a parametric likelihood approximation placed in a conventional MCMC, performs well in retrieving known parameter values from virtual field data generated by the forest model. We analyze the results of the parameter estimation, examine the sensitivity towards the choice and aggregation of model outputs and observed data (summary statistics), and show results from using this method to fit the FORMIND model to field data from an Ecuadorian tropical forest. Finally, we discuss differences of this approach to Approximate Bayesian Computing (ABC), another commonly used method to generate simulation-based likelihood approximations. Our results demonstrate that simulation-based inference, which offers considerable conceptual advantages over more traditional methods for inverse parameter estimation, can
Bayesian spatial semi-parametric modeling of HIV variation in Kenya.
Directory of Open Access Journals (Sweden)
Oscar Ngesa
Full Text Available Spatial statistics has seen rapid application in many fields, especially epidemiology and public health. Many studies, nonetheless, make limited use of the geographical location information and also usually assume that the covariates, which are related to the response variable, have linear effects. We develop a Bayesian semi-parametric regression model for HIV prevalence data. Model estimation and inference is based on fully Bayesian approach via Markov Chain Monte Carlo (McMC. The model is applied to HIV prevalence data among men in Kenya, derived from the Kenya AIDS indicator survey, with n = 3,662. Past studies have concluded that HIV infection has a nonlinear association with age. In this study a smooth function based on penalized regression splines is used to estimate this nonlinear effect. Other covariates were assumed to have a linear effect. Spatial references to the counties were modeled as both structured and unstructured spatial effects. We observe that circumcision reduces the risk of HIV infection. The results also indicate that men in the urban areas were more likely to be infected by HIV as compared to their rural counterpart. Men with higher education had the lowest risk of HIV infection. A nonlinear relationship between HIV infection and age was established. Risk of HIV infection increases with age up to the age of 40 then declines with increase in age. Men who had STI in the last 12 months were more likely to be infected with HIV. Also men who had ever used a condom were found to have higher likelihood to be infected by HIV. A significant spatial variation of HIV infection in Kenya was also established. The study shows the practicality and flexibility of Bayesian semi-parametric regression model in analyzing epidemiological data.
A hierarchical bayesian model to quantify uncertainty of stream water temperature forecasts.
Directory of Open Access Journals (Sweden)
Guillaume Bal
Full Text Available Providing generic and cost effective modelling approaches to reconstruct and forecast freshwater temperature using predictors as air temperature and water discharge is a prerequisite to understanding ecological processes underlying the impact of water temperature and of global warming on continental aquatic ecosystems. Using air temperature as a simple linear predictor of water temperature can lead to significant bias in forecasts as it does not disentangle seasonality and long term trends in the signal. Here, we develop an alternative approach based on hierarchical Bayesian statistical time series modelling of water temperature, air temperature and water discharge using seasonal sinusoidal periodic signals and time varying means and amplitudes. Fitting and forecasting performances of this approach are compared with that of simple linear regression between water and air temperatures using i an emotive simulated example, ii application to three French coastal streams with contrasting bio-geographical conditions and sizes. The time series modelling approach better fit data and does not exhibit forecasting bias in long term trends contrary to the linear regression. This new model also allows for more accurate forecasts of water temperature than linear regression together with a fair assessment of the uncertainty around forecasting. Warming of water temperature forecast by our hierarchical Bayesian model was slower and more uncertain than that expected with the classical regression approach. These new forecasts are in a form that is readily usable in further ecological analyses and will allow weighting of outcomes from different scenarios to manage climate change impacts on freshwater wildlife.
Wei Wu; James Clark; James Vose
2010-01-01
Hierarchical Bayesian (HB) modeling allows for multiple sources of uncertainty by factoring complex relationships into conditional distributions that can be used to draw inference and make predictions. We applied an HB model to estimate the parameters and state variables of a parsimonious hydrological model â GR4J â by coherently assimilating the uncertainties from the...
iSEDfit: Bayesian spectral energy distribution modeling of galaxies
Moustakas, John
2017-08-01
iSEDfit uses Bayesian inference to extract the physical properties of galaxies from their observed broadband photometric spectral energy distribution (SED). In its default mode, the inputs to iSEDfit are the measured photometry (fluxes and corresponding inverse variances) and a measurement of the galaxy redshift. Alternatively, iSEDfit can be used to estimate photometric redshifts from the input photometry alone. After the priors have been specified, iSEDfit calculates the marginalized posterior probability distributions for the physical parameters of interest, including the stellar mass, star-formation rate, dust content, star formation history, and stellar metallicity. iSEDfit also optionally computes K-corrections and produces multiple "quality assurance" (QA) plots at each stage of the modeling procedure to aid in the interpretation of the prior parameter choices and subsequent fitting results. The software is distributed as part of the impro IDL suite.
Designing and testing inflationary models with Bayesian networks
International Nuclear Information System (INIS)
Price, Layne C.; Auckland Univ.; Peiris, Hiranya V.; Frazer, Jonathan; Univ. of the Basque Country, Bilbao; Basque Foundation for Science, Bilbao; Easther, Richard
2015-11-01
Even simple inflationary scenarios have many free parameters. Beyond the variables appearing in the inflationary action, these include dynamical initial conditions, the number of fields, and couplings to other sectors. These quantities are often ignored but cosmological observables can depend on the unknown parameters. We use Bayesian networks to account for a large set of inflationary parameters, deriving generative models for the primordial spectra that are conditioned on a hierarchical set of prior probabilities describing the initial conditions, reheating physics, and other free parameters. We use N f -quadratic inflation as an illustrative example, finding that the number of e-folds N * between horizon exit for the pivot scale and the end of inflation is typically the most important parameter, even when the number of fields, their masses and initial conditions are unknown, along with possible conditional dependencies between these parameters.
Bonfatti, V; Tiezzi, F; Miglior, F; Carnier, P
2017-09-01
The objective of this study was to compare the prediction accuracy of 92 infrared prediction equations obtained by different statistical approaches. The predicted traits included fatty acid composition (n = 1,040); detailed protein composition (n = 1,137); lactoferrin (n = 558); pH and coagulation properties (n = 1,296); curd yield and composition obtained by a micro-cheese making procedure (n = 1,177); and Ca, P, Mg, and K contents (n = 689). The statistical methods used to develop the prediction equations were partial least squares regression (PLSR), Bayesian ridge regression, Bayes A, Bayes B, Bayes C, and Bayesian least absolute shrinkage and selection operator. Model performances were assessed, for each trait and model, in training and validation sets over 10 replicates. In validation sets, Bayesian regression models performed significantly better than PLSR for the prediction of 33 out of 92 traits, especially fatty acids, whereas they yielded a significantly lower prediction accuracy than PLSR in the prediction of 8 traits: the percentage of C18:1n-7 trans-9 in fat; the content of unglycosylated κ-casein and its percentage in protein; the content of α-lactalbumin; the percentage of α S2 -casein in protein; and the contents of Ca, P, and Mg. Even though Bayesian methods produced a significant enhancement of model accuracy in many traits compared with PLSR, most variations in the coefficient of determination in validation sets were smaller than 1 percentage point. Over traits, the highest predictive ability was obtained by Bayes C even though most of the significant differences in accuracy between Bayesian regression models were negligible. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Bayesian Estimation of MIRT Models with General and Specific Latent Traits in MATLAB
Directory of Open Access Journals (Sweden)
Yanyan Sheng
2010-10-01
Full Text Available Multidimensional item response models have been developed to incorporate a general trait and several specific trait dimensions. Depending on the structure of these latent traits, different models can be considered. This paper provides the requisite information and description of software that implement the Gibbs sampling procedures for three such models with a normal ogive form. The software developed is written in the MATLAB package IRTm2noHA. The package is flexible enough to allow a user the choice to simulate binary response data with a latent structure involving general and specific traits, specify prior distributions for model parameters, check convergence of the MCMC chain, and obtain Bayesian fit statistics. Illustrative examples are provided to demonstrate and validate the use of the software package.
Bayesian modeling of ChIP-chip data using latent variables
Directory of Open Access Journals (Sweden)
Tian Yanan
2009-10-01
Full Text Available Abstract Background The ChIP-chip technology has been used in a wide range of biomedical studies, such as identification of human transcription factor binding sites, investigation of DNA methylation, and investigation of histone modifications in animals and plants. Various methods have been proposed in the literature for analyzing the ChIP-chip data, such as the sliding window methods, the hidden Markov model-based methods, and Bayesian methods. Although, due to the integrated consideration of uncertainty of the models and model parameters, Bayesian methods can potentially work better than the other two classes of methods, the existing Bayesian methods do not perform satisfactorily. They usually require multiple replicates or some extra experimental information to parametrize the model, and long CPU time due to involving of MCMC simulations. Results In this paper, we propose a Bayesian latent model for the ChIP-chip data. The new model mainly differs from the existing Bayesian models, such as the joint deconvolution model, the hierarchical gamma mixture model, and the Bayesian hierarchical model, in two respects. Firstly, it works on the difference between the averaged treatment and control samples. This enables the use of a simple model for the data, which avoids the probe-specific effect and the sample (control/treatment effect. As a consequence, this enables an efficient MCMC simulation of the posterior distribution of the model, and also makes the model more robust to the outliers. Secondly, it models the neighboring dependence of probes by introducing a latent indicator vector. A truncated Poisson prior distribution is assumed for the latent indicator variable, with the rationale being justified at length. Conclusion The Bayesian latent method is successfully applied to real and ten simulated datasets, with comparisons with some of the existing Bayesian methods, hidden Markov model methods, and sliding window methods. The numerical results
Bayesian modeling of ChIP-chip data using latent variables.
Wu, Mingqi
2009-10-26
BACKGROUND: The ChIP-chip technology has been used in a wide range of biomedical studies, such as identification of human transcription factor binding sites, investigation of DNA methylation, and investigation of histone modifications in animals and plants. Various methods have been proposed in the literature for analyzing the ChIP-chip data, such as the sliding window methods, the hidden Markov model-based methods, and Bayesian methods. Although, due to the integrated consideration of uncertainty of the models and model parameters, Bayesian methods can potentially work better than the other two classes of methods, the existing Bayesian methods do not perform satisfactorily. They usually require multiple replicates or some extra experimental information to parametrize the model, and long CPU time due to involving of MCMC simulations. RESULTS: In this paper, we propose a Bayesian latent model for the ChIP-chip data. The new model mainly differs from the existing Bayesian models, such as the joint deconvolution model, the hierarchical gamma mixture model, and the Bayesian hierarchical model, in two respects. Firstly, it works on the difference between the averaged treatment and control samples. This enables the use of a simple model for the data, which avoids the probe-specific effect and the sample (control/treatment) effect. As a consequence, this enables an efficient MCMC simulation of the posterior distribution of the model, and also makes the model more robust to the outliers. Secondly, it models the neighboring dependence of probes by introducing a latent indicator vector. A truncated Poisson prior distribution is assumed for the latent indicator variable, with the rationale being justified at length. CONCLUSION: The Bayesian latent method is successfully applied to real and ten simulated datasets, with comparisons with some of the existing Bayesian methods, hidden Markov model methods, and sliding window methods. The numerical results indicate that the
A novel Bayesian hierarchical model for road safety hotspot prediction.
Fawcett, Lee; Thorpe, Neil; Matthews, Joseph; Kremer, Karsten
2017-02-01
In this paper, we propose a Bayesian hierarchical model for predicting accident counts in future years at sites within a pool of potential road safety hotspots. The aim is to inform road safety practitioners of the location of likely future hotspots to enable a proactive, rather than reactive, approach to road safety scheme implementation. A feature of our model is the ability to rank sites according to their potential to exceed, in some future time period, a threshold accident count which may be used as a criterion for scheme implementation. Our model specification enables the classical empirical Bayes formulation - commonly used in before-and-after studies, wherein accident counts from a single before period are used to estimate counterfactual counts in the after period - to be extended to incorporate counts from multiple time periods. This allows site-specific variations in historical accident counts (e.g. locally-observed trends) to offset estimates of safety generated by a global accident prediction model (APM), which itself is used to help account for the effects of global trend and regression-to-mean (RTM). The Bayesian posterior predictive distribution is exploited to formulate predictions and to properly quantify our uncertainty in these predictions. The main contributions of our model include (i) the ability to allow accident counts from multiple time-points to inform predictions, with counts in more recent years lending more weight to predictions than counts from time-points further in the past; (ii) where appropriate, the ability to offset global estimates of trend by variations in accident counts observed locally, at a site-specific level; and (iii) the ability to account for unknown/unobserved site-specific factors which may affect accident counts. We illustrate our model with an application to accident counts at 734 potential hotspots in the German city of Halle; we also propose some simple diagnostics to validate the predictive capability of our
Taming Many-Parameter BSM Models with Bayesian Neural Networks
Kuchera, M. P.; Karbo, A.; Prosper, H. B.; Sanchez, A.; Taylor, J. Z.
2017-09-01
The search for physics Beyond the Standard Model (BSM) is a major focus of large-scale high energy physics experiments. One method is to look for specific deviations from the Standard Model that are predicted by BSM models. In cases where the model has a large number of free parameters, standard search methods become intractable due to computation time. This talk presents results using Bayesian Neural Networks, a supervised machine learning method, to enable the study of higher-dimensional models. The popular phenomenological Minimal Supersymmetric Standard Model was studied as an example of the feasibility and usefulness of this method. Graphics Processing Units (GPUs) are used to expedite the calculations. Cross-section predictions for 13 TeV proton collisions will be presented. My participation in the Conference Experience for Undergraduates (CEU) in 2004-2006 exposed me to the national and global significance of cutting-edge research. At the 2005 CEU, I presented work from the previous summer's SULI internship at Lawrence Berkeley Laboratory, where I learned to program while working on the Majorana Project. That work inspired me to follow a similar research path, which led me to my current work on computational methods applied to BSM physics.
Bayesian Network Webserver: a comprehensive tool for biological network modeling.
Ziebarth, Jesse D; Bhattacharya, Anindya; Cui, Yan
2013-11-01
The Bayesian Network Webserver (BNW) is a platform for comprehensive network modeling of systems genetics and other biological datasets. It allows users to quickly and seamlessly upload a dataset, learn the structure of the network model that best explains the data and use the model to understand relationships between network variables. Many datasets, including those used to create genetic network models, contain both discrete (e.g. genotype) and continuous (e.g. gene expression traits) variables, and BNW allows for modeling hybrid datasets. Users of BNW can incorporate prior knowledge during structure learning through an easy-to-use structural constraint interface. After structure learning, users are immediately presented with an interactive network model, which can be used to make testable hypotheses about network relationships. BNW, including a downloadable structure learning package, is available at http://compbio.uthsc.edu/BNW. (The BNW interface for adding structural constraints uses HTML5 features that are not supported by current version of Internet Explorer. We suggest using other browsers (e.g. Google Chrome or Mozilla Firefox) when accessing BNW). ycui2@uthsc.edu. Supplementary data are available at Bioinformatics online.
Bayesian analysis of inflation: Parameter estimation for single field models
International Nuclear Information System (INIS)
Mortonson, Michael J.; Peiris, Hiranya V.; Easther, Richard
2011-01-01
Future astrophysical data sets promise to strengthen constraints on models of inflation, and extracting these constraints requires methods and tools commensurate with the quality of the data. In this paper we describe ModeCode, a new, publicly available code that computes the primordial scalar and tensor power spectra for single-field inflationary models. ModeCode solves the inflationary mode equations numerically, avoiding the slow roll approximation. It is interfaced with CAMB and CosmoMC to compute cosmic microwave background angular power spectra and perform likelihood analysis and parameter estimation. ModeCode is easily extendable to additional models of inflation, and future updates will include Bayesian model comparison. Errors from ModeCode contribute negligibly to the error budget for analyses of data from Planck or other next generation experiments. We constrain representative single-field models (φ n with n=2/3, 1, 2, and 4, natural inflation, and 'hilltop' inflation) using current data, and provide forecasts for Planck. From current data, we obtain weak but nontrivial limits on the post-inflationary physics, which is a significant source of uncertainty in the predictions of inflationary models, while we find that Planck will dramatically improve these constraints. In particular, Planck will link the inflationary dynamics with the post-inflationary growth of the horizon, and thus begin to probe the ''primordial dark ages'' between TeV and grand unified theory scale energies.
Enhancing Flood Prediction Reliability Using Bayesian Model Averaging
Liu, Z.; Merwade, V.
2017-12-01
Uncertainty analysis is an indispensable part of modeling the hydrology and hydrodynamics of non-idealized environmental systems. Compared to reliance on prediction from one model simulation, using on ensemble of predictions that consider uncertainty from different sources is more reliable. In this study, Bayesian model averaging (BMA) is applied to Black River watershed in Arkansas and Missouri by combining multi-model simulations to get reliable deterministic water stage and probabilistic inundation extent predictions. The simulation ensemble is generated from 81 LISFLOOD-FP subgrid model configurations that include uncertainty from channel shape, channel width, channel roughness and discharge. Model simulation outputs are trained with observed water stage data during one flood event, and BMA prediction ability is validated for another flood event. Results from this study indicate that BMA does not always outperform all members in the ensemble, but it provides relatively robust deterministic flood stage predictions across the basin. Station based BMA (BMA_S) water stage prediction has better performance than global based BMA (BMA_G) prediction which is superior to the ensemble mean prediction. Additionally, high-frequency flood inundation extent (probability greater than 60%) in BMA_G probabilistic map is more accurate than the probabilistic flood inundation extent based on equal weights.
Estimating Parameters in Physical Models through Bayesian Inversion: A Complete Example
Allmaras, Moritz
2013-02-07
All mathematical models of real-world phenomena contain parameters that need to be estimated from measurements, either for realistic predictions or simply to understand the characteristics of the model. Bayesian statistics provides a framework for parameter estimation in which uncertainties about models and measurements are translated into uncertainties in estimates of parameters. This paper provides a simple, step-by-step example-starting from a physical experiment and going through all of the mathematics-to explain the use of Bayesian techniques for estimating the coefficients of gravity and air friction in the equations describing a falling body. In the experiment we dropped an object from a known height and recorded the free fall using a video camera. The video recording was analyzed frame by frame to obtain the distance the body had fallen as a function of time, including measures of uncertainty in our data that we describe as probability densities. We explain the decisions behind the various choices of probability distributions and relate them to observed phenomena. Our measured data are then combined with a mathematical model of a falling body to obtain probability densities on the space of parameters we seek to estimate. We interpret these results and discuss sources of errors in our estimation procedure. © 2013 Society for Industrial and Applied Mathematics.
Bayesian model ensembling using meta-trained recurrent neural networks
Ambrogioni, L.; Berezutskaya, Y.; Gü ç lü , U.; Borne, E.W.P. van den; Gü ç lü tü rk, Y.; Gerven, M.A.J. van; Maris, E.G.G.
2017-01-01
In this paper we demonstrate that a recurrent neural network meta-trained on an ensemble of arbitrary classification tasks can be used as an approximation of the Bayes optimal classifier. This result is obtained by relying on the framework of e-free approximate Bayesian inference, where the Bayesian
A Bayesian Reformulation of the Extended Drift-Diffusion Model in Perceptual Decision Making
Fard, Pouyan R.; Park, Hame; Warkentin, Andrej; Kiebel, Stefan J.; Bitzer, Sebastian
2017-01-01
Perceptual decision making can be described as a process of accumulating evidence to a bound which has been formalized within drift-diffusion models (DDMs). Recently, an equivalent Bayesian model has been proposed. In contrast to standard DDMs, this Bayesian model directly links information in the stimulus to the decision process. Here, we extend this Bayesian model further and allow inter-trial variability of two parameters following the extended version of the DDM. We derive parameter distributions for the Bayesian model and show that they lead to predictions that are qualitatively equivalent to those made by the extended drift-diffusion model (eDDM). Further, we demonstrate the usefulness of the extended Bayesian model (eBM) for the analysis of concrete behavioral data. Specifically, using Bayesian model selection, we find evidence that including additional inter-trial parameter variability provides for a better model, when the model is constrained by trial-wise stimulus features. This result is remarkable because it was derived using just 200 trials per condition, which is typically thought to be insufficient for identifying variability parameters in DDMs. In sum, we present a Bayesian analysis, which provides for a novel and promising analysis of perceptual decision making experiments. PMID:28553219
A Bayesian Reformulation of the Extended Drift-Diffusion Model in Perceptual Decision Making
Directory of Open Access Journals (Sweden)
Pouyan R. Fard
2017-05-01
Full Text Available Perceptual decision making can be described as a process of accumulating evidence to a bound which has been formalized within drift-diffusion models (DDMs. Recently, an equivalent Bayesian model has been proposed. In contrast to standard DDMs, this Bayesian model directly links information in the stimulus to the decision process. Here, we extend this Bayesian model further and allow inter-trial variability of two parameters following the extended version of the DDM. We derive parameter distributions for the Bayesian model and show that they lead to predictions that are qualitatively equivalent to those made by the extended drift-diffusion model (eDDM. Further, we demonstrate the usefulness of the extended Bayesian model (eBM for the analysis of concrete behavioral data. Specifically, using Bayesian model selection, we find evidence that including additional inter-trial parameter variability provides for a better model, when the model is constrained by trial-wise stimulus features. This result is remarkable because it was derived using just 200 trials per condition, which is typically thought to be insufficient for identifying variability parameters in DDMs. In sum, we present a Bayesian analysis, which provides for a novel and promising analysis of perceptual decision making experiments.
Chen, Cong; Zhang, Guohui; Tarefder, Rafiqul; Ma, Jianming; Wei, Heng; Guan, Hongzhi
2015-07-01
Rear-end crash is one of the most common types of traffic crashes in the U.S. A good understanding of its characteristics and contributing factors is of practical importance. Previously, both multinomial Logit models and Bayesian network methods have been used in crash modeling and analysis, respectively, although each of them has its own application restrictions and limitations. In this study, a hybrid approach is developed to combine multinomial logit models and Bayesian network methods for comprehensively analyzing driver injury severities in rear-end crashes based on state-wide crash data collected in New Mexico from 2010 to 2011. A multinomial logit model is developed to investigate and identify significant contributing factors for rear-end crash driver injury severities classified into three categories: no injury, injury, and fatality. Then, the identified significant factors are utilized to establish a Bayesian network to explicitly formulate statistical associations between injury severity outcomes and explanatory attributes, including driver behavior, demographic features, vehicle factors, geometric and environmental characteristics, etc. The test results demonstrate that the proposed hybrid approach performs reasonably well. The Bayesian network reference analyses indicate that the factors including truck-involvement, inferior lighting conditions, windy weather conditions, the number of vehicles involved, etc. could significantly increase driver injury severities in rear-end crashes. The developed methodology and estimation results provide insights for developing effective countermeasures to reduce rear-end crash injury severities and improve traffic system safety performance. Copyright © 2015 Elsevier Ltd. All rights reserved.
DEFF Research Database (Denmark)
Iglesias, J. E.; Sabuncu, M. R.; Van Leemput, Koen
2012-01-01
Many successful segmentation algorithms are based on Bayesian models in which prior anatomical knowledge is combined with the available image information. However, these methods typically have many free parameters that are estimated to obtain point estimates only, whereas a faithful Bayesian anal...
A Bayesian Approach to Person Fit Analysis in Item Response Theory Models. Research Report.
Glas, Cees A. W.; Meijer, Rob R.
A Bayesian approach to the evaluation of person fit in item response theory (IRT) models is presented. In a posterior predictive check, the observed value on a discrepancy variable is positioned in its posterior distribution. In a Bayesian framework, a Markov Chain Monte Carlo procedure can be used to generate samples of the posterior distribution…
More Bayesian Transdimensional Inversion for Thermal History Modelling (Invited)
Gallagher, K.
2013-12-01
Since the publication of Dodson (1973) quantifying the relationship between geochronogical ages and closure temperatures, an ongoing concern in thermochronology is reconstruction of thermal histories consistent with the measured data. Extracting this thermal history information is best treated as an inverse problem, given the complex relationship between the observations and the thermal history. When solving the inverse problem (i.e. finding thermal acceptable thermal histories), stochastic sampling methods have often been used, as these are relatively global when searching the model space. However, the issue remains how best to estimate those parts of the thermal history unconstrained by independent information, i.e. what is required to fit the data ? To solve this general problem, we use a Bayesian transdimensional Markov Chain Monte Carlo method and this has been integrated into user-friendly software, QTQt (Quantitative Thermochronology with Qt), which runs on both Macintosh and PC. The Bayesian approach allows us to consider a wide range of possible thermal history as general prior information on time, temperature (and temperature offset for multiple samples in a vertical profile). We can also incorporate more focussed geological constraints in terms of more specific priors. In this framework, it is the data themselves (and their errors) that determine the complexity of the thermal history solutions. For example, more precise data will justify a more complex solution, while more noisy data will be happy with simpler solutions. We can express complexity in terms of the number of time-temperature points defining the total thermal history. Another useful feature of this method is that was can easily deal with imprecise parameter values (e.g. kinetics, data errors), by drawing samples from a user specified probability distribution, rather than using a single value. Finally, the method can be applied to either single samples, or multiple samples (from a borehole or
Rational Irrationality: Modeling Climate Change Belief Polarization Using Bayesian Networks.
Cook, John; Lewandowsky, Stephan
2016-01-01
Belief polarization is said to occur when two people respond to the same evidence by updating their beliefs in opposite directions. This response is considered to be "irrational" because it involves contrary updating, a form of belief updating that appears to violate normatively optimal responding, as for example dictated by Bayes' theorem. In light of much evidence that people are capable of normatively optimal behavior, belief polarization presents a puzzling exception. We show that Bayesian networks, or Bayes nets, can simulate rational belief updating. When fit to experimental data, Bayes nets can help identify the factors that contribute to polarization. We present a study into belief updating concerning the reality of climate change in response to information about the scientific consensus on anthropogenic global warming (AGW). The study used representative samples of Australian and U.S. Among Australians, consensus information partially neutralized the influence of worldview, with free-market supporters showing a greater increase in acceptance of human-caused global warming relative to free-market opponents. In contrast, while consensus information overall had a positive effect on perceived consensus among U.S. participants, there was a reduction in perceived consensus and acceptance of human-caused global warming for strong supporters of unregulated free markets. Fitting a Bayes net model to the data indicated that under a Bayesian framework, free-market support is a significant driver of beliefs about climate change and trust in climate scientists. Further, active distrust of climate scientists among a small number of U.S. conservatives drives contrary updating in response to consensus information among this particular group. Copyright © 2016 Cognitive Science Society, Inc.
Model-based Bayesian signal extraction algorithm for peripheral nerves
Eggers, Thomas E.; Dweiri, Yazan M.; McCallum, Grant A.; Durand, Dominique M.
2017-10-01
Objective. Multi-channel cuff electrodes have recently been investigated for extracting fascicular-level motor commands from mixed neural recordings. Such signals could provide volitional, intuitive control over a robotic prosthesis for amputee patients. Recent work has demonstrated success in extracting these signals in acute and chronic preparations using spatial filtering techniques. These extracted signals, however, had low signal-to-noise ratios and thus limited their utility to binary classification. In this work a new algorithm is proposed which combines previous source localization approaches to create a model based method which operates in real time. Approach. To validate this algorithm, a saline benchtop setup was created to allow the precise placement of artificial sources within a cuff and interference sources outside the cuff. The artificial source was taken from five seconds of chronic neural activity to replicate realistic recordings. The proposed algorithm, hybrid Bayesian signal extraction (HBSE), is then compared to previous algorithms, beamforming and a Bayesian spatial filtering method, on this test data. An example chronic neural recording is also analyzed with all three algorithms. Main results. The proposed algorithm improved the signal to noise and signal to interference ratio of extracted test signals two to three fold, as well as increased the correlation coefficient between the original and recovered signals by 10–20%. These improvements translated to the chronic recording example and increased the calculated bit rate between the recovered signals and the recorded motor activity. Significance. HBSE significantly outperforms previous algorithms in extracting realistic neural signals, even in the presence of external noise sources. These results demonstrate the feasibility of extracting dynamic motor signals from a multi-fascicled intact nerve trunk, which in turn could extract motor command signals from an amputee for the end goal of
A Bayesian Model of Category-Specific Emotional Brain Responses
Wager, Tor D.; Kang, Jian; Johnson, Timothy D.; Nichols, Thomas E.; Satpute, Ajay B.; Barrett, Lisa Feldman
2015-01-01
Understanding emotion is critical for a science of healthy and disordered brain function, but the neurophysiological basis of emotional experience is still poorly understood. We analyzed human brain activity patterns from 148 studies of emotion categories (2159 total participants) using a novel hierarchical Bayesian model. The model allowed us to classify which of five categories—fear, anger, disgust, sadness, or happiness—is engaged by a study with 66% accuracy (43-86% across categories). Analyses of the activity patterns encoded in the model revealed that each emotion category is associated with unique, prototypical patterns of activity across multiple brain systems including the cortex, thalamus, amygdala, and other structures. The results indicate that emotion categories are not contained within any one region or system, but are represented as configurations across multiple brain networks. The model provides a precise summary of the prototypical patterns for each emotion category, and demonstrates that a sufficient characterization of emotion categories relies on (a) differential patterns of involvement in neocortical systems that differ between humans and other species, and (b) distinctive patterns of cortical-subcortical interactions. Thus, these findings are incompatible with several contemporary theories of emotion, including those that emphasize emotion-dedicated brain systems and those that propose emotion is localized primarily in subcortical activity. They are consistent with componential and constructionist views, which propose that emotions are differentiated by a combination of perceptual, mnemonic, prospective, and motivational elements. Such brain-based models of emotion provide a foundation for new translational and clinical approaches. PMID:25853490
Statistical Analysis by Statistical Physics Model for the STOCK Markets
Wang, Tiansong; Wang, Jun; Fan, Bingli
A new stochastic stock price model of stock markets based on the contact process of the statistical physics systems is presented in this paper, where the contact model is a continuous time Markov process, one interpretation of this model is as a model for the spread of an infection. Through this model, the statistical properties of Shanghai Stock Exchange (SSE) and Shenzhen Stock Exchange (SZSE) are studied. In the present paper, the data of SSE Composite Index and the data of SZSE Component Index are analyzed, and the corresponding simulation is made by the computer computation. Further, we investigate the statistical properties, fat-tail phenomena, the power-law distributions, and the long memory of returns for these indices. The techniques of skewness-kurtosis test, Kolmogorov-Smirnov test, and R/S analysis are applied to study the fluctuation characters of the stock price returns.
Bayesian Genomic Prediction with Genotype × Environment Interaction Kernel Models
Cuevas, Jaime; Crossa, José; Montesinos-López, Osval A.; Burgueño, Juan; Pérez-Rodríguez, Paulino; de los Campos, Gustavo
2016-01-01
The phenomenon of genotype × environment (G × E) interaction in plant breeding decreases selection accuracy, thereby negatively affecting genetic gains. Several genomic prediction models incorporating G × E have been recently developed and used in genomic selection of plant breeding programs. Genomic prediction models for assessing multi-environment G × E interaction are extensions of a single-environment model, and have advantages and limitations. In this study, we propose two multi-environment Bayesian genomic models: the first model considers genetic effects (u) that can be assessed by the Kronecker product of variance–covariance matrices of genetic correlations between environments and genomic kernels through markers under two linear kernel methods, linear (genomic best linear unbiased predictors, GBLUP) and Gaussian (Gaussian kernel, GK). The other model has the same genetic component as the first model (u) plus an extra component, f, that captures random effects between environments that were not captured by the random effects u. We used five CIMMYT data sets (one maize and four wheat) that were previously used in different studies. Results show that models with G × E always have superior prediction ability than single-environment models, and the higher prediction ability of multi-environment models with u and f over the multi-environment model with only u occurred 85% of the time with GBLUP and 45% of the time with GK across the five data sets. The latter result indicated that including the random effect f is still beneficial for increasing prediction ability after adjusting by the random effect u. PMID:27793970
Bayesian Genomic Prediction with Genotype × Environment Interaction Kernel Models
Directory of Open Access Journals (Sweden)
Jaime Cuevas
2017-01-01
Full Text Available The phenomenon of genotype × environment (G × E interaction in plant breeding decreases selection accuracy, thereby negatively affecting genetic gains. Several genomic prediction models incorporating G × E have been recently developed and used in genomic selection of plant breeding programs. Genomic prediction models for assessing multi-environment G × E interaction are extensions of a single-environment model, and have advantages and limitations. In this study, we propose two multi-environment Bayesian genomic models: the first model considers genetic effects ( u that can be assessed by the Kronecker product of variance–covariance matrices of genetic correlations between environments and genomic kernels through markers under two linear kernel methods, linear (genomic best linear unbiased predictors, GBLUP and Gaussian (Gaussian kernel, GK. The other model has the same genetic component as the first model ( u plus an extra component, f, that captures random effects between environments that were not captured by the random effects u . We used five CIMMYT data sets (one maize and four wheat that were previously used in different studies. Results show that models with G × E always have superior prediction ability than single-environment models, and the higher prediction ability of multi-environment models with u and f over the multi-environment model with only u occurred 85% of the time with GBLUP and 45% of the time with GK across the five data sets. The latter result indicated that including the random effect f is still beneficial for increasing prediction ability after adjusting by the random effect u .
Bayesian Genomic Prediction with Genotype × Environment Interaction Kernel Models.
Cuevas, Jaime; Crossa, José; Montesinos-López, Osval A; Burgueño, Juan; Pérez-Rodríguez, Paulino; de Los Campos, Gustavo
2017-01-05
The phenomenon of genotype × environment (G × E) interaction in plant breeding decreases selection accuracy, thereby negatively affecting genetic gains. Several genomic prediction models incorporating G × E have been recently developed and used in genomic selection of plant breeding programs. Genomic prediction models for assessing multi-environment G × E interaction are extensions of a single-environment model, and have advantages and limitations. In this study, we propose two multi-environment Bayesian genomic models: the first model considers genetic effects [Formula: see text] that can be assessed by the Kronecker product of variance-covariance matrices of genetic correlations between environments and genomic kernels through markers under two linear kernel methods, linear (genomic best linear unbiased predictors, GBLUP) and Gaussian (Gaussian kernel, GK). The other model has the same genetic component as the first model [Formula: see text] plus an extra component, F: , that captures random effects between environments that were not captured by the random effects [Formula: see text] We used five CIMMYT data sets (one maize and four wheat) that were previously used in different studies. Results show that models with G × E always have superior prediction ability than single-environment models, and the higher prediction ability of multi-environment models with [Formula: see text] over the multi-environment model with only u occurred 85% of the time with GBLUP and 45% of the time with GK across the five data sets. The latter result indicated that including the random effect f is still beneficial for increasing prediction ability after adjusting by the random effect [Formula: see text]. Copyright © 2017 Cuevas et al.
Bazin, Eric; Dawson, Kevin J; Beaumont, Mark A
2010-06-01
We address the problem of finding evidence of natural selection from genetic data, accounting for the confounding effects of demographic history. In the absence of natural selection, gene genealogies should all be sampled from the same underlying distribution, often approximated by a coalescent model. Selection at a particular locus will lead to a modified genealogy, and this motivates a number of recent approaches for detecting the effects of natural selection in the genome as "outliers" under some models. The demographic history of a population affects the sampling distribution of genealogies, and therefore the observed genotypes and the classification of outliers. Since we cannot see genealogies directly, we have to infer them from the observed data under some model of mutation and demography. Thus the accuracy of an outlier-based approach depends to a greater or a lesser extent on the uncertainty about the demographic and mutational model. A natural modeling framework for this type of problem is provided by Bayesian hierarchical models, in which parameters, such as mutation rates and selection coefficients, are allowed to vary across loci. It has proved quite difficult computationally to implement fully probabilistic genealogical models with complex demographies, and this has motivated the development of approximations such as approximate Bayesian computation (ABC). In ABC the data are compressed into summary statistics, and computation of the likelihood function is replaced by simulation of data under the model. In a hierarchical setting one may be interested both in hyperparameters and parameters, and there may be very many of the latter--for example, in a genetic model, these may be parameters describing each of many loci or populations. This poses a problem for ABC in that one then requires summary statistics for each locus, which, if used naively, leads to a consequent difficulty in conditional density estimation. We develop a general method for applying
Nonlinear regression modeling of nutrient loads in streams: A Bayesian approach
Qian, S.S.; Reckhow, K.H.; Zhai, J.; McMahon, G.
2005-01-01
A Bayesian nonlinear regression modeling method is introduced and compared with the least squares method for modeling nutrient loads in stream networks. The objective of the study is to better model spatial correlation in river basin hydrology and land use for improving the model as a forecasting tool. The Bayesian modeling approach is introduced in three steps, each with a more complicated model and data error structure. The approach is illustrated using a data set from three large river basins in eastern North Carolina. Results indicate that the Bayesian model better accounts for model and data uncertainties than does the conventional least squares approach. Applications of the Bayesian models for ambient water quality standards compliance and TMDL assessment are discussed. Copyright 2005 by the American Geophysical Union.
In this paper, the Genetic Algorithms (GA) and Bayesian model averaging (BMA) were combined to simultaneously conduct calibration and uncertainty analysis for the Soil and Water Assessment Tool (SWAT). In this hybrid method, several SWAT models with different structures are first selected; next GA i...
Etienne, RS; Olff, H
Species abundances are undoubtedly the most widely available macroecological data, but can we use them to distinguish among several models of community structure? Here we present a Bayesian analysis of species-abundance data that yields a full joint probability distribution of each model's
Etienne, R.S.; Olff, H.
2005-01-01
Species abundances are undoubtedly the most widely available macroecological data, but can we use them to distinguish among several models of community structure? Here we present a Bayesian analysis of species-abundance data that yields a full joint probability distribution of each model's
Bayesian analysis for exponential random graph models using the adaptive exchange sampler
Jin, Ick Hoon
2013-01-01
Exponential random graph models have been widely used in social network analysis. However, these models are extremely difficult to handle from a statistical viewpoint, because of the existence of intractable normalizing constants. In this paper, we consider a fully Bayesian analysis for exponential random graph models using the adaptive exchange sampler, which solves the issue of intractable normalizing constants encountered in Markov chain Monte Carlo (MCMC) simulations. The adaptive exchange sampler can be viewed as a MCMC extension of the exchange algorithm, and it generates auxiliary networks via an importance sampling procedure from an auxiliary Markov chain running in parallel. The convergence of this algorithm is established under mild conditions. The adaptive exchange sampler is illustrated using a few social networks, including the Florentine business network, molecule synthetic network, and dolphins network. The results indicate that the adaptive exchange algorithm can produce more accurate estimates than approximate exchange algorithms, while maintaining the same computational efficiency.
Bayesian unknown change-point models to investigate immediacy in single case designs.
Natesan, Prathiba; Hedges, Larry V
2017-12-01
Although immediacy is one of the necessary criteria to show strong evidence of a causal relation in single case designs (SCDs), no inferential statistical tool is currently used to demonstrate it. We propose a Bayesian unknown change-point model to investigate and quantify immediacy in SCD analysis. Unlike visual analysis that considers only 3-5 observations in consecutive phases to investigate immediacy, this model considers all data points. Immediacy is indicated when the posterior distribution of the unknown change-point is narrow around the true value of the change-point. This model can accommodate delayed effects. Monte Carlo simulation for a 2-phase design shows that the posterior standard deviations of the change-points decrease with increase in standardized mean difference between phases and decrease in test length. This method is illustrated with real data. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Directory of Open Access Journals (Sweden)
Gerhard Moser
2015-04-01
Full Text Available Gene discovery, estimation of heritability captured by SNP arrays, inference on genetic architecture and prediction analyses of complex traits are usually performed using different statistical models and methods, leading to inefficiency and loss of power. Here we use a Bayesian mixture model that simultaneously allows variant discovery, estimation of genetic variance explained by all variants and prediction of unobserved phenotypes in new samples. We apply the method to simulated data of quantitative traits and Welcome Trust Case Control Consortium (WTCCC data on disease and show that it provides accurate estimates of SNP-based heritability, produces unbiased estimators of risk in new samples, and that it can estimate genetic architecture by partitioning variation across hundreds to thousands of SNPs. We estimated that, depending on the trait, 2,633 to 9,411 SNPs explain all of the SNP-based heritability in the WTCCC diseases. The majority of those SNPs (>96% had small effects, confirming a substantial polygenic component to common diseases. The proportion of the SNP-based variance explained by large effects (each SNP explaining 1% of the variance varied markedly between diseases, ranging from almost zero for bipolar disorder to 72% for type 1 diabetes. Prediction analyses demonstrate that for diseases with major loci, such as type 1 diabetes and rheumatoid arthritis, Bayesian methods outperform profile scoring or mixed model approaches.
STATISTICS OF MEASURING NEUTRON STAR RADII: ASSESSING A FREQUENTIST AND A BAYESIAN APPROACH
Energy Technology Data Exchange (ETDEWEB)
Özel, Feryal; Psaltis, Dimitrios [Departments of Astronomy and Physics, University of Arizona, 933 N. Cherry Avenue, Tucson, AZ 85721 (United States)
2015-09-10
Measuring neutron star radii with spectroscopic and timing techniques relies on the combination of multiple observables to break the degeneracies between the mass and radius introduced by general relativistic effects. Here, we explore a previously used frequentist and a newly proposed Bayesian framework to obtain the most likely value and the uncertainty in such a measurement. We find that for the expected range of masses and radii and for realistic measurement errors, the frequentist approach suffers from biases that are larger than the accuracy in the radius measurement required to distinguish between the different equations of state. In contrast, in the Bayesian framework, the inferred uncertainties are larger, but the most likely values do not suffer from such biases. We also investigate ways of quantifying the degree of consistency between different spectroscopic measurements from a single source. We show that a careful assessment of the systematic uncertainties in the measurements eliminates the need for introducing ad hoc biases, which lead to artificially large inferred radii.
Extraction of Airways with Probabilistic State-Space Models and Bayesian Smoothing
DEFF Research Database (Denmark)
Raghavendra, Selvan; Petersen, Jens; Pedersen, Jesper Johannes Holst
of elongated branches using probabilistic state-space models and Bayesian smoothing. Unlike most existing methods that proceed with sequential tracking of branches, we present an exploratory method, that is less sensitive to local anomalies in the data due to acquisition noise and/or interfering structures....... The evolution of individual branches is modelled using a process model and the observed data is incorporated into the update step of the Bayesian smoother using a measurement model that is based on a multi-scale blob detector. Bayesian smoothing is performed using the RTS (Rauch-Tung-Striebel) smoother, which...
A. Zellner (Arnold); L. Bauwens (Luc); H.K. van Dijk (Herman)
1988-01-01
textabstractBayesian procedures for specification analysis or diagnostic checking of modeling assumptions for structural equations of econometric models are developed and applied using Monte Carlo numerical methods. Checks on the validity of identifying restrictions, exogeneity assumptions and other
Directory of Open Access Journals (Sweden)
A. E. Sikorska
2012-04-01
Full Text Available Urbanization and the resulting land-use change strongly affect the water cycle and runoff-processes in watersheds. Unfortunately, small urban watersheds, which are most affected by urban sprawl, are mostly ungauged. This makes it intrinsically difficult to assess the consequences of urbanization. Most of all, it is unclear how to reliably assess the predictive uncertainty given the structural deficits of the applied models. In this study, we therefore investigate the uncertainty of flood predictions in ungauged urban basins from structurally uncertain rainfall-runoff models. To this end, we suggest a procedure to explicitly account for input uncertainty and model structure deficits using Bayesian statistics with a continuous-time autoregressive error model. In addition, we propose a concise procedure to derive prior parameter distributions from base data and successfully apply the methodology to an urban catchment in Warsaw, Poland. Based on our results, we are able to demonstrate that the autoregressive error model greatly helps to meet the statistical assumptions and to compute reliable prediction intervals. In our study, we found that predicted peak flows were up to 7 times higher than observations. This was reduced to 5 times with Bayesian updating, using only few discharge measurements. In addition, our analysis suggests that imprecise rainfall information and model structure deficits contribute mostly to the total prediction uncertainty. In the future, flood predictions in ungauged basins will become more important due to ongoing urbanization as well as anthropogenic and climatic changes. Thus, providing reliable measures of uncertainty is crucial to support decision making.
Directory of Open Access Journals (Sweden)
Robert William Rankin
2016-03-01
Full Text Available We present a Hierarchical Bayesian version of Pollock's Closed Robust Design for studying the survival, temporary-migration, and abundance of marked animals. Through simulations and analyses of a bottlenose dolphin photo-identification dataset, we compare several estimation frameworks, including Maximum Likelihood estimation (ML, model-averaging by AICc, as well as Bayesian and Hierarchical Bayesian (HB procedures. Our results demonstrate a number of advantages of the Bayesian framework over other popular methods. First, for simple fixed-effect models, we show the near-equivalence of Bayesian and ML point-estimates and confidence/credibility intervals. Second, we demonstrate how there is an inherent correlation among temporary-migration and survival parameter estimates in the PCRD, and while this can lead to serious convergence issues and singularities among MLEs, we show that the Bayesian estimates were more reliable. Third, we demonstrate that a Hierarchical Bayesian model with carefully thought-out hyperpriors, can lead to similar parameter estimates and conclusions as multi-model inference by AICc model-averaging. This latter point is especially interesting for mark-recapture practitioners, for whom model-uncertainty and multi-model inference have become a major preoccupation. Lastly, we extend the Hierarchical Bayesian PCRD to include full-capture histories (i.e., by modelling a recruitment process and individual-level heterogeneity in detection probabilities, which can have important consequences for the range of phenomena studied by the PCRD, as well as lead to large differences in abundance estimates. For example, we estimate 8%-24% more bottlenose dolphins in the western gulf of Shark Bay than previously estimated by ML and AICc-based model-averaging. Other important extensions are discussed. Our Bayesian PCRD models are written in the BUGS-like JAGS language for easy dissemination and customization by the community of capture
SU-E-T-51: Bayesian Network Models for Radiotherapy Error Detection
Energy Technology Data Exchange (ETDEWEB)
Kalet, A; Phillips, M; Gennari, J [UniversityWashington, Seattle, WA (United States)
2014-06-01
Purpose: To develop a probabilistic model of radiotherapy plans using Bayesian networks that will detect potential errors in radiation delivery. Methods: Semi-structured interviews with medical physicists and other domain experts were employed to generate a set of layered nodes and arcs forming a Bayesian Network (BN) which encapsulates relevant radiotherapy concepts and their associated interdependencies. Concepts in the final network were limited to those whose parameters are represented in the institutional database at a level significant enough to develop mathematical distributions. The concept-relation knowledge base was constructed using the Web Ontology Language (OWL) and translated into Hugin Expert Bayes Network files via the the RHugin package in the R statistical programming language. A subset of de-identified data derived from a Mosaiq relational database representing 1937 unique prescription cases was processed and pre-screened for errors and then used by the Hugin implementation of the Estimation-Maximization (EM) algorithm for machine learning all parameter distributions. Individual networks were generated for each of several commonly treated anatomic regions identified by ICD-9 neoplasm categories including lung, brain, lymphoma, and female breast. Results: The resulting Bayesian networks represent a large part of the probabilistic knowledge inherent in treatment planning. By populating the networks entirely with data captured from a clinical oncology information management system over the course of several years of normal practice, we were able to create accurate probability tables with no additional time spent by experts or clinicians. These probabilistic descriptions of the treatment planning allow one to check if a treatment plan is within the normal scope of practice, given some initial set of clinical evidence and thereby detect for potential outliers to be flagged for further investigation. Conclusion: The networks developed here support the
SU-E-T-51: Bayesian Network Models for Radiotherapy Error Detection
International Nuclear Information System (INIS)
Kalet, A; Phillips, M; Gennari, J
2014-01-01
Purpose: To develop a probabilistic model of radiotherapy plans using Bayesian networks that will detect potential errors in radiation delivery. Methods: Semi-structured interviews with medical physicists and other domain experts were employed to generate a set of layered nodes and arcs forming a Bayesian Network (BN) which encapsulates relevant radiotherapy concepts and their associated interdependencies. Concepts in the final network were limited to those whose parameters are represented in the institutional database at a level significant enough to develop mathematical distributions. The concept-relation knowledge base was constructed using the Web Ontology Language (OWL) and translated into Hugin Expert Bayes Network files via the the RHugin package in the R statistical programming language. A subset of de-identified data derived from a Mosaiq relational database representing 1937 unique prescription cases was processed and pre-screened for errors and then used by the Hugin implementation of the Estimation-Maximization (EM) algorithm for machine learning all parameter distributions. Individual networks were generated for each of several commonly treated anatomic regions identified by ICD-9 neoplasm categories including lung, brain, lymphoma, and female breast. Results: The resulting Bayesian networks represent a large part of the probabilistic knowledge inherent in treatment planning. By populating the networks entirely with data captured from a clinical oncology information management system over the course of several years of normal practice, we were able to create accurate probability tables with no additional time spent by experts or clinicians. These probabilistic descriptions of the treatment planning allow one to check if a treatment plan is within the normal scope of practice, given some initial set of clinical evidence and thereby detect for potential outliers to be flagged for further investigation. Conclusion: The networks developed here support the
Bayesian nonparametric clustering in phylogenetics: modeling antigenic evolution in influenza.
Cybis, Gabriela B; Sinsheimer, Janet S; Bedford, Trevor; Rambaut, Andrew; Lemey, Philippe; Suchard, Marc A
2018-01-30
Influenza is responsible for up to 500,000 deaths every year, and antigenic variability represents much of its epidemiological burden. To visualize antigenic differences across many viral strains, antigenic cartography methods use multidimensional scaling on binding assay data to map influenza antigenicity onto a low-dimensional space. Analysis of such assay data ideally leads to natural clustering of influenza strains of similar antigenicity that correlate with sequence evolution. To understand the dynamics of these antigenic groups, we present a framework that jointly models genetic and antigenic evolution by combining multidimensional scaling of binding assay data, Bayesian phylogenetic machinery and nonparametric clustering methods. We propose a phylogenetic Chinese restaurant process that extends the current process to incorporate the phylogenetic dependency structure between strains in the modeling of antigenic clusters. With this method, we are able to use the genetic information to better understand the evolution of antigenicity throughout epidemics, as shown in applications of this model to H1N1 influenza. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
A Bayesian semiparametric Markov regression model for juvenile dermatomyositis.
De Iorio, Maria; Gallot, Natacha; Valcarcel, Beatriz; Wedderburn, Lucy
2018-02-20
Juvenile dermatomyositis (JDM) is a rare autoimmune disease that may lead to serious complications, even to death. We develop a 2-state Markov regression model in a Bayesian framework to characterise disease progression in JDM over time and gain a better understanding of the factors influencing disease risk. The transition probabilities between disease and remission state (and vice versa) are a function of time-homogeneous and time-varying covariates. These latter types of covariates are introduced in the model through a latent health state function, which describes patient-specific health over time and accounts for variability among patients. We assume a nonparametric prior based on the Dirichlet process to model the health state function and the baseline transition intensities between disease and remission state and vice versa. The Dirichlet process induces a clustering of the patients in homogeneous risk groups. To highlight clinical variables that most affect the transition probabilities, we perform variable selection using spike and slab prior distributions. Posterior inference is performed through Markov chain Monte Carlo methods. Data were made available from the UK JDM Cohort and Biomarker Study and Repository, hosted at the UCL Institute of Child Health. Copyright © 2018 John Wiley & Sons, Ltd.
Bayesian network model of crowd emotion and negative behavior
Ramli, Nurulhuda; Ghani, Noraida Abdul; Hatta, Zulkarnain Ahmad; Hashim, Intan Hashimah Mohd; Sulong, Jasni; Mahudin, Nor Diana Mohd; Rahman, Shukran Abd; Saad, Zarina Mat
2014-12-01
The effects of overcrowding have become a major concern for event organizers. One aspect of this concern has been the idea that overcrowding can enhance the occurrence of serious incidents during events. As one of the largest Muslim religious gathering attended by pilgrims from all over the world, Hajj has become extremely overcrowded with many incidents being reported. The purpose of this study is to analyze the nature of human emotion and negative behavior resulting from overcrowding during Hajj events from data gathered in Malaysian Hajj Experience Survey in 2013. The sample comprised of 147 Malaysian pilgrims (70 males and 77 females). Utilizing a probabilistic model called Bayesian network, this paper models the dependence structure between different emotions and negative behaviors of pilgrims in the crowd. The model included the following variables of emotion: negative, negative comfortable, positive, positive comfortable and positive spiritual and variables of negative behaviors; aggressive and hazardous acts. The study demonstrated that emotions of negative, negative comfortable, positive spiritual and positive emotion have a direct influence on aggressive behavior whereas emotion of negative comfortable, positive spiritual and positive have a direct influence on hazardous acts behavior. The sensitivity analysis showed that a low level of negative and negative comfortable emotions leads to a lower level of aggressive and hazardous behavior. Findings of the study can be further improved to identify the exact cause and risk factors of crowd-related incidents in preventing crowd disasters during the mass gathering events.
Flexible Bayesian Dynamic Modeling of Covariance and Correlation Matrices
Lan, Shiwei
2017-11-08
Modeling covariance (and correlation) matrices is a challenging problem due to the large dimensionality and positive-definiteness constraint. In this paper, we propose a novel Bayesian framework based on decomposing the covariance matrix into variance and correlation matrices. The highlight is that the correlations are represented as products of vectors on unit spheres. We propose a variety of distributions on spheres (e.g. the squared-Dirichlet distribution) to induce flexible prior distributions for covariance matrices that go beyond the commonly used inverse-Wishart prior. To handle the intractability of the resulting posterior, we introduce the adaptive $\\\\Delta$-Spherical Hamiltonian Monte Carlo. We also extend our structured framework to dynamic cases and introduce unit-vector Gaussian process priors for modeling the evolution of correlation among multiple time series. Using an example of Normal-Inverse-Wishart problem, a simulated periodic process, and an analysis of local field potential data (collected from the hippocampus of rats performing a complex sequence memory task), we demonstrated the validity and effectiveness of our proposed framework for (dynamic) modeling covariance and correlation matrices.
Bayesian models for astrophysical data using R, JAGS, Python, and Stan
Hilbe, Joseph M; Ishida, Emille E O
2017-01-01
This comprehensive guide to Bayesian methods in astronomy enables hands-on work by supplying complete R, JAGS, Python, and Stan code, to use directly or to adapt. It begins by examining the normal model from both frequentist and Bayesian perspectives and then progresses to a full range of Bayesian generalized linear and mixed or hierarchical models, as well as additional types of models such as ABC and INLA. The book provides code that is largely unavailable elsewhere and includes details on interpreting and evaluating Bayesian models. Initial discussions offer models in synthetic form so that readers can easily adapt them to their own data; later the models are applied to real astronomical data. The consistent focus is on hands-on modeling, analysis of data, and interpretations that address scientific questions. A must-have for astronomers, its concrete approach will also be attractive to researchers in the sciences more generally.
Bayesian Regression of Thermodynamic Models of Redox Active Materials
Energy Technology Data Exchange (ETDEWEB)
Johnston, Katherine [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
2017-09-01
Finding a suitable functional redox material is a critical challenge to achieving scalable, economically viable technologies for storing concentrated solar energy in the form of a defected oxide. Demonstrating e ectiveness for thermal storage or solar fuel is largely accomplished by using a thermodynamic model derived from experimental data. The purpose of this project is to test the accuracy of our regression model on representative data sets. Determining the accuracy of the model includes parameter tting the model to the data, comparing the model using di erent numbers of param- eters, and analyzing the entropy and enthalpy calculated from the model. Three data sets were considered in this project: two demonstrating materials for solar fuels by wa- ter splitting and the other of a material for thermal storage. Using Bayesian Inference and Markov Chain Monte Carlo (MCMC), parameter estimation was preformed on the three data sets. Good results were achieved, except some there was some deviations on the edges of the data input ranges. The evidence values were then calculated in a variety of ways and used to compare models with di erent number of parameters. It was believed that at least one of the parameters was unnecessary and comparing evidence values demonstrated that the parameter was need on one data set and not signi cantly helpful on another. The entropy was calculated by taking the derivative in one variable and integrating over another. and its uncertainty was also calculated by evaluating the entropy over multiple MCMC samples. Afterwards, all the parts were written up as a tutorial for the Uncertainty Quanti cation Toolkit (UQTk).
Tierz, Pablo; Odbert, Henry; Phillips, Jeremy; Woodhouse, Mark; Sandri, Laura; Selva, Jacopo; Marzocchi, Warner
2016-04-01
Quantification of volcanic hazards is a challenging task for modern volcanology. Assessing the large uncertainties involved in the hazard analysis requires the combination of volcanological data, physical and statistical models. This is a complex procedure even when taking into account only one type of volcanic hazard. However, volcanic systems are known to be multi-hazard environments where several hazardous phenomena (tephra fallout, Pyroclastic Density Currents -PDCs-, lahars, etc.) may occur whether simultaneous or sequentially. Bayesian Belief Networks (BBNs) are a flexible and powerful way of modelling uncertainty. They are statistical models that can merge information coming from data, physical models, other statistical models or expert knowledge into a unified probabilistic assessment. Therefore, they can be applied to model the interaction between different volcanic hazards in an efficient manner. In this work, we design and preliminarily parametrize a BBN with the aim of forecasting the occurrence and volume of rain-triggered lahars when considering: (1) input of pyroclastic material, in the form of tephra fallout and PDCs, over the catchments around the volcano; (2) remobilization of this material by antecedent lahar events. Input of fresh pyroclastic material can be modelled through a combination of physical models (e.g. advection-diffusion models for tephra fallout such as HAZMAP and shallow-layer continuum models for PDCs such as Titan2D) and uncertainty quantification techniques, while the remobilization efficiency can be constrained from datasets of lahar observations at different volcanoes. The applications of this kind of probabilistic multi-hazard approach can range from real-time forecasting of lahar activity to calibration of physical or statistical models (e.g. emulators) for long-term volcanic hazard assessment.
De la Fuente, José Manuel; Bengoetxea, Endika; Navarro, Felipe; Bobes, Julio; Alarcón, Renato Daniel
2011-04-30
There is agreement in that strengthening the sets of neurobiological data would reinforce the diagnostic objectivity of many psychiatric entities. This article attempts to use this approach in borderline personality disorder (BPD). Assuming that most of the biological findings in BPD reflect common underlying pathophysiological processes we hypothesized that most of the data involved in the findings would be statistically interconnected and interdependent, indicating biological consistency for this diagnosis. Prospectively obtained data on scalp and sleep electroencephalography (EEG), clinical neurologic soft signs, the dexamethasone suppression and thyrotropin-releasing hormone stimulation tests of 20 consecutive BPD patients were used to generate a Bayesian network model, an artificial intelligence paradigm that visually illustrates eventual associations (or inter-dependencies) between otherwise seemingly unrelated variables. The Bayesian network model identified relationships among most of the variables. EEG and TSH were the variables that influence most of the others, especially sleep parameters. Neurological soft signs were linked with EEG, TSH, and sleep parameters. The results suggest the possibility of using objective neurobiological variables to strengthen the validity of future diagnostic criteria and nosological characterization of BPD. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
A Joint Bayesian Model for Integrating Microarray and RNA Sequencing Transcriptomic Data.
Ma, Tianzhou; Liang, Faming; Oesterreich, Steffi; Tseng, George C
2017-07-01
As the sequencing cost continued to drop in the past decade, RNA sequencing (RNA-seq) has replaced microarray to become the standard high-throughput experimental tool to analyze transcriptomic profile. As more and more datasets are generated and accumulated in the public domain, meta-analysis to combine multiple transcriptomic studies to increase statistical power has received increasing popularity. In this article, we propose a Bayesian hierarchical model to jointly integrate microarray and RNA-seq studies. Since systematic fold change differences across RNA-seq and microarray for detecting differentially expressed genes have been previously reported, we replicated this finding in several real datasets and showed that incorporation of a normalization procedure to account for the bias improves the detection accuracy and power. We compared our method with the popular two-stage Fisher's method using simulations and two real applications in a histological subtype (invasive lobular carcinoma) of breast cancer comparing PR+ versus PR- and early-stage versus late-stage patients. The result showed improved detection power and more significant and interpretable pathways enriched in the detected biomarkers from the proposed Bayesian model.
High resolution forecasting for wind energy applications using Bayesian model averaging
Energy Technology Data Exchange (ETDEWEB)
Courtney, Jennifer F.; Lynch, Peter; Sweeney, Conor [Meteorology and Climate Centre, UCD, Dublin (Ireland)], e-mail: jennifer.courtney@ucdconnect.ie
2013-02-15
Two methods of post-processing the uncalibrated wind speed forecasts from the European Centre for Medium-Range Weather Forecasts (ECMWF) ensemble prediction system (EPS) are presented here. Both methods involve statistically post-processing the EPS or a downscaled version of it with Bayesian model averaging (BMA). The first method applies BMA directly to the EPS data. The second method involves clustering the EPS to eight representative members (RMs) and downscaling the data through two limited area models at two resolutions. Four weighted ensemble mean forecasts are produced and used as input to the BMA method. Both methods are tested against 13 meteorological stations around Ireland with 1 yr of forecast/observation data. Results show calibration and accuracy improvements using both methods, with the best results stemming from Method 2, which has comparatively low mean absolute error and continuous ranked probability scores.
High resolution forecasting for wind energy applications using Bayesian model averaging
Directory of Open Access Journals (Sweden)
Jennifer F. Courtney
2013-02-01
Full Text Available Two methods of post-processing the uncalibrated wind speed forecasts from the European Centre for Medium-Range Weather Forecasts (ECMWF ensemble prediction system (EPS are presented here. Both methods involve statistically post-processing the EPS or a downscaled version of it with Bayesian model averaging (BMA. The first method applies BMA directly to the EPS data. The second method involves clustering the EPS to eight representative members (RMs and downscaling the data through two limited area models at two resolutions. Four weighted ensemble mean forecasts are produced and used as input to the BMA method. Both methods are tested against 13 meteorological stations around Ireland with 1 yr of forecast/observation data. Results show calibration and accuracy improvements using both methods, with the best results stemming from Method 2, which has comparatively low mean absolute error and continuous ranked probability scores.
DEFF Research Database (Denmark)
Schur, Nadine; Hürlimann, Eveline; Stensgaard, Anna-Sofie
2013-01-01
surveys on different age-groups and to acquire separate estimates for individuals aged ≤20 years and entire communities. Prevalence estimates were combined with population statistics to obtain country-specific numbers of Schistosoma infections. We estimate that 122 million individuals in eastern Africa...... Africa. Bayesian geostatistical models based on climatic and other environmental data were used to account for potential spatial clustering in spatially structured exposures. Geostatistical variable selection was employed to reduce the set of covariates. Alignment factors were implemented to combine...... are currently infected with either S. mansoni, or S. haematobium, or both species concurrently. Country-specific population-adjusted prevalence estimates range between 12.9% (Uganda) and 34.5% (Mozambique) for S. mansoni and between 11.9% (Djibouti) and 40.9% (Mozambique) for S. haematobium. Our models revealed...
Macroscopic Models of Clique Tree Growth for Bayesian Networks
National Aeronautics and Space Administration — In clique tree clustering, inference consists of propagation in a clique tree compiled from a Bayesian network. In this paper, we develop an analytical approach to...
Multilevel temporal Bayesian networks can model longitudinal change in multimorbidity
Lappenschaar, M.; Hommersom, A.; Lucas, P.J.; Lagro, J.; Visscher, S.; Korevaar, J.C.; Schellevis, F.G.
2013-01-01
Objectives Although the course of single diseases can be studied using traditional epidemiologic techniques, these methods cannot capture the complex joint evolutionary course of multiple disorders. In this study, multilevel temporal Bayesian networks were adopted to study the course of
Fong, Ted Chun Tat; Ho, Rainbow Tin Hung
2013-12-01
The latent structure of the Hospital Anxiety and Depression Scale (HADS) has caused inconsistent results in the literature. The HADS is frequently analyzed via maximum likelihood confirmatory factor analysis (ML-CFA). However, the overly restrictive assumption of exact zero cross-loadings and residual correlations in ML-CFA can lead to poor model fits and distorted factor structures. This study applied Bayesian structural equation modeling (BSEM) to evaluate the latent structure of the HADS. Three a priori models, the two-factor, three-factor, and bifactor models, were investigated in a Chinese community sample (N = 312) and clinical sample (N = 198) using ML-CFA and BSEM. BSEM specified approximate zero cross-loadings and residual correlations through the use of zero-mean, small-variance informative priors. The model comparison was based on the Bayesian information criterion (BIC). Using ML-CFA, none of the three models provided an adequate fit for either sample. The BSEM two-factor model with approximate zero cross-loadings and residual correlations fitted both samples well with the lowest BIC of the three models and displayed a simple and parsimonious factor-loading pattern. The study demonstrated that the two-factor structure fitted the HADS well, suggesting its usefulness in assessing the symptoms of anxiety and depression in clinical practice. BSEM is a sophisticated and flexible statistical technique that better reflects substantive theories and locates the source of model misfit. Future use of BSEM is recommended to evaluate the latent structure of other psychological instruments.
Bayesian modeling and chronological precision for Polynesian settlement of Tonga.
Directory of Open Access Journals (Sweden)
David Burley
Full Text Available First settlement of Polynesia, and population expansion throughout the ancestral Polynesian homeland are foundation events for global history. A precise chronology is paramount to informed archaeological interpretation of these events and their consequences. Recently applied chronometric hygiene protocols excluding radiocarbon dates on wood charcoal without species identification all but eliminates this chronology as it has been built for the Kingdom of Tonga, the initial islands to be settled in Polynesia. In this paper we re-examine and redevelop this chronology through application of Bayesian models to the questioned suite of radiocarbon dates, but also incorporating short-lived wood charcoal dates from archived samples and high precision U/Th dates on coral artifacts. These models provide generation level precision allowing us to track population migration from first Lapita occupation on the island of Tongatapu through Tonga's central and northern island groups. They further illustrate an exceptionally short duration for the initial colonizing Lapita phase and a somewhat abrupt transition to ancestral Polynesian society as it is currently defined.
Bayesian mixture models for source separation in MEG
International Nuclear Information System (INIS)
Calvetti, Daniela; Homa, Laura; Somersalo, Erkki
2011-01-01
This paper discusses the problem of imaging electromagnetic brain activity from measurements of the induced magnetic field outside the head. This imaging modality, magnetoencephalography (MEG), is known to be severely ill posed, and in order to obtain useful estimates for the activity map, complementary information needs to be used to regularize the problem. In this paper, a particular emphasis is on finding non-superficial focal sources that induce a magnetic field that may be confused with noise due to external sources and with distributed brain noise. The data are assumed to come from a mixture of a focal source and a spatially distributed possibly virtual source; hence, to differentiate between those two components, the problem is solved within a Bayesian framework, with a mixture model prior encoding the information that different sources may be concurrently active. The mixture model prior combines one density that favors strongly focal sources and another that favors spatially distributed sources, interpreted as clutter in the source estimation. Furthermore, to address the challenge of localizing deep focal sources, a novel depth sounding algorithm is suggested, and it is shown with simulated data that the method is able to distinguish between a signal arising from a deep focal source and a clutter signal. (paper)
Bayesian methods for measures of agreement
Broemeling, Lyle D
2009-01-01
Using WinBUGS to implement Bayesian inferences of estimation and testing hypotheses, Bayesian Methods for Measures of Agreement presents useful methods for the design and analysis of agreement studies. It focuses on agreement among the various players in the diagnostic process.The author employs a Bayesian approach to provide statistical inferences based on various models of intra- and interrater agreement. He presents many examples that illustrate the Bayesian mode of reasoning and explains elements of a Bayesian application, including prior information, experimental information, the likelihood function, posterior distribution, and predictive distribution. The appendices provide the necessary theoretical foundation to understand Bayesian methods as well as introduce the fundamentals of programming and executing the WinBUGS software.Taking a Bayesian approach to inference, this hands-on book explores numerous measures of agreement, including the Kappa coefficient, the G coefficient, and intraclass correlation...
Nitrate source apportionment in a subtropical watershed using Bayesian model
International Nuclear Information System (INIS)
Yang, Liping; Han, Jiangpei; Xue, Jianlong; Zeng, Lingzao; Shi, Jiachun; Wu, Laosheng; Jiang, Yonghai
2013-01-01
Nitrate (NO 3 − ) pollution in aquatic system is a worldwide problem. The temporal distribution pattern and sources of nitrate are of great concern for water quality. The nitrogen (N) cycling processes in a subtropical watershed located in Changxing County, Zhejiang Province, China were greatly influenced by the temporal variations of precipitation and temperature during the study period (September 2011 to July 2012). The highest NO 3 − concentration in water was in May (wet season, mean ± SD = 17.45 ± 9.50 mg L −1 ) and the lowest concentration occurred in December (dry season, mean ± SD = 10.54 ± 6.28 mg L −1 ). Nevertheless, no water sample in the study area exceeds the WHO drinking water limit of 50 mg L −1 NO 3 − . Four sources of NO 3 − (atmospheric deposition, AD; soil N, SN; synthetic fertilizer, SF; manure and sewage, M and S) were identified using both hydrochemical characteristics [Cl − , NO 3 − , HCO 3 − , SO 4 2− , Ca 2+ , K + , Mg 2+ , Na + , dissolved oxygen (DO)] and dual isotope approach (δ 15 N–NO 3 − and δ 18 O–NO 3 − ). Both chemical and isotopic characteristics indicated that denitrification was not the main N cycling process in the study area. Using a Bayesian model (stable isotope analysis in R, SIAR), the contribution of each source was apportioned. Source apportionment results showed that source contributions differed significantly between the dry and wet season, AD and M and S contributed more in December than in May. In contrast, SN and SF contributed more NO 3 − to water in May than that in December. M and S and SF were the major contributors in December and May, respectively. Moreover, the shortcomings and uncertainties of SIAR were discussed to provide implications for future works. With the assessment of temporal variation and sources of NO 3 − , better agricultural management practices and sewage disposal programs can be implemented to sustain water quality in subtropical watersheds
Bayesian Multi-Energy Computed Tomography reconstruction approaches based on decomposition models
International Nuclear Information System (INIS)
Cai, Caifang
2013-01-01
Multi-Energy Computed Tomography (MECT) makes it possible to get multiple fractions of basis materials without segmentation. In medical application, one is the soft-tissue equivalent water fraction and the other is the hard-matter equivalent bone fraction. Practical MECT measurements are usually obtained with polychromatic X-ray beams. Existing reconstruction approaches based on linear forward models without counting the beam poly-chromaticity fail to estimate the correct decomposition fractions and result in Beam-Hardening Artifacts (BHA). The existing BHA correction approaches either need to refer to calibration measurements or suffer from the noise amplification caused by the negative-log pre-processing and the water and bone separation problem. To overcome these problems, statistical DECT reconstruction approaches based on non-linear forward models counting the beam poly-chromaticity show great potential for giving accurate fraction images.This work proposes a full-spectral Bayesian reconstruction approach which allows the reconstruction of high quality fraction images from ordinary polychromatic measurements. This approach is based on a Gaussian noise model with unknown variance assigned directly to the projections without taking negative-log. Referring to Bayesian inferences, the decomposition fractions and observation variance are estimated by using the joint Maximum A Posteriori (MAP) estimation method. Subject to an adaptive prior model assigned to the variance, the joint estimation problem is then simplified into a single estimation problem. It transforms the joint MAP estimation problem into a minimization problem with a non-quadratic cost function. To solve it, the use of a monotone Conjugate Gradient (CG) algorithm with suboptimal descent steps is proposed.The performances of the proposed approach are analyzed with both simulated and experimental data. The results show that the proposed Bayesian approach is robust to noise and materials. It is also
Bayesian hierarchical models for regional climate reconstructions of the last glacial maximum
Weitzel, Nils; Hense, Andreas; Ohlwein, Christian
2017-04-01
Spatio-temporal reconstructions of past climate are important for the understanding of the long term behavior of the climate system and the sensitivity to forcing changes. Unfortunately, they are subject to large uncertainties, have to deal with a complex proxy-climate structure, and a physically reasonable interpolation between the sparse proxy observations is difficult. Bayesian Hierarchical Models (BHMs) are a class of statistical models that is well suited for spatio-temporal reconstructions of past climate because they permit the inclusion of multiple sources of information (e.g. records from different proxy types, uncertain age information, output from climate simulations) and quantify uncertainties in a statistically rigorous way. BHMs in paleoclimatology typically consist of three stages which are modeled individually and are combined using Bayesian inference techniques. The data stage models the proxy-climate relation (often named transfer function), the process stage models the spatio-temporal distribution of the climate variables of interest, and the prior stage consists of prior distributions of the model parameters. For our BHMs, we translate well-known proxy-climate transfer functions for pollen to a Bayesian framework. In addition, we can include Gaussian distributed local climate information from preprocessed proxy records. The process stage combines physically reasonable spatial structures from prior distributions with proxy records which leads to a multivariate posterior probability distribution for the reconstructed climate variables. The prior distributions that constrain the possible spatial structure of the climate variables are calculated from climate simulation output. We present results from pseudoproxy tests as well as new regional reconstructions of temperatures for the last glacial maximum (LGM, ˜ 21,000 years BP). These reconstructions combine proxy data syntheses with information from climate simulations for the LGM that were
Linking statistical bias description to multiobjective model calibration
Reichert, P.; Schuwirth, N.
2012-09-01
In the absence of model deficiencies, simulation results at the correct parameter values lead to an unbiased description of observed data with remaining deviations due to observation errors only. However, this ideal cannot be reached in the practice of environmental modeling, because the required simplified representation of the complex reality by the model and errors in model input lead to errors that are reflected in biased model output. This leads to two related problems: First, ignoring bias of output in the statistical model description leads to bias in parameter estimates, model predictions and, in particular, in the quantification of their uncertainty. Second, as there is no objective choice of how much bias to accept in which output variable, it is not possible to design an "objective" model calibration procedure. The first of these problems has been addressed by introducing a statistical (Bayesian) description of bias, the second by suggesting the use of multiobjective calibration techniques that cannot easily be used for uncertainty analysis. We merge the ideas of these two approaches by using the prior of the statistical bias description to quantify the importance of multiple calibration objectives. This leads to probabilistic inference and prediction while still taking multiple calibration objectives into account. The ideas and technical details of the suggested approach are outlined and a didactical example as well as an application to environmental data are provided to demonstrate its practical feasibility and computational efficiency.
Biedermann, A; Taroni, F; Bozza, S; Mazzella, W D
2011-01-30
This paper presents and discusses the use of Bayesian procedures - introduced through the use of Bayesian networks in Part I of this series of papers - for 'learning' probabilities from data. The discussion will relate to a set of real data on characteristics of black toners commonly used in printing and copying devices. Particular attention is drawn to the incorporation of the proposed procedures as an integral part in probabilistic inference schemes (notably in the form of Bayesian networks) that are intended to address uncertainties related to particular propositions of interest (e.g., whether or not a sample originates from a particular source). The conceptual tenets of the proposed methodologies are presented along with aspects of their practical implementation using currently available Bayesian network software. Copyright Â© 2010 Elsevier Ireland Ltd. All rights reserved.
Bayesian inference for partially identified models exploring the limits of limited data
Gustafson, Paul
2015-01-01
Introduction Identification What Is against Us? What Is for Us? Some Simple Examples of Partially Identified ModelsThe Road Ahead The Structure of Inference in Partially Identified Models Bayesian Inference The Structure of Posterior Distributions in PIMs Computational Strategies Strength of Bayesian Updating, Revisited Posterior MomentsCredible Intervals Evaluating the Worth of Inference Partial Identification versus Model Misspecification The Siren Call of Identification Comp
International Nuclear Information System (INIS)
Elsheikh, Ahmed H.; Wheeler, Mary F.; Hoteit, Ibrahim
2014-01-01
A Hybrid Nested Sampling (HNS) algorithm is proposed for efficient Bayesian model calibration and prior model selection. The proposed algorithm combines, Nested Sampling (NS) algorithm, Hybrid Monte Carlo (HMC) sampling and gradient estimation using Stochastic Ensemble Method (SEM). NS is an efficient sampling algorithm that can be used for Bayesian calibration and estimating the Bayesian evidence for prior model selection. Nested sampling has the advantage of computational feasibility. Within the nested sampling algorithm, a constrained sampling step is performed. For this step, we utilize HMC to reduce the correlation between successive sampled states. HMC relies on the gradient of the logarithm of the posterior distribution, which we estimate using a stochastic ensemble method based on an ensemble of directional derivatives. SEM only requires forward model runs and the simulator is then used as a black box and no adjoint code is needed. The developed HNS algorithm is successfully applied for Bayesian calibration and prior model selection of several nonlinear subsurface flow problems
van Lier-Walqui, M.; Morrison, H.; Kumjian, M. R.; Prat, O. P.
2016-12-01
Microphysical parameterization schemes have reached an impressive level of sophistication: numerous prognostic hydrometeor categories, and either size-resolved (bin) particle size distributions, or multiple prognostic moments of the size distribution. Yet, uncertainty in model representation of microphysical processes and the effects of microphysics on numerical simulation of weather has not shown a improvement commensurate with the advanced sophistication of these schemes. We posit that this may be caused by unconstrained assumptions of these schemes, such as ad-hoc parameter value choices and structural uncertainties (e.g. choice of a particular form for the size distribution). We present work on development and observational constraint of a novel microphysical parameterization approach, the Bayesian Observationally-constrained Statistical-physical Scheme (BOSS), which seeks to address these sources of uncertainty. Our framework avoids unnecessary a priori assumptions, and instead relies on observations to provide probabilistic constraint of the scheme structure and sensitivities to environmental and microphysical conditions. We harness the rich microphysical information content of polarimetric radar observations to develop and constrain BOSS within a Bayesian inference framework using a Markov Chain Monte Carlo sampler (see Kumjian et al., this meeting for details on development of an associated polarimetric forward operator). Our work shows how knowledge of microphysical processes is provided by polarimetric radar observations of diverse weather conditions, and which processes remain highly uncertain, even after considering observations.
Reliability assessment using degradation models: bayesian and classical approaches
Directory of Open Access Journals (Sweden)
Marta Afonso Freitas
2010-04-01
Full Text Available Traditionally, reliability assessment of devices has been based on (accelerated life tests. However, for highly reliable products, little information about reliability is provided by life tests in which few or no failures are typically observed. Since most failures arise from a degradation mechanism at work for which there are characteristics that degrade over time, one alternative is monitor the device for a period of time and assess its reliability from the changes in performance (degradation observed during that period. The goal of this article is to illustrate how degradation data can be modeled and analyzed by using "classical" and Bayesian approaches. Four methods of data analysis based on classical inference are presented. Next we show how Bayesian methods can also be used to provide a natural approach to analyzing degradation data. The approaches are applied to a real data set regarding train wheels degradation.Tradicionalmente, o acesso à confiabilidade de dispositivos tem sido baseado em testes de vida (acelerados. Entretanto, para produtos altamente confiáveis, pouca informação a respeito de sua confiabilidade é fornecida por testes de vida no quais poucas ou nenhumas falhas são observadas. Uma vez que boa parte das falhas é induzida por mecanismos de degradação, uma alternativa é monitorar o dispositivo por um período de tempo e acessar sua confiabilidade através das mudanças em desempenho (degradação observadas durante aquele período. O objetivo deste artigo é ilustrar como dados de degradação podem ser modelados e analisados utilizando-se abordagens "clássicas" e Bayesiana. Quatro métodos de análise de dados baseados em inferência clássica são apresentados. A seguir, mostramos como os métodos Bayesianos podem também ser aplicados para proporcionar uma abordagem natural à análise de dados de degradação. As abordagens são aplicadas a um banco de dados real relacionado à degradação de rodas de trens.
Rosenheim, B. E.; Firesinger, D.; Roberts, M. L.; Burton, J. R.; Khan, N.; Moyer, R. P.
2016-12-01
Radiocarbon (14C) sediment core chronologies benefit from a high density of dates, even when precision of individual dates is sacrificed. This is demonstrated by a combined approach of rapid 14C analysis of CO2 gas generated from carbonates and organic material coupled with Bayesian statistical modeling. Analysis of 14C is facilitated by the gas ion source on the Continuous Flow Accelerator Mass Spectrometry (CFAMS) system at the Woods Hole Oceanographic Institution's National Ocean Sciences Accelerator Mass Spectrometry facility. This instrument is capable of producing a 14C determination of +/- 100 14C y precision every 4-5 minutes, with limited sample handling (dissolution of carbonates and/or combustion of organic carbon in evacuated containers). Rapid analysis allows over-preparation of samples to include replicates at each depth and/or comparison of different sample types at particular depths in a sediment or peat core. Analysis priority is given to depths that have the least chronologic precision as determined by Bayesian modeling of the chronology of calibrated ages. Use of such a statistical approach to determine the order in which samples are run ensures that the chronology constantly improves so long as material is available for the analysis of chronologic weak points. Ultimately, accuracy of the chronology is determined by the material that is actually being dated, and our combined approach allows testing of different constituents of the organic carbon pool and the carbonate minerals within a core. We will present preliminary results from a deep-sea sediment core abundant in deep-sea foraminifera as well as coastal wetland peat cores to demonstrate statistical improvements in sediment- and peat-core chronologies obtained by increasing the quantity and decreasing the quality of individual dates.
Yang, Ziheng; Zhu, Tianqi
2018-02-20
The Bayesian method is noted to produce spuriously high posterior probabilities for phylogenetic trees in analysis of large datasets, but the precise reasons for this overconfidence are unknown. In general, the performance of Bayesian selection of misspecified models is poorly understood, even though this is of great scientific interest since models are never true in real data analysis. Here we characterize the asymptotic behavior of Bayesian model selection and show that when the competing models are equally wrong, Bayesian model selection exhibits surprising and polarized behaviors in large datasets, supporting one model with full force while rejecting the others. If one model is slightly less wrong than the other, the less wrong model will eventually win when the amount of data increases, but the method may become overconfident before it becomes reliable. We suggest that this extreme behavior may be a major factor for the spuriously high posterior probabilities for evolutionary trees. The philosophical implications of our results to the application of Bayesian model selection to evaluate opposing scientific hypotheses are yet to be explored, as are the behaviors of non-Bayesian methods in similar situations.
Using literature and data to learn Bayesian networks as clinical models of ovarian tumors
DEFF Research Database (Denmark)
Antal, P.; Fannes, G.; Timmerman, D.
2004-01-01
Thanks to its increasing availability, electronic literature has become a potential source of information for the development of complex Bayesian networks (BN), when human expertise is missing or data is scarce or contains much noise. This opportunity raises the question of how to integrate...... information from free-text resources with statistical data in learning Bayesian networks. Firstly, we report on the collection of prior information resources in the ovarian cancer domain, which includes "kernel" annotations of the domain variables. We introduce methods based on the annotations and literature...... an expert reference and against data scores (the mutual information (MI) and a Bayesian score). Next, we transform the text-based dependency measures into informative text-based priors for Bayesian network structures. Finally, we report the benefit of such informative text-based priors on the performance...
Bayesian Safety Risk Modeling of Human-Flightdeck Automation Interaction
Ancel, Ersin; Shih, Ann T.
2015-01-01
Usage of automatic systems in airliners has increased fuel efficiency, added extra capabilities, enhanced safety and reliability, as well as provide improved passenger comfort since its introduction in the late 80's. However, original automation benefits, including reduced flight crew workload, human errors or training requirements, were not achieved as originally expected. Instead, automation introduced new failure modes, redistributed, and sometimes increased workload, brought in new cognitive and attention demands, and increased training requirements. Modern airliners have numerous flight modes, providing more flexibility (and inherently more complexity) to the flight crew. However, the price to pay for the increased flexibility is the need for increased mode awareness, as well as the need to supervise, understand, and predict automated system behavior. Also, over-reliance on automation is linked to manual flight skill degradation and complacency in commercial pilots. As a result, recent accidents involving human errors are often caused by the interactions between humans and the automated systems (e.g., the breakdown in man-machine coordination), deteriorated manual flying skills, and/or loss of situational awareness due to heavy dependence on automated systems. This paper describes the development of the increased complexity and reliance on automation baseline model, named FLAP for FLightdeck Automation Problems. The model development process starts with a comprehensive literature review followed by the construction of a framework comprised of high-level causal factors leading to an automation-related flight anomaly. The framework was then converted into a Bayesian Belief Network (BBN) using the Hugin Software v7.8. The effects of automation on flight crew are incorporated into the model, including flight skill degradation, increased cognitive demand and training requirements along with their interactions. Besides flight crew deficiencies, automation system
Directory of Open Access Journals (Sweden)
Wirichada Pan-ngum
Full Text Available Accuracy of rapid diagnostic tests for dengue infection has been repeatedly estimated by comparing those tests with reference assays. We hypothesized that those estimates might be inaccurate if the accuracy of the reference assays is not perfect. Here, we investigated this using statistical modeling.Data from a cohort study of 549 patients suspected of dengue infection presenting at Colombo North Teaching Hospital, Ragama, Sri Lanka, that described the application of our reference assay (a combination of Dengue IgM antibody capture ELISA and IgG antibody capture ELISA and of three rapid diagnostic tests (Panbio NS1 antigen, IgM antibody and IgG antibody rapid immunochromatographic cassette tests were re-evaluated using bayesian latent class models (LCMs. The estimated sensitivity and specificity of the reference assay were 62.0% and 99.6%, respectively. Prevalence of dengue infection (24.3%, and sensitivities and specificities of the Panbio NS1 (45.9% and 97.9%, IgM (54.5% and 95.5% and IgG (62.1% and 84.5% estimated by bayesian LCMs were significantly different from those estimated by assuming that the reference assay was perfect. Sensitivity, specificity, PPV and NPV for a combination of NS1, IgM and IgG cassette tests on admission samples were 87.0%, 82.8%, 62.0% and 95.2%, respectively.Our reference assay is an imperfect gold standard. In our setting, the combination of NS1, IgM and IgG rapid diagnostic tests could be used on admission to rule out dengue infection with a high level of accuracy (NPV 95.2%. Further evaluation of rapid diagnostic tests for dengue infection should include the use of appropriate statistical models.
A fully Bayesian latent variable model for integrative clustering analysis of multi-type omics data.
Mo, Qianxing; Shen, Ronglai; Guo, Cui; Vannucci, Marina; Chan, Keith S; Hilsenbeck, Susan G
2018-01-01
Identification of clinically relevant tumor subtypes and omics signatures is an important task in cancer translational research for precision medicine. Large-scale genomic profiling studies such as The Cancer Genome Atlas (TCGA) Research Network have generated vast amounts of genomic, transcriptomic, epigenomic, and proteomic data. While these studies have provided great resources for researchers to discover clinically relevant tumor subtypes and driver molecular alterations, there are few computationally efficient methods and tools for integrative clustering analysis of these multi-type omics data. Therefore, the aim of this article is to develop a fully Bayesian latent variable method (called iClusterBayes) that can jointly model omics data of continuous and discrete data types for identification of tumor subtypes and relevant omics features. Specifically, the proposed method uses a few latent variables to capture the inherent structure of multiple omics data sets to achieve joint dimension reduction. As a result, the tumor samples can be clustered in the latent variable space and relevant omics features that drive the sample clustering are identified through Bayesian variable selection. This method significantly improve on the existing integrative clustering method iClusterPlus in terms of statistical inference and computational speed. By analyzing TCGA and simulated data sets, we demonstrate the excellent performance of the proposed method in revealing clinically meaningful tumor subtypes and driver omics features. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
U.S. Environmental Protection Agency — The dataset is lake dissolved oxygen concentrations obtained form plots published by Gelda et al. (1996) and lake reaeration model simulated values using Bayesian...
DEFF Research Database (Denmark)
Quinonero, Joaquin; Girard, Agathe; Larsen, Jan
2003-01-01
The object of Bayesian modelling is predictive distribution, which, in a forecasting scenario, enables evaluation of forecasted values and their uncertainties. We focus on reliably estimating the predictive mean and variance of forecasted values using Bayesian kernel based models such as the Gaus......The object of Bayesian modelling is predictive distribution, which, in a forecasting scenario, enables evaluation of forecasted values and their uncertainties. We focus on reliably estimating the predictive mean and variance of forecasted values using Bayesian kernel based models...... such as the Gaussian process and the relevance vector machine. We derive novel analytic expressions for the predictive mean and variance for Gaussian kernel shapes under the assumption of a Gaussian input distribution in the static case, and of a recursive Gaussian predictive density in iterative forecasting...
International Nuclear Information System (INIS)
Duarte, Juliana P.; Leite, Victor C.; Melo, P.F. Frutuoso e
2013-01-01
Bayesian networks have become a very handy tool for solving problems in various application areas. This paper discusses the use of Bayesian networks to treat dependent events in reliability engineering typically modeled by Markovian models. Dependent events play an important role as, for example, when treating load-sharing systems, bridge systems, common-cause failures, and switching systems (those for which a standby component is activated after the main one fails by means of a switching mechanism). Repair plays an important role in all these cases (as, for example, the number of repairmen). All Bayesian network calculations are performed by means of the Netica™ software, of Norsys Software Corporation, and Fortran 90 to evaluate them over time. The discussion considers the development of time-dependent reliability figures of merit, which are easily obtained, through Markovian models, but not through Bayesian networks, because these latter need probability figures as input and not failure and repair rates. Bayesian networks produced results in very good agreement with those of Markov models and pivotal decomposition. Static and discrete time (DTBN) Bayesian networks were used in order to check their capabilities of modeling specific situations, like switching failures in cold-standby systems. The DTBN was more flexible to modeling systems where the time of occurrence of an event is important, for example, standby failure and repair. However, the static network model showed as good results as DTBN by a much more simplified approach. (author)
Gopalan, Giri; Hrafnkelsson, Birgir; Aðalgeirsdóttir, Guðfinna; Jarosch, Alexander H.; Pálsson, Finnur
2018-03-01
Bayesian hierarchical modeling can assist the study of glacial dynamics and ice flow properties. This approach will allow glaciologists to make fully probabilistic predictions for the thickness of a glacier at unobserved spatio-temporal coordinates, and it will also allow for the derivation of posterior probability distributions for key physical parameters such as ice viscosity and basal sliding. The goal of this paper is to develop a proof of concept for a Bayesian hierarchical model constructed, which uses exact analytical solutions for the shallow ice approximation (SIA) introduced by Bueler et al. (2005). A suite of test simulations utilizing these exact solutions suggests that this approach is able to adequately model numerical errors and produce useful physical parameter posterior distributions and predictions. A byproduct of the development of the Bayesian hierarchical model is the derivation of a novel finite difference method for solving the SIA partial differential equation (PDE). An additional novelty of this work is the correction of numerical errors induced through a numerical solution using a statistical model. This error correcting process models numerical errors that accumulate forward in time and spatial variation of numerical errors between the dome, interior, and margin of a glacier.
Leak localization in water distribution networks using model-based bayesian reasoning
Soldevila Coma, Adrià; Fernández Canti, Rosa M.; Blesa Izquierdo, Joaquim; Tornil Sin, Sebastián; Puig Cayuela, Vicenç
2016-01-01
This paper presents a new method for leak localization in Water Distribution Networks that uses a model-based approach combined with Bayesian reasoning. Probability density functions in model-based pressure residuals are calibrated off-line for all the possible leak scenarios by using a hydraulic simulator, being leak size uncertainty, demand uncertainty and sensor noise considered. A Bayesian reasoning is applied online to the available residuals to determine the location of leaks present in...
Predicting water main failures using Bayesian model averaging and survival modelling approach
International Nuclear Information System (INIS)
Kabir, Golam; Tesfamariam, Solomon; Sadiq, Rehan
2015-01-01
To develop an effective preventive or proactive repair and replacement action plan, water utilities often rely on water main failure prediction models. However, in predicting the failure of water mains, uncertainty is inherent regardless of the quality and quantity of data used in the model. To improve the understanding of water main failure, a Bayesian framework is developed for predicting the failure of water mains considering uncertainties. In this study, Bayesian model averaging method (BMA) is presented to identify the influential pipe-dependent and time-dependent covariates considering model uncertainties whereas Bayesian Weibull Proportional Hazard Model (BWPHM) is applied to develop the survival curves and to predict the failure rates of water mains. To accredit the proposed framework, it is implemented to predict the failure of cast iron (CI) and ductile iron (DI) pipes of the water distribution network of the City of Calgary, Alberta, Canada. Results indicate that the predicted 95% uncertainty bounds of the proposed BWPHMs capture effectively the observed breaks for both CI and DI water mains. Moreover, the performance of the proposed BWPHMs are better compare to the Cox-Proportional Hazard Model (Cox-PHM) for considering Weibull distribution for the baseline hazard function and model uncertainties. - Highlights: • Prioritize rehabilitation and replacements (R/R) strategies of water mains. • Consider the uncertainties for the failure prediction. • Improve the prediction capability of the water mains failure models. • Identify the influential and appropriate covariates for different models. • Determine the effects of the covariates on failure
Bayesian Model Averaging of Artificial Intelligence Models for Hydraulic Conductivity Estimation
Nadiri, A.; Chitsazan, N.; Tsai, F. T.; Asghari Moghaddam, A.
2012-12-01
This research presents a Bayesian artificial intelligence model averaging (BAIMA) method that incorporates multiple artificial intelligence (AI) models to estimate hydraulic conductivity and evaluate estimation uncertainties. Uncertainty in the AI model outputs stems from error in model input as well as non-uniqueness in selecting different AI methods. Using one single AI model tends to bias the estimation and underestimate uncertainty. BAIMA employs Bayesian model averaging (BMA) technique to address the issue of using one single AI model for estimation. BAIMA estimates hydraulic conductivity by averaging the outputs of AI models according to their model weights. In this study, the model weights were determined using the Bayesian information criterion (BIC) that follows the parsimony principle. BAIMA calculates the within-model variances to account for uncertainty propagation from input data to AI model output. Between-model variances are evaluated to account for uncertainty due to model non-uniqueness. We employed Takagi-Sugeno fuzzy logic (TS-FL), artificial neural network (ANN) and neurofuzzy (NF) to estimate hydraulic conductivity for the Tasuj plain aquifer, Iran. BAIMA combined three AI models and produced better fitting than individual models. While NF was expected to be the best AI model owing to its utilization of both TS-FL and ANN models, the NF model is nearly discarded by the parsimony principle. The TS-FL model and the ANN model showed equal importance although their hydraulic conductivity estimates were quite different. This resulted in significant between-model variances that are normally ignored by using one AI model.
Dondeynaz, C.; Lopez-Puga, J.; Carmona-Moreno, C.
2012-04-01
Improving Water and Sanitation Services (WSS), being a complex and interdisciplinary issue, passes through collaboration and coordination of different sectors (environment, health, economic activities, governance, and international cooperation). This inter-dependency has been recognised with the adoption of the "Integrated Water Resources Management" principles that push for the integration of these various dimensions involved in WSS delivery to ensure an efficient and sustainable management. The understanding of these interrelations appears as crucial for decision makers in the water sector in particular in developing countries where WSS still represent an important leverage for livelihood improvement. In this framework, the Joint Research Centre of the European Commission has developed a coherent database (WatSan4Dev database) containing 29 indicators from environmental, socio-economic, governance and financial aid flows data focusing on developing countries (Celine et al, 2011 under publication). The aim of this work is to model the WatSan4Dev dataset using probabilistic models to identify the key variables influencing or being influenced by the water supply and sanitation access levels. Bayesian Network Models are suitable to map the conditional dependencies between variables and also allows ordering variables by level of influence on the dependent variable. Separated models have been built for water supply and for sanitation because of different behaviour. The models are validated if complying with statistical criteria but either with scientific knowledge and literature. A two steps approach has been adopted to build the structure of the model; Bayesian network is first built for each thematic cluster of variables (e.g governance, agricultural pressure, or human development) keeping a detailed level for interpretation later one. A global model is then built based on significant indicators of each cluster being previously modelled. The structure of the
Bayesian Inference: with ecological applications
Link, William A.; Barker, Richard J.
2010-01-01
This text provides a mathematically rigorous yet accessible and engaging introduction to Bayesian inference with relevant examples that will be of interest to biologists working in the fields of ecology, wildlife management and environmental studies as well as students in advanced undergraduate statistics.. This text opens the door to Bayesian inference, taking advantage of modern computational efficiencies and easily accessible software to evaluate complex hierarchical models.
Bayesian network as a modelling tool for risk management in agriculture
DEFF Research Database (Denmark)
Rasmussen, Svend; Madsen, Anders Læsø; Lund, Mogens
. In this paper we use Bayesian networks as an integrated modelling approach for representing uncertainty and analysing risk management in agriculture. It is shown how historical farm account data may be efficiently used to estimate conditional probabilities, which are the core elements in Bayesian network models....... We further show how the Bayesian network model RiBay is used for stochastic simulation of farm income, and we demonstrate how RiBay can be used to simulate risk management at the farm level. It is concluded that the key strength of a Bayesian network is the transparency of assumptions......, and that it has the ability to link uncertainty from different external sources to budget figures and to quantify risk at the farm level....
Vernon, Ian; Liu, Junli; Goldstein, Michael; Rowe, James; Topping, Jen; Lindsey, Keith
2018-01-02
Many mathematical models have now been employed across every area of systems biology. These models increasingly involve large numbers of unknown parameters, have complex structure which can result in substantial evaluation time relative to the needs of the analysis, and need to be compared to observed data of various forms. The correct analysis of such models usually requires a global parameter search, over a high dimensional parameter space, that incorporates and respects the most important sources of uncertainty. This can be an extremely difficult task, but it is essential for any meaningful inference or prediction to be made about any biological system. It hence represents a fundamental challenge for the whole of systems biology. Bayesian statistical methodology for the uncertainty analysis of complex models is introduced, which is designed to address the high dimensional global parameter search problem. Bayesian emulators that mimic the systems biology model but which are extremely fast to evaluate are embeded within an iterative history match: an efficient method to search high dimensional spaces within a more formal statistical setting, while incorporating major sources of uncertainty. The approach is demonstrated via application to a model of hormonal crosstalk in Arabidopsis root development, which has 32 rate parameters, for which we identify the sets of rate parameter values that lead to acceptable matches between model output and observed trend data. The multiple insights into the model's structure that this analysis provides are discussed. The methodology is applied to a second related model, and the biological consequences of the resulting comparison, including the evaluation of gene functions, are described. Bayesian uncertainty analysis for complex models using both emulators and history matching is shown to be a powerful technique that can greatly aid the study of a large class of systems biology models. It both provides insight into model behaviour
A Bayesian approach for temporally scaling climate for modeling ecological systems.
Post van der Burg, Max; Anteau, Michael J; McCauley, Lisa A; Wiltermuth, Mark T
2016-05-01
With climate change becoming more of concern, many ecologists are including climate variables in their system and statistical models. The Standardized Precipitation Evapotranspiration Index (SPEI) is a drought index that has potential advantages in modeling ecological response variables, including a flexible computation of the index over different timescales. However, little development has been made in terms of the choice of timescale for SPEI. We developed a Bayesian modeling approach for estimating the timescale for SPEI and demonstrated its use in modeling wetland hydrologic dynamics in two different eras (i.e., historical [pre-1970] and contemporary [post-2003]). Our goal was to determine whether differences in climate between the two eras could explain changes in the amount of water in wetlands. Our results showed that wetland water surface areas tended to be larger in wetter conditions, but also changed less in response to climate fluctuations in the contemporary era. We also found that the average timescale parameter was greater in the historical period, compared with the contemporary period. We were not able to determine whether this shift in timescale was due to a change in the timing of wet-dry periods or whether it was due to changes in the way wetlands responded to climate. Our results suggest that perhaps some interaction between climate and hydrologic response may be at work, and further analysis is needed to determine which has a stronger influence. Despite this, we suggest that our modeling approach enabled us to estimate the relevant timescale for SPEI and make inferences from those estimates. Likewise, our approach provides a mechanism for using prior information with future data to assess whether these patterns may continue over time. We suggest that ecologists consider using temporally scalable climate indices in conjunction with Bayesian analysis for assessing the role of climate in ecological systems.
Wentworth, Mami Tonoe
techniques for model calibration. For Bayesian model calibration, we employ adaptive Metropolis algorithms to construct densities for input parameters in the heat model and the HIV model. To quantify the uncertainty in the parameters, we employ two MCMC algorithms: Delayed Rejection Adaptive Metropolis (DRAM) [33] and Differential Evolution Adaptive Metropolis (DREAM) [66, 68]. The densities obtained using these methods are compared to those obtained through the direct numerical evaluation of the Bayes' formula. We also combine uncertainties in input parameters and measurement errors to construct predictive estimates for a model response. A significant emphasis is on the development and illustration of techniques to verify the accuracy of sampling-based Metropolis algorithms. We verify the accuracy of DRAM and DREAM by comparing chains, densities and correlations obtained using DRAM, DREAM and the direct evaluation of Bayes formula. We also perform similar analysis for credible and prediction intervals for responses. Once the parameters are estimated, we employ energy statistics test [63, 64] to compare the densities obtained by different methods for the HIV model. The energy statistics are used to test the equality of distributions. We also consider parameter selection and verification techniques for models having one or more parameters that are noninfluential in the sense that they minimally impact model outputs. We illustrate these techniques for a dynamic HIV model but note that the parameter selection and verification framework is applicable to a wide range of biological and physical models. To accommodate the nonlinear input to output relations, which are typical for such models, we focus on global sensitivity analysis techniques, including those based on partial correlations, Sobol indices based on second-order model representations, and Morris indices, as well as a parameter selection technique based on standard errors. A significant objective is to provide
DEFF Research Database (Denmark)
Kuikka, Sakari; Haapasaari, Päivi Elisabet; Helle, Inari
2011-01-01
networks are flexible tools that can take into account the different research traditions and the various types of information sources. We present two types of cases. With the Baltic salmon stocks modeled with Bayesian techniques, the existing data sets are rich and the estimation of the parameters...... components, which favors the use of quantitative risk analysis. However, the traditions and quality criteria of these scientific fields are in many respects different. This creates both technical and human challenges to the modeling tasks....
Enhanced Bayesian modelling in BAPS software for learning genetic structures of populations
Directory of Open Access Journals (Sweden)
Sirén Jukka
2008-12-01
Full Text Available Abstract Background During the most recent decade many Bayesian statistical models and software for answering questions related to the genetic structure underlying population samples have appeared in the scientific literature. Most of these methods utilize molecular markers for the inferences, while some are also capable of handling DNA sequence data. In a number of earlier works, we have introduced an array of statistical methods for population genetic inference that are implemented in the software BAPS. However, the complexity of biological problems related to genetic structure analysis keeps increasing such that in many cases the current methods may provide either inappropriate or insufficient solutions. Results We discuss the necessity of enhancing the statistical approaches to face the challenges posed by the ever-increasing amounts of molecular data generated by scientists over a wide range of research areas and introduce an array of new statistical tools implemented in the most recent version of BAPS. With these methods it is possible, e.g., to fit genetic mixture models using user-specified numbers of clusters and to estimate levels of admixture under a genetic linkage model. Also, alleles representing a different ancestry compared to the average observed genomic positions can be tracked for the sampled individuals, and a priori specified hypotheses about genetic population structure can be directly compared using Bayes' theorem. In general, we have improved further the computational characteristics of the algorithms behind the methods implemented in BAPS facilitating the analyses of large and complex datasets. In particular, analysis of a single dataset can now be spread over multiple computers using a script interface to the software. Conclusion The Bayesian modelling methods introduced in this article represent an array of enhanced tools for learning the genetic structure of populations. Their implementations in the BAPS software are
Plant, N. G.; Thieler, E. R.; Gutierrez, B.; Lentz, E. E.; Zeigler, S. L.; Van Dongeren, A.; Fienen, M. N.
2016-12-01
We evaluate the strengths and weaknesses of Bayesian networks that have been used to address scientific and decision-support questions related to coastal geomorphology. We will provide an overview of coastal geomorphology research that has used Bayesian networks and describe what this approach can do and when it works (or fails to work). Over the past decade, Bayesian networks have been formulated to analyze the multi-variate structure and evolution of coastal morphology and associated human and ecological impacts. The approach relates observable system variables to each other by estimating discrete correlations. The resulting Bayesian-networks make predictions that propagate errors, conduct inference via Bayes rule, or both. In scientific applications, the model results are useful for hypothesis testing, using confidence estimates to gage the strength of tests while applications to coastal resource management are aimed at decision-support, where the probabilities of desired ecosystems outcomes are evaluated. The range of Bayesian-network applications to coastal morphology includes emulation of high-resolution wave transformation models to make oceanographic predictions, morphologic response to storms and/or sea-level rise, groundwater response to sea-level rise and morphologic variability, habitat suitability for endangered species, and assessment of monetary or human-life risk associated with storms. All of these examples are based on vast observational data sets, numerical model output, or both. We will discuss the progression of our experiments, which has included testing whether the Bayesian-network approach can be implemented and is appropriate for addressing basic and applied scientific problems and evaluating the hindcast and forecast skill of these implementations. We will present and discuss calibration/validation tests that are used to assess the robustness of Bayesian-network models and we will compare these results to tests of other models. This will
Development of a Bayesian Belief Network Runway Incursion and Excursion Model
Green, Lawrence L.
2014-01-01
In a previous work, a statistical analysis of runway incursion (RI) event data was conducted to ascertain the relevance of this data to the top ten Technical Challenges (TC) of the National Aeronautics and Space Administration (NASA) Aviation Safety Program (AvSP). The study revealed connections to several of the AvSP top ten TC and identified numerous primary causes and contributing factors of RI events. The statistical analysis served as the basis for developing a system-level Bayesian Belief Network (BBN) model for RI events, also previously reported. Through literature searches and data analysis, this RI event network has now been extended to also model runway excursion (RE) events. These RI and RE event networks have been further modified and vetted by a Subject Matter Expert (SME) panel. The combined system-level BBN model will allow NASA to generically model the causes of RI and RE events and to assess the effectiveness of technology products being developed under NASA funding. These products are intended to reduce the frequency of runway safety incidents/accidents, and to improve runway safety in general. The development and structure of the BBN for both RI and RE events are documented in this paper.
Bayesian estimation of regularization parameters for deformable surface models
Energy Technology Data Exchange (ETDEWEB)
Cunningham, G.S.; Lehovich, A.; Hanson, K.M.
1999-02-20
In this article the authors build on their past attempts to reconstruct a 3D, time-varying bolus of radiotracer from first-pass data obtained by the dynamic SPECT imager, FASTSPECT, built by the University of Arizona. The object imaged is a CardioWest total artificial heart. The bolus is entirely contained in one ventricle and its associated inlet and outlet tubes. The model for the radiotracer distribution at a given time is a closed surface parameterized by 482 vertices that are connected to make 960 triangles, with nonuniform intensity variations of radiotracer allowed inside the surface on a voxel-to-voxel basis. The total curvature of the surface is minimized through the use of a weighted prior in the Bayesian framework, as is the weighted norm of the gradient of the voxellated grid. MAP estimates for the vertices, interior intensity voxels and background count level are produced. The strength of the priors, or hyperparameters, are determined by maximizing the probability of the data given the hyperparameters, called the evidence. The evidence is calculated by first assuming that the posterior is approximately normal in the values of the vertices and voxels, and then by evaluating the integral of the multi-dimensional normal distribution. This integral (which requires evaluating the determinant of a covariance matrix) is computed by applying a recent algorithm from Bai et. al. that calculates the needed determinant efficiently. They demonstrate that the radiotracer is highly inhomogeneous in early time frames, as suspected in earlier reconstruction attempts that assumed a uniform intensity of radiotracer within the closed surface, and that the optimal choice of hyperparameters is substantially different for different time frames.
Bayesian latent class models with conditionally dependent diagnostic tests: a case study.
Menten, Joris; Boelaert, Marleen; Lesaffre, Emmanuel
2008-09-30
In the assessment of the accuracy of diagnostic tests for infectious diseases, the true disease status of the subjects is often unknown due to the lack of a gold standard test. Latent class models with two latent classes, representing diseased and non-diseased subjects, are often used to analyze this type of data. In its basic format, latent class analysis requires the observed outcomes to be statistically independent conditional on the disease status. In most diagnostic settings, this assumption is highly questionable. During the last decade, several methods have been proposed to estimate latent class models with conditional dependence between the test results. A class of flexible fixed and random effects models were described by Dendukuri and Joseph in a Bayesian framework. We illustrate these models using the analysis of a diagnostic study of three field tests and an imperfect reference test for the diagnosis of visceral leishmaniasis. We show that, as observed earlier by Albert and Dodd, different dependence models may result in similar fits to the data while resulting in different inferences. Given this problem, selection of appropriate latent class models should be based on substantive subject matter knowledge. If several clinically plausible models are supported by the data, a sensitivity analysis should be performed by describing the results obtained from different models and using different priors. Copyright (c) 2008 John Wiley & Sons, Ltd.
Statistical modelling of fish stocks
DEFF Research Database (Denmark)
Kvist, Trine
1999-01-01
for modelling the dynamics of a fish population is suggested. A new approach is introduced to analyse the sources of variation in age composition data, which is one of the most important sources of information in the cohort based models for estimation of stock abundancies and mortalities. The approach combines...... and it is argued that an approach utilising stochastic differential equations might be advantagous in fish stoch assessments....
Budi Astuti, Ani; Iriawan, Nur; Irhamah; Kuswanto, Heri; Sasiarini, Laksmi
2017-10-01
Bayesian statistics proposes an approach that is very flexible in the number of samples and distribution of data. Bayesian Mixture Model (BMM) is a Bayesian approach for multimodal models. Diabetes Mellitus (DM) is more commonly known in the Indonesian community as sweet pee. This disease is one type of chronic non-communicable diseases but it is very dangerous to humans because of the effects of other diseases complications caused. WHO reports in 2013 showed DM disease was ranked 6th in the world as the leading causes of human death. In Indonesia, DM disease continues to increase over time. These research would be studied patterns and would be built the BMM models of the DM data through simulation studies where the simulation data built on cases of blood sugar levels of DM patients in RSUD Saiful Anwar Malang. The results have been successfully demonstrated pattern of distribution of the DM data which has a normal mixture distribution. The BMM models have succeed to accommodate the real condition of the DM data based on the data driven concept.
Directory of Open Access Journals (Sweden)
Mihaela Simionescu
2014-12-01
Full Text Available There are many types of econometric models used in predicting the inflation rate, but in this study we used a Bayesian shrinkage combination approach. This methodology is used in order to improve the predictions accuracy by including information that is not captured by the econometric models. Therefore, experts’ forecasts are utilized as prior information, for Romania these predictions being provided by Institute for Economic Forecasting (Dobrescu macromodel, National Commission for Prognosis and European Commission. The empirical results for Romanian inflation show the superiority of a fixed effects model compared to other types of econometric models like VAR, Bayesian VAR, simultaneous equations model, dynamic model, log-linear model. The Bayesian combinations that used experts’ predictions as priors, when the shrinkage parameter tends to infinite, improved the accuracy of all forecasts based on individual models, outperforming also zero and equal weights predictions and naïve forecasts.
On the Practice of Bayesian Inference in Basic Economic Time Series Models using Gibbs Sampling
M.D. de Pooter (Michiel); R. Segers (René); H.K. van Dijk (Herman)
2006-01-01
textabstractSeveral lessons learned from a Bayesian analysis of basic economic time series models by means of the Gibbs sampling algorithm are presented. Models include the Cochrane-Orcutt model for serial correlation, the Koyck distributed lag model, the Unit Root model, the Instrumental Variables
Bayesian Analysis of Geostatistical Models With an Auxiliary Lattice
Park, Jincheol
2012-04-01
The Gaussian geostatistical model has been widely used for modeling spatial data. However, this model suffers from a severe difficulty in computation: it requires users to invert a large covariance matrix. This is infeasible when the number of observations is large. In this article, we propose an auxiliary lattice-based approach for tackling this difficulty. By introducing an auxiliary lattice to the space of observations and defining a Gaussian Markov random field on the auxiliary lattice, our model completely avoids the requirement of matrix inversion. It is remarkable that the computational complexity of our method is only O(n), where n is the number of observations. Hence, our method can be applied to very large datasets with reasonable computational (CPU) times. The numerical results indicate that our model can approximate Gaussian random fields very well in terms of predictions, even for those with long correlation lengths. For real data examples, our model can generally outperform conventional Gaussian random field models in both prediction errors and CPU times. Supplemental materials for the article are available online. © 2012 American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America.
Statistical lung model for microdosimetry
International Nuclear Information System (INIS)
Fisher, D.R.; Hadley, R.T.
1984-03-01
To calculate the microdosimetry of plutonium in the lung, a mathematical description is needed of lung tissue microstructure that defines source-site parameters. Beagle lungs were expanded using a glutaraldehyde fixative at 30 cm water pressure. Tissue specimens, five microns thick, were stained with hematoxylin and eosin then studied using an image analyzer. Measurements were made along horizontal lines through the magnified tissue image. The distribution of air space and tissue chord lengths and locations of epithelial cell nuclei were recorded from about 10,000 line scans. The distribution parameters constituted a model of lung microstructure for predicting the paths of random alpha particle tracks in the lung and the probability of traversing biologically sensitive sites. This lung model may be used in conjunction with established deposition and retention models for determining the microdosimetry in the pulmonary lung for a wide variety of inhaled radioactive materials
Bayesian spatial modeling of HIV mortality via zero-inflated Poisson models.
Musal, Muzaffer; Aktekin, Tevfik
2013-01-30
In this paper, we investigate the effects of poverty and inequality on the number of HIV-related deaths in 62 New York counties via Bayesian zero-inflated Poisson models that exhibit spatial dependence. We quantify inequality via the Theil index and poverty via the ratios of two Census 2000 variables, the number of people under the poverty line and the number of people for whom poverty status is determined, in each Zip Code Tabulation Area. The purpose of this study was to investigate the effects of inequality and poverty in addition to spatial dependence between neighboring regions on HIV mortality rate, which can lead to improved health resource allocation decisions. In modeling county-specific HIV counts, we propose Bayesian zero-inflated Poisson models whose rates are functions of both covariate and spatial/random effects. To show how the proposed models work, we used three different publicly available data sets: TIGER Shapefiles, Census 2000, and mortality index files. In addition, we introduce parameter estimation issues of Bayesian zero-inflated Poisson models and discuss MCMC method implications. Copyright © 2012 John Wiley & Sons, Ltd.
Bayesian Modeling of ChIP-chip Data Through a High-Order Ising Model
Mo, Qianxing
2010-01-29
ChIP-chip experiments are procedures that combine chromatin immunoprecipitation (ChIP) and DNA microarray (chip) technology to study a variety of biological problems, including protein-DNA interaction, histone modification, and DNA methylation. The most important feature of ChIP-chip data is that the intensity measurements of probes are spatially correlated because the DNA fragments are hybridized to neighboring probes in the experiments. We propose a simple, but powerful Bayesian hierarchical approach to ChIP-chip data through an Ising model with high-order interactions. The proposed method naturally takes into account the intrinsic spatial structure of the data and can be used to analyze data from multiple platforms with different genomic resolutions. The model parameters are estimated using the Gibbs sampler. The proposed method is illustrated using two publicly available data sets from Affymetrix and Agilent platforms, and compared with three alternative Bayesian methods, namely, Bayesian hierarchical model, hierarchical gamma mixture model, and Tilemap hidden Markov model. The numerical results indicate that the proposed method performs as well as the other three methods for the data from Affymetrix tiling arrays, but significantly outperforms the other three methods for the data from Agilent promoter arrays. In addition, we find that the proposed method has better operating characteristics in terms of sensitivities and false discovery rates under various scenarios. © 2010, The International Biometric Society.
Using literature and data to learn Bayesian networks as clinical models of ovarian tumors.
Antal, Peter; Fannes, Geert; Timmerman, Dirk; Moreau, Yves; De Moor, Bart
2004-03-01
Thanks to its increasing availability, electronic literature has become a potential source of information for the development of complex Bayesian networks (BN), when human expertise is missing or data is scarce or contains much noise. This opportunity raises the question of how to integrate information from free-text resources with statistical data in learning Bayesian networks. Firstly, we report on the collection of prior information resources in the ovarian cancer domain, which includes "kernel" annotations of the domain variables. We introduce methods based on the annotations and literature to derive informative pairwise dependency measures, which are derived from the statistical cooccurrence of the names of the variables, from the similarity of the "kernel" descriptions of the variables and from a combined method. We perform wide-scale evaluation of these text-based dependency scores against an expert reference and against data scores (the mutual information (MI) and a Bayesian score). Next, we transform the text-based dependency measures into informative text-based priors for Bayesian network structures. Finally, we report the benefit of such informative text-based priors on the performance of a Bayesian network for the classification of ovarian tumors from clinical data.
Gilkey, Kelly M.; Myers, Jerry G.; McRae, Michael P.; Griffin, Elise A.; Kallrui, Aditya S.
2012-01-01
The Exploration Medical Capability project is creating a catalog of risk assessments using the Integrated Medical Model (IMM). The IMM is a software-based system intended to assist mission planners in preparing for spaceflight missions by helping them to make informed decisions about medical preparations and supplies needed for combating and treating various medical events using Probabilistic Risk Assessment. The objective is to use statistical analyses to inform the IMM decision tool with estimated probabilities of medical events occurring during an exploration mission. Because data regarding astronaut health are limited, Bayesian statistical analysis is used. Bayesian inference combines prior knowledge, such as data from the general U.S. population, the U.S. Submarine Force, or the analog astronaut population located at the NASA Johnson Space Center, with observed data for the medical condition of interest. The posterior results reflect the best evidence for specific medical events occurring in flight. Bayes theorem provides a formal mechanism for combining available observed data with data from similar studies to support the quantification process. The IMM team performed Bayesian updates on the following medical events: angina, appendicitis, atrial fibrillation, atrial flutter, dental abscess, dental caries, dental periodontal disease, gallstone disease, herpes zoster, renal stones, seizure, and stroke.
Louzoun, Yoram; Alter, Idan; Gragert, Loren; Albrecht, Mark; Maiers, Martin
2018-05-01
Regardless of sampling depth, accurate genotype imputation is limited in regions of high polymorphism which often have a heavy-tailed haplotype frequency distribution. Many rare haplotypes are thus unobserved. Statistical methods to improve imputation by extending reference haplotype distributions using linkage disequilibrium patterns that relate allele and haplotype frequencies have not yet been explored. In the field of unrelated stem cell transplantation, imputation of highly polymorphic human leukocyte antigen (HLA) genes has an important application in identifying the best-matched stem cell donor when searching large registries totaling over 28,000,000 donors worldwide. Despite these large registry sizes, a significant proportion of searched patients present novel HLA haplotypes. Supporting this observation, HLA population genetic models have indicated that many extant HLA haplotypes remain unobserved. The absent haplotypes are a significant cause of error in haplotype matching. We have applied a Bayesian inference methodology for extending haplotype frequency distributions, using a model where new haplotypes are created by recombination of observed alleles. Applications of this joint probability model offer significant improvement in frequency distribution estimates over the best existing alternative methods, as we illustrate using five-locus HLA frequency data from the National Marrow Donor Program registry. Transplant matching algorithms and disease association studies involving phasing and imputation of rare variants may benefit from this statistical inference framework.
International Nuclear Information System (INIS)
Dongiovanni, Danilo Nicola; Iesmantas, Tomas
2016-01-01
Highlights: • RAMI (Reliability, Availability, Maintainability and Inspectability) assessment of secondary heat transfer loop for a DEMO nuclear fusion plant. • Definition of a fault tree for a nuclear steam turbine operated in pulsed mode. • Turbine failure rate models update by mean of a Bayesian network reflecting the fault tree analysis in the considered scenario. • Sensitivity analysis on system availability performance. - Abstract: Availability will play an important role in the Demonstration Power Plant (DEMO) success from an economic and safety perspective. Availability performance is commonly assessed by Reliability Availability Maintainability Inspectability (RAMI) analysis, strongly relying on the accurate definition of system components failure modes (FM) and failure rates (FR). Little component experience is available in fusion application, therefore requiring the adaptation of literature FR to fusion plant operating conditions, which may differ in several aspects. As a possible solution to this problem, a new methodology to extrapolate/estimate components failure rate under different operating conditions is presented. The DEMO Balance of Plant nuclear steam turbine component operated in pulse mode is considered as study case. The methodology moves from the definition of a fault tree taking into account failure modes possibly enhanced by pulsed operation. The fault tree is then translated into a Bayesian network. A statistical model for the turbine system failure rate in terms of subcomponents’ FR is hence obtained, allowing for sensitivity analyses on the structured mixture of literature and unknown FR data for which plausible value intervals are investigated to assess their impact on the whole turbine system FR. Finally, the impact of resulting turbine system FR on plant availability is assessed exploiting a Reliability Block Diagram (RBD) model for a typical secondary cooling system implementing a Rankine cycle. Mean inherent availability
Directory of Open Access Journals (Sweden)
Gao Shouguo
2011-08-01
Full Text Available Abstract Background Bayesian Network (BN is a powerful approach to reconstructing genetic regulatory networks from gene expression data. However, expression data by itself suffers from high noise and lack of power. Incorporating prior biological knowledge can improve the performance. As each type of prior knowledge on its own may be incomplete or limited by quality issues, integrating multiple sources of prior knowledge to utilize their consensus is desirable. Results We introduce a new method to incorporate the quantitative information from multiple sources of prior knowledge. It first uses the Naïve Bayesian classifier to assess the likelihood of functional linkage between gene pairs based on prior knowledge. In this study we included cocitation in PubMed and schematic similarity in Gene Ontology annotation. A candidate network edge reservoir is then created in which the copy number of each edge is proportional to the estimated likelihood of linkage between the two corresponding genes. In network simulation the Markov Chain Monte Carlo sampling algorithm is adopted, and samples from this reservoir at each iteration to generate new candidate networks. We evaluated the new algorithm using both simulated and real gene expression data including that from a yeast cell cycle and a mouse pancreas development/growth study. Incorporating prior knowledge led to a ~2 fold increase in the number of known transcription regulations recovered, without significant change in false positive rate. In contrast, without the prior knowledge BN modeling is not always better than a random selection, demonstrating the necessity in network modeling to supplement the gene expression data with additional information. Conclusion our new development provides a statistical means to utilize the quantitative information in prior biological knowledge in the BN modeling of gene expression data, which significantly improves the performance.
Cucchi, K.; Kawa, N.; Hesse, F.; Rubin, Y.
2017-12-01
In order to reduce uncertainty in the prediction of subsurface flow and transport processes, practitioners should use all data available. However, classic inverse modeling frameworks typically only make use of information contained in in-situ field measurements to provide estimates of hydrogeological parameters. Such hydrogeological information about an aquifer is difficult and costly to acquire. In this data-scarce context, the transfer of ex-situ information coming from previously investigated sites can be critical for improving predictions by better constraining the estimation procedure. Bayesian inverse modeling provides a coherent framework to represent such ex-situ information by virtue of the prior distribution and combine them with in-situ information from the target site. In this study, we present an innovative data-driven approach for defining such informative priors for hydrogeological parameters at the target site. Our approach consists in two steps, both relying on statistical and machine learning methods. The first step is data selection; it consists in selecting sites similar to the target site. We use clustering methods for selecting similar sites based on observable hydrogeological features. The second step is data assimilation; it consists in assimilating data from the selected similar sites into the informative prior. We use a Bayesian hierarchical model to account for inter-site variability and to allow for the assimilation of multiple types of site-specific data. We present the application and validation of the presented methods on an established database of hydrogeological parameters. Data and methods are implemented in the form of an open-source R-package and therefore facilitate easy use by other practitioners.
Energy Technology Data Exchange (ETDEWEB)
Dongiovanni, Danilo Nicola, E-mail: danilo.dongiovanni@enea.it [ENEA, Nuclear Fusion and Safety Technologies Department, via Enrico Fermi 45, Frascati 00040 (Italy); Iesmantas, Tomas [LEI, Breslaujos str. 3 Kaunas (Lithuania)
2016-11-01
Highlights: • RAMI (Reliability, Availability, Maintainability and Inspectability) assessment of secondary heat transfer loop for a DEMO nuclear fusion plant. • Definition of a fault tree for a nuclear steam turbine operated in pulsed mode. • Turbine failure rate models update by mean of a Bayesian network reflecting the fault tree analysis in the considered scenario. • Sensitivity analysis on system availability performance. - Abstract: Availability will play an important role in the Demonstration Power Plant (DEMO) success from an economic and safety perspective. Availability performance is commonly assessed by Reliability Availability Maintainability Inspectability (RAMI) analysis, strongly relying on the accurate definition of system components failure modes (FM) and failure rates (FR). Little component experience is available in fusion application, therefore requiring the adaptation of literature FR to fusion plant operating conditions, which may differ in several aspects. As a possible solution to this problem, a new methodology to extrapolate/estimate components failure rate under different operating conditions is presented. The DEMO Balance of Plant nuclear steam turbine component operated in pulse mode is considered as study case. The methodology moves from the definition of a fault tree taking into account failure modes possibly enhanced by pulsed operation. The fault tree is then translated into a Bayesian network. A statistical model for the turbine system failure rate in terms of subcomponents’ FR is hence obtained, allowing for sensitivity analyses on the structured mixture of literature and unknown FR data for which plausible value intervals are investigated to assess their impact on the whole turbine system FR. Finally, the impact of resulting turbine system FR on plant availability is assessed exploiting a Reliability Block Diagram (RBD) model for a typical secondary cooling system implementing a Rankine cycle. Mean inherent availability
Actuarial statistics with generalized linear mixed models
Antonio, K.; Beirlant, J.
2007-01-01
Over the last decade the use of generalized linear models (GLMs) in actuarial statistics has received a lot of attention, starting from the actuarial illustrations in the standard text by McCullagh and Nelder [McCullagh, P., Nelder, J.A., 1989. Generalized linear models. In: Monographs on Statistics
A Bayesian model for pooling gene expression studies that incorporates co-regulation information.
Directory of Open Access Journals (Sweden)
Erin M Conlon
Full Text Available Current Bayesian microarray models that pool multiple studies assume gene expression is independent of other genes. However, in prokaryotic organisms, genes are arranged in units that are co-regulated (called operons. Here, we introduce a new Bayesian model for pooling gene expression studies that incorporates operon information into the model. Our Bayesian model borrows information from other genes within the same operon to improve estimation of gene expression. The model produces the gene-specific posterior probability of differential expression, which is the basis for inference. We found in simulations and in biological studies that incorporating co-regulation information improves upon the independence model. We assume that each study contains two experimental conditions: a treatment and control. We note that there exist environmental conditions for which genes that are supposed to be transcribed together lose their operon structure, and that our model is best carried out for known operon structures.
Statistical Modeling of Bivariate Data.
1982-08-01
to one. Following Crain (1974), one may consider order m approximators m log f111(X) - k k (x) - c(e), asx ;b. (4.4.5) k,-r A m and attempt to find...literature. Consider the approximate model m log fn (x) = 7 ekk(x) + a G(x), aSx ;b, (44.8) " k=-Mn ’ where G(x) is a Gaussian process and n is a
International Nuclear Information System (INIS)
Chagas Moura, Márcio das; Azevedo, Rafael Valença; Droguett, Enrique López; Chaves, Leandro Rego; Lins, Isis Didier
2016-01-01
Occupational accidents pose several negative consequences to employees, employers, environment and people surrounding the locale where the accident takes place. Some types of accidents correspond to low frequency-high consequence (long sick leaves) events, and then classical statistical approaches are ineffective in these cases because the available dataset is generally sparse and contain censored recordings. In this context, we propose a Bayesian population variability method for the estimation of the distributions of the rates of accident and recovery. Given these distributions, a Markov-based model will be used to estimate the uncertainty over the expected number of accidents and the work time loss. Thus, the use of Bayesian analysis along with the Markov approach aims at investigating future trends regarding occupational accidents in a workplace as well as enabling a better management of the labor force and prevention efforts. One application example is presented in order to validate the proposed approach; this case uses available data gathered from a hydropower company in Brazil. - Highlights: • This paper proposes a Bayesian method to estimate rates of accident and recovery. • The model requires simple data likely to be available in the company database. • These results show the proposed model is not too sensitive to the prior estimates.
Directory of Open Access Journals (Sweden)
Xiaolin Shi
2016-01-01
Full Text Available This paper deals with the Bayesian inference on step-stress partially accelerated life tests using Type II progressive censored data in the presence of competing failure causes. Suppose that the occurrence time of the failure cause follows Pareto distribution under use stress levels. Based on the tampered failure rate model, the objective Bayesian estimates, Bayesian estimates, and E-Bayesian estimates of the unknown parameters and acceleration factor are obtained under the squared loss function. To evaluate the performance of the obtained estimates, the average relative errors (AREs and mean squared errors (MSEs are calculated. In addition, the comparisons of the three estimates of unknown parameters and acceleration factor for different sample sizes and different progressive censoring schemes are conducted through Monte Carlo simulations.
Statistical Models and Methods for Lifetime Data
Lawless, Jerald F
2011-01-01
Praise for the First Edition"An indispensable addition to any serious collection on lifetime data analysis and . . . a valuable contribution to the statistical literature. Highly recommended . . ."-Choice"This is an important book, which will appeal to statisticians working on survival analysis problems."-Biometrics"A thorough, unified treatment of statistical models and methods used in the analysis of lifetime data . . . this is a highly competent and agreeable statistical textbook."-Statistics in MedicineThe statistical analysis of lifetime or response time data is a key tool in engineering,
The Bayesian statistical decision theory applied to the optimization of generating set maintenance
International Nuclear Information System (INIS)
Procaccia, H.; Cordier, R.; Muller, S.
1994-11-01
The difficulty in RCM methodology is the allocation of a new periodicity of preventive maintenance on one equipment when a critical failure has been identified: until now this new allocation has been based on the engineer's judgment, and one must wait for a full cycle of feedback experience before to validate it. Statistical decision theory could be a more rational alternative for the optimization of preventive maintenance periodicity. This methodology has been applied to inspection and maintenance optimization of cylinders of diesel generator engines of 900 MW nuclear plants, and has shown that previous preventive maintenance periodicity can be extended. (authors). 8 refs., 5 figs
Shen, Yanna; Cooper, Gregory F
2012-09-01
This paper investigates Bayesian modeling of known and unknown causes of events in the context of disease-outbreak detection. We introduce a multivariate Bayesian approach that models multiple evidential features of every person in the population. This approach models and detects (1) known diseases (e.g., influenza and anthrax) by using informative prior probabilities and (2) unknown diseases (e.g., a new, highly contagious respiratory virus that has never been seen before) by using relatively non-informative prior probabilities. We report the results of simulation experiments which support that this modeling method can improve the detection of new disease outbreaks in a population. A contribution of this paper is that it introduces a multivariate Bayesian approach for jointly modeling both known and unknown causes of events. Such modeling has general applicability in domains where the space of known causes is incomplete. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
Fast and accurate Bayesian model criticism and conflict diagnostics using R-INLA
Ferkingstad, Egil
2017-10-16
Bayesian hierarchical models are increasingly popular for realistic modelling and analysis of complex data. This trend is accompanied by the need for flexible, general and computationally efficient methods for model criticism and conflict detection. Usually, a Bayesian hierarchical model incorporates a grouping of the individual data points, as, for example, with individuals in repeated measurement data. In such cases, the following question arises: Are any of the groups “outliers,” or in conflict with the remaining groups? Existing general approaches aiming to answer such questions tend to be extremely computationally demanding when model fitting is based on Markov chain Monte Carlo. We show how group-level model criticism and conflict detection can be carried out quickly and accurately through integrated nested Laplace approximations (INLA). The new method is implemented as a part of the open-source R-INLA package for Bayesian computing (http://r-inla.org).
Bayesian interpolation in a dynamic sinusoidal model with application to packet-loss concealment
DEFF Research Database (Denmark)
Nielsen, Jesper Kjær; Christensen, Mads Græsbøll; Cemgil, Ali Taylan
2010-01-01
In this paper, we consider Bayesian interpolation and parameter estimation in a dynamic sinusoidal model. This model is more ﬂexible than the static sinusoidal model since it enables the amplitudes and phases of the sinusoids to be time-varying. For the dynamic sinusoidal model, we derive...
Lesaffre, Emmanuel
2012-01-01
The growth of biostatistics has been phenomenal in recent years and has been marked by considerable technical innovation in both methodology and computational practicality. One area that has experienced significant growth is Bayesian methods. The growing use of Bayesian methodology has taken place partly due to an increasing number of practitioners valuing the Bayesian paradigm as matching that of scientific discovery. In addition, computational advances have allowed for more complex models to be fitted routinely to realistic data sets. Through examples, exercises and a combination of introd
Bayesian networks in reliability
Energy Technology Data Exchange (ETDEWEB)
Langseth, Helge [Department of Mathematical Sciences, Norwegian University of Science and Technology, N-7491 Trondheim (Norway)]. E-mail: helgel@math.ntnu.no; Portinale, Luigi [Department of Computer Science, University of Eastern Piedmont ' Amedeo Avogadro' , 15100 Alessandria (Italy)]. E-mail: portinal@di.unipmn.it
2007-01-15
Over the last decade, Bayesian networks (BNs) have become a popular tool for modelling many kinds of statistical problems. We have also seen a growing interest for using BNs in the reliability analysis community. In this paper we will discuss the properties of the modelling framework that make BNs particularly well suited for reliability applications, and point to ongoing research that is relevant for practitioners in reliability.
Iizumi, T.; Nishimori, M.; Yokozawa, M.; Kotera, A.; Khang, N. D.
2008-12-01
Long-term daily global solar radiation (GSR) data of the same quality in the 20th century has been needed as a baseline to assess the climate change impact on paddy rice production in Vietnamese Mekong Delta area (MKD: 104.5-107.5oE/8.2-11.2oN). However, though sunshine duration data is available, the accessibility of GSR data is quite poor in MKD. This study estimated the daily GSR in MKD for 30-yr (1978- 2007) by applying the statistical downscaling method (SDM). The estimates of GSR was obtained from four different sources: (1) the combined equations with the corrected reanalysis data of daily maximum/minimum temperatures, relative humidity, sea level pressure, and precipitable water; (2) the correction equation with the reanalysis data of downward shortwave radiation; (3) the empirical equation with the observed sunshine duration; and (4) the observation at one site for short term. Three reanalysis data, i.e., NCEP-R1, ERA-40, and JRA-25, were used. Also the observed meteorological data, which includes many missing data, were obtained from 11 stations of the Vietnamese Meteorological Agency for 28-yr and five stations of the Global Summary of the Day for 30-yr. The observed GSR data for 1-yr was obtained from our station. Considering the use of data with many missing data for analysis, the Bayesian inference was used for this study, which has the powerful capability to optimize multiple parameters in a non-linear and hierarchical model. The Bayesian inference provided the posterior distributions of 306 parameter values relating to the combined equations, the empirical equation, and the correction equation. The preliminary result shows that the amplitude of daily fluctuation of modeled GSR was underestimated by the empirical equation and the correction equation. The combination of SDM and Bayesian inference has a potential to estimate the long- term daily GSR of the same quality even though in the area where the observed data is quite limited.