A Bayesian ensemble of sensitivity measures for severe accident modeling
Energy Technology Data Exchange (ETDEWEB)
Hoseyni, Seyed Mohsen [Department of Basic Sciences, East Tehran Branch, Islamic Azad University, Tehran (Iran, Islamic Republic of); Di Maio, Francesco, E-mail: francesco.dimaio@polimi.it [Energy Department, Politecnico di Milano, Via La Masa 34, 20156 Milano (Italy); Vagnoli, Matteo [Energy Department, Politecnico di Milano, Via La Masa 34, 20156 Milano (Italy); Zio, Enrico [Energy Department, Politecnico di Milano, Via La Masa 34, 20156 Milano (Italy); Chair on System Science and Energetic Challenge, Fondation EDF – Electricite de France Ecole Centrale, Paris, and Supelec, Paris (France); Pourgol-Mohammad, Mohammad [Department of Mechanical Engineering, Sahand University of Technology, Tabriz (Iran, Islamic Republic of)
2015-12-15
Highlights: • We propose a sensitivity analysis (SA) method based on a Bayesian updating scheme. • The Bayesian updating schemes adjourns an ensemble of sensitivity measures. • Bootstrap replicates of a severe accident code output are fed to the Bayesian scheme. • The MELCOR code simulates the fission products release of LOFT LP-FP-2 experiment. • Results are compared with those of traditional SA methods. - Abstract: In this work, a sensitivity analysis framework is presented to identify the relevant input variables of a severe accident code, based on an incremental Bayesian ensemble updating method. The proposed methodology entails: (i) the propagation of the uncertainty in the input variables through the severe accident code; (ii) the collection of bootstrap replicates of the input and output of limited number of simulations for building a set of finite mixture models (FMMs) for approximating the probability density function (pdf) of the severe accident code output of the replicates; (iii) for each FMM, the calculation of an ensemble of sensitivity measures (i.e., input saliency, Hellinger distance and Kullback–Leibler divergence) and the updating when a new piece of evidence arrives, by a Bayesian scheme, based on the Bradley–Terry model for ranking the most relevant input model variables. An application is given with respect to a limited number of simulations of a MELCOR severe accident model describing the fission products release in the LP-FP-2 experiment of the loss of fluid test (LOFT) facility, which is a scaled-down facility of a pressurized water reactor (PWR).
Bayesian model ensembling using meta-trained recurrent neural networks
Ambrogioni, L.; Berezutskaya, Y.; Gü ç lü , U.; Borne, E.W.P. van den; Gü ç lü tü rk, Y.; Gerven, M.A.J. van; Maris, E.G.G.
2017-01-01
In this paper we demonstrate that a recurrent neural network meta-trained on an ensemble of arbitrary classification tasks can be used as an approximation of the Bayes optimal classifier. This result is obtained by relying on the framework of e-free approximate Bayesian inference, where the Bayesian
Ensemble bayesian model averaging using markov chain Monte Carlo sampling
Energy Technology Data Exchange (ETDEWEB)
Vrugt, Jasper A [Los Alamos National Laboratory; Diks, Cees G H [NON LANL; Clark, Martyn P [NON LANL
2008-01-01
Bayesian model averaging (BMA) has recently been proposed as a statistical method to calibrate forecast ensembles from numerical weather models. Successful implementation of BMA however, requires accurate estimates of the weights and variances of the individual competing models in the ensemble. In their seminal paper (Raftery etal. Mon Weather Rev 133: 1155-1174, 2(05)) has recommended the Expectation-Maximization (EM) algorithm for BMA model training, even though global convergence of this algorithm cannot be guaranteed. In this paper, we compare the performance of the EM algorithm and the recently developed Differential Evolution Adaptive Metropolis (DREAM) Markov Chain Monte Carlo (MCMC) algorithm for estimating the BMA weights and variances. Simulation experiments using 48-hour ensemble data of surface temperature and multi-model stream-flow forecasts show that both methods produce similar results, and that their performance is unaffected by the length of the training data set. However, MCMC simulation with DREAM is capable of efficiently handling a wide variety of BMA predictive distributions, and provides useful information about the uncertainty associated with the estimated BMA weights and variances.
Bayesian energy landscape tilting: towards concordant models of molecular ensembles.
Beauchamp, Kyle A; Pande, Vijay S; Das, Rhiju
2014-03-18
Predicting biological structure has remained challenging for systems such as disordered proteins that take on myriad conformations. Hybrid simulation/experiment strategies have been undermined by difficulties in evaluating errors from computational model inaccuracies and data uncertainties. Building on recent proposals from maximum entropy theory and nonequilibrium thermodynamics, we address these issues through a Bayesian energy landscape tilting (BELT) scheme for computing Bayesian hyperensembles over conformational ensembles. BELT uses Markov chain Monte Carlo to directly sample maximum-entropy conformational ensembles consistent with a set of input experimental observables. To test this framework, we apply BELT to model trialanine, starting from disagreeing simulations with the force fields ff96, ff99, ff99sbnmr-ildn, CHARMM27, and OPLS-AA. BELT incorporation of limited chemical shift and (3)J measurements gives convergent values of the peptide's α, β, and PPII conformational populations in all cases. As a test of predictive power, all five BELT hyperensembles recover set-aside measurements not used in the fitting and report accurate errors, even when starting from highly inaccurate simulations. BELT's principled framework thus enables practical predictions for complex biomolecular systems from discordant simulations and sparse data. Copyright © 2014 Biophysical Society. Published by Elsevier Inc. All rights reserved.
Climatic Models Ensemble-based Mid-21st Century Runoff Projections: A Bayesian Framework
Achieng, K. O.; Zhu, J.
2017-12-01
There are a number of North American Regional Climate Change Assessment Program (NARCCAP) climatic models that have been used to project surface runoff in the mid-21st century. Statistical model selection techniques are often used to select the model that best fits data. However, model selection techniques often lead to different conclusions. In this study, ten models are averaged in Bayesian paradigm to project runoff. Bayesian Model Averaging (BMA) is used to project and identify effect of model uncertainty on future runoff projections. Baseflow separation - a two-digital filter which is also called Eckhardt filter - is used to separate USGS streamflow (total runoff) into two components: baseflow and surface runoff. We use this surface runoff as the a priori runoff when conducting BMA of runoff simulated from the ten RCM models. The primary objective of this study is to evaluate how well RCM multi-model ensembles simulate surface runoff, in a Bayesian framework. Specifically, we investigate and discuss the following questions: How well do ten RCM models ensemble jointly simulate surface runoff by averaging over all the models using BMA, given a priori surface runoff? What are the effects of model uncertainty on surface runoff simulation?
A Bayesian posterior predictive framework for weighting ensemble regional climate models
Directory of Open Access Journals (Sweden)
Y. Fan
2017-06-01
Full Text Available We present a novel Bayesian statistical approach to computing model weights in climate change projection ensembles in order to create probabilistic projections. The weight of each climate model is obtained by weighting the current day observed data under the posterior distribution admitted under competing climate models. We use a linear model to describe the model output and observations. The approach accounts for uncertainty in model bias, trend and internal variability, including error in the observations used. Our framework is general, requires very little problem-specific input, and works well with default priors. We carry out cross-validation checks that confirm that the method produces the correct coverage.
Taroni, M.; Selva, J.
2017-12-01
In this work we show how we built an ensemble seismic hazard model for the magnitude distribution for the TSUMAPS-NEAM EU project (http://www.tsumaps-neam.eu/). The considered source area includes the whole NEAM region (North East Atlantic, Mediterranean and connected seas). We build our models by using the catalogs (EMEC and ISC), their completeness and the regionalization provided by the project. We developed four alternative implementations of a Bayesian model, considering tapered or truncated Gutenberg-Richter distributions, and fixed or variable b-value. The frequency size distribution is based on the Weichert formulation. This allows for simultaneously assessing all the frequency-size distribution parameters (a-value, b-value, and corner magnitude), using multiple completeness periods for the different magnitudes. With respect to previous studies, we introduce the tapered Pareto distribution (in addition to the classical truncated Pareto), and we build a novel approach to quantify the prior distribution. For each alternative implementation, we set the prior distributions using the global seismic data grouped according to the different types of tectonic setting, and assigned them to the related regions. The estimation is based on the complete (not declustered) local catalog in each region. Using the complete catalog also allows us to consider foreshocks and aftershocks in the seismic rate computation: the Poissonicity of the tsunami events (and similarly the exceedances of the PGA) will be insured by the Le Cam's theorem. This Bayesian approach provides robust estimations also in the zones where few events are available, but also leaves us the possibility to explore the uncertainty associated with the estimation of the magnitude distribution parameters (e.g. with the classical Metropolis-Hastings Monte Carlo method). Finally we merge all the models with their uncertainty to create the ensemble model that represents our knowledge of the seismicity in the
Energy Technology Data Exchange (ETDEWEB)
Vrugt, Jasper A [Los Alamos National Laboratory; Wohling, Thomas [NON LANL
2008-01-01
Most studies in vadose zone hydrology use a single conceptual model for predictive inference and analysis. Focusing on the outcome of a single model is prone to statistical bias and underestimation of uncertainty. In this study, we combine multi-objective optimization and Bayesian Model Averaging (BMA) to generate forecast ensembles of soil hydraulic models. To illustrate our method, we use observed tensiometric pressure head data at three different depths in a layered vadose zone of volcanic origin in New Zealand. A set of seven different soil hydraulic models is calibrated using a multi-objective formulation with three different objective functions that each measure the mismatch between observed and predicted soil water pressure head at one specific depth. The Pareto solution space corresponding to these three objectives is estimated with AMALGAM, and used to generate four different model ensembles. These ensembles are post-processed with BMA and used for predictive analysis and uncertainty estimation. Our most important conclusions for the vadose zone under consideration are: (1) the mean BMA forecast exhibits similar predictive capabilities as the best individual performing soil hydraulic model, (2) the size of the BMA uncertainty ranges increase with increasing depth and dryness in the soil profile, (3) the best performing ensemble corresponds to the compromise (or balanced) solution of the three-objective Pareto surface, and (4) the combined multi-objective optimization and BMA framework proposed in this paper is very useful to generate forecast ensembles of soil hydraulic models.
Ensemble Bayesian forecasting system Part I: Theory and algorithms
Herr, Henry D.; Krzysztofowicz, Roman
2015-05-01
The ensemble Bayesian forecasting system (EBFS), whose theory was published in 2001, is developed for the purpose of quantifying the total uncertainty about a discrete-time, continuous-state, non-stationary stochastic process such as a time series of stages, discharges, or volumes at a river gauge. The EBFS is built of three components: an input ensemble forecaster (IEF), which simulates the uncertainty associated with random inputs; a deterministic hydrologic model (of any complexity), which simulates physical processes within a river basin; and a hydrologic uncertainty processor (HUP), which simulates the hydrologic uncertainty (an aggregate of all uncertainties except input). It works as a Monte Carlo simulator: an ensemble of time series of inputs (e.g., precipitation amounts) generated by the IEF is transformed deterministically through a hydrologic model into an ensemble of time series of outputs, which is next transformed stochastically by the HUP into an ensemble of time series of predictands (e.g., river stages). Previous research indicated that in order to attain an acceptable sampling error, the ensemble size must be on the order of hundreds (for probabilistic river stage forecasts and probabilistic flood forecasts) or even thousands (for probabilistic stage transition forecasts). The computing time needed to run the hydrologic model this many times renders the straightforward simulations operationally infeasible. This motivates the development of the ensemble Bayesian forecasting system with randomization (EBFSR), which takes full advantage of the analytic meta-Gaussian HUP and generates multiple ensemble members after each run of the hydrologic model; this auxiliary randomization reduces the required size of the meteorological input ensemble and makes it operationally feasible to generate a Bayesian ensemble forecast of large size. Such a forecast quantifies the total uncertainty, is well calibrated against the prior (climatic) distribution of
2017-09-01
application of statistical inference. Even when human forecasters leverage their professional experience, which is often gained through long periods of... application throughout statistics and Bayesian data analysis. The multivariate form of 2( , ) (e.g., Figure 12) is similarly analytically...data (i.e., no systematic manipulations with analytical functions), it is common in the statistical literature to apply mathematical transformations
Qi, Wei; Liu, Junguo; Yang, Hong; Sweetapple, Chris
2018-03-01
Global precipitation products are very important datasets in flow simulations, especially in poorly gauged regions. Uncertainties resulting from precipitation products, hydrological models and their combinations vary with time and data magnitude, and undermine their application to flow simulations. However, previous studies have not quantified these uncertainties individually and explicitly. This study developed an ensemble-based dynamic Bayesian averaging approach (e-Bay) for deterministic discharge simulations using multiple global precipitation products and hydrological models. In this approach, the joint probability of precipitation products and hydrological models being correct is quantified based on uncertainties in maximum and mean estimation, posterior probability is quantified as functions of the magnitude and timing of discharges, and the law of total probability is implemented to calculate expected discharges. Six global fine-resolution precipitation products and two hydrological models of different complexities are included in an illustrative application. e-Bay can effectively quantify uncertainties and therefore generate better deterministic discharges than traditional approaches (weighted average methods with equal and varying weights and maximum likelihood approach). The mean Nash-Sutcliffe Efficiency values of e-Bay are up to 0.97 and 0.85 in training and validation periods respectively, which are at least 0.06 and 0.13 higher than traditional approaches. In addition, with increased training data, assessment criteria values of e-Bay show smaller fluctuations than traditional approaches and its performance becomes outstanding. The proposed e-Bay approach bridges the gap between global precipitation products and their pragmatic applications to discharge simulations, and is beneficial to water resources management in ungauged or poorly gauged regions across the world.
Bayesian ensemble refinement by replica simulations and reweighting
Hummer, Gerhard; Köfinger, Jürgen
2015-12-01
We describe different Bayesian ensemble refinement methods, examine their interrelation, and discuss their practical application. With ensemble refinement, the properties of dynamic and partially disordered (bio)molecular structures can be characterized by integrating a wide range of experimental data, including measurements of ensemble-averaged observables. We start from a Bayesian formulation in which the posterior is a functional that ranks different configuration space distributions. By maximizing this posterior, we derive an optimal Bayesian ensemble distribution. For discrete configurations, this optimal distribution is identical to that obtained by the maximum entropy "ensemble refinement of SAXS" (EROS) formulation. Bayesian replica ensemble refinement enhances the sampling of relevant configurations by imposing restraints on averages of observables in coupled replica molecular dynamics simulations. We show that the strength of the restraints should scale linearly with the number of replicas to ensure convergence to the optimal Bayesian result in the limit of infinitely many replicas. In the "Bayesian inference of ensembles" method, we combine the replica and EROS approaches to accelerate the convergence. An adaptive algorithm can be used to sample directly from the optimal ensemble, without replicas. We discuss the incorporation of single-molecule measurements and dynamic observables such as relaxation parameters. The theoretical analysis of different Bayesian ensemble refinement approaches provides a basis for practical applications and a starting point for further investigations.
Bayesian ensemble approach to error estimation of interatomic potentials
DEFF Research Database (Denmark)
Frederiksen, Søren Lund; Jacobsen, Karsten Wedel; Brown, K.S.
2004-01-01
Using a Bayesian approach a general method is developed to assess error bars on predictions made by models fitted to data. The error bars are estimated from fluctuations in ensembles of models sampling the model-parameter space with a probability density set by the minimum cost. The method...... is applied to the development of interatomic potentials for molybdenum using various potential forms and databases based on atomic forces. The calculated error bars on elastic constants, gamma-surface energies, structural energies, and dislocation properties are shown to provide realistic estimates...
Multi-Model Ensemble Wake Vortex Prediction
Koerner, Stephan; Holzaepfel, Frank; Ahmad, Nash'at N.
2015-01-01
Several multi-model ensemble methods are investigated for predicting wake vortex transport and decay. This study is a joint effort between National Aeronautics and Space Administration and Deutsches Zentrum fuer Luft- und Raumfahrt to develop a multi-model ensemble capability using their wake models. An overview of different multi-model ensemble methods and their feasibility for wake applications is presented. The methods include Reliability Ensemble Averaging, Bayesian Model Averaging, and Monte Carlo Simulations. The methodologies are evaluated using data from wake vortex field experiments.
DEFF Research Database (Denmark)
Jensen, Finn Verner; Nielsen, Thomas Dyhre
2016-01-01
Mathematically, a Bayesian graphical model is a compact representation of the joint probability distribution for a set of variables. The most frequently used type of Bayesian graphical models are Bayesian networks. The structural part of a Bayesian graphical model is a graph consisting of nodes...
Congdon, Peter
2014-01-01
This book provides an accessible approach to Bayesian computing and data analysis, with an emphasis on the interpretation of real data sets. Following in the tradition of the successful first edition, this book aims to make a wide range of statistical modeling applications accessible using tested code that can be readily adapted to the reader's own applications. The second edition has been thoroughly reworked and updated to take account of advances in the field. A new set of worked examples is included. The novel aspect of the first edition was the coverage of statistical modeling using WinBU
Particle Kalman Filtering: A Nonlinear Bayesian Framework for Ensemble Kalman Filters*
Hoteit, Ibrahim
2012-02-01
This paper investigates an approximation scheme of the optimal nonlinear Bayesian filter based on the Gaussian mixture representation of the state probability distribution function. The resulting filter is similar to the particle filter, but is different from it in that the standard weight-type correction in the particle filter is complemented by the Kalman-type correction with the associated covariance matrices in the Gaussian mixture. The authors show that this filter is an algorithm in between the Kalman filter and the particle filter, and therefore is referred to as the particle Kalman filter (PKF). In the PKF, the solution of a nonlinear filtering problem is expressed as the weighted average of an “ensemble of Kalman filters” operating in parallel. Running an ensemble of Kalman filters is, however, computationally prohibitive for realistic atmospheric and oceanic data assimilation problems. For this reason, the authors consider the construction of the PKF through an “ensemble” of ensemble Kalman filters (EnKFs) instead, and call the implementation the particle EnKF (PEnKF). It is shown that different types of the EnKFs can be considered as special cases of the PEnKF. Similar to the situation in the particle filter, the authors also introduce a resampling step to the PEnKF in order to reduce the risk of weights collapse and improve the performance of the filter. Numerical experiments with the strongly nonlinear Lorenz-96 model are presented and discussed.
Bayesian nonparametric hierarchical modeling.
Dunson, David B
2009-04-01
In biomedical research, hierarchical models are very widely used to accommodate dependence in multivariate and longitudinal data and for borrowing of information across data from different sources. A primary concern in hierarchical modeling is sensitivity to parametric assumptions, such as linearity and normality of the random effects. Parametric assumptions on latent variable distributions can be challenging to check and are typically unwarranted, given available prior knowledge. This article reviews some recent developments in Bayesian nonparametric methods motivated by complex, multivariate and functional data collected in biomedical studies. The author provides a brief review of flexible parametric approaches relying on finite mixtures and latent class modeling. Dirichlet process mixture models are motivated by the need to generalize these approaches to avoid assuming a fixed finite number of classes. Focusing on an epidemiology application, the author illustrates the practical utility and potential of nonparametric Bayes methods.
Advanced Atmospheric Ensemble Modeling Techniques
Energy Technology Data Exchange (ETDEWEB)
Buckley, R. [Savannah River Site (SRS), Aiken, SC (United States). Savannah River National Lab. (SRNL); Chiswell, S. [Savannah River Site (SRS), Aiken, SC (United States). Savannah River National Lab. (SRNL); Kurzeja, R. [Savannah River Site (SRS), Aiken, SC (United States). Savannah River National Lab. (SRNL); Maze, G. [Savannah River Site (SRS), Aiken, SC (United States). Savannah River National Lab. (SRNL); Viner, B. [Savannah River Site (SRS), Aiken, SC (United States). Savannah River National Lab. (SRNL); Werth, D. [Savannah River Site (SRS), Aiken, SC (United States). Savannah River National Lab. (SRNL)
2017-09-29
Ensemble modeling (EM), the creation of multiple atmospheric simulations for a given time period, has become an essential tool for characterizing uncertainties in model predictions. We explore two novel ensemble modeling techniques: (1) perturbation of model parameters (Adaptive Programming, AP), and (2) data assimilation (Ensemble Kalman Filter, EnKF). The current research is an extension to work from last year and examines transport on a small spatial scale (<100 km) in complex terrain, for more rigorous testing of the ensemble technique. Two different release cases were studied, a coastal release (SF6) and an inland release (Freon) which consisted of two release times. Observations of tracer concentration and meteorology are used to judge the ensemble results. In addition, adaptive grid techniques have been developed to reduce required computing resources for transport calculations. Using a 20- member ensemble, the standard approach generated downwind transport that was quantitatively good for both releases; however, the EnKF method produced additional improvement for the coastal release where the spatial and temporal differences due to interior valley heating lead to the inland movement of the plume. The AP technique showed improvements for both release cases, with more improvement shown in the inland release. This research demonstrated that transport accuracy can be improved when models are adapted to a particular location/time or when important local data is assimilated into the simulation and enhances SRNL’s capability in atmospheric transport modeling in support of its current customer base and local site missions, as well as our ability to attract new customers within the intelligence community.
Bayesian network ensemble as a multivariate strategy to predict radiation pneumonitis risk
International Nuclear Information System (INIS)
Lee, Sangkyu; Ybarra, Norma; Jeyaseelan, Krishinima; Seuntjens, Jan; El Naqa, Issam; Faria, Sergio; Kopek, Neil; Brisebois, Pascale; Bradley, Jeffrey D.; Robinson, Clifford
2015-01-01
Purpose: Prediction of radiation pneumonitis (RP) has been shown to be challenging due to the involvement of a variety of factors including dose–volume metrics and radiosensitivity biomarkers. Some of these factors are highly correlated and might affect prediction results when combined. Bayesian network (BN) provides a probabilistic framework to represent variable dependencies in a directed acyclic graph. The aim of this study is to integrate the BN framework and a systems’ biology approach to detect possible interactions among RP risk factors and exploit these relationships to enhance both the understanding and prediction of RP. Methods: The authors studied 54 nonsmall-cell lung cancer patients who received curative 3D-conformal radiotherapy. Nineteen RP events were observed (common toxicity criteria for adverse events grade 2 or higher). Serum concentration of the following four candidate biomarkers were measured at baseline and midtreatment: alpha-2-macroglobulin, angiotensin converting enzyme (ACE), transforming growth factor, interleukin-6. Dose-volumetric and clinical parameters were also included as covariates. Feature selection was performed using a Markov blanket approach based on the Koller–Sahami filter. The Markov chain Monte Carlo technique estimated the posterior distribution of BN graphs built from the observed data of the selected variables and causality constraints. RP probability was estimated using a limited number of high posterior graphs (ensemble) and was averaged for the final RP estimate using Bayes’ rule. A resampling method based on bootstrapping was applied to model training and validation in order to control under- and overfit pitfalls. Results: RP prediction power of the BN ensemble approach reached its optimum at a size of 200. The optimized performance of the BN model recorded an area under the receiver operating characteristic curve (AUC) of 0.83, which was significantly higher than multivariate logistic regression (0
Bayesian network ensemble as a multivariate strategy to predict radiation pneumonitis risk
Energy Technology Data Exchange (ETDEWEB)
Lee, Sangkyu, E-mail: sangkyu.lee@mail.mcgill.ca; Ybarra, Norma; Jeyaseelan, Krishinima; Seuntjens, Jan; El Naqa, Issam [Medical Physics Unit, McGill University, Montreal, Quebec H3G1A4 (Canada); Faria, Sergio; Kopek, Neil; Brisebois, Pascale [Department of Radiation Oncology, Montreal General Hospital, Montreal, H3G1A4 (Canada); Bradley, Jeffrey D.; Robinson, Clifford [Radiation Oncology, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110 (United States)
2015-05-15
Purpose: Prediction of radiation pneumonitis (RP) has been shown to be challenging due to the involvement of a variety of factors including dose–volume metrics and radiosensitivity biomarkers. Some of these factors are highly correlated and might affect prediction results when combined. Bayesian network (BN) provides a probabilistic framework to represent variable dependencies in a directed acyclic graph. The aim of this study is to integrate the BN framework and a systems’ biology approach to detect possible interactions among RP risk factors and exploit these relationships to enhance both the understanding and prediction of RP. Methods: The authors studied 54 nonsmall-cell lung cancer patients who received curative 3D-conformal radiotherapy. Nineteen RP events were observed (common toxicity criteria for adverse events grade 2 or higher). Serum concentration of the following four candidate biomarkers were measured at baseline and midtreatment: alpha-2-macroglobulin, angiotensin converting enzyme (ACE), transforming growth factor, interleukin-6. Dose-volumetric and clinical parameters were also included as covariates. Feature selection was performed using a Markov blanket approach based on the Koller–Sahami filter. The Markov chain Monte Carlo technique estimated the posterior distribution of BN graphs built from the observed data of the selected variables and causality constraints. RP probability was estimated using a limited number of high posterior graphs (ensemble) and was averaged for the final RP estimate using Bayes’ rule. A resampling method based on bootstrapping was applied to model training and validation in order to control under- and overfit pitfalls. Results: RP prediction power of the BN ensemble approach reached its optimum at a size of 200. The optimized performance of the BN model recorded an area under the receiver operating characteristic curve (AUC) of 0.83, which was significantly higher than multivariate logistic regression (0
Multicomponent ensemble models to forecast induced seismicity
Király-Proag, E.; Gischig, V.; Zechar, J. D.; Wiemer, S.
2018-01-01
In recent years, human-induced seismicity has become a more and more relevant topic due to its economic and social implications. Several models and approaches have been developed to explain underlying physical processes or forecast induced seismicity. They range from simple statistical models to coupled numerical models incorporating complex physics. We advocate the need for forecast testing as currently the best method for ascertaining if models are capable to reasonably accounting for key physical governing processes—or not. Moreover, operational forecast models are of great interest to help on-site decision-making in projects entailing induced earthquakes. We previously introduced a standardized framework following the guidelines of the Collaboratory for the Study of Earthquake Predictability, the Induced Seismicity Test Bench, to test, validate, and rank induced seismicity models. In this study, we describe how to construct multicomponent ensemble models based on Bayesian weightings that deliver more accurate forecasts than individual models in the case of Basel 2006 and Soultz-sous-Forêts 2004 enhanced geothermal stimulation projects. For this, we examine five calibrated variants of two significantly different model groups: (1) Shapiro and Smoothed Seismicity based on the seismogenic index, simple modified Omori-law-type seismicity decay, and temporally weighted smoothed seismicity; (2) Hydraulics and Seismicity based on numerically modelled pore pressure evolution that triggers seismicity using the Mohr-Coulomb failure criterion. We also demonstrate how the individual and ensemble models would perform as part of an operational Adaptive Traffic Light System. Investigating seismicity forecasts based on a range of potential injection scenarios, we use forecast periods of different durations to compute the occurrence probabilities of seismic events M ≥ 3. We show that in the case of the Basel 2006 geothermal stimulation the models forecast hazardous levels
Ait-El-Fquih, Boujemaa
2016-08-12
Ensemble Kalman filtering (EnKF) is an efficient approach to addressing uncertainties in subsurface ground-water models. The EnKF sequentially integrates field data into simulation models to obtain a better characterization of the model\\'s state and parameters. These are generally estimated following joint and dual filtering strategies, in which, at each assimilation cycle, a forecast step by the model is followed by an update step with incoming observations. The joint EnKF directly updates the augmented state-parameter vector, whereas the dual EnKF empirically employs two separate filters, first estimating the parameters and then estimating the state based on the updated parameters. To develop a Bayesian consistent dual approach and improve the state-parameter estimates and their consistency, we propose in this paper a one-step-ahead (OSA) smoothing formulation of the state-parameter Bayesian filtering problem from which we derive a new dual-type EnKF, the dual EnKF(OSA). Compared with the standard dual EnKF, it imposes a new update step to the state, which is shown to enhance the performance of the dual approach with almost no increase in the computational cost. Numerical experiments are conducted with a two-dimensional (2-D) synthetic groundwater aquifer model to investigate the performance and robustness of the proposed dual EnKFOSA, and to evaluate its results against those of the joint and dual EnKFs. The proposed scheme is able to successfully recover both the hydraulic head and the aquifer conductivity, providing further reliable estimates of their uncertainties. Furthermore, it is found to be more robust to different assimilation settings, such as the spatial and temporal distribution of the observations, and the level of noise in the data. Based on our experimental setups, it yields up to 25% more accurate state and parameter estimations than the joint and dual approaches.
Ait-El-Fquih, Boujemaa; El Gharamti, Mohamad; Hoteit, Ibrahim
2016-01-01
Ensemble Kalman filtering (EnKF) is an efficient approach to addressing uncertainties in subsurface ground-water models. The EnKF sequentially integrates field data into simulation models to obtain a better characterization of the model's state and parameters. These are generally estimated following joint and dual filtering strategies, in which, at each assimilation cycle, a forecast step by the model is followed by an update step with incoming observations. The joint EnKF directly updates the augmented state-parameter vector, whereas the dual EnKF empirically employs two separate filters, first estimating the parameters and then estimating the state based on the updated parameters. To develop a Bayesian consistent dual approach and improve the state-parameter estimates and their consistency, we propose in this paper a one-step-ahead (OSA) smoothing formulation of the state-parameter Bayesian filtering problem from which we derive a new dual-type EnKF, the dual EnKF(OSA). Compared with the standard dual EnKF, it imposes a new update step to the state, which is shown to enhance the performance of the dual approach with almost no increase in the computational cost. Numerical experiments are conducted with a two-dimensional (2-D) synthetic groundwater aquifer model to investigate the performance and robustness of the proposed dual EnKFOSA, and to evaluate its results against those of the joint and dual EnKFs. The proposed scheme is able to successfully recover both the hydraulic head and the aquifer conductivity, providing further reliable estimates of their uncertainties. Furthermore, it is found to be more robust to different assimilation settings, such as the spatial and temporal distribution of the observations, and the level of noise in the data. Based on our experimental setups, it yields up to 25% more accurate state and parameter estimations than the joint and dual approaches.
Bayesian analysis of CCDM models
Jesus, J. F.; Valentim, R.; Andrade-Oliveira, F.
2017-09-01
Creation of Cold Dark Matter (CCDM), in the context of Einstein Field Equations, produces a negative pressure term which can be used to explain the accelerated expansion of the Universe. In this work we tested six different spatially flat models for matter creation using statistical criteria, in light of SNe Ia data: Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and Bayesian Evidence (BE). These criteria allow to compare models considering goodness of fit and number of free parameters, penalizing excess of complexity. We find that JO model is slightly favoured over LJO/ΛCDM model, however, neither of these, nor Γ = 3αH0 model can be discarded from the current analysis. Three other scenarios are discarded either because poor fitting or because of the excess of free parameters. A method of increasing Bayesian evidence through reparameterization in order to reducing parameter degeneracy is also developed.
Bayesian analysis of CCDM models
Energy Technology Data Exchange (ETDEWEB)
Jesus, J.F. [Universidade Estadual Paulista (Unesp), Câmpus Experimental de Itapeva, Rua Geraldo Alckmin 519, Vila N. Sra. de Fátima, Itapeva, SP, 18409-010 Brazil (Brazil); Valentim, R. [Departamento de Física, Instituto de Ciências Ambientais, Químicas e Farmacêuticas—ICAQF, Universidade Federal de São Paulo (UNIFESP), Unidade José Alencar, Rua São Nicolau No. 210, Diadema, SP, 09913-030 Brazil (Brazil); Andrade-Oliveira, F., E-mail: jfjesus@itapeva.unesp.br, E-mail: valentim.rodolfo@unifesp.br, E-mail: felipe.oliveira@port.ac.uk [Institute of Cosmology and Gravitation—University of Portsmouth, Burnaby Road, Portsmouth, PO1 3FX United Kingdom (United Kingdom)
2017-09-01
Creation of Cold Dark Matter (CCDM), in the context of Einstein Field Equations, produces a negative pressure term which can be used to explain the accelerated expansion of the Universe. In this work we tested six different spatially flat models for matter creation using statistical criteria, in light of SNe Ia data: Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and Bayesian Evidence (BE). These criteria allow to compare models considering goodness of fit and number of free parameters, penalizing excess of complexity. We find that JO model is slightly favoured over LJO/ΛCDM model, however, neither of these, nor Γ = 3α H {sub 0} model can be discarded from the current analysis. Three other scenarios are discarded either because poor fitting or because of the excess of free parameters. A method of increasing Bayesian evidence through reparameterization in order to reducing parameter degeneracy is also developed.
Directory of Open Access Journals (Sweden)
Eils Roland
2006-06-01
Full Text Available Abstract Background The subcellular location of a protein is closely related to its function. It would be worthwhile to develop a method to predict the subcellular location for a given protein when only the amino acid sequence of the protein is known. Although many efforts have been made to predict subcellular location from sequence information only, there is the need for further research to improve the accuracy of prediction. Results A novel method called HensBC is introduced to predict protein subcellular location. HensBC is a recursive algorithm which constructs a hierarchical ensemble of classifiers. The classifiers used are Bayesian classifiers based on Markov chain models. We tested our method on six various datasets; among them are Gram-negative bacteria dataset, data for discriminating outer membrane proteins and apoptosis proteins dataset. We observed that our method can predict the subcellular location with high accuracy. Another advantage of the proposed method is that it can improve the accuracy of the prediction of some classes with few sequences in training and is therefore useful for datasets with imbalanced distribution of classes. Conclusion This study introduces an algorithm which uses only the primary sequence of a protein to predict its subcellular location. The proposed recursive scheme represents an interesting methodology for learning and combining classifiers. The method is computationally efficient and competitive with the previously reported approaches in terms of prediction accuracies as empirical results indicate. The code for the software is available upon request.
Bayesian modeling using WinBUGS
Ntzoufras, Ioannis
2009-01-01
A hands-on introduction to the principles of Bayesian modeling using WinBUGS Bayesian Modeling Using WinBUGS provides an easily accessible introduction to the use of WinBUGS programming techniques in a variety of Bayesian modeling settings. The author provides an accessible treatment of the topic, offering readers a smooth introduction to the principles of Bayesian modeling with detailed guidance on the practical implementation of key principles. The book begins with a basic introduction to Bayesian inference and the WinBUGS software and goes on to cover key topics, including: Markov Chain Monte Carlo algorithms in Bayesian inference Generalized linear models Bayesian hierarchical models Predictive distribution and model checking Bayesian model and variable evaluation Computational notes and screen captures illustrate the use of both WinBUGS as well as R software to apply the discussed techniques. Exercises at the end of each chapter allow readers to test their understanding of the presented concepts and all ...
International Nuclear Information System (INIS)
Elsheikh, Ahmed H.; Wheeler, Mary F.; Hoteit, Ibrahim
2014-01-01
A Hybrid Nested Sampling (HNS) algorithm is proposed for efficient Bayesian model calibration and prior model selection. The proposed algorithm combines, Nested Sampling (NS) algorithm, Hybrid Monte Carlo (HMC) sampling and gradient estimation using Stochastic Ensemble Method (SEM). NS is an efficient sampling algorithm that can be used for Bayesian calibration and estimating the Bayesian evidence for prior model selection. Nested sampling has the advantage of computational feasibility. Within the nested sampling algorithm, a constrained sampling step is performed. For this step, we utilize HMC to reduce the correlation between successive sampled states. HMC relies on the gradient of the logarithm of the posterior distribution, which we estimate using a stochastic ensemble method based on an ensemble of directional derivatives. SEM only requires forward model runs and the simulator is then used as a black box and no adjoint code is needed. The developed HNS algorithm is successfully applied for Bayesian calibration and prior model selection of several nonlinear subsurface flow problems
Energy Technology Data Exchange (ETDEWEB)
Elsheikh, Ahmed H., E-mail: aelsheikh@ices.utexas.edu [Institute for Computational Engineering and Sciences (ICES), University of Texas at Austin, TX (United States); Institute of Petroleum Engineering, Heriot-Watt University, Edinburgh EH14 4AS (United Kingdom); Wheeler, Mary F. [Institute for Computational Engineering and Sciences (ICES), University of Texas at Austin, TX (United States); Hoteit, Ibrahim [Department of Earth Sciences and Engineering, King Abdullah University of Science and Technology (KAUST), Thuwal (Saudi Arabia)
2014-02-01
A Hybrid Nested Sampling (HNS) algorithm is proposed for efficient Bayesian model calibration and prior model selection. The proposed algorithm combines, Nested Sampling (NS) algorithm, Hybrid Monte Carlo (HMC) sampling and gradient estimation using Stochastic Ensemble Method (SEM). NS is an efficient sampling algorithm that can be used for Bayesian calibration and estimating the Bayesian evidence for prior model selection. Nested sampling has the advantage of computational feasibility. Within the nested sampling algorithm, a constrained sampling step is performed. For this step, we utilize HMC to reduce the correlation between successive sampled states. HMC relies on the gradient of the logarithm of the posterior distribution, which we estimate using a stochastic ensemble method based on an ensemble of directional derivatives. SEM only requires forward model runs and the simulator is then used as a black box and no adjoint code is needed. The developed HNS algorithm is successfully applied for Bayesian calibration and prior model selection of several nonlinear subsurface flow problems.
Rings, J.; Vrugt, J.A.; Schoups, G.; Huisman, J.A.; Vereecken, H.
2012-01-01
Bayesian model averaging (BMA) is a standard method for combining predictive distributions from different models. In recent years, this method has enjoyed widespread application and use in many fields of study to improve the spread-skill relationship of forecast ensembles. The BMA predictive
Borsboom, D.; Haig, B.D.
2013-01-01
Unlike most other statistical frameworks, Bayesian statistical inference is wedded to a particular approach in the philosophy of science (see Howson & Urbach, 2006); this approach is called Bayesianism. Rather than being concerned with model fitting, this position in the philosophy of science
Bayesian models: A statistical primer for ecologists
Hobbs, N. Thompson; Hooten, Mevin B.
2015-01-01
Bayesian modeling has become an indispensable tool for ecological research because it is uniquely suited to deal with complexity in a statistically coherent way. This textbook provides a comprehensive and accessible introduction to the latest Bayesian methods—in language ecologists can understand. Unlike other books on the subject, this one emphasizes the principles behind the computations, giving ecologists a big-picture understanding of how to implement this powerful statistical approach.Bayesian Models is an essential primer for non-statisticians. It begins with a definition of probability and develops a step-by-step sequence of connected ideas, including basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and inference from single and multiple models. This unique book places less emphasis on computer coding, favoring instead a concise presentation of the mathematical statistics needed to understand how and why Bayesian analysis works. It also explains how to write out properly formulated hierarchical Bayesian models and use them in computing, research papers, and proposals.This primer enables ecologists to understand the statistical principles behind Bayesian modeling and apply them to research, teaching, policy, and management.Presents the mathematical and statistical foundations of Bayesian modeling in language accessible to non-statisticiansCovers basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and moreDeemphasizes computer coding in favor of basic principlesExplains how to write out properly factored statistical expressions representing Bayesian models
A Bayesian model for binary Markov chains
Directory of Open Access Journals (Sweden)
Belkheir Essebbar
2004-02-01
Full Text Available This note is concerned with Bayesian estimation of the transition probabilities of a binary Markov chain observed from heterogeneous individuals. The model is founded on the Jeffreys' prior which allows for transition probabilities to be correlated. The Bayesian estimator is approximated by means of Monte Carlo Markov chain (MCMC techniques. The performance of the Bayesian estimates is illustrated by analyzing a small simulated data set.
A Bayesian approach to model uncertainty
International Nuclear Information System (INIS)
Buslik, A.
1994-01-01
A Bayesian approach to model uncertainty is taken. For the case of a finite number of alternative models, the model uncertainty is equivalent to parameter uncertainty. A derivation based on Savage's partition problem is given
An ensemble model of QSAR tools for regulatory risk assessment.
Pradeep, Prachi; Povinelli, Richard J; White, Shannon; Merrill, Stephen J
2016-01-01
Quantitative structure activity relationships (QSARs) are theoretical models that relate a quantitative measure of chemical structure to a physical property or a biological effect. QSAR predictions can be used for chemical risk assessment for protection of human and environmental health, which makes them interesting to regulators, especially in the absence of experimental data. For compatibility with regulatory use, QSAR models should be transparent, reproducible and optimized to minimize the number of false negatives. In silico QSAR tools are gaining wide acceptance as a faster alternative to otherwise time-consuming clinical and animal testing methods. However, different QSAR tools often make conflicting predictions for a given chemical and may also vary in their predictive performance across different chemical datasets. In a regulatory context, conflicting predictions raise interpretation, validation and adequacy concerns. To address these concerns, ensemble learning techniques in the machine learning paradigm can be used to integrate predictions from multiple tools. By leveraging various underlying QSAR algorithms and training datasets, the resulting consensus prediction should yield better overall predictive ability. We present a novel ensemble QSAR model using Bayesian classification. The model allows for varying a cut-off parameter that allows for a selection in the desirable trade-off between model sensitivity and specificity. The predictive performance of the ensemble model is compared with four in silico tools (Toxtree, Lazar, OECD Toolbox, and Danish QSAR) to predict carcinogenicity for a dataset of air toxins (332 chemicals) and a subset of the gold carcinogenic potency database (480 chemicals). Leave-one-out cross validation results show that the ensemble model achieves the best trade-off between sensitivity and specificity (accuracy: 83.8 % and 80.4 %, and balanced accuracy: 80.6 % and 80.8 %) and highest inter-rater agreement [kappa ( κ ): 0
Nonparametric Bayesian Modeling of Complex Networks
DEFF Research Database (Denmark)
Schmidt, Mikkel Nørgaard; Mørup, Morten
2013-01-01
an infinite mixture model as running example, we go through the steps of deriving the model as an infinite limit of a finite parametric model, inferring the model parameters by Markov chain Monte Carlo, and checking the model?s fit and predictive performance. We explain how advanced nonparametric models......Modeling structure in complex networks using Bayesian nonparametrics makes it possible to specify flexible model structures and infer the adequate model complexity from the observed data. This article provides a gentle introduction to nonparametric Bayesian modeling of complex networks: Using...
Bayesian models a statistical primer for ecologists
Hobbs, N Thompson
2015-01-01
Bayesian modeling has become an indispensable tool for ecological research because it is uniquely suited to deal with complexity in a statistically coherent way. This textbook provides a comprehensive and accessible introduction to the latest Bayesian methods-in language ecologists can understand. Unlike other books on the subject, this one emphasizes the principles behind the computations, giving ecologists a big-picture understanding of how to implement this powerful statistical approach. Bayesian Models is an essential primer for non-statisticians. It begins with a definition of probabili
Bayesian Modeling of a Human MMORPG Player
Synnaeve, Gabriel; Bessière, Pierre
2011-03-01
This paper describes an application of Bayesian programming to the control of an autonomous avatar in a multiplayer role-playing game (the example is based on World of Warcraft). We model a particular task, which consists of choosing what to do and to select which target in a situation where allies and foes are present. We explain the model in Bayesian programming and show how we could learn the conditional probabilities from data gathered during human-played sessions.
Modeling polydispersive ensembles of diamond nanoparticles
International Nuclear Information System (INIS)
Barnard, Amanda S
2013-01-01
While significant progress has been made toward production of monodispersed samples of a variety of nanoparticles, in cases such as diamond nanoparticles (nanodiamonds) a significant degree of polydispersivity persists, so scaling-up of laboratory applications to industrial levels has its challenges. In many cases, however, monodispersivity is not essential for reliable application, provided that the inevitable uncertainties are just as predictable as the functional properties. As computational methods of materials design are becoming more widespread, there is a growing need for robust methods for modeling ensembles of nanoparticles, that capture the structural complexity characteristic of real specimens. In this paper we present a simple statistical approach to modeling of ensembles of nanoparticles, and apply it to nanodiamond, based on sets of individual simulations that have been carefully selected to describe specific structural sources that are responsible for scattering of fundamental properties, and that are typically difficult to eliminate experimentally. For the purposes of demonstration we show how scattering in the Fermi energy and the electronic band gap are related to different structural variations (sources), and how these results can be combined strategically to yield statistically significant predictions of the properties of an entire ensemble of nanodiamonds, rather than merely one individual ‘model’ particle or a non-representative sub-set. (paper)
Fully probabilistic design of hierarchical Bayesian models
Czech Academy of Sciences Publication Activity Database
Quinn, A.; Kárný, Miroslav; Guy, Tatiana Valentine
2016-01-01
Roč. 369, č. 1 (2016), s. 532-547 ISSN 0020-0255 R&D Projects: GA ČR GA13-13502S Institutional support: RVO:67985556 Keywords : Fully probabilistic design * Ideal distribution * Minimum cross-entropy principle * Bayesian conditioning * Kullback-Leibler divergence * Bayesian nonparametric modelling Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 4.832, year: 2016 http://library.utia.cas.cz/separaty/2016/AS/karny-0463052.pdf
Elsheikh, Ahmed H.
2014-02-01
A Hybrid Nested Sampling (HNS) algorithm is proposed for efficient Bayesian model calibration and prior model selection. The proposed algorithm combines, Nested Sampling (NS) algorithm, Hybrid Monte Carlo (HMC) sampling and gradient estimation using Stochastic Ensemble Method (SEM). NS is an efficient sampling algorithm that can be used for Bayesian calibration and estimating the Bayesian evidence for prior model selection. Nested sampling has the advantage of computational feasibility. Within the nested sampling algorithm, a constrained sampling step is performed. For this step, we utilize HMC to reduce the correlation between successive sampled states. HMC relies on the gradient of the logarithm of the posterior distribution, which we estimate using a stochastic ensemble method based on an ensemble of directional derivatives. SEM only requires forward model runs and the simulator is then used as a black box and no adjoint code is needed. The developed HNS algorithm is successfully applied for Bayesian calibration and prior model selection of several nonlinear subsurface flow problems. © 2013 Elsevier Inc.
Robust bayesian analysis of an autoregressive model with ...
African Journals Online (AJOL)
In this work, robust Bayesian analysis of the Bayesian estimation of an autoregressive model with exponential innovations is performed. Using a Bayesian robustness methodology, we show that, using a suitable generalized quadratic loss, we obtain optimal Bayesian estimators of the parameters corresponding to the ...
Empirical Bayesian inference and model uncertainty
International Nuclear Information System (INIS)
Poern, K.
1994-01-01
This paper presents a hierarchical or multistage empirical Bayesian approach for the estimation of uncertainty concerning the intensity of a homogeneous Poisson process. A class of contaminated gamma distributions is considered to describe the uncertainty concerning the intensity. These distributions in turn are defined through a set of secondary parameters, the knowledge of which is also described and updated via Bayes formula. This two-stage Bayesian approach is an example where the modeling uncertainty is treated in a comprehensive way. Each contaminated gamma distributions, represented by a point in the 3D space of secondary parameters, can be considered as a specific model of the uncertainty about the Poisson intensity. Then, by the empirical Bayesian method each individual model is assigned a posterior probability
Data assimilation in integrated hydrological modeling using ensemble Kalman filtering
DEFF Research Database (Denmark)
Rasmussen, Jørn; Madsen, H.; Jensen, Karsten Høgh
2015-01-01
Groundwater head and stream discharge is assimilated using the ensemble transform Kalman filter in an integrated hydrological model with the aim of studying the relationship between the filter performance and the ensemble size. In an attempt to reduce the required number of ensemble members...... and estimating parameters requires a much larger ensemble size than just assimilating groundwater head observations. However, the required ensemble size can be greatly reduced with the use of adaptive localization, which by far outperforms distance-based localization. The study is conducted using synthetic data...
Global Optimization Ensemble Model for Classification Methods
Anwar, Hina; Qamar, Usman; Muzaffar Qureshi, Abdul Wahab
2014-01-01
Supervised learning is the process of data mining for deducing rules from training datasets. A broad array of supervised learning algorithms exists, every one of them with its own advantages and drawbacks. There are some basic issues that affect the accuracy of classifier while solving a supervised learning problem, like bias-variance tradeoff, dimensionality of input space, and noise in the input data space. All these problems affect the accuracy of classifier and are the reason that there is no global optimal method for classification. There is not any generalized improvement method that can increase the accuracy of any classifier while addressing all the problems stated above. This paper proposes a global optimization ensemble model for classification methods (GMC) that can improve the overall accuracy for supervised learning problems. The experimental results on various public datasets showed that the proposed model improved the accuracy of the classification models from 1% to 30% depending upon the algorithm complexity. PMID:24883382
Global Optimization Ensemble Model for Classification Methods
Directory of Open Access Journals (Sweden)
Hina Anwar
2014-01-01
Full Text Available Supervised learning is the process of data mining for deducing rules from training datasets. A broad array of supervised learning algorithms exists, every one of them with its own advantages and drawbacks. There are some basic issues that affect the accuracy of classifier while solving a supervised learning problem, like bias-variance tradeoff, dimensionality of input space, and noise in the input data space. All these problems affect the accuracy of classifier and are the reason that there is no global optimal method for classification. There is not any generalized improvement method that can increase the accuracy of any classifier while addressing all the problems stated above. This paper proposes a global optimization ensemble model for classification methods (GMC that can improve the overall accuracy for supervised learning problems. The experimental results on various public datasets showed that the proposed model improved the accuracy of the classification models from 1% to 30% depending upon the algorithm complexity.
Design ensemble machine learning model for breast cancer diagnosis.
Hsieh, Sheau-Ling; Hsieh, Sung-Huai; Cheng, Po-Hsun; Chen, Chi-Huang; Hsu, Kai-Ping; Lee, I-Shun; Wang, Zhenyu; Lai, Feipei
2012-10-01
In this paper, we classify the breast cancer of medical diagnostic data. Information gain has been adapted for feature selections. Neural fuzzy (NF), k-nearest neighbor (KNN), quadratic classifier (QC), each single model scheme as well as their associated, ensemble ones have been developed for classifications. In addition, a combined ensemble model with these three schemes has been constructed for further validations. The experimental results indicate that the ensemble learning performs better than individual single ones. Moreover, the combined ensemble model illustrates the highest accuracy of classifications for the breast cancer among all models.
Bayesian uncertainty analyses of probabilistic risk models
International Nuclear Information System (INIS)
Pulkkinen, U.
1989-01-01
Applications of Bayesian principles to the uncertainty analyses are discussed in the paper. A short review of the most important uncertainties and their causes is provided. An application of the principle of maximum entropy to the determination of Bayesian prior distributions is described. An approach based on so called probabilistic structures is presented in order to develop a method of quantitative evaluation of modelling uncertainties. The method is applied to a small example case. Ideas for application areas for the proposed method are discussed
Hierarchical Bayesian Models of Subtask Learning
Anglim, Jeromy; Wynton, Sarah K. A.
2015-01-01
The current study used Bayesian hierarchical methods to challenge and extend previous work on subtask learning consistency. A general model of individual-level subtask learning was proposed focusing on power and exponential functions with constraints to test for inconsistency. To study subtask learning, we developed a novel computer-based booking…
Bayesian Modelling of Functional Whole Brain Connectivity
DEFF Research Database (Denmark)
Røge, Rasmus
the prevalent strategy of standardizing of fMRI time series and model data using directional statistics or we model the variability in the signal across the brain and across multiple subjects. In either case, we use Bayesian nonparametric modeling to automatically learn from the fMRI data the number......This thesis deals with parcellation of whole-brain functional magnetic resonance imaging (fMRI) using Bayesian inference with mixture models tailored to the fMRI data. In the three included papers and manuscripts, we analyze two different approaches to modeling fMRI signal; either we accept...... of funcional units, i.e. parcels. We benchmark the proposed mixture models against state of the art methods of brain parcellation, both probabilistic and non-probabilistic. The time series of each voxel are most often standardized using z-scoring which projects the time series data onto a hypersphere...
Wei, Jiangfeng; Dirmeyer, Paul A.; Yang, Zong-Liang; Chen, Haishan
2017-10-01
Through a series of model simulations with an atmospheric general circulation model coupled to three different land surface models, this study investigates the impacts of land model ensembles and coupled model ensemble on precipitation simulation. It is found that coupling an ensemble of land models to an atmospheric model has a very minor impact on the improvement of precipitation climatology and variability, but a simple ensemble average of the precipitation from three individually coupled land-atmosphere models produces better results, especially for precipitation variability. The generally weak impact of land processes on precipitation should be the main reason that the land model ensembles do not improve precipitation simulation. However, if there are big biases in the land surface model or land surface data set, correcting them could improve the simulated climate, especially for well-constrained regional climate simulations.
Modelling dependable systems using hybrid Bayesian networks
International Nuclear Information System (INIS)
Neil, Martin; Tailor, Manesh; Marquez, David; Fenton, Norman; Hearty, Peter
2008-01-01
A hybrid Bayesian network (BN) is one that incorporates both discrete and continuous nodes. In our extensive applications of BNs for system dependability assessment, the models are invariably hybrid and the need for efficient and accurate computation is paramount. We apply a new iterative algorithm that efficiently combines dynamic discretisation with robust propagation algorithms on junction tree structures to perform inference in hybrid BNs. We illustrate its use in the field of dependability with two example of reliability estimation. Firstly we estimate the reliability of a simple single system and next we implement a hierarchical Bayesian model. In the hierarchical model we compute the reliability of two unknown subsystems from data collected on historically similar subsystems and then input the result into a reliability block model to compute system level reliability. We conclude that dynamic discretisation can be used as an alternative to analytical or Monte Carlo methods with high precision and can be applied to a wide range of dependability problems
Bayesian disease mapping: hierarchical modeling in spatial epidemiology
National Research Council Canada - National Science Library
Lawson, Andrew
2013-01-01
.... Exploring these new developments, Bayesian Disease Mapping: Hierarchical Modeling in Spatial Epidemiology, Second Edition provides an up-to-date, cohesive account of the full range of Bayesian disease mapping methods and applications...
Constrained bayesian inference of project performance models
Sunmola, Funlade
2013-01-01
Project performance models play an important role in the management of project success. When used for monitoring projects, they can offer predictive ability such as indications of possible delivery problems. Approaches for monitoring project performance relies on available project information including restrictions imposed on the project, particularly the constraints of cost, quality, scope and time. We study in this paper a Bayesian inference methodology for project performance modelling in ...
Bayesian methodology for reliability model acceptance
International Nuclear Information System (INIS)
Zhang Ruoxue; Mahadevan, Sankaran
2003-01-01
This paper develops a methodology to assess the reliability computation model validity using the concept of Bayesian hypothesis testing, by comparing the model prediction and experimental observation, when there is only one computational model available to evaluate system behavior. Time-independent and time-dependent problems are investigated, with consideration of both cases: with and without statistical uncertainty in the model. The case of time-independent failure probability prediction with no statistical uncertainty is a straightforward application of Bayesian hypothesis testing. However, for the life prediction (time-dependent reliability) problem, a new methodology is developed in this paper to make the same Bayesian hypothesis testing concept applicable. With the existence of statistical uncertainty in the model, in addition to the application of a predictor estimator of the Bayes factor, the uncertainty in the Bayes factor is explicitly quantified through treating it as a random variable and calculating the probability that it exceeds a specified value. The developed method provides a rational criterion to decision-makers for the acceptance or rejection of the computational model
Bayesian network modelling of upper gastrointestinal bleeding
Aisha, Nazziwa; Shohaimi, Shamarina; Adam, Mohd Bakri
2013-09-01
Bayesian networks are graphical probabilistic models that represent causal and other relationships between domain variables. In the context of medical decision making, these models have been explored to help in medical diagnosis and prognosis. In this paper, we discuss the Bayesian network formalism in building medical support systems and we learn a tree augmented naive Bayes Network (TAN) from gastrointestinal bleeding data. The accuracy of the TAN in classifying the source of gastrointestinal bleeding into upper or lower source is obtained. The TAN achieves a high classification accuracy of 86% and an area under curve of 92%. A sensitivity analysis of the model shows relatively high levels of entropy reduction for color of the stool, history of gastrointestinal bleeding, consistency and the ratio of blood urea nitrogen to creatinine. The TAN facilitates the identification of the source of GIB and requires further validation.
Dynamic Metabolic Model Building Based on the Ensemble Modeling Approach
Energy Technology Data Exchange (ETDEWEB)
Liao, James C. [Univ. of California, Los Angeles, CA (United States)
2016-10-01
Ensemble modeling of kinetic systems addresses the challenges of kinetic model construction, with respect to parameter value selection, and still allows for the rich insights possible from kinetic models. This project aimed to show that constructing, implementing, and analyzing such models is a useful tool for the metabolic engineering toolkit, and that they can result in actionable insights from models. Key concepts are developed and deliverable publications and results are presented.
An educational model for ensemble streamflow simulation and uncertainty analysis
Directory of Open Access Journals (Sweden)
A. AghaKouchak
2013-02-01
Full Text Available This paper presents the hands-on modeling toolbox, HBV-Ensemble, designed as a complement to theoretical hydrology lectures, to teach hydrological processes and their uncertainties. The HBV-Ensemble can be used for in-class lab practices and homework assignments, and assessment of students' understanding of hydrological processes. Using this modeling toolbox, students can gain more insights into how hydrological processes (e.g., precipitation, snowmelt and snow accumulation, soil moisture, evapotranspiration and runoff generation are interconnected. The educational toolbox includes a MATLAB Graphical User Interface (GUI and an ensemble simulation scheme that can be used for teaching uncertainty analysis, parameter estimation, ensemble simulation and model sensitivity. HBV-Ensemble was administered in a class for both in-class instruction and a final project, and students submitted their feedback about the toolbox. The results indicate that this educational software had a positive impact on students understanding and knowledge of uncertainty in hydrological modeling.
Ensemble inequivalence: Landau theory and the ABC model
International Nuclear Information System (INIS)
Cohen, O; Mukamel, D
2012-01-01
It is well known that systems with long-range interactions may exhibit different phase diagrams when studied within two different ensembles. In many of the previously studied examples of ensemble inequivalence, the phase diagrams differ only when the transition in one of the ensembles is first order. By contrast, in a recent study of a generalized ABC model, the canonical and grand-canonical ensembles of the model were shown to differ even when they both exhibit a continuous transition. Here we show that the order of the transition where ensemble inequivalence may occur is related to the symmetry properties of the order parameter associated with the transition. This is done by analyzing the Landau expansion of a generic model with long-range interactions. The conclusions drawn from the generic analysis are demonstrated for the ABC model by explicit calculation of its Landau expansion. (paper)
Network structure exploration via Bayesian nonparametric models
International Nuclear Information System (INIS)
Chen, Y; Wang, X L; Xiang, X; Tang, B Z; Bu, J Z
2015-01-01
Complex networks provide a powerful mathematical representation of complex systems in nature and society. To understand complex networks, it is crucial to explore their internal structures, also called structural regularities. The task of network structure exploration is to determine how many groups there are in a complex network and how to group the nodes of the network. Most existing structure exploration methods need to specify either a group number or a certain type of structure when they are applied to a network. In the real world, however, the group number and also the certain type of structure that a network has are usually unknown in advance. To explore structural regularities in complex networks automatically, without any prior knowledge of the group number or the certain type of structure, we extend a probabilistic mixture model that can handle networks with any type of structure but needs to specify a group number using Bayesian nonparametric theory. We also propose a novel Bayesian nonparametric model, called the Bayesian nonparametric mixture (BNPM) model. Experiments conducted on a large number of networks with different structures show that the BNPM model is able to explore structural regularities in networks automatically with a stable, state-of-the-art performance. (paper)
Bayesian Recurrent Neural Network for Language Modeling.
Chien, Jen-Tzung; Ku, Yuan-Chu
2016-02-01
A language model (LM) is calculated as the probability of a word sequence that provides the solution to word prediction for a variety of information systems. A recurrent neural network (RNN) is powerful to learn the large-span dynamics of a word sequence in the continuous space. However, the training of the RNN-LM is an ill-posed problem because of too many parameters from a large dictionary size and a high-dimensional hidden layer. This paper presents a Bayesian approach to regularize the RNN-LM and apply it for continuous speech recognition. We aim to penalize the too complicated RNN-LM by compensating for the uncertainty of the estimated model parameters, which is represented by a Gaussian prior. The objective function in a Bayesian classification network is formed as the regularized cross-entropy error function. The regularized model is constructed not only by calculating the regularized parameters according to the maximum a posteriori criterion but also by estimating the Gaussian hyperparameter by maximizing the marginal likelihood. A rapid approximation to a Hessian matrix is developed to implement the Bayesian RNN-LM (BRNN-LM) by selecting a small set of salient outer-products. The proposed BRNN-LM achieves a sparser model than the RNN-LM. Experiments on different corpora show the robustness of system performance by applying the rapid BRNN-LM under different conditions.
Centralized Bayesian reliability modelling with sensor networks
Czech Academy of Sciences Publication Activity Database
Dedecius, Kamil; Sečkárová, Vladimíra
2013-01-01
Roč. 19, č. 5 (2013), s. 471-482 ISSN 1387-3954 R&D Projects: GA MŠk 7D12004 Grant - others:GA MŠk(CZ) SVV-265315 Keywords : Bayesian modelling * Sensor network * Reliability Subject RIV: BD - Theory of Information Impact factor: 0.984, year: 2013 http://library.utia.cas.cz/separaty/2013/AS/dedecius-0392551.pdf
Bayesian Inference of a Multivariate Regression Model
Directory of Open Access Journals (Sweden)
Marick S. Sinay
2014-01-01
Full Text Available We explore Bayesian inference of a multivariate linear regression model with use of a flexible prior for the covariance structure. The commonly adopted Bayesian setup involves the conjugate prior, multivariate normal distribution for the regression coefficients and inverse Wishart specification for the covariance matrix. Here we depart from this approach and propose a novel Bayesian estimator for the covariance. A multivariate normal prior for the unique elements of the matrix logarithm of the covariance matrix is considered. Such structure allows for a richer class of prior distributions for the covariance, with respect to strength of beliefs in prior location hyperparameters, as well as the added ability, to model potential correlation amongst the covariance structure. The posterior moments of all relevant parameters of interest are calculated based upon numerical results via a Markov chain Monte Carlo procedure. The Metropolis-Hastings-within-Gibbs algorithm is invoked to account for the construction of a proposal density that closely matches the shape of the target posterior distribution. As an application of the proposed technique, we investigate a multiple regression based upon the 1980 High School and Beyond Survey.
Modeling task-specific neuronal ensembles improves decoding of grasp
Smith, Ryan J.; Soares, Alcimar B.; Rouse, Adam G.; Schieber, Marc H.; Thakor, Nitish V.
2018-06-01
Objective. Dexterous movement involves the activation and coordination of networks of neuronal populations across multiple cortical regions. Attempts to model firing of individual neurons commonly treat the firing rate as directly modulating with motor behavior. However, motor behavior may additionally be associated with modulations in the activity and functional connectivity of neurons in a broader ensemble. Accounting for variations in neural ensemble connectivity may provide additional information about the behavior being performed. Approach. In this study, we examined neural ensemble activity in primary motor cortex (M1) and premotor cortex (PM) of two male rhesus monkeys during performance of a center-out reach, grasp and manipulate task. We constructed point process encoding models of neuronal firing that incorporated task-specific variations in the baseline firing rate as well as variations in functional connectivity with the neural ensemble. Models were evaluated both in terms of their encoding capabilities and their ability to properly classify the grasp being performed. Main results. Task-specific ensemble models correctly predicted the performed grasp with over 95% accuracy and were shown to outperform models of neuronal activity that assume only a variable baseline firing rate. Task-specific ensemble models exhibited superior decoding performance in 82% of units in both monkeys (p < 0.01). Inclusion of ensemble activity also broadly improved the ability of models to describe observed spiking. Encoding performance of task-specific ensemble models, measured by spike timing predictability, improved upon baseline models in 62% of units. Significance. These results suggest that additional discriminative information about motor behavior found in the variations in functional connectivity of neuronal ensembles located in motor-related cortical regions is relevant to decode complex tasks such as grasping objects, and may serve the basis for more
Bayesian Predictive Models for Rayleigh Wind Speed
DEFF Research Database (Denmark)
Shahirinia, Amir; Hajizadeh, Amin; Yu, David C
2017-01-01
predictive model of the wind speed aggregates the non-homogeneous distributions into a single continuous distribution. Therefore, the result is able to capture the variation among the probability distributions of the wind speeds at the turbines’ locations in a wind farm. More specifically, instead of using...... a wind speed distribution whose parameters are known or estimated, the parameters are considered as random whose variations are according to probability distributions. The Bayesian predictive model for a Rayleigh which only has a single model scale parameter has been proposed. Also closed-form posterior...... and predictive inferences under different reasonable choices of prior distribution in sensitivity analysis have been presented....
Bayesian Model Selection under Time Constraints
Hoege, M.; Nowak, W.; Illman, W. A.
2017-12-01
Bayesian model selection (BMS) provides a consistent framework for rating and comparing models in multi-model inference. In cases where models of vastly different complexity compete with each other, we also face vastly different computational runtimes of such models. For instance, time series of a quantity of interest can be simulated by an autoregressive process model that takes even less than a second for one run, or by a partial differential equations-based model with runtimes up to several hours or even days. The classical BMS is based on a quantity called Bayesian model evidence (BME). It determines the model weights in the selection process and resembles a trade-off between bias of a model and its complexity. However, in practice, the runtime of models is another weight relevant factor for model selection. Hence, we believe that it should be included, leading to an overall trade-off problem between bias, variance and computing effort. We approach this triple trade-off from the viewpoint of our ability to generate realizations of the models under a given computational budget. One way to obtain BME values is through sampling-based integration techniques. We argue with the fact that more expensive models can be sampled much less under time constraints than faster models (in straight proportion to their runtime). The computed evidence in favor of a more expensive model is statistically less significant than the evidence computed in favor of a faster model, since sampling-based strategies are always subject to statistical sampling error. We present a straightforward way to include this misbalance into the model weights that are the basis for model selection. Our approach follows directly from the idea of insufficient significance. It is based on a computationally cheap bootstrapping error estimate of model evidence and is easy to implement. The approach is illustrated in a small synthetic modeling study.
A Bayesian approach for parameter estimation and prediction using a computationally intensive model
International Nuclear Information System (INIS)
Higdon, Dave; McDonnell, Jordan D; Schunck, Nicolas; Sarich, Jason; Wild, Stefan M
2015-01-01
Bayesian methods have been successful in quantifying uncertainty in physics-based problems in parameter estimation and prediction. In these cases, physical measurements y are modeled as the best fit of a physics-based model η(θ), where θ denotes the uncertain, best input setting. Hence the statistical model is of the form y=η(θ)+ϵ, where ϵ accounts for measurement, and possibly other, error sources. When nonlinearity is present in η(⋅), the resulting posterior distribution for the unknown parameters in the Bayesian formulation is typically complex and nonstandard, requiring computationally demanding computational approaches such as Markov chain Monte Carlo (MCMC) to produce multivariate draws from the posterior. Although generally applicable, MCMC requires thousands (or even millions) of evaluations of the physics model η(⋅). This requirement is problematic if the model takes hours or days to evaluate. To overcome this computational bottleneck, we present an approach adapted from Bayesian model calibration. This approach combines output from an ensemble of computational model runs with physical measurements, within a statistical formulation, to carry out inference. A key component of this approach is a statistical response surface, or emulator, estimated from the ensemble of model runs. We demonstrate this approach with a case study in estimating parameters for a density functional theory model, using experimental mass/binding energy measurements from a collection of atomic nuclei. We also demonstrate how this approach produces uncertainties in predictions for recent mass measurements obtained at Argonne National Laboratory. (paper)
Selecting a climate model subset to optimise key ensemble properties
Directory of Open Access Journals (Sweden)
N. Herger
2018-02-01
Full Text Available End users studying impacts and risks caused by human-induced climate change are often presented with large multi-model ensembles of climate projections whose composition and size are arbitrarily determined. An efficient and versatile method that finds a subset which maintains certain key properties from the full ensemble is needed, but very little work has been done in this area. Therefore, users typically make their own somewhat subjective subset choices and commonly use the equally weighted model mean as a best estimate. However, different climate model simulations cannot necessarily be regarded as independent estimates due to the presence of duplicated code and shared development history. Here, we present an efficient and flexible tool that makes better use of the ensemble as a whole by finding a subset with improved mean performance compared to the multi-model mean while at the same time maintaining the spread and addressing the problem of model interdependence. Out-of-sample skill and reliability are demonstrated using model-as-truth experiments. This approach is illustrated with one set of optimisation criteria but we also highlight the flexibility of cost functions, depending on the focus of different users. The technique is useful for a range of applications that, for example, minimise present-day bias to obtain an accurate ensemble mean, reduce dependence in ensemble spread, maximise future spread, ensure good performance of individual models in an ensemble, reduce the ensemble size while maintaining important ensemble characteristics, or optimise several of these at the same time. As in any calibration exercise, the final ensemble is sensitive to the metric, observational product, and pre-processing steps used.
Selecting a climate model subset to optimise key ensemble properties
Herger, Nadja; Abramowitz, Gab; Knutti, Reto; Angélil, Oliver; Lehmann, Karsten; Sanderson, Benjamin M.
2018-02-01
End users studying impacts and risks caused by human-induced climate change are often presented with large multi-model ensembles of climate projections whose composition and size are arbitrarily determined. An efficient and versatile method that finds a subset which maintains certain key properties from the full ensemble is needed, but very little work has been done in this area. Therefore, users typically make their own somewhat subjective subset choices and commonly use the equally weighted model mean as a best estimate. However, different climate model simulations cannot necessarily be regarded as independent estimates due to the presence of duplicated code and shared development history. Here, we present an efficient and flexible tool that makes better use of the ensemble as a whole by finding a subset with improved mean performance compared to the multi-model mean while at the same time maintaining the spread and addressing the problem of model interdependence. Out-of-sample skill and reliability are demonstrated using model-as-truth experiments. This approach is illustrated with one set of optimisation criteria but we also highlight the flexibility of cost functions, depending on the focus of different users. The technique is useful for a range of applications that, for example, minimise present-day bias to obtain an accurate ensemble mean, reduce dependence in ensemble spread, maximise future spread, ensure good performance of individual models in an ensemble, reduce the ensemble size while maintaining important ensemble characteristics, or optimise several of these at the same time. As in any calibration exercise, the final ensemble is sensitive to the metric, observational product, and pre-processing steps used.
Three-model ensemble wind prediction in southern Italy
Torcasio, Rosa Claudia; Federico, Stefano; Calidonna, Claudia Roberta; Avolio, Elenio; Drofa, Oxana; Landi, Tony Christian; Malguzzi, Piero; Buzzi, Andrea; Bonasoni, Paolo
2016-03-01
Quality of wind prediction is of great importance since a good wind forecast allows the prediction of available wind power, improving the penetration of renewable energies into the energy market. Here, a 1-year (1 December 2012 to 30 November 2013) three-model ensemble (TME) experiment for wind prediction is considered. The models employed, run operationally at National Research Council - Institute of Atmospheric Sciences and Climate (CNR-ISAC), are RAMS (Regional Atmospheric Modelling System), BOLAM (BOlogna Limited Area Model), and MOLOCH (MOdello LOCale in H coordinates). The area considered for the study is southern Italy and the measurements used for the forecast verification are those of the GTS (Global Telecommunication System). Comparison with observations is made every 3 h up to 48 h of forecast lead time. Results show that the three-model ensemble outperforms the forecast of each individual model. The RMSE improvement compared to the best model is between 22 and 30 %, depending on the season. It is also shown that the three-model ensemble outperforms the IFS (Integrated Forecasting System) of the ECMWF (European Centre for Medium-Range Weather Forecast) for the surface wind forecasts. Notably, the three-model ensemble forecast performs better than each unbiased model, showing the added value of the ensemble technique. Finally, the sensitivity of the three-model ensemble RMSE to the length of the training period is analysed.
Broderick, Ciaran; Matthews, Tom; Wilby, Robert L.; Bastola, Satish; Murphy, Conor
2016-10-01
Understanding hydrological model predictive capabilities under contrasting climate conditions enables more robust decision making. Using Differential Split Sample Testing (DSST), we analyze the performance of six hydrological models for 37 Irish catchments under climate conditions unlike those used for model training. Additionally, we consider four ensemble averaging techniques when examining interperiod transferability. DSST is conducted using 2/3 year noncontinuous blocks of (i) the wettest/driest years on record based on precipitation totals and (ii) years with a more/less pronounced seasonal precipitation regime. Model transferability between contrasting regimes was found to vary depending on the testing scenario, catchment, and evaluation criteria considered. As expected, the ensemble average outperformed most individual ensemble members. However, averaging techniques differed considerably in the number of times they surpassed the best individual model member. Bayesian Model Averaging (BMA) and the Granger-Ramanathan Averaging (GRA) method were found to outperform the simple arithmetic mean (SAM) and Akaike Information Criteria Averaging (AICA). Here GRA performed better than the best individual model in 51%-86% of cases (according to the Nash-Sutcliffe criterion). When assessing model predictive skill under climate change conditions we recommend (i) setting up DSST to select the best available analogues of expected annual mean and seasonal climate conditions; (ii) applying multiple performance criteria; (iii) testing transferability using a diverse set of catchments; and (iv) using a multimodel ensemble in conjunction with an appropriate averaging technique. Given the computational efficiency and performance of GRA relative to BMA, the former is recommended as the preferred ensemble averaging technique for climate assessment.
Distributed Bayesian Networks for User Modeling
DEFF Research Database (Denmark)
Tedesco, Roberto; Dolog, Peter; Nejdl, Wolfgang
2006-01-01
The World Wide Web is a popular platform for providing eLearning applications to a wide spectrum of users. However – as users differ in their preferences, background, requirements, and goals – applications should provide personalization mechanisms. In the Web context, user models used by such ada......The World Wide Web is a popular platform for providing eLearning applications to a wide spectrum of users. However – as users differ in their preferences, background, requirements, and goals – applications should provide personalization mechanisms. In the Web context, user models used...... by such adaptive applications are often partial fragments of an overall user model. The fragments have then to be collected and merged into a global user profile. In this paper we investigate and present algorithms able to cope with distributed, fragmented user models – based on Bayesian Networks – in the context...... of Web-based eLearning platforms. The scenario we are tackling assumes learners who use several systems over time, which are able to create partial Bayesian Networks for user models based on the local system context. In particular, we focus on how to merge these partial user models. Our merge mechanism...
Bayesian Spatial Modelling with R-INLA
Directory of Open Access Journals (Sweden)
Finn Lindgren
2015-02-01
Full Text Available The principles behind the interface to continuous domain spatial models in the R- INLA software package for R are described. The integrated nested Laplace approximation (INLA approach proposed by Rue, Martino, and Chopin (2009 is a computationally effective alternative to MCMC for Bayesian inference. INLA is designed for latent Gaussian models, a very wide and flexible class of models ranging from (generalized linear mixed to spatial and spatio-temporal models. Combined with the stochastic partial differential equation approach (SPDE, Lindgren, Rue, and Lindstrm 2011, one can accommodate all kinds of geographically referenced data, including areal and geostatistical ones, as well as spatial point process data. The implementation interface covers stationary spatial mod- els, non-stationary spatial models, and also spatio-temporal models, and is applicable in epidemiology, ecology, environmental risk assessment, as well as general geostatistics.
Sparse Event Modeling with Hierarchical Bayesian Kernel Methods
2016-01-05
SECURITY CLASSIFICATION OF: The research objective of this proposal was to develop a predictive Bayesian kernel approach to model count data based on...several predictive variables. Such an approach, which we refer to as the Poisson Bayesian kernel model, is able to model the rate of occurrence of... kernel methods made use of: (i) the Bayesian property of improving predictive accuracy as data are dynamically obtained, and (ii) the kernel function
Gridded Calibration of Ensemble Wind Vector Forecasts Using Ensemble Model Output Statistics
Lazarus, S. M.; Holman, B. P.; Splitt, M. E.
2017-12-01
A computationally efficient method is developed that performs gridded post processing of ensemble wind vector forecasts. An expansive set of idealized WRF model simulations are generated to provide physically consistent high resolution winds over a coastal domain characterized by an intricate land / water mask. Ensemble model output statistics (EMOS) is used to calibrate the ensemble wind vector forecasts at observation locations. The local EMOS predictive parameters (mean and variance) are then spread throughout the grid utilizing flow-dependent statistical relationships extracted from the downscaled WRF winds. Using data withdrawal and 28 east central Florida stations, the method is applied to one year of 24 h wind forecasts from the Global Ensemble Forecast System (GEFS). Compared to the raw GEFS, the approach improves both the deterministic and probabilistic forecast skill. Analysis of multivariate rank histograms indicate the post processed forecasts are calibrated. Two downscaling case studies are presented, a quiescent easterly flow event and a frontal passage. Strengths and weaknesses of the approach are presented and discussed.
Energy Technology Data Exchange (ETDEWEB)
Soltanzadeh, I. [Tehran Univ. (Iran, Islamic Republic of). Inst. of Geophysics; Azadi, M.; Vakili, G.A. [Atmospheric Science and Meteorological Research Center (ASMERC), Teheran (Iran, Islamic Republic of)
2011-07-01
Using Bayesian Model Averaging (BMA), an attempt was made to obtain calibrated probabilistic numerical forecasts of 2-m temperature over Iran. The ensemble employs three limited area models (WRF, MM5 and HRM), with WRF used with five different configurations. Initial and boundary conditions for MM5 and WRF are obtained from the National Centers for Environmental Prediction (NCEP) Global Forecast System (GFS) and for HRM the initial and boundary conditions come from analysis of Global Model Europe (GME) of the German Weather Service. The resulting ensemble of seven members was run for a period of 6 months (from December 2008 to May 2009) over Iran. The 48-h raw ensemble outputs were calibrated using BMA technique for 120 days using a 40 days training sample of forecasts and relative verification data. The calibrated probabilistic forecasts were assessed using rank histogram and attribute diagrams. Results showed that application of BMA improved the reliability of the raw ensemble. Using the weighted ensemble mean forecast as a deterministic forecast it was found that the deterministic-style BMA forecasts performed usually better than the best member's deterministic forecast. (orig.)
Directory of Open Access Journals (Sweden)
I. Soltanzadeh
2011-07-01
Full Text Available Using Bayesian Model Averaging (BMA, an attempt was made to obtain calibrated probabilistic numerical forecasts of 2-m temperature over Iran. The ensemble employs three limited area models (WRF, MM5 and HRM, with WRF used with five different configurations. Initial and boundary conditions for MM5 and WRF are obtained from the National Centers for Environmental Prediction (NCEP Global Forecast System (GFS and for HRM the initial and boundary conditions come from analysis of Global Model Europe (GME of the German Weather Service. The resulting ensemble of seven members was run for a period of 6 months (from December 2008 to May 2009 over Iran. The 48-h raw ensemble outputs were calibrated using BMA technique for 120 days using a 40 days training sample of forecasts and relative verification data. The calibrated probabilistic forecasts were assessed using rank histogram and attribute diagrams. Results showed that application of BMA improved the reliability of the raw ensemble. Using the weighted ensemble mean forecast as a deterministic forecast it was found that the deterministic-style BMA forecasts performed usually better than the best member's deterministic forecast.
Adversarial life testing: A Bayesian negotiation model
International Nuclear Information System (INIS)
Rufo, M.J.; Martín, J.; Pérez, C.J.
2014-01-01
Life testing is a procedure intended for facilitating the process of making decisions in the context of industrial reliability. On the other hand, negotiation is a process of making joint decisions that has one of its main foundations in decision theory. A Bayesian sequential model of negotiation in the context of adversarial life testing is proposed. This model considers a general setting for which a manufacturer offers a product batch to a consumer. It is assumed that the reliability of the product is measured in terms of its lifetime. Furthermore, both the manufacturer and the consumer have to use their own information with respect to the quality of the product. Under these assumptions, two situations can be analyzed. For both of them, the main aim is to accept or reject the product batch based on the product reliability. This topic is related to a reliability demonstration problem. The procedure is applied to a class of distributions that belong to the exponential family. Thus, a unified framework addressing the main topics in the considered Bayesian model is presented. An illustrative example shows that the proposed technique can be easily applied in practice
Modeling Coordination Problems in a Music Ensemble
DEFF Research Database (Denmark)
Frimodt-Møller, Søren R.
2008-01-01
This paper considers in general terms, how musicians are able to coordinate through rational choices in a situation of (temporary) doubt in an ensemble performance. A fictitious example involving a 5-bar development in an unknown piece of music is analyzed in terms of epistemic logic, more...... to coordinate. Such coordination can be described in terms of Michael Bacharach's theory of variable frames as an aid to solve game theoretic coordination problems....
Jones, Matt; Love, Bradley C
2011-08-01
The prominence of Bayesian modeling of cognition has increased recently largely because of mathematical advances in specifying and deriving predictions from complex probabilistic models. Much of this research aims to demonstrate that cognitive behavior can be explained from rational principles alone, without recourse to psychological or neurological processes and representations. We note commonalities between this rational approach and other movements in psychology - namely, Behaviorism and evolutionary psychology - that set aside mechanistic explanations or make use of optimality assumptions. Through these comparisons, we identify a number of challenges that limit the rational program's potential contribution to psychological theory. Specifically, rational Bayesian models are significantly unconstrained, both because they are uninformed by a wide range of process-level data and because their assumptions about the environment are generally not grounded in empirical measurement. The psychological implications of most Bayesian models are also unclear. Bayesian inference itself is conceptually trivial, but strong assumptions are often embedded in the hypothesis sets and the approximation algorithms used to derive model predictions, without a clear delineation between psychological commitments and implementational details. Comparing multiple Bayesian models of the same task is rare, as is the realization that many Bayesian models recapitulate existing (mechanistic level) theories. Despite the expressive power of current Bayesian models, we argue they must be developed in conjunction with mechanistic considerations to offer substantive explanations of cognition. We lay out several means for such an integration, which take into account the representations on which Bayesian inference operates, as well as the algorithms and heuristics that carry it out. We argue this unification will better facilitate lasting contributions to psychological theory, avoiding the pitfalls
A multi-model ensemble approach to seabed mapping
Diesing, Markus; Stephens, David
2015-06-01
Seabed habitat mapping based on swath acoustic data and ground-truth samples is an emergent and active marine science discipline. Significant progress could be achieved by transferring techniques and approaches that have been successfully developed and employed in such fields as terrestrial land cover mapping. One such promising approach is the multiple classifier system, which aims at improving classification performance by combining the outputs of several classifiers. Here we present results of a multi-model ensemble applied to multibeam acoustic data covering more than 5000 km2 of seabed in the North Sea with the aim to derive accurate spatial predictions of seabed substrate. A suite of six machine learning classifiers (k-Nearest Neighbour, Support Vector Machine, Classification Tree, Random Forest, Neural Network and Naïve Bayes) was trained with ground-truth sample data classified into seabed substrate classes and their prediction accuracy was assessed with an independent set of samples. The three and five best performing models were combined to classifier ensembles. Both ensembles led to increased prediction accuracy as compared to the best performing single classifier. The improvements were however not statistically significant at the 5% level. Although the three-model ensemble did not perform significantly better than its individual component models, we noticed that the five-model ensemble did perform significantly better than three of the five component models. A classifier ensemble might therefore be an effective strategy to improve classification performance. Another advantage is the fact that the agreement in predicted substrate class between the individual models of the ensemble could be used as a measure of confidence. We propose a simple and spatially explicit measure of confidence that is based on model agreement and prediction accuracy.
Merging Digital Surface Models Implementing Bayesian Approaches
Sadeq, H.; Drummond, J.; Li, Z.
2016-06-01
In this research different DSMs from different sources have been merged. The merging is based on a probabilistic model using a Bayesian Approach. The implemented data have been sourced from very high resolution satellite imagery sensors (e.g. WorldView-1 and Pleiades). It is deemed preferable to use a Bayesian Approach when the data obtained from the sensors are limited and it is difficult to obtain many measurements or it would be very costly, thus the problem of the lack of data can be solved by introducing a priori estimations of data. To infer the prior data, it is assumed that the roofs of the buildings are specified as smooth, and for that purpose local entropy has been implemented. In addition to the a priori estimations, GNSS RTK measurements have been collected in the field which are used as check points to assess the quality of the DSMs and to validate the merging result. The model has been applied in the West-End of Glasgow containing different kinds of buildings, such as flat roofed and hipped roofed buildings. Both quantitative and qualitative methods have been employed to validate the merged DSM. The validation results have shown that the model was successfully able to improve the quality of the DSMs and improving some characteristics such as the roof surfaces, which consequently led to better representations. In addition to that, the developed model has been compared with the well established Maximum Likelihood model and showed similar quantitative statistical results and better qualitative results. Although the proposed model has been applied on DSMs that were derived from satellite imagery, it can be applied to any other sourced DSMs.
MERGING DIGITAL SURFACE MODELS IMPLEMENTING BAYESIAN APPROACHES
Directory of Open Access Journals (Sweden)
H. Sadeq
2016-06-01
Full Text Available In this research different DSMs from different sources have been merged. The merging is based on a probabilistic model using a Bayesian Approach. The implemented data have been sourced from very high resolution satellite imagery sensors (e.g. WorldView-1 and Pleiades. It is deemed preferable to use a Bayesian Approach when the data obtained from the sensors are limited and it is difficult to obtain many measurements or it would be very costly, thus the problem of the lack of data can be solved by introducing a priori estimations of data. To infer the prior data, it is assumed that the roofs of the buildings are specified as smooth, and for that purpose local entropy has been implemented. In addition to the a priori estimations, GNSS RTK measurements have been collected in the field which are used as check points to assess the quality of the DSMs and to validate the merging result. The model has been applied in the West-End of Glasgow containing different kinds of buildings, such as flat roofed and hipped roofed buildings. Both quantitative and qualitative methods have been employed to validate the merged DSM. The validation results have shown that the model was successfully able to improve the quality of the DSMs and improving some characteristics such as the roof surfaces, which consequently led to better representations. In addition to that, the developed model has been compared with the well established Maximum Likelihood model and showed similar quantitative statistical results and better qualitative results. Although the proposed model has been applied on DSMs that were derived from satellite imagery, it can be applied to any other sourced DSMs.
Effect on Prediction when Modeling Covariates in Bayesian Nonparametric Models.
Cruz-Marcelo, Alejandro; Rosner, Gary L; Müller, Peter; Stewart, Clinton F
2013-04-01
In biomedical research, it is often of interest to characterize biologic processes giving rise to observations and to make predictions of future observations. Bayesian nonparametric methods provide a means for carrying out Bayesian inference making as few assumptions about restrictive parametric models as possible. There are several proposals in the literature for extending Bayesian nonparametric models to include dependence on covariates. Limited attention, however, has been directed to the following two aspects. In this article, we examine the effect on fitting and predictive performance of incorporating covariates in a class of Bayesian nonparametric models by one of two primary ways: either in the weights or in the locations of a discrete random probability measure. We show that different strategies for incorporating continuous covariates in Bayesian nonparametric models can result in big differences when used for prediction, even though they lead to otherwise similar posterior inferences. When one needs the predictive density, as in optimal design, and this density is a mixture, it is better to make the weights depend on the covariates. We demonstrate these points via a simulated data example and in an application in which one wants to determine the optimal dose of an anticancer drug used in pediatric oncology.
HIGH-RESOLUTION ATMOSPHERIC ENSEMBLE MODELING AT SRNL
Energy Technology Data Exchange (ETDEWEB)
Buckley, R.; Werth, D.; Chiswell, S.; Etherton, B.
2011-05-10
The High-Resolution Mid-Atlantic Forecasting Ensemble (HME) is a federated effort to improve operational forecasts related to precipitation, convection and boundary layer evolution, and fire weather utilizing data and computing resources from a diverse group of cooperating institutions in order to create a mesoscale ensemble from independent members. Collaborating organizations involved in the project include universities, National Weather Service offices, and national laboratories, including the Savannah River National Laboratory (SRNL). The ensemble system is produced from an overlapping numerical weather prediction model domain and parameter subsets provided by each contributing member. The coordination, synthesis, and dissemination of the ensemble information are performed by the Renaissance Computing Institute (RENCI) at the University of North Carolina-Chapel Hill. This paper discusses background related to the HME effort, SRNL participation, and example results available from the RENCI website.
Learning Bayesian Dependence Model for Student Modelling
Directory of Open Access Journals (Sweden)
Adina COCU
2008-12-01
Full Text Available Learning a Bayesian network from a numeric set of data is a challenging task because of dual nature of learning process: initial need to learn network structure, and then to find out the distribution probability tables. In this paper, we propose a machine-learning algorithm based on hill climbing search combined with Tabu list. The aim of learning process is to discover the best network that represents dependences between nodes. Another issue in machine learning procedure is handling numeric attributes. In order to do that, we must perform an attribute discretization pre-processes. This discretization operation can influence the results of learning network structure. Therefore, we make a comparative study to find out the most suitable combination between discretization method and learning algorithm, for a specific data set.
Efficient Bayesian network modeling of systems
International Nuclear Information System (INIS)
Bensi, Michelle; Kiureghian, Armen Der; Straub, Daniel
2013-01-01
The Bayesian network (BN) is a convenient tool for probabilistic modeling of system performance, particularly when it is of interest to update the reliability of the system or its components in light of observed information. In this paper, BN structures for modeling the performance of systems that are defined in terms of their minimum link or cut sets are investigated. Standard BN structures that define the system node as a child of its constituent components or its minimum link/cut sets lead to converging structures, which are computationally disadvantageous and could severely hamper application of the BN to real systems. A systematic approach to defining an alternative formulation is developed that creates chain-like BN structures that are orders of magnitude more efficient, particularly in terms of computational memory demand. The formulation uses an integer optimization algorithm to identify the most efficient BN structure. Example applications demonstrate the proposed methodology and quantify the gained computational advantage
A new ensemble model for short term wind power prediction
DEFF Research Database (Denmark)
Madsen, Henrik; Albu, Razvan-Daniel; Felea, Ioan
2012-01-01
As the objective of this study, a non-linear ensemble system is used to develop a new model for predicting wind speed in short-term time scale. Short-term wind power prediction becomes an extremely important field of research for the energy sector. Regardless of the recent advancements in the re-search...... of prediction models, it was observed that different models have different capabilities and also no single model is suitable under all situations. The idea behind EPS (ensemble prediction systems) is to take advantage of the unique features of each subsystem to detain diverse patterns that exist in the dataset...
DEFF Research Database (Denmark)
Wellendorff, Jess; Lundgård, Keld Troen; Møgelhøj, Andreas
2012-01-01
A methodology for semiempirical density functional optimization, using regularization and cross-validation methods from machine learning, is developed. We demonstrate that such methods enable well-behaved exchange-correlation approximations in very flexible model spaces, thus avoiding the overfit......A methodology for semiempirical density functional optimization, using regularization and cross-validation methods from machine learning, is developed. We demonstrate that such methods enable well-behaved exchange-correlation approximations in very flexible model spaces, thus avoiding...... the energetics of intramolecular and intermolecular, bulk solid, and surface chemical bonding, and the developed optimization method explicitly handles making the compromise based on the directions in model space favored by different materials properties. The approach is applied to designing the Bayesian error...... sets validates the applicability of BEEF-vdW to studies in chemistry and condensed matter physics. Applications of the approximation and its Bayesian ensemble error estimate to two intricate surface science problems support this....
Directory of Open Access Journals (Sweden)
J. P. Werner
2015-03-01
Full Text Available Reconstructions of the late-Holocene climate rely heavily upon proxies that are assumed to be accurately dated by layer counting, such as measurements of tree rings, ice cores, and varved lake sediments. Considerable advances could be achieved if time-uncertain proxies were able to be included within these multiproxy reconstructions, and if time uncertainties were recognized and correctly modeled for proxies commonly treated as free of age model errors. Current approaches for accounting for time uncertainty are generally limited to repeating the reconstruction using each one of an ensemble of age models, thereby inflating the final estimated uncertainty – in effect, each possible age model is given equal weighting. Uncertainties can be reduced by exploiting the inferred space–time covariance structure of the climate to re-weight the possible age models. Here, we demonstrate how Bayesian hierarchical climate reconstruction models can be augmented to account for time-uncertain proxies. Critically, although a priori all age models are given equal probability of being correct, the probabilities associated with the age models are formally updated within the Bayesian framework, thereby reducing uncertainties. Numerical experiments show that updating the age model probabilities decreases uncertainty in the resulting reconstructions, as compared with the current de facto standard of sampling over all age models, provided there is sufficient information from other data sources in the spatial region of the time-uncertain proxy. This approach can readily be generalized to non-layer-counted proxies, such as those derived from marine sediments.
Model parameter updating using Bayesian networks
International Nuclear Information System (INIS)
Treml, C.A.; Ross, Timothy J.
2004-01-01
This paper outlines a model parameter updating technique for a new method of model validation using a modified model reference adaptive control (MRAC) framework with Bayesian Networks (BNs). The model parameter updating within this method is generic in the sense that the model/simulation to be validated is treated as a black box. It must have updateable parameters to which its outputs are sensitive, and those outputs must have metrics that can be compared to that of the model reference, i.e., experimental data. Furthermore, no assumptions are made about the statistics of the model parameter uncertainty, only upper and lower bounds need to be specified. This method is designed for situations where a model is not intended to predict a complete point-by-point time domain description of the item/system behavior; rather, there are specific points, features, or events of interest that need to be predicted. These specific points are compared to the model reference derived from actual experimental data. The logic for updating the model parameters to match the model reference is formed via a BN. The nodes of this BN consist of updateable model input parameters and the specific output values or features of interest. Each time the model is executed, the input/output pairs are used to adapt the conditional probabilities of the BN. Each iteration further refines the inferred model parameters to produce the desired model output. After parameter updating is complete and model inputs are inferred, reliabilities for the model output are supplied. Finally, this method is applied to a simulation of a resonance control cooling system for a prototype coupled cavity linac. The results are compared to experimental data.
Ensemble modeling for aromatic production in Escherichia coli.
Directory of Open Access Journals (Sweden)
Matthew L Rizk
2009-09-01
Full Text Available Ensemble Modeling (EM is a recently developed method for metabolic modeling, particularly for utilizing the effect of enzyme tuning data on the production of a specific compound to refine the model. This approach is used here to investigate the production of aromatic products in Escherichia coli. Instead of using dynamic metabolite data to fit a model, the EM approach uses phenotypic data (effects of enzyme overexpression or knockouts on the steady state production rate to screen possible models. These data are routinely generated during strain design. An ensemble of models is constructed that all reach the same steady state and are based on the same mechanistic framework at the elementary reaction level. The behavior of the models spans the kinetics allowable by thermodynamics. Then by using existing data from the literature for the overexpression of genes coding for transketolase (Tkt, transaldolase (Tal, and phosphoenolpyruvate synthase (Pps to screen the ensemble, we arrive at a set of models that properly describes the known enzyme overexpression phenotypes. This subset of models becomes more predictive as additional data are used to refine the models. The final ensemble of models demonstrates the characteristic of the cell that Tkt is the first rate controlling step, and correctly predicts that only after Tkt is overexpressed does an increase in Pps increase the production rate of aromatics. This work demonstrates that EM is able to capture the result of enzyme overexpression on aromatic producing bacteria by successfully utilizing routinely generated enzyme tuning data to guide model learning.
Lessons from Climate Modeling on the Design and Use of Ensembles for Crop Modeling
Wallach, Daniel; Mearns, Linda O.; Ruane, Alexander C.; Roetter, Reimund P.; Asseng, Senthold
2016-01-01
Working with ensembles of crop models is a recent but important development in crop modeling which promises to lead to better uncertainty estimates for model projections and predictions, better predictions using the ensemble mean or median, and closer collaboration within the modeling community. There are numerous open questions about the best way to create and analyze such ensembles. Much can be learned from the field of climate modeling, given its much longer experience with ensembles. We draw on that experience to identify questions and make propositions that should help make ensemble modeling with crop models more rigorous and informative. The propositions include defining criteria for acceptance of models in a crop MME, exploring criteria for evaluating the degree of relatedness of models in a MME, studying the effect of number of models in the ensemble, development of a statistical model of model sampling, creation of a repository for MME results, studies of possible differential weighting of models in an ensemble, creation of single model ensembles based on sampling from the uncertainty distribution of parameter values or inputs specifically oriented toward uncertainty estimation, the creation of super ensembles that sample more than one source of uncertainty, the analysis of super ensemble results to obtain information on total uncertainty and the separate contributions of different sources of uncertainty and finally further investigation of the use of the multi-model mean or median as a predictor.
Item selection via Bayesian IRT models.
Arima, Serena
2015-02-10
With reference to a questionnaire that aimed to assess the quality of life for dysarthric speakers, we investigate the usefulness of a model-based procedure for reducing the number of items. We propose a mixed cumulative logit model, which is known in the psychometrics literature as the graded response model: responses to different items are modelled as a function of individual latent traits and as a function of item characteristics, such as their difficulty and their discrimination power. We jointly model the discrimination and the difficulty parameters by using a k-component mixture of normal distributions. Mixture components correspond to disjoint groups of items. Items that belong to the same groups can be considered equivalent in terms of both difficulty and discrimination power. According to decision criteria, we select a subset of items such that the reduced questionnaire is able to provide the same information that the complete questionnaire provides. The model is estimated by using a Bayesian approach, and the choice of the number of mixture components is justified according to information criteria. We illustrate the proposed approach on the basis of data that are collected for 104 dysarthric patients by local health authorities in Lecce and in Milan. Copyright © 2014 John Wiley & Sons, Ltd.
Advances in Bayesian Modeling in Educational Research
Levy, Roy
2016-01-01
In this article, I provide a conceptually oriented overview of Bayesian approaches to statistical inference and contrast them with frequentist approaches that currently dominate conventional practice in educational research. The features and advantages of Bayesian approaches are illustrated with examples spanning several statistical modeling…
Improving wave forecasting by integrating ensemble modelling and machine learning
O'Donncha, F.; Zhang, Y.; James, S. C.
2017-12-01
Modern smart-grid networks use technologies to instantly relay information on supply and demand to support effective decision making. Integration of renewable-energy resources with these systems demands accurate forecasting of energy production (and demand) capacities. For wave-energy converters, this requires wave-condition forecasting to enable estimates of energy production. Current operational wave forecasting systems exhibit substantial errors with wave-height RMSEs of 40 to 60 cm being typical, which limits the reliability of energy-generation predictions thereby impeding integration with the distribution grid. In this study, we integrate physics-based models with statistical learning aggregation techniques that combine forecasts from multiple, independent models into a single "best-estimate" prediction of the true state. The Simulating Waves Nearshore physics-based model is used to compute wind- and currents-augmented waves in the Monterey Bay area. Ensembles are developed based on multiple simulations perturbing input data (wave characteristics supplied at the model boundaries and winds) to the model. A learning-aggregation technique uses past observations and past model forecasts to calculate a weight for each model. The aggregated forecasts are compared to observation data to quantify the performance of the model ensemble and aggregation techniques. The appropriately weighted ensemble model outperforms an individual ensemble member with regard to forecasting wave conditions.
Modelling of drug release from ensembles of aspirin microcapsules ...
African Journals Online (AJOL)
Purpose: In order to determine the drug release profile of an ensemble of aspirin crystals or microcapsules from its particle distribution a mathematical model that considered the individual release characteristics of the component single particles was developed. The model assumed that under sink conditions the release ...
Canonical Ensemble Model for Black Hole Horizon of Schwarzschild ...
Indian Academy of Sciences (India)
Abstract. In this paper, we use the canonical ensemble model to discuss the radiation of a Schwarzschild–de Sitter black hole on the black hole horizon. Using this model, we calculate the probability distribution from function of the emission shell. And the statistical meaning which compare with the distribution function is ...
Erfanian, A.; Fomenko, L.; Wang, G.
2016-12-01
Multi-model ensemble (MME) average is considered the most reliable for simulating both present-day and future climates. It has been a primary reference for making conclusions in major coordinated studies i.e. IPCC Assessment Reports and CORDEX. The biases of individual models cancel out each other in MME average, enabling the ensemble mean to outperform individual members in simulating the mean climate. This enhancement however comes with tremendous computational cost, which is especially inhibiting for regional climate modeling as model uncertainties can originate from both RCMs and the driving GCMs. Here we propose the Ensemble-based Reconstructed Forcings (ERF) approach to regional climate modeling that achieves a similar level of bias reduction at a fraction of cost compared with the conventional MME approach. The new method constructs a single set of initial and boundary conditions (IBCs) by averaging the IBCs of multiple GCMs, and drives the RCM with this ensemble average of IBCs to conduct a single run. Using a regional climate model (RegCM4.3.4-CLM4.5), we tested the method over West Africa for multiple combination of (up to six) GCMs. Our results indicate that the performance of the ERF method is comparable to that of the MME average in simulating the mean climate. The bias reduction seen in ERF simulations is achieved by using more realistic IBCs in solving the system of equations underlying the RCM physics and dynamics. This endows the new method with a theoretical advantage in addition to reducing computational cost. The ERF output is an unaltered solution of the RCM as opposed to a climate state that might not be physically plausible due to the averaging of multiple solutions with the conventional MME approach. The ERF approach should be considered for use in major international efforts such as CORDEX. Key words: Multi-model ensemble, ensemble analysis, ERF, regional climate modeling
Bayesian inference of chemical kinetic models from proposed reactions
Galagali, Nikhil; Marzouk, Youssef M.
2015-01-01
© 2014 Elsevier Ltd. Bayesian inference provides a natural framework for combining experimental data with prior knowledge to develop chemical kinetic models and quantify the associated uncertainties, not only in parameter values but also in model
Ensemble atmospheric dispersion modeling for emergency response consequence assessments
International Nuclear Information System (INIS)
Addis, R.P.; Buckley, R.L.
2003-01-01
Full text: Prognostic atmospheric dispersion models are used to generate consequence assessments, which assist decision-makers in the event of a release from a nuclear facility. Differences in the forecast wind fields generated by various meteorological agencies, differences in the transport and diffusion models themselves, as well as differences in the way these models treat the release source term, all may result in differences in the simulated plumes. This talk will address the U.S. participation in the European ENSEMBLE project, and present a perspective an how ensemble techniques may be used to enable atmospheric modelers to provide decision-makers with a more realistic understanding of how both the atmosphere and the models behave. Meteorological forecasts generated by numerical models from national and multinational meteorological agencies provide individual realizations of three-dimensional, time dependent atmospheric wind fields. These wind fields may be used to drive atmospheric dispersion (transport and diffusion) models, or they may be used to initiate other, finer resolution meteorological models, which in turn drive dispersion models. Many modeling agencies now utilize ensemble-modeling techniques to determine how sensitive the prognostic fields are to minor perturbations in the model parameters. However, the European Union programs RTMOD and ENSEMBLE are the first projects to utilize a WEB based ensemble approach to interpret the output from atmospheric dispersion models. The ensembles produced are different from those generated by meteorological forecasting centers in that they are ensembles of dispersion model outputs from many different atmospheric transport and diffusion models utilizing prognostic atmospheric fields from several different forecast centers. As such, they enable a decision-maker to consider the uncertainty in the plume transport and growth as a result of the differences in the forecast wind fields as well as the differences in the
Using synchronization in multi-model ensembles to improve prediction
Hiemstra, P.; Selten, F.
2012-04-01
In recent decades, many climate models have been developed to understand and predict the behavior of the Earth's climate system. Although these models are all based on the same basic physical principles, they still show different behavior. This is for example caused by the choice of how to parametrize sub-grid scale processes. One method to combine these imperfect models, is to run a multi-model ensemble. The models are given identical initial conditions and are integrated forward in time. A multi-model estimate can for example be a weighted mean of the ensemble members. We propose to go a step further, and try to obtain synchronization between the imperfect models by connecting the multi-model ensemble, and exchanging information. The combined multi-model ensemble is also known as a supermodel. The supermodel has learned from observations how to optimally exchange information between the ensemble members. In this study we focused on the density and formulation of the onnections within the supermodel. The main question was whether we could obtain syn-chronization between two climate models when connecting only a subset of their state spaces. Limiting the connected subspace has two advantages: 1) it limits the transfer of data (bytes) between the ensemble, which can be a limiting factor in large scale climate models, and 2) learning the optimal connection strategy from observations is easier. To answer the research question, we connected two identical quasi-geostrohic (QG) atmospheric models to each other, where the model have different initial conditions. The QG model is a qualitatively realistic simulation of the winter flow on the Northern hemisphere, has three layers and uses a spectral imple-mentation. We connected the models in the original spherical harmonical state space, and in linear combinations of these spherical harmonics, i.e. Empirical Orthogonal Functions (EOFs). We show that when connecting through spherical harmonics, we only need to connect 28% of
Constructive Epistemic Modeling: A Hierarchical Bayesian Model Averaging Method
Tsai, F. T. C.; Elshall, A. S.
2014-12-01
Constructive epistemic modeling is the idea that our understanding of a natural system through a scientific model is a mental construct that continually develops through learning about and from the model. Using the hierarchical Bayesian model averaging (HBMA) method [1], this study shows that segregating different uncertain model components through a BMA tree of posterior model probabilities, model prediction, within-model variance, between-model variance and total model variance serves as a learning tool [2]. First, the BMA tree of posterior model probabilities permits the comparative evaluation of the candidate propositions of each uncertain model component. Second, systemic model dissection is imperative for understanding the individual contribution of each uncertain model component to the model prediction and variance. Third, the hierarchical representation of the between-model variance facilitates the prioritization of the contribution of each uncertain model component to the overall model uncertainty. We illustrate these concepts using the groundwater modeling of a siliciclastic aquifer-fault system. The sources of uncertainty considered are from geological architecture, formation dip, boundary conditions and model parameters. The study shows that the HBMA analysis helps in advancing knowledge about the model rather than forcing the model to fit a particularly understanding or merely averaging several candidate models. [1] Tsai, F. T.-C., and A. S. Elshall (2013), Hierarchical Bayesian model averaging for hydrostratigraphic modeling: Uncertainty segregation and comparative evaluation. Water Resources Research, 49, 5520-5536, doi:10.1002/wrcr.20428. [2] Elshall, A.S., and F. T.-C. Tsai (2014). Constructive epistemic modeling of groundwater flow with geological architecture and boundary condition uncertainty under Bayesian paradigm, Journal of Hydrology, 517, 105-119, doi: 10.1016/j.jhydrol.2014.05.027.
Bootstrap prediction and Bayesian prediction under misspecified models
Fushiki, Tadayoshi
2005-01-01
We consider a statistical prediction problem under misspecified models. In a sense, Bayesian prediction is an optimal prediction method when an assumed model is true. Bootstrap prediction is obtained by applying Breiman's `bagging' method to a plug-in prediction. Bootstrap prediction can be considered to be an approximation to the Bayesian prediction under the assumption that the model is true. However, in applications, there are frequently deviations from the assumed model. In this paper, bo...
WE-E-BRE-05: Ensemble of Graphical Models for Predicting Radiation Pneumontis Risk
Energy Technology Data Exchange (ETDEWEB)
Lee, S; Ybarra, N; Jeyaseelan, K; El Naqa, I [McGill University, Montreal, Quebec (Canada); Faria, S; Kopek, N [Montreal General Hospital, Montreal, Quebec (Canada)
2014-06-15
Purpose: We propose a prior knowledge-based approach to construct an interaction graph of biological and dosimetric radiation pneumontis (RP) covariates for the purpose of developing a RP risk classifier. Methods: We recruited 59 NSCLC patients who received curative radiotherapy with minimum 6 month follow-up. 16 RP events was observed (CTCAE grade ≥2). Blood serum was collected from every patient before (pre-RT) and during RT (mid-RT). From each sample the concentration of the following five candidate biomarkers were taken as covariates: alpha-2-macroglobulin (α2M), angiotensin converting enzyme (ACE), transforming growth factor β (TGF-β), interleukin-6 (IL-6), and osteopontin (OPN). Dose-volumetric parameters were also included as covariates. The number of biological and dosimetric covariates was reduced by a variable selection scheme implemented by L1-regularized logistic regression (LASSO). Posterior probability distribution of interaction graphs between the selected variables was estimated from the data under the literature-based prior knowledge to weight more heavily the graphs that contain the expected associations. A graph ensemble was formed by averaging the most probable graphs weighted by their posterior, creating a Bayesian Network (BN)-based RP risk classifier. Results: The LASSO selected the following 7 RP covariates: (1) pre-RT concentration level of α2M, (2) α2M level mid- RT/pre-RT, (3) pre-RT IL6 level, (4) IL6 level mid-RT/pre-RT, (5) ACE mid-RT/pre-RT, (6) PTV volume, and (7) mean lung dose (MLD). The ensemble BN model achieved the maximum sensitivity/specificity of 81%/84% and outperformed univariate dosimetric predictors as shown by larger AUC values (0.78∼0.81) compared with MLD (0.61), V20 (0.65) and V30 (0.70). The ensembles obtained by incorporating the prior knowledge improved classification performance for the ensemble size 5∼50. Conclusion: We demonstrated a probabilistic ensemble method to detect robust associations between
Hierarchical Bayesian Modeling of Fluid-Induced Seismicity
Broccardo, M.; Mignan, A.; Wiemer, S.; Stojadinovic, B.; Giardini, D.
2017-11-01
In this study, we present a Bayesian hierarchical framework to model fluid-induced seismicity. The framework is based on a nonhomogeneous Poisson process with a fluid-induced seismicity rate proportional to the rate of injected fluid. The fluid-induced seismicity rate model depends upon a set of physically meaningful parameters and has been validated for six fluid-induced case studies. In line with the vision of hierarchical Bayesian modeling, the rate parameters are considered as random variables. We develop both the Bayesian inference and updating rules, which are used to develop a probabilistic forecasting model. We tested the Basel 2006 fluid-induced seismic case study to prove that the hierarchical Bayesian model offers a suitable framework to coherently encode both epistemic uncertainty and aleatory variability. Moreover, it provides a robust and consistent short-term seismic forecasting model suitable for online risk quantification and mitigation.
Bayesian Networks for Modeling Dredging Decisions
2011-10-01
years, that algorithms have been developed to solve these problems efficiently. Most modern Bayesian network software uses junction tree (a.k.a. join... software was used to develop the network . This is by no means an exhaustive list of Bayesian network applications, but it is representative of recent...characteristic node (SCN), state- defining node ( SDN ), effect node (EFN), or value node. The five types of nodes can be described as follows: ERDC/EL TR-11
Particle Kalman Filtering: A Nonlinear Bayesian Framework for Ensemble Kalman Filters*
Hoteit, Ibrahim; Luo, Xiaodong; Pham, Dinh-Tuan
2012-01-01
introduce a resampling step to the PEnKF in order to reduce the risk of weights collapse and improve the performance of the filter. Numerical experiments with the strongly nonlinear Lorenz-96 model are presented and discussed.
Influence of horizontal resolution and ensemble size on model performance
CSIR Research Space (South Africa)
Dalton, A
2014-10-01
Full Text Available Conference of South African Society for Atmospheric Sciences (SASAS), Potchefstroom, 1-2 October 2014 Influence of horizontal resolution and ensemble size on model performance Amaris Dalton*¹, Willem A. Landman ¹ʾ² ¹Departmen of Geography, Geo...
The egg model - a geological ensemble for reservoir simulation
Jansen, J.D.; Fonseca, R.M.; Kahrobaei, S.; Siraj, M.M.; Essen, van G.M.; Hof, Van den P.M.J.
2014-01-01
The ‘Egg Model’ is a synthetic reservoir model consisting of an ensemble of 101 relatively small three-dimensional realizations of a channelized oil reservoir produced under water flooding conditions with eight water injectors and four oil producers. It has been used in numerous publications to
Fast model updating coupling Bayesian inference and PGD model reduction
Rubio, Paul-Baptiste; Louf, François; Chamoin, Ludovic
2018-04-01
The paper focuses on a coupled Bayesian-Proper Generalized Decomposition (PGD) approach for the real-time identification and updating of numerical models. The purpose is to use the most general case of Bayesian inference theory in order to address inverse problems and to deal with different sources of uncertainties (measurement and model errors, stochastic parameters). In order to do so with a reasonable CPU cost, the idea is to replace the direct model called for Monte-Carlo sampling by a PGD reduced model, and in some cases directly compute the probability density functions from the obtained analytical formulation. This procedure is first applied to a welding control example with the updating of a deterministic parameter. In the second application, the identification of a stochastic parameter is studied through a glued assembly example.
Modelling machine ensembles with discrete event dynamical system theory
Hunter, Dan
1990-01-01
Discrete Event Dynamical System (DEDS) theory can be utilized as a control strategy for future complex machine ensembles that will be required for in-space construction. The control strategy involves orchestrating a set of interactive submachines to perform a set of tasks for a given set of constraints such as minimum time, minimum energy, or maximum machine utilization. Machine ensembles can be hierarchically modeled as a global model that combines the operations of the individual submachines. These submachines are represented in the global model as local models. Local models, from the perspective of DEDS theory , are described by the following: a set of system and transition states, an event alphabet that portrays actions that takes a submachine from one state to another, an initial system state, a partial function that maps the current state and event alphabet to the next state, and the time required for the event to occur. Each submachine in the machine ensemble is presented by a unique local model. The global model combines the local models such that the local models can operate in parallel under the additional logistic and physical constraints due to submachine interactions. The global model is constructed from the states, events, event functions, and timing requirements of the local models. Supervisory control can be implemented in the global model by various methods such as task scheduling (open-loop control) or implementing a feedback DEDS controller (closed-loop control).
Ensemble models of neutrophil trafficking in severe sepsis.
Directory of Open Access Journals (Sweden)
Sang Ok Song
Full Text Available A hallmark of severe sepsis is systemic inflammation which activates leukocytes and can result in their misdirection. This leads to both impaired migration to the locus of infection and increased infiltration into healthy tissues. In order to better understand the pathophysiologic mechanisms involved, we developed a coarse-grained phenomenological model of the acute inflammatory response in CLP (cecal ligation and puncture-induced sepsis in rats. This model incorporates distinct neutrophil kinetic responses to the inflammatory stimulus and the dynamic interactions between components of a compartmentalized inflammatory response. Ensembles of model parameter sets consistent with experimental observations were statistically generated using a Markov-Chain Monte Carlo sampling. Prediction uncertainty in the model states was quantified over the resulting ensemble parameter sets. Forward simulation of the parameter ensembles successfully captured experimental features and predicted that systemically activated circulating neutrophils display impaired migration to the tissue and neutrophil sequestration in the lung, consequently contributing to tissue damage and mortality. Principal component and multiple regression analyses of the parameter ensembles estimated from survivor and non-survivor cohorts provide insight into pathologic mechanisms dictating outcome in sepsis. Furthermore, the model was extended to incorporate hypothetical mechanisms by which immune modulation using extracorporeal blood purification results in improved outcome in septic rats. Simulations identified a sub-population (about 18% of the treated population that benefited from blood purification. Survivors displayed enhanced neutrophil migration to tissue and reduced sequestration of lung neutrophils, contributing to improved outcome. The model ensemble presented herein provides a platform for generating and testing hypotheses in silico, as well as motivating further experimental
When mechanism matters: Bayesian forecasting using models of ecological diffusion
Hefley, Trevor J.; Hooten, Mevin B.; Russell, Robin E.; Walsh, Daniel P.; Powell, James A.
2017-01-01
Ecological diffusion is a theory that can be used to understand and forecast spatio-temporal processes such as dispersal, invasion, and the spread of disease. Hierarchical Bayesian modelling provides a framework to make statistical inference and probabilistic forecasts, using mechanistic ecological models. To illustrate, we show how hierarchical Bayesian models of ecological diffusion can be implemented for large data sets that are distributed densely across space and time. The hierarchical Bayesian approach is used to understand and forecast the growth and geographic spread in the prevalence of chronic wasting disease in white-tailed deer (Odocoileus virginianus). We compare statistical inference and forecasts from our hierarchical Bayesian model to phenomenological regression-based methods that are commonly used to analyse spatial occurrence data. The mechanistic statistical model based on ecological diffusion led to important ecological insights, obviated a commonly ignored type of collinearity, and was the most accurate method for forecasting.
Bayesian hierarchical modelling of North Atlantic windiness
Vanem, E.; Breivik, O. N.
2013-03-01
Extreme weather conditions represent serious natural hazards to ship operations and may be the direct cause or contributing factor to maritime accidents. Such severe environmental conditions can be taken into account in ship design and operational windows can be defined that limits hazardous operations to less extreme conditions. Nevertheless, possible changes in the statistics of extreme weather conditions, possibly due to anthropogenic climate change, represent an additional hazard to ship operations that is less straightforward to account for in a consistent way. Obviously, there are large uncertainties as to how future climate change will affect the extreme weather conditions at sea and there is a need for stochastic models that can describe the variability in both space and time at various scales of the environmental conditions. Previously, Bayesian hierarchical space-time models have been developed to describe the variability and complex dependence structures of significant wave height in space and time. These models were found to perform reasonably well and provided some interesting results, in particular, pertaining to long-term trends in the wave climate. In this paper, a similar framework is applied to oceanic windiness and the spatial and temporal variability of the 10-m wind speed over an area in the North Atlantic ocean is investigated. When the results from the model for North Atlantic windiness is compared to the results for significant wave height over the same area, it is interesting to observe that whereas an increasing trend in significant wave height was identified, no statistically significant long-term trend was estimated in windiness. This may indicate that the increase in significant wave height is not due to an increase in locally generated wind waves, but rather to increased swell. This observation is also consistent with studies that have suggested a poleward shift of the main storm tracks.
Bayesian hierarchical modelling of North Atlantic windiness
Directory of Open Access Journals (Sweden)
E. Vanem
2013-03-01
Full Text Available Extreme weather conditions represent serious natural hazards to ship operations and may be the direct cause or contributing factor to maritime accidents. Such severe environmental conditions can be taken into account in ship design and operational windows can be defined that limits hazardous operations to less extreme conditions. Nevertheless, possible changes in the statistics of extreme weather conditions, possibly due to anthropogenic climate change, represent an additional hazard to ship operations that is less straightforward to account for in a consistent way. Obviously, there are large uncertainties as to how future climate change will affect the extreme weather conditions at sea and there is a need for stochastic models that can describe the variability in both space and time at various scales of the environmental conditions. Previously, Bayesian hierarchical space-time models have been developed to describe the variability and complex dependence structures of significant wave height in space and time. These models were found to perform reasonably well and provided some interesting results, in particular, pertaining to long-term trends in the wave climate. In this paper, a similar framework is applied to oceanic windiness and the spatial and temporal variability of the 10-m wind speed over an area in the North Atlantic ocean is investigated. When the results from the model for North Atlantic windiness is compared to the results for significant wave height over the same area, it is interesting to observe that whereas an increasing trend in significant wave height was identified, no statistically significant long-term trend was estimated in windiness. This may indicate that the increase in significant wave height is not due to an increase in locally generated wind waves, but rather to increased swell. This observation is also consistent with studies that have suggested a poleward shift of the main storm tracks.
Bayesian disease mapping: hierarchical modeling in spatial epidemiology
National Research Council Canada - National Science Library
Lawson, Andrew
2013-01-01
Since the publication of the first edition, many new Bayesian tools and methods have been developed for space-time data analysis, the predictive modeling of health outcomes, and other spatial biostatistical areas...
Flexible Bayesian Dynamic Modeling of Covariance and Correlation Matrices
Lan, Shiwei; Holbrook, Andrew; Fortin, Norbert J.; Ombao, Hernando; Shahbaba, Babak
2017-01-01
Modeling covariance (and correlation) matrices is a challenging problem due to the large dimensionality and positive-definiteness constraint. In this paper, we propose a novel Bayesian framework based on decomposing the covariance matrix
Dispersion Modeling Using Ensemble Forecasts Compared to ETEX Measurements.
Straume, Anne Grete; N'dri Koffi, Ernest; Nodop, Katrin
1998-11-01
Numerous numerical models are developed to predict long-range transport of hazardous air pollution in connection with accidental releases. When evaluating and improving such a model, it is important to detect uncertainties connected to the meteorological input data. A Lagrangian dispersion model, the Severe Nuclear Accident Program, is used here to investigate the effect of errors in the meteorological input data due to analysis error. An ensemble forecast, produced at the European Centre for Medium-Range Weather Forecasts, is then used as model input. The ensemble forecast members are generated by perturbing the initial meteorological fields of the weather forecast. The perturbations are calculated from singular vectors meant to represent possible forecast developments generated by instabilities in the atmospheric flow during the early part of the forecast. The instabilities are generated by errors in the analyzed fields. Puff predictions from the dispersion model, using ensemble forecast input, are compared, and a large spread in the predicted puff evolutions is found. This shows that the quality of the meteorological input data is important for the success of the dispersion model. In order to evaluate the dispersion model, the calculations are compared with measurements from the European Tracer Experiment. The model manages to predict the measured puff evolution concerning shape and time of arrival to a fairly high extent, up to 60 h after the start of the release. The modeled puff is still too narrow in the advection direction.
Bayesian network modeling of operator's state recognition process
International Nuclear Information System (INIS)
Hatakeyama, Naoki; Furuta, Kazuo
2000-01-01
Nowadays we are facing a difficult problem of establishing a good relation between humans and machines. To solve this problem, we suppose that machine system need to have a model of human behavior. In this study we model the state cognition process of a PWR plant operator as an example. We use a Bayesian network as an inference engine. We incorporate the knowledge hierarchy in the Bayesian network and confirm its validity using the example of PWR plant operator. (author)
Ensemble Kinetic Modeling of Metabolic Networks from Dynamic Metabolic Profiles
Directory of Open Access Journals (Sweden)
Gengjie Jia
2012-11-01
Full Text Available Kinetic modeling of metabolic pathways has important applications in metabolic engineering, but significant challenges still remain. The difficulties faced vary from finding best-fit parameters in a highly multidimensional search space to incomplete parameter identifiability. To meet some of these challenges, an ensemble modeling method is developed for characterizing a subset of kinetic parameters that give statistically equivalent goodness-of-fit to time series concentration data. The method is based on the incremental identification approach, where the parameter estimation is done in a step-wise manner. Numerical efficacy is achieved by reducing the dimensionality of parameter space and using efficient random parameter exploration algorithms. The shift toward using model ensembles, instead of the traditional “best-fit” models, is necessary to directly account for model uncertainty during the application of such models. The performance of the ensemble modeling approach has been demonstrated in the modeling of a generic branched pathway and the trehalose pathway in Saccharomyces cerevisiae using generalized mass action (GMA kinetics.
DEFF Research Database (Denmark)
Kuikka, Sakari; Haapasaari, Päivi Elisabet; Helle, Inari
2011-01-01
We review the experience obtained in using integrative Bayesian models in interdisciplinary analysis focusing on sustainable use of marine resources and environmental management tasks. We have applied Bayesian models to both fisheries and environmental risk analysis problems. Bayesian belief...... be time consuming and research projects can be difficult to manage due to unpredictable technical problems related to parameter estimation. Biology, sociology and environmental economics have their own scientific traditions. Bayesian models are becoming traditional tools in fisheries biology, where...
Combination of Bayesian Network and Overlay Model in User Modeling
Directory of Open Access Journals (Sweden)
Loc Nguyen
2009-12-01
Full Text Available The core of adaptive system is user model containing personal information such as knowledge, learning styles, goals… which is requisite for learning personalized process. There are many modeling approaches, for example: stereotype, overlay, plan recognition… but they don’t bring out the solid method for reasoning from user model. This paper introduces the statistical method that combines Bayesian network and overlay modeling so that it is able to infer user’s knowledge from evidences collected during user’s learning process.
Modeling Non-Gaussian Time Series with Nonparametric Bayesian Model.
Xu, Zhiguang; MacEachern, Steven; Xu, Xinyi
2015-02-01
We present a class of Bayesian copula models whose major components are the marginal (limiting) distribution of a stationary time series and the internal dynamics of the series. We argue that these are the two features with which an analyst is typically most familiar, and hence that these are natural components with which to work. For the marginal distribution, we use a nonparametric Bayesian prior distribution along with a cdf-inverse cdf transformation to obtain large support. For the internal dynamics, we rely on the traditionally successful techniques of normal-theory time series. Coupling the two components gives us a family of (Gaussian) copula transformed autoregressive models. The models provide coherent adjustments of time scales and are compatible with many extensions, including changes in volatility of the series. We describe basic properties of the models, show their ability to recover non-Gaussian marginal distributions, and use a GARCH modification of the basic model to analyze stock index return series. The models are found to provide better fit and improved short-range and long-range predictions than Gaussian competitors. The models are extensible to a large variety of fields, including continuous time models, spatial models, models for multiple series, models driven by external covariate streams, and non-stationary models.
Bayesian graphical models for genomewide association studies.
Verzilli, Claudio J; Stallard, Nigel; Whittaker, John C
2006-07-01
As the extent of human genetic variation becomes more fully characterized, the research community is faced with the challenging task of using this information to dissect the heritable components of complex traits. Genomewide association studies offer great promise in this respect, but their analysis poses formidable difficulties. In this article, we describe a computationally efficient approach to mining genotype-phenotype associations that scales to the size of the data sets currently being collected in such studies. We use discrete graphical models as a data-mining tool, searching for single- or multilocus patterns of association around a causative site. The approach is fully Bayesian, allowing us to incorporate prior knowledge on the spatial dependencies around each marker due to linkage disequilibrium, which reduces considerably the number of possible graphical structures. A Markov chain-Monte Carlo scheme is developed that yields samples from the posterior distribution of graphs conditional on the data from which probabilistic statements about the strength of any genotype-phenotype association can be made. Using data simulated under scenarios that vary in marker density, genotype relative risk of a causative allele, and mode of inheritance, we show that the proposed approach has better localization properties and leads to lower false-positive rates than do single-locus analyses. Finally, we present an application of our method to a quasi-synthetic data set in which data from the CYP2D6 region are embedded within simulated data on 100K single-nucleotide polymorphisms. Analysis is quick (<5 min), and we are able to localize the causative site to a very short interval.
A tutorial introduction to Bayesian models of cognitive development.
Perfors, Amy; Tenenbaum, Joshua B; Griffiths, Thomas L; Xu, Fei
2011-09-01
We present an introduction to Bayesian inference as it is used in probabilistic models of cognitive development. Our goal is to provide an intuitive and accessible guide to the what, the how, and the why of the Bayesian approach: what sorts of problems and data the framework is most relevant for, and how and why it may be useful for developmentalists. We emphasize a qualitative understanding of Bayesian inference, but also include information about additional resources for those interested in the cognitive science applications, mathematical foundations, or machine learning details in more depth. In addition, we discuss some important interpretation issues that often arise when evaluating Bayesian models in cognitive science. Copyright © 2010 Elsevier B.V. All rights reserved.
Modelling of JET diagnostics using Bayesian Graphical Models
Energy Technology Data Exchange (ETDEWEB)
Svensson, J. [IPP Greifswald, Greifswald (Germany); Ford, O. [Imperial College, London (United Kingdom); McDonald, D.; Hole, M.; Nessi, G. von; Meakins, A.; Brix, M.; Thomsen, H.; Werner, A.; Sirinelli, A.
2011-07-01
The mapping between physics parameters (such as densities, currents, flows, temperatures etc) defining the plasma 'state' under a given model and the raw observations of each plasma diagnostic will 1) depend on the particular physics model used, 2) is inherently probabilistic, from uncertainties on both observations and instrumental aspects of the mapping, such as calibrations, instrument functions etc. A flexible and principled way of modelling such interconnected probabilistic systems is through so called Bayesian graphical models. Being an amalgam between graph theory and probability theory, Bayesian graphical models can simulate the complex interconnections between physics models and diagnostic observations from multiple heterogeneous diagnostic systems, making it relatively easy to optimally combine the observations from multiple diagnostics for joint inference on parameters of the underlying physics model, which in itself can be represented as part of the graph. At JET about 10 diagnostic systems have to date been modelled in this way, and has lead to a number of new results, including: the reconstruction of the flux surface topology and q-profiles without any specific equilibrium assumption, using information from a number of different diagnostic systems; profile inversions taking into account the uncertainties in the flux surface positions and a substantial increase in accuracy of JET electron density and temperature profiles, including improved pedestal resolution, through the joint analysis of three diagnostic systems. It is believed that the Bayesian graph approach could potentially be utilised for very large sets of diagnostics, providing a generic data analysis framework for nuclear fusion experiments, that would be able to optimally utilize the information from multiple diagnostics simultaneously, and where the explicit graph representation of the connections to underlying physics models could be used for sophisticated model testing. This
Inventory model using bayesian dynamic linear model for demand forecasting
Directory of Open Access Journals (Sweden)
Marisol Valencia-Cárdenas
2014-12-01
Full Text Available An important factor of manufacturing process is the inventory management of terminated product. Constantly, industry is looking for better alternatives to establish an adequate plan of production and stored quantities, with optimal cost, getting quantities in a time horizon, which permits to define resources and logistics with anticipation, needed to distribute products on time. Total absence of historical data, required by many statistical models to forecast, demands the search for other kind of accurate techniques. This work presents an alternative that not only permits to forecast, in an adjusted way, but also, to provide optimal quantities to produce and store with an optimal cost, using Bayesian statistics. The proposal is illustrated with real data. Palabras clave: estadística bayesiana, optimización, modelo de inventarios, modelo lineal dinámico bayesiano. Keywords: Bayesian statistics, opti
Ensemble streamflow assimilation with the National Water Model.
Rafieeinasab, A.; McCreight, J. L.; Noh, S.; Seo, D. J.; Gochis, D.
2017-12-01
Through case studies of flooding across the US, we compare the performance of the National Water Model (NWM) data assimilation (DA) scheme to that of a newly implemented ensemble Kalman filter approach. The NOAA National Water Model (NWM) is an operational implementation of the community WRF-Hydro modeling system. As of August 2016, the NWM forecasts of distributed hydrologic states and fluxes (including soil moisture, snowpack, ET, and ponded water) over the contiguous United States have been publicly disseminated by the National Center for Environmental Prediction (NCEP) . It also provides streamflow forecasts at more than 2.7 million river reaches up to 30 days in advance. The NWM employs a nudging scheme to assimilate more than 6,000 USGS streamflow observations and provide initial conditions for its forecasts. A problem with nudging is how the forecasts relax quickly to open-loop bias in the forecast. This has been partially addressed by an experimental bias correction approach which was found to have issues with phase errors during flooding events. In this work, we present an ensemble streamflow data assimilation approach combining new channel-only capabilities of the NWM and HydroDART (a coupling of the offline WRF-Hydro model and NCAR's Data Assimilation Research Testbed; DART). Our approach focuses on the single model state of discharge and incorporates error distributions on channel-influxes (overland and groundwater) in the assimilation via an ensemble Kalman filter (EnKF). In order to avoid filter degeneracy associated with a limited number of ensemble at large scale, DART's covariance inflation (Anderson, 2009) and localization capabilities are implemented and evaluated. The current NWM data assimilation scheme is compared to preliminary results from the EnKF application for several flooding case studies across the US.
Canonical Ensemble Model for Black Hole Radiation Jingyi Zhang
Indian Academy of Sciences (India)
Canonical Ensemble Model for Black Hole Radiation. 575. For entropy, there is no corresponding thermodynamical quantity, without loss of generalization. Let us define an entropy operator. ˆS = −KB ln ˆρ. (11). Then, the mean value of entropy is. S ≡〈ˆS〉 = tr( ˆρ ˆS) = −KBtr( ˆρ ln ˆρ). (12). For ideal gases, let y = V , then the ...
A Bayesian alternative for multi-objective ecohydrological model specification
Tang, Yating; Marshall, Lucy; Sharma, Ashish; Ajami, Hoori
2018-01-01
Recent studies have identified the importance of vegetation processes in terrestrial hydrologic systems. Process-based ecohydrological models combine hydrological, physical, biochemical and ecological processes of the catchments, and as such are generally more complex and parametric than conceptual hydrological models. Thus, appropriate calibration objectives and model uncertainty analysis are essential for ecohydrological modeling. In recent years, Bayesian inference has become one of the most popular tools for quantifying the uncertainties in hydrological modeling with the development of Markov chain Monte Carlo (MCMC) techniques. The Bayesian approach offers an appealing alternative to traditional multi-objective hydrologic model calibrations by defining proper prior distributions that can be considered analogous to the ad-hoc weighting often prescribed in multi-objective calibration. Our study aims to develop appropriate prior distributions and likelihood functions that minimize the model uncertainties and bias within a Bayesian ecohydrological modeling framework based on a traditional Pareto-based model calibration technique. In our study, a Pareto-based multi-objective optimization and a formal Bayesian framework are implemented in a conceptual ecohydrological model that combines a hydrological model (HYMOD) and a modified Bucket Grassland Model (BGM). Simulations focused on one objective (streamflow/LAI) and multiple objectives (streamflow and LAI) with different emphasis defined via the prior distribution of the model error parameters. Results show more reliable outputs for both predicted streamflow and LAI using Bayesian multi-objective calibration with specified prior distributions for error parameters based on results from the Pareto front in the ecohydrological modeling. The methodology implemented here provides insight into the usefulness of multiobjective Bayesian calibration for ecohydrologic systems and the importance of appropriate prior
Bayesian Estimation of the Logistic Positive Exponent IRT Model
Bolfarine, Heleno; Bazan, Jorge Luis
2010-01-01
A Bayesian inference approach using Markov Chain Monte Carlo (MCMC) is developed for the logistic positive exponent (LPE) model proposed by Samejima and for a new skewed Logistic Item Response Theory (IRT) model, named Reflection LPE model. Both models lead to asymmetric item characteristic curves (ICC) and can be appropriate because a symmetric…
Stochastic resonance in models of neuronal ensembles
International Nuclear Information System (INIS)
Chialvo, D.R.; Longtin, A.; Mueller-Gerkin, J.
1997-01-01
Two recently suggested mechanisms for the neuronal encoding of sensory information involving the effect of stochastic resonance with aperiodic time-varying inputs are considered. It is shown, using theoretical arguments and numerical simulations, that the nonmonotonic behavior with increasing noise of the correlation measures used for the so-called aperiodic stochastic resonance (ASR) scenario does not rely on the cooperative effect typical of stochastic resonance in bistable and excitable systems. Rather, ASR with slowly varying signals is more properly interpreted as linearization by noise. Consequently, the broadening of the open-quotes resonance curveclose quotes in the multineuron stochastic resonance without tuning scenario can also be explained by this linearization. Computation of the input-output correlation as a function of both signal frequency and noise for the model system further reveals conditions where noise-induced firing with aperiodic inputs will benefit from stochastic resonance rather than linearization by noise. Thus, our study clarifies the tuning requirements for the optimal transduction of subthreshold aperiodic signals. It also shows that a single deterministic neuron can perform as well as a network when biased into a suprathreshold regime. Finally, we show that the inclusion of a refractory period in the spike-detection scheme produces a better correlation between instantaneous firing rate and input signal. copyright 1997 The American Physical Society
Using consensus bayesian network to model the reactive oxygen species regulatory pathway.
Directory of Open Access Journals (Sweden)
Liangdong Hu
Full Text Available Bayesian network is one of the most successful graph models for representing the reactive oxygen species regulatory pathway. With the increasing number of microarray measurements, it is possible to construct the bayesian network from microarray data directly. Although large numbers of bayesian network learning algorithms have been developed, when applying them to learn bayesian networks from microarray data, the accuracies are low due to that the databases they used to learn bayesian networks contain too few microarray data. In this paper, we propose a consensus bayesian network which is constructed by combining bayesian networks from relevant literatures and bayesian networks learned from microarray data. It would have a higher accuracy than the bayesian networks learned from one database. In the experiment, we validated the bayesian network combination algorithm on several classic machine learning databases and used the consensus bayesian network to model the Escherichia coli's ROS pathway.
Multi-wheat-model ensemble responses to interannual climatic variability
DEFF Research Database (Denmark)
Ruane, A C; Hudson, N I; Asseng, S
2016-01-01
We compare 27 wheat models' yield responses to interannual climate variability, analyzed at locations in Argentina, Australia, India, and The Netherlands as part of the Agricultural Model Intercomparison and Improvement Project (AgMIP) Wheat Pilot. Each model simulated 1981–2010 grain yield, and ......-term warming, suggesting that additional processes differentiate climate change impacts from observed climate variability analogs and motivating continuing analysis and model development efforts.......We compare 27 wheat models' yield responses to interannual climate variability, analyzed at locations in Argentina, Australia, India, and The Netherlands as part of the Agricultural Model Intercomparison and Improvement Project (AgMIP) Wheat Pilot. Each model simulated 1981–2010 grain yield, and we...... evaluate results against the interannual variability of growing season temperature, precipitation, and solar radiation. The amount of information used for calibration has only a minor effect on most models' climate response, and even small multi-model ensembles prove beneficial. Wheat model clusters reveal...
Maritime piracy situation modelling with dynamic Bayesian networks
CSIR Research Space (South Africa)
Dabrowski, James M
2015-05-01
Full Text Available A generative model for modelling maritime vessel behaviour is proposed. The model is a novel variant of the dynamic Bayesian network (DBN). The proposed DBN is in the form of a switching linear dynamic system (SLDS) that has been extended into a...
Bayesian inference model for fatigue life of laminated composites
DEFF Research Database (Denmark)
Dimitrov, Nikolay Krasimirov; Kiureghian, Armen Der; Berggreen, Christian
2016-01-01
A probabilistic model for estimating the fatigue life of laminated composite plates is developed. The model is based on lamina-level input data, making it possible to predict fatigue properties for a wide range of laminate configurations. Model parameters are estimated by Bayesian inference. The ...
Characterizing economic trends by Bayesian stochastic model specification search
DEFF Research Database (Denmark)
Grassi, Stefano; Proietti, Tommaso
We extend a recently proposed Bayesian model selection technique, known as stochastic model specification search, for characterising the nature of the trend in macroeconomic time series. In particular, we focus on autoregressive models with possibly time-varying intercept and slope and decide on ...
Bayesian Plackett-Luce Mixture Models for Partially Ranked Data.
Mollica, Cristina; Tardella, Luca
2017-06-01
The elicitation of an ordinal judgment on multiple alternatives is often required in many psychological and behavioral experiments to investigate preference/choice orientation of a specific population. The Plackett-Luce model is one of the most popular and frequently applied parametric distributions to analyze rankings of a finite set of items. The present work introduces a Bayesian finite mixture of Plackett-Luce models to account for unobserved sample heterogeneity of partially ranked data. We describe an efficient way to incorporate the latent group structure in the data augmentation approach and the derivation of existing maximum likelihood procedures as special instances of the proposed Bayesian method. Inference can be conducted with the combination of the Expectation-Maximization algorithm for maximum a posteriori estimation and the Gibbs sampling iterative procedure. We additionally investigate several Bayesian criteria for selecting the optimal mixture configuration and describe diagnostic tools for assessing the fitness of ranking distributions conditionally and unconditionally on the number of ranked items. The utility of the novel Bayesian parametric Plackett-Luce mixture for characterizing sample heterogeneity is illustrated with several applications to simulated and real preference ranked data. We compare our method with the frequentist approach and a Bayesian nonparametric mixture model both assuming the Plackett-Luce model as a mixture component. Our analysis on real datasets reveals the importance of an accurate diagnostic check for an appropriate in-depth understanding of the heterogenous nature of the partial ranking data.
Girsanov reweighting for path ensembles and Markov state models
Donati, L.; Hartmann, C.; Keller, B. G.
2017-06-01
The sensitivity of molecular dynamics on changes in the potential energy function plays an important role in understanding the dynamics and function of complex molecules. We present a method to obtain path ensemble averages of a perturbed dynamics from a set of paths generated by a reference dynamics. It is based on the concept of path probability measure and the Girsanov theorem, a result from stochastic analysis to estimate a change of measure of a path ensemble. Since Markov state models (MSMs) of the molecular dynamics can be formulated as a combined phase-space and path ensemble average, the method can be extended to reweight MSMs by combining it with a reweighting of the Boltzmann distribution. We demonstrate how to efficiently implement the Girsanov reweighting in a molecular dynamics simulation program by calculating parts of the reweighting factor "on the fly" during the simulation, and we benchmark the method on test systems ranging from a two-dimensional diffusion process and an artificial many-body system to alanine dipeptide and valine dipeptide in implicit and explicit water. The method can be used to study the sensitivity of molecular dynamics on external perturbations as well as to reweight trajectories generated by enhanced sampling schemes to the original dynamics.
A Simple Approach to Account for Climate Model Interdependence in Multi-Model Ensembles
Herger, N.; Abramowitz, G.; Angelil, O. M.; Knutti, R.; Sanderson, B.
2016-12-01
Multi-model ensembles are an indispensable tool for future climate projection and its uncertainty quantification. Ensembles containing multiple climate models generally have increased skill, consistency and reliability. Due to the lack of agreed-on alternatives, most scientists use the equally-weighted multi-model mean as they subscribe to model democracy ("one model, one vote").Different research groups are known to share sections of code, parameterizations in their model, literature, or even whole model components. Therefore, individual model runs do not represent truly independent estimates. Ignoring this dependence structure might lead to a false model consensus, wrong estimation of uncertainty and effective number of independent models.Here, we present a way to partially address this problem by selecting a subset of CMIP5 model runs so that its climatological mean minimizes the RMSE compared to a given observation product. Due to the cancelling out of errors, regional biases in the ensemble mean are reduced significantly.Using a model-as-truth experiment we demonstrate that those regional biases persist into the future and we are not fitting noise, thus providing improved observationally-constrained projections of the 21st century. The optimally selected ensemble shows significantly higher global mean surface temperature projections than the original ensemble, where all the model runs are considered. Moreover, the spread is decreased well beyond that expected from the decreased ensemble size.Several previous studies have recommended an ensemble selection approach based on performance ranking of the model runs. Here, we show that this approach can perform even worse than randomly selecting ensemble members and can thus be harmful. We suggest that accounting for interdependence in the ensemble selection process is a necessary step for robust projections for use in impact assessments, adaptation and mitigation of climate change.
Bayesian inference of chemical kinetic models from proposed reactions
Galagali, Nikhil
2015-02-01
© 2014 Elsevier Ltd. Bayesian inference provides a natural framework for combining experimental data with prior knowledge to develop chemical kinetic models and quantify the associated uncertainties, not only in parameter values but also in model structure. Most existing applications of Bayesian model selection methods to chemical kinetics have been limited to comparisons among a small set of models, however. The significant computational cost of evaluating posterior model probabilities renders traditional Bayesian methods infeasible when the model space becomes large. We present a new framework for tractable Bayesian model inference and uncertainty quantification using a large number of systematically generated model hypotheses. The approach involves imposing point-mass mixture priors over rate constants and exploring the resulting posterior distribution using an adaptive Markov chain Monte Carlo method. The posterior samples are used to identify plausible models, to quantify rate constant uncertainties, and to extract key diagnostic information about model structure-such as the reactions and operating pathways most strongly supported by the data. We provide numerical demonstrations of the proposed framework by inferring kinetic models for catalytic steam and dry reforming of methane using available experimental data.
Multi-Wheat-Model Ensemble Responses to Interannual Climate Variability
Ruane, Alex C.; Hudson, Nicholas I.; Asseng, Senthold; Camarrano, Davide; Ewert, Frank; Martre, Pierre; Boote, Kenneth J.; Thorburn, Peter J.; Aggarwal, Pramod K.; Angulo, Carlos
2016-01-01
We compare 27 wheat models' yield responses to interannual climate variability, analyzed at locations in Argentina, Australia, India, and The Netherlands as part of the Agricultural Model Intercomparison and Improvement Project (AgMIP) Wheat Pilot. Each model simulated 1981e2010 grain yield, and we evaluate results against the interannual variability of growing season temperature, precipitation, and solar radiation. The amount of information used for calibration has only a minor effect on most models' climate response, and even small multi-model ensembles prove beneficial. Wheat model clusters reveal common characteristics of yield response to climate; however models rarely share the same cluster at all four sites indicating substantial independence. Only a weak relationship (R2 0.24) was found between the models' sensitivities to interannual temperature variability and their response to long-termwarming, suggesting that additional processes differentiate climate change impacts from observed climate variability analogs and motivating continuing analysis and model development efforts.
Bayesian Subset Modeling for High-Dimensional Generalized Linear Models
Liang, Faming
2013-06-01
This article presents a new prior setting for high-dimensional generalized linear models, which leads to a Bayesian subset regression (BSR) with the maximum a posteriori model approximately equivalent to the minimum extended Bayesian information criterion model. The consistency of the resulting posterior is established under mild conditions. Further, a variable screening procedure is proposed based on the marginal inclusion probability, which shares the same properties of sure screening and consistency with the existing sure independence screening (SIS) and iterative sure independence screening (ISIS) procedures. However, since the proposed procedure makes use of joint information from all predictors, it generally outperforms SIS and ISIS in real applications. This article also makes extensive comparisons of BSR with the popular penalized likelihood methods, including Lasso, elastic net, SIS, and ISIS. The numerical results indicate that BSR can generally outperform the penalized likelihood methods. The models selected by BSR tend to be sparser and, more importantly, of higher prediction ability. In addition, the performance of the penalized likelihood methods tends to deteriorate as the number of predictors increases, while this is not significant for BSR. Supplementary materials for this article are available online. © 2013 American Statistical Association.
Research & development and growth: A Bayesian model averaging analysis
Czech Academy of Sciences Publication Activity Database
Horváth, Roman
2011-01-01
Roč. 28, č. 6 (2011), s. 2669-2673 ISSN 0264-9993. [Society for Non-linear Dynamics and Econometrics Annual Conferencen. Washington DC, 16.03.2011-18.03.2011] R&D Projects: GA ČR GA402/09/0965 Institutional research plan: CEZ:AV0Z10750506 Keywords : Research and development * Growth * Bayesian model averaging Subject RIV: AH - Economic s Impact factor: 0.701, year: 2011 http://library.utia.cas.cz/separaty/2011/E/horvath-research & development and growth a bayesian model averaging analysis.pdf
Bayesian inference with information content model check for Langevin equations
DEFF Research Database (Denmark)
Krog, Jens F. C.; Lomholt, Michael Andersen
2017-01-01
The Bayesian data analysis framework has been proven to be a systematic and effective method of parameter inference and model selection for stochastic processes. In this work we introduce an information content model check which may serve as a goodness-of-fit, like the chi-square procedure...
Accurate phenotyping: Reconciling approaches through Bayesian model averaging.
Directory of Open Access Journals (Sweden)
Carla Chia-Ming Chen
Full Text Available Genetic research into complex diseases is frequently hindered by a lack of clear biomarkers for phenotype ascertainment. Phenotypes for such diseases are often identified on the basis of clinically defined criteria; however such criteria may not be suitable for understanding the genetic composition of the diseases. Various statistical approaches have been proposed for phenotype definition; however our previous studies have shown that differences in phenotypes estimated using different approaches have substantial impact on subsequent analyses. Instead of obtaining results based upon a single model, we propose a new method, using Bayesian model averaging to overcome problems associated with phenotype definition. Although Bayesian model averaging has been used in other fields of research, this is the first study that uses Bayesian model averaging to reconcile phenotypes obtained using multiple models. We illustrate the new method by applying it to simulated genetic and phenotypic data for Kofendred personality disorder-an imaginary disease with several sub-types. Two separate statistical methods were used to identify clusters of individuals with distinct phenotypes: latent class analysis and grade of membership. Bayesian model averaging was then used to combine the two clusterings for the purpose of subsequent linkage analyses. We found that causative genetic loci for the disease produced higher LOD scores using model averaging than under either individual model separately. We attribute this improvement to consolidation of the cores of phenotype clusters identified using each individual method.
Involving stakeholders in building integrated fisheries models using Bayesian methods
DEFF Research Database (Denmark)
Haapasaari, Päivi Elisabet; Mäntyniemi, Samu; Kuikka, Sakari
2013-01-01
the potential of the study to contribute to the development of participatory modeling practices. It is concluded that the subjective perspective to knowledge, that is fundamental in Bayesian theory, suits participatory modeling better than a positivist paradigm that seeks the objective truth. The methodology...
Small Sample Properties of Bayesian Multivariate Autoregressive Time Series Models
Price, Larry R.
2012-01-01
The aim of this study was to compare the small sample (N = 1, 3, 5, 10, 15) performance of a Bayesian multivariate vector autoregressive (BVAR-SEM) time series model relative to frequentist power and parameter estimation bias. A multivariate autoregressive model was developed based on correlated autoregressive time series vectors of varying…
Bayesian Network Models in Cyber Security: A Systematic Review
Chockalingam, S.; Pieters, W.; Herdeiro Teixeira, A.M.; van Gelder, P.H.A.J.M.; Lipmaa, Helger; Mitrokotsa, Aikaterini; Matulevicius, Raimundas
2017-01-01
Bayesian Networks (BNs) are an increasingly popular modelling technique in cyber security especially due to their capability to overcome data limitations. This is also instantiated by the growth of BN models development in cyber security. However, a comprehensive comparison and analysis of these
A Bayesian Model of the Memory Colour Effect.
Witzel, Christoph; Olkkonen, Maria; Gegenfurtner, Karl R
2018-01-01
According to the memory colour effect, the colour of a colour-diagnostic object is not perceived independently of the object itself. Instead, it has been shown through an achromatic adjustment method that colour-diagnostic objects still appear slightly in their typical colour, even when they are colourimetrically grey. Bayesian models provide a promising approach to capture the effect of prior knowledge on colour perception and to link these effects to more general effects of cue integration. Here, we model memory colour effects using prior knowledge about typical colours as priors for the grey adjustments in a Bayesian model. This simple model does not involve any fitting of free parameters. The Bayesian model roughly captured the magnitude of the measured memory colour effect for photographs of objects. To some extent, the model predicted observed differences in memory colour effects across objects. The model could not account for the differences in memory colour effects across different levels of realism in the object images. The Bayesian model provides a particularly simple account of memory colour effects, capturing some of the multiple sources of variation of these effects.
Rainfall estimation with TFR model using Ensemble Kalman filter
Asyiqotur Rohmah, Nabila; Apriliani, Erna
2018-03-01
Rainfall fluctuation can affect condition of other environment, correlated with economic activity and public health. The increasing of global average temperature is influenced by the increasing of CO2 in the atmosphere, which caused climate change. Meanwhile, the forests as carbon sinks that help keep the carbon cycle and climate change mitigation. Climate change caused by rainfall intensity deviations can affect the economy of a region, and even countries. It encourages research on rainfall associated with an area of forest. In this study, the mathematics model that used is a model which describes the global temperatures, forest cover, and seasonal rainfall called the TFR (temperature, forest cover, and rainfall) model. The model will be discretized first, and then it will be estimated by the method of Ensemble Kalman Filter (EnKF). The result shows that the more ensembles used in estimation, the better the result is. Also, the accurateness of simulation result is influenced by measurement variable. If a variable is measurement data, the result of simulation is better.
Ensemble catchment hydrological modelling for climate change impact analysis
Vansteenkiste, Thomas; Ntegeka, Victor; Willems, Patrick
2014-05-01
It is vital to investigate how the hydrological model structure affects the climate change impact given that future changes not in the range for which the models were calibrated or validated are likely. Thus an ensemble modelling approach which involves a diversity of models with different structures such as spatial resolutions and process descriptions is crucial. The ensemble modelling approach was applied to a set of models: from the lumped conceptual models NAM, PDM and VHM, an intermediate detailed and distributed model WetSpa, to the highly detailed and fully distributed model MIKE-SHE. Explicit focus was given to the high and low flow extremes. All models were calibrated for sub flows and quick flows derived from rainfall and potential evapotranspiration (ETo) time series. In general, all models were able to produce reliable estimates of the flow regimes under the current climate for extreme peak and low flows. An intercomparison of the low and high flow changes under changed climatic conditions was made using climate scenarios tailored for extremes. Tailoring was important for two reasons. First, since the use of many scenarios was not feasible it was necessary to construct few scenarios that would reasonably represent the range of extreme impacts. Second, scenarios would be more informative as changes in high and low flows would be easily traced to changes of ETo and rainfall; the tailored scenarios are constructed using seasonal changes that are defined using different levels of magnitude (high, mean and low) for rainfall and ETo. After simulation of these climate scenarios in the five hydrological models, close agreement was found among the models. The different models predicted similar range of peak flow changes. For the low flows, however, the differences in the projected impact range by different hydrological models was larger, particularly for the drier scenarios. This suggests that the hydrological model structure is critical in low flow predictions
Modeling error distributions of growth curve models through Bayesian methods.
Zhang, Zhiyong
2016-06-01
Growth curve models are widely used in social and behavioral sciences. However, typical growth curve models often assume that the errors are normally distributed although non-normal data may be even more common than normal data. In order to avoid possible statistical inference problems in blindly assuming normality, a general Bayesian framework is proposed to flexibly model normal and non-normal data through the explicit specification of the error distributions. A simulation study shows when the distribution of the error is correctly specified, one can avoid the loss in the efficiency of standard error estimates. A real example on the analysis of mathematical ability growth data from the Early Childhood Longitudinal Study, Kindergarten Class of 1998-99 is used to show the application of the proposed methods. Instructions and code on how to conduct growth curve analysis with both normal and non-normal error distributions using the the MCMC procedure of SAS are provided.
Bayesian Network Models in Cyber Security: A Systematic Review
Chockalingam, S.; Pieters, W.; Herdeiro Teixeira, A.M.; van Gelder, P.H.A.J.M.; Lipmaa, Helger; Mitrokotsa, Aikaterini; Matulevicius, Raimundas
2017-01-01
Bayesian Networks (BNs) are an increasingly popular modelling technique in cyber security especially due to their capability to overcome data limitations. This is also instantiated by the growth of BN models development in cyber security. However, a comprehensive comparison and analysis of these models is missing. In this paper, we conduct a systematic review of the scientific literature and identify 17 standard BN models in cyber security. We analyse these models based on 9 different criteri...
Cluster-based analysis of multi-model climate ensembles
Hyde, Richard; Hossaini, Ryan; Leeson, Amber A.
2018-06-01
Clustering - the automated grouping of similar data - can provide powerful and unique insight into large and complex data sets, in a fast and computationally efficient manner. While clustering has been used in a variety of fields (from medical image processing to economics), its application within atmospheric science has been fairly limited to date, and the potential benefits of the application of advanced clustering techniques to climate data (both model output and observations) has yet to be fully realised. In this paper, we explore the specific application of clustering to a multi-model climate ensemble. We hypothesise that clustering techniques can provide (a) a flexible, data-driven method of testing model-observation agreement and (b) a mechanism with which to identify model development priorities. We focus our analysis on chemistry-climate model (CCM) output of tropospheric ozone - an important greenhouse gas - from the recent Atmospheric Chemistry and Climate Model Intercomparison Project (ACCMIP). Tropospheric column ozone from the ACCMIP ensemble was clustered using the Data Density based Clustering (DDC) algorithm. We find that a multi-model mean (MMM) calculated using members of the most-populous cluster identified at each location offers a reduction of up to ˜ 20 % in the global absolute mean bias between the MMM and an observed satellite-based tropospheric ozone climatology, with respect to a simple, all-model MMM. On a spatial basis, the bias is reduced at ˜ 62 % of all locations, with the largest bias reductions occurring in the Northern Hemisphere - where ozone concentrations are relatively large. However, the bias is unchanged at 9 % of all locations and increases at 29 %, particularly in the Southern Hemisphere. The latter demonstrates that although cluster-based subsampling acts to remove outlier model data, such data may in fact be closer to observed values in some locations. We further demonstrate that clustering can provide a viable and
Reliability of multi-model and structurally different single-model ensembles
Energy Technology Data Exchange (ETDEWEB)
Yokohata, Tokuta [National Institute for Environmental Studies, Center for Global Environmental Research, Tsukuba, Ibaraki (Japan); Annan, James D.; Hargreaves, Julia C. [Japan Agency for Marine-Earth Science and Technology, Research Institute for Global Change, Yokohama, Kanagawa (Japan); Collins, Matthew [University of Exeter, College of Engineering, Mathematics and Physical Sciences, Exeter (United Kingdom); Jackson, Charles S.; Tobis, Michael [The University of Texas at Austin, Institute of Geophysics, 10100 Burnet Rd., ROC-196, Mail Code R2200, Austin, TX (United States); Webb, Mark J. [Met Office Hadley Centre, Exeter (United Kingdom)
2012-08-15
The performance of several state-of-the-art climate model ensembles, including two multi-model ensembles (MMEs) and four structurally different (perturbed parameter) single model ensembles (SMEs), are investigated for the first time using the rank histogram approach. In this method, the reliability of a model ensemble is evaluated from the point of view of whether the observations can be regarded as being sampled from the ensemble. Our analysis reveals that, in the MMEs, the climate variables we investigated are broadly reliable on the global scale, with a tendency towards overdispersion. On the other hand, in the SMEs, the reliability differs depending on the ensemble and variable field considered. In general, the mean state and historical trend of surface air temperature, and mean state of precipitation are reliable in the SMEs. However, variables such as sea level pressure or top-of-atmosphere clear-sky shortwave radiation do not cover a sufficiently wide range in some. It is not possible to assess whether this is a fundamental feature of SMEs generated with particular model, or a consequence of the algorithm used to select and perturb the values of the parameters. As under-dispersion is a potentially more serious issue when using ensembles to make projections, we recommend the application of rank histograms to assess reliability when designing and running perturbed physics SMEs. (orig.)
Fernández, J.; Primo, C.; Cofiño, A. S.; Gutiérrez, J. M.; Rodríguez, M. A.
2009-08-01
In a recent paper, Gutiérrez et al. (Nonlinear Process Geophys 15(1):109-114, 2008) introduced a new characterization of spatiotemporal error growth—the so called mean-variance logarithmic (MVL) diagram—and applied it to study ensemble prediction systems (EPS); in particular, they analyzed single-model ensembles obtained by perturbing the initial conditions. In the present work, the MVL diagram is applied to multi-model ensembles analyzing also the effect of model formulation differences. To this aim, the MVL diagram is systematically applied to the multi-model ensemble produced in the EU-funded DEMETER project. It is shown that the shared building blocks (atmospheric and ocean components) impose similar dynamics among different models and, thus, contribute to poorly sampling the model formulation uncertainty. This dynamical similarity should be taken into account, at least as a pre-screening process, before applying any objective weighting method.
Modeling Dynamic Systems with Efficient Ensembles of Process-Based Models.
Directory of Open Access Journals (Sweden)
Nikola Simidjievski
Full Text Available Ensembles are a well established machine learning paradigm, leading to accurate and robust models, predominantly applied to predictive modeling tasks. Ensemble models comprise a finite set of diverse predictive models whose combined output is expected to yield an improved predictive performance as compared to an individual model. In this paper, we propose a new method for learning ensembles of process-based models of dynamic systems. The process-based modeling paradigm employs domain-specific knowledge to automatically learn models of dynamic systems from time-series observational data. Previous work has shown that ensembles based on sampling observational data (i.e., bagging and boosting, significantly improve predictive performance of process-based models. However, this improvement comes at the cost of a substantial increase of the computational time needed for learning. To address this problem, the paper proposes a method that aims at efficiently learning ensembles of process-based models, while maintaining their accurate long-term predictive performance. This is achieved by constructing ensembles with sampling domain-specific knowledge instead of sampling data. We apply the proposed method to and evaluate its performance on a set of problems of automated predictive modeling in three lake ecosystems using a library of process-based knowledge for modeling population dynamics. The experimental results identify the optimal design decisions regarding the learning algorithm. The results also show that the proposed ensembles yield significantly more accurate predictions of population dynamics as compared to individual process-based models. Finally, while their predictive performance is comparable to the one of ensembles obtained with the state-of-the-art methods of bagging and boosting, they are substantially more efficient.
Predicting artificailly drained areas by means of selective model ensemble
DEFF Research Database (Denmark)
Møller, Anders Bjørn; Beucher, Amélie; Iversen, Bo Vangsø
. The approaches employed include decision trees, discriminant analysis, regression models, neural networks and support vector machines amongst others. Several models are trained with each method, using variously the original soil covariates and principal components of the covariates. With a large ensemble...... out since the mid-19th century, and it has been estimated that half of the cultivated area is artificially drained (Olesen, 2009). A number of machine learning approaches can be used to predict artificially drained areas in geographic space. However, instead of choosing the most accurate model....... The study aims firstly to train a large number of models to predict the extent of artificially drained areas using various machine learning approaches. Secondly, the study will develop a method for selecting the models, which give a good prediction of artificially drained areas, when used in conjunction...
Cheung, Shao-Yong; Lee, Chieh-Han; Yu, Hwa-Lung
2017-04-01
Due to the limited hydrogeological observation data and high levels of uncertainty within, parameter estimation of the groundwater model has been an important issue. There are many methods of parameter estimation, for example, Kalman filter provides a real-time calibration of parameters through measurement of groundwater monitoring wells, related methods such as Extended Kalman Filter and Ensemble Kalman Filter are widely applied in groundwater research. However, Kalman Filter method is limited to linearity. This study propose a novel method, Bayesian Maximum Entropy Filtering, which provides a method that can considers the uncertainty of data in parameter estimation. With this two methods, we can estimate parameter by given hard data (certain) and soft data (uncertain) in the same time. In this study, we use Python and QGIS in groundwater model (MODFLOW) and development of Extended Kalman Filter and Bayesian Maximum Entropy Filtering in Python in parameter estimation. This method may provide a conventional filtering method and also consider the uncertainty of data. This study was conducted through numerical model experiment to explore, combine Bayesian maximum entropy filter and a hypothesis for the architecture of MODFLOW groundwater model numerical estimation. Through the virtual observation wells to simulate and observe the groundwater model periodically. The result showed that considering the uncertainty of data, the Bayesian maximum entropy filter will provide an ideal result of real-time parameters estimation.
Development of dynamic Bayesian models for web application test management
Azarnova, T. V.; Polukhin, P. V.; Bondarenko, Yu V.; Kashirina, I. L.
2018-03-01
The mathematical apparatus of dynamic Bayesian networks is an effective and technically proven tool that can be used to model complex stochastic dynamic processes. According to the results of the research, mathematical models and methods of dynamic Bayesian networks provide a high coverage of stochastic tasks associated with error testing in multiuser software products operated in a dynamically changing environment. Formalized representation of the discrete test process as a dynamic Bayesian model allows us to organize the logical connection between individual test assets for multiple time slices. This approach gives an opportunity to present testing as a discrete process with set structural components responsible for the generation of test assets. Dynamic Bayesian network-based models allow us to combine in one management area individual units and testing components with different functionalities and a direct influence on each other in the process of comprehensive testing of various groups of computer bugs. The application of the proposed models provides an opportunity to use a consistent approach to formalize test principles and procedures, methods used to treat situational error signs, and methods used to produce analytical conclusions based on test results.
Extreme winds over Europe in the ENSEMBLES regional climate models
Directory of Open Access Journals (Sweden)
S. D. Outten
2013-05-01
Full Text Available Extreme winds cause vast amounts of damage every year and represent a major concern for numerous industries including construction, afforestation, wind energy and many others. Under a changing climate, the intensity and frequency of extreme events are expected to change, and accurate projections of these changes will be invaluable to decision makers and society as a whole. This work examines four regional climate model downscalings over Europe following the SRES A1B scenario from the "ENSEMBLE-based Predictions of Climate Changes and their Impacts" project (ENSEMBLES. It investigates the projected changes in the 50 yr return wind speeds and the associated uncertainties. This is accomplished by employing the peaks-over-threshold method with the use of the generalised Pareto distribution. The models show that, for much of Europe, the 50 yr return wind is projected to change by less than 2 m s−1, while the uncertainties associated with the statistical estimates are larger than this. In keeping with previous works in this field, the largest source of uncertainty is found to be the inter-model spread, with some locations showing differences in the 50 yr return wind of over 20 m s−1 between two different downscalings.
Spatial and spatio-temporal bayesian models with R - INLA
Blangiardo, Marta
2015-01-01
Dedication iiiPreface ix1 Introduction 11.1 Why spatial and spatio-temporal statistics? 11.2 Why do we use Bayesian methods for modelling spatial and spatio-temporal structures? 21.3 Why INLA? 31.4 Datasets 32 Introduction to 212.1 The language 212.2 objects 222.3 Data and session management 342.4 Packages 352.5 Programming in 362.6 Basic statistical analysis with 393 Introduction to Bayesian Methods 533.1 Bayesian Philosophy 533.2 Basic Probability Elements 573.3 Bayes Theorem 623.4 Prior and Posterior Distributions 643.5 Working with the Posterior Distribution 663.6 Choosing the Prior Distr
Bayesian log-periodic model for financial crashes
DEFF Research Database (Denmark)
Rodríguez-Caballero, Carlos Vladimir; Knapik, Oskar
2014-01-01
This paper introduces a Bayesian approach in econophysics literature about financial bubbles in order to estimate the most probable time for a financial crash to occur. To this end, we propose using noninformative prior distributions to obtain posterior distributions. Since these distributions...... cannot be performed analytically, we develop a Markov Chain Monte Carlo algorithm to draw from posterior distributions. We consider three Bayesian models that involve normal and Student’s t-distributions in the disturbances and an AR(1)-GARCH(1,1) structure only within the first case. In the empirical...... part of the study, we analyze a well-known example of financial bubble – the S&P 500 1987 crash – to show the usefulness of the three methods under consideration and crashes of Merval-94, Bovespa-97, IPCMX-94, Hang Seng-97 using the simplest method. The novelty of this research is that the Bayesian...
Thermodynamic state ensemble models of cis-regulation.
Directory of Open Access Journals (Sweden)
Marc S Sherman
Full Text Available A major goal in computational biology is to develop models that accurately predict a gene's expression from its surrounding regulatory DNA. Here we present one class of such models, thermodynamic state ensemble models. We describe the biochemical derivation of the thermodynamic framework in simple terms, and lay out the mathematical components that comprise each model. These components include (1 the possible states of a promoter, where a state is defined as a particular arrangement of transcription factors bound to a DNA promoter, (2 the binding constants that describe the affinity of the protein-protein and protein-DNA interactions that occur in each state, and (3 whether each state is capable of transcribing. Using these components, we demonstrate how to compute a cis-regulatory function that encodes the probability of a promoter being active. Our intention is to provide enough detail so that readers with little background in thermodynamics can compose their own cis-regulatory functions. To facilitate this goal, we also describe a matrix form of the model that can be easily coded in any programming language. This formalism has great flexibility, which we show by illustrating how phenomena such as competition between transcription factors and cooperativity are readily incorporated into these models. Using this framework, we also demonstrate that Michaelis-like functions, another class of cis-regulatory models, are a subset of the thermodynamic framework with specific assumptions. By recasting Michaelis-like functions as thermodynamic functions, we emphasize the relationship between these models and delineate the specific circumstances representable by each approach. Application of thermodynamic state ensemble models is likely to be an important tool in unraveling the physical basis of combinatorial cis-regulation and in generating formalisms that accurately predict gene expression from DNA sequence.
Hierarchical Bayesian spatial models for multispecies conservation planning and monitoring.
Carroll, Carlos; Johnson, Devin S; Dunk, Jeffrey R; Zielinski, William J
2010-12-01
Biologists who develop and apply habitat models are often familiar with the statistical challenges posed by their data's spatial structure but are unsure of whether the use of complex spatial models will increase the utility of model results in planning. We compared the relative performance of nonspatial and hierarchical Bayesian spatial models for three vertebrate and invertebrate taxa of conservation concern (Church's sideband snails [Monadenia churchi], red tree voles [Arborimus longicaudus], and Pacific fishers [Martes pennanti pacifica]) that provide examples of a range of distributional extents and dispersal abilities. We used presence-absence data derived from regional monitoring programs to develop models with both landscape and site-level environmental covariates. We used Markov chain Monte Carlo algorithms and a conditional autoregressive or intrinsic conditional autoregressive model framework to fit spatial models. The fit of Bayesian spatial models was between 35 and 55% better than the fit of nonspatial analogue models. Bayesian spatial models outperformed analogous models developed with maximum entropy (Maxent) methods. Although the best spatial and nonspatial models included similar environmental variables, spatial models provided estimates of residual spatial effects that suggested how ecological processes might structure distribution patterns. Spatial models built from presence-absence data improved fit most for localized endemic species with ranges constrained by poorly known biogeographic factors and for widely distributed species suspected to be strongly affected by unmeasured environmental variables or population processes. By treating spatial effects as a variable of interest rather than a nuisance, hierarchical Bayesian spatial models, especially when they are based on a common broad-scale spatial lattice (here the national Forest Inventory and Analysis grid of 24 km(2) hexagons), can increase the relevance of habitat models to multispecies
Bayesian nonparametric estimation of hazard rate in monotone Aalen model
Czech Academy of Sciences Publication Activity Database
Timková, Jana
2014-01-01
Roč. 50, č. 6 (2014), s. 849-868 ISSN 0023-5954 Institutional support: RVO:67985556 Keywords : Aalen model * Bayesian estimation * MCMC Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.541, year: 2014 http://library.utia.cas.cz/separaty/2014/SI/timkova-0438210.pdf
Advanced REACH tool: A Bayesian model for occupational exposure assessment
McNally, K.; Warren, N.; Fransman, W.; Entink, R.K.; Schinkel, J.; Van Tongeren, M.; Cherrie, J.W.; Kromhout, H.; Schneider, T.; Tielemans, E.
2014-01-01
This paper describes a Bayesian model for the assessment of inhalation exposures in an occupational setting; the methodology underpins a freely available web-based application for exposure assessment, the Advanced REACH Tool (ART). The ART is a higher tier exposure tool that combines disparate
Testing adaptive toolbox models: a Bayesian hierarchical approach.
Scheibehenne, Benjamin; Rieskamp, Jörg; Wagenmakers, Eric-Jan
2013-01-01
Many theories of human cognition postulate that people are equipped with a repertoire of strategies to solve the tasks they face. This theoretical framework of a cognitive toolbox provides a plausible account of intra- and interindividual differences in human behavior. Unfortunately, it is often unclear how to rigorously test the toolbox framework. How can a toolbox model be quantitatively specified? How can the number of toolbox strategies be limited to prevent uncontrolled strategy sprawl? How can a toolbox model be formally tested against alternative theories? The authors show how these challenges can be met by using Bayesian inference techniques. By means of parameter recovery simulations and the analysis of empirical data across a variety of domains (i.e., judgment and decision making, children's cognitive development, function learning, and perceptual categorization), the authors illustrate how Bayesian inference techniques allow toolbox models to be quantitatively specified, strategy sprawl to be contained, and toolbox models to be rigorously tested against competing theories. The authors demonstrate that their approach applies at the individual level but can also be generalized to the group level with hierarchical Bayesian procedures. The suggested Bayesian inference techniques represent a theoretical and methodological advancement for toolbox theories of cognition and behavior.
DART: New Research Using Ensemble Data Assimilation in Geophysical Models
Hoar, T. J.; Raeder, K.
2015-12-01
The Data Assimilation Research Testbed (DART) is a community facilityfor ensemble data assimilation developed and supported by the NationalCenter for Atmospheric Research. DART provides a comprehensive suite of software, documentation, and tutorials that can be used for ensemble data assimilation research, operations, and education. Scientists and software engineers at NCAR are available to support DART users who want to use existing DART products or develop their own applications. Current DART users range from university professors teaching data assimilation, to individual graduate students working with simple models, through national laboratories doing operational prediction with large state-of-the-art models. DART runs efficiently on many computational platforms ranging from laptops through thousands of cores on the newest supercomputers.This poster focuses on several recent research activities using DART with geophysical models.Using CAM/DART to understand whether OCO-2 Total Precipitable Water observations can be useful in numerical weather prediction.Impacts of the synergistic use of Infra-red CO retrievals (MOPITT, IASI) in CAM-CHEM/DART assimilations.Assimilation and Analysis of Observations of Amazonian Biomass Burning Emissions by MOPITT (aerosol optical depth), MODIS (carbon monoxide) and MISR (plume height).Long term evaluation of the chemical response of MOPITT-CO assimilation in CAM-CHEM/DART OSSEs for satellite planning and emission inversion capabilities.Improved forward observation operators for land models that have multiple land use/land cover segments in a single grid cell,Simulating mesoscale convective systems (MCSs) using a variable resolution, unstructured grid in the Model for Prediction Across Scales (MPAS) and DART.The mesoscale WRF+DART system generated an ensemble of year-long, real-time initializations of a convection allowing model over the United States.Constraining WACCM with observations in the tropical band (30S-30N) using DART
Ensemble Prediction Model with Expert Selection for Electricity Price Forecasting
Directory of Open Access Journals (Sweden)
Bijay Neupane
2017-01-01
Full Text Available Forecasting of electricity prices is important in deregulated electricity markets for all of the stakeholders: energy wholesalers, traders, retailers and consumers. Electricity price forecasting is an inherently difficult problem due to its special characteristic of dynamicity and non-stationarity. In this paper, we present a robust price forecasting mechanism that shows resilience towards the aggregate demand response effect and provides highly accurate forecasted electricity prices to the stakeholders in a dynamic environment. We employ an ensemble prediction model in which a group of different algorithms participates in forecasting 1-h ahead the price for each hour of a day. We propose two different strategies, namely, the Fixed Weight Method (FWM and the Varying Weight Method (VWM, for selecting each hour’s expert algorithm from the set of participating algorithms. In addition, we utilize a carefully engineered set of features selected from a pool of features extracted from the past electricity price data, weather data and calendar data. The proposed ensemble model offers better results than the Autoregressive Integrated Moving Average (ARIMA method, the Pattern Sequence-based Forecasting (PSF method and our previous work using Artificial Neural Networks (ANN alone on the datasets for New York, Australian and Spanish electricity markets.
Bayesian estimation of parameters in a regional hydrological model
Directory of Open Access Journals (Sweden)
K. Engeland
2002-01-01
Full Text Available This study evaluates the applicability of the distributed, process-oriented Ecomag model for prediction of daily streamflow in ungauged basins. The Ecomag model is applied as a regional model to nine catchments in the NOPEX area, using Bayesian statistics to estimate the posterior distribution of the model parameters conditioned on the observed streamflow. The distribution is calculated by Markov Chain Monte Carlo (MCMC analysis. The Bayesian method requires formulation of a likelihood function for the parameters and three alternative formulations are used. The first is a subjectively chosen objective function that describes the goodness of fit between the simulated and observed streamflow, as defined in the GLUE framework. The second and third formulations are more statistically correct likelihood models that describe the simulation errors. The full statistical likelihood model describes the simulation errors as an AR(1 process, whereas the simple model excludes the auto-regressive part. The statistical parameters depend on the catchments and the hydrological processes and the statistical and the hydrological parameters are estimated simultaneously. The results show that the simple likelihood model gives the most robust parameter estimates. The simulation error may be explained to a large extent by the catchment characteristics and climatic conditions, so it is possible to transfer knowledge about them to ungauged catchments. The statistical models for the simulation errors indicate that structural errors in the model are more important than parameter uncertainties. Keywords: regional hydrological model, model uncertainty, Bayesian analysis, Markov Chain Monte Carlo analysis
Bayesian inference method for stochastic damage accumulation modeling
International Nuclear Information System (INIS)
Jiang, Xiaomo; Yuan, Yong; Liu, Xian
2013-01-01
Damage accumulation based reliability model plays an increasingly important role in successful realization of condition based maintenance for complicated engineering systems. This paper developed a Bayesian framework to establish stochastic damage accumulation model from historical inspection data, considering data uncertainty. Proportional hazards modeling technique is developed to model the nonlinear effect of multiple influencing factors on system reliability. Different from other hazard modeling techniques such as normal linear regression model, the approach does not require any distribution assumption for the hazard model, and can be applied for a wide variety of distribution models. A Bayesian network is created to represent the nonlinear proportional hazards models and to estimate model parameters by Bayesian inference with Markov Chain Monte Carlo simulation. Both qualitative and quantitative approaches are developed to assess the validity of the established damage accumulation model. Anderson–Darling goodness-of-fit test is employed to perform the normality test, and Box–Cox transformation approach is utilized to convert the non-normality data into normal distribution for hypothesis testing in quantitative model validation. The methodology is illustrated with the seepage data collected from real-world subway tunnels.
Comparing pharmacophore models derived from crystallography and NMR ensembles
Ghanakota, Phani; Carlson, Heather A.
2017-11-01
NMR and X-ray crystallography are the two most widely used methods for determining protein structures. Our previous study examining NMR versus X-Ray sources of protein conformations showed improved performance with NMR structures when used in our Multiple Protein Structures (MPS) method for receptor-based pharmacophores (Damm, Carlson, J Am Chem Soc 129:8225-8235, 2007). However, that work was based on a single test case, HIV-1 protease, because of the rich data available for that system. New data for more systems are available now, which calls for further examination of the effect of different sources of protein conformations. The MPS technique was applied to Growth factor receptor bound protein 2 (Grb2), Src SH2 homology domain (Src-SH2), FK506-binding protein 1A (FKBP12), and Peroxisome proliferator-activated receptor-γ (PPAR-γ). Pharmacophore models from both crystal and NMR ensembles were able to discriminate between high-affinity, low-affinity, and decoy molecules. As we found in our original study, NMR models showed optimal performance when all elements were used. The crystal models had more pharmacophore elements compared to their NMR counterparts. The crystal-based models exhibited optimum performance only when pharmacophore elements were dropped. This supports our assertion that the higher flexibility in NMR ensembles helps focus the models on the most essential interactions with the protein. Our studies suggest that the "extra" pharmacophore elements seen at the periphery in X-ray models arise as a result of decreased protein flexibility and make very little contribution to model performance.
A Bayesian Markov geostatistical model for estimation of hydrogeological properties
International Nuclear Information System (INIS)
Rosen, L.; Gustafson, G.
1996-01-01
A geostatistical methodology based on Markov-chain analysis and Bayesian statistics was developed for probability estimations of hydrogeological and geological properties in the siting process of a nuclear waste repository. The probability estimates have practical use in decision-making on issues such as siting, investigation programs, and construction design. The methodology is nonparametric which makes it possible to handle information that does not exhibit standard statistical distributions, as is often the case for classified information. Data do not need to meet the requirements on additivity and normality as with the geostatistical methods based on regionalized variable theory, e.g., kriging. The methodology also has a formal way for incorporating professional judgments through the use of Bayesian statistics, which allows for updating of prior estimates to posterior probabilities each time new information becomes available. A Bayesian Markov Geostatistical Model (BayMar) software was developed for implementation of the methodology in two and three dimensions. This paper gives (1) a theoretical description of the Bayesian Markov Geostatistical Model; (2) a short description of the BayMar software; and (3) an example of application of the model for estimating the suitability for repository establishment with respect to the three parameters of lithology, hydraulic conductivity, and rock quality designation index (RQD) at 400--500 meters below ground surface in an area around the Aespoe Hard Rock Laboratory in southeastern Sweden
Bayesian Dimensionality Assessment for the Multidimensional Nominal Response Model
Directory of Open Access Journals (Sweden)
Javier Revuelta
2017-06-01
Full Text Available This article introduces Bayesian estimation and evaluation procedures for the multidimensional nominal response model. The utility of this model is to perform a nominal factor analysis of items that consist of a finite number of unordered response categories. The key aspect of the model, in comparison with traditional factorial model, is that there is a slope for each response category on the latent dimensions, instead of having slopes associated to the items. The extended parameterization of the multidimensional nominal response model requires large samples for estimation. When sample size is of a moderate or small size, some of these parameters may be weakly empirically identifiable and the estimation algorithm may run into difficulties. We propose a Bayesian MCMC inferential algorithm to estimate the parameters and the number of dimensions underlying the multidimensional nominal response model. Two Bayesian approaches to model evaluation were compared: discrepancy statistics (DIC, WAICC, and LOO that provide an indication of the relative merit of different models, and the standardized generalized discrepancy measure that requires resampling data and is computationally more involved. A simulation study was conducted to compare these two approaches, and the results show that the standardized generalized discrepancy measure can be used to reliably estimate the dimensionality of the model whereas the discrepancy statistics are questionable. The paper also includes an example with real data in the context of learning styles, in which the model is used to conduct an exploratory factor analysis of nominal data.
Characterizing economic trends by Bayesian stochastic model specifi cation search
Grassi, Stefano; Proietti, Tommaso
2010-01-01
We apply a recently proposed Bayesian model selection technique, known as stochastic model specification search, for characterising the nature of the trend in macroeconomic time series. We illustrate that the methodology can be quite successfully applied to discriminate between stochastic and deterministic trends. In particular, we formulate autoregressive models with stochastic trends components and decide on whether a specific feature of the series, i.e. the underlying level and/or the rate...
Copula Based Factorization in Bayesian Multivariate Infinite Mixture Models
Martin Burda; Artem Prokhorov
2012-01-01
Bayesian nonparametric models based on infinite mixtures of density kernels have been recently gaining in popularity due to their flexibility and feasibility of implementation even in complicated modeling scenarios. In economics, they have been particularly useful in estimating nonparametric distributions of latent variables. However, these models have been rarely applied in more than one dimension. Indeed, the multivariate case suffers from the curse of dimensionality, with a rapidly increas...
Acute leukemia classification by ensemble particle swarm model selection.
Escalante, Hugo Jair; Montes-y-Gómez, Manuel; González, Jesús A; Gómez-Gil, Pilar; Altamirano, Leopoldo; Reyes, Carlos A; Reta, Carolina; Rosales, Alejandro
2012-07-01
Acute leukemia is a malignant disease that affects a large proportion of the world population. Different types and subtypes of acute leukemia require different treatments. In order to assign the correct treatment, a physician must identify the leukemia type or subtype. Advanced and precise methods are available for identifying leukemia types, but they are very expensive and not available in most hospitals in developing countries. Thus, alternative methods have been proposed. An option explored in this paper is based on the morphological properties of bone marrow images, where features are extracted from medical images and standard machine learning techniques are used to build leukemia type classifiers. This paper studies the use of ensemble particle swarm model selection (EPSMS), which is an automated tool for the selection of classification models, in the context of acute leukemia classification. EPSMS is the application of particle swarm optimization to the exploration of the search space of ensembles that can be formed by heterogeneous classification models in a machine learning toolbox. EPSMS does not require prior domain knowledge and it is able to select highly accurate classification models without user intervention. Furthermore, specific models can be used for different classification tasks. We report experimental results for acute leukemia classification with real data and show that EPSMS outperformed the best results obtained using manually designed classifiers with the same data. The highest performance using EPSMS was of 97.68% for two-type classification problems and of 94.21% for more than two types problems. To the best of our knowledge, these are the best results reported for this data set. Compared with previous studies, these improvements were consistent among different type/subtype classification tasks, different features extracted from images, and different feature extraction regions. The performance improvements were statistically significant
A note on the multi model super ensemble technique for reducing forecast errors
International Nuclear Information System (INIS)
Kantha, L.; Carniel, S.; Sclavo, M.
2008-01-01
The multi model super ensemble (S E) technique has been used with considerable success to improve meteorological forecasts and is now being applied to ocean models. Although the technique has been shown to produce deterministic forecasts that can be superior to the individual models in the ensemble or a simple multi model ensemble forecast, there is a clear need to understand its strengths and limitations. This paper is an attempt to do so in simple, easily understood contexts. The results demonstrate that the S E forecast is almost always better than the simple ensemble forecast, the degree of improvement depending on the properties of the models in the ensemble. However, the skill of the S E forecast with respect to the true forecast depends on a number of factors, principal among which is the skill of the models in the ensemble. As can be expected, if the ensemble consists of models with poor skill, the S E forecast will also be poor, although better than the ensemble forecast. On the other hand, the inclusion of even a single skillful model in the ensemble increases the forecast skill significantly.
Ensemble ecosystem modeling for predicting ecosystem response to predator reintroduction.
Baker, Christopher M; Gordon, Ascelin; Bode, Michael
2017-04-01
Introducing a new or extirpated species to an ecosystem is risky, and managers need quantitative methods that can predict the consequences for the recipient ecosystem. Proponents of keystone predator reintroductions commonly argue that the presence of the predator will restore ecosystem function, but this has not always been the case, and mathematical modeling has an important role to play in predicting how reintroductions will likely play out. We devised an ensemble modeling method that integrates species interaction networks and dynamic community simulations and used it to describe the range of plausible consequences of 2 keystone-predator reintroductions: wolves (Canis lupus) to Yellowstone National Park and dingoes (Canis dingo) to a national park in Australia. Although previous methods for predicting ecosystem responses to such interventions focused on predicting changes around a given equilibrium, we used Lotka-Volterra equations to predict changing abundances through time. We applied our method to interaction networks for wolves in Yellowstone National Park and for dingoes in Australia. Our model replicated the observed dynamics in Yellowstone National Park and produced a larger range of potential outcomes for the dingo network. However, we also found that changes in small vertebrates or invertebrates gave a good indication about the potential future state of the system. Our method allowed us to predict when the systems were far from equilibrium. Our results showed that the method can also be used to predict which species may increase or decrease following a reintroduction and can identify species that are important to monitor (i.e., species whose changes in abundance give extra insight into broad changes in the system). Ensemble ecosystem modeling can also be applied to assess the ecosystem-wide implications of other types of interventions including assisted migration, biocontrol, and invasive species eradication. © 2016 Society for Conservation Biology.
Bayesian models for comparative analysis integrating phylogenetic uncertainty
Directory of Open Access Journals (Sweden)
Villemereuil Pierre de
2012-06-01
Full Text Available Abstract Background Uncertainty in comparative analyses can come from at least two sources: a phylogenetic uncertainty in the tree topology or branch lengths, and b uncertainty due to intraspecific variation in trait values, either due to measurement error or natural individual variation. Most phylogenetic comparative methods do not account for such uncertainties. Not accounting for these sources of uncertainty leads to false perceptions of precision (confidence intervals will be too narrow and inflated significance in hypothesis testing (e.g. p-values will be too small. Although there is some application-specific software for fitting Bayesian models accounting for phylogenetic error, more general and flexible software is desirable. Methods We developed models to directly incorporate phylogenetic uncertainty into a range of analyses that biologists commonly perform, using a Bayesian framework and Markov Chain Monte Carlo analyses. Results We demonstrate applications in linear regression, quantification of phylogenetic signal, and measurement error models. Phylogenetic uncertainty was incorporated by applying a prior distribution for the phylogeny, where this distribution consisted of the posterior tree sets from Bayesian phylogenetic tree estimation programs. The models were analysed using simulated data sets, and applied to a real data set on plant traits, from rainforest plant species in Northern Australia. Analyses were performed using the free and open source software OpenBUGS and JAGS. Conclusions Incorporating phylogenetic uncertainty through an empirical prior distribution of trees leads to more precise estimation of regression model parameters than using a single consensus tree and enables a more realistic estimation of confidence intervals. In addition, models incorporating measurement errors and/or individual variation, in one or both variables, are easily formulated in the Bayesian framework. We show that BUGS is a useful, flexible
Bayesian models for comparative analysis integrating phylogenetic uncertainty
2012-01-01
Background Uncertainty in comparative analyses can come from at least two sources: a) phylogenetic uncertainty in the tree topology or branch lengths, and b) uncertainty due to intraspecific variation in trait values, either due to measurement error or natural individual variation. Most phylogenetic comparative methods do not account for such uncertainties. Not accounting for these sources of uncertainty leads to false perceptions of precision (confidence intervals will be too narrow) and inflated significance in hypothesis testing (e.g. p-values will be too small). Although there is some application-specific software for fitting Bayesian models accounting for phylogenetic error, more general and flexible software is desirable. Methods We developed models to directly incorporate phylogenetic uncertainty into a range of analyses that biologists commonly perform, using a Bayesian framework and Markov Chain Monte Carlo analyses. Results We demonstrate applications in linear regression, quantification of phylogenetic signal, and measurement error models. Phylogenetic uncertainty was incorporated by applying a prior distribution for the phylogeny, where this distribution consisted of the posterior tree sets from Bayesian phylogenetic tree estimation programs. The models were analysed using simulated data sets, and applied to a real data set on plant traits, from rainforest plant species in Northern Australia. Analyses were performed using the free and open source software OpenBUGS and JAGS. Conclusions Incorporating phylogenetic uncertainty through an empirical prior distribution of trees leads to more precise estimation of regression model parameters than using a single consensus tree and enables a more realistic estimation of confidence intervals. In addition, models incorporating measurement errors and/or individual variation, in one or both variables, are easily formulated in the Bayesian framework. We show that BUGS is a useful, flexible general purpose tool for
Bayesian non parametric modelling of Higgs pair production
Directory of Open Access Journals (Sweden)
Scarpa Bruno
2017-01-01
Full Text Available Statistical classification models are commonly used to separate a signal from a background. In this talk we face the problem of isolating the signal of Higgs pair production using the decay channel in which each boson decays into a pair of b-quarks. Typically in this context non parametric methods are used, such as Random Forests or different types of boosting tools. We remain in the same non-parametric framework, but we propose to face the problem following a Bayesian approach. A Dirichlet process is used as prior for the random effects in a logit model which is fitted by leveraging the Polya-Gamma data augmentation. Refinements of the model include the insertion in the simple model of P-splines to relate explanatory variables with the response and the use of Bayesian trees (BART to describe the atoms in the Dirichlet process.
CSIR Research Space (South Africa)
Johnson, S
2010-02-01
Full Text Available metapopulations was the focus of a Bayesian Network (BN) modelling workshop in South Africa. Using a new heuristics, Iterative Bayesian Network Development Cycle (IBNDC), described in this paper, several networks were formulated to distinguish between the unique...
Uncertainties in ozone concentrations predicted with a Lagrangian photochemical air quality model have been estimated using Bayesian Monte Carlo (BMC) analysis. Bayesian Monte Carlo analysis provides a means of combining subjective "prior" uncertainty estimates developed ...
Technical note: Bayesian calibration of dynamic ruminant nutrition models.
Reed, K F; Arhonditsis, G B; France, J; Kebreab, E
2016-08-01
Mechanistic models of ruminant digestion and metabolism have advanced our understanding of the processes underlying ruminant animal physiology. Deterministic modeling practices ignore the inherent variation within and among individual animals and thus have no way to assess how sources of error influence model outputs. We introduce Bayesian calibration of mathematical models to address the need for robust mechanistic modeling tools that can accommodate error analysis by remaining within the bounds of data-based parameter estimation. For the purpose of prediction, the Bayesian approach generates a posterior predictive distribution that represents the current estimate of the value of the response variable, taking into account both the uncertainty about the parameters and model residual variability. Predictions are expressed as probability distributions, thereby conveying significantly more information than point estimates in regard to uncertainty. Our study illustrates some of the technical advantages of Bayesian calibration and discusses the future perspectives in the context of animal nutrition modeling. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Projecting UK mortality using Bayesian generalised additive models
Hilton, Jason; Dodd, Erengul; Forster, Jonathan; Smith, Peter W.F.
2018-01-01
Forecasts of mortality provide vital information about future populations, with implications for pension and health-care policy as well as for decisions made by private companies about life insurance and annuity pricing. This paper presents a Bayesian approach to the forecasting of mortality that jointly estimates a Generalised Additive Model (GAM) for mortality for the majority of the age-range and a parametric model for older ages where the data are sparser. The GAM allows smooth components...
Bayesian modeling and prediction of solar particles flux
International Nuclear Information System (INIS)
Dedecius, Kamil; Kalova, Jana
2010-01-01
An autoregression model was developed based on the Bayesian approach. Considering the solar wind non-homogeneity, the idea was applied of combining the pure autoregressive properties of the model with expert knowledge based on a similar behaviour of the various phenomena related to the flux properties. Examples of such situations include the hardening of the X-ray spectrum, which is often followed by coronal mass ejection and a significant increase in the particles flux intensity
Merging information from multi-model flood projections in a hierarchical Bayesian framework
Le Vine, Nataliya
2016-04-01
Multi-model ensembles are becoming widely accepted for flood frequency change analysis. The use of multiple models results in large uncertainty around estimates of flood magnitudes, due to both uncertainty in model selection and natural variability of river flow. The challenge is therefore to extract the most meaningful signal from the multi-model predictions, accounting for both model quality and uncertainties in individual model estimates. The study demonstrates the potential of a recently proposed hierarchical Bayesian approach to combine information from multiple models. The approach facilitates explicit treatment of shared multi-model discrepancy as well as the probabilistic nature of the flood estimates, by treating the available models as a sample from a hypothetical complete (but unobserved) set of models. The advantages of the approach are: 1) to insure an adequate 'baseline' conditions with which to compare future changes; 2) to reduce flood estimate uncertainty; 3) to maximize use of statistical information in circumstances where multiple weak predictions individually lack power, but collectively provide meaningful information; 4) to adjust multi-model consistency criteria when model biases are large; and 5) to explicitly consider the influence of the (model performance) stationarity assumption. Moreover, the analysis indicates that reducing shared model discrepancy is the key to further reduction of uncertainty in the flood frequency analysis. The findings are of value regarding how conclusions about changing exposure to flooding are drawn, and to flood frequency change attribution studies.
Pauci ex tanto numero: reducing redundancy in multi-model ensembles
Solazzo, E.; Riccio, A.; Kioutsioukis, I.; Galmarini, S.
2013-02-01
We explicitly address the fundamental issue of member diversity in multi-model ensembles. To date no attempts in this direction are documented within the air quality (AQ) community, although the extensive use of ensembles in this field. Common biases and redundancy are the two issues directly deriving from lack of independence, undermining the significance of a multi-model ensemble, and are the subject of this study. Shared biases among models will determine a biased ensemble, making therefore essential the errors of the ensemble members to be independent so that bias can cancel out. Redundancy derives from having too large a portion of common variance among the members of the ensemble, producing overconfidence in the predictions and underestimation of the uncertainty. The two issues of common biases and redundancy are analysed in detail using the AQMEII ensemble of AQ model results for four air pollutants in two European regions. We show that models share large portions of bias and variance, extending well beyond those induced by common inputs. We make use of several techniques to further show that subsets of models can explain the same amount of variance as the full ensemble with the advantage of being poorly correlated. Selecting the members for generating skilful, non-redundant ensembles from such subsets proved, however, non-trivial. We propose and discuss various methods of member selection and rate the ensemble performance they produce. In most cases, the full ensemble is outscored by the reduced ones. We conclude that, although independence of outputs may not always guarantee enhancement of scores (but this depends upon the skill being investigated) we discourage selecting the members of the ensemble simply on the basis of scores, that is, independence and skills need to be considered disjointly.
Application of a predictive Bayesian model to environmental accounting.
Anex, R P; Englehardt, J D
2001-03-30
Environmental accounting techniques are intended to capture important environmental costs and benefits that are often overlooked in standard accounting practices. Environmental accounting methods themselves often ignore or inadequately represent large but highly uncertain environmental costs and costs conditioned by specific prior events. Use of a predictive Bayesian model is demonstrated for the assessment of such highly uncertain environmental and contingent costs. The predictive Bayesian approach presented generates probability distributions for the quantity of interest (rather than parameters thereof). A spreadsheet implementation of a previously proposed predictive Bayesian model, extended to represent contingent costs, is described and used to evaluate whether a firm should undertake an accelerated phase-out of its PCB containing transformers. Variability and uncertainty (due to lack of information) in transformer accident frequency and severity are assessed simultaneously using a combination of historical accident data, engineering model-based cost estimates, and subjective judgement. Model results are compared using several different risk measures. Use of the model for incorporation of environmental risk management into a company's overall risk management strategy is discussed.
International Nuclear Information System (INIS)
Lucas, Donald D.; Simpson, Matthew; Cameron-Smith, Philip; Baskett, Ronald L.
2017-01-01
Probability distribution functions (PDFs) of model inputs that affect the transport and dispersion of a trace gas released from a coastal California nuclear power plant are quantified using ensemble simulations, machine-learning algorithms, and Bayesian inversion. The PDFs are constrained by observations of tracer concentrations and account for uncertainty in meteorology, transport, diffusion, and emissions. Meteorological uncertainty is calculated using an ensemble of simulations of the Weather Research and Forecasting (WRF) model that samples five categories of model inputs (initialization time, boundary layer physics, land surface model, nudging options, and reanalysis data). The WRF output is used to drive tens of thousands of FLEXPART dispersion simulations that sample a uniform distribution of six emissions inputs. Machine-learning algorithms are trained on the ensemble data and used to quantify the sources of ensemble variability and to infer, via inverse modeling, the values of the 11 model inputs most consistent with tracer measurements. We find a substantial ensemble spread in tracer concentrations (factors of 10 to 10 3 ), most of which is due to changing emissions inputs (about 80 %), though the cumulative effects of meteorological variations are not negligible. The performance of the inverse method is verified using synthetic observations generated from arbitrarily selected simulations. When applied to measurements from a controlled tracer release experiment, the inverse method satisfactorily determines the location, start time, duration and amount. In a 2 km x 2 km area of possible locations, the actual location is determined to within 200 m. The start time is determined to within 5 min out of 2 h, and the duration to within 50 min out of 4 h. Over a range of release amounts of 10 to 1000 kg, the estimated amount exceeds the actual amount of 146 kg by only 32 kg. The inversion also estimates probabilities of different WRF configurations. To best match
Energy Technology Data Exchange (ETDEWEB)
Lucas, Donald D.; Simpson, Matthew; Cameron-Smith, Philip; Baskett, Ronald L. [Lawrence Livermore National Laboratory, Livermore, CA (United States)
2017-07-01
Probability distribution functions (PDFs) of model inputs that affect the transport and dispersion of a trace gas released from a coastal California nuclear power plant are quantified using ensemble simulations, machine-learning algorithms, and Bayesian inversion. The PDFs are constrained by observations of tracer concentrations and account for uncertainty in meteorology, transport, diffusion, and emissions. Meteorological uncertainty is calculated using an ensemble of simulations of the Weather Research and Forecasting (WRF) model that samples five categories of model inputs (initialization time, boundary layer physics, land surface model, nudging options, and reanalysis data). The WRF output is used to drive tens of thousands of FLEXPART dispersion simulations that sample a uniform distribution of six emissions inputs. Machine-learning algorithms are trained on the ensemble data and used to quantify the sources of ensemble variability and to infer, via inverse modeling, the values of the 11 model inputs most consistent with tracer measurements. We find a substantial ensemble spread in tracer concentrations (factors of 10 to 10{sup 3}), most of which is due to changing emissions inputs (about 80 %), though the cumulative effects of meteorological variations are not negligible. The performance of the inverse method is verified using synthetic observations generated from arbitrarily selected simulations. When applied to measurements from a controlled tracer release experiment, the inverse method satisfactorily determines the location, start time, duration and amount. In a 2 km x 2 km area of possible locations, the actual location is determined to within 200 m. The start time is determined to within 5 min out of 2 h, and the duration to within 50 min out of 4 h. Over a range of release amounts of 10 to 1000 kg, the estimated amount exceeds the actual amount of 146 kg by only 32 kg. The inversion also estimates probabilities of different WRF configurations. To best
Impact of censoring on learning Bayesian networks in survival modelling.
Stajduhar, Ivan; Dalbelo-Basić, Bojana; Bogunović, Nikola
2009-11-01
Bayesian networks are commonly used for presenting uncertainty and covariate interactions in an easily interpretable way. Because of their efficient inference and ability to represent causal relationships, they are an excellent choice for medical decision support systems in diagnosis, treatment, and prognosis. Although good procedures for learning Bayesian networks from data have been defined, their performance in learning from censored survival data has not been widely studied. In this paper, we explore how to use these procedures to learn about possible interactions between prognostic factors and their influence on the variate of interest. We study how censoring affects the probability of learning correct Bayesian network structures. Additionally, we analyse the potential usefulness of the learnt models for predicting the time-independent probability of an event of interest. We analysed the influence of censoring with a simulation on synthetic data sampled from randomly generated Bayesian networks. We used two well-known methods for learning Bayesian networks from data: a constraint-based method and a score-based method. We compared the performance of each method under different levels of censoring to those of the naive Bayes classifier and the proportional hazards model. We did additional experiments on several datasets from real-world medical domains. The machine-learning methods treated censored cases in the data as event-free. We report and compare results for several commonly used model evaluation metrics. On average, the proportional hazards method outperformed other methods in most censoring setups. As part of the simulation study, we also analysed structural similarities of the learnt networks. Heavy censoring, as opposed to no censoring, produces up to a 5% surplus and up to 10% missing total arcs. It also produces up to 50% missing arcs that should originally be connected to the variate of interest. Presented methods for learning Bayesian networks from
Bayesian data assimilation for stochastic multiscale models of transport in porous media.
Energy Technology Data Exchange (ETDEWEB)
Marzouk, Youssef M. (Massachusetts Institute of Technology, Cambridge, MA); van Bloemen Waanders, Bart Gustaaf (Sandia National Laboratories, Albuquerque NM); Parno, Matthew (Massachusetts Institute of Technology, Cambridge, MA); Ray, Jaideep; Lefantzi, Sophia; Salazar, Luke (Sandia National Laboratories, Albuquerque NM); McKenna, Sean Andrew (Sandia National Laboratories, Albuquerque NM); Klise, Katherine A. (Sandia National Laboratories, Albuquerque NM)
2011-10-01
We investigate Bayesian techniques that can be used to reconstruct field variables from partial observations. In particular, we target fields that exhibit spatial structures with a large spectrum of lengthscales. Contemporary methods typically describe the field on a grid and estimate structures which can be resolved by it. In contrast, we address the reconstruction of grid-resolved structures as well as estimation of statistical summaries of subgrid structures, which are smaller than the grid resolution. We perform this in two different ways (a) via a physical (phenomenological), parameterized subgrid model that summarizes the impact of the unresolved scales at the coarse level and (b) via multiscale finite elements, where specially designed prolongation and restriction operators establish the interscale link between the same problem defined on a coarse and fine mesh. The estimation problem is posed as a Bayesian inverse problem. Dimensionality reduction is performed by projecting the field to be inferred on a suitable orthogonal basis set, viz. the Karhunen-Loeve expansion of a multiGaussian. We first demonstrate our techniques on the reconstruction of a binary medium consisting of a matrix with embedded inclusions, which are too small to be grid-resolved. The reconstruction is performed using an adaptive Markov chain Monte Carlo method. We find that the posterior distributions of the inferred parameters are approximately Gaussian. We exploit this finding to reconstruct a permeability field with long, but narrow embedded fractures (which are too fine to be grid-resolved) using scalable ensemble Kalman filters; this also allows us to address larger grids. Ensemble Kalman filtering is then used to estimate the values of hydraulic conductivity and specific yield in a model of the High Plains Aquifer in Kansas. Strong conditioning of the spatial structure of the parameters and the non-linear aspects of the water table aquifer create difficulty for the ensemble Kalman
Ensemble of regional climate model projections for Ireland
Nolan, Paul; McGrath, Ray
2016-04-01
The method of Regional Climate Modelling (RCM) was employed to assess the impacts of a warming climate on the mid-21st-century climate of Ireland. The RCM simulations were run at high spatial resolution, up to 4 km, thus allowing a better evaluation of the local effects of climate change. Simulations were run for a reference period 1981-2000 and future period 2041-2060. Differences between the two periods provide a measure of climate change. To address the issue of uncertainty, a multi-model ensemble approach was employed. Specifically, the future climate of Ireland was simulated using three different RCMs, driven by four Global Climate Models (GCMs). To account for the uncertainty in future emissions, a number of SRES (B1, A1B, A2) and RCP (4.5, 8.5) emission scenarios were used to simulate the future climate. Through the ensemble approach, the uncertainty in the RCM projections can be partially quantified, thus providing a measure of confidence in the predictions. In addition, likelihood values can be assigned to the projections. The RCMs used in this work are the COnsortium for Small-scale MOdeling-Climate Limited-area Modelling (COSMO-CLM, versions 3 and 4) model and the Weather Research and Forecasting (WRF) model. The GCMs used are the Max Planck Institute's ECHAM5, the UK Met Office's HadGEM2-ES, the CGCM3.1 model from the Canadian Centre for Climate Modelling and the EC-Earth consortium GCM. The projections for mid-century indicate an increase of 1-1.6°C in mean annual temperatures, with the largest increases seen in the east of the country. Warming is enhanced for the extremes (i.e. hot or cold days), with the warmest 5% of daily maximum summer temperatures projected to increase by 0.7-2.6°C. The coldest 5% of night-time temperatures in winter are projected to rise by 1.1-3.1°C. Averaged over the whole country, the number of frost days is projected to decrease by over 50%. The projections indicate an average increase in the length of the growing season
Bayesian Inference of High-Dimensional Dynamical Ocean Models
Lin, J.; Lermusiaux, P. F. J.; Lolla, S. V. T.; Gupta, A.; Haley, P. J., Jr.
2015-12-01
This presentation addresses a holistic set of challenges in high-dimension ocean Bayesian nonlinear estimation: i) predict the probability distribution functions (pdfs) of large nonlinear dynamical systems using stochastic partial differential equations (PDEs); ii) assimilate data using Bayes' law with these pdfs; iii) predict the future data that optimally reduce uncertainties; and (iv) rank the known and learn the new model formulations themselves. Overall, we allow the joint inference of the state, equations, geometry, boundary conditions and initial conditions of dynamical models. Examples are provided for time-dependent fluid and ocean flows, including cavity, double-gyre and Strait flows with jets and eddies. The Bayesian model inference, based on limited observations, is illustrated first by the estimation of obstacle shapes and positions in fluid flows. Next, the Bayesian inference of biogeochemical reaction equations and of their states and parameters is presented, illustrating how PDE-based machine learning can rigorously guide the selection and discovery of complex ecosystem models. Finally, the inference of multiscale bottom gravity current dynamics is illustrated, motivated in part by classic overflows and dense water formation sites and their relevance to climate monitoring and dynamics. This is joint work with our MSEAS group at MIT.
Bayesian Age-Period-Cohort Modeling and Prediction - BAMP
Directory of Open Access Journals (Sweden)
Volker J. Schmid
2007-10-01
Full Text Available The software package BAMP provides a method of analyzing incidence or mortality data on the Lexis diagram, using a Bayesian version of an age-period-cohort model. A hierarchical model is assumed with a binomial model in the first-stage. As smoothing priors for the age, period and cohort parameters random walks of first and second order, with and without an additional unstructured component are available. Unstructured heterogeneity can also be included in the model. In order to evaluate the model fit, posterior deviance, DIC and predictive deviances are computed. By projecting the random walk prior into the future, future death rates can be predicted.
Introduction to Hierarchical Bayesian Modeling for Ecological Data
Parent, Eric
2012-01-01
Making statistical modeling and inference more accessible to ecologists and related scientists, Introduction to Hierarchical Bayesian Modeling for Ecological Data gives readers a flexible and effective framework to learn about complex ecological processes from various sources of data. It also helps readers get started on building their own statistical models. The text begins with simple models that progressively become more complex and realistic through explanatory covariates and intermediate hidden states variables. When fitting the models to data, the authors gradually present the concepts a
Bayesian Option Pricing using Mixed Normal Heteroskedasticity Models
DEFF Research Database (Denmark)
Rombouts, Jeroen; Stentoft, Lars
2014-01-01
Option pricing using mixed normal heteroscedasticity models is considered. It is explained how to perform inference and price options in a Bayesian framework. The approach allows to easily compute risk neutral predictive price densities which take into account parameter uncertainty....... In an application to the S&P 500 index, classical and Bayesian inference is performed on the mixture model using the available return data. Comparing the ML estimates and posterior moments small differences are found. When pricing a rich sample of options on the index, both methods yield similar pricing errors...... measured in dollar and implied standard deviation losses, and it turns out that the impact of parameter uncertainty is minor. Therefore, when it comes to option pricing where large amounts of data are available, the choice of the inference method is unimportant. The results are robust to different...
Operational modal analysis modeling, Bayesian inference, uncertainty laws
Au, Siu-Kui
2017-01-01
This book presents operational modal analysis (OMA), employing a coherent and comprehensive Bayesian framework for modal identification and covering stochastic modeling, theoretical formulations, computational algorithms, and practical applications. Mathematical similarities and philosophical differences between Bayesian and classical statistical approaches to system identification are discussed, allowing their mathematical tools to be shared and their results correctly interpreted. Many chapters can be used as lecture notes for the general topic they cover beyond the OMA context. After an introductory chapter (1), Chapters 2–7 present the general theory of stochastic modeling and analysis of ambient vibrations. Readers are first introduced to the spectral analysis of deterministic time series (2) and structural dynamics (3), which do not require the use of probability concepts. The concepts and techniques in these chapters are subsequently extended to a probabilistic context in Chapter 4 (on stochastic pro...
MACC regional multi-model ensemble simulations of birch pollen dispersion in Europe
Sofiev, M.; Berger, U.; Prank, M.; Vira, J.; Arteta, J.; Belmonte, J.; Bergmann, K.C.; Chéroux, F.; Elbern, H.; Friese, E.; Galan, C.; Gehrig, R.; Khvorostyanov, D.; Kranenburg, R.; Kumar, U.; Marécal, V.; Meleux, F.; Menut, L.; Pessi, A.M.; Robertson, L.; Ritenberga, O.; Rodinkova, V.; Saarto, A.; Segers, A.; Severova, E.; Sauliene, I.; Siljamo, P.; Steensen, B.M.; Teinemaa, E.; Thibaudon, M.; Peuch, V.H.
2015-01-01
This paper presents the first ensemble modelling experiment in relation to birch pollen in Europe. The seven-model European ensemble of MACC-ENS, tested in trial simulations over the flowering season of 2010, was run through the flowering season of 2013. The simulations have been compared with
Bayesian hierarchical model for large-scale covariance matrix estimation.
Zhu, Dongxiao; Hero, Alfred O
2007-12-01
Many bioinformatics problems implicitly depend on estimating large-scale covariance matrix. The traditional approaches tend to give rise to high variance and low accuracy due to "overfitting." We cast the large-scale covariance matrix estimation problem into the Bayesian hierarchical model framework, and introduce dependency between covariance parameters. We demonstrate the advantages of our approaches over the traditional approaches using simulations and OMICS data analysis.
Bayesian Modelling of fMRI Time Series
DEFF Research Database (Denmark)
Højen-Sørensen, Pedro; Hansen, Lars Kai; Rasmussen, Carl Edward
2000-01-01
We present a Hidden Markov Model (HMM) for inferring the hidden psychological state (or neural activity) during single trial fMRI activation experiments with blocked task paradigms. Inference is based on Bayesian methodology, using a combination of analytical and a variety of Markov Chain Monte...... Carlo (MCMC) sampling techniques. The advantage of this method is that detection of short time learning effects between repeated trials is possible since inference is based only on single trial experiments....
Nonparametric Bayesian models through probit stick-breaking processes.
Rodríguez, Abel; Dunson, David B
2011-03-01
We describe a novel class of Bayesian nonparametric priors based on stick-breaking constructions where the weights of the process are constructed as probit transformations of normal random variables. We show that these priors are extremely flexible, allowing us to generate a great variety of models while preserving computational simplicity. Particular emphasis is placed on the construction of rich temporal and spatial processes, which are applied to two problems in finance and ecology.
A Bayesian Model of the Memory Colour Effect
Witzel, Christoph; Olkkonen, Maria; Gegenfurtner, Karl R.
2018-01-01
According to the memory colour effect, the colour of a colour-diagnostic object is not perceived independently of the object itself. Instead, it has been shown through an achromatic adjustment method that colour-diagnostic objects still appear slightly in their typical colour, even when they are colourimetrically grey. Bayesian models provide a promising approach to capture the effect of prior knowledge on colour perception and to link these effects to more general effects of cue integration....
A Bayesian model of the memory colour effect.
Witzel, Christoph; Olkkonen, Maria; Gegenfurtner, Karl R.
2018-01-01
According to the memory colour effect, the colour of a colour-diagnostic object is not perceived independently of the object itself. Instead, it has been shown through an achromatic adjustment method that colour-diagnostic objects still appear slightly in their typical colour, even when they are colourimetrically grey. Bayesian models provide a promising approach to capture the effect of prior knowledge on colour perception and to link these effects to more general effects of cue integration....
Li, L.; Xu, C.-Y.; Engeland, K.
2012-04-01
With respect to model calibration, parameter estimation and analysis of uncertainty sources, different approaches have been used in hydrological models. Bayesian method is one of the most widely used methods for uncertainty assessment of hydrological models, which incorporates different sources of information into a single analysis through Bayesian theorem. However, none of these applications can well treat the uncertainty in extreme flows of hydrological models' simulations. This study proposes a Bayesian modularization method approach in uncertainty assessment of conceptual hydrological models by considering the extreme flows. It includes a comprehensive comparison and evaluation of uncertainty assessments by a new Bayesian modularization method approach and traditional Bayesian models using the Metropolis Hasting (MH) algorithm with the daily hydrological model WASMOD. Three likelihood functions are used in combination with traditional Bayesian: the AR (1) plus Normal and time period independent model (Model 1), the AR (1) plus Normal and time period dependent model (Model 2) and the AR (1) plus multi-normal model (Model 3). The results reveal that (1) the simulations derived from Bayesian modularization method are more accurate with the highest Nash-Sutcliffe efficiency value, and (2) the Bayesian modularization method performs best in uncertainty estimates of entire flows and in terms of the application and computational efficiency. The study thus introduces a new approach for reducing the extreme flow's effect on the discharge uncertainty assessment of hydrological models via Bayesian. Keywords: extreme flow, uncertainty assessment, Bayesian modularization, hydrological model, WASMOD
Advances in Applications of Hierarchical Bayesian Methods with Hydrological Models
Alexander, R. B.; Schwarz, G. E.; Boyer, E. W.
2017-12-01
Mechanistic and empirical watershed models are increasingly used to inform water resource decisions. Growing access to historical stream measurements and data from in-situ sensor technologies has increased the need for improved techniques for coupling models with hydrological measurements. Techniques that account for the intrinsic uncertainties of both models and measurements are especially needed. Hierarchical Bayesian methods provide an efficient modeling tool for quantifying model and prediction uncertainties, including those associated with measurements. Hierarchical methods can also be used to explore spatial and temporal variations in model parameters and uncertainties that are informed by hydrological measurements. We used hierarchical Bayesian methods to develop a hybrid (statistical-mechanistic) SPARROW (SPAtially Referenced Regression On Watershed attributes) model of long-term mean annual streamflow across diverse environmental and climatic drainages in 18 U.S. hydrological regions. Our application illustrates the use of a new generation of Bayesian methods that offer more advanced computational efficiencies than the prior generation. Evaluations of the effects of hierarchical (regional) variations in model coefficients and uncertainties on model accuracy indicates improved prediction accuracies (median of 10-50%) but primarily in humid eastern regions, where model uncertainties are one-third of those in arid western regions. Generally moderate regional variability is observed for most hierarchical coefficients. Accounting for measurement and structural uncertainties, using hierarchical state-space techniques, revealed the effects of spatially-heterogeneous, latent hydrological processes in the "localized" drainages between calibration sites; this improved model precision, with only minor changes in regional coefficients. Our study can inform advances in the use of hierarchical methods with hydrological models to improve their integration with stream
Predicting Power Outages Using Multi-Model Ensemble Forecasts
Cerrai, D.; Anagnostou, E. N.; Yang, J.; Astitha, M.
2017-12-01
Power outages affect every year millions of people in the United States, affecting the economy and conditioning the everyday life. An Outage Prediction Model (OPM) has been developed at the University of Connecticut for helping utilities to quickly restore outages and to limit their adverse consequences on the population. The OPM, operational since 2015, combines several non-parametric machine learning (ML) models that use historical weather storm simulations and high-resolution weather forecasts, satellite remote sensing data, and infrastructure and land cover data to predict the number and spatial distribution of power outages. A new methodology, developed for improving the outage model performances by combining weather- and soil-related variables using three different weather models (WRF 3.7, WRF 3.8 and RAMS/ICLAMS), will be presented in this study. First, we will present a performance evaluation of each model variable, by comparing historical weather analyses with station data or reanalysis over the entire storm data set. Hence, each variable of the new outage model version is extracted from the best performing weather model for that variable, and sensitivity tests are performed for investigating the most efficient variable combination for outage prediction purposes. Despite that the final variables combination is extracted from different weather models, this ensemble based on multi-weather forcing and multi-statistical model power outage prediction outperforms the currently operational OPM version that is based on a single weather forcing variable (WRF 3.7), because each model component is the closest to the actual atmospheric state.
A Bayesian Nonparametric Meta-Analysis Model
Karabatsos, George; Talbott, Elizabeth; Walker, Stephen G.
2015-01-01
In a meta-analysis, it is important to specify a model that adequately describes the effect-size distribution of the underlying population of studies. The conventional normal fixed-effect and normal random-effects models assume a normal effect-size population distribution, conditionally on parameters and covariates. For estimating the mean overall…
Pauci ex tanto numero: reduce redundancy in multi-model ensembles
Solazzo, E.; Riccio, A.; Kioutsioukis, I.; Galmarini, S.
2013-08-01
We explicitly address the fundamental issue of member diversity in multi-model ensembles. To date, no attempts in this direction have been documented within the air quality (AQ) community despite the extensive use of ensembles in this field. Common biases and redundancy are the two issues directly deriving from lack of independence, undermining the significance of a multi-model ensemble, and are the subject of this study. Shared, dependant biases among models do not cancel out but will instead determine a biased ensemble. Redundancy derives from having too large a portion of common variance among the members of the ensemble, producing overconfidence in the predictions and underestimation of the uncertainty. The two issues of common biases and redundancy are analysed in detail using the AQMEII ensemble of AQ model results for four air pollutants in two European regions. We show that models share large portions of bias and variance, extending well beyond those induced by common inputs. We make use of several techniques to further show that subsets of models can explain the same amount of variance as the full ensemble with the advantage of being poorly correlated. Selecting the members for generating skilful, non-redundant ensembles from such subsets proved, however, non-trivial. We propose and discuss various methods of member selection and rate the ensemble performance they produce. In most cases, the full ensemble is outscored by the reduced ones. We conclude that, although independence of outputs may not always guarantee enhancement of scores (but this depends upon the skill being investigated), we discourage selecting the members of the ensemble simply on the basis of scores; that is, independence and skills need to be considered disjointly.
AIC, BIC, Bayesian evidence against the interacting dark energy model
Energy Technology Data Exchange (ETDEWEB)
Szydlowski, Marek [Jagiellonian University, Astronomical Observatory, Krakow (Poland); Jagiellonian University, Mark Kac Complex Systems Research Centre, Krakow (Poland); Krawiec, Adam [Jagiellonian University, Institute of Economics, Finance and Management, Krakow (Poland); Jagiellonian University, Mark Kac Complex Systems Research Centre, Krakow (Poland); Kurek, Aleksandra [Jagiellonian University, Astronomical Observatory, Krakow (Poland); Kamionka, Michal [University of Wroclaw, Astronomical Institute, Wroclaw (Poland)
2015-01-01
Recent astronomical observations have indicated that the Universe is in a phase of accelerated expansion. While there are many cosmological models which try to explain this phenomenon, we focus on the interacting ΛCDM model where an interaction between the dark energy and dark matter sectors takes place. This model is compared to its simpler alternative - the ΛCDM model. To choose between these models the likelihood ratio test was applied as well as the model comparison methods (employing Occam's principle): the Akaike information criterion (AIC), the Bayesian information criterion (BIC) and the Bayesian evidence. Using the current astronomical data: type Ia supernova (Union2.1), h(z), baryon acoustic oscillation, the Alcock- Paczynski test, and the cosmic microwave background data, we evaluated both models. The analyses based on the AIC indicated that there is less support for the interacting ΛCDM model when compared to the ΛCDM model, while those based on the BIC indicated that there is strong evidence against it in favor of the ΛCDM model. Given the weak or almost non-existing support for the interacting ΛCDM model and bearing in mind Occam's razor we are inclined to reject this model. (orig.)
AIC, BIC, Bayesian evidence against the interacting dark energy model
Energy Technology Data Exchange (ETDEWEB)
Szydłowski, Marek, E-mail: marek.szydlowski@uj.edu.pl [Astronomical Observatory, Jagiellonian University, Orla 171, 30-244, Kraków (Poland); Mark Kac Complex Systems Research Centre, Jagiellonian University, Reymonta 4, 30-059, Kraków (Poland); Krawiec, Adam, E-mail: adam.krawiec@uj.edu.pl [Institute of Economics, Finance and Management, Jagiellonian University, Łojasiewicza 4, 30-348, Kraków (Poland); Mark Kac Complex Systems Research Centre, Jagiellonian University, Reymonta 4, 30-059, Kraków (Poland); Kurek, Aleksandra, E-mail: alex@oa.uj.edu.pl [Astronomical Observatory, Jagiellonian University, Orla 171, 30-244, Kraków (Poland); Kamionka, Michał, E-mail: kamionka@astro.uni.wroc.pl [Astronomical Institute, University of Wrocław, ul. Kopernika 11, 51-622, Wrocław (Poland)
2015-01-14
Recent astronomical observations have indicated that the Universe is in a phase of accelerated expansion. While there are many cosmological models which try to explain this phenomenon, we focus on the interacting ΛCDM model where an interaction between the dark energy and dark matter sectors takes place. This model is compared to its simpler alternative—the ΛCDM model. To choose between these models the likelihood ratio test was applied as well as the model comparison methods (employing Occam’s principle): the Akaike information criterion (AIC), the Bayesian information criterion (BIC) and the Bayesian evidence. Using the current astronomical data: type Ia supernova (Union2.1), h(z), baryon acoustic oscillation, the Alcock–Paczynski test, and the cosmic microwave background data, we evaluated both models. The analyses based on the AIC indicated that there is less support for the interacting ΛCDM model when compared to the ΛCDM model, while those based on the BIC indicated that there is strong evidence against it in favor of the ΛCDM model. Given the weak or almost non-existing support for the interacting ΛCDM model and bearing in mind Occam’s razor we are inclined to reject this model.
AIC, BIC, Bayesian evidence against the interacting dark energy model
International Nuclear Information System (INIS)
Szydlowski, Marek; Krawiec, Adam; Kurek, Aleksandra; Kamionka, Michal
2015-01-01
Recent astronomical observations have indicated that the Universe is in a phase of accelerated expansion. While there are many cosmological models which try to explain this phenomenon, we focus on the interacting ΛCDM model where an interaction between the dark energy and dark matter sectors takes place. This model is compared to its simpler alternative - the ΛCDM model. To choose between these models the likelihood ratio test was applied as well as the model comparison methods (employing Occam's principle): the Akaike information criterion (AIC), the Bayesian information criterion (BIC) and the Bayesian evidence. Using the current astronomical data: type Ia supernova (Union2.1), h(z), baryon acoustic oscillation, the Alcock- Paczynski test, and the cosmic microwave background data, we evaluated both models. The analyses based on the AIC indicated that there is less support for the interacting ΛCDM model when compared to the ΛCDM model, while those based on the BIC indicated that there is strong evidence against it in favor of the ΛCDM model. Given the weak or almost non-existing support for the interacting ΛCDM model and bearing in mind Occam's razor we are inclined to reject this model. (orig.)
Towards port sustainability through probabilistic models: Bayesian networks
Directory of Open Access Journals (Sweden)
B. Molina
2018-04-01
Full Text Available It is necessary that a manager of an infrastructure knows relations between variables. Using Bayesian networks, variables can be classified, predicted and diagnosed, being able to estimate posterior probability of the unknown ones based on known ones. The proposed methodology has generated a database with port variables, which have been classified as economic, social, environmental and institutional, as addressed in of smart ports studies made in all Spanish Port System. Network has been developed using an acyclic directed graph, which have let us know relationships in terms of parents and sons. In probabilistic terms, it can be concluded from the constructed network that the most decisive variables for port sustainability are those that are part of the institutional dimension. It has been concluded that Bayesian networks allow modeling uncertainty probabilistically even when the number of variables is high as it occurs in port planning and exploitation.
Bayesian geostatistical modeling of leishmaniasis incidence in Brazil.
Directory of Open Access Journals (Sweden)
Dimitrios-Alexios Karagiannis-Voules
Full Text Available BACKGROUND: Leishmaniasis is endemic in 98 countries with an estimated 350 million people at risk and approximately 2 million cases annually. Brazil is one of the most severely affected countries. METHODOLOGY: We applied Bayesian geostatistical negative binomial models to analyze reported incidence data of cutaneous and visceral leishmaniasis in Brazil covering a 10-year period (2001-2010. Particular emphasis was placed on spatial and temporal patterns. The models were fitted using integrated nested Laplace approximations to perform fast approximate Bayesian inference. Bayesian variable selection was employed to determine the most important climatic, environmental, and socioeconomic predictors of cutaneous and visceral leishmaniasis. PRINCIPAL FINDINGS: For both types of leishmaniasis, precipitation and socioeconomic proxies were identified as important risk factors. The predicted number of cases in 2010 were 30,189 (standard deviation [SD]: 7,676 for cutaneous leishmaniasis and 4,889 (SD: 288 for visceral leishmaniasis. Our risk maps predicted the highest numbers of infected people in the states of Minas Gerais and Pará for visceral and cutaneous leishmaniasis, respectively. CONCLUSIONS/SIGNIFICANCE: Our spatially explicit, high-resolution incidence maps identified priority areas where leishmaniasis control efforts should be targeted with the ultimate goal to reduce disease incidence.
Ensembles modeling approach to study Climate Change impacts on Wheat
Ahmed, Mukhtar; Claudio, Stöckle O.; Nelson, Roger; Higgins, Stewart
2017-04-01
Simulations of crop yield under climate variability are subject to uncertainties, and quantification of such uncertainties is essential for effective use of projected results in adaptation and mitigation strategies. In this study we evaluated the uncertainties related to crop-climate models using five crop growth simulation models (CropSyst, APSIM, DSSAT, STICS and EPIC) and 14 general circulation models (GCMs) for 2 representative concentration pathways (RCP) of atmospheric CO2 (4.5 and 8.5 W m-2) in the Pacific Northwest (PNW), USA. The aim was to assess how different process-based crop models could be used accurately for estimation of winter wheat growth, development and yield. Firstly, all models were calibrated for high rainfall, medium rainfall, low rainfall and irrigated sites in the PNW using 1979-2010 as the baseline period. Response variables were related to farm management and soil properties, and included crop phenology, leaf area index (LAI), biomass and grain yield of winter wheat. All five models were run from 2000 to 2100 using the 14 GCMs and 2 RCPs to evaluate the effect of future climate (rainfall, temperature and CO2) on winter wheat phenology, LAI, biomass, grain yield and harvest index. Simulated time to flowering and maturity was reduced in all models except EPIC with some level of uncertainty. All models generally predicted an increase in biomass and grain yield under elevated CO2 but this effect was more prominent under rainfed conditions than irrigation. However, there was uncertainty in the simulation of crop phenology, biomass and grain yield under 14 GCMs during three prediction periods (2030, 2050 and 2070). We concluded that to improve accuracy and consistency in simulating wheat growth dynamics and yield under a changing climate, a multimodel ensemble approach should be used.
Ensembling Variable Selectors by Stability Selection for the Cox Model
Directory of Open Access Journals (Sweden)
Qing-Yan Yin
2017-01-01
Full Text Available As a pivotal tool to build interpretive models, variable selection plays an increasingly important role in high-dimensional data analysis. In recent years, variable selection ensembles (VSEs have gained much interest due to their many advantages. Stability selection (Meinshausen and Bühlmann, 2010, a VSE technique based on subsampling in combination with a base algorithm like lasso, is an effective method to control false discovery rate (FDR and to improve selection accuracy in linear regression models. By adopting lasso as a base learner, we attempt to extend stability selection to handle variable selection problems in a Cox model. According to our experience, it is crucial to set the regularization region Λ in lasso and the parameter λmin properly so that stability selection can work well. To the best of our knowledge, however, there is no literature addressing this problem in an explicit way. Therefore, we first provide a detailed procedure to specify Λ and λmin. Then, some simulated and real-world data with various censoring rates are used to examine how well stability selection performs. It is also compared with several other variable selection approaches. Experimental results demonstrate that it achieves better or competitive performance in comparison with several other popular techniques.
Ensemble models on palaeoclimate to predict India's groundwater challenge
Directory of Open Access Journals (Sweden)
Partha Sarathi Datta
2013-09-01
Full Text Available In many parts of the world, freshwater crisis is largely due to increasing water consumption and pollution by rapidly growing population and aspirations for economic development, but, ascribed usually to the climate. However, limited understanding and knowledge gaps in the factors controlling climate and uncertainties in the climate models are unable to assess the probable impacts on water availability in tropical regions. In this context, review of ensemble models on δ18O and δD in rainfall and groundwater, 3H- and 14C- ages of groundwater and 14C- age of lakes sediments helped to reconstruct palaeoclimate and long-term recharge in the North-west India; and predict future groundwater challenge. The annual mean temperature trend indicates both warming/cooling in different parts of India in the past and during 1901–2010. Neither the GCMs (Global Climate Models nor the observational record indicates any significant change/increase in temperature and rainfall over the last century, and climate change during the last 1200 yrs BP. In much of the North-West region, deep groundwater renewal occurred from past humid climate, and shallow groundwater renewal from limited modern recharge over the past decades. To make water management to be more responsive to climate change, the gaps in the science of climate change need to be bridged.
Dynamic model based on Bayesian method for energy security assessment
International Nuclear Information System (INIS)
Augutis, Juozas; Krikštolaitis, Ričardas; Pečiulytė, Sigita; Žutautaitė, Inga
2015-01-01
Highlights: • Methodology for dynamic indicator model construction and forecasting of indicators. • Application of dynamic indicator model for energy system development scenarios. • Expert judgement involvement using Bayesian method. - Abstract: The methodology for the dynamic indicator model construction and forecasting of indicators for the assessment of energy security level is presented in this article. An indicator is a special index, which provides numerical values to important factors for the investigated area. In real life, models of different processes take into account various factors that are time-dependent and dependent on each other. Thus, it is advisable to construct a dynamic model in order to describe these dependences. The energy security indicators are used as factors in the dynamic model. Usually, the values of indicators are obtained from statistical data. The developed dynamic model enables to forecast indicators’ variation taking into account changes in system configuration. The energy system development is usually based on a new object construction. Since the parameters of changes of the new system are not exactly known, information about their influences on indicators could not be involved in the model by deterministic methods. Thus, dynamic indicators’ model based on historical data is adjusted by probabilistic model with the influence of new factors on indicators using the Bayesian method
Bayesian model discrimination for glucose-insulin homeostasis
DEFF Research Database (Denmark)
Andersen, Kim Emil; Brooks, Stephen P.; Højbjerre, Malene
In this paper we analyse a set of experimental data on a number of healthy and diabetic patients and discuss a variety of models for describing the physiological processes involved in glucose absorption and insulin secretion within the human body. We adopt a Bayesian approach which facilitates...... as parameter uncertainty. Markov chain Monte Carlo methods are used, combining Metropolis Hastings, reversible jump and simulated tempering updates to provide rapidly mixing chains so as to provide robust inference. We demonstrate the methodology for both healthy and type II diabetic populations concluding...... that whilst both populations are well modelled by a common insulin model, their glucose dynamics differ considerably....
Bayesian modeling to paired comparison data via the Pareto distribution
Directory of Open Access Journals (Sweden)
Nasir Abbas
2017-12-01
Full Text Available A probabilistic approach to build models for paired comparison experiments based on the comparison of two Pareto variables is considered. Analysis of the proposed model is carried out in classical as well as Bayesian frameworks. Informative and uninformative priors are employed to accommodate the prior information. Simulation study is conducted to assess the suitablily and performance of the model under theoretical conditions. Appropriateness of fit of the is also carried out. Entire inferential procedure is illustrated by comparing certain cricket teams using real dataset.
Bayesian inference and model comparison for metallic fatigue data
Babuška, Ivo
2016-02-23
In this work, we present a statistical treatment of stress-life (S-N) data drawn from a collection of records of fatigue experiments that were performed on 75S-T6 aluminum alloys. Our main objective is to predict the fatigue life of materials by providing a systematic approach to model calibration, model selection and model ranking with reference to S-N data. To this purpose, we consider fatigue-limit models and random fatigue-limit models that are specially designed to allow the treatment of the run-outs (right-censored data). We first fit the models to the data by maximum likelihood methods and estimate the quantiles of the life distribution of the alloy specimen. To assess the robustness of the estimation of the quantile functions, we obtain bootstrap confidence bands by stratified resampling with respect to the cycle ratio. We then compare and rank the models by classical measures of fit based on information criteria. We also consider a Bayesian approach that provides, under the prior distribution of the model parameters selected by the user, their simulation-based posterior distributions. We implement and apply Bayesian model comparison methods, such as Bayes factor ranking and predictive information criteria based on cross-validation techniques under various a priori scenarios.
Bayesian inference and model comparison for metallic fatigue data
Babuška, Ivo; Sawlan, Zaid A; Scavino, Marco; Szabó , Barna; Tempone, Raul
2016-01-01
In this work, we present a statistical treatment of stress-life (S-N) data drawn from a collection of records of fatigue experiments that were performed on 75S-T6 aluminum alloys. Our main objective is to predict the fatigue life of materials by providing a systematic approach to model calibration, model selection and model ranking with reference to S-N data. To this purpose, we consider fatigue-limit models and random fatigue-limit models that are specially designed to allow the treatment of the run-outs (right-censored data). We first fit the models to the data by maximum likelihood methods and estimate the quantiles of the life distribution of the alloy specimen. To assess the robustness of the estimation of the quantile functions, we obtain bootstrap confidence bands by stratified resampling with respect to the cycle ratio. We then compare and rank the models by classical measures of fit based on information criteria. We also consider a Bayesian approach that provides, under the prior distribution of the model parameters selected by the user, their simulation-based posterior distributions. We implement and apply Bayesian model comparison methods, such as Bayes factor ranking and predictive information criteria based on cross-validation techniques under various a priori scenarios.
Predicting coastal cliff erosion using a Bayesian probabilistic model
Hapke, Cheryl J.; Plant, Nathaniel G.
2010-01-01
Regional coastal cliff retreat is difficult to model due to the episodic nature of failures and the along-shore variability of retreat events. There is a growing demand, however, for predictive models that can be used to forecast areas vulnerable to coastal erosion hazards. Increasingly, probabilistic models are being employed that require data sets of high temporal density to define the joint probability density function that relates forcing variables (e.g. wave conditions) and initial conditions (e.g. cliff geometry) to erosion events. In this study we use a multi-parameter Bayesian network to investigate correlations between key variables that control and influence variations in cliff retreat processes. The network uses Bayesian statistical methods to estimate event probabilities using existing observations. Within this framework, we forecast the spatial distribution of cliff retreat along two stretches of cliffed coast in Southern California. The input parameters are the height and slope of the cliff, a descriptor of material strength based on the dominant cliff-forming lithology, and the long-term cliff erosion rate that represents prior behavior. The model is forced using predicted wave impact hours. Results demonstrate that the Bayesian approach is well-suited to the forward modeling of coastal cliff retreat, with the correct outcomes forecast in 70–90% of the modeled transects. The model also performs well in identifying specific locations of high cliff erosion, thus providing a foundation for hazard mapping. This approach can be employed to predict cliff erosion at time-scales ranging from storm events to the impacts of sea-level rise at the century-scale.
DPpackage: Bayesian Semi- and Nonparametric Modeling in R
Directory of Open Access Journals (Sweden)
Alejandro Jara
2011-04-01
Full Text Available Data analysis sometimes requires the relaxation of parametric assumptions in order to gain modeling flexibility and robustness against mis-specification of the probability model. In the Bayesian context, this is accomplished by placing a prior distribution on a function space, such as the space of all probability distributions or the space of all regression functions. Unfortunately, posterior distributions ranging over function spaces are highly complex and hence sampling methods play a key role. This paper provides an introduction to a simple, yet comprehensive, set of programs for the implementation of some Bayesian nonparametric and semiparametric models in R, DPpackage. Currently, DPpackage includes models for marginal and conditional density estimation, receiver operating characteristic curve analysis, interval-censored data, binary regression data, item response data, longitudinal and clustered data using generalized linear mixed models, and regression data using generalized additive models. The package also contains functions to compute pseudo-Bayes factors for model comparison and for eliciting the precision parameter of the Dirichlet process prior, and a general purpose Metropolis sampling algorithm. To maximize computational efficiency, the actual sampling for each model is carried out using compiled C, C++ or Fortran code.
A Bayesian approach for quantification of model uncertainty
International Nuclear Information System (INIS)
Park, Inseok; Amarchinta, Hemanth K.; Grandhi, Ramana V.
2010-01-01
In most engineering problems, more than one model can be created to represent an engineering system's behavior. Uncertainty is inevitably involved in selecting the best model from among the models that are possible. Uncertainty in model selection cannot be ignored, especially when the differences between the predictions of competing models are significant. In this research, a methodology is proposed to quantify model uncertainty using measured differences between experimental data and model outcomes under a Bayesian statistical framework. The adjustment factor approach is used to propagate model uncertainty into prediction of a system response. A nonlinear vibration system is used to demonstrate the processes for implementing the adjustment factor approach. Finally, the methodology is applied on the engineering benefits of a laser peening process, and a confidence band for residual stresses is established to indicate the reliability of model prediction.
Nonparametric Bayesian models for a spatial covariance.
Reich, Brian J; Fuentes, Montserrat
2012-01-01
A crucial step in the analysis of spatial data is to estimate the spatial correlation function that determines the relationship between a spatial process at two locations. The standard approach to selecting the appropriate correlation function is to use prior knowledge or exploratory analysis, such as a variogram analysis, to select the correct parametric correlation function. Rather that selecting a particular parametric correlation function, we treat the covariance function as an unknown function to be estimated from the data. We propose a flexible prior for the correlation function to provide robustness to the choice of correlation function. We specify the prior for the correlation function using spectral methods and the Dirichlet process prior, which is a common prior for an unknown distribution function. Our model does not require Gaussian data or spatial locations on a regular grid. The approach is demonstrated using a simulation study as well as an analysis of California air pollution data.
Wetterhall, F.; Cloke, H. L.; He, Y.; Freer, J.; Pappenberger, F.
2012-04-01
Evidence provided by modelled assessments of climate change impact on flooding is fundamental to water resource and flood risk decision making. Impact models usually rely on climate projections from Global and Regional Climate Models, and there is no doubt that these provide a useful assessment of future climate change. However, cascading ensembles of climate projections into impact models is not straightforward because of problems of coarse resolution in Global and Regional Climate Models (GCM/RCM) and the deficiencies in modelling high-intensity precipitation events. Thus decisions must be made on how to appropriately pre-process the meteorological variables from GCM/RCMs, such as selection of downscaling methods and application of Model Output Statistics (MOS). In this paper a grand ensemble of projections from several GCM/RCM are used to drive a hydrological model and analyse the resulting future flood projections for the Upper Severn, UK. The impact and implications of applying MOS techniques to precipitation as well as hydrological model parameter uncertainty is taken into account. The resultant grand ensemble of future river discharge projections from the RCM/GCM-hydrological model chain is evaluated against a response surface technique combined with a perturbed physics experiment creating a probabilisic ensemble climate model outputs. The ensemble distribution of results show that future risk of flooding in the Upper Severn increases compared to present conditions, however, the study highlights that the uncertainties are large and that strong assumptions were made in using Model Output Statistics to produce the estimates of future discharge. The importance of analysing on a seasonal basis rather than just annual is highlighted. The inability of the RCMs (and GCMs) to produce realistic precipitation patterns, even in present conditions, is a major caveat of local climate impact studies on flooding, and this should be a focus for future development.
Two Bayesian tests of the GLOMOsys Model.
Field, Sarahanne M; Wagenmakers, Eric-Jan; Newell, Ben R; Zeelenberg, René; van Ravenzwaaij, Don
2016-12-01
Priming is arguably one of the key phenomena in contemporary social psychology. Recent retractions and failed replication attempts have led to a division in the field between proponents and skeptics and have reinforced the importance of confirming certain priming effects through replication. In this study, we describe the results of 2 preregistered replication attempts of 1 experiment by Förster and Denzler (2012). In both experiments, participants first processed letters either globally or locally, then were tested using a typicality rating task. Bayes factor hypothesis tests were conducted for both experiments: Experiment 1 (N = 100) yielded an indecisive Bayes factor of 1.38, indicating that the in-lab data are 1.38 times more likely to have occurred under the null hypothesis than under the alternative. Experiment 2 (N = 908) yielded a Bayes factor of 10.84, indicating strong support for the null hypothesis that global priming does not affect participants' mean typicality ratings. The failure to replicate this priming effect challenges existing support for the GLOMO sys model. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Modeling Women's Menstrual Cycles using PICI Gates in Bayesian Network.
Zagorecki, Adam; Łupińska-Dubicka, Anna; Voortman, Mark; Druzdzel, Marek J
2016-03-01
A major difficulty in building Bayesian network (BN) models is the size of conditional probability tables, which grow exponentially in the number of parents. One way of dealing with this problem is through parametric conditional probability distributions that usually require only a number of parameters that is linear in the number of parents. In this paper, we introduce a new class of parametric models, the Probabilistic Independence of Causal Influences (PICI) models, that aim at lowering the number of parameters required to specify local probability distributions, but are still capable of efficiently modeling a variety of interactions. A subset of PICI models is decomposable and this leads to significantly faster inference as compared to models that cannot be decomposed. We present an application of the proposed method to learning dynamic BNs for modeling a woman's menstrual cycle. We show that PICI models are especially useful for parameter learning from small data sets and lead to higher parameter accuracy than when learning CPTs.
Simulation models are extensively used to predict agricultural productivity and greenhouse gas (GHG) emissions. However, the uncertainties of (reduced) model ensemble simulations have not been assessed systematically for variables affecting food security and climate change mitigation, within multisp...
Robust Bayesian Experimental Design for Conceptual Model Discrimination
Pham, H. V.; Tsai, F. T. C.
2015-12-01
A robust Bayesian optimal experimental design under uncertainty is presented to provide firm information for model discrimination, given the least number of pumping wells and observation wells. Firm information is the maximum information of a system can be guaranteed from an experimental design. The design is based on the Box-Hill expected entropy decrease (EED) before and after the experiment design and the Bayesian model averaging (BMA) framework. A max-min programming is introduced to choose the robust design that maximizes the minimal Box-Hill EED subject to that the highest expected posterior model probability satisfies a desired probability threshold. The EED is calculated by the Gauss-Hermite quadrature. The BMA method is used to predict future observations and to quantify future observation uncertainty arising from conceptual and parametric uncertainties in calculating EED. Monte Carlo approach is adopted to quantify the uncertainty in the posterior model probabilities. The optimal experimental design is tested by a synthetic 5-layer anisotropic confined aquifer. Nine conceptual groundwater models are constructed due to uncertain geological architecture and boundary condition. High-performance computing is used to enumerate all possible design solutions in order to identify the most plausible groundwater model. Results highlight the impacts of scedasticity in future observation data as well as uncertainty sources on potential pumping and observation locations.
Macroeconomic Forecasts in Models with Bayesian Averaging of Classical Estimates
Directory of Open Access Journals (Sweden)
Piotr Białowolski
2012-03-01
Full Text Available The aim of this paper is to construct a forecasting model oriented on predicting basic macroeconomic variables, namely: the GDP growth rate, the unemployment rate, and the consumer price inflation. In order to select the set of the best regressors, Bayesian Averaging of Classical Estimators (BACE is employed. The models are atheoretical (i.e. they do not reflect causal relationships postulated by the macroeconomic theory and the role of regressors is played by business and consumer tendency survey-based indicators. Additionally, survey-based indicators are included with a lag that enables to forecast the variables of interest (GDP, unemployment, and inflation for the four forthcoming quarters without the need to make any additional assumptions concerning the values of predictor variables in the forecast period. Bayesian Averaging of Classical Estimators is a method allowing for full and controlled overview of all econometric models which can be obtained out of a particular set of regressors. In this paper authors describe the method of generating a family of econometric models and the procedure for selection of a final forecasting model. Verification of the procedure is performed by means of out-of-sample forecasts of main economic variables for the quarters of 2011. The accuracy of the forecasts implies that there is still a need to search for new solutions in the atheoretical modelling.
Modeling operational risks of the nuclear industry with Bayesian networks
International Nuclear Information System (INIS)
Wieland, Patricia; Lustosa, Leonardo J.
2009-01-01
Basically, planning a new industrial plant requires information on the industrial management, regulations, site selection, definition of initial and planned capacity, and on the estimation of the potential demand. However, this is far from enough to assure the success of an industrial enterprise. Unexpected and extremely damaging events may occur that deviates from the original plan. The so-called operational risks are not only in the system, equipment, process or human (technical or managerial) failures. They are also in intentional events such as frauds and sabotage, or extreme events like terrorist attacks or radiological accidents and even on public reaction to perceived environmental or future generation impacts. For the nuclear industry, it is a challenge to identify and to assess the operational risks and their various sources. Early identification of operational risks can help in preparing contingency plans, to delay the decision to invest or to approve a project that can, at an extreme, affect the public perception of the nuclear energy. A major problem in modeling operational risk losses is the lack of internal data that are essential, for example, to apply the loss distribution approach. As an alternative, methods that consider qualitative and subjective information can be applied, for example, fuzzy logic, neural networks, system dynamic or Bayesian networks. An advantage of applying Bayesian networks to model operational risk is the possibility to include expert opinions and variables of interest, to structure the model via causal dependencies among these variables, and to specify subjective prior and conditional probabilities distributions at each step or network node. This paper suggests a classification of operational risks in industry and discusses the benefits and obstacles of the Bayesian networks approach to model those risks. (author)
Modeling operational risks of the nuclear industry with Bayesian networks
Energy Technology Data Exchange (ETDEWEB)
Wieland, Patricia [Pontificia Univ. Catolica do Rio de Janeiro (PUC-Rio), RJ (Brazil). Dept. de Engenharia Industrial; Comissao Nacional de Energia Nuclear (CNEN), Rio de Janeiro, RJ (Brazil)], e-mail: pwieland@cnen.gov.br; Lustosa, Leonardo J. [Pontificia Univ. Catolica do Rio de Janeiro (PUC-Rio), RJ (Brazil). Dept. de Engenharia Industrial], e-mail: ljl@puc-rio.br
2009-07-01
Basically, planning a new industrial plant requires information on the industrial management, regulations, site selection, definition of initial and planned capacity, and on the estimation of the potential demand. However, this is far from enough to assure the success of an industrial enterprise. Unexpected and extremely damaging events may occur that deviates from the original plan. The so-called operational risks are not only in the system, equipment, process or human (technical or managerial) failures. They are also in intentional events such as frauds and sabotage, or extreme events like terrorist attacks or radiological accidents and even on public reaction to perceived environmental or future generation impacts. For the nuclear industry, it is a challenge to identify and to assess the operational risks and their various sources. Early identification of operational risks can help in preparing contingency plans, to delay the decision to invest or to approve a project that can, at an extreme, affect the public perception of the nuclear energy. A major problem in modeling operational risk losses is the lack of internal data that are essential, for example, to apply the loss distribution approach. As an alternative, methods that consider qualitative and subjective information can be applied, for example, fuzzy logic, neural networks, system dynamic or Bayesian networks. An advantage of applying Bayesian networks to model operational risk is the possibility to include expert opinions and variables of interest, to structure the model via causal dependencies among these variables, and to specify subjective prior and conditional probabilities distributions at each step or network node. This paper suggests a classification of operational risks in industry and discusses the benefits and obstacles of the Bayesian networks approach to model those risks. (author)
Bayesian modeling of the mass and density of asteroids
Dotson, Jessie L.; Mathias, Donovan
2017-10-01
Mass and density are two of the fundamental properties of any object. In the case of near earth asteroids, knowledge about the mass of an asteroid is essential for estimating the risk due to (potential) impact and planning possible mitigation options. The density of an asteroid can illuminate the structure of the asteroid. A low density can be indicative of a rubble pile structure whereas a higher density can imply a monolith and/or higher metal content. The damage resulting from an impact of an asteroid with Earth depends on its interior structure in addition to its total mass, and as a result, density is a key parameter to understanding the risk of asteroid impact. Unfortunately, measuring the mass and density of asteroids is challenging and often results in measurements with large uncertainties. In the absence of mass / density measurements for a specific object, understanding the range and distribution of likely values can facilitate probabilistic assessments of structure and impact risk. Hierarchical Bayesian models have recently been developed to investigate the mass - radius relationship of exoplanets (Wolfgang, Rogers & Ford 2016) and to probabilistically forecast the mass of bodies large enough to establish hydrostatic equilibrium over a range of 9 orders of magnitude in mass (from planemos to main sequence stars; Chen & Kipping 2017). Here, we extend this approach to investigate the mass and densities of asteroids. Several candidate Bayesian models are presented, and their performance is assessed relative to a synthetic asteroid population. In addition, a preliminary Bayesian model for probablistically forecasting masses and densities of asteroids is presented. The forecasting model is conditioned on existing asteroid data and includes observational errors, hyper-parameter uncertainties and intrinsic scatter.
Bayesian analysis for uncertainty estimation of a canopy transpiration model
Samanta, S.; Mackay, D. S.; Clayton, M. K.; Kruger, E. L.; Ewers, B. E.
2007-04-01
A Bayesian approach was used to fit a conceptual transpiration model to half-hourly transpiration rates for a sugar maple (Acer saccharum) stand collected over a 5-month period and probabilistically estimate its parameter and prediction uncertainties. The model used the Penman-Monteith equation with the Jarvis model for canopy conductance. This deterministic model was extended by adding a normally distributed error term. This extension enabled using Markov chain Monte Carlo simulations to sample the posterior parameter distributions. The residuals revealed approximate conformance to the assumption of normally distributed errors. However, minor systematic structures in the residuals at fine timescales suggested model changes that would potentially improve the modeling of transpiration. Results also indicated considerable uncertainties in the parameter and transpiration estimates. This simple methodology of uncertainty analysis would facilitate the deductive step during the development cycle of deterministic conceptual models by accounting for these uncertainties while drawing inferences from data.
Kaplan, David; Lee, Chansoon
2018-01-01
This article provides a review of Bayesian model averaging as a means of optimizing the predictive performance of common statistical models applied to large-scale educational assessments. The Bayesian framework recognizes that in addition to parameter uncertainty, there is uncertainty in the choice of models themselves. A Bayesian approach to addressing the problem of model uncertainty is the method of Bayesian model averaging. Bayesian model averaging searches the space of possible models for a set of submodels that satisfy certain scientific principles and then averages the coefficients across these submodels weighted by each model's posterior model probability (PMP). Using the weighted coefficients for prediction has been shown to yield optimal predictive performance according to certain scoring rules. We demonstrate the utility of Bayesian model averaging for prediction in education research with three examples: Bayesian regression analysis, Bayesian logistic regression, and a recently developed approach for Bayesian structural equation modeling. In each case, the model-averaged estimates are shown to yield better prediction of the outcome of interest than any submodel based on predictive coverage and the log-score rule. Implications for the design of large-scale assessments when the goal is optimal prediction in a policy context are discussed.
Multi-model ensembles for assessment of flood losses and associated uncertainty
Figueiredo, Rui; Schröter, Kai; Weiss-Motz, Alexander; Martina, Mario L. V.; Kreibich, Heidi
2018-05-01
Flood loss modelling is a crucial part of risk assessments. However, it is subject to large uncertainty that is often neglected. Most models available in the literature are deterministic, providing only single point estimates of flood loss, and large disparities tend to exist among them. Adopting any one such model in a risk assessment context is likely to lead to inaccurate loss estimates and sub-optimal decision-making. In this paper, we propose the use of multi-model ensembles to address these issues. This approach, which has been applied successfully in other scientific fields, is based on the combination of different model outputs with the aim of improving the skill and usefulness of predictions. We first propose a model rating framework to support ensemble construction, based on a probability tree of model properties, which establishes relative degrees of belief between candidate models. Using 20 flood loss models in two test cases, we then construct numerous multi-model ensembles, based both on the rating framework and on a stochastic method, differing in terms of participating members, ensemble size and model weights. We evaluate the performance of ensemble means, as well as their probabilistic skill and reliability. Our results demonstrate that well-designed multi-model ensembles represent a pragmatic approach to consistently obtain more accurate flood loss estimates and reliable probability distributions of model uncertainty.
A study of finite mixture model: Bayesian approach on financial time series data
Phoong, Seuk-Yen; Ismail, Mohd Tahir
2014-07-01
Recently, statistician have emphasized on the fitting finite mixture model by using Bayesian method. Finite mixture model is a mixture of distributions in modeling a statistical distribution meanwhile Bayesian method is a statistical method that use to fit the mixture model. Bayesian method is being used widely because it has asymptotic properties which provide remarkable result. In addition, Bayesian method also shows consistency characteristic which means the parameter estimates are close to the predictive distributions. In the present paper, the number of components for mixture model is studied by using Bayesian Information Criterion. Identify the number of component is important because it may lead to an invalid result. Later, the Bayesian method is utilized to fit the k-component mixture model in order to explore the relationship between rubber price and stock market price for Malaysia, Thailand, Philippines and Indonesia. Lastly, the results showed that there is a negative effect among rubber price and stock market price for all selected countries.
Directory of Open Access Journals (Sweden)
Hashem Salarzadeh Jenatabadi
2016-11-01
Full Text Available There are many factors which could influence the sustainability of airlines. The main purpose of this study is to introduce a framework for a financial sustainability index and model it based on structural equation modeling (SEM with maximum likelihood and Bayesian predictors. The introduced framework includes economic performance, operational performance, cost performance, and financial performance. Based on both Bayesian SEM (Bayesian-SEM and Classical SEM (Classical-SEM, it was found that economic performance with both operational performance and cost performance are significantly related to the financial performance index. The four mathematical indices employed are root mean square error, coefficient of determination, mean absolute error, and mean absolute percentage error to compare the efficiency of Bayesian-SEM and Classical-SEM in predicting the airline financial performance. The outputs confirmed that the framework with Bayesian prediction delivered a good fit with the data, although the framework predicted with a Classical-SEM approach did not prepare a well-fitting model. The reasons for this discrepancy between Classical and Bayesian predictions, as well as the potential advantages and caveats with the application of Bayesian approach in airline sustainability studies, are debated.
From qualitative reasoning models to Bayesian-based learner modeling
Milošević, U.; Bredeweg, B.; de Kleer, J.; Forbus, K.D.
2010-01-01
Assessing the knowledge of a student is a fundamental part of intelligent learning environments. We present a Bayesian network based approach to dealing with uncertainty when estimating a learner’s state of knowledge in the context of Qualitative Reasoning (QR). A proposal for a global architecture
Development of a cyber security risk model using Bayesian networks
International Nuclear Information System (INIS)
Shin, Jinsoo; Son, Hanseong; Khalil ur, Rahman; Heo, Gyunyoung
2015-01-01
Cyber security is an emerging safety issue in the nuclear industry, especially in the instrumentation and control (I and C) field. To address the cyber security issue systematically, a model that can be used for cyber security evaluation is required. In this work, a cyber security risk model based on a Bayesian network is suggested for evaluating cyber security for nuclear facilities in an integrated manner. The suggested model enables the evaluation of both the procedural and technical aspects of cyber security, which are related to compliance with regulatory guides and system architectures, respectively. The activity-quality analysis model was developed to evaluate how well people and/or organizations comply with the regulatory guidance associated with cyber security. The architecture analysis model was created to evaluate vulnerabilities and mitigation measures with respect to their effect on cyber security. The two models are integrated into a single model, which is called the cyber security risk model, so that cyber security can be evaluated from procedural and technical viewpoints at the same time. The model was applied to evaluate the cyber security risk of the reactor protection system (RPS) of a research reactor and to demonstrate its usefulness and feasibility. - Highlights: • We developed the cyber security risk model can be find the weak point of cyber security integrated two cyber analysis models by using Bayesian Network. • One is the activity-quality model signifies how people and/or organization comply with the cyber security regulatory guide. • Other is the architecture model represents the probability of cyber-attack on RPS architecture. • The cyber security risk model can provide evidence that is able to determine the key element for cyber security for RPS of a research reactor
Quantum-Like Bayesian Networks for Modeling Decision Making
Directory of Open Access Journals (Sweden)
Catarina eMoreira
2016-01-01
Full Text Available In this work, we explore an alternative quantum structure to perform quantum probabilistic inferences to accommodate the paradoxical findings of the Sure Thing Principle. We propose a Quantum-Like Bayesian Network, which consists in replacing classical probabilities by quantum probability amplitudes. However, since this approach suffers from the problem of exponential growth of quantum parameters, we also propose a similarity heuristic that automatically fits quantum parameters through vector similarities. This makes the proposed model general and predictive in contrast to the current state of the art models, which cannot be generalized for more complex decision scenarios and that only provide an explanatory nature for the observed paradoxes. In the end, the model that we propose consists in a nonparametric method for estimating inference effects from a statistical point of view. It is a statistical model that is simpler than the previous quantum dynamic and quantum-like models proposed in the literature. We tested the proposed network with several empirical data from the literature, mainly from the Prisoner's Dilemma game and the Two Stage Gambling game. The results obtained show that the proposed quantum Bayesian Network is a general method that can accommodate violations of the laws of classical probability theory and make accurate predictions regarding human decision-making in these scenarios.
Prior Sensitivity Analysis in Default Bayesian Structural Equation Modeling.
van Erp, Sara; Mulder, Joris; Oberski, Daniel L
2017-11-27
Bayesian structural equation modeling (BSEM) has recently gained popularity because it enables researchers to fit complex models and solve some of the issues often encountered in classical maximum likelihood estimation, such as nonconvergence and inadmissible solutions. An important component of any Bayesian analysis is the prior distribution of the unknown model parameters. Often, researchers rely on default priors, which are constructed in an automatic fashion without requiring substantive prior information. However, the prior can have a serious influence on the estimation of the model parameters, which affects the mean squared error, bias, coverage rates, and quantiles of the estimates. In this article, we investigate the performance of three different default priors: noninformative improper priors, vague proper priors, and empirical Bayes priors-with the latter being novel in the BSEM literature. Based on a simulation study, we find that these three default BSEM methods may perform very differently, especially with small samples. A careful prior sensitivity analysis is therefore needed when performing a default BSEM analysis. For this purpose, we provide a practical step-by-step guide for practitioners to conducting a prior sensitivity analysis in default BSEM. Our recommendations are illustrated using a well-known case study from the structural equation modeling literature, and all code for conducting the prior sensitivity analysis is available in the online supplemental materials. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Bayesian semiparametric regression models to characterize molecular evolution
Directory of Open Access Journals (Sweden)
Datta Saheli
2012-10-01
Full Text Available Abstract Background Statistical models and methods that associate changes in the physicochemical properties of amino acids with natural selection at the molecular level typically do not take into account the correlations between such properties. We propose a Bayesian hierarchical regression model with a generalization of the Dirichlet process prior on the distribution of the regression coefficients that describes the relationship between the changes in amino acid distances and natural selection in protein-coding DNA sequence alignments. Results The Bayesian semiparametric approach is illustrated with simulated data and the abalone lysin sperm data. Our method identifies groups of properties which, for this particular dataset, have a similar effect on evolution. The model also provides nonparametric site-specific estimates for the strength of conservation of these properties. Conclusions The model described here is distinguished by its ability to handle a large number of amino acid properties simultaneously, while taking into account that such data can be correlated. The multi-level clustering ability of the model allows for appealing interpretations of the results in terms of properties that are roughly equivalent from the standpoint of molecular evolution.
Experimental validation of a Bayesian model of visual acuity.
LENUS (Irish Health Repository)
Dalimier, Eugénie
2009-01-01
Based on standard procedures used in optometry clinics, we compare measurements of visual acuity for 10 subjects (11 eyes tested) in the presence of natural ocular aberrations and different degrees of induced defocus, with the predictions given by a Bayesian model customized with aberrometric data of the eye. The absolute predictions of the model, without any adjustment, show good agreement with the experimental data, in terms of correlation and absolute error. The efficiency of the model is discussed in comparison with image quality metrics and other customized visual process models. An analysis of the importance and customization of each stage of the model is also given; it stresses the potential high predictive power from precise modeling of ocular and neural transfer functions.
Characterizing Drought Events from a Hydrological Model Ensemble
Smith, Katie; Parry, Simon; Prudhomme, Christel; Hannaford, Jamie; Tanguy, Maliko; Barker, Lucy; Svensson, Cecilia
2017-04-01
Hydrological droughts are a slow onset natural hazard that can affect large areas. Within the United Kingdom there have been eight major drought events over the last 50 years, with several events acting at the continental scale, and covering the entire nation. Many of these events have lasted several years and had significant impacts on agriculture, the environment and the economy. Generally in the UK, due to a northwest-southeast gradient in rainfall and relief, as well as varying underlying geology, droughts tend to be most severe in the southeast, which can threaten water supplies to the capital in London. With the impacts of climate change likely to increase the severity and duration of drought events worldwide, it is crucial that we gain an understanding of the characteristics of some of the longer and more extreme droughts of the 19th and 20th centuries, so we may utilize this information in planning for the future. Hydrological models are essential both for reconstructing such events that predate streamflow records, and for use in drought forecasting. However, whilst the uncertainties involved in modelling hydrological extremes on the flooding end of the flow regime have been studied in depth over the past few decades, the uncertainties in simulating droughts and low flow events have not yet received such rigorous academic attention. The "Cascade of Uncertainty" approach has been applied to explore uncertainty and coherence across simulations of notable drought events from the past 50 years using the airGR family of daily lumped catchment models. Parameter uncertainty has been addressed using a Latin Hypercube sampled experiment of 500,000 parameter sets per model (GR4J, GR5J and GR6J), over more than 200 catchments across the UK. The best performing model parameterisations, determined using a multi-objective function approach, have then been taken forward for use in the assessment of the impact of model parameters and model structure on drought event
Assessing global vegetation activity using spatio-temporal Bayesian modelling
Mulder, Vera L.; van Eck, Christel M.; Friedlingstein, Pierre; Regnier, Pierre A. G.
2016-04-01
This work demonstrates the potential of modelling vegetation activity using a hierarchical Bayesian spatio-temporal model. This approach allows modelling changes in vegetation and climate simultaneous in space and time. Changes of vegetation activity such as phenology are modelled as a dynamic process depending on climate variability in both space and time. Additionally, differences in observed vegetation status can be contributed to other abiotic ecosystem properties, e.g. soil and terrain properties. Although these properties do not change in time, they do change in space and may provide valuable information in addition to the climate dynamics. The spatio-temporal Bayesian models were calibrated at a regional scale because the local trends in space and time can be better captured by the model. The regional subsets were defined according to the SREX segmentation, as defined by the IPCC. Each region is considered being relatively homogeneous in terms of large-scale climate and biomes, still capturing small-scale (grid-cell level) variability. Modelling within these regions is hence expected to be less uncertain due to the absence of these large-scale patterns, compared to a global approach. This overall modelling approach allows the comparison of model behavior for the different regions and may provide insights on the main dynamic processes driving the interaction between vegetation and climate within different regions. The data employed in this study encompasses the global datasets for soil properties (SoilGrids), terrain properties (Global Relief Model based on SRTM DEM and ETOPO), monthly time series of satellite-derived vegetation indices (GIMMS NDVI3g) and climate variables (Princeton Meteorological Forcing Dataset). The findings proved the potential of a spatio-temporal Bayesian modelling approach for assessing vegetation dynamics, at a regional scale. The observed interrelationships of the employed data and the different spatial and temporal trends support
Theory-based Bayesian models of inductive learning and reasoning.
Tenenbaum, Joshua B; Griffiths, Thomas L; Kemp, Charles
2006-07-01
Inductive inference allows humans to make powerful generalizations from sparse data when learning about word meanings, unobserved properties, causal relationships, and many other aspects of the world. Traditional accounts of induction emphasize either the power of statistical learning, or the importance of strong constraints from structured domain knowledge, intuitive theories or schemas. We argue that both components are necessary to explain the nature, use and acquisition of human knowledge, and we introduce a theory-based Bayesian framework for modeling inductive learning and reasoning as statistical inferences over structured knowledge representations.
Factors affecting GEBV accuracy with single-step Bayesian models.
Zhou, Lei; Mrode, Raphael; Zhang, Shengli; Zhang, Qin; Li, Bugao; Liu, Jian-Feng
2018-01-01
A single-step approach to obtain genomic prediction was first proposed in 2009. Many studies have investigated the components of GEBV accuracy in genomic selection. However, it is still unclear how the population structure and the relationships between training and validation populations influence GEBV accuracy in terms of single-step analysis. Here, we explored the components of GEBV accuracy in single-step Bayesian analysis with a simulation study. Three scenarios with various numbers of QTL (5, 50, and 500) were simulated. Three models were implemented to analyze the simulated data: single-step genomic best linear unbiased prediction (GBLUP; SSGBLUP), single-step BayesA (SS-BayesA), and single-step BayesB (SS-BayesB). According to our results, GEBV accuracy was influenced by the relationships between the training and validation populations more significantly for ungenotyped animals than for genotyped animals. SS-BayesA/BayesB showed an obvious advantage over SSGBLUP with the scenarios of 5 and 50 QTL. SS-BayesB model obtained the lowest accuracy with the 500 QTL in the simulation. SS-BayesA model was the most efficient and robust considering all QTL scenarios. Generally, both the relationships between training and validation populations and LD between markers and QTL contributed to GEBV accuracy in the single-step analysis, and the advantages of single-step Bayesian models were more apparent when the trait is controlled by fewer QTL.
Bayesian Sensitivity Analysis of Statistical Models with Missing Data.
Zhu, Hongtu; Ibrahim, Joseph G; Tang, Niansheng
2014-04-01
Methods for handling missing data depend strongly on the mechanism that generated the missing values, such as missing completely at random (MCAR) or missing at random (MAR), as well as other distributional and modeling assumptions at various stages. It is well known that the resulting estimates and tests may be sensitive to these assumptions as well as to outlying observations. In this paper, we introduce various perturbations to modeling assumptions and individual observations, and then develop a formal sensitivity analysis to assess these perturbations in the Bayesian analysis of statistical models with missing data. We develop a geometric framework, called the Bayesian perturbation manifold, to characterize the intrinsic structure of these perturbations. We propose several intrinsic influence measures to perform sensitivity analysis and quantify the effect of various perturbations to statistical models. We use the proposed sensitivity analysis procedure to systematically investigate the tenability of the non-ignorable missing at random (NMAR) assumption. Simulation studies are conducted to evaluate our methods, and a dataset is analyzed to illustrate the use of our diagnostic measures.
Approximate Bayesian computation for forward modeling in cosmology
International Nuclear Information System (INIS)
Akeret, Joël; Refregier, Alexandre; Amara, Adam; Seehars, Sebastian; Hasner, Caspar
2015-01-01
Bayesian inference is often used in cosmology and astrophysics to derive constraints on model parameters from observations. This approach relies on the ability to compute the likelihood of the data given a choice of model parameters. In many practical situations, the likelihood function may however be unavailable or intractable due to non-gaussian errors, non-linear measurements processes, or complex data formats such as catalogs and maps. In these cases, the simulation of mock data sets can often be made through forward modeling. We discuss how Approximate Bayesian Computation (ABC) can be used in these cases to derive an approximation to the posterior constraints using simulated data sets. This technique relies on the sampling of the parameter set, a distance metric to quantify the difference between the observation and the simulations and summary statistics to compress the information in the data. We first review the principles of ABC and discuss its implementation using a Population Monte-Carlo (PMC) algorithm and the Mahalanobis distance metric. We test the performance of the implementation using a Gaussian toy model. We then apply the ABC technique to the practical case of the calibration of image simulations for wide field cosmological surveys. We find that the ABC analysis is able to provide reliable parameter constraints for this problem and is therefore a promising technique for other applications in cosmology and astrophysics. Our implementation of the ABC PMC method is made available via a public code release
On-line Bayesian model updating for structural health monitoring
Rocchetta, Roberto; Broggi, Matteo; Huchet, Quentin; Patelli, Edoardo
2018-03-01
Fatigue induced cracks is a dangerous failure mechanism which affects mechanical components subject to alternating load cycles. System health monitoring should be adopted to identify cracks which can jeopardise the structure. Real-time damage detection may fail in the identification of the cracks due to different sources of uncertainty which have been poorly assessed or even fully neglected. In this paper, a novel efficient and robust procedure is used for the detection of cracks locations and lengths in mechanical components. A Bayesian model updating framework is employed, which allows accounting for relevant sources of uncertainty. The idea underpinning the approach is to identify the most probable crack consistent with the experimental measurements. To tackle the computational cost of the Bayesian approach an emulator is adopted for replacing the computationally costly Finite Element model. To improve the overall robustness of the procedure, different numerical likelihoods, measurement noises and imprecision in the value of model parameters are analysed and their effects quantified. The accuracy of the stochastic updating and the efficiency of the numerical procedure are discussed. An experimental aluminium frame and on a numerical model of a typical car suspension arm are used to demonstrate the applicability of the approach.
Bayesian statistic methods and theri application in probabilistic simulation models
Directory of Open Access Journals (Sweden)
Sergio Iannazzo
2007-03-01
Full Text Available Bayesian statistic methods are facing a rapidly growing level of interest and acceptance in the field of health economics. The reasons of this success are probably to be found on the theoretical fundaments of the discipline that make these techniques more appealing to decision analysis. To this point should be added the modern IT progress that has developed different flexible and powerful statistical software framework. Among them probably one of the most noticeably is the BUGS language project and its standalone application for MS Windows WinBUGS. Scope of this paper is to introduce the subject and to show some interesting applications of WinBUGS in developing complex economical models based on Markov chains. The advantages of this approach reside on the elegance of the code produced and in its capability to easily develop probabilistic simulations. Moreover an example of the integration of bayesian inference models in a Markov model is shown. This last feature let the analyst conduce statistical analyses on the available sources of evidence and exploit them directly as inputs in the economic model.
Bayesian calibration of power plant models for accurate performance prediction
International Nuclear Information System (INIS)
Boksteen, Sowande Z.; Buijtenen, Jos P. van; Pecnik, Rene; Vecht, Dick van der
2014-01-01
Highlights: • Bayesian calibration is applied to power plant performance prediction. • Measurements from a plant in operation are used for model calibration. • A gas turbine performance model and steam cycle model are calibrated. • An integrated plant model is derived. • Part load efficiency is accurately predicted as a function of ambient conditions. - Abstract: Gas turbine combined cycles are expected to play an increasingly important role in the balancing of supply and demand in future energy markets. Thermodynamic modeling of these energy systems is frequently applied to assist in decision making processes related to the management of plant operation and maintenance. In most cases, model inputs, parameters and outputs are treated as deterministic quantities and plant operators make decisions with limited or no regard of uncertainties. As the steady integration of wind and solar energy into the energy market induces extra uncertainties, part load operation and reliability are becoming increasingly important. In the current study, methods are proposed to not only quantify various types of uncertainties in measurements and plant model parameters using measured data, but to also assess their effect on various aspects of performance prediction. The authors aim to account for model parameter and measurement uncertainty, and for systematic discrepancy of models with respect to reality. For this purpose, the Bayesian calibration framework of Kennedy and O’Hagan is used, which is especially suitable for high-dimensional industrial problems. The article derives a calibrated model of the plant efficiency as a function of ambient conditions and operational parameters, which is also accurate in part load. The article shows that complete statistical modeling of power plants not only enhances process models, but can also increases confidence in operational decisions
A comparison of resampling schemes for estimating model observer performance with small ensembles
Elshahaby, Fatma E. A.; Jha, Abhinav K.; Ghaly, Michael; Frey, Eric C.
2017-09-01
In objective assessment of image quality, an ensemble of images is used to compute the 1st and 2nd order statistics of the data. Often, only a finite number of images is available, leading to the issue of statistical variability in numerical observer performance. Resampling-based strategies can help overcome this issue. In this paper, we compared different combinations of resampling schemes (the leave-one-out (LOO) and the half-train/half-test (HT/HT)) and model observers (the conventional channelized Hotelling observer (CHO), channelized linear discriminant (CLD) and channelized quadratic discriminant). Observer performance was quantified by the area under the ROC curve (AUC). For a binary classification task and for each observer, the AUC value for an ensemble size of 2000 samples per class served as a gold standard for that observer. Results indicated that each observer yielded a different performance depending on the ensemble size and the resampling scheme. For a small ensemble size, the combination [CHO, HT/HT] had more accurate rankings than the combination [CHO, LOO]. Using the LOO scheme, the CLD and CHO had similar performance for large ensembles. However, the CLD outperformed the CHO and gave more accurate rankings for smaller ensembles. As the ensemble size decreased, the performance of the [CHO, LOO] combination seriously deteriorated as opposed to the [CLD, LOO] combination. Thus, it might be desirable to use the CLD with the LOO scheme when smaller ensemble size is available.
Bayesian signal processing classical, modern, and particle filtering methods
Candy, James V
2016-01-01
This book aims to give readers a unified Bayesian treatment starting from the basics (Baye's rule) to the more advanced (Monte Carlo sampling), evolving to the next-generation model-based techniques (sequential Monte Carlo sampling). This next edition incorporates a new chapter on "Sequential Bayesian Detection," a new section on "Ensemble Kalman Filters" as well as an expansion of Case Studies that detail Bayesian solutions for a variety of applications. These studies illustrate Bayesian approaches to real-world problems incorporating detailed particle filter designs, adaptive particle filters and sequential Bayesian detectors. In addition to these major developments a variety of sections are expanded to "fill-in-the gaps" of the first edition. Here metrics for particle filter (PF) designs with emphasis on classical "sanity testing" lead to ensemble techniques as a basic requirement for performance analysis. The expansion of information theory metrics and their application to PF designs is fully developed an...
Model Selection in Historical Research Using Approximate Bayesian Computation
Rubio-Campillo, Xavier
2016-01-01
Formal Models and History Computational models are increasingly being used to study historical dynamics. This new trend, which could be named Model-Based History, makes use of recently published datasets and innovative quantitative methods to improve our understanding of past societies based on their written sources. The extensive use of formal models allows historians to re-evaluate hypotheses formulated decades ago and still subject to debate due to the lack of an adequate quantitative framework. The initiative has the potential to transform the discipline if it solves the challenges posed by the study of historical dynamics. These difficulties are based on the complexities of modelling social interaction, and the methodological issues raised by the evaluation of formal models against data with low sample size, high variance and strong fragmentation. Case Study This work examines an alternate approach to this evaluation based on a Bayesian-inspired model selection method. The validity of the classical Lanchester’s laws of combat is examined against a dataset comprising over a thousand battles spanning 300 years. Four variations of the basic equations are discussed, including the three most common formulations (linear, squared, and logarithmic) and a new variant introducing fatigue. Approximate Bayesian Computation is then used to infer both parameter values and model selection via Bayes Factors. Impact Results indicate decisive evidence favouring the new fatigue model. The interpretation of both parameter estimations and model selection provides new insights into the factors guiding the evolution of warfare. At a methodological level, the case study shows how model selection methods can be used to guide historical research through the comparison between existing hypotheses and empirical evidence. PMID:26730953
An Active Lattice Model in a Bayesian Framework
DEFF Research Database (Denmark)
Carstensen, Jens Michael
1996-01-01
A Markov Random Field is used as a structural model of a deformable rectangular lattice. When used as a template prior in a Bayesian framework this model is powerful for making inferences about lattice structures in images. The model assigns maximum probability to the perfect regular lattice...... by penalizing deviations in alignment and lattice node distance. The Markov random field represents prior knowledge about the lattice structure, and through an observation model that incorporates the visual appearance of the nodes, we can simulate realizations from the posterior distribution. A maximum...... a posteriori (MAP) estimate, found by simulated annealing, is used as the reconstructed lattice. The model was developed as a central part of an algorithm for automatic analylsis of genetic experiments, positioned in a lattice structure by a robot. The algorithm has been successfully applied to many images...
Bayesian approach to errors-in-variables in regression models
Rozliman, Nur Aainaa; Ibrahim, Adriana Irawati Nur; Yunus, Rossita Mohammad
2017-05-01
In many applications and experiments, data sets are often contaminated with error or mismeasured covariates. When at least one of the covariates in a model is measured with error, Errors-in-Variables (EIV) model can be used. Measurement error, when not corrected, would cause misleading statistical inferences and analysis. Therefore, our goal is to examine the relationship of the outcome variable and the unobserved exposure variable given the observed mismeasured surrogate by applying the Bayesian formulation to the EIV model. We shall extend the flexible parametric method proposed by Hossain and Gustafson (2009) to another nonlinear regression model which is the Poisson regression model. We shall then illustrate the application of this approach via a simulation study using Markov chain Monte Carlo sampling methods.
Uncovering Transcriptional Regulatory Networks by Sparse Bayesian Factor Model
Directory of Open Access Journals (Sweden)
Qi Yuan(Alan
2010-01-01
Full Text Available Abstract The problem of uncovering transcriptional regulation by transcription factors (TFs based on microarray data is considered. A novel Bayesian sparse correlated rectified factor model (BSCRFM is proposed that models the unknown TF protein level activity, the correlated regulations between TFs, and the sparse nature of TF-regulated genes. The model admits prior knowledge from existing database regarding TF-regulated target genes based on a sparse prior and through a developed Gibbs sampling algorithm, a context-specific transcriptional regulatory network specific to the experimental condition of the microarray data can be obtained. The proposed model and the Gibbs sampling algorithm were evaluated on the simulated systems, and results demonstrated the validity and effectiveness of the proposed approach. The proposed model was then applied to the breast cancer microarray data of patients with Estrogen Receptor positive ( status and Estrogen Receptor negative ( status, respectively.
MODELING INFORMATION SYSTEM AVAILABILITY BY USING BAYESIAN BELIEF NETWORK APPROACH
Directory of Open Access Journals (Sweden)
Semir Ibrahimović
2016-03-01
Full Text Available Modern information systems are expected to be always-on by providing services to end-users, regardless of time and location. This is particularly important for organizations and industries where information systems support real-time operations and mission-critical applications that need to be available on 24 7 365 basis. Examples of such entities include process industries, telecommunications, healthcare, energy, banking, electronic commerce and a variety of cloud services. This article presents a modified Bayesian Belief Network model for predicting information system availability, introduced initially by Franke, U. and Johnson, P. (in article “Availability of enterprise IT systems – an expert based Bayesian model”. Software Quality Journal 20(2, 369-394, 2012 based on a thorough review of several dimensions of the information system availability, we proposed a modified set of determinants. The model is parameterized by using probability elicitation process with the participation of experts from the financial sector of Bosnia and Herzegovina. The model validation was performed using Monte Carlo simulation.
A Single-column Model Ensemble Approach Applied to the TWP-ICE Experiment
Davies, L.; Jakob, C.; Cheung, K.; DelGenio, A.; Hill, A.; Hume, T.; Keane, R. J.; Komori, T.; Larson, V. E.; Lin, Y.;
2013-01-01
Single-column models (SCM) are useful test beds for investigating the parameterization schemes of numerical weather prediction and climate models. The usefulness of SCM simulations are limited, however, by the accuracy of the best estimate large-scale observations prescribed. Errors estimating the observations will result in uncertainty in modeled simulations. One method to address the modeled uncertainty is to simulate an ensemble where the ensemble members span observational uncertainty. This study first derives an ensemble of large-scale data for the Tropical Warm Pool International Cloud Experiment (TWP-ICE) based on an estimate of a possible source of error in the best estimate product. These data are then used to carry out simulations with 11 SCM and two cloud-resolving models (CRM). Best estimate simulations are also performed. All models show that moisture-related variables are close to observations and there are limited differences between the best estimate and ensemble mean values. The models, however, show different sensitivities to changes in the forcing particularly when weakly forced. The ensemble simulations highlight important differences in the surface evaporation term of the moisture budget between the SCM and CRM. Differences are also apparent between the models in the ensemble mean vertical structure of cloud variables, while for each model, cloud properties are relatively insensitive to forcing. The ensemble is further used to investigate cloud variables and precipitation and identifies differences between CRM and SCM particularly for relationships involving ice. This study highlights the additional analysis that can be performed using ensemble simulations and hence enables a more complete model investigation compared to using the more traditional single best estimate simulation only.
Bayesian Model Comparison With the g-Prior
DEFF Research Database (Denmark)
Nielsen, Jesper Kjær; Christensen, Mads Græsbøll; Cemgil, Ali Taylan
2014-01-01
’s asymptotic MAP rule was an improvement, and in this paper we extend the work by Djuric in several ways. Speciﬁcally, we consider the elicitation of proper prior distributions, treat the case of real- and complex-valued data simultaneously in a Bayesian framework similar to that considered by Djuric......, and develop new model selection rules for a regression model containing both linear and non-linear parameters. Moreover, we use this framework to give a new interpretation of the popular information criteria and relate their performance to the signal-to-noise ratio of the data. By use of simulations, we also...... demonstrate that our proposed model comparison and selection rules outperform the traditional information criteria both in terms of detecting the true model and in terms of predicting unobserved data. The simulation code is available online....
Recursive Bayesian recurrent neural networks for time-series modeling.
Mirikitani, Derrick T; Nikolaev, Nikolay
2010-02-01
This paper develops a probabilistic approach to recursive second-order training of recurrent neural networks (RNNs) for improved time-series modeling. A general recursive Bayesian Levenberg-Marquardt algorithm is derived to sequentially update the weights and the covariance (Hessian) matrix. The main strengths of the approach are a principled handling of the regularization hyperparameters that leads to better generalization, and stable numerical performance. The framework involves the adaptation of a noise hyperparameter and local weight prior hyperparameters, which represent the noise in the data and the uncertainties in the model parameters. Experimental investigations using artificial and real-world data sets show that RNNs equipped with the proposed approach outperform standard real-time recurrent learning and extended Kalman training algorithms for recurrent networks, as well as other contemporary nonlinear neural models, on time-series modeling.
Bayesian Dose-Response Modeling in Sparse Data
Kim, Steven B.
This book discusses Bayesian dose-response modeling in small samples applied to two different settings. The first setting is early phase clinical trials, and the second setting is toxicology studies in cancer risk assessment. In early phase clinical trials, experimental units are humans who are actual patients. Prior to a clinical trial, opinions from multiple subject area experts are generally more informative than the opinion of a single expert, but we may face a dilemma when they have disagreeing prior opinions. In this regard, we consider compromising the disagreement and compare two different approaches for making a decision. In addition to combining multiple opinions, we also address balancing two levels of ethics in early phase clinical trials. The first level is individual-level ethics which reflects the perspective of trial participants. The second level is population-level ethics which reflects the perspective of future patients. We extensively compare two existing statistical methods which focus on each perspective and propose a new method which balances the two conflicting perspectives. In toxicology studies, experimental units are living animals. Here we focus on a potential non-monotonic dose-response relationship which is known as hormesis. Briefly, hormesis is a phenomenon which can be characterized by a beneficial effect at low doses and a harmful effect at high doses. In cancer risk assessments, the estimation of a parameter, which is known as a benchmark dose, can be highly sensitive to a class of assumptions, monotonicity or hormesis. In this regard, we propose a robust approach which considers both monotonicity and hormesis as a possibility. In addition, We discuss statistical hypothesis testing for hormesis and consider various experimental designs for detecting hormesis based on Bayesian decision theory. Past experiments have not been optimally designed for testing for hormesis, and some Bayesian optimal designs may not be optimal under a
Bayesian uncertainty analysis with applications to turbulence modeling
International Nuclear Information System (INIS)
Cheung, Sai Hung; Oliver, Todd A.; Prudencio, Ernesto E.; Prudhomme, Serge; Moser, Robert D.
2011-01-01
In this paper, we apply Bayesian uncertainty quantification techniques to the processes of calibrating complex mathematical models and predicting quantities of interest (QoI's) with such models. These techniques also enable the systematic comparison of competing model classes. The processes of calibration and comparison constitute the building blocks of a larger validation process, the goal of which is to accept or reject a given mathematical model for the prediction of a particular QoI for a particular scenario. In this work, we take the first step in this process by applying the methodology to the analysis of the Spalart-Allmaras turbulence model in the context of incompressible, boundary layer flows. Three competing model classes based on the Spalart-Allmaras model are formulated, calibrated against experimental data, and used to issue a prediction with quantified uncertainty. The model classes are compared in terms of their posterior probabilities and their prediction of QoI's. The model posterior probability represents the relative plausibility of a model class given the data. Thus, it incorporates the model's ability to fit experimental observations. Alternatively, comparing models using the predicted QoI connects the process to the needs of decision makers that use the results of the model. We show that by using both the model plausibility and predicted QoI, one has the opportunity to reject some model classes after calibration, before subjecting the remaining classes to additional validation challenges.
Optimal inference with suboptimal models: Addiction and active Bayesian inference
Schwartenbeck, Philipp; FitzGerald, Thomas H.B.; Mathys, Christoph; Dolan, Ray; Wurst, Friedrich; Kronbichler, Martin; Friston, Karl
2015-01-01
When casting behaviour as active (Bayesian) inference, optimal inference is defined with respect to an agent’s beliefs – based on its generative model of the world. This contrasts with normative accounts of choice behaviour, in which optimal actions are considered in relation to the true structure of the environment – as opposed to the agent’s beliefs about worldly states (or the task). This distinction shifts an understanding of suboptimal or pathological behaviour away from aberrant inference as such, to understanding the prior beliefs of a subject that cause them to behave less ‘optimally’ than our prior beliefs suggest they should behave. Put simply, suboptimal or pathological behaviour does not speak against understanding behaviour in terms of (Bayes optimal) inference, but rather calls for a more refined understanding of the subject’s generative model upon which their (optimal) Bayesian inference is based. Here, we discuss this fundamental distinction and its implications for understanding optimality, bounded rationality and pathological (choice) behaviour. We illustrate our argument using addictive choice behaviour in a recently described ‘limited offer’ task. Our simulations of pathological choices and addictive behaviour also generate some clear hypotheses, which we hope to pursue in ongoing empirical work. PMID:25561321
Sparse linear models: Variational approximate inference and Bayesian experimental design
International Nuclear Information System (INIS)
Seeger, Matthias W
2009-01-01
A wide range of problems such as signal reconstruction, denoising, source separation, feature selection, and graphical model search are addressed today by posterior maximization for linear models with sparsity-favouring prior distributions. The Bayesian posterior contains useful information far beyond its mode, which can be used to drive methods for sampling optimization (active learning), feature relevance ranking, or hyperparameter estimation, if only this representation of uncertainty can be approximated in a tractable manner. In this paper, we review recent results for variational sparse inference, and show that they share underlying computational primitives. We discuss how sampling optimization can be implemented as sequential Bayesian experimental design. While there has been tremendous recent activity to develop sparse estimation, little attendance has been given to sparse approximate inference. In this paper, we argue that many problems in practice, such as compressive sensing for real-world image reconstruction, are served much better by proper uncertainty approximations than by ever more aggressive sparse estimation algorithms. Moreover, since some variational inference methods have been given strong convex optimization characterizations recently, theoretical analysis may become possible, promising new insights into nonlinear experimental design.
Sparse linear models: Variational approximate inference and Bayesian experimental design
Energy Technology Data Exchange (ETDEWEB)
Seeger, Matthias W [Saarland University and Max Planck Institute for Informatics, Campus E1.4, 66123 Saarbruecken (Germany)
2009-12-01
A wide range of problems such as signal reconstruction, denoising, source separation, feature selection, and graphical model search are addressed today by posterior maximization for linear models with sparsity-favouring prior distributions. The Bayesian posterior contains useful information far beyond its mode, which can be used to drive methods for sampling optimization (active learning), feature relevance ranking, or hyperparameter estimation, if only this representation of uncertainty can be approximated in a tractable manner. In this paper, we review recent results for variational sparse inference, and show that they share underlying computational primitives. We discuss how sampling optimization can be implemented as sequential Bayesian experimental design. While there has been tremendous recent activity to develop sparse estimation, little attendance has been given to sparse approximate inference. In this paper, we argue that many problems in practice, such as compressive sensing for real-world image reconstruction, are served much better by proper uncertainty approximations than by ever more aggressive sparse estimation algorithms. Moreover, since some variational inference methods have been given strong convex optimization characterizations recently, theoretical analysis may become possible, promising new insights into nonlinear experimental design.
MATHEMATICAL RISK ANALYSIS: VIA NICHOLAS RISK MODEL AND BAYESIAN ANALYSIS
Directory of Open Access Journals (Sweden)
Anass BAYAGA
2010-07-01
Full Text Available The objective of this second part of a two-phased study was to explorethe predictive power of quantitative risk analysis (QRA method andprocess within Higher Education Institution (HEI. The method and process investigated the use impact analysis via Nicholas risk model and Bayesian analysis, with a sample of hundred (100 risk analysts in a historically black South African University in the greater Eastern Cape Province.The first findings supported and confirmed previous literature (KingIII report, 2009: Nicholas and Steyn, 2008: Stoney, 2007: COSA, 2004 that there was a direct relationship between risk factor, its likelihood and impact, certiris paribus. The second finding in relation to either controlling the likelihood or the impact of occurrence of risk (Nicholas risk model was that to have a brighter risk reward, it was important to control the likelihood ofoccurrence of risks as compared with its impact so to have a direct effect on entire University. On the Bayesian analysis, thus third finding, the impact of risk should be predicted along three aspects. These aspects included the human impact (decisions made, the property impact (students and infrastructural based and the business impact. Lastly, the study revealed that although in most business cases, where as business cycles considerably vary dependingon the industry and or the institution, this study revealed that, most impacts in HEI (University was within the period of one academic.The recommendation was that application of quantitative risk analysisshould be related to current legislative framework that affects HEI.
Bayesian Age-Period-Cohort Model of Lung Cancer Mortality
Directory of Open Access Journals (Sweden)
Bhikhari P. Tharu
2015-09-01
Full Text Available Background The objective of this study was to analyze the time trend for lung cancer mortality in the population of the USA by 5 years based on most recent available data namely to 2010. The knowledge of the mortality rates in the temporal trends is necessary to understand cancer burden.Methods Bayesian Age-Period-Cohort model was fitted using Poisson regression with histogram smoothing prior to decompose mortality rates based on age at death, period at death, and birth-cohort.Results Mortality rates from lung cancer increased more rapidly from age 52 years. It ended up to 325 deaths annually for 82 years on average. The mortality of younger cohorts was lower than older cohorts. The risk of lung cancer was lowered from period 1993 to recent periods.Conclusions The fitted Bayesian Age-Period-Cohort model with histogram smoothing prior is capable of explaining mortality rate of lung cancer. The reduction in carcinogens in cigarettes and increase in smoking cessation from around 1960 might led to decreasing trend of lung cancer mortality after calendar period 1993.
Bayesian modeling of recombination events in bacterial populations
Directory of Open Access Journals (Sweden)
Dowson Chris
2008-10-01
Full Text Available Abstract Background We consider the discovery of recombinant segments jointly with their origins within multilocus DNA sequences from bacteria representing heterogeneous populations of fairly closely related species. The currently available methods for recombination detection capable of probabilistic characterization of uncertainty have a limited applicability in practice as the number of strains in a data set increases. Results We introduce a Bayesian spatial structural model representing the continuum of origins over sites within the observed sequences, including a probabilistic characterization of uncertainty related to the origin of any particular site. To enable a statistically accurate and practically feasible approach to the analysis of large-scale data sets representing a single genus, we have developed a novel software tool (BRAT, Bayesian Recombination Tracker implementing the model and the corresponding learning algorithm, which is capable of identifying the posterior optimal structure and to estimate the marginal posterior probabilities of putative origins over the sites. Conclusion A multitude of challenging simulation scenarios and an analysis of real data from seven housekeeping genes of 120 strains of genus Burkholderia are used to illustrate the possibilities offered by our approach. The software is freely available for download at URL http://web.abo.fi/fak/mnf//mate/jc/software/brat.html.
With or without a conductor: Comparative analysis of leadership models in the musical ensemble
Directory of Open Access Journals (Sweden)
Kovačević Mia
2016-01-01
Full Text Available In search of innovative models of work organization and therefore the artistic process of one musical ensemble, in the last ten years musical ensembles have developed examples of non-traditional artistic-performing decisions and organizational practice. The paper is conceived as a research and analysis of the dominant models of leadership (i.e. organizing, conducting business applicable on the music ensembles and experiences of the musicians. The aim is to recognize and define leadership styles that encourage the increase of motivation and productivity of musicians within the musical ensemble. The paper will specifically investigate the relationship and differences between the two dominant models of leadership, leadership of conductor and collaborative leadership. At the same time, the paper describes and analyses an experiment that was conducted by the Ensemble Metamorphosis, which applied into their work two dominant models of leadership. In an effort to increase the motivation and productivity of musicians, Ensemble Metamorphosis also searched for a new management model of work organization and a new model of leadership. The aim of this paper was therefore to investigate the effects of leadership models that improve the artistic quality, motivation of the musicians, psychological climate and overall increase productivity of musical organization.
Multi-objective optimization for generating a weighted multi-model ensemble
Lee, H.
2017-12-01
Many studies have demonstrated that multi-model ensembles generally show better skill than each ensemble member. When generating weighted multi-model ensembles, the first step is measuring the performance of individual model simulations using observations. There is a consensus on the assignment of weighting factors based on a single evaluation metric. When considering only one evaluation metric, the weighting factor for each model is proportional to a performance score or inversely proportional to an error for the model. While this conventional approach can provide appropriate combinations of multiple models, the approach confronts a big challenge when there are multiple metrics under consideration. When considering multiple evaluation metrics, it is obvious that a simple averaging of multiple performance scores or model ranks does not address the trade-off problem between conflicting metrics. So far, there seems to be no best method to generate weighted multi-model ensembles based on multiple performance metrics. The current study applies the multi-objective optimization, a mathematical process that provides a set of optimal trade-off solutions based on a range of evaluation metrics, to combining multiple performance metrics for the global climate models and their dynamically downscaled regional climate simulations over North America and generating a weighted multi-model ensemble. NASA satellite data and the Regional Climate Model Evaluation System (RCMES) software toolkit are used for assessment of the climate simulations. Overall, the performance of each model differs markedly with strong seasonal dependence. Because of the considerable variability across the climate simulations, it is important to evaluate models systematically and make future projections by assigning optimized weighting factors to the models with relatively good performance. Our results indicate that the optimally weighted multi-model ensemble always shows better performance than an arithmetic
Li, Lu; Xu, Chong-Yu; Engeland, Kolbjørn
2013-04-01
SummaryWith respect to model calibration, parameter estimation and analysis of uncertainty sources, various regression and probabilistic approaches are used in hydrological modeling. A family of Bayesian methods, which incorporates different sources of information into a single analysis through Bayes' theorem, is widely used for uncertainty assessment. However, none of these approaches can well treat the impact of high flows in hydrological modeling. This study proposes a Bayesian modularization uncertainty assessment approach in which the highest streamflow observations are treated as suspect information that should not influence the inference of the main bulk of the model parameters. This study includes a comprehensive comparison and evaluation of uncertainty assessments by our new Bayesian modularization method and standard Bayesian methods using the Metropolis-Hastings (MH) algorithm with the daily hydrological model WASMOD. Three likelihood functions were used in combination with standard Bayesian method: the AR(1) plus Normal model independent of time (Model 1), the AR(1) plus Normal model dependent on time (Model 2) and the AR(1) plus Multi-normal model (Model 3). The results reveal that the Bayesian modularization method provides the most accurate streamflow estimates measured by the Nash-Sutcliffe efficiency and provide the best in uncertainty estimates for low, medium and entire flows compared to standard Bayesian methods. The study thus provides a new approach for reducing the impact of high flows on the discharge uncertainty assessment of hydrological models via Bayesian method.
Lu, Xiuyuan; Van Roy, Benjamin
2017-01-01
Thompson sampling has emerged as an effective heuristic for a broad range of online decision problems. In its basic form, the algorithm requires computing and sampling from a posterior distribution over models, which is tractable only for simple special cases. This paper develops ensemble sampling, which aims to approximate Thompson sampling while maintaining tractability even in the face of complex models such as neural networks. Ensemble sampling dramatically expands on the range of applica...
Noodles: a tool for visualization of numerical weather model ensemble uncertainty.
Sanyal, Jibonananda; Zhang, Song; Dyer, Jamie; Mercer, Andrew; Amburn, Philip; Moorhead, Robert J
2010-01-01
Numerical weather prediction ensembles are routinely used for operational weather forecasting. The members of these ensembles are individual simulations with either slightly perturbed initial conditions or different model parameterizations, or occasionally both. Multi-member ensemble output is usually large, multivariate, and challenging to interpret interactively. Forecast meteorologists are interested in understanding the uncertainties associated with numerical weather prediction; specifically variability between the ensemble members. Currently, visualization of ensemble members is mostly accomplished through spaghetti plots of a single mid-troposphere pressure surface height contour. In order to explore new uncertainty visualization methods, the Weather Research and Forecasting (WRF) model was used to create a 48-hour, 18 member parameterization ensemble of the 13 March 1993 "Superstorm". A tool was designed to interactively explore the ensemble uncertainty of three important weather variables: water-vapor mixing ratio, perturbation potential temperature, and perturbation pressure. Uncertainty was quantified using individual ensemble member standard deviation, inter-quartile range, and the width of the 95% confidence interval. Bootstrapping was employed to overcome the dependence on normality in the uncertainty metrics. A coordinated view of ribbon and glyph-based uncertainty visualization, spaghetti plots, iso-pressure colormaps, and data transect plots was provided to two meteorologists for expert evaluation. They found it useful in assessing uncertainty in the data, especially in finding outliers in the ensemble run and therefore avoiding the WRF parameterizations that lead to these outliers. Additionally, the meteorologists could identify spatial regions where the uncertainty was significantly high, allowing for identification of poorly simulated storm environments and physical interpretation of these model issues.
Application of Bayesian Model Selection for Metal Yield Models using ALEGRA and Dakota.
Energy Technology Data Exchange (ETDEWEB)
Portone, Teresa; Niederhaus, John Henry; Sanchez, Jason James; Swiler, Laura Painton
2018-02-01
This report introduces the concepts of Bayesian model selection, which provides a systematic means of calibrating and selecting an optimal model to represent a phenomenon. This has many potential applications, including for comparing constitutive models. The ideas described herein are applied to a model selection problem between different yield models for hardened steel under extreme loading conditions.
Aggregated Residential Load Modeling Using Dynamic Bayesian Networks
Energy Technology Data Exchange (ETDEWEB)
Vlachopoulou, Maria; Chin, George; Fuller, Jason C.; Lu, Shuai
2014-09-28
Abstract—It is already obvious that the future power grid will have to address higher demand for power and energy, and to incorporate renewable resources of different energy generation patterns. Demand response (DR) schemes could successfully be used to manage and balance power supply and demand under operating conditions of the future power grid. To achieve that, more advanced tools for DR management of operations and planning are necessary that can estimate the available capacity from DR resources. In this research, a Dynamic Bayesian Network (DBN) is derived, trained, and tested that can model aggregated load of Heating, Ventilation, and Air Conditioning (HVAC) systems. DBNs can provide flexible and powerful tools for both operations and planing, due to their unique analytical capabilities. The DBN model accuracy and flexibility of use is demonstrated by testing the model under different operational scenarios.
A Bayesian Analysis of Unobserved Component Models Using Ox
Directory of Open Access Journals (Sweden)
Charles S. Bos
2011-05-01
Full Text Available This article details a Bayesian analysis of the Nile river flow data, using a similar state space model as other articles in this volume. For this data set, Metropolis-Hastings and Gibbs sampling algorithms are implemented in the programming language Ox. These Markov chain Monte Carlo methods only provide output conditioned upon the full data set. For filtered output, conditioning only on past observations, the particle filter is introduced. The sampling methods are flexible, and this advantage is used to extend the model to incorporate a stochastic volatility process. The volatility changes both in the Nile data and also in daily S&P 500 return data are investigated. The posterior density of parameters and states is found to provide information on which elements of the model are easily identifiable, and which elements are estimated with less precision.
Fast Bayesian Inference in Dirichlet Process Mixture Models.
Wang, Lianming; Dunson, David B
2011-01-01
There has been increasing interest in applying Bayesian nonparametric methods in large samples and high dimensions. As Markov chain Monte Carlo (MCMC) algorithms are often infeasible, there is a pressing need for much faster algorithms. This article proposes a fast approach for inference in Dirichlet process mixture (DPM) models. Viewing the partitioning of subjects into clusters as a model selection problem, we propose a sequential greedy search algorithm for selecting the partition. Then, when conjugate priors are chosen, the resulting posterior conditionally on the selected partition is available in closed form. This approach allows testing of parametric models versus nonparametric alternatives based on Bayes factors. We evaluate the approach using simulation studies and compare it with four other fast nonparametric methods in the literature. We apply the proposed approach to three datasets including one from a large epidemiologic study. Matlab codes for the simulation and data analyses using the proposed approach are available online in the supplemental materials.
Model-based dispersive wave processing: A recursive Bayesian solution
International Nuclear Information System (INIS)
Candy, J.V.; Chambers, D.H.
1999-01-01
Wave propagation through dispersive media represents a significant problem in many acoustic applications, especially in ocean acoustics, seismology, and nondestructive evaluation. In this paper we propose a propagation model that can easily represent many classes of dispersive waves and proceed to develop the model-based solution to the wave processing problem. It is shown that the underlying wave system is nonlinear and time-variable requiring a recursive processor. Thus the general solution to the model-based dispersive wave enhancement problem is developed using a Bayesian maximum a posteriori (MAP) approach and shown to lead to the recursive, nonlinear extended Kalman filter (EKF) processor. The problem of internal wave estimation is cast within this framework. The specific processor is developed and applied to data synthesized by a sophisticated simulator demonstrating the feasibility of this approach. copyright 1999 Acoustical Society of America.
Advances in Bayesian Model Based Clustering Using Particle Learning
Energy Technology Data Exchange (ETDEWEB)
Merl, D M
2009-11-19
Recent work by Carvalho, Johannes, Lopes and Polson and Carvalho, Lopes, Polson and Taddy introduced a sequential Monte Carlo (SMC) alternative to traditional iterative Monte Carlo strategies (e.g. MCMC and EM) for Bayesian inference for a large class of dynamic models. The basis of SMC techniques involves representing the underlying inference problem as one of state space estimation, thus giving way to inference via particle filtering. The key insight of Carvalho et al was to construct the sequence of filtering distributions so as to make use of the posterior predictive distribution of the observable, a distribution usually only accessible in certain Bayesian settings. Access to this distribution allows a reversal of the usual propagate and resample steps characteristic of many SMC methods, thereby alleviating to a large extent many problems associated with particle degeneration. Furthermore, Carvalho et al point out that for many conjugate models the posterior distribution of the static variables can be parametrized in terms of [recursively defined] sufficient statistics of the previously observed data. For models where such sufficient statistics exist, particle learning as it is being called, is especially well suited for the analysis of streaming data do to the relative invariance of its algorithmic complexity with the number of data observations. Through a particle learning approach, a statistical model can be fit to data as the data is arriving, allowing at any instant during the observation process direct quantification of uncertainty surrounding underlying model parameters. Here we describe the use of a particle learning approach for fitting a standard Bayesian semiparametric mixture model as described in Carvalho, Lopes, Polson and Taddy. In Section 2 we briefly review the previously presented particle learning algorithm for the case of a Dirichlet process mixture of multivariate normals. In Section 3 we describe several novel extensions to the original
Road network safety evaluation using Bayesian hierarchical joint model.
Wang, Jie; Huang, Helai
2016-05-01
Safety and efficiency are commonly regarded as two significant performance indicators of transportation systems. In practice, road network planning has focused on road capacity and transport efficiency whereas the safety level of a road network has received little attention in the planning stage. This study develops a Bayesian hierarchical joint model for road network safety evaluation to help planners take traffic safety into account when planning a road network. The proposed model establishes relationships between road network risk and micro-level variables related to road entities and traffic volume, as well as socioeconomic, trip generation and network density variables at macro level which are generally used for long term transportation plans. In addition, network spatial correlation between intersections and their connected road segments is also considered in the model. A road network is elaborately selected in order to compare the proposed hierarchical joint model with a previous joint model and a negative binomial model. According to the results of the model comparison, the hierarchical joint model outperforms the joint model and negative binomial model in terms of the goodness-of-fit and predictive performance, which indicates the reasonableness of considering the hierarchical data structure in crash prediction and analysis. Moreover, both random effects at the TAZ level and the spatial correlation between intersections and their adjacent segments are found to be significant, supporting the employment of the hierarchical joint model as an alternative in road-network-level safety modeling as well. Copyright © 2016 Elsevier Ltd. All rights reserved.
Bayesian Geostatistical Modeling of Malaria Indicator Survey Data in Angola
Gosoniu, Laura; Veta, Andre Mia; Vounatsou, Penelope
2010-01-01
The 2006–2007 Angola Malaria Indicator Survey (AMIS) is the first nationally representative household survey in the country assessing coverage of the key malaria control interventions and measuring malaria-related burden among children under 5 years of age. In this paper, the Angolan MIS data were analyzed to produce the first smooth map of parasitaemia prevalence based on contemporary nationwide empirical data in the country. Bayesian geostatistical models were fitted to assess the effect of interventions after adjusting for environmental, climatic and socio-economic factors. Non-linear relationships between parasitaemia risk and environmental predictors were modeled by categorizing the covariates and by employing two non-parametric approaches, the B-splines and the P-splines. The results of the model validation showed that the categorical model was able to better capture the relationship between parasitaemia prevalence and the environmental factors. Model fit and prediction were handled within a Bayesian framework using Markov chain Monte Carlo (MCMC) simulations. Combining estimates of parasitaemia prevalence with the number of children under we obtained estimates of the number of infected children in the country. The population-adjusted prevalence ranges from in Namibe province to in Malanje province. The odds of parasitaemia in children living in a household with at least ITNs per person was by 41% lower (CI: 14%, 60%) than in those with fewer ITNs. The estimates of the number of parasitaemic children produced in this paper are important for planning and implementing malaria control interventions and for monitoring the impact of prevention and control activities. PMID:20351775
Modeling Land-Use Decision Behavior with Bayesian Belief Networks
Directory of Open Access Journals (Sweden)
Inge Aalders
2008-06-01
Full Text Available The ability to incorporate and manage the different drivers of land-use change in a modeling process is one of the key challenges because they are complex and are both quantitative and qualitative in nature. This paper uses Bayesian belief networks (BBN to incorporate characteristics of land managers in the modeling process and to enhance our understanding of land-use change based on the limited and disparate sources of information. One of the two models based on spatial data represented land managers in the form of a quantitative variable, the area of individual holdings, whereas the other model included qualitative data from a survey of land managers. Random samples from the spatial data provided evidence of the relationship between the different variables, which I used to develop the BBN structure. The model was tested for four different posterior probability distributions, and results showed that the trained and learned models are better at predicting land use than the uniform and random models. The inference from the model demonstrated the constraints that biophysical characteristics impose on land managers; for older land managers without heirs, there is a higher probability of the land use being arable agriculture. The results show the benefits of incorporating a more complex notion of land managers in land-use models, and of using different empirical data sources in the modeling process. Future research should focus on incorporating more complex social processes into the modeling structure, as well as incorporating spatio-temporal dynamics in a BBN.
Evaluating Flight Crew Performance by a Bayesian Network Model
Directory of Open Access Journals (Sweden)
Wei Chen
2018-03-01
Full Text Available Flight crew performance is of great significance in keeping flights safe and sound. When evaluating the crew performance, quantitative detailed behavior information may not be available. The present paper introduces the Bayesian Network to perform flight crew performance evaluation, which permits the utilization of multidisciplinary sources of objective and subjective information, despite sparse behavioral data. In this paper, the causal factors are selected based on the analysis of 484 aviation accidents caused by human factors. Then, a network termed Flight Crew Performance Model is constructed. The Delphi technique helps to gather subjective data as a supplement to objective data from accident reports. The conditional probabilities are elicited by the leaky noisy MAX model. Two ways of inference for the BN—probability prediction and probabilistic diagnosis are used and some interesting conclusions are drawn, which could provide data support to make interventions for human error management in aviation safety.
GPU Computing in Bayesian Inference of Realized Stochastic Volatility Model
International Nuclear Information System (INIS)
Takaishi, Tetsuya
2015-01-01
The realized stochastic volatility (RSV) model that utilizes the realized volatility as additional information has been proposed to infer volatility of financial time series. We consider the Bayesian inference of the RSV model by the Hybrid Monte Carlo (HMC) algorithm. The HMC algorithm can be parallelized and thus performed on the GPU for speedup. The GPU code is developed with CUDA Fortran. We compare the computational time in performing the HMC algorithm on GPU (GTX 760) and CPU (Intel i7-4770 3.4GHz) and find that the GPU can be up to 17 times faster than the CPU. We also code the program with OpenACC and find that appropriate coding can achieve the similar speedup with CUDA Fortran
El Gharamti, M.; Bethke, I.; Tjiputra, J.; Bertino, L.
2016-02-01
Given the recent strong international focus on developing new data assimilation systems for biological models, we present in this comparative study the application of newly developed state-parameters estimation tools to an ocean ecosystem model. It is quite known that the available physical models are still too simple compared to the complexity of the ocean biology. Furthermore, various biological parameters remain poorly unknown and hence wrong specifications of such parameters can lead to large model errors. Standard joint state-parameters augmentation technique using the ensemble Kalman filter (Stochastic EnKF) has been extensively tested in many geophysical applications. Some of these assimilation studies reported that jointly updating the state and the parameters might introduce significant inconsistency especially for strongly nonlinear models. This is usually the case for ecosystem models particularly during the period of the spring bloom. A better handling of the estimation problem is often carried out by separating the update of the state and the parameters using the so-called Dual EnKF. The dual filter is computationally more expensive than the Joint EnKF but is expected to perform more accurately. Using a similar separation strategy, we propose a new EnKF estimation algorithm in which we apply a one-step-ahead smoothing to the state. The new state-parameters estimation scheme is derived in a consistent Bayesian filtering framework and results in separate update steps for the state and the parameters. Unlike the classical filtering path, the new scheme starts with an update step and later a model propagation step is performed. We test the performance of the new smoothing-based schemes against the standard EnKF in a one-dimensional configuration of the Norwegian Earth System Model (NorESM) in the North Atlantic. We use nutrients profile (up to 2000 m deep) data and surface partial CO2 measurements from Mike weather station (66o N, 2o E) to estimate
Single-particle model of a strongly driven, dense, nanoscale quantum ensemble
DiLoreto, C. S.; Rangan, C.
2018-01-01
We study the effects of interatomic interactions on the quantum dynamics of a dense, nanoscale, atomic ensemble driven by a strong electromagnetic field. We use a self-consistent, mean-field technique based on the pseudospectral time-domain method and a full, three-directional basis to solve the coupled Maxwell-Liouville equations. We find that interatomic interactions generate a decoherence in the state of an ensemble on a much faster time scale than the excited-state lifetime of individual atoms. We present a single-particle model of the driven, dense ensemble by incorporating interactions into a dephasing rate. This single-particle model reproduces the essential physics of the full simulation and is an efficient way of rapidly estimating the collective dynamics of a dense ensemble.
Clustered iterative stochastic ensemble method for multi-modal calibration of subsurface flow models
Elsheikh, Ahmed H.
2013-05-01
A novel multi-modal parameter estimation algorithm is introduced. Parameter estimation is an ill-posed inverse problem that might admit many different solutions. This is attributed to the limited amount of measured data used to constrain the inverse problem. The proposed multi-modal model calibration algorithm uses an iterative stochastic ensemble method (ISEM) for parameter estimation. ISEM employs an ensemble of directional derivatives within a Gauss-Newton iteration for nonlinear parameter estimation. ISEM is augmented with a clustering step based on k-means algorithm to form sub-ensembles. These sub-ensembles are used to explore different parts of the search space. Clusters are updated at regular intervals of the algorithm to allow merging of close clusters approaching the same local minima. Numerical testing demonstrates the potential of the proposed algorithm in dealing with multi-modal nonlinear parameter estimation for subsurface flow models. © 2013 Elsevier B.V.
Quantum ensembles of quantum classifiers.
Schuld, Maria; Petruccione, Francesco
2018-02-09
Quantum machine learning witnesses an increasing amount of quantum algorithms for data-driven decision making, a problem with potential applications ranging from automated image recognition to medical diagnosis. Many of those algorithms are implementations of quantum classifiers, or models for the classification of data inputs with a quantum computer. Following the success of collective decision making with ensembles in classical machine learning, this paper introduces the concept of quantum ensembles of quantum classifiers. Creating the ensemble corresponds to a state preparation routine, after which the quantum classifiers are evaluated in parallel and their combined decision is accessed by a single-qubit measurement. This framework naturally allows for exponentially large ensembles in which - similar to Bayesian learning - the individual classifiers do not have to be trained. As an example, we analyse an exponentially large quantum ensemble in which each classifier is weighed according to its performance in classifying the training data, leading to new results for quantum as well as classical machine learning.
Modelling of population dynamics of red king crab using Bayesian approach
Directory of Open Access Journals (Sweden)
Bakanev Sergey ...
2012-10-01
Modeling population dynamics based on the Bayesian approach enables to successfully resolve the above issues. The integration of the data from various studies into a unified model based on Bayesian parameter estimation method provides a much more detailed description of the processes occurring in the population.
Inconsistency of Bayesian Inference for Misspecified Linear Models, and a Proposal for Repairing It
Grünwald, P.; van Ommen, T.
2017-01-01
We empirically show that Bayesian inference can be inconsistent under misspecification in simple linear regression problems, both in a model averaging/selection and in a Bayesian ridge regression setting. We use the standard linear model, which assumes homoskedasticity, whereas the data are
Dynamic Bayesian Network Modeling of Game Based Diagnostic Assessments. CRESST Report 837
Levy, Roy
2014-01-01
Digital games offer an appealing environment for assessing student proficiencies, including skills and misconceptions in a diagnostic setting. This paper proposes a dynamic Bayesian network modeling approach for observations of student performance from an educational video game. A Bayesian approach to model construction, calibration, and use in…
Inconsistency of Bayesian inference for misspecified linear models, and a proposal for repairing it
P.D. Grünwald (Peter); T. van Ommen (Thijs)
2017-01-01
textabstractWe empirically show that Bayesian inference can be inconsistent under misspecification in simple linear regression problems, both in a model averaging/selection and in a Bayesian ridge regression setting. We use the standard linear model, which assumes homoskedasticity, whereas the data
BDgraph: An R Package for Bayesian Structure Learning in Graphical Models
Mohammadi, A.; Wit, E.C.
2017-01-01
Graphical models provide powerful tools to uncover complicated patterns in multivariate data and are commonly used in Bayesian statistics and machine learning. In this paper, we introduce an R package BDgraph which performs Bayesian structure learning for general undirected graphical models with
Cooper, Richard J; Krueger, Tobias; Hiscock, Kevin M; Rawlins, Barry G
2014-11-01
Mixing models have become increasingly common tools for apportioning fluvial sediment load to various sediment sources across catchments using a wide variety of Bayesian and frequentist modeling approaches. In this study, we demonstrate how different model setups can impact upon resulting source apportionment estimates in a Bayesian framework via a one-factor-at-a-time (OFAT) sensitivity analysis. We formulate 13 versions of a mixing model, each with different error assumptions and model structural choices, and apply them to sediment geochemistry data from the River Blackwater, Norfolk, UK, to apportion suspended particulate matter (SPM) contributions from three sources (arable topsoils, road verges, and subsurface material) under base flow conditions between August 2012 and August 2013. Whilst all 13 models estimate subsurface sources to be the largest contributor of SPM (median ∼76%), comparison of apportionment estimates reveal varying degrees of sensitivity to changing priors, inclusion of covariance terms, incorporation of time-variant distributions, and methods of proportion characterization. We also demonstrate differences in apportionment results between a full and an empirical Bayesian setup, and between a Bayesian and a frequentist optimization approach. This OFAT sensitivity analysis reveals that mixing model structural choices and error assumptions can significantly impact upon sediment source apportionment results, with estimated median contributions in this study varying by up to 21% between model versions. Users of mixing models are therefore strongly advised to carefully consider and justify their choice of model structure prior to conducting sediment source apportionment investigations. An OFAT sensitivity analysis of sediment fingerprinting mixing models is conductedBayesian models display high sensitivity to error assumptions and structural choicesSource apportionment results differ between Bayesian and frequentist approaches.
A Bayesian Approach for Structural Learning with Hidden Markov Models
Directory of Open Access Journals (Sweden)
Cen Li
2002-01-01
Full Text Available Hidden Markov Models(HMM have proved to be a successful modeling paradigm for dynamic and spatial processes in many domains, such as speech recognition, genomics, and general sequence alignment. Typically, in these applications, the model structures are predefined by domain experts. Therefore, the HMM learning problem focuses on the learning of the parameter values of the model to fit the given data sequences. However, when one considers other domains, such as, economics and physiology, model structure capturing the system dynamic behavior is not available. In order to successfully apply the HMM methodology in these domains, it is important that a mechanism is available for automatically deriving the model structure from the data. This paper presents a HMM learning procedure that simultaneously learns the model structure and the maximum likelihood parameter values of a HMM from data. The HMM model structures are derived based on the Bayesian model selection methodology. In addition, we introduce a new initialization procedure for HMM parameter value estimation based on the K-means clustering method. Experimental results with artificially generated data show the effectiveness of the approach.
Bayesian network models for error detection in radiotherapy plans
International Nuclear Information System (INIS)
Kalet, Alan M; Ford, Eric C; Phillips, Mark H; Gennari, John H
2015-01-01
The purpose of this study is to design and develop a probabilistic network for detecting errors in radiotherapy plans for use at the time of initial plan verification. Our group has initiated a multi-pronged approach to reduce these errors. We report on our development of Bayesian models of radiotherapy plans. Bayesian networks consist of joint probability distributions that define the probability of one event, given some set of other known information. Using the networks, we find the probability of obtaining certain radiotherapy parameters, given a set of initial clinical information. A low probability in a propagated network then corresponds to potential errors to be flagged for investigation. To build our networks we first interviewed medical physicists and other domain experts to identify the relevant radiotherapy concepts and their associated interdependencies and to construct a network topology. Next, to populate the network’s conditional probability tables, we used the Hugin Expert software to learn parameter distributions from a subset of de-identified data derived from a radiation oncology based clinical information database system. These data represent 4990 unique prescription cases over a 5 year period. Under test case scenarios with approximately 1.5% introduced error rates, network performance produced areas under the ROC curve of 0.88, 0.98, and 0.89 for the lung, brain and female breast cancer error detection networks, respectively. Comparison of the brain network to human experts performance (AUC of 0.90 ± 0.01) shows the Bayes network model performs better than domain experts under the same test conditions. Our results demonstrate the feasibility and effectiveness of comprehensive probabilistic models as part of decision support systems for improved detection of errors in initial radiotherapy plan verification procedures. (paper)
Sparse Estimation Using Bayesian Hierarchical Prior Modeling for Real and Complex Linear Models
DEFF Research Database (Denmark)
Pedersen, Niels Lovmand; Manchón, Carles Navarro; Badiu, Mihai Alin
2015-01-01
In sparse Bayesian learning (SBL), Gaussian scale mixtures (GSMs) have been used to model sparsity-inducing priors that realize a class of concave penalty functions for the regression task in real-valued signal models. Motivated by the relative scarcity of formal tools for SBL in complex-valued m......In sparse Bayesian learning (SBL), Gaussian scale mixtures (GSMs) have been used to model sparsity-inducing priors that realize a class of concave penalty functions for the regression task in real-valued signal models. Motivated by the relative scarcity of formal tools for SBL in complex...... error, and robustness in low and medium signal-to-noise ratio regimes....
Large scale Bayesian nuclear data evaluation with consistent model defects
International Nuclear Information System (INIS)
Schnabel, G
2015-01-01
The aim of nuclear data evaluation is the reliable determination of cross sections and related quantities of the atomic nuclei. To this end, evaluation methods are applied which combine the information of experiments with the results of model calculations. The evaluated observables with their associated uncertainties and correlations are assembled into data sets, which are required for the development of novel nuclear facilities, such as fusion reactors for energy supply, and accelerator driven systems for nuclear waste incineration. The efficiency and safety of such future facilities is dependent on the quality of these data sets and thus also on the reliability of the applied evaluation methods. This work investigated the performance of the majority of available evaluation methods in two scenarios. The study indicated the importance of an essential component in these methods, which is the frequently ignored deficiency of nuclear models. Usually, nuclear models are based on approximations and thus their predictions may deviate from reliable experimental data. As demonstrated in this thesis, the neglect of this possibility in evaluation methods can lead to estimates of observables which are inconsistent with experimental data. Due to this finding, an extension of Bayesian evaluation methods is proposed to take into account the deficiency of the nuclear models. The deficiency is modeled as a random function in terms of a Gaussian process and combined with the model prediction. This novel formulation conserves sum rules and allows to explicitly estimate the magnitude of model deficiency. Both features are missing in available evaluation methods so far. Furthermore, two improvements of existing methods have been developed in the course of this thesis. The first improvement concerns methods relying on Monte Carlo sampling. A Metropolis-Hastings scheme with a specific proposal distribution is suggested, which proved to be more efficient in the studied scenarios than the
Bayesian inference and model comparison for metallic fatigue data
Babuska, Ivo
2016-01-06
In this work, we present a statistical treatment of stress-life (S-N) data drawn from a collection of records of fatigue experiments that were performed on 75S-T6 aluminum alloys. Our main objective is to predict the fatigue life of materials by providing a systematic approach to model calibration, model selection and model ranking with reference to S-N data. To this purpose, we consider fatigue-limit models and random fatigue-limit models that are specially designed to allow the treatment of the run-outs (right-censored data). We first fit the models to the data by maximum likelihood methods and estimate the quantiles of the life distribution of the alloy specimen. We then compare and rank the models by classical measures of fit based on information criteria. We also consider a Bayesian approach that provides, under the prior distribution of the model parameters selected by the user, their simulation-based posterior distributions.
One-Step Dynamic Classifier Ensemble Model for Customer Value Segmentation with Missing Values
Directory of Open Access Journals (Sweden)
Jin Xiao
2014-01-01
Full Text Available Scientific customer value segmentation (CVS is the base of efficient customer relationship management, and customer credit scoring, fraud detection, and churn prediction all belong to CVS. In real CVS, the customer data usually include lots of missing values, which may affect the performance of CVS model greatly. This study proposes a one-step dynamic classifier ensemble model for missing values (ODCEM model. On the one hand, ODCEM integrates the preprocess of missing values and the classification modeling into one step; on the other hand, it utilizes multiple classifiers ensemble technology in constructing the classification models. The empirical results in credit scoring dataset “German” from UCI and the real customer churn prediction dataset “China churn” show that the ODCEM outperforms four commonly used “two-step” models and the ensemble based model LMF and can provide better decision support for market managers.
A model ensemble for projecting multi‐decadal coastal cliff retreat during the 21st century
Limber, Patrick; Barnard, Patrick; Vitousek, Sean; Erikson, Li
2018-01-01
Sea cliff retreat rates are expected to accelerate with rising sea levels during the 21st century. Here we develop an approach for a multi‐model ensemble that efficiently projects time‐averaged sea cliff retreat over multi‐decadal time scales and large (>50 km) spatial scales. The ensemble consists of five simple 1‐D models adapted from the literature that relate sea cliff retreat to wave impacts, sea level rise (SLR), historical cliff behavior, and cross‐shore profile geometry. Ensemble predictions are based on Monte Carlo simulations of each individual model, which account for the uncertainty of model parameters. The consensus of the individual models also weights uncertainty, such that uncertainty is greater when predictions from different models do not agree. A calibrated, but unvalidated, ensemble was applied to the 475 km‐long coastline of Southern California (USA), with 4 SLR scenarios of 0.5, 0.93, 1.5, and 2 m by 2100. Results suggest that future retreat rates could increase relative to mean historical rates by more than two‐fold for the higher SLR scenarios, causing an average total land loss of 19 – 41 m by 2100. However, model uncertainty ranges from +/‐ 5 – 15 m, reflecting the inherent difficulties of projecting cliff retreat over multiple decades. To enhance ensemble performance, future work could include weighting each model by its skill in matching observations in different morphological settings
iSEDfit: Bayesian spectral energy distribution modeling of galaxies
Moustakas, John
2017-08-01
iSEDfit uses Bayesian inference to extract the physical properties of galaxies from their observed broadband photometric spectral energy distribution (SED). In its default mode, the inputs to iSEDfit are the measured photometry (fluxes and corresponding inverse variances) and a measurement of the galaxy redshift. Alternatively, iSEDfit can be used to estimate photometric redshifts from the input photometry alone. After the priors have been specified, iSEDfit calculates the marginalized posterior probability distributions for the physical parameters of interest, including the stellar mass, star-formation rate, dust content, star formation history, and stellar metallicity. iSEDfit also optionally computes K-corrections and produces multiple "quality assurance" (QA) plots at each stage of the modeling procedure to aid in the interpretation of the prior parameter choices and subsequent fitting results. The software is distributed as part of the impro IDL suite.
A Bayesian modelling framework for tornado occurrences in North America.
Cheng, Vincent Y S; Arhonditsis, George B; Sills, David M L; Gough, William A; Auld, Heather
2015-03-25
Tornadoes represent one of nature's most hazardous phenomena that have been responsible for significant destruction and devastating fatalities. Here we present a Bayesian modelling approach for elucidating the spatiotemporal patterns of tornado activity in North America. Our analysis shows a significant increase in the Canadian Prairies and the Northern Great Plains during the summer, indicating a clear transition of tornado activity from the United States to Canada. The linkage between monthly-averaged atmospheric variables and likelihood of tornado events is characterized by distinct seasonality; the convective available potential energy is the predominant factor in the summer; vertical wind shear appears to have a strong signature primarily in the winter and secondarily in the summer; and storm relative environmental helicity is most influential in the spring. The present probabilistic mapping can be used to draw inference on the likelihood of tornado occurrence in any location in North America within a selected time period of the year.
Designing and testing inflationary models with Bayesian networks
Energy Technology Data Exchange (ETDEWEB)
Price, Layne C. [Carnegie Mellon Univ., Pittsburgh, PA (United States). Dept. of Physics; Auckland Univ. (New Zealand). Dept. of Physics; Peiris, Hiranya V. [Univ. College London (United Kingdom). Dept. of Physics and Astronomy; Frazer, Jonathan [DESY Hamburg (Germany). Theory Group; Univ. of the Basque Country, Bilbao (Spain). Dept. of Theoretical Physics; Basque Foundation for Science, Bilbao (Spain). IKERBASQUE; Easther, Richard [Auckland Univ. (New Zealand). Dept. of Physics
2015-11-15
Even simple inflationary scenarios have many free parameters. Beyond the variables appearing in the inflationary action, these include dynamical initial conditions, the number of fields, and couplings to other sectors. These quantities are often ignored but cosmological observables can depend on the unknown parameters. We use Bayesian networks to account for a large set of inflationary parameters, deriving generative models for the primordial spectra that are conditioned on a hierarchical set of prior probabilities describing the initial conditions, reheating physics, and other free parameters. We use N{sub f}-quadratic inflation as an illustrative example, finding that the number of e-folds N{sub *} between horizon exit for the pivot scale and the end of inflation is typically the most important parameter, even when the number of fields, their masses and initial conditions are unknown, along with possible conditional dependencies between these parameters.
Designing and testing inflationary models with Bayesian networks
Energy Technology Data Exchange (ETDEWEB)
Price, Layne C. [McWilliams Center for Cosmology, Department of Physics, Carnegie Mellon University, Pittsburgh, PA 15213 (United States); Peiris, Hiranya V. [Department of Physics and Astronomy, University College London, London WC1E 6BT (United Kingdom); Frazer, Jonathan [Deutsches Elektronen-Synchrotron DESY, Theory Group, 22603 Hamburg (Germany); Easther, Richard, E-mail: laynep@andrew.cmu.edu, E-mail: h.peiris@ucl.ac.uk, E-mail: jonathan.frazer@desy.de, E-mail: r.easther@auckland.ac.nz [Department of Physics, University of Auckland, Private Bag 92019, Auckland (New Zealand)
2016-02-01
Even simple inflationary scenarios have many free parameters. Beyond the variables appearing in the inflationary action, these include dynamical initial conditions, the number of fields, and couplings to other sectors. These quantities are often ignored but cosmological observables can depend on the unknown parameters. We use Bayesian networks to account for a large set of inflationary parameters, deriving generative models for the primordial spectra that are conditioned on a hierarchical set of prior probabilities describing the initial conditions, reheating physics, and other free parameters. We use N{sub f}-quadratic inflation as an illustrative example, finding that the number of e-folds N{sub *} between horizon exit for the pivot scale and the end of inflation is typically the most important parameter, even when the number of fields, their masses and initial conditions are unknown, along with possible conditional dependencies between these parameters.
Designing and testing inflationary models with Bayesian networks
International Nuclear Information System (INIS)
Price, Layne C.; Auckland Univ.; Peiris, Hiranya V.; Frazer, Jonathan; Univ. of the Basque Country, Bilbao; Basque Foundation for Science, Bilbao; Easther, Richard
2015-11-01
Even simple inflationary scenarios have many free parameters. Beyond the variables appearing in the inflationary action, these include dynamical initial conditions, the number of fields, and couplings to other sectors. These quantities are often ignored but cosmological observables can depend on the unknown parameters. We use Bayesian networks to account for a large set of inflationary parameters, deriving generative models for the primordial spectra that are conditioned on a hierarchical set of prior probabilities describing the initial conditions, reheating physics, and other free parameters. We use N f -quadratic inflation as an illustrative example, finding that the number of e-folds N * between horizon exit for the pivot scale and the end of inflation is typically the most important parameter, even when the number of fields, their masses and initial conditions are unknown, along with possible conditional dependencies between these parameters.
Forecast Accuracy and Economic Gains from Bayesian Model Averaging using Time Varying Weights
L.F. Hoogerheide (Lennart); R.H. Kleijn (Richard); H.K. van Dijk (Herman); M.J.C.M. Verbeek (Marno)
2009-01-01
textabstractSeveral Bayesian model combination schemes, including some novel approaches that simultaneously allow for parameter uncertainty, model uncertainty and robust time varying model weights, are compared in terms of forecast accuracy and economic gains using financial and macroeconomic time
Evidence on Features of a DSGE Business Cycle Model from Bayesian Model Averaging
R.W. Strachan (Rodney); H.K. van Dijk (Herman)
2012-01-01
textabstractThe empirical support for features of a Dynamic Stochastic General Equilibrium model with two technology shocks is valuated using Bayesian model averaging over vector autoregressions. The model features include equilibria, restrictions on long-run responses, a structural break of unknown
Bayesian nonparametric meta-analysis using Polya tree mixture models.
Branscum, Adam J; Hanson, Timothy E
2008-09-01
Summary. A common goal in meta-analysis is estimation of a single effect measure using data from several studies that are each designed to address the same scientific inquiry. Because studies are typically conducted in geographically disperse locations, recent developments in the statistical analysis of meta-analytic data involve the use of random effects models that account for study-to-study variability attributable to differences in environments, demographics, genetics, and other sources that lead to heterogeneity in populations. Stemming from asymptotic theory, study-specific summary statistics are modeled according to normal distributions with means representing latent true effect measures. A parametric approach subsequently models these latent measures using a normal distribution, which is strictly a convenient modeling assumption absent of theoretical justification. To eliminate the influence of overly restrictive parametric models on inferences, we consider a broader class of random effects distributions. We develop a novel hierarchical Bayesian nonparametric Polya tree mixture (PTM) model. We present methodology for testing the PTM versus a normal random effects model. These methods provide researchers a straightforward approach for conducting a sensitivity analysis of the normality assumption for random effects. An application involving meta-analysis of epidemiologic studies designed to characterize the association between alcohol consumption and breast cancer is presented, which together with results from simulated data highlight the performance of PTMs in the presence of nonnormality of effect measures in the source population.
Bayesian flood forecasting methods: A review
Han, Shasha; Coulibaly, Paulin
2017-08-01
Over the past few decades, floods have been seen as one of the most common and largely distributed natural disasters in the world. If floods could be accurately forecasted in advance, then their negative impacts could be greatly minimized. It is widely recognized that quantification and reduction of uncertainty associated with the hydrologic forecast is of great importance for flood estimation and rational decision making. Bayesian forecasting system (BFS) offers an ideal theoretic framework for uncertainty quantification that can be developed for probabilistic flood forecasting via any deterministic hydrologic model. It provides suitable theoretical structure, empirically validated models and reasonable analytic-numerical computation method, and can be developed into various Bayesian forecasting approaches. This paper presents a comprehensive review on Bayesian forecasting approaches applied in flood forecasting from 1999 till now. The review starts with an overview of fundamentals of BFS and recent advances in BFS, followed with BFS application in river stage forecasting and real-time flood forecasting, then move to a critical analysis by evaluating advantages and limitations of Bayesian forecasting methods and other predictive uncertainty assessment approaches in flood forecasting, and finally discusses the future research direction in Bayesian flood forecasting. Results show that the Bayesian flood forecasting approach is an effective and advanced way for flood estimation, it considers all sources of uncertainties and produces a predictive distribution of the river stage, river discharge or runoff, thus gives more accurate and reliable flood forecasts. Some emerging Bayesian forecasting methods (e.g. ensemble Bayesian forecasting system, Bayesian multi-model combination) were shown to overcome limitations of single model or fixed model weight and effectively reduce predictive uncertainty. In recent years, various Bayesian flood forecasting approaches have been
Force Sensor Based Tool Condition Monitoring Using a Heterogeneous Ensemble Learning Model
Directory of Open Access Journals (Sweden)
Guofeng Wang
2014-11-01
Full Text Available Tool condition monitoring (TCM plays an important role in improving machining efficiency and guaranteeing workpiece quality. In order to realize reliable recognition of the tool condition, a robust classifier needs to be constructed to depict the relationship between tool wear states and sensory information. However, because of the complexity of the machining process and the uncertainty of the tool wear evolution, it is hard for a single classifier to fit all the collected samples without sacrificing generalization ability. In this paper, heterogeneous ensemble learning is proposed to realize tool condition monitoring in which the support vector machine (SVM, hidden Markov model (HMM and radius basis function (RBF are selected as base classifiers and a stacking ensemble strategy is further used to reflect the relationship between the outputs of these base classifiers and tool wear states. Based on the heterogeneous ensemble learning classifier, an online monitoring system is constructed in which the harmonic features are extracted from force signals and a minimal redundancy and maximal relevance (mRMR algorithm is utilized to select the most prominent features. To verify the effectiveness of the proposed method, a titanium alloy milling experiment was carried out and samples with different tool wear states were collected to build the proposed heterogeneous ensemble learning classifier. Moreover, the homogeneous ensemble learning model and majority voting strategy are also adopted to make a comparison. The analysis and comparison results show that the proposed heterogeneous ensemble learning classifier performs better in both classification accuracy and stability.
Biological ensemble modeling to evaluate potential futures of living marine resources
DEFF Research Database (Denmark)
Gårdmark, Anna; Lindegren, Martin; Neuenfeldt, Stefan
2013-01-01
) as an example. The core of the approach is to expose an ensemble of models with different ecological assumptions to climate forcing, using multiple realizations of each climate scenario. We simulated the long-term response of cod to future fishing and climate change in seven ecological models ranging from...... model assumptions from the statistical uncertainty of future climate, and (3) identified results common for the whole model ensemble. Species interactions greatly influenced the simulated response of cod to fishing and climate, as well as the degree to which the statistical uncertainty of climate...... in all models, intense fishing prevented recovery, and climate change further decreased the cod population. Our study demonstrates how the biological ensemble modeling approach makes it possible to evaluate the relative importance of different sources of uncertainty in future species responses, as well...
Assessment of Surface Air Temperature over China Using Multi-criterion Model Ensemble Framework
Li, J.; Zhu, Q.; Su, L.; He, X.; Zhang, X.
2017-12-01
The General Circulation Models (GCMs) are designed to simulate the present climate and project future trends. It has been noticed that the performances of GCMs are not always in agreement with each other over different regions. Model ensemble techniques have been developed to post-process the GCMs' outputs and improve their prediction reliabilities. To evaluate the performances of GCMs, root-mean-square error, correlation coefficient, and uncertainty are commonly used statistical measures. However, the simultaneous achievements of these satisfactory statistics cannot be guaranteed when using many model ensemble techniques. Meanwhile, uncertainties and future scenarios are critical for Water-Energy management and operation. In this study, a new multi-model ensemble framework was proposed. It uses a state-of-art evolutionary multi-objective optimization algorithm, termed Multi-Objective Complex Evolution Global Optimization with Principle Component Analysis and Crowding Distance (MOSPD), to derive optimal GCM ensembles and demonstrate the trade-offs among various solutions. Such trade-off information was further analyzed with a robust Pareto front with respect to different statistical measures. A case study was conducted to optimize the surface air temperature (SAT) ensemble solutions over seven geographical regions of China for the historical period (1900-2005) and future projection (2006-2100). The results showed that the ensemble solutions derived with MOSPD algorithm are superior over the simple model average and any single model output during the historical simulation period. For the future prediction, the proposed ensemble framework identified that the largest SAT change would occur in the South Central China under RCP 2.6 scenario, North Eastern China under RCP 4.5 scenario, and North Western China under RCP 8.5 scenario, while the smallest SAT change would occur in the Inner Mongolia under RCP 2.6 scenario, South Central China under RCP 4.5 scenario, and
Bayesian modeling of ChIP-chip data using latent variables.
Wu, Mingqi
2009-10-26
BACKGROUND: The ChIP-chip technology has been used in a wide range of biomedical studies, such as identification of human transcription factor binding sites, investigation of DNA methylation, and investigation of histone modifications in animals and plants. Various methods have been proposed in the literature for analyzing the ChIP-chip data, such as the sliding window methods, the hidden Markov model-based methods, and Bayesian methods. Although, due to the integrated consideration of uncertainty of the models and model parameters, Bayesian methods can potentially work better than the other two classes of methods, the existing Bayesian methods do not perform satisfactorily. They usually require multiple replicates or some extra experimental information to parametrize the model, and long CPU time due to involving of MCMC simulations. RESULTS: In this paper, we propose a Bayesian latent model for the ChIP-chip data. The new model mainly differs from the existing Bayesian models, such as the joint deconvolution model, the hierarchical gamma mixture model, and the Bayesian hierarchical model, in two respects. Firstly, it works on the difference between the averaged treatment and control samples. This enables the use of a simple model for the data, which avoids the probe-specific effect and the sample (control/treatment) effect. As a consequence, this enables an efficient MCMC simulation of the posterior distribution of the model, and also makes the model more robust to the outliers. Secondly, it models the neighboring dependence of probes by introducing a latent indicator vector. A truncated Poisson prior distribution is assumed for the latent indicator variable, with the rationale being justified at length. CONCLUSION: The Bayesian latent method is successfully applied to real and ten simulated datasets, with comparisons with some of the existing Bayesian methods, hidden Markov model methods, and sliding window methods. The numerical results indicate that the
A novel Bayesian hierarchical model for road safety hotspot prediction.
Fawcett, Lee; Thorpe, Neil; Matthews, Joseph; Kremer, Karsten
2017-02-01
In this paper, we propose a Bayesian hierarchical model for predicting accident counts in future years at sites within a pool of potential road safety hotspots. The aim is to inform road safety practitioners of the location of likely future hotspots to enable a proactive, rather than reactive, approach to road safety scheme implementation. A feature of our model is the ability to rank sites according to their potential to exceed, in some future time period, a threshold accident count which may be used as a criterion for scheme implementation. Our model specification enables the classical empirical Bayes formulation - commonly used in before-and-after studies, wherein accident counts from a single before period are used to estimate counterfactual counts in the after period - to be extended to incorporate counts from multiple time periods. This allows site-specific variations in historical accident counts (e.g. locally-observed trends) to offset estimates of safety generated by a global accident prediction model (APM), which itself is used to help account for the effects of global trend and regression-to-mean (RTM). The Bayesian posterior predictive distribution is exploited to formulate predictions and to properly quantify our uncertainty in these predictions. The main contributions of our model include (i) the ability to allow accident counts from multiple time-points to inform predictions, with counts in more recent years lending more weight to predictions than counts from time-points further in the past; (ii) where appropriate, the ability to offset global estimates of trend by variations in accident counts observed locally, at a site-specific level; and (iii) the ability to account for unknown/unobserved site-specific factors which may affect accident counts. We illustrate our model with an application to accident counts at 734 potential hotspots in the German city of Halle; we also propose some simple diagnostics to validate the predictive capability of our
Bayesian Network Webserver: a comprehensive tool for biological network modeling.
Ziebarth, Jesse D; Bhattacharya, Anindya; Cui, Yan
2013-11-01
The Bayesian Network Webserver (BNW) is a platform for comprehensive network modeling of systems genetics and other biological datasets. It allows users to quickly and seamlessly upload a dataset, learn the structure of the network model that best explains the data and use the model to understand relationships between network variables. Many datasets, including those used to create genetic network models, contain both discrete (e.g. genotype) and continuous (e.g. gene expression traits) variables, and BNW allows for modeling hybrid datasets. Users of BNW can incorporate prior knowledge during structure learning through an easy-to-use structural constraint interface. After structure learning, users are immediately presented with an interactive network model, which can be used to make testable hypotheses about network relationships. BNW, including a downloadable structure learning package, is available at http://compbio.uthsc.edu/BNW. (The BNW interface for adding structural constraints uses HTML5 features that are not supported by current version of Internet Explorer. We suggest using other browsers (e.g. Google Chrome or Mozilla Firefox) when accessing BNW). ycui2@uthsc.edu. Supplementary data are available at Bioinformatics online.
Model dependence and its effect on ensemble projections in CMIP5
Abramowitz, G.; Bishop, C.
2013-12-01
Conceptually, the notion of model dependence within climate model ensembles is relatively simple - modelling groups share a literature base, parametrisations, data sets and even model code - the potential for dependence in sampling different climate futures is clear. How though can this conceptual problem inform a practical solution that demonstrably improves the ensemble mean and ensemble variance as an estimate of system uncertainty? While some research has already focused on error correlation or error covariance as a candidate to improve ensemble mean estimates, a complete definition of independence must at least implicitly subscribe to an ensemble interpretation paradigm, such as the 'truth-plus-error', 'indistinguishable', or more recently 'replicate Earth' paradigm. Using a definition of model dependence based on error covariance within the replicate Earth paradigm, this presentation will show that accounting for dependence in surface air temperature gives cooler projections in CMIP5 - by as much as 20% globally in some RCPs - although results differ significantly for each RCP, especially regionally. The fact that the change afforded by accounting for dependence across different RCPs is different is not an inconsistent result. Different numbers of submissions to each RCP by different modelling groups mean that differences in projections from different RCPs are not entirely about RCP forcing conditions - they also reflect different sampling strategies.
DEFF Research Database (Denmark)
Iglesias, J. E.; Sabuncu, M. R.; Van Leemput, Koen
2012-01-01
Many successful segmentation algorithms are based on Bayesian models in which prior anatomical knowledge is combined with the available image information. However, these methods typically have many free parameters that are estimated to obtain point estimates only, whereas a faithful Bayesian anal...
A Bayesian Approach to Person Fit Analysis in Item Response Theory Models. Research Report.
Glas, Cees A. W.; Meijer, Rob R.
A Bayesian approach to the evaluation of person fit in item response theory (IRT) models is presented. In a posterior predictive check, the observed value on a discrepancy variable is positioned in its posterior distribution. In a Bayesian framework, a Markov Chain Monte Carlo procedure can be used to generate samples of the posterior distribution…
Walter, G.M.; Augustin, Th.; Kneib, Thomas; Tutz, Gerhard
2010-01-01
The paper is concerned with Bayesian analysis under prior-data conflict, i.e. the situation when observed data are rather unexpected under the prior (and the sample size is not large enough to eliminate the influence of the prior). Two approaches for Bayesian linear regression modeling based on
Rational Irrationality: Modeling Climate Change Belief Polarization Using Bayesian Networks.
Cook, John; Lewandowsky, Stephan
2016-01-01
Belief polarization is said to occur when two people respond to the same evidence by updating their beliefs in opposite directions. This response is considered to be "irrational" because it involves contrary updating, a form of belief updating that appears to violate normatively optimal responding, as for example dictated by Bayes' theorem. In light of much evidence that people are capable of normatively optimal behavior, belief polarization presents a puzzling exception. We show that Bayesian networks, or Bayes nets, can simulate rational belief updating. When fit to experimental data, Bayes nets can help identify the factors that contribute to polarization. We present a study into belief updating concerning the reality of climate change in response to information about the scientific consensus on anthropogenic global warming (AGW). The study used representative samples of Australian and U.S. Among Australians, consensus information partially neutralized the influence of worldview, with free-market supporters showing a greater increase in acceptance of human-caused global warming relative to free-market opponents. In contrast, while consensus information overall had a positive effect on perceived consensus among U.S. participants, there was a reduction in perceived consensus and acceptance of human-caused global warming for strong supporters of unregulated free markets. Fitting a Bayes net model to the data indicated that under a Bayesian framework, free-market support is a significant driver of beliefs about climate change and trust in climate scientists. Further, active distrust of climate scientists among a small number of U.S. conservatives drives contrary updating in response to consensus information among this particular group. Copyright © 2016 Cognitive Science Society, Inc.
Model-based Bayesian signal extraction algorithm for peripheral nerves
Eggers, Thomas E.; Dweiri, Yazan M.; McCallum, Grant A.; Durand, Dominique M.
2017-10-01
Objective. Multi-channel cuff electrodes have recently been investigated for extracting fascicular-level motor commands from mixed neural recordings. Such signals could provide volitional, intuitive control over a robotic prosthesis for amputee patients. Recent work has demonstrated success in extracting these signals in acute and chronic preparations using spatial filtering techniques. These extracted signals, however, had low signal-to-noise ratios and thus limited their utility to binary classification. In this work a new algorithm is proposed which combines previous source localization approaches to create a model based method which operates in real time. Approach. To validate this algorithm, a saline benchtop setup was created to allow the precise placement of artificial sources within a cuff and interference sources outside the cuff. The artificial source was taken from five seconds of chronic neural activity to replicate realistic recordings. The proposed algorithm, hybrid Bayesian signal extraction (HBSE), is then compared to previous algorithms, beamforming and a Bayesian spatial filtering method, on this test data. An example chronic neural recording is also analyzed with all three algorithms. Main results. The proposed algorithm improved the signal to noise and signal to interference ratio of extracted test signals two to three fold, as well as increased the correlation coefficient between the original and recovered signals by 10-20%. These improvements translated to the chronic recording example and increased the calculated bit rate between the recovered signals and the recorded motor activity. Significance. HBSE significantly outperforms previous algorithms in extracting realistic neural signals, even in the presence of external noise sources. These results demonstrate the feasibility of extracting dynamic motor signals from a multi-fascicled intact nerve trunk, which in turn could extract motor command signals from an amputee for the end goal of
Improving a Deep Learning based RGB-D Object Recognition Model by Ensemble Learning
DEFF Research Database (Denmark)
Aakerberg, Andreas; Nasrollahi, Kamal; Heder, Thomas
2018-01-01
Augmenting RGB images with depth information is a well-known method to significantly improve the recognition accuracy of object recognition models. Another method to im- prove the performance of visual recognition models is ensemble learning. However, this method has not been widely explored...... in combination with deep convolutional neural network based RGB-D object recognition models. Hence, in this paper, we form different ensembles of complementary deep convolutional neural network models, and show that this can be used to increase the recognition performance beyond existing limits. Experiments...
A Bayesian Model of Category-Specific Emotional Brain Responses
Wager, Tor D.; Kang, Jian; Johnson, Timothy D.; Nichols, Thomas E.; Satpute, Ajay B.; Barrett, Lisa Feldman
2015-01-01
Understanding emotion is critical for a science of healthy and disordered brain function, but the neurophysiological basis of emotional experience is still poorly understood. We analyzed human brain activity patterns from 148 studies of emotion categories (2159 total participants) using a novel hierarchical Bayesian model. The model allowed us to classify which of five categories—fear, anger, disgust, sadness, or happiness—is engaged by a study with 66% accuracy (43-86% across categories). Analyses of the activity patterns encoded in the model revealed that each emotion category is associated with unique, prototypical patterns of activity across multiple brain systems including the cortex, thalamus, amygdala, and other structures. The results indicate that emotion categories are not contained within any one region or system, but are represented as configurations across multiple brain networks. The model provides a precise summary of the prototypical patterns for each emotion category, and demonstrates that a sufficient characterization of emotion categories relies on (a) differential patterns of involvement in neocortical systems that differ between humans and other species, and (b) distinctive patterns of cortical-subcortical interactions. Thus, these findings are incompatible with several contemporary theories of emotion, including those that emphasize emotion-dedicated brain systems and those that propose emotion is localized primarily in subcortical activity. They are consistent with componential and constructionist views, which propose that emotions are differentiated by a combination of perceptual, mnemonic, prospective, and motivational elements. Such brain-based models of emotion provide a foundation for new translational and clinical approaches. PMID:25853490
Bayesian calibration of the Community Land Model using surrogates
Energy Technology Data Exchange (ETDEWEB)
Ray, Jaideep; Hou, Zhangshuan; Huang, Maoyi; Swiler, Laura Painton
2014-02-01
We present results from the Bayesian calibration of hydrological parameters of the Community Land Model (CLM), which is often used in climate simulations and Earth system models. A statistical inverse problem is formulated for three hydrological parameters, conditional on observations of latent heat surface fluxes over 48 months. Our calibration method uses polynomial and Gaussian process surrogates of the CLM, and solves the parameter estimation problem using a Markov chain Monte Carlo sampler. Posterior probability densities for the parameters are developed for two sites with different soil and vegetation covers. Our method also allows us to examine the structural error in CLM under two error models. We find that surrogate models can be created for CLM in most cases. The posterior distributions are more predictive than the default parameter values in CLM. Climatologically averaging the observations does not modify the parameters' distributions significantly. The structural error model reveals a correlation time-scale which can be used to identify the physical process that could be contributing to it. While the calibrated CLM has a higher predictive skill, the calibration is under-dispersive.
Bayesian Belief Networks Approach for Modeling Irrigation Behavior
Andriyas, S.; McKee, M.
2012-12-01
Canal operators need information to manage water deliveries to irrigators. Short-term irrigation demand forecasts can potentially valuable information for a canal operator who must manage an on-demand system. Such forecasts could be generated by using information about the decision-making processes of irrigators. Bayesian models of irrigation behavior can provide insight into the likely criteria which farmers use to make irrigation decisions. This paper develops a Bayesian belief network (BBN) to learn irrigation decision-making behavior of farmers and utilizes the resulting model to make forecasts of future irrigation decisions based on factor interaction and posterior probabilities. Models for studying irrigation behavior have been rarely explored in the past. The model discussed here was built from a combination of data about biotic, climatic, and edaphic conditions under which observed irrigation decisions were made. The paper includes a case study using data collected from the Canal B region of the Sevier River, near Delta, Utah. Alfalfa, barley and corn are the main crops of the location. The model has been tested with a portion of the data to affirm the model predictive capabilities. Irrigation rules were deduced in the process of learning and verified in the testing phase. It was found that most of the farmers used consistent rules throughout all years and across different types of crops. Soil moisture stress, which indicates the level of water available to the plant in the soil profile, was found to be one of the most significant likely driving forces for irrigation. Irrigations appeared to be triggered by a farmer's perception of soil stress, or by a perception of combined factors such as information about a neighbor irrigating or an apparent preference to irrigate on a weekend. Soil stress resulted in irrigation probabilities of 94.4% for alfalfa. With additional factors like weekend and irrigating when a neighbor irrigates, alfalfa irrigation
Bayesian Genomic Prediction with Genotype × Environment Interaction Kernel Models
Cuevas, Jaime; Crossa, José; Montesinos-López, Osval A.; Burgueño, Juan; Pérez-Rodríguez, Paulino; de los Campos, Gustavo
2016-01-01
The phenomenon of genotype × environment (G × E) interaction in plant breeding decreases selection accuracy, thereby negatively affecting genetic gains. Several genomic prediction models incorporating G × E have been recently developed and used in genomic selection of plant breeding programs. Genomic prediction models for assessing multi-environment G × E interaction are extensions of a single-environment model, and have advantages and limitations. In this study, we propose two multi-environment Bayesian genomic models: the first model considers genetic effects (u) that can be assessed by the Kronecker product of variance–covariance matrices of genetic correlations between environments and genomic kernels through markers under two linear kernel methods, linear (genomic best linear unbiased predictors, GBLUP) and Gaussian (Gaussian kernel, GK). The other model has the same genetic component as the first model (u) plus an extra component, f, that captures random effects between environments that were not captured by the random effects u. We used five CIMMYT data sets (one maize and four wheat) that were previously used in different studies. Results show that models with G × E always have superior prediction ability than single-environment models, and the higher prediction ability of multi-environment models with u and f over the multi-environment model with only u occurred 85% of the time with GBLUP and 45% of the time with GK across the five data sets. The latter result indicated that including the random effect f is still beneficial for increasing prediction ability after adjusting by the random effect u. PMID:27793970
Bayesian Genomic Prediction with Genotype × Environment Interaction Kernel Models
Directory of Open Access Journals (Sweden)
Jaime Cuevas
2017-01-01
Full Text Available The phenomenon of genotype × environment (G × E interaction in plant breeding decreases selection accuracy, thereby negatively affecting genetic gains. Several genomic prediction models incorporating G × E have been recently developed and used in genomic selection of plant breeding programs. Genomic prediction models for assessing multi-environment G × E interaction are extensions of a single-environment model, and have advantages and limitations. In this study, we propose two multi-environment Bayesian genomic models: the first model considers genetic effects ( u that can be assessed by the Kronecker product of variance–covariance matrices of genetic correlations between environments and genomic kernels through markers under two linear kernel methods, linear (genomic best linear unbiased predictors, GBLUP and Gaussian (Gaussian kernel, GK. The other model has the same genetic component as the first model ( u plus an extra component, f, that captures random effects between environments that were not captured by the random effects u . We used five CIMMYT data sets (one maize and four wheat that were previously used in different studies. Results show that models with G × E always have superior prediction ability than single-environment models, and the higher prediction ability of multi-environment models with u and f over the multi-environment model with only u occurred 85% of the time with GBLUP and 45% of the time with GK across the five data sets. The latter result indicated that including the random effect f is still beneficial for increasing prediction ability after adjusting by the random effect u .
Bayesian Genomic Prediction with Genotype × Environment Interaction Kernel Models.
Cuevas, Jaime; Crossa, José; Montesinos-López, Osval A; Burgueño, Juan; Pérez-Rodríguez, Paulino; de Los Campos, Gustavo
2017-01-05
The phenomenon of genotype × environment (G × E) interaction in plant breeding decreases selection accuracy, thereby negatively affecting genetic gains. Several genomic prediction models incorporating G × E have been recently developed and used in genomic selection of plant breeding programs. Genomic prediction models for assessing multi-environment G × E interaction are extensions of a single-environment model, and have advantages and limitations. In this study, we propose two multi-environment Bayesian genomic models: the first model considers genetic effects [Formula: see text] that can be assessed by the Kronecker product of variance-covariance matrices of genetic correlations between environments and genomic kernels through markers under two linear kernel methods, linear (genomic best linear unbiased predictors, GBLUP) and Gaussian (Gaussian kernel, GK). The other model has the same genetic component as the first model [Formula: see text] plus an extra component, F: , that captures random effects between environments that were not captured by the random effects [Formula: see text] We used five CIMMYT data sets (one maize and four wheat) that were previously used in different studies. Results show that models with G × E always have superior prediction ability than single-environment models, and the higher prediction ability of multi-environment models with [Formula: see text] over the multi-environment model with only u occurred 85% of the time with GBLUP and 45% of the time with GK across the five data sets. The latter result indicated that including the random effect f is still beneficial for increasing prediction ability after adjusting by the random effect [Formula: see text]. Copyright © 2017 Cuevas et al.
Analyzing the impact of changing size and composition of a crop model ensemble
Rodríguez, Alfredo
2017-04-01
The use of an ensemble of crop growth simulation models is a practice recently adopted in order to quantify aspects of uncertainties in model simulations. Yet, while the climate modelling community has extensively investigated the properties of model ensembles and their implications, this has hardly been investigated for crop model ensembles (Wallach et al., 2016). In their ensemble of 27 wheat models, Martre et al. (2015) found that the accuracy of the multi-model ensemble-average only increases up to an ensemble size of ca. 10, but does not improve when including more models in the analysis. However, even when this number of members is reached, questions about the impact of the addition or removal of a member to/from the ensemble arise. When selecting ensemble members, identifying members with poor performance or giving implausible results can make a large difference on the outcome. The objective of this study is to set up a methodology that defines indicators to show the effects of changing the ensemble composition and size on simulation results, when a selection procedure of ensemble members is applied. Ensemble mean or median, and variance are measures used to depict ensemble results among other indicators. We are utilizing simulations from an ensemble of wheat models that have been used to construct impact response surfaces (Pirttioja et al., 2015) (IRSs). These show the response of an impact variable (e.g., crop yield) to systematic changes in two explanatory variables (e.g., precipitation and temperature). Using these, we compare different sub-ensembles in terms of the mean, median and spread, and also by comparing IRSs. The methodology developed here allows comparing an ensemble before and after applying any procedure that changes the ensemble composition and size by measuring the impact of this decision on the ensemble central tendency measures. The methodology could also be further developed to compare the effect of changing ensemble composition and size
DEFF Research Database (Denmark)
Salarzadeh Jenatabadi, Hashem; Babashamsi, Peyman; Khajeheian, Datis
2016-01-01
There are many factors which could inﬂuence the sustainability of airlines. The main purpose of this study is to introduce a framework for a financial sustainability index and model it based on structural equation modeling (SEM) with maximum likelihood and Bayesian predictors. The introduced...
Bayesian inference for hybrid discrete-continuous stochastic kinetic models
International Nuclear Information System (INIS)
Sherlock, Chris; Golightly, Andrew; Gillespie, Colin S
2014-01-01
We consider the problem of efficiently performing simulation and inference for stochastic kinetic models. Whilst it is possible to work directly with the resulting Markov jump process (MJP), computational cost can be prohibitive for networks of realistic size and complexity. In this paper, we consider an inference scheme based on a novel hybrid simulator that classifies reactions as either ‘fast’ or ‘slow’ with fast reactions evolving as a continuous Markov process whilst the remaining slow reaction occurrences are modelled through a MJP with time-dependent hazards. A linear noise approximation (LNA) of fast reaction dynamics is employed and slow reaction events are captured by exploiting the ability to solve the stochastic differential equation driving the LNA. This simulation procedure is used as a proposal mechanism inside a particle MCMC scheme, thus allowing Bayesian inference for the model parameters. We apply the scheme to a simple application and compare the output with an existing hybrid approach and also a scheme for performing inference for the underlying discrete stochastic model. (paper)
Bayesian model calibration of ramp compression experiments on Z
Brown, Justin; Hund, Lauren
2017-06-01
Bayesian model calibration (BMC) is a statistical framework to estimate inputs for a computational model in the presence of multiple uncertainties, making it well suited to dynamic experiments which must be coupled with numerical simulations to interpret the results. Often, dynamic experiments are diagnosed using velocimetry and this output can be modeled using a hydrocode. Several calibration issues unique to this type of scenario including the functional nature of the output, uncertainty of nuisance parameters within the simulation, and model discrepancy identifiability are addressed, and a novel BMC process is proposed. As a proof of concept, we examine experiments conducted on Sandia National Laboratories' Z-machine which ramp compressed tantalum to peak stresses of 250 GPa. The proposed BMC framework is used to calibrate the cold curve of Ta (with uncertainty), and we conclude that the procedure results in simple, fast, and valid inferences. Sandia National Laboratories is a multi-mission laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000.
Improving default risk prediction using Bayesian model uncertainty techniques.
Kazemi, Reza; Mosleh, Ali
2012-11-01
Credit risk is the potential exposure of a creditor to an obligor's failure or refusal to repay the debt in principal or interest. The potential of exposure is measured in terms of probability of default. Many models have been developed to estimate credit risk, with rating agencies dating back to the 19th century. They provide their assessment of probability of default and transition probabilities of various firms in their annual reports. Regulatory capital requirements for credit risk outlined by the Basel Committee on Banking Supervision have made it essential for banks and financial institutions to develop sophisticated models in an attempt to measure credit risk with higher accuracy. The Bayesian framework proposed in this article uses the techniques developed in physical sciences and engineering for dealing with model uncertainty and expert accuracy to obtain improved estimates of credit risk and associated uncertainties. The approach uses estimates from one or more rating agencies and incorporates their historical accuracy (past performance data) in estimating future default risk and transition probabilities. Several examples demonstrate that the proposed methodology can assess default probability with accuracy exceeding the estimations of all the individual models. Moreover, the methodology accounts for potentially significant departures from "nominal predictions" due to "upsetting events" such as the 2008 global banking crisis. © 2012 Society for Risk Analysis.
Bayesian Hierarchical Random Effects Models in Forensic Science
Directory of Open Access Journals (Sweden)
Colin G. G. Aitken
2018-04-01
Full Text Available Statistical modeling of the evaluation of evidence with the use of the likelihood ratio has a long history. It dates from the Dreyfus case at the end of the nineteenth century through the work at Bletchley Park in the Second World War to the present day. The development received a significant boost in 1977 with a seminal work by Dennis Lindley which introduced a Bayesian hierarchical random effects model for the evaluation of evidence with an example of refractive index measurements on fragments of glass. Many models have been developed since then. The methods have now been sufficiently well-developed and have become so widespread that it is timely to try and provide a software package to assist in their implementation. With that in mind, a project (SAILR: Software for the Analysis and Implementation of Likelihood Ratios was funded by the European Network of Forensic Science Institutes through their Monopoly programme to develop a software package for use by forensic scientists world-wide that would assist in the statistical analysis and implementation of the approach based on likelihood ratios. It is the purpose of this document to provide a short review of a small part of this history. The review also provides a background, or landscape, for the development of some of the models within the SAILR package and references to SAILR as made as appropriate.
Bayesian Hierarchical Random Effects Models in Forensic Science.
Aitken, Colin G G
2018-01-01
Statistical modeling of the evaluation of evidence with the use of the likelihood ratio has a long history. It dates from the Dreyfus case at the end of the nineteenth century through the work at Bletchley Park in the Second World War to the present day. The development received a significant boost in 1977 with a seminal work by Dennis Lindley which introduced a Bayesian hierarchical random effects model for the evaluation of evidence with an example of refractive index measurements on fragments of glass. Many models have been developed since then. The methods have now been sufficiently well-developed and have become so widespread that it is timely to try and provide a software package to assist in their implementation. With that in mind, a project (SAILR: Software for the Analysis and Implementation of Likelihood Ratios) was funded by the European Network of Forensic Science Institutes through their Monopoly programme to develop a software package for use by forensic scientists world-wide that would assist in the statistical analysis and implementation of the approach based on likelihood ratios. It is the purpose of this document to provide a short review of a small part of this history. The review also provides a background, or landscape, for the development of some of the models within the SAILR package and references to SAILR as made as appropriate.
Assessing fit in Bayesian models for spatial processes
Jun, M.
2014-09-16
© 2014 John Wiley & Sons, Ltd. Gaussian random fields are frequently used to model spatial and spatial-temporal data, particularly in geostatistical settings. As much of the attention of the statistics community has been focused on defining and estimating the mean and covariance functions of these processes, little effort has been devoted to developing goodness-of-fit tests to allow users to assess the models\\' adequacy. We describe a general goodness-of-fit test and related graphical diagnostics for assessing the fit of Bayesian Gaussian process models using pivotal discrepancy measures. Our method is applicable for both regularly and irregularly spaced observation locations on planar and spherical domains. The essential idea behind our method is to evaluate pivotal quantities defined for a realization of a Gaussian random field at parameter values drawn from the posterior distribution. Because the nominal distribution of the resulting pivotal discrepancy measures is known, it is possible to quantitatively assess model fit directly from the output of Markov chain Monte Carlo algorithms used to sample from the posterior distribution on the parameter space. We illustrate our method in a simulation study and in two applications.
Assessing fit in Bayesian models for spatial processes
Jun, M.; Katzfuss, M.; Hu, J.; Johnson, V. E.
2014-01-01
© 2014 John Wiley & Sons, Ltd. Gaussian random fields are frequently used to model spatial and spatial-temporal data, particularly in geostatistical settings. As much of the attention of the statistics community has been focused on defining and estimating the mean and covariance functions of these processes, little effort has been devoted to developing goodness-of-fit tests to allow users to assess the models' adequacy. We describe a general goodness-of-fit test and related graphical diagnostics for assessing the fit of Bayesian Gaussian process models using pivotal discrepancy measures. Our method is applicable for both regularly and irregularly spaced observation locations on planar and spherical domains. The essential idea behind our method is to evaluate pivotal quantities defined for a realization of a Gaussian random field at parameter values drawn from the posterior distribution. Because the nominal distribution of the resulting pivotal discrepancy measures is known, it is possible to quantitatively assess model fit directly from the output of Markov chain Monte Carlo algorithms used to sample from the posterior distribution on the parameter space. We illustrate our method in a simulation study and in two applications.
The role of model dynamics in ensemble Kalman filter performance for chaotic systems
Ng, G.-H.C.; McLaughlin, D.; Entekhabi, D.; Ahanin, A.
2011-01-01
The ensemble Kalman filter (EnKF) is susceptible to losing track of observations, or 'diverging', when applied to large chaotic systems such as atmospheric and ocean models. Past studies have demonstrated the adverse impact of sampling error during the filter's update step. We examine how system dynamics affect EnKF performance, and whether the absence of certain dynamic features in the ensemble may lead to divergence. The EnKF is applied to a simple chaotic model, and ensembles are checked against singular vectors of the tangent linear model, corresponding to short-term growth and Lyapunov vectors, corresponding to long-term growth. Results show that the ensemble strongly aligns itself with the subspace spanned by unstable Lyapunov vectors. Furthermore, the filter avoids divergence only if the full linearized long-term unstable subspace is spanned. However, short-term dynamics also become important as non-linearity in the system increases. Non-linear movement prevents errors in the long-term stable subspace from decaying indefinitely. If these errors then undergo linear intermittent growth, a small ensemble may fail to properly represent all important modes, causing filter divergence. A combination of long and short-term growth dynamics are thus critical to EnKF performance. These findings can help in developing practical robust filters based on model dynamics. ?? 2011 The Authors Tellus A ?? 2011 John Wiley & Sons A/S.
Estimating Parameters in Physical Models through Bayesian Inversion: A Complete Example
Allmaras, Moritz; Bangerth, Wolfgang; Linhart, Jean Marie; Polanco, Javier; Wang, Fang; Wang, Kainan; Webster, Jennifer; Zedler, Sarah
2013-01-01
All mathematical models of real-world phenomena contain parameters that need to be estimated from measurements, either for realistic predictions or simply to understand the characteristics of the model. Bayesian statistics provides a framework
Elsheikh, Ahmed H.; Wheeler, Mary Fanett; Hoteit, Ibrahim
2014-01-01
A Hybrid Nested Sampling (HNS) algorithm is proposed for efficient Bayesian model calibration and prior model selection. The proposed algorithm combines, Nested Sampling (NS) algorithm, Hybrid Monte Carlo (HMC) sampling and gradient estimation using
Directory of Open Access Journals (Sweden)
Morgan E. Smith
2017-03-01
Full Text Available Mathematical models of parasite transmission provide powerful tools for assessing the impacts of interventions. Owing to complexity and uncertainty, no single model may capture all features of transmission and elimination dynamics. Multi-model ensemble modelling offers a framework to help overcome biases of single models. We report on the development of a first multi-model ensemble of three lymphatic filariasis (LF models (EPIFIL, LYMFASIM, and TRANSFIL, and evaluate its predictive performance in comparison with that of the constituents using calibration and validation data from three case study sites, one each from the three major LF endemic regions: Africa, Southeast Asia and Papua New Guinea (PNG. We assessed the performance of the respective models for predicting the outcomes of annual MDA strategies for various baseline scenarios thought to exemplify the current endemic conditions in the three regions. The results show that the constructed multi-model ensemble outperformed the single models when evaluated across all sites. Single models that best fitted calibration data tended to do less well in simulating the out-of-sample, or validation, intervention data. Scenario modelling results demonstrate that the multi-model ensemble is able to compensate for variance between single models in order to produce more plausible predictions of intervention impacts. Our results highlight the value of an ensemble approach to modelling parasite control dynamics. However, its optimal use will require further methodological improvements as well as consideration of the organizational mechanisms required to ensure that modelling results and data are shared effectively between all stakeholders.
International Nuclear Information System (INIS)
Iskandar, Ismed; Gondokaryono, Yudi Satria
2016-01-01
In reliability theory, the most important problem is to determine the reliability of a complex system from the reliability of its components. The weakness of most reliability theories is that the systems are described and explained as simply functioning or failed. In many real situations, the failures may be from many causes depending upon the age and the environment of the system and its components. Another problem in reliability theory is one of estimating the parameters of the assumed failure models. The estimation may be based on data collected over censored or uncensored life tests. In many reliability problems, the failure data are simply quantitatively inadequate, especially in engineering design and maintenance system. The Bayesian analyses are more beneficial than the classical one in such cases. The Bayesian estimation analyses allow us to combine past knowledge or experience in the form of an apriori distribution with life test data to make inferences of the parameter of interest. In this paper, we have investigated the application of the Bayesian estimation analyses to competing risk systems. The cases are limited to the models with independent causes of failure by using the Weibull distribution as our model. A simulation is conducted for this distribution with the objectives of verifying the models and the estimators and investigating the performance of the estimators for varying sample size. The simulation data are analyzed by using Bayesian and the maximum likelihood analyses. The simulation results show that the change of the true of parameter relatively to another will change the value of standard deviation in an opposite direction. For a perfect information on the prior distribution, the estimation methods of the Bayesian analyses are better than those of the maximum likelihood. The sensitivity analyses show some amount of sensitivity over the shifts of the prior locations. They also show the robustness of the Bayesian analysis within the range
Flexible Bayesian Dynamic Modeling of Covariance and Correlation Matrices
Lan, Shiwei
2017-11-08
Modeling covariance (and correlation) matrices is a challenging problem due to the large dimensionality and positive-definiteness constraint. In this paper, we propose a novel Bayesian framework based on decomposing the covariance matrix into variance and correlation matrices. The highlight is that the correlations are represented as products of vectors on unit spheres. We propose a variety of distributions on spheres (e.g. the squared-Dirichlet distribution) to induce flexible prior distributions for covariance matrices that go beyond the commonly used inverse-Wishart prior. To handle the intractability of the resulting posterior, we introduce the adaptive $\\\\Delta$-Spherical Hamiltonian Monte Carlo. We also extend our structured framework to dynamic cases and introduce unit-vector Gaussian process priors for modeling the evolution of correlation among multiple time series. Using an example of Normal-Inverse-Wishart problem, a simulated periodic process, and an analysis of local field potential data (collected from the hippocampus of rats performing a complex sequence memory task), we demonstrated the validity and effectiveness of our proposed framework for (dynamic) modeling covariance and correlation matrices.
Context-dependent decision-making: a simple Bayesian model.
Lloyd, Kevin; Leslie, David S
2013-05-06
Many phenomena in animal learning can be explained by a context-learning process whereby an animal learns about different patterns of relationship between environmental variables. Differentiating between such environmental regimes or 'contexts' allows an animal to rapidly adapt its behaviour when context changes occur. The current work views animals as making sequential inferences about current context identity in a world assumed to be relatively stable but also capable of rapid switches to previously observed or entirely new contexts. We describe a novel decision-making model in which contexts are assumed to follow a Chinese restaurant process with inertia and full Bayesian inference is approximated by a sequential-sampling scheme in which only a single hypothesis about current context is maintained. Actions are selected via Thompson sampling, allowing uncertainty in parameters to drive exploration in a straightforward manner. The model is tested on simple two-alternative choice problems with switching reinforcement schedules and the results compared with rat behavioural data from a number of T-maze studies. The model successfully replicates a number of important behavioural effects: spontaneous recovery, the effect of partial reinforcement on extinction and reversal, the overtraining reversal effect, and serial reversal-learning effects.
A Bayesian approach to the modelling of α Cen A
Bazot, M.; Bourguignon, S.; Christensen-Dalsgaard, J.
2012-12-01
Determining the physical characteristics of a star is an inverse problem consisting of estimating the parameters of models for the stellar structure and evolution, and knowing certain observable quantities. We use a Bayesian approach to solve this problem for α Cen A, which allows us to incorporate prior information on the parameters to be estimated, in order to better constrain the problem. Our strategy is based on the use of a Markov chain Monte Carlo (MCMC) algorithm to estimate the posterior probability densities of the stellar parameters: mass, age, initial chemical composition, etc. We use the stellar evolutionary code ASTEC to model the star. To constrain this model both seismic and non-seismic observations were considered. Several different strategies were tested to fit these values, using either two free parameters or five free parameters in ASTEC. We are thus able to show evidence that MCMC methods become efficient with respect to more classical grid-based strategies when the number of parameters increases. The results of our MCMC algorithm allow us to derive estimates for the stellar parameters and robust uncertainties thanks to the statistical analysis of the posterior probability densities. We are also able to compute odds for the presence of a convective core in α Cen A. When using core-sensitive seismic observational constraints, these can rise above ˜40 per cent. The comparison of results to previous studies also indicates that these seismic constraints are of critical importance for our knowledge of the structure of this star.
Parameter Estimation of Structural Equation Modeling Using Bayesian Approach
Directory of Open Access Journals (Sweden)
Dewi Kurnia Sari
2016-05-01
Full Text Available Leadership is a process of influencing, directing or giving an example of employees in order to achieve the objectives of the organization and is a key element in the effectiveness of the organization. In addition to the style of leadership, the success of an organization or company in achieving its objectives can also be influenced by the commitment of the organization. Where organizational commitment is a commitment created by each individual for the betterment of the organization. The purpose of this research is to obtain a model of leadership style and organizational commitment to job satisfaction and employee performance, and determine the factors that influence job satisfaction and employee performance using SEM with Bayesian approach. This research was conducted at Statistics FNI employees in Malang, with 15 people. The result of this study showed that the measurement model, all significant indicators measure each latent variable. Meanwhile in the structural model, it was concluded there are a significant difference between the variables of Leadership Style and Organizational Commitment toward Job Satisfaction directly as well as a significant difference between Job Satisfaction on Employee Performance. As for the influence of Leadership Style and variable Organizational Commitment on Employee Performance directly declared insignificant.
Bayesian nonparametric clustering in phylogenetics: modeling antigenic evolution in influenza.
Cybis, Gabriela B; Sinsheimer, Janet S; Bedford, Trevor; Rambaut, Andrew; Lemey, Philippe; Suchard, Marc A
2018-01-30
Influenza is responsible for up to 500,000 deaths every year, and antigenic variability represents much of its epidemiological burden. To visualize antigenic differences across many viral strains, antigenic cartography methods use multidimensional scaling on binding assay data to map influenza antigenicity onto a low-dimensional space. Analysis of such assay data ideally leads to natural clustering of influenza strains of similar antigenicity that correlate with sequence evolution. To understand the dynamics of these antigenic groups, we present a framework that jointly models genetic and antigenic evolution by combining multidimensional scaling of binding assay data, Bayesian phylogenetic machinery and nonparametric clustering methods. We propose a phylogenetic Chinese restaurant process that extends the current process to incorporate the phylogenetic dependency structure between strains in the modeling of antigenic clusters. With this method, we are able to use the genetic information to better understand the evolution of antigenicity throughout epidemics, as shown in applications of this model to H1N1 influenza. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Bayesian averaging over Decision Tree models for trauma severity scoring.
Schetinin, V; Jakaite, L; Krzanowski, W
2018-01-01
Health care practitioners analyse possible risks of misleading decisions and need to estimate and quantify uncertainty in predictions. We have examined the "gold" standard of screening a patient's conditions for predicting survival probability, based on logistic regression modelling, which is used in trauma care for clinical purposes and quality audit. This methodology is based on theoretical assumptions about data and uncertainties. Models induced within such an approach have exposed a number of problems, providing unexplained fluctuation of predicted survival and low accuracy of estimating uncertainty intervals within which predictions are made. Bayesian method, which in theory is capable of providing accurate predictions and uncertainty estimates, has been adopted in our study using Decision Tree models. Our approach has been tested on a large set of patients registered in the US National Trauma Data Bank and has outperformed the standard method in terms of prediction accuracy, thereby providing practitioners with accurate estimates of the predictive posterior densities of interest that are required for making risk-aware decisions. Copyright © 2017 Elsevier B.V. All rights reserved.
Bayesian network model of crowd emotion and negative behavior
Ramli, Nurulhuda; Ghani, Noraida Abdul; Hatta, Zulkarnain Ahmad; Hashim, Intan Hashimah Mohd; Sulong, Jasni; Mahudin, Nor Diana Mohd; Rahman, Shukran Abd; Saad, Zarina Mat
2014-12-01
The effects of overcrowding have become a major concern for event organizers. One aspect of this concern has been the idea that overcrowding can enhance the occurrence of serious incidents during events. As one of the largest Muslim religious gathering attended by pilgrims from all over the world, Hajj has become extremely overcrowded with many incidents being reported. The purpose of this study is to analyze the nature of human emotion and negative behavior resulting from overcrowding during Hajj events from data gathered in Malaysian Hajj Experience Survey in 2013. The sample comprised of 147 Malaysian pilgrims (70 males and 77 females). Utilizing a probabilistic model called Bayesian network, this paper models the dependence structure between different emotions and negative behaviors of pilgrims in the crowd. The model included the following variables of emotion: negative, negative comfortable, positive, positive comfortable and positive spiritual and variables of negative behaviors; aggressive and hazardous acts. The study demonstrated that emotions of negative, negative comfortable, positive spiritual and positive emotion have a direct influence on aggressive behavior whereas emotion of negative comfortable, positive spiritual and positive have a direct influence on hazardous acts behavior. The sensitivity analysis showed that a low level of negative and negative comfortable emotions leads to a lower level of aggressive and hazardous behavior. Findings of the study can be further improved to identify the exact cause and risk factors of crowd-related incidents in preventing crowd disasters during the mass gathering events.
Bayesian models for astrophysical data using R, JAGS, Python, and Stan
Hilbe, Joseph M; Ishida, Emille E O
2017-01-01
This comprehensive guide to Bayesian methods in astronomy enables hands-on work by supplying complete R, JAGS, Python, and Stan code, to use directly or to adapt. It begins by examining the normal model from both frequentist and Bayesian perspectives and then progresses to a full range of Bayesian generalized linear and mixed or hierarchical models, as well as additional types of models such as ABC and INLA. The book provides code that is largely unavailable elsewhere and includes details on interpreting and evaluating Bayesian models. Initial discussions offer models in synthetic form so that readers can easily adapt them to their own data; later the models are applied to real astronomical data. The consistent focus is on hands-on modeling, analysis of data, and interpretations that address scientific questions. A must-have for astronomers, its concrete approach will also be attractive to researchers in the sciences more generally.
Walcott, Sam
2014-10-01
Molecular motors, by turning chemical energy into mechanical work, are responsible for active cellular processes. Often groups of these motors work together to perform their biological role. Motors in an ensemble are coupled and exhibit complex emergent behavior. Although large motor ensembles can be modeled with partial differential equations (PDEs) by assuming that molecules function independently of their neighbors, this assumption is violated when motors are coupled locally. It is therefore unclear how to describe the ensemble behavior of the locally coupled motors responsible for biological processes such as calcium-dependent skeletal muscle activation. Here we develop a theory to describe locally coupled motor ensembles and apply the theory to skeletal muscle activation. The central idea is that a muscle filament can be divided into two phases: an active and an inactive phase. Dynamic changes in the relative size of these phases are described by a set of linear ordinary differential equations (ODEs). As the dynamics of the active phase are described by PDEs, muscle activation is governed by a set of coupled ODEs and PDEs, building on previous PDE models. With comparison to Monte Carlo simulations, we demonstrate that the theory captures the behavior of locally coupled ensembles. The theory also plausibly describes and predicts muscle experiments from molecular to whole muscle scales, suggesting that a micro- to macroscale muscle model is within reach.
BayesCLUMPY: BAYESIAN INFERENCE WITH CLUMPY DUSTY TORUS MODELS
International Nuclear Information System (INIS)
Asensio Ramos, A.; Ramos Almeida, C.
2009-01-01
Our aim is to present a fast and general Bayesian inference framework based on the synergy between machine learning techniques and standard sampling methods and apply it to infer the physical properties of clumpy dusty torus using infrared photometric high spatial resolution observations of active galactic nuclei. We make use of the Metropolis-Hastings Markov Chain Monte Carlo algorithm for sampling the posterior distribution function. Such distribution results from combining all a priori knowledge about the parameters of the model and the information introduced by the observations. The main difficulty resides in the fact that the model used to explain the observations is computationally demanding and the sampling is very time consuming. For this reason, we apply a set of artificial neural networks that are used to approximate and interpolate a database of models. As a consequence, models not present in the original database can be computed ensuring continuity. We focus on the application of this solution scheme to the recently developed public database of clumpy dusty torus models. The machine learning scheme used in this paper allows us to generate any model from the database using only a factor of 10 -4 of the original size of the database and a factor of 10 -3 in computing time. The posterior distribution obtained for each model parameter allows us to investigate how the observations constrain the parameters and which ones remain partially or completely undetermined, providing statistically relevant confidence intervals. As an example, the application to the nuclear region of Centaurus A shows that the optical depth of the clouds, the total number of clouds, and the radial extent of the cloud distribution zone are well constrained using only six filters. The code is freely available from the authors.
Bayesian uncertainty quantification in linear models for diffusion MRI.
Sjölund, Jens; Eklund, Anders; Özarslan, Evren; Herberthson, Magnus; Bånkestad, Maria; Knutsson, Hans
2018-03-29
Diffusion MRI (dMRI) is a valuable tool in the assessment of tissue microstructure. By fitting a model to the dMRI signal it is possible to derive various quantitative features. Several of the most popular dMRI signal models are expansions in an appropriately chosen basis, where the coefficients are determined using some variation of least-squares. However, such approaches lack any notion of uncertainty, which could be valuable in e.g. group analyses. In this work, we use a probabilistic interpretation of linear least-squares methods to recast popular dMRI models as Bayesian ones. This makes it possible to quantify the uncertainty of any derived quantity. In particular, for quantities that are affine functions of the coefficients, the posterior distribution can be expressed in closed-form. We simulated measurements from single- and double-tensor models where the correct values of several quantities are known, to validate that the theoretically derived quantiles agree with those observed empirically. We included results from residual bootstrap for comparison and found good agreement. The validation employed several different models: Diffusion Tensor Imaging (DTI), Mean Apparent Propagator MRI (MAP-MRI) and Constrained Spherical Deconvolution (CSD). We also used in vivo data to visualize maps of quantitative features and corresponding uncertainties, and to show how our approach can be used in a group analysis to downweight subjects with high uncertainty. In summary, we convert successful linear models for dMRI signal estimation to probabilistic models, capable of accurate uncertainty quantification. Copyright © 2018 Elsevier Inc. All rights reserved.
Bayesian Regression of Thermodynamic Models of Redox Active Materials
Energy Technology Data Exchange (ETDEWEB)
Johnston, Katherine [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
2017-09-01
Finding a suitable functional redox material is a critical challenge to achieving scalable, economically viable technologies for storing concentrated solar energy in the form of a defected oxide. Demonstrating e ectiveness for thermal storage or solar fuel is largely accomplished by using a thermodynamic model derived from experimental data. The purpose of this project is to test the accuracy of our regression model on representative data sets. Determining the accuracy of the model includes parameter tting the model to the data, comparing the model using di erent numbers of param- eters, and analyzing the entropy and enthalpy calculated from the model. Three data sets were considered in this project: two demonstrating materials for solar fuels by wa- ter splitting and the other of a material for thermal storage. Using Bayesian Inference and Markov Chain Monte Carlo (MCMC), parameter estimation was preformed on the three data sets. Good results were achieved, except some there was some deviations on the edges of the data input ranges. The evidence values were then calculated in a variety of ways and used to compare models with di erent number of parameters. It was believed that at least one of the parameters was unnecessary and comparing evidence values demonstrated that the parameter was need on one data set and not signi cantly helpful on another. The entropy was calculated by taking the derivative in one variable and integrating over another. and its uncertainty was also calculated by evaluating the entropy over multiple MCMC samples. Afterwards, all the parts were written up as a tutorial for the Uncertainty Quanti cation Toolkit (UQTk).
Estimation of the uncertainty of a climate model using an ensemble simulation
Barth, A.; Mathiot, P.; Goosse, H.
2012-04-01
The atmospheric forcings play an important role in the study of the ocean and sea-ice dynamics of the Southern Ocean. Error in the atmospheric forcings will inevitably result in uncertain model results. The sensitivity of the model results to errors in the atmospheric forcings are studied with ensemble simulations using multivariate perturbations of the atmospheric forcing fields. The numerical ocean model used is the NEMO-LIM in a global configuration with an horizontal resolution of 2°. NCEP reanalyses are used to provide air temperature and wind data to force the ocean model over the last 50 years. A climatological mean is used to prescribe relative humidity, cloud cover and precipitation. In a first step, the model results is compared with OSTIA SST and OSI SAF sea ice concentration of the southern hemisphere. The seasonal behavior of the RMS difference and bias in SST and ice concentration is highlighted as well as the regions with relatively high RMS errors and biases such as the Antarctic Circumpolar Current and near the ice-edge. Ensemble simulations are performed to statistically characterize the model error due to uncertainties in the atmospheric forcings. Such information is a crucial element for future data assimilation experiments. Ensemble simulations are performed with perturbed air temperature and wind forcings. A Fourier decomposition of the NCEP wind vectors and air temperature for 2007 is used to generate ensemble perturbations. The perturbations are scaled such that the resulting ensemble spread matches approximately the RMS differences between the satellite SST and sea ice concentration. The ensemble spread and covariance are analyzed for the minimum and maximum sea ice extent. It is shown that errors in the atmospheric forcings can extend to several hundred meters in depth near the Antarctic Circumpolar Current.
DEFF Research Database (Denmark)
Maiorano, Andrea; Martre, Pierre; Asseng, Senthold
2017-01-01
of models needed in a MME. Herein, 15 wheat growth models of a larger MME were improved through re-parameterization and/or incorporating or modifying heat stress effects on phenology, leaf growth and senescence, biomass growth, and grain number and size using detailed field experimental data from the USDA...... ensemble percentile range) of grain yields simulated by the MME on average by 39% in the calibration data set and by 26% in the independent evaluation data set for crops grown in mean seasonal temperatures >24 °C. MME mean squared error in simulating grain yield decreased by 37%. A reduction in MME...
Macroscopic Models of Clique Tree Growth for Bayesian Networks
National Aeronautics and Space Administration — In clique tree clustering, inference consists of propagation in a clique tree compiled from a Bayesian network. In this paper, we develop an analytical approach to...
Development of uncertainty-based work injury model using Bayesian structural equation modelling.
Chatterjee, Snehamoy
2014-01-01
This paper proposed a Bayesian method-based structural equation model (SEM) of miners' work injury for an underground coal mine in India. The environmental and behavioural variables for work injury were identified and causal relationships were developed. For Bayesian modelling, prior distributions of SEM parameters are necessary to develop the model. In this paper, two approaches were adopted to obtain prior distribution for factor loading parameters and structural parameters of SEM. In the first approach, the prior distributions were considered as a fixed distribution function with specific parameter values, whereas, in the second approach, prior distributions of the parameters were generated from experts' opinions. The posterior distributions of these parameters were obtained by applying Bayesian rule. The Markov Chain Monte Carlo sampling in the form Gibbs sampling was applied for sampling from the posterior distribution. The results revealed that all coefficients of structural and measurement model parameters are statistically significant in experts' opinion-based priors, whereas, two coefficients are not statistically significant when fixed prior-based distributions are applied. The error statistics reveals that Bayesian structural model provides reasonably good fit of work injury with high coefficient of determination (0.91) and less mean squared error as compared to traditional SEM.
Nitrate source apportionment in a subtropical watershed using Bayesian model
Energy Technology Data Exchange (ETDEWEB)
Yang, Liping; Han, Jiangpei; Xue, Jianlong; Zeng, Lingzao [College of Environmental and Natural Resource Sciences, Zhejiang Provincial Key Laboratory of Subtropical Soil and Plant Nutrition, Zhejiang University, Hangzhou, 310058 (China); Shi, Jiachun, E-mail: jcshi@zju.edu.cn [College of Environmental and Natural Resource Sciences, Zhejiang Provincial Key Laboratory of Subtropical Soil and Plant Nutrition, Zhejiang University, Hangzhou, 310058 (China); Wu, Laosheng, E-mail: laowu@zju.edu.cn [College of Environmental and Natural Resource Sciences, Zhejiang Provincial Key Laboratory of Subtropical Soil and Plant Nutrition, Zhejiang University, Hangzhou, 310058 (China); Jiang, Yonghai [State Key Laboratory of Environmental Criteria and Risk Assessment, Chinese Research Academy of Environmental Sciences, Beijing, 100012 (China)
2013-10-01
Nitrate (NO{sub 3}{sup −}) pollution in aquatic system is a worldwide problem. The temporal distribution pattern and sources of nitrate are of great concern for water quality. The nitrogen (N) cycling processes in a subtropical watershed located in Changxing County, Zhejiang Province, China were greatly influenced by the temporal variations of precipitation and temperature during the study period (September 2011 to July 2012). The highest NO{sub 3}{sup −} concentration in water was in May (wet season, mean ± SD = 17.45 ± 9.50 mg L{sup −1}) and the lowest concentration occurred in December (dry season, mean ± SD = 10.54 ± 6.28 mg L{sup −1}). Nevertheless, no water sample in the study area exceeds the WHO drinking water limit of 50 mg L{sup −1} NO{sub 3}{sup −}. Four sources of NO{sub 3}{sup −} (atmospheric deposition, AD; soil N, SN; synthetic fertilizer, SF; manure and sewage, M and S) were identified using both hydrochemical characteristics [Cl{sup −}, NO{sub 3}{sup −}, HCO{sub 3}{sup −}, SO{sub 4}{sup 2−}, Ca{sup 2+}, K{sup +}, Mg{sup 2+}, Na{sup +}, dissolved oxygen (DO)] and dual isotope approach (δ{sup 15}N–NO{sub 3}{sup −} and δ{sup 18}O–NO{sub 3}{sup −}). Both chemical and isotopic characteristics indicated that denitrification was not the main N cycling process in the study area. Using a Bayesian model (stable isotope analysis in R, SIAR), the contribution of each source was apportioned. Source apportionment results showed that source contributions differed significantly between the dry and wet season, AD and M and S contributed more in December than in May. In contrast, SN and SF contributed more NO{sub 3}{sup −} to water in May than that in December. M and S and SF were the major contributors in December and May, respectively. Moreover, the shortcomings and uncertainties of SIAR were discussed to provide implications for future works. With the assessment of temporal variation and sources of NO{sub 3}{sup −}, better
Nitrate source apportionment in a subtropical watershed using Bayesian model
International Nuclear Information System (INIS)
Yang, Liping; Han, Jiangpei; Xue, Jianlong; Zeng, Lingzao; Shi, Jiachun; Wu, Laosheng; Jiang, Yonghai
2013-01-01
Nitrate (NO 3 − ) pollution in aquatic system is a worldwide problem. The temporal distribution pattern and sources of nitrate are of great concern for water quality. The nitrogen (N) cycling processes in a subtropical watershed located in Changxing County, Zhejiang Province, China were greatly influenced by the temporal variations of precipitation and temperature during the study period (September 2011 to July 2012). The highest NO 3 − concentration in water was in May (wet season, mean ± SD = 17.45 ± 9.50 mg L −1 ) and the lowest concentration occurred in December (dry season, mean ± SD = 10.54 ± 6.28 mg L −1 ). Nevertheless, no water sample in the study area exceeds the WHO drinking water limit of 50 mg L −1 NO 3 − . Four sources of NO 3 − (atmospheric deposition, AD; soil N, SN; synthetic fertilizer, SF; manure and sewage, M and S) were identified using both hydrochemical characteristics [Cl − , NO 3 − , HCO 3 − , SO 4 2− , Ca 2+ , K + , Mg 2+ , Na + , dissolved oxygen (DO)] and dual isotope approach (δ 15 N–NO 3 − and δ 18 O–NO 3 − ). Both chemical and isotopic characteristics indicated that denitrification was not the main N cycling process in the study area. Using a Bayesian model (stable isotope analysis in R, SIAR), the contribution of each source was apportioned. Source apportionment results showed that source contributions differed significantly between the dry and wet season, AD and M and S contributed more in December than in May. In contrast, SN and SF contributed more NO 3 − to water in May than that in December. M and S and SF were the major contributors in December and May, respectively. Moreover, the shortcomings and uncertainties of SIAR were discussed to provide implications for future works. With the assessment of temporal variation and sources of NO 3 − , better agricultural management practices and sewage disposal programs can be implemented to sustain water quality in subtropical watersheds
Bayesian mixture models for source separation in MEG
International Nuclear Information System (INIS)
Calvetti, Daniela; Homa, Laura; Somersalo, Erkki
2011-01-01
This paper discusses the problem of imaging electromagnetic brain activity from measurements of the induced magnetic field outside the head. This imaging modality, magnetoencephalography (MEG), is known to be severely ill posed, and in order to obtain useful estimates for the activity map, complementary information needs to be used to regularize the problem. In this paper, a particular emphasis is on finding non-superficial focal sources that induce a magnetic field that may be confused with noise due to external sources and with distributed brain noise. The data are assumed to come from a mixture of a focal source and a spatially distributed possibly virtual source; hence, to differentiate between those two components, the problem is solved within a Bayesian framework, with a mixture model prior encoding the information that different sources may be concurrently active. The mixture model prior combines one density that favors strongly focal sources and another that favors spatially distributed sources, interpreted as clutter in the source estimation. Furthermore, to address the challenge of localizing deep focal sources, a novel depth sounding algorithm is suggested, and it is shown with simulated data that the method is able to distinguish between a signal arising from a deep focal source and a clutter signal. (paper)
Bergee, Martin J.; Westfall, Claude R.
2005-01-01
This is the third study in a line of inquiry whose purpose has been to develop a theoretical model of selected extra musical variables' influence on solo and small-ensemble festival ratings. Authors of the second of these (Bergee & McWhirter, 2005) had used binomial logistic regression as the basis for their model-formulation strategy. Their…
A short-range multi-model ensemble weather prediction system for South Africa
CSIR Research Space (South Africa)
Landman, S
2010-09-01
Full Text Available prediction system (EPS) at the South African Weather Service (SAWS) are examined. The ensemble consists of different forecasts from the 12-km LAM of the UK Met Office Unified Model (UM) and the Conformal-Cubic Atmospheric Model (CCAM) covering the South...
Directory of Open Access Journals (Sweden)
Linus Zhang
2018-03-01
Full Text Available Given the substantial impacts that are expected due to climate change, it is crucial that accurate rainfall–runoff results are provided for various decision-making purposes. However, these modeling results often generate uncertainty or bias due to the imperfect character of individual models. In this paper, a genetic algorithm together with a Bayesian model averaging method are employed to provide a multi-model ensemble (MME and combined runoff prediction under climate change scenarios produced from eight rainfall–runoff models for the Yellow River Basin. The results show that the multi-model ensemble method, especially the genetic algorithm method, can produce more reliable predictions than the other considered rainfall–runoff models. These results show that it is possible to reduce the uncertainty and thus improve the accuracy for future projections using different models because an MME approach evens out the bias involved in the individual model. For the study area, the final combined predictions reveal that less runoff is expected under most climatic scenarios, which will threaten water security of the basin.
Bayesian image reconstruction in SPECT using higher order mechanical models as priors
International Nuclear Information System (INIS)
Lee, S.J.; Gindi, G.; Rangarajan, A.
1995-01-01
While the ML-EM (maximum-likelihood-expectation maximization) algorithm for reconstruction for emission tomography is unstable due to the ill-posed nature of the problem, Bayesian reconstruction methods overcome this instability by introducing prior information, often in the form of a spatial smoothness regularizer. More elaborate forms of smoothness constraints may be used to extend the role of the prior beyond that of a stabilizer in order to capture actual spatial information about the object. Previously proposed forms of such prior distributions were based on the assumption of a piecewise constant source distribution. Here, the authors propose an extension to a piecewise linear model--the weak plate--which is more expressive than the piecewise constant model. The weak plate prior not only preserves edges but also allows for piecewise ramplike regions in the reconstruction. Indeed, for the application in SPECT, such ramplike regions are observed in ground-truth source distributions in the form of primate autoradiographs of rCBF radionuclides. To incorporate the weak plate prior in a MAP approach, the authors model the prior as a Gibbs distribution and use a GEM formulation for the optimization. They compare quantitative performance of the ML-EM algorithm, a GEM algorithm with a prior favoring piecewise constant regions, and a GEM algorithm with the weak plate prior. Pointwise and regional bias and variance of ensemble image reconstructions are used as indications of image quality. The results show that the weak plate and membrane priors exhibit improved bias and variance relative to ML-EM techniques
Bayesian inference for partially identified models exploring the limits of limited data
Gustafson, Paul
2015-01-01
Introduction Identification What Is against Us? What Is for Us? Some Simple Examples of Partially Identified ModelsThe Road Ahead The Structure of Inference in Partially Identified Models Bayesian Inference The Structure of Posterior Distributions in PIMs Computational Strategies Strength of Bayesian Updating, Revisited Posterior MomentsCredible Intervals Evaluating the Worth of Inference Partial Identification versus Model Misspecification The Siren Call of Identification Comp
Post-processing of multi-model ensemble river discharge forecasts using censored EMOS
Hemri, Stephan; Lisniak, Dmytro; Klein, Bastian
2014-05-01
When forecasting water levels and river discharge, ensemble weather forecasts are used as meteorological input to hydrologic process models. As hydrologic models are imperfect and the input ensembles tend to be biased and underdispersed, the output ensemble forecasts for river runoff typically are biased and underdispersed, too. Thus, statistical post-processing is required in order to achieve calibrated and sharp predictions. Standard post-processing methods such as Ensemble Model Output Statistics (EMOS) that have their origins in meteorological forecasting are now increasingly being used in hydrologic applications. Here we consider two sub-catchments of River Rhine, for which the forecasting system of the Federal Institute of Hydrology (BfG) uses runoff data that are censored below predefined thresholds. To address this methodological challenge, we develop a censored EMOS method that is tailored to such data. The censored EMOS forecast distribution can be understood as a mixture of a point mass at the censoring threshold and a continuous part based on a truncated normal distribution. Parameter estimates of the censored EMOS model are obtained by minimizing the Continuous Ranked Probability Score (CRPS) over the training dataset. Model fitting on Box-Cox transformed data allows us to take account of the positive skewness of river discharge distributions. In order to achieve realistic forecast scenarios over an entire range of lead-times, there is a need for multivariate extensions. To this end, we smooth the marginal parameter estimates over lead-times. In order to obtain realistic scenarios of discharge evolution over time, the marginal distributions have to be linked with each other. To this end, the multivariate dependence structure can either be adopted from the raw ensemble like in Ensemble Copula Coupling (ECC), or be estimated from observations in a training period. The censored EMOS model has been applied to multi-model ensemble forecasts issued on a
Multi-criterion model ensemble of CMIP5 surface air temperature over China
Yang, Tiantian; Tao, Yumeng; Li, Jingjing; Zhu, Qian; Su, Lu; He, Xiaojia; Zhang, Xiaoming
2018-05-01
The global circulation models (GCMs) are useful tools for simulating climate change, projecting future temperature changes, and therefore, supporting the preparation of national climate adaptation plans. However, different GCMs are not always in agreement with each other over various regions. The reason is that GCMs' configurations, module characteristics, and dynamic forcings vary from one to another. Model ensemble techniques are extensively used to post-process the outputs from GCMs and improve the variability of model outputs. Root-mean-square error (RMSE), correlation coefficient (CC, or R) and uncertainty are commonly used statistics for evaluating the performances of GCMs. However, the simultaneous achievements of all satisfactory statistics cannot be guaranteed in using many model ensemble techniques. In this paper, we propose a multi-model ensemble framework, using a state-of-art evolutionary multi-objective optimization algorithm (termed MOSPD), to evaluate different characteristics of ensemble candidates and to provide comprehensive trade-off information for different model ensemble solutions. A case study of optimizing the surface air temperature (SAT) ensemble solutions over different geographical regions of China is carried out. The data covers from the period of 1900 to 2100, and the projections of SAT are analyzed with regard to three different statistical indices (i.e., RMSE, CC, and uncertainty). Among the derived ensemble solutions, the trade-off information is further analyzed with a robust Pareto front with respect to different statistics. The comparison results over historical period (1900-2005) show that the optimized solutions are superior over that obtained simple model average, as well as any single GCM output. The improvements of statistics are varying for different climatic regions over China. Future projection (2006-2100) with the proposed ensemble method identifies that the largest (smallest) temperature changes will happen in the
Multimodel Ensembles of Wheat Growth: Many Models are Better than One
Martre, Pierre; Wallach, Daniel; Asseng, Senthold; Ewert, Frank; Jones, James W.; Rotter, Reimund P.; Boote, Kenneth J.; Ruane, Alexander C.; Thorburn, Peter J.; Cammarano, Davide;
2015-01-01
Crop models of crop growth are increasingly used to quantify the impact of global changes due to climate or crop management. Therefore, accuracy of simulation results is a major concern. Studies with ensembles of crop model scan give valuable information about model accuracy and uncertainty, but such studies are difficult to organize and have only recently begun. We report on the largest ensemble study to date, of 27 wheat models tested in four contrasting locations for their accuracy in simulating multiple crop growth and yield variables. The relative error averaged over models was 2438 for the different end-of-season variables including grain yield (GY) and grain protein concentration (GPC). There was little relation between error of a model for GY or GPC and error for in-season variables. Thus, most models did not arrive at accurate simulations of GY and GPC by accurately simulating preceding growth dynamics. Ensemble simulations, taking either the mean (e-mean) or median (e-median) of simulated values, gave better estimates than any individual model when all variables were considered. Compared to individual models, e-median ranked first in simulating measured GY and third in GPC. The error of e-mean and e-median declined with an increasing number of ensemble members, with little decrease beyond 10 models. We conclude that multimodel ensembles can be used to create new estimators with improved accuracy and consistency in simulating growth dynamics. We argue that these results are applicable to other crop species, and hypothesize that they apply more generally to ecological system models.
Multimodel Ensembles of Wheat Growth: More Models are Better than One
Martre, Pierre; Wallach, Daniel; Asseng, Senthold; Ewert, Frank; Jones, James W.; Rotter, Reimund P.; Boote, Kenneth J.; Ruane, Alex C.; Thorburn, Peter J.; Cammarano, Davide;
2015-01-01
Crop models of crop growth are increasingly used to quantify the impact of global changes due to climate or crop management. Therefore, accuracy of simulation results is a major concern. Studies with ensembles of crop models can give valuable information about model accuracy and uncertainty, but such studies are difficult to organize and have only recently begun. We report on the largest ensemble study to date, of 27 wheat models tested in four contrasting locations for their accuracy in simulating multiple crop growth and yield variables. The relative error averaged over models was 24-38% for the different end-of-season variables including grain yield (GY) and grain protein concentration (GPC). There was little relation between error of a model for GY or GPC and error for in-season variables. Thus, most models did not arrive at accurate simulations of GY and GPC by accurately simulating preceding growth dynamics. Ensemble simulations, taking either the mean (e-mean) or median (e-median) of simulated values, gave better estimates than any individual model when all variables were considered. Compared to individual models, e-median ranked first in simulating measured GY and third in GPC. The error of e-mean and e-median declined with an increasing number of ensemble members, with little decrease beyond 10 models. We conclude that multimodel ensembles can be used to create new estimators with improved accuracy and consistency in simulating growth dynamics. We argue that these results are applicable to other crop species, and hypothesize that they apply more generally to ecological system models.
A user credit assessment model based on clustering ensemble for broadband network new media service supervision
Liu, Fang; Cao, San-xing; Lu, Rui
2012-04-01
This paper proposes a user credit assessment model based on clustering ensemble aiming to solve the problem that users illegally spread pirated and pornographic media contents within the user self-service oriented broadband network new media platforms. Its idea is to do the new media user credit assessment by establishing indices system based on user credit behaviors, and the illegal users could be found according to the credit assessment results, thus to curb the bad videos and audios transmitted on the network. The user credit assessment model based on clustering ensemble proposed by this paper which integrates the advantages that swarm intelligence clustering is suitable for user credit behavior analysis and K-means clustering could eliminate the scattered users existed in the result of swarm intelligence clustering, thus to realize all the users' credit classification automatically. The model's effective verification experiments are accomplished which are based on standard credit application dataset in UCI machine learning repository, and the statistical results of a comparative experiment with a single model of swarm intelligence clustering indicates this clustering ensemble model has a stronger creditworthiness distinguishing ability, especially in the aspect of predicting to find user clusters with the best credit and worst credit, which will facilitate the operators to take incentive measures or punitive measures accurately. Besides, compared with the experimental results of Logistic regression based model under the same conditions, this clustering ensemble model is robustness and has better prediction accuracy.
Drzewiecki, Wojciech
2016-12-01
In this work nine non-linear regression models were compared for sub-pixel impervious surface area mapping from Landsat images. The comparison was done in three study areas both for accuracy of imperviousness coverage evaluation in individual points in time and accuracy of imperviousness change assessment. The performance of individual machine learning algorithms (Cubist, Random Forest, stochastic gradient boosting of regression trees, k-nearest neighbors regression, random k-nearest neighbors regression, Multivariate Adaptive Regression Splines, averaged neural networks, and support vector machines with polynomial and radial kernels) was also compared with the performance of heterogeneous model ensembles constructed from the best models trained using particular techniques. The results proved that in case of sub-pixel evaluation the most accurate prediction of change may not necessarily be based on the most accurate individual assessments. When single methods are considered, based on obtained results Cubist algorithm may be advised for Landsat based mapping of imperviousness for single dates. However, Random Forest may be endorsed when the most reliable evaluation of imperviousness change is the primary goal. It gave lower accuracies for individual assessments, but better prediction of change due to more correlated errors of individual predictions. Heterogeneous model ensembles performed for individual time points assessments at least as well as the best individual models. In case of imperviousness change assessment the ensembles always outperformed single model approaches. It means that it is possible to improve the accuracy of sub-pixel imperviousness change assessment using ensembles of heterogeneous non-linear regression models.
Ehrhardt, Fiona; Soussana, Jean-François; Bellocchi, Gianni; Grace, Peter; McAuliffe, Russel; Recous, Sylvie; Sándor, Renáta; Smith, Pete; Snow, Val; de Antoni Migliorati, Massimiliano; Basso, Bruno; Bhatia, Arti; Brilli, Lorenzo; Doltra, Jordi; Dorich, Christopher D; Doro, Luca; Fitton, Nuala; Giacomini, Sandro J; Grant, Brian; Harrison, Matthew T; Jones, Stephanie K; Kirschbaum, Miko U F; Klumpp, Katja; Laville, Patricia; Léonard, Joël; Liebig, Mark; Lieffering, Mark; Martin, Raphaël; Massad, Raia S; Meier, Elizabeth; Merbold, Lutz; Moore, Andrew D; Myrgiotis, Vasileios; Newton, Paul; Pattey, Elizabeth; Rolinski, Susanne; Sharp, Joanna; Smith, Ward N; Wu, Lianhai; Zhang, Qing
2018-02-01
Simulation models are extensively used to predict agricultural productivity and greenhouse gas emissions. However, the uncertainties of (reduced) model ensemble simulations have not been assessed systematically for variables affecting food security and climate change mitigation, within multi-species agricultural contexts. We report an international model comparison and benchmarking exercise, showing the potential of multi-model ensembles to predict productivity and nitrous oxide (N 2 O) emissions for wheat, maize, rice and temperate grasslands. Using a multi-stage modelling protocol, from blind simulations (stage 1) to partial (stages 2-4) and full calibration (stage 5), 24 process-based biogeochemical models were assessed individually or as an ensemble against long-term experimental data from four temperate grassland and five arable crop rotation sites spanning four continents. Comparisons were performed by reference to the experimental uncertainties of observed yields and N 2 O emissions. Results showed that across sites and crop/grassland types, 23%-40% of the uncalibrated individual models were within two standard deviations (SD) of observed yields, while 42 (rice) to 96% (grasslands) of the models were within 1 SD of observed N 2 O emissions. At stage 1, ensembles formed by the three lowest prediction model errors predicted both yields and N 2 O emissions within experimental uncertainties for 44% and 33% of the crop and grassland growth cycles, respectively. Partial model calibration (stages 2-4) markedly reduced prediction errors of the full model ensemble E-median for crop grain yields (from 36% at stage 1 down to 4% on average) and grassland productivity (from 44% to 27%) and to a lesser and more variable extent for N 2 O emissions. Yield-scaled N 2 O emissions (N 2 O emissions divided by crop yields) were ranked accurately by three-model ensembles across crop species and field sites. The potential of using process-based model ensembles to predict jointly
A Bayesian, generalized frailty model for comet assays.
Ghebretinsae, Aklilu Habteab; Faes, Christel; Molenberghs, Geert; De Boeck, Marlies; Geys, Helena
2013-05-01
This paper proposes a flexible modeling approach for so-called comet assay data regularly encountered in preclinical research. While such data consist of non-Gaussian outcomes in a multilevel hierarchical structure, traditional analyses typically completely or partly ignore this hierarchical nature by summarizing measurements within a cluster. Non-Gaussian outcomes are often modeled using exponential family models. This is true not only for binary and count data, but also for, example, time-to-event outcomes. Two important reasons for extending this family are for (1) the possible occurrence of overdispersion, meaning that the variability in the data may not be adequately described by the models, which often exhibit a prescribed mean-variance link, and (2) the accommodation of a hierarchical structure in the data, owing to clustering in the data. The first issue is dealt with through so-called overdispersion models. Clustering is often accommodated through the inclusion of random subject-specific effects. Though not always, one conventionally assumes such random effects to be normally distributed. In the case of time-to-event data, one encounters, for example, the gamma frailty model (Duchateau and Janssen, 2007 ). While both of these issues may occur simultaneously, models combining both are uncommon. Molenberghs et al. ( 2010 ) proposed a broad class of generalized linear models accommodating overdispersion and clustering through two separate sets of random effects. Here, we use this method to model data from a comet assay with a three-level hierarchical structure. Although a conjugate gamma random effect is used for the overdispersion random effect, both gamma and normal random effects are considered for the hierarchical random effect. Apart from model formulation, we place emphasis on Bayesian estimation. Our proposed method has an upper hand over the traditional analysis in that it (1) uses the appropriate distribution stipulated in the literature; (2) deals
Reliability assessment using degradation models: bayesian and classical approaches
Directory of Open Access Journals (Sweden)
Marta Afonso Freitas
2010-04-01
Full Text Available Traditionally, reliability assessment of devices has been based on (accelerated life tests. However, for highly reliable products, little information about reliability is provided by life tests in which few or no failures are typically observed. Since most failures arise from a degradation mechanism at work for which there are characteristics that degrade over time, one alternative is monitor the device for a period of time and assess its reliability from the changes in performance (degradation observed during that period. The goal of this article is to illustrate how degradation data can be modeled and analyzed by using "classical" and Bayesian approaches. Four methods of data analysis based on classical inference are presented. Next we show how Bayesian methods can also be used to provide a natural approach to analyzing degradation data. The approaches are applied to a real data set regarding train wheels degradation.Tradicionalmente, o acesso à confiabilidade de dispositivos tem sido baseado em testes de vida (acelerados. Entretanto, para produtos altamente confiáveis, pouca informação a respeito de sua confiabilidade é fornecida por testes de vida no quais poucas ou nenhumas falhas são observadas. Uma vez que boa parte das falhas é induzida por mecanismos de degradação, uma alternativa é monitorar o dispositivo por um período de tempo e acessar sua confiabilidade através das mudanças em desempenho (degradação observadas durante aquele período. O objetivo deste artigo é ilustrar como dados de degradação podem ser modelados e analisados utilizando-se abordagens "clássicas" e Bayesiana. Quatro métodos de análise de dados baseados em inferência clássica são apresentados. A seguir, mostramos como os métodos Bayesianos podem também ser aplicados para proporcionar uma abordagem natural à análise de dados de degradação. As abordagens são aplicadas a um banco de dados real relacionado à degradação de rodas de trens.
Yang, Ziheng; Zhu, Tianqi
2018-02-20
The Bayesian method is noted to produce spuriously high posterior probabilities for phylogenetic trees in analysis of large datasets, but the precise reasons for this overconfidence are unknown. In general, the performance of Bayesian selection of misspecified models is poorly understood, even though this is of great scientific interest since models are never true in real data analysis. Here we characterize the asymptotic behavior of Bayesian model selection and show that when the competing models are equally wrong, Bayesian model selection exhibits surprising and polarized behaviors in large datasets, supporting one model with full force while rejecting the others. If one model is slightly less wrong than the other, the less wrong model will eventually win when the amount of data increases, but the method may become overconfident before it becomes reliable. We suggest that this extreme behavior may be a major factor for the spuriously high posterior probabilities for evolutionary trees. The philosophical implications of our results to the application of Bayesian model selection to evaluate opposing scientific hypotheses are yet to be explored, as are the behaviors of non-Bayesian methods in similar situations.
Tauber, Sean; Navarro, Daniel J; Perfors, Amy; Steyvers, Mark
2017-07-01
Recent debates in the psychological literature have raised questions about the assumptions that underpin Bayesian models of cognition and what inferences they license about human cognition. In this paper we revisit this topic, arguing that there are 2 qualitatively different ways in which a Bayesian model could be constructed. The most common approach uses a Bayesian model as a normative standard upon which to license a claim about optimality. In the alternative approach, a descriptive Bayesian model need not correspond to any claim that the underlying cognition is optimal or rational, and is used solely as a tool for instantiating a substantive psychological theory. We present 3 case studies in which these 2 perspectives lead to different computational models and license different conclusions about human cognition. We demonstrate how the descriptive Bayesian approach can be used to answer different sorts of questions than the optimal approach, especially when combined with principled tools for model evaluation and model selection. More generally we argue for the importance of making a clear distinction between the 2 perspectives. Considerable confusion results when descriptive models and optimal models are conflated, and if Bayesians are to avoid contributing to this confusion it is important to avoid making normative claims when none are intended. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Bayesian leave-one-out cross-validation approximations for Gaussian latent variable models
DEFF Research Database (Denmark)
Vehtari, Aki; Mononen, Tommi; Tolvanen, Ville
2016-01-01
The future predictive performance of a Bayesian model can be estimated using Bayesian cross-validation. In this article, we consider Gaussian latent variable models where the integration over the latent values is approximated using the Laplace method or expectation propagation (EP). We study...... the properties of several Bayesian leave-one-out (LOO) cross-validation approximations that in most cases can be computed with a small additional cost after forming the posterior approximation given the full data. Our main objective is to assess the accuracy of the approximative LOO cross-validation estimators...
Bayesian Safety Risk Modeling of Human-Flightdeck Automation Interaction
Ancel, Ersin; Shih, Ann T.
2015-01-01
Usage of automatic systems in airliners has increased fuel efficiency, added extra capabilities, enhanced safety and reliability, as well as provide improved passenger comfort since its introduction in the late 80's. However, original automation benefits, including reduced flight crew workload, human errors or training requirements, were not achieved as originally expected. Instead, automation introduced new failure modes, redistributed, and sometimes increased workload, brought in new cognitive and attention demands, and increased training requirements. Modern airliners have numerous flight modes, providing more flexibility (and inherently more complexity) to the flight crew. However, the price to pay for the increased flexibility is the need for increased mode awareness, as well as the need to supervise, understand, and predict automated system behavior. Also, over-reliance on automation is linked to manual flight skill degradation and complacency in commercial pilots. As a result, recent accidents involving human errors are often caused by the interactions between humans and the automated systems (e.g., the breakdown in man-machine coordination), deteriorated manual flying skills, and/or loss of situational awareness due to heavy dependence on automated systems. This paper describes the development of the increased complexity and reliance on automation baseline model, named FLAP for FLightdeck Automation Problems. The model development process starts with a comprehensive literature review followed by the construction of a framework comprised of high-level causal factors leading to an automation-related flight anomaly. The framework was then converted into a Bayesian Belief Network (BBN) using the Hugin Software v7.8. The effects of automation on flight crew are incorporated into the model, including flight skill degradation, increased cognitive demand and training requirements along with their interactions. Besides flight crew deficiencies, automation system
Bayesian joint modelling of benefit and risk in drug development.
Costa, Maria J; Drury, Thomas
2018-05-01
To gain regulatory approval, a new medicine must demonstrate that its benefits outweigh any potential risks, ie, that the benefit-risk balance is favourable towards the new medicine. For transparency and clarity of the decision, a structured and consistent approach to benefit-risk assessment that quantifies uncertainties and accounts for underlying dependencies is desirable. This paper proposes two approaches to benefit-risk evaluation, both based on the idea of joint modelling of mixed outcomes that are potentially dependent at the subject level. Using Bayesian inference, the two approaches offer interpretability and efficiency to enhance qualitative frameworks. Simulation studies show that accounting for correlation leads to a more accurate assessment of the strength of evidence to support benefit-risk profiles of interest. Several graphical approaches are proposed that can be used to communicate the benefit-risk balance to project teams. Finally, the two approaches are illustrated in a case study using real clinical trial data. Copyright © 2018 John Wiley & Sons, Ltd.
Bayesian inversion using a geologically realistic and discrete model space
Jaeggli, C.; Julien, S.; Renard, P.
2017-12-01
Since the early days of groundwater modeling, inverse methods play a crucial role. Many research and engineering groups aim to infer extensive knowledge of aquifer parameters from a sparse set of observations. Despite decades of dedicated research on this topic, there are still several major issues to be solved. In the hydrogeological framework, one is often confronted with underground structures that present very sharp contrasts of geophysical properties. In particular, subsoil structures such as karst conduits, channels, faults, or lenses, strongly influence groundwater flow and transport behavior of the underground. For this reason it can be essential to identify their location and shape very precisely. Unfortunately, when inverse methods are specially trained to consider such complex features, their computation effort often becomes unaffordably high. The following work is an attempt to solve this dilemma. We present a new method that is, in some sense, a compromise between the ergodicity of Markov chain Monte Carlo (McMC) methods and the efficient handling of data by the ensemble based Kalmann filters. The realistic and complex random fields are generated by a Multiple-Point Statistics (MPS) tool. Nonetheless, it is applicable with any conditional geostatistical simulation tool. Furthermore, the algorithm is independent of any parametrization what becomes most important when two parametric systems are equivalent (permeability and resistivity, speed and slowness, etc.). When compared to two existing McMC schemes, the computational effort was divided by a factor of 12.
Statistical modelling of railway track geometry degradation using Hierarchical Bayesian models
International Nuclear Information System (INIS)
Andrade, A.R.; Teixeira, P.F.
2015-01-01
Railway maintenance planners require a predictive model that can assess the railway track geometry degradation. The present paper uses a Hierarchical Bayesian model as a tool to model the main two quality indicators related to railway track geometry degradation: the standard deviation of longitudinal level defects and the standard deviation of horizontal alignment defects. Hierarchical Bayesian Models (HBM) are flexible statistical models that allow specifying different spatially correlated components between consecutive track sections, namely for the deterioration rates and the initial qualities parameters. HBM are developed for both quality indicators, conducting an extensive comparison between candidate models and a sensitivity analysis on prior distributions. HBM is applied to provide an overall assessment of the degradation of railway track geometry, for the main Portuguese railway line Lisbon–Oporto. - Highlights: • Rail track geometry degradation is analysed using Hierarchical Bayesian models. • A Gibbs sampling strategy is put forward to estimate the HBM. • Model comparison and sensitivity analysis find the most suitable model. • We applied the most suitable model to all the segments of the main Portuguese line. • Tackling spatial correlations using CAR structures lead to a better model fit
DEFF Research Database (Denmark)
Quinonero, Joaquin; Girard, Agathe; Larsen, Jan
2003-01-01
The object of Bayesian modelling is predictive distribution, which, in a forecasting scenario, enables evaluation of forecasted values and their uncertainties. We focus on reliably estimating the predictive mean and variance of forecasted values using Bayesian kernel based models such as the Gaus......The object of Bayesian modelling is predictive distribution, which, in a forecasting scenario, enables evaluation of forecasted values and their uncertainties. We focus on reliably estimating the predictive mean and variance of forecasted values using Bayesian kernel based models...... such as the Gaussian process and the relevance vector machine. We derive novel analytic expressions for the predictive mean and variance for Gaussian kernel shapes under the assumption of a Gaussian input distribution in the static case, and of a recursive Gaussian predictive density in iterative forecasting...
U.S. Environmental Protection Agency — The dataset is lake dissolved oxygen concentrations obtained form plots published by Gelda et al. (1996) and lake reaeration model simulated values using Bayesian...
Fenton, N.; Neil, M.; Lagnado, D.; Marsh, W.; Yet, B.; Constantinou, A.
2016-01-01
We show that existing Bayesian network (BN) modelling techniques cannot capture the correct intuitive reasoning in the important case when a set of mutually exclusive events need to be modelled as separate nodes instead of states of a single node. A previously proposed ‘solution’, which introduces a simple constraint node that enforces mutual exclusivity, fails to preserve the prior probabilities of the events, while other proposed solutions involve major changes to the original model. We pro...
Energy Technology Data Exchange (ETDEWEB)
Duarte, Juliana P.; Leite, Victor C.; Melo, P.F. Frutuoso e, E-mail: julianapduarte@poli.ufrj.br, E-mail: victor.coppo.leite@poli.ufrj.br, E-mail: frutuoso@nuclear.ufrj.br [Universidade Federal do Rio de Janeiro (UFRJ), Rio de Janeiro, RJ (Brazil)
2013-07-01
Bayesian networks have become a very handy tool for solving problems in various application areas. This paper discusses the use of Bayesian networks to treat dependent events in reliability engineering typically modeled by Markovian models. Dependent events play an important role as, for example, when treating load-sharing systems, bridge systems, common-cause failures, and switching systems (those for which a standby component is activated after the main one fails by means of a switching mechanism). Repair plays an important role in all these cases (as, for example, the number of repairmen). All Bayesian network calculations are performed by means of the Netica™ software, of Norsys Software Corporation, and Fortran 90 to evaluate them over time. The discussion considers the development of time-dependent reliability figures of merit, which are easily obtained, through Markovian models, but not through Bayesian networks, because these latter need probability figures as input and not failure and repair rates. Bayesian networks produced results in very good agreement with those of Markov models and pivotal decomposition. Static and discrete time (DTBN) Bayesian networks were used in order to check their capabilities of modeling specific situations, like switching failures in cold-standby systems. The DTBN was more flexible to modeling systems where the time of occurrence of an event is important, for example, standby failure and repair. However, the static network model showed as good results as DTBN by a much more simplified approach. (author)
International Nuclear Information System (INIS)
Duarte, Juliana P.; Leite, Victor C.; Melo, P.F. Frutuoso e
2013-01-01
Bayesian networks have become a very handy tool for solving problems in various application areas. This paper discusses the use of Bayesian networks to treat dependent events in reliability engineering typically modeled by Markovian models. Dependent events play an important role as, for example, when treating load-sharing systems, bridge systems, common-cause failures, and switching systems (those for which a standby component is activated after the main one fails by means of a switching mechanism). Repair plays an important role in all these cases (as, for example, the number of repairmen). All Bayesian network calculations are performed by means of the Netica™ software, of Norsys Software Corporation, and Fortran 90 to evaluate them over time. The discussion considers the development of time-dependent reliability figures of merit, which are easily obtained, through Markovian models, but not through Bayesian networks, because these latter need probability figures as input and not failure and repair rates. Bayesian networks produced results in very good agreement with those of Markov models and pivotal decomposition. Static and discrete time (DTBN) Bayesian networks were used in order to check their capabilities of modeling specific situations, like switching failures in cold-standby systems. The DTBN was more flexible to modeling systems where the time of occurrence of an event is important, for example, standby failure and repair. However, the static network model showed as good results as DTBN by a much more simplified approach. (author)
LEON-GONZALEZ, Roberto; VINAYAGATHASAN, Thanabalasingam
2013-01-01
This paper investigates the determinants of growth in the Asian developing economies. We use Bayesian model averaging (BMA) in the context of a dynamic panel data growth regression to overcome the uncertainty over the choice of control variables. In addition, we use a Bayesian algorithm to analyze a large number of competing models. Among the explanatory variables, we include a non-linear function of inflation that allows for threshold effects. We use an unbalanced panel data set of 27 Asian ...
Costa, Veber; Fernandes, Wilson
2017-11-01
Extreme flood estimation has been a key research topic in hydrological sciences. Reliable estimates of such events are necessary as structures for flood conveyance are continuously evolving in size and complexity and, as a result, their failure-associated hazards become more and more pronounced. Due to this fact, several estimation techniques intended to improve flood frequency analysis and reducing uncertainty in extreme quantile estimation have been addressed in the literature in the last decades. In this paper, we develop a Bayesian framework for the indirect estimation of extreme flood quantiles from rainfall-runoff models. In the proposed approach, an ensemble of long daily rainfall series is simulated with a stochastic generator, which models extreme rainfall amounts with an upper-bounded distribution function, namely, the 4-parameter lognormal model. The rationale behind the generation model is that physical limits for rainfall amounts, and consequently for floods, exist and, by imposing an appropriate upper bound for the probabilistic model, more plausible estimates can be obtained for those rainfall quantiles with very low exceedance probabilities. Daily rainfall time series are converted into streamflows by routing each realization of the synthetic ensemble through a conceptual hydrologic model, the Rio Grande rainfall-runoff model. Calibration of parameters is performed through a nonlinear regression model, by means of the specification of a statistical model for the residuals that is able to accommodate autocorrelation, heteroscedasticity and nonnormality. By combining the outlined steps in a Bayesian structure of analysis, one is able to properly summarize the resulting uncertainty and estimating more accurate credible intervals for a set of flood quantiles of interest. The method for extreme flood indirect estimation was applied to the American river catchment, at the Folsom dam, in the state of California, USA. Results show that most floods
Maiorano, Andrea; Martre, Pierre; Asseng, Senthold; Ewert, Frank; Mueller, Christoph; Roetter, Reimund P.; Ruane, Alex C.; Semenov, Mikhail A.; Wallach, Daniel; Wang, Enli
2016-01-01
To improve climate change impact estimates and to quantify their uncertainty, multi-model ensembles (MMEs) have been suggested. Model improvements can improve the accuracy of simulations and reduce the uncertainty of climate change impact assessments. Furthermore, they can reduce the number of models needed in a MME. Herein, 15 wheat growth models of a larger MME were improved through re-parameterization and/or incorporating or modifying heat stress effects on phenology, leaf growth and senescence, biomass growth, and grain number and size using detailed field experimental data from the USDA Hot Serial Cereal experiment (calibration data set). Simulation results from before and after model improvement were then evaluated with independent field experiments from a CIMMYT worldwide field trial network (evaluation data set). Model improvements decreased the variation (10th to 90th model ensemble percentile range) of grain yields simulated by the MME on average by 39% in the calibration data set and by 26% in the independent evaluation data set for crops grown in mean seasonal temperatures greater than 24 C. MME mean squared error in simulating grain yield decreased by 37%. A reduction in MME uncertainty range by 27% increased MME prediction skills by 47%. Results suggest that the mean level of variation observed in field experiments and used as a benchmark can be reached with half the number of models in the MME. Improving crop models is therefore important to increase the certainty of model-based impact assessments and allow more practical, i.e. smaller MMEs to be used effectively.
Predicting water main failures using Bayesian model averaging and survival modelling approach
International Nuclear Information System (INIS)
Kabir, Golam; Tesfamariam, Solomon; Sadiq, Rehan
2015-01-01
To develop an effective preventive or proactive repair and replacement action plan, water utilities often rely on water main failure prediction models. However, in predicting the failure of water mains, uncertainty is inherent regardless of the quality and quantity of data used in the model. To improve the understanding of water main failure, a Bayesian framework is developed for predicting the failure of water mains considering uncertainties. In this study, Bayesian model averaging method (BMA) is presented to identify the influential pipe-dependent and time-dependent covariates considering model uncertainties whereas Bayesian Weibull Proportional Hazard Model (BWPHM) is applied to develop the survival curves and to predict the failure rates of water mains. To accredit the proposed framework, it is implemented to predict the failure of cast iron (CI) and ductile iron (DI) pipes of the water distribution network of the City of Calgary, Alberta, Canada. Results indicate that the predicted 95% uncertainty bounds of the proposed BWPHMs capture effectively the observed breaks for both CI and DI water mains. Moreover, the performance of the proposed BWPHMs are better compare to the Cox-Proportional Hazard Model (Cox-PHM) for considering Weibull distribution for the baseline hazard function and model uncertainties. - Highlights: • Prioritize rehabilitation and replacements (R/R) strategies of water mains. • Consider the uncertainties for the failure prediction. • Improve the prediction capability of the water mains failure models. • Identify the influential and appropriate covariates for different models. • Determine the effects of the covariates on failure
Accounting for model error due to unresolved scales within ensemble Kalman filtering
Mitchell, Lewis; Carrassi, Alberto
2014-01-01
We propose a method to account for model error due to unresolved scales in the context of the ensemble transform Kalman filter (ETKF). The approach extends to this class of algorithms the deterministic model error formulation recently explored for variational schemes and extended Kalman filter. The model error statistic required in the analysis update is estimated using historical reanalysis increments and a suitable model error evolution law. Two different versions of the method are describe...
Saito, K.; Sueyoshi, T.; Marchenko, S.; Romanovsky, V.; Otto-Bliesner, B.; Walsh, J.; Bigelow, N.; Hendricks, A.; Yoshikawa, K.
2013-01-01
Here, global-scale frozen ground distribution from the Last Glacial Maximum (LGM) has been reconstructed using multi-model ensembles of global climate models, and then compared with evidence-based knowledge and earlier numerical results. Modeled soil temperatures, taken from Paleoclimate Modelling Intercomparison Project phase III (PMIP3) simulations, were used to diagnose the subsurface thermal regime and determine underlying frozen ground types for the present day (pre-industrial; 0 kya) an...
LGM permafrost distribution: how well can the latest PMIP multi-model ensembles reconstruct?
K. Saito; T. Sueyoshi; S. Marchenko; V. Romanovsky; B. Otto-Bliesner; J. Walsh; N. Bigelow; A. Hendricks; K. Yoshikawa
2013-01-01
Global-scale frozen ground distribution during the Last Glacial Maximum (LGM) was reconstructed using multi-model ensembles of global climate models, and then compared with evidence-based knowledge and earlier numerical results. Modeled soil temperatures, taken from Paleoclimate Modelling Intercomparison Project Phase III (PMIP3) simulations, were used to diagnose the subsurface thermal regime and determine underlying frozen ground types for the present-day (pre-industrial; 0 k) and the LGM (...
Lenartz, F.; Raick, C.; Soetaert, K.E.R.; Grégoire, M.
2007-01-01
The Ensemble Kalman filter (EnKF) has been applied to a 1-D complex ecosystem model coupled with a hydrodynamic model of the Ligurian Sea. In order to improve the performance of the EnKF, an ensemble subsampling strategy has been used to better represent the covariance matrices and a pre-analysis
Assessing uncertainties of water footprints using an ensemble of crop growth models on winter wheat
Czech Academy of Sciences Publication Activity Database
Kersebaum, K. C.; Kroes, J.; Gobin, A.; Takáč, J.; Hlavinka, Petr; Trnka, Miroslav; Ventrella, D.; Giglio, L.; Ferrise, R.; Moriondo, M.; Marta, A. D.; Luo, Q.; Eitzinger, Josef; Mirschel, W.; Weigel, H-J.; Manderscheid, R.; Hofmann, M.; Nejedlík, P.; Hösch, J.
2016-01-01
Roč. 8, č. 12 (2016), č. článku 571. ISSN 2073-4441 R&D Projects: GA MŠk(CZ) LO1415; GA MŠk(CZ) LD13030 Institutional support: RVO:67179843 Keywords : water footprint * uncertainty * model ensemble * wheat Subject RIV: DA - Hydrology ; Limnology Impact factor: 1.832, year: 2016
A deep learning-based multi-model ensemble method for cancer prediction.
Xiao, Yawen; Wu, Jun; Lin, Zongli; Zhao, Xiaodong
2018-01-01
Cancer is a complex worldwide health problem associated with high mortality. With the rapid development of the high-throughput sequencing technology and the application of various machine learning methods that have emerged in recent years, progress in cancer prediction has been increasingly made based on gene expression, providing insight into effective and accurate treatment decision making. Thus, developing machine learning methods, which can successfully distinguish cancer patients from healthy persons, is of great current interest. However, among the classification methods applied to cancer prediction so far, no one method outperforms all the others. In this paper, we demonstrate a new strategy, which applies deep learning to an ensemble approach that incorporates multiple different machine learning models. We supply informative gene data selected by differential gene expression analysis to five different classification models. Then, a deep learning method is employed to ensemble the outputs of the five classifiers. The proposed deep learning-based multi-model ensemble method was tested on three public RNA-seq data sets of three kinds of cancers, Lung Adenocarcinoma, Stomach Adenocarcinoma and Breast Invasive Carcinoma. The test results indicate that it increases the prediction accuracy of cancer for all the tested RNA-seq data sets as compared to using a single classifier or the majority voting algorithm. By taking full advantage of different classifiers, the proposed deep learning-based multi-model ensemble method is shown to be accurate and effective for cancer prediction. Copyright © 2017 Elsevier B.V. All rights reserved.
Castruccio, Stefano
2015-04-02
One of the main challenges when working with modern climate model ensembles is the increasingly larger size of the data produced, and the consequent difficulty in storing large amounts of spatio-temporally resolved information. Many compression algorithms can be used to mitigate this problem, but since they are designed to compress generic scientific data sets, they do not account for the nature of climate model output and they compress only individual simulations. In this work, we propose a different, statistics-based approach that explicitly accounts for the space-time dependence of the data for annual global three-dimensional temperature fields in an initial condition ensemble. The set of estimated parameters is small (compared to the data size) and can be regarded as a summary of the essential structure of the ensemble output; therefore, it can be used to instantaneously reproduce the temperature fields in an ensemble with a substantial saving in storage and time. The statistical model exploits the gridded geometry of the data and parallelization across processors. It is therefore computationally convenient and allows to fit a non-trivial model to a data set of one billion data points with a covariance matrix comprising of 10^18 entries.
Castruccio, Stefano; Genton, Marc G.
2015-01-01
One of the main challenges when working with modern climate model ensembles is the increasingly larger size of the data produced, and the consequent difficulty in storing large amounts of spatio-temporally resolved information. Many compression algorithms can be used to mitigate this problem, but since they are designed to compress generic scientific data sets, they do not account for the nature of climate model output and they compress only individual simulations. In this work, we propose a different, statistics-based approach that explicitly accounts for the space-time dependence of the data for annual global three-dimensional temperature fields in an initial condition ensemble. The set of estimated parameters is small (compared to the data size) and can be regarded as a summary of the essential structure of the ensemble output; therefore, it can be used to instantaneously reproduce the temperature fields in an ensemble with a substantial saving in storage and time. The statistical model exploits the gridded geometry of the data and parallelization across processors. It is therefore computationally convenient and allows to fit a non-trivial model to a data set of one billion data points with a covariance matrix comprising of 10^18 entries.
Experimental real-time multi-model ensemble (MME) prediction of ...
Indian Academy of Sciences (India)
calibration (training) has to be of good quality. Otherwise, it might degrade the MME results. Early works by ... ECMWF ensemble data (Evans et al 2000), and they showed the superiority of the multi-model system over the ..... eral idea of the quality of rainfall forecasts in terms of error statistics for monsoon for the member.
Bayesian Model Averaging of Artificial Intelligence Models for Hydraulic Conductivity Estimation
Nadiri, A.; Chitsazan, N.; Tsai, F. T.; Asghari Moghaddam, A.
2012-12-01
This research presents a Bayesian artificial intelligence model averaging (BAIMA) method that incorporates multiple artificial intelligence (AI) models to estimate hydraulic conductivity and evaluate estimation uncertainties. Uncertainty in the AI model outputs stems from error in model input as well as non-uniqueness in selecting different AI methods. Using one single AI model tends to bias the estimation and underestimate uncertainty. BAIMA employs Bayesian model averaging (BMA) technique to address the issue of using one single AI model for estimation. BAIMA estimates hydraulic conductivity by averaging the outputs of AI models according to their model weights. In this study, the model weights were determined using the Bayesian information criterion (BIC) that follows the parsimony principle. BAIMA calculates the within-model variances to account for uncertainty propagation from input data to AI model output. Between-model variances are evaluated to account for uncertainty due to model non-uniqueness. We employed Takagi-Sugeno fuzzy logic (TS-FL), artificial neural network (ANN) and neurofuzzy (NF) to estimate hydraulic conductivity for the Tasuj plain aquifer, Iran. BAIMA combined three AI models and produced better fitting than individual models. While NF was expected to be the best AI model owing to its utilization of both TS-FL and ANN models, the NF model is nearly discarded by the parsimony principle. The TS-FL model and the ANN model showed equal importance although their hydraulic conductivity estimates were quite different. This resulted in significant between-model variances that are normally ignored by using one AI model.
Bayesian network as a modelling tool for risk management in agriculture
DEFF Research Database (Denmark)
Rasmussen, Svend; Madsen, Anders L.; Lund, Mogens
. In this paper we use Bayesian networks as an integrated modelling approach for representing uncertainty and analysing risk management in agriculture. It is shown how historical farm account data may be efficiently used to estimate conditional probabilities, which are the core elements in Bayesian network models....... We further show how the Bayesian network model RiBay is used for stochastic simulation of farm income, and we demonstrate how RiBay can be used to simulate risk management at the farm level. It is concluded that the key strength of a Bayesian network is the transparency of assumptions......, and that it has the ability to link uncertainty from different external sources to budget figures and to quantify risk at the farm level....
Bayesian Proteoform Modeling Improves Protein Quantification of Global Proteomic Measurements
Energy Technology Data Exchange (ETDEWEB)
Webb-Robertson, Bobbie-Jo M.; Matzke, Melissa M.; Datta, Susmita; Payne, Samuel H.; Kang, Jiyun; Bramer, Lisa M.; Nicora, Carrie D.; Shukla, Anil K.; Metz, Thomas O.; Rodland, Karin D.; Smith, Richard D.; Tardiff, Mark F.; McDermott, Jason E.; Pounds, Joel G.; Waters, Katrina M.
2014-12-01
As the capability of mass spectrometry-based proteomics has matured, tens of thousands of peptides can be measured simultaneously, which has the benefit of offering a systems view of protein expression. However, a major challenge is that with an increase in throughput, protein quantification estimation from the native measured peptides has become a computational task. A limitation to existing computationally-driven protein quantification methods is that most ignore protein variation, such as alternate splicing of the RNA transcript and post-translational modifications or other possible proteoforms, which will affect a significant fraction of the proteome. The consequence of this assumption is that statistical inference at the protein level, and consequently downstream analyses, such as network and pathway modeling, have only limited power for biomarker discovery. Here, we describe a Bayesian model (BP-Quant) that uses statistically derived peptides signatures to identify peptides that are outside the dominant pattern, or the existence of multiple over-expressed patterns to improve relative protein abundance estimates. It is a research-driven approach that utilizes the objectives of the experiment, defined in the context of a standard statistical hypothesis, to identify a set of peptides exhibiting similar statistical behavior relating to a protein. This approach infers that changes in relative protein abundance can be used as a surrogate for changes in function, without necessarily taking into account the effect of differential post-translational modifications, processing, or splicing in altering protein function. We verify the approach using a dilution study from mouse plasma samples and demonstrate that BP-Quant achieves similar accuracy as the current state-of-the-art methods at proteoform identification with significantly better specificity. BP-Quant is available as a MatLab ® and R packages at https://github.com/PNNL-Comp-Mass-Spec/BP-Quant.
A Bayesian hierarchical model for demand curve analysis.
Ho, Yen-Yi; Nhu Vo, Tien; Chu, Haitao; Luo, Xianghua; Le, Chap T
2018-07-01
Drug self-administration experiments are a frequently used approach to assessing the abuse liability and reinforcing property of a compound. It has been used to assess the abuse liabilities of various substances such as psychomotor stimulants and hallucinogens, food, nicotine, and alcohol. The demand curve generated from a self-administration study describes how demand of a drug or non-drug reinforcer varies as a function of price. With the approval of the 2009 Family Smoking Prevention and Tobacco Control Act, demand curve analysis provides crucial evidence to inform the US Food and Drug Administration's policy on tobacco regulation, because it produces several important quantitative measurements to assess the reinforcing strength of nicotine. The conventional approach popularly used to analyze the demand curve data is individual-specific non-linear least square regression. The non-linear least square approach sets out to minimize the residual sum of squares for each subject in the dataset; however, this one-subject-at-a-time approach does not allow for the estimation of between- and within-subject variability in a unified model framework. In this paper, we review the existing approaches to analyze the demand curve data, non-linear least square regression, and the mixed effects regression and propose a new Bayesian hierarchical model. We conduct simulation analyses to compare the performance of these three approaches and illustrate the proposed approaches in a case study of nicotine self-administration in rats. We present simulation results and discuss the benefits of using the proposed approaches.
International Nuclear Information System (INIS)
Linares-Rodriguez, Alvaro; Ruiz-Arias, José Antonio; Pozo-Vazquez, David; Tovar-Pescador, Joaquin
2013-01-01
An optimized artificial neural network ensemble model is built to estimate daily global solar radiation over large areas. The model uses clear-sky estimates and satellite images as input variables. Unlike most studies using satellite imagery based on visible channels, our model also exploits all information within infrared channels of the Meteosat 9 satellite. A genetic algorithm is used to optimize selection of model inputs, for which twelve are selected – eleven 3-km Meteosat 9 channels and one clear-sky term. The model is validated in Andalusia (Spain) from January 2008 through December 2008. Measured data from 83 stations across the region are used, 65 for training and 18 independent ones for testing the model. At the latter stations, the ensemble model yields an overall root mean square error of 6.74% and correlation coefficient of 99%; the generated estimates are relatively accurate and errors spatially uniform. The model yields reliable results even on cloudy days, improving on current models based on satellite imagery. - Highlights: • Daily solar radiation data are generated using an artificial neural network ensemble. • Eleven Meteosat channels observations and a clear sky term are used as model inputs. • Model exploits all information within infrared Meteosat channels. • Measured data for a year from 83 ground stations are used. • The proposed approach has better performance than existing models on daily basis
Zhang, Li; Ai, Haixin; Chen, Wen; Yin, Zimo; Hu, Huan; Zhu, Junfeng; Zhao, Jian; Zhao, Qi; Liu, Hongsheng
2017-05-18
Carcinogenicity refers to a highly toxic end point of certain chemicals, and has become an important issue in the drug development process. In this study, three novel ensemble classification models, namely Ensemble SVM, Ensemble RF, and Ensemble XGBoost, were developed to predict carcinogenicity of chemicals using seven types of molecular fingerprints and three machine learning methods based on a dataset containing 1003 diverse compounds with rat carcinogenicity. Among these three models, Ensemble XGBoost is found to be the best, giving an average accuracy of 70.1 ± 2.9%, sensitivity of 67.0 ± 5.0%, and specificity of 73.1 ± 4.4% in five-fold cross-validation and an accuracy of 70.0%, sensitivity of 65.2%, and specificity of 76.5% in external validation. In comparison with some recent methods, the ensemble models outperform some machine learning-based approaches and yield equal accuracy and higher specificity but lower sensitivity than rule-based expert systems. It is also found that the ensemble models could be further improved if more data were available. As an application, the ensemble models are employed to discover potential carcinogens in the DrugBank database. The results indicate that the proposed models are helpful in predicting the carcinogenicity of chemicals. A web server called CarcinoPred-EL has been built for these models ( http://ccsipb.lnu.edu.cn/toxicity/CarcinoPred-EL/ ).
Ensemble Modeling for Robustness Analysis in engineering non-native metabolic pathways.
Lee, Yun; Lafontaine Rivera, Jimmy G; Liao, James C
2014-09-01
Metabolic pathways in cells must be sufficiently robust to tolerate fluctuations in expression levels and changes in environmental conditions. Perturbations in expression levels may lead to system failure due to the disappearance of a stable steady state. Increasing evidence has suggested that biological networks have evolved such that they are intrinsically robust in their network structure. In this article, we presented Ensemble Modeling for Robustness Analysis (EMRA), which combines a continuation method with the Ensemble Modeling approach, for investigating the robustness issue of non-native pathways. EMRA investigates a large ensemble of reference models with different parameters, and determines the effects of parameter drifting until a bifurcation point, beyond which a stable steady state disappears and system failure occurs. A pathway is considered to have high bifurcational robustness if the probability of system failure is low in the ensemble. To demonstrate the utility of EMRA, we investigate the bifurcational robustness of two synthetic central metabolic pathways that achieve carbon conservation: non-oxidative glycolysis and reverse glyoxylate cycle. With EMRA, we determined the probability of system failure of each design and demonstrated that alternative designs of these pathways indeed display varying degrees of bifurcational robustness. Furthermore, we demonstrated that target selection for flux improvement should consider the trade-offs between robustness and performance. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.
Visualizing projected Climate Changes - the CMIP5 Multi-Model Ensemble
Böttinger, Michael; Eyring, Veronika; Lauer, Axel; Meier-Fleischer, Karin
2017-04-01
Large ensembles add an additional dimension to climate model simulations. Internal variability of the climate system can be assessed for example by multiple climate model simulations with small variations in the initial conditions or by analyzing the spread in large ensembles made by multiple climate models under common protocols. This spread is often used as a measure of uncertainty in climate projections. In the context of the fifth phase of the WCRP's Coupled Model Intercomparison Project (CMIP5), more than 40 different coupled climate models were employed to carry out a coordinated set of experiments. Time series of the development of integral quantities such as the global mean temperature change for all models visualize the spread in the multi-model ensemble. A similar approach can be applied to 2D-visualizations of projected climate changes such as latitude-longitude maps showing the multi-model mean of the ensemble by adding a graphical representation of the uncertainty information. This has been demonstrated for example with static figures in chapter 12 of the last IPCC report (AR5) using different so-called stippling and hatching techniques. In this work, we focus on animated visualizations of multi-model ensemble climate projections carried out within CMIP5 as a way of communicating climate change results to the scientific community as well as to the public. We take a closer look at measures of robustness or uncertainty used in recent publications suitable for animated visualizations. Specifically, we use the ESMValTool [1] to process and prepare the CMIP5 multi-model data in combination with standard visualization tools such as NCL and the commercial 3D visualization software Avizo to create the animations. We compare different visualization techniques such as height fields or shading with transparency for creating animated visualization of ensemble mean changes in temperature and precipitation including corresponding robustness measures. [1] Eyring, V
A Bayesian Approach for Summarizing and Modeling Time-Series Exposure Data with Left Censoring.
Houseman, E Andres; Virji, M Abbas
2017-08-01
Direct reading instruments are valuable tools for measuring exposure as they provide real-time measurements for rapid decision making. However, their use is limited to general survey applications in part due to issues related to their performance. Moreover, statistical analysis of real-time data is complicated by autocorrelation among successive measurements, non-stationary time series, and the presence of left-censoring due to limit-of-detection (LOD). A Bayesian framework is proposed that accounts for non-stationary autocorrelation and LOD issues in exposure time-series data in order to model workplace factors that affect exposure and estimate summary statistics for tasks or other covariates of interest. A spline-based approach is used to model non-stationary autocorrelation with relatively few assumptions about autocorrelation structure. Left-censoring is addressed by integrating over the left tail of the distribution. The model is fit using Markov-Chain Monte Carlo within a Bayesian paradigm. The method can flexibly account for hierarchical relationships, random effects and fixed effects of covariates. The method is implemented using the rjags package in R, and is illustrated by applying it to real-time exposure data. Estimates for task means and covariates from the Bayesian model are compared to those from conventional frequentist models including linear regression, mixed-effects, and time-series models with different autocorrelation structures. Simulations studies are also conducted to evaluate method performance. Simulation studies with percent of measurements below the LOD ranging from 0 to 50% showed lowest root mean squared errors for task means and the least biased standard deviations from the Bayesian model compared to the frequentist models across all levels of LOD. In the application, task means from the Bayesian model were similar to means from the frequentist models, while the standard deviations were different. Parameter estimates for covariates
Bayesian estimation of regularization parameters for deformable surface models
International Nuclear Information System (INIS)
Cunningham, G.S.; Lehovich, A.; Hanson, K.M.
1999-01-01
In this article the authors build on their past attempts to reconstruct a 3D, time-varying bolus of radiotracer from first-pass data obtained by the dynamic SPECT imager, FASTSPECT, built by the University of Arizona. The object imaged is a CardioWest total artificial heart. The bolus is entirely contained in one ventricle and its associated inlet and outlet tubes. The model for the radiotracer distribution at a given time is a closed surface parameterized by 482 vertices that are connected to make 960 triangles, with nonuniform intensity variations of radiotracer allowed inside the surface on a voxel-to-voxel basis. The total curvature of the surface is minimized through the use of a weighted prior in the Bayesian framework, as is the weighted norm of the gradient of the voxellated grid. MAP estimates for the vertices, interior intensity voxels and background count level are produced. The strength of the priors, or hyperparameters, are determined by maximizing the probability of the data given the hyperparameters, called the evidence. The evidence is calculated by first assuming that the posterior is approximately normal in the values of the vertices and voxels, and then by evaluating the integral of the multi-dimensional normal distribution. This integral (which requires evaluating the determinant of a covariance matrix) is computed by applying a recent algorithm from Bai et. al. that calculates the needed determinant efficiently. They demonstrate that the radiotracer is highly inhomogeneous in early time frames, as suspected in earlier reconstruction attempts that assumed a uniform intensity of radiotracer within the closed surface, and that the optimal choice of hyperparameters is substantially different for different time frames
Gharamti, M. E.
2015-05-11
The ensemble Kalman filter (EnKF) is a popular method for state-parameters estimation of subsurface flow and transport models based on field measurements. The common filtering procedure is to directly update the state and parameters as one single vector, which is known as the Joint-EnKF. In this study, we follow the one-step-ahead smoothing formulation of the filtering problem, to derive a new joint-based EnKF which involves a smoothing step of the state between two successive analysis steps. The new state-parameters estimation scheme is derived in a consistent Bayesian filtering framework and results in separate update steps for the state and the parameters. This new algorithm bears strong resemblance with the Dual-EnKF, but unlike the latter which first propagates the state with the model then updates it with the new observation, the proposed scheme starts by an update step, followed by a model integration step. We exploit this new formulation of the joint filtering problem and propose an efficient model-integration-free iterative procedure on the update step of the parameters only for further improved performances. Numerical experiments are conducted with a two-dimensional synthetic subsurface transport model simulating the migration of a contaminant plume in a heterogenous aquifer domain. Contaminant concentration data are assimilated to estimate both the contaminant state and the hydraulic conductivity field. Assimilation runs are performed under imperfect modeling conditions and various observational scenarios. Simulation results suggest that the proposed scheme efficiently recovers both the contaminant state and the aquifer conductivity, providing more accurate estimates than the standard Joint and Dual EnKFs in all tested scenarios. Iterating on the update step of the new scheme further enhances the proposed filter’s behavior. In term of computational cost, the new Joint-EnKF is almost equivalent to that of the Dual-EnKF, but requires twice more model
Bayesian Subset Modeling for High-Dimensional Generalized Linear Models
Liang, Faming; Song, Qifan; Yu, Kai
2013-01-01
criterion model. The consistency of the resulting posterior is established under mild conditions. Further, a variable screening procedure is proposed based on the marginal inclusion probability, which shares the same properties of sure screening
Energy Technology Data Exchange (ETDEWEB)
Brown, Justin [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Hund, Lauren [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
2017-02-01
Dynamic compression experiments are being performed on complicated materials using increasingly complex drivers. The data produced in these experiments are beginning to reach a regime where traditional analysis techniques break down; requiring the solution of an inverse problem. A common measurement in dynamic experiments is an interface velocity as a function of time, and often this functional output can be simulated using a hydrodynamics code. Bayesian model calibration is a statistical framework to estimate inputs into a computational model in the presence of multiple uncertainties, making it well suited to measurements of this type. In this article, we apply Bayesian model calibration to high pressure (250 GPa) ramp compression measurements in tantalum. We address several issues speci c to this calibration including the functional nature of the output as well as parameter and model discrepancy identi ability. Speci cally, we propose scaling the likelihood function by an e ective sample size rather than modeling the autocorrelation function to accommodate the functional output and propose sensitivity analyses using the notion of `modularization' to assess the impact of experiment-speci c nuisance input parameters on estimates of material properties. We conclude that the proposed Bayesian model calibration procedure results in simple, fast, and valid inferences on the equation of state parameters for tantalum.
Garcia Galiano, S. G.; Olmos, P.; Giraldo Osorio, J. D.
2015-12-01
In the Mediterranean area, significant changes on temperature and precipitation are expected throughout the century. These trends could exacerbate the existing conditions in regions already vulnerable to climatic variability, reducing the water availability. Improving knowledge about plausible impacts of climate change on water cycle processes at basin scale, is an important step for building adaptive capacity to the impacts in this region, where severe water shortages are expected for the next decades. RCMs ensemble in combination with distributed hydrological models with few parameters, constitutes a valid and robust methodology to increase the reliability of climate and hydrological projections. For reaching this objective, a novel methodology for building Regional Climate Models (RCMs) ensembles of meteorological variables (rainfall an temperatures), was applied. RCMs ensembles are justified for increasing the reliability of climate and hydrological projections. The evaluation of RCMs goodness-of-fit to build the ensemble is based on empirical probability density functions (PDF) extracted from both RCMs dataset and a highly resolution gridded observational dataset, for the time period 1961-1990. The applied method is considering the seasonal and annual variability of the rainfall and temperatures. The RCMs ensembles constitute the input to a distributed hydrological model at basin scale, for assessing the runoff projections. The selected hydrological model is presenting few parameters in order to reduce the uncertainties involved. The study basin corresponds to a head basin of Segura River Basin, located in the South East of Spain. The impacts on runoff and its trend from observational dataset and climate projections, were assessed. Considering the control period 1961-1990, plausible significant decreases in runoff for the time period 2021-2050, were identified.
An iterative stochastic ensemble method for parameter estimation of subsurface flow models
International Nuclear Information System (INIS)
Elsheikh, Ahmed H.; Wheeler, Mary F.; Hoteit, Ibrahim
2013-01-01
Parameter estimation for subsurface flow models is an essential step for maximizing the value of numerical simulations for future prediction and the development of effective control strategies. We propose the iterative stochastic ensemble method (ISEM) as a general method for parameter estimation based on stochastic estimation of gradients using an ensemble of directional derivatives. ISEM eliminates the need for adjoint coding and deals with the numerical simulator as a blackbox. The proposed method employs directional derivatives within a Gauss–Newton iteration. The update equation in ISEM resembles the update step in ensemble Kalman filter, however the inverse of the output covariance matrix in ISEM is regularized using standard truncated singular value decomposition or Tikhonov regularization. We also investigate the performance of a set of shrinkage based covariance estimators within ISEM. The proposed method is successfully applied on several nonlinear parameter estimation problems for subsurface flow models. The efficiency of the proposed algorithm is demonstrated by the small size of utilized ensembles and in terms of error convergence rates
An iterative stochastic ensemble method for parameter estimation of subsurface flow models
Elsheikh, Ahmed H.
2013-06-01
Parameter estimation for subsurface flow models is an essential step for maximizing the value of numerical simulations for future prediction and the development of effective control strategies. We propose the iterative stochastic ensemble method (ISEM) as a general method for parameter estimation based on stochastic estimation of gradients using an ensemble of directional derivatives. ISEM eliminates the need for adjoint coding and deals with the numerical simulator as a blackbox. The proposed method employs directional derivatives within a Gauss-Newton iteration. The update equation in ISEM resembles the update step in ensemble Kalman filter, however the inverse of the output covariance matrix in ISEM is regularized using standard truncated singular value decomposition or Tikhonov regularization. We also investigate the performance of a set of shrinkage based covariance estimators within ISEM. The proposed method is successfully applied on several nonlinear parameter estimation problems for subsurface flow models. The efficiency of the proposed algorithm is demonstrated by the small size of utilized ensembles and in terms of error convergence rates. © 2013 Elsevier Inc.
Directory of Open Access Journals (Sweden)
Mihaela Simionescu
2014-12-01
Full Text Available There are many types of econometric models used in predicting the inflation rate, but in this study we used a Bayesian shrinkage combination approach. This methodology is used in order to improve the predictions accuracy by including information that is not captured by the econometric models. Therefore, experts’ forecasts are utilized as prior information, for Romania these predictions being provided by Institute for Economic Forecasting (Dobrescu macromodel, National Commission for Prognosis and European Commission. The empirical results for Romanian inflation show the superiority of a fixed effects model compared to other types of econometric models like VAR, Bayesian VAR, simultaneous equations model, dynamic model, log-linear model. The Bayesian combinations that used experts’ predictions as priors, when the shrinkage parameter tends to infinite, improved the accuracy of all forecasts based on individual models, outperforming also zero and equal weights predictions and naïve forecasts.
Energy Technology Data Exchange (ETDEWEB)
Deque, M.; Somot, S. [Meteo-France, Centre National de Recherches Meteorologiques, CNRS/GAME, Toulouse Cedex 01 (France); Sanchez-Gomez, E. [Cerfacs/CNRS, SUC URA1875, Toulouse Cedex 01 (France); Goodess, C.M. [University of East Anglia, Climatic Research Unit, Norwich (United Kingdom); Jacob, D. [Max Planck Institute for Meteorology, Hamburg (Germany); Lenderink, G. [KNMI, Postbus 201, De Bilt (Netherlands); Christensen, O.B. [Danish Meteorological Institute, Copenhagen Oe (Denmark)
2012-03-15
Various combinations of thirteen regional climate models (RCM) and six general circulation models (GCM) were used in FP6-ENSEMBLES. The response to the SRES-A1B greenhouse gas concentration scenario over Europe, calculated as the difference between the 2021-2050 and the 1961-1990 means can be viewed as an expected value about which various uncertainties exist. Uncertainties are measured here by variance explained for temperature and precipitation changes over eight European sub-areas. Three sources of uncertainty can be evaluated from the ENSEMBLES database. Sampling uncertainty is due to the fact that the model climate is estimated as an average over a finite number of years (30) despite a non-negligible interannual variability. Regional model uncertainty is due to the fact that the RCMs use different techniques to discretize the equations and to represent sub-grid effects. Global model uncertainty is due to the fact that the RCMs have been driven by different GCMs. Two methods are presented to fill the many empty cells of the ENSEMBLES RCM x GCM matrix. The first one is based on the same approach as in FP5-PRUDENCE. The second one uses the concept of weather regimes to attempt to separate the contribution of the GCM and the RCM. The variance of the climate response is analyzed with respect to the contribution of the GCM and the RCM. The two filling methods agree that the main contributor to the spread is the choice of the GCM, except for summer precipitation where the choice of the RCM dominates the uncertainty. Of course the implication of the GCM to the spread varies with the region, being maximum in the South-western part of Europe, whereas the continental parts are more sensitive to the choice of the RCM. The third cause of spread is systematically the interannual variability. The total uncertainty about temperature is not large enough to mask the 2021-2050 response which shows a similar pattern to the one obtained for 2071-2100 in PRUDENCE. The uncertainty
Bayesian Modeling of ChIP-chip Data Through a High-Order Ising Model
Mo, Qianxing
2010-01-29
ChIP-chip experiments are procedures that combine chromatin immunoprecipitation (ChIP) and DNA microarray (chip) technology to study a variety of biological problems, including protein-DNA interaction, histone modification, and DNA methylation. The most important feature of ChIP-chip data is that the intensity measurements of probes are spatially correlated because the DNA fragments are hybridized to neighboring probes in the experiments. We propose a simple, but powerful Bayesian hierarchical approach to ChIP-chip data through an Ising model with high-order interactions. The proposed method naturally takes into account the intrinsic spatial structure of the data and can be used to analyze data from multiple platforms with different genomic resolutions. The model parameters are estimated using the Gibbs sampler. The proposed method is illustrated using two publicly available data sets from Affymetrix and Agilent platforms, and compared with three alternative Bayesian methods, namely, Bayesian hierarchical model, hierarchical gamma mixture model, and Tilemap hidden Markov model. The numerical results indicate that the proposed method performs as well as the other three methods for the data from Affymetrix tiling arrays, but significantly outperforms the other three methods for the data from Agilent promoter arrays. In addition, we find that the proposed method has better operating characteristics in terms of sensitivities and false discovery rates under various scenarios. © 2010, The International Biometric Society.
Bayesian spatial modeling of HIV mortality via zero-inflated Poisson models.
Musal, Muzaffer; Aktekin, Tevfik
2013-01-30
In this paper, we investigate the effects of poverty and inequality on the number of HIV-related deaths in 62 New York counties via Bayesian zero-inflated Poisson models that exhibit spatial dependence. We quantify inequality via the Theil index and poverty via the ratios of two Census 2000 variables, the number of people under the poverty line and the number of people for whom poverty status is determined, in each Zip Code Tabulation Area. The purpose of this study was to investigate the effects of inequality and poverty in addition to spatial dependence between neighboring regions on HIV mortality rate, which can lead to improved health resource allocation decisions. In modeling county-specific HIV counts, we propose Bayesian zero-inflated Poisson models whose rates are functions of both covariate and spatial/random effects. To show how the proposed models work, we used three different publicly available data sets: TIGER Shapefiles, Census 2000, and mortality index files. In addition, we introduce parameter estimation issues of Bayesian zero-inflated Poisson models and discuss MCMC method implications. Copyright © 2012 John Wiley & Sons, Ltd.
Use of SAMC for Bayesian analysis of statistical models with intractable normalizing constants
Jin, Ick Hoon
2014-03-01
Statistical inference for the models with intractable normalizing constants has attracted much attention. During the past two decades, various approximation- or simulation-based methods have been proposed for the problem, such as the Monte Carlo maximum likelihood method and the auxiliary variable Markov chain Monte Carlo methods. The Bayesian stochastic approximation Monte Carlo algorithm specifically addresses this problem: It works by sampling from a sequence of approximate distributions with their average converging to the target posterior distribution, where the approximate distributions can be achieved using the stochastic approximation Monte Carlo algorithm. A strong law of large numbers is established for the Bayesian stochastic approximation Monte Carlo estimator under mild conditions. Compared to the Monte Carlo maximum likelihood method, the Bayesian stochastic approximation Monte Carlo algorithm is more robust to the initial guess of model parameters. Compared to the auxiliary variable MCMC methods, the Bayesian stochastic approximation Monte Carlo algorithm avoids the requirement for perfect samples, and thus can be applied to many models for which perfect sampling is not available or very expensive. The Bayesian stochastic approximation Monte Carlo algorithm also provides a general framework for approximate Bayesian analysis. © 2012 Elsevier B.V. All rights reserved.
Directory of Open Access Journals (Sweden)
Yunus Subagyo Swarinoto
2014-08-01
Full Text Available Manajemen air menjadi sangat penting khususnya di wilayah yang rentan terhadap ketersediaan air. Mengingat hujan di atas normal dapat mengakibatkan banjir, sedangkan hujan di bawah normal mengakibatkan kekeringan. Untuk itu prediksi unsur iklim hujan ini menjadi penting. Model sistem prediksi ensemble berbasis model sistem prediksi tunggal ANFIS, Wavelet-ANFIS, Wavelet ARIMA, dan ARIMA total hujan bulanan telah disimulasikan di wilayah Kabupaten Indramayu. Model sistem prediksi ensemble total hujan bulanan ini dibentuk dengan teknik pembobotan. Nilai pembobot didasarkan pada nilai koefisien korelasi Pearson (r yang diperoleh selama masa pelatihan dengan series data 1991-2000. Hasil pengolahan data 2001-2009 menunjukkan kisaran nilai r didapat 0,45-0,83 untuk ANFIS; 0,20-0,53 untuk Wavelet-ANFIS; 0,50-0,95 untuk Wavelet-ARIMA; 0,14-0,66 untuk ARIMA; dan 0,58-0,94 untuk Ensemble. Secara spasial, luaran model sistem prediksi ensemble total hujan bulanan di wilayah Kabupaten Indramayu menunjukkan hasil yang konsisten lebih baik daripada luaran model sistem prediksi tunggal pembentuknya. Water management is very important especially for region which is vulnarable to the water availability. Above normal rainfal condition causes flood, meanwhile below normal one triggers to the drought occurences. Coping with this situation, the rainfall prediction output is needed. The ensemble prediction system model (EPSM based on several single prediction system models (SPSMs such as ANFIS, Wavelet-ANFIS, Wavelet ARIMA, and ARIMA on monthly rainfall total, has been simulated within Indramayu district. The EPSM was developed and based on the weighting technique. This weighting is computed based on the value of Pearson correlation coefficient (r which has been gained during the training period of 1991-2000. Results of 2001-2009 model running show the value of r are 0,45-0,83 for ANFIS; 0,20-0,53 for Wavelet- ANFIS; 0,50-0,95 for Wavelet-ARIMA; 0,14-0,66 for
Directory of Open Access Journals (Sweden)
Muammar Sadrawi
2018-03-01
Full Text Available Equally partitioned data are essential for prediction. However, in some important cases, the data distribution is severely unbalanced. In this study, several algorithms are utilized to maximize the learning accuracy when dealing with a highly unbalanced dataset. A linguistic algorithm is applied to evaluate the input and output relationship, namely Fuzzy c-Means (FCM, which is applied as a clustering algorithm for the majority class to balance the minority class data from about 3 million cases. Each cluster is used to train several artificial neural network (ANN models. Different techniques are applied to generate an ensemble genetic fuzzy neuro model (EGFNM in order to select the models. The first ensemble technique, the intra-cluster EGFNM, works by evaluating the best combination from all the models generated by each cluster. Another ensemble technique is the inter-cluster model EGFNM, which is based on selecting the best model from each cluster. The accuracy of these techniques is evaluated using the receiver operating characteristic (ROC via its area under the curve (AUC. Results show that the AUC of the unbalanced data is 0.67974. The random cluster and best ANN single model have AUCs of 0.7177 and 0.72806, respectively. For the ensemble evaluations, the intra-cluster and the inter-cluster EGFNMs produce 0.7293 and 0.73038, respectively. In conclusion, this study achieved improved results by performing the EGFNM method compared with the unbalanced training. This study concludes that selecting several best models will produce a better result compared with all models combined.
Steinberg, P. D.; Brener, G.; Duffy, D.; Nearing, G. S.; Pelissier, C.
2017-12-01
Hyperparameterization, of statistical models, i.e. automated model scoring and selection, such as evolutionary algorithms, grid searches, and randomized searches, can improve forecast model skill by reducing errors associated with model parameterization, model structure, and statistical properties of training data. Ensemble Learning Models (Elm), and the related Earthio package, provide a flexible interface for automating the selection of parameters and model structure for machine learning models common in climate science and land cover classification, offering convenient tools for loading NetCDF, HDF, Grib, or GeoTiff files, decomposition methods like PCA and manifold learning, and parallel training and prediction with unsupervised and supervised classification, clustering, and regression estimators. Continuum Analytics is using Elm to experiment with statistical soil moisture forecasting based on meteorological forcing data from NASA's North American Land Data Assimilation System (NLDAS). There Elm is using the NSGA-2 multiobjective optimization algorithm for optimizing statistical preprocessing of forcing data to improve goodness-of-fit for statistical models (i.e. feature engineering). This presentation will discuss Elm and its components, including dask (distributed task scheduling), xarray (data structures for n-dimensional arrays), and scikit-learn (statistical preprocessing, clustering, classification, regression), and it will show how NSGA-2 is being used for automate selection of soil moisture forecast statistical models for North America.
Lesaffre, Emmanuel
2012-01-01
The growth of biostatistics has been phenomenal in recent years and has been marked by considerable technical innovation in both methodology and computational practicality. One area that has experienced significant growth is Bayesian methods. The growing use of Bayesian methodology has taken place partly due to an increasing number of practitioners valuing the Bayesian paradigm as matching that of scientific discovery. In addition, computational advances have allowed for more complex models to be fitted routinely to realistic data sets. Through examples, exercises and a combination of introd
Modeling of steam generator in nuclear power plant using neural network ensemble
International Nuclear Information System (INIS)
Lee, S. K.; Lee, E. C.; Jang, J. W.
2003-01-01
Neural network is now being used in modeling the steam generator is known to be difficult due to the reverse dynamics. However, Neural network is prone to the problem of overfitting. This paper investigates the use of neural network combining methods to model steam generator water level and compares with single neural network. The results show that neural network ensemble is effective tool which can offer improved generalization, lower dependence of the training set and reduced training time
Acharya Nachiketa Multi-model ensemble schemes for predicting ...
Indian Academy of Sciences (India)
during pre-monsoon season: A case study based on sat- ellite data and regional climate model. 269. Anand R ... Development of regional wheat VI-LAI models using. Resourcesat-1 .... Impact of additional surface observation network on short range .... 795. Sahu P. Threat of land subsidence in and around Kolkata City.
Hawkins, L. R.; Rupp, D. E.; Li, S.; Sarah, S.; McNeall, D. J.; Mote, P.; Betts, R. A.; Wallom, D.
2017-12-01
Changing regional patterns of surface temperature, precipitation, and humidity may cause ecosystem-scale changes in vegetation, altering the distribution of trees, shrubs, and grasses. A changing vegetation distribution, in turn, alters the albedo, latent heat flux, and carbon exchanged with the atmosphere with resulting feedbacks onto the regional climate. However, a wide range of earth-system processes that affect the carbon, energy, and hydrologic cycles occur at sub grid scales in climate models and must be parameterized. The appropriate parameter values in such parameterizations are often poorly constrained, leading to uncertainty in predictions of how the ecosystem will respond to changes in forcing. To better understand the sensitivity of regional climate to parameter selection and to improve regional climate and vegetation simulations, we used a large perturbed physics ensemble and a suite of statistical emulators. We dynamically downscaled a super-ensemble (multiple parameter sets and multiple initial conditions) of global climate simulations using a 25-km resolution regional climate model HadRM3p with the land-surface scheme MOSES2 and dynamic vegetation module TRIFFID. We simultaneously perturbed land surface parameters relating to the exchange of carbon, water, and energy between the land surface and atmosphere in a large super-ensemble of regional climate simulations over the western US. Statistical emulation was used as a computationally cost-effective tool to explore uncertainties in interactions. Regions of parameter space that did not satisfy observational constraints were eliminated and an ensemble of parameter sets that reduce regional biases and span a range of plausible interactions among earth system processes were selected. This study demonstrated that by combining super-ensemble simulations with statistical emulation, simulations of regional climate could be improved while simultaneously accounting for a range of plausible land
Evaluation of an ensemble of Arctic regional climate models
DEFF Research Database (Denmark)
Rinke, A.; Dethloff, K.; Cassano, J. J.
2006-01-01
Simulations of eight different regional climate models (RCMs) have been performed for the period September 1997-September 1998, which coincides with the Surface Heat Budget of the Arctic Ocean (SHEBA) project period. Each of the models employed approximately the same domain covering the western......, temperature, cloud cover, and long-/shortwave downward radiation between the individual model simulations are investigated. With this work, we quantify the scatter among the models and therefore the magnitude of disagreement and unreliability of current Arctic RCM simulations. Even with the relatively...... constrained experimental design we notice a considerable scatter among the different RCMs. We found the largest across-model scatter in the 2 m temperature over land, in the surface radiation fluxes, and in the cloud cover which implies a reduced confidence level for these variables....
Bassen, David M; Vilkhovoy, Michael; Minot, Mason; Butcher, Jonathan T; Varner, Jeffrey D
2017-01-25
Ensemble modeling is a promising approach for obtaining robust predictions and coarse grained population behavior in deterministic mathematical models. Ensemble approaches address model uncertainty by using parameter or model families instead of single best-fit parameters or fixed model structures. Parameter ensembles can be selected based upon simulation error, along with other criteria such as diversity or steady-state performance. Simulations using parameter ensembles can estimate confidence intervals on model variables, and robustly constrain model predictions, despite having many poorly constrained parameters. In this software note, we present a multiobjective based technique to estimate parameter or models ensembles, the Pareto Optimal Ensemble Technique in the Julia programming language (JuPOETs). JuPOETs integrates simulated annealing with Pareto optimality to estimate ensembles on or near the optimal tradeoff surface between competing training objectives. We demonstrate JuPOETs on a suite of multiobjective problems, including test functions with parameter bounds and system constraints as well as for the identification of a proof-of-concept biochemical model with four conflicting training objectives. JuPOETs identified optimal or near optimal solutions approximately six-fold faster than a corresponding implementation in Octave for the suite of test functions. For the proof-of-concept biochemical model, JuPOETs produced an ensemble of parameters that gave both the mean of the training data for conflicting data sets, while simultaneously estimating parameter sets that performed well on each of the individual objective functions. JuPOETs is a promising approach for the estimation of parameter and model ensembles using multiobjective optimization. JuPOETs can be adapted to solve many problem types, including mixed binary and continuous variable types, bilevel optimization problems and constrained problems without altering the base algorithm. JuPOETs is open
Parameterizing Bayesian network Representations of Social-Behavioral Models by Expert Elicitation
Energy Technology Data Exchange (ETDEWEB)
Walsh, Stephen J.; Dalton, Angela C.; Whitney, Paul D.; White, Amanda M.
2010-05-23
Bayesian networks provide a general framework with which to model many natural phenomena. The mathematical nature of Bayesian networks enables a plethora of model validation and calibration techniques: e.g parameter estimation, goodness of fit tests, and diagnostic checking of the model assumptions. However, they are not free of shortcomings. Parameter estimation from relevant extant data is a common approach to calibrating the model parameters. In practice it is not uncommon to find oneself lacking adequate data to reliably estimate all model parameters. In this paper we present the early development of a novel application of conjoint analysis as a method for eliciting and modeling expert opinions and using the results in a methodology for calibrating the parameters of a Bayesian network.
Fast and accurate Bayesian model criticism and conflict diagnostics using R-INLA
Ferkingstad, Egil
2017-10-16
Bayesian hierarchical models are increasingly popular for realistic modelling and analysis of complex data. This trend is accompanied by the need for flexible, general and computationally efficient methods for model criticism and conflict detection. Usually, a Bayesian hierarchical model incorporates a grouping of the individual data points, as, for example, with individuals in repeated measurement data. In such cases, the following question arises: Are any of the groups “outliers,” or in conflict with the remaining groups? Existing general approaches aiming to answer such questions tend to be extremely computationally demanding when model fitting is based on Markov chain Monte Carlo. We show how group-level model criticism and conflict detection can be carried out quickly and accurately through integrated nested Laplace approximations (INLA). The new method is implemented as a part of the open-source R-INLA package for Bayesian computing (http://r-inla.org).
Estimating Convection Parameters in the GFDL CM2.1 Model Using Ensemble Data Assimilation
Li, Shan; Zhang, Shaoqing; Liu, Zhengyu; Lu, Lv; Zhu, Jiang; Zhang, Xuefeng; Wu, Xinrong; Zhao, Ming; Vecchi, Gabriel A.; Zhang, Rong-Hua; Lin, Xiaopei
2018-04-01
Parametric uncertainty in convection parameterization is one major source of model errors that cause model climate drift. Convection parameter tuning has been widely studied in atmospheric models to help mitigate the problem. However, in a fully coupled general circulation model (CGCM), convection parameters which impact the ocean as well as the climate simulation may have different optimal values. This study explores the possibility of estimating convection parameters with an ensemble coupled data assimilation method in a CGCM. Impacts of the convection parameter estimation on climate analysis and forecast are analyzed. In a twin experiment framework, five convection parameters in the GFDL coupled model CM2.1 are estimated individually and simultaneously under both perfect and imperfect model regimes. Results show that the ensemble data assimilation method can help reduce the bias in convection parameters. With estimated convection parameters, the analyses and forecasts for both the atmosphere and the ocean are generally improved. It is also found that information in low latitudes is relatively more important for estimating convection parameters. This study further suggests that when important parameters in appropriate physical parameterizations are identified, incorporating their estimation into traditional ensemble data assimilation procedure could improve the final analysis and climate prediction.
An Ensemble Model for Co-Seismic Landslide Susceptibility Using GIS and Random Forest Method
Directory of Open Access Journals (Sweden)
Suchita Shrestha
2017-11-01
Full Text Available The Mw 7.8 Gorkha earthquake of 25 April 2015 triggered thousands of landslides in the central part of the Nepal Himalayas. The main goal of this study was to generate an ensemble-based map of co-seismic landslide susceptibility in Sindhupalchowk District using model comparison and combination strands. A total of 2194 co-seismic landslides were identified and were randomly split into 1536 (~70%, to train data for establishing the model, and the remaining 658 (~30% for the validation of the model. Frequency ratio, evidential belief function, and weight of evidence methods were applied and compared using 11 different causative factors (peak ground acceleration, epicenter proximity, fault proximity, geology, elevation, slope, plan curvature, internal relief, drainage proximity, stream power index, and topographic wetness index to prepare the landslide susceptibility map. An ensemble of random forest was then used to overcome the various prediction limitations of the individual models. The success rates and prediction capabilities were critically compared using the area under the curve (AUC of the receiver operating characteristic curve (ROC. By synthesizing the results of the various models into a single score, the ensemble model improved accuracy and provided considerably more realistic prediction capacities (91% than the frequency ratio (81.2%, evidential belief function (83.5% methods, and weight of evidence (80.1%.
Modeling GIA at the Gulf of Mexico and environs: a Bayesian approach
Caron, L.; Ivins, E. R.; Larour, E. Y.; Adhikari, S.
2017-12-01
The massive amount of new data that constrain global mass changes that are derived from space missions, such as JASON, ENVISat, ICEsat, GRACE time series coupled to GNSS determined vertical land motion (VLM), have revolutionized our understanding of near real-time changes in water storage, sea-level rise (SLR) and ice mass balance on decadal time scales. In order to better interpret these data sets, however, background secular signals need to be removed if a mass conserving reconstruction of ongoing changes in surface mass can be accurately determined with appropriate error statistics. Among the major contaminants of measurements is the signal due to the growth and collapse of the great ice sheets during the last glacial cycle, a phenomenon known as Glacial Isostatic Adjustment (GIA). Linear trends in VLM, gravity and tide-gauge measurements of local sea-level may be removed by using GIA models. The major caveat for GIA models is that no reliable error statistic comes with the correction. Consequently, the community struggles to establish a consensus about GIA model predictions. A formal calculation of the uncertainty in the prediction is logically an absolute corner stone for quantifying the degree of knowledge we have about this phenomenon. GIA uncertainty should be incorporated and propagated into the uncertainty estimates for any scientific results that employ geodetic measurements that also contain the GIA signature. We propose a new method based on model ensembles and Bayesian framework to provide statistical characterization of the present-day GIA signal. Through more than 30,000 forward models, our approach explores the range of possible solutions by varying jointly the Earth properties, such as the mantle rheology and structure, and the ice loading history. Our inversion is constrained by 459 GNSS stations (with trends accurate to less than 0.5 mm/yr) that cover non-tectonic North America, Europe and Antarctica, as well as 11451 paleo sea level records
Janardhanan, S.; Datta, B.
2011-12-01
Surrogate models are widely used to develop computationally efficient simulation-optimization models to solve complex groundwater management problems. Artificial intelligence based models are most often used for this purpose where they are trained using predictor-predictand data obtained from a numerical simulation model. Most often this is implemented with the assumption that the parameters and boundary conditions used in the numerical simulation model are perfectly known. However, in most practical situations these values are uncertain. Under these circumstances the application of such approximation surrogates becomes limited. In our study we develop a surrogate model based coupled simulation optimization methodology for determining optimal pumping strategies for coastal aquifers considering parameter uncertainty. An ensemble surrogate modeling approach is used along with multiple realization optimization. The methodology is used to solve a multi-objective coastal aquifer management problem considering two conflicting objectives. Hydraulic conductivity and the aquifer recharge are considered as uncertain values. Three dimensional coupled flow and transport simulation model FEMWATER is used to simulate the aquifer responses for a number of scenarios corresponding to Latin hypercube samples of pumping and uncertain parameters to generate input-output patterns for training the surrogate models. Non-parametric bootstrap sampling of this original data set is used to generate multiple data sets which belong to different regions in the multi-dimensional decision and parameter space. These data sets are used to train and test multiple surrogate models based on genetic programming. The ensemble of surrogate models is then linked to a multi-objective genetic algorithm to solve the pumping optimization problem. Two conflicting objectives, viz, maximizing total pumping from beneficial wells and minimizing the total pumping from barrier wells for hydraulic control of
Boos, Moritz; Seer, Caroline; Lange, Florian; Kopp, Bruno
2016-01-01
Cognitive determinants of probabilistic inference were examined using hierarchical Bayesian modeling techniques. A classic urn-ball paradigm served as experimental strategy, involving a factorial two (prior probabilities) by two (likelihoods) design. Five computational models of cognitive processes were compared with the observed behavior. Parameter-free Bayesian posterior probabilities and parameter-free base rate neglect provided inadequate models of probabilistic inference. The introduction of distorted subjective probabilities yielded more robust and generalizable results. A general class of (inverted) S-shaped probability weighting functions had been proposed; however, the possibility of large differences in probability distortions not only across experimental conditions, but also across individuals, seems critical for the model's success. It also seems advantageous to consider individual differences in parameters of probability weighting as being sampled from weakly informative prior distributions of individual parameter values. Thus, the results from hierarchical Bayesian modeling converge with previous results in revealing that probability weighting parameters show considerable task dependency and individual differences. Methodologically, this work exemplifies the usefulness of hierarchical Bayesian modeling techniques for cognitive psychology. Theoretically, human probabilistic inference might be best described as the application of individualized strategic policies for Bayesian belief revision.
A Tutorial Introduction to Bayesian Models of Cognitive Development
2011-01-01
typewriter with an infinite amount of paper. There is a space of documents that it is capable of producing, which includes things like The Tempest and does...not include, say, a Vermeer painting or a poem written in Russian. This typewriter represents a means of generating the hypothesis space for a Bayesian...learner: each possible document that can be typed on it is a hypothesis, the infinite set of documents producible by the typewriter is the latent
Ensemble modeling of E. coli in the Charles River, Boston, Massachusetts, USA.
Hellweger, F L
2007-01-01
A case study of ensemble modeling of Escherichia coli (E. coli) densities in surface waters in the context of public health risk prediction is presented. The output of two different models, mechanistic and empirical, are combined and compared to data. The mechanistic model is a high-resolution, time-variable, three-dimensional coupled hydrodynamic and water quality model. It generally reproduces the mechanisms of E. coli fate and transport in the river, including the presence and absence of a plume in the study area under similar input, but different hydrodynamic conditions caused by the operation of a downstream dam and wind. At the time series station, the model has a root mean square error (RMSE) of 370 CFU/100mL, a total error rate (with respect to the EPA-recommended single sample criteria value of 235 CFU/100mL) (TER) of 15% and negative error rate (NER) of 30%. The empirical model is based on multiple linear regression using the forcing functions of the mechanistic model as independent variables. It has better overall performance (at the time series station), due to a strong correlation of E. coli density with upstream inflow for this time period (RMSE =200 CFU/100mL, TER =13%, NER =1.6%). However, the model is mechanistically incorrect in that it predicts decreasing densities with increasing Combined Sewer Overflow (CSO) input. The two models are fundamentally different and their errors are uncorrelated (R(2) =0.02), which motivates their combination in an ensemble. Two combination approaches, a geometric mean ensemble (GME) and an "either exceeds" ensemble (EEE), are explored. The GME model outperforms the mechanistic and empirical models in terms of RMSE (190 CFU/100mL) and TER (11%), but has a higher NER (23%). The EEE has relatively high TER (16%), but low NER (0.8%) and may be the best method for a conservative prediction. The study demonstrates the potential utility of ensemble modeling for pathogen indicators, but significant further research is
Hosseiny, S. M. H.; Zarzar, C.; Gomez, M.; Siddique, R.; Smith, V.; Mejia, A.; Demir, I.
2016-12-01
The National Water Model (NWM) provides a platform for operationalize nationwide flood inundation forecasting and mapping. The ability to model flood inundation on a national scale will provide invaluable information to decision makers and local emergency officials. Often, forecast products use deterministic model output to provide a visual representation of a single inundation scenario, which is subject to uncertainty from various sources. While this provides a straightforward representation of the potential inundation, the inherent uncertainty associated with the model output should be considered to optimize this tool for decision making support. The goal of this study is to produce ensembles of future flood inundation conditions (i.e. extent, depth, and velocity) to spatially quantify and visually assess uncertainties associated with the predicted flood inundation maps. The setting for this study is located in a highly urbanized watershed along the Darby Creek in Pennsylvania. A forecasting framework coupling the NWM with multiple hydraulic models was developed to produce a suite ensembles of future flood inundation predictions. Time lagged ensembles from the NWM short range forecasts were used to account for uncertainty associated with the hydrologic forecasts. The forecasts from the NWM were input to iRIC and HEC-RAS two-dimensional software packages, from which water extent, depth, and flow velocity were output. Quantifying the agreement between output ensembles for each forecast grid provided the uncertainty metrics for predicted flood water inundation extent, depth, and flow velocity. For visualization, a series of flood maps that display flood extent, water depth, and flow velocity along with the underlying uncertainty associated with each of the forecasted variables were produced. The results from this study demonstrate the potential to incorporate and visualize model uncertainties in flood inundation maps in order to identify the high flood risk zones.
Wheeler, David C.; Hickson, DeMarc A.; Waller, Lance A.
2010-01-01
Many diagnostic tools and goodness-of-fit measures, such as the Akaike information criterion (AIC) and the Bayesian deviance information criterion (DIC), are available to evaluate the overall adequacy of linear regression models. In addition, visually assessing adequacy in models has become an essential part of any regression analysis. In this paper, we focus on a spatial consideration of the local DIC measure for model selection and goodness-of-fit evaluation. We use a partitioning of the DIC into the local DIC, leverage, and deviance residuals to assess local model fit and influence for both individual observations and groups of observations in a Bayesian framework. We use visualization of the local DIC and differences in local DIC between models to assist in model selection and to visualize the global and local impacts of adding covariates or model parameters. We demonstrate the utility of the local DIC in assessing model adequacy using HIV prevalence data from pregnant women in the Butare province of Rwanda during 1989-1993 using a range of linear model specifications, from global effects only to spatially varying coefficient models, and a set of covariates related to sexual behavior. Results of applying the diagnostic visualization approach include more refined model selection and greater understanding of the models as applied to the data. PMID:21243121
Kaewprag, Pacharmon; Newton, Cheryl; Vermillion, Brenda; Hyun, Sookyung; Huang, Kun; Machiraju, Raghu
2017-07-05
We develop predictive models enabling clinicians to better understand and explore patient clinical data along with risk factors for pressure ulcers in intensive care unit patients from electronic health record data. Identifying accurate risk factors of pressure ulcers is essential to determining appropriate prevention strategies; in this work we examine medication, diagnosis, and traditional Braden pressure ulcer assessment scale measurements as patient features. In order to predict pressure ulcer incidence and better understand the structure of related risk factors, we construct Bayesian networks from patient features. Bayesian network nodes (features) and edges (conditional dependencies) are simplified with statistical network techniques. Upon reviewing a network visualization of our model, our clinician collaborators were able to identify strong relationships between risk factors widely recognized as associated with pressure ulcers. We present a three-stage framework for predictive analysis of patient clinical data: 1) Developing electronic health record feature extraction functions with assistance of clinicians, 2) simplifying features, and 3) building Bayesian network predictive models. We evaluate all combinations of Bayesian network models from different search algorithms, scoring functions, prior structure initializations, and sets of features. From the EHRs of 7,717 ICU patients, we construct Bayesian network predictive models from 86 medication, diagnosis, and Braden scale features. Our model not only identifies known and suspected high PU risk factors, but also substantially increases sensitivity of the prediction - nearly three times higher comparing to logistical regression models - without sacrificing the overall accuracy. We visualize a representative model with which our clinician collaborators identify strong relationships between risk factors widely recognized as associated with pressure ulcers. Given the strong adverse effect of pressure ulcers
Testing students' e-learning via Facebook through Bayesian structural equation modeling.
Salarzadeh Jenatabadi, Hashem; Moghavvemi, Sedigheh; Wan Mohamed Radzi, Che Wan Jasimah Bt; Babashamsi, Parastoo; Arashi, Mohammad
2017-01-01
Learning is an intentional activity, with several factors affecting students' intention to use new learning technology. Researchers have investigated technology acceptance in different contexts by developing various theories/models and testing them by a number of means. Although most theories/models developed have been examined through regression or structural equation modeling, Bayesian analysis offers more accurate data analysis results. To address this gap, the unified theory of acceptance and technology use in the context of e-learning via Facebook are re-examined in this study using Bayesian analysis. The data (S1 Data) were collected from 170 students enrolled in a business statistics course at University of Malaya, Malaysia, and tested with the maximum likelihood and Bayesian approaches. The difference between the two methods' results indicates that performance expectancy and hedonic motivation are the strongest factors influencing the intention to use e-learning via Facebook. The Bayesian estimation model exhibited better data fit than the maximum likelihood estimator model. The results of the Bayesian and maximum likelihood estimator approaches are compared and the reasons for the result discrepancy are deliberated.
Testing students' e-learning via Facebook through Bayesian structural equation modeling.
Directory of Open Access Journals (Sweden)
Hashem Salarzadeh Jenatabadi
Full Text Available Learning is an intentional activity, with several factors affecting students' intention to use new learning technology. Researchers have investigated technology acceptance in different contexts by developing various theories/models and testing them by a number of means. Although most theories/models developed have been examined through regression or structural equation modeling, Bayesian analysis offers more accurate data analysis results. To address this gap, the unified theory of acceptance and technology use in the context of e-learning via Facebook are re-examined in this study using Bayesian analysis. The data (S1 Data were collected from 170 students enrolled in a business statistics course at University of Malaya, Malaysia, and tested with the maximum likelihood and Bayesian approaches. The difference between the two methods' results indicates that performance expectancy and hedonic motivation are the strongest factors influencing the intention to use e-learning via Facebook. The Bayesian estimation model exhibited better data fit than the maximum likelihood estimator model. The results of the Bayesian and maximum likelihood estimator approaches are compared and the reasons for the result discrepancy are deliberated.
Directory of Open Access Journals (Sweden)
Moritz eBoos
2016-05-01
Full Text Available Cognitive determinants of probabilistic inference were examined using hierarchical Bayesian modelling techniques. A classic urn-ball paradigm served as experimental strategy, involving a factorial two (prior probabilities by two (likelihoods design. Five computational models of cognitive processes were compared with the observed behaviour. Parameter-free Bayesian posterior probabilities and parameter-free base rate neglect provided inadequate models of probabilistic inference. The introduction of distorted subjective probabilities yielded more robust and generalizable results. A general class of (inverted S-shaped probability weighting functions had been proposed; however, the possibility of large differences in probability distortions not only across experimental conditions, but also across individuals, seems critical for the model’s success. It also seems advantageous to consider individual differences in parameters of probability weighting as being sampled from weakly informative prior distributions of individual parameter values. Thus, the results from hierarchical Bayesian modelling converge with previous results in revealing that probability weighting parameters show considerable task dependency and individual differences. Methodologically, this work exemplifies the usefulness of hierarchical Bayesian modelling techniques for cognitive psychology. Theoretically, human probabilistic inference might be best described as the application of individualized strategic policies for Bayesian belief revision.
Directory of Open Access Journals (Sweden)
H. Wan
2014-09-01
Full Text Available This paper explores the feasibility of an experimentation strategy for investigating sensitivities in fast components of atmospheric general circulation models. The basic idea is to replace the traditional serial-in-time long-term climate integrations by representative ensembles of shorter simulations. The key advantage of the proposed method lies in its efficiency: since fewer days of simulation are needed, the computational cost is less, and because individual realizations are independent and can be integrated simultaneously, the new dimension of parallelism can dramatically reduce the turnaround time in benchmark tests, sensitivities studies, and model tuning exercises. The strategy is not appropriate for exploring sensitivity of all model features, but it is very effective in many situations. Two examples are presented using the Community Atmosphere Model, version 5. In the first example, the method is used to characterize sensitivities of the simulated clouds to time-step length. Results show that 3-day ensembles of 20 to 50 members are sufficient to reproduce the main signals revealed by traditional 5-year simulations. A nudging technique is applied to an additional set of simulations to help understand the contribution of physics–dynamics interaction to the detected time-step sensitivity. In the second example, multiple empirical parameters related to cloud microphysics and aerosol life cycle are perturbed simultaneously in order to find out which parameters have the largest impact on the simulated global mean top-of-atmosphere radiation balance. It turns out that 12-member ensembles of 10-day simulations are able to reveal the same sensitivities as seen in 4-year simulations performed in a previous study. In both cases, the ensemble method reduces the total computational time by a factor of about 15, and the turnaround time by a factor of several hundred. The efficiency of the method makes it particularly useful for the development of
Lew, E. J.; Butenhoff, C. L.; Karmakar, S.; Rice, A. L.; Khalil, A. K.
2017-12-01
Methane is the second most important greenhouse gas after carbon dioxide. In efforts to control emissions, a careful examination of the methane budget and source strengths is required. To determine methane surface fluxes, Bayesian methods are often used to provide top-down constraints. Inverse modeling derives unknown fluxes using observed methane concentrations, a chemical transport model (CTM) and prior information. The Bayesian inversion reduces prior flux uncertainties by exploiting information content in the data. While the Bayesian formalism produces internal error estimates of source fluxes, systematic or external errors that arise from user choices in the inversion scheme are often much larger. Here we examine model sensitivity and uncertainty of our inversion under different observation data sets and CTM grid resolution. We compare posterior surface fluxes using the data product GLOBALVIEW-CH4 against the event-level molar mixing ratio data available from NOAA. GLOBALVIEW-CH4 is a collection of CH4 concentration estimates from 221 sites, collected by 12 laboratories, that have been interpolated and extracted to provide weekly records from 1984-2008. Differently, the event-level NOAA data records methane mixing ratios field measurements from 102 sites, containing sampling frequency irregularities and gaps in time. Furthermore, the sampling platform types used by the data sets may influence the posterior flux estimates, namely fixed surface, tower, ship and aircraft sites. To explore the sensitivity of the posterior surface fluxes to the observation network geometry, inversions composed of all sites, only aircraft, only ship, only tower and only fixed surface sites, are performed and compared. Also, we investigate the sensitivity of the error reduction associated with the resolution of the GEOS-Chem simulation (4°×5° vs 2°×2.5°) used to calculate the response matrix. Using a higher resolution grid decreased the model-data error at most sites, thereby
Schaller, N.; Sillmann, J.; Anstey, J.; Fischer, E. M.; Grams, C. M.; Russo, S.
2018-05-01
Better preparedness for summer heatwaves could mitigate their adverse effects on society. This can potentially be attained through an increased understanding of the relationship between heatwaves and one of their main dynamical drivers, atmospheric blocking. In the 1979–2015 period, we find that there is a significant correlation between summer heatwave magnitudes and the number of days influenced by atmospheric blocking in Northern Europe and Western Russia. Using three large global climate model ensembles, we find similar correlations, indicating that these three models are able to represent the relationship between extreme temperature and atmospheric blocking, despite having biases in their simulation of individual climate variables such as temperature or geopotential height. Our results emphasize the need to use large ensembles of different global climate models as single realizations do not always capture this relationship. The three large ensembles further suggest that the relationship between summer heatwaves and atmospheric blocking will not change in the future. This could be used to statistically model heatwaves with atmospheric blocking as a covariate and aid decision-makers in planning disaster risk reduction and adaptation to climate change.
From deep TLS validation to ensembles of atomic models built from elemental motions
International Nuclear Information System (INIS)
Urzhumtsev, Alexandre; Afonine, Pavel V.; Van Benschoten, Andrew H.; Fraser, James S.; Adams, Paul D.
2015-01-01
Procedures are described for extracting the vibration and libration parameters corresponding to a given set of TLS matrices and their simultaneous validation. Knowledge of these parameters allows the generation of structural ensembles corresponding to these matrices. The translation–libration–screw model first introduced by Cruickshank, Schomaker and Trueblood describes the concerted motions of atomic groups. Using TLS models can improve the agreement between calculated and experimental diffraction data. Because the T, L and S matrices describe a combination of atomic vibrations and librations, TLS models can also potentially shed light on molecular mechanisms involving correlated motions. However, this use of TLS models in mechanistic studies is hampered by the difficulties in translating the results of refinement into molecular movement or a structural ensemble. To convert the matrices into a constituent molecular movement, the matrix elements must satisfy several conditions. Refining the T, L and S matrix elements as independent parameters without taking these conditions into account may result in matrices that do not represent concerted molecular movements. Here, a mathematical framework and the computational tools to analyze TLS matrices, resulting in either explicit decomposition into descriptions of the underlying motions or a report of broken conditions, are described. The description of valid underlying motions can then be output as a structural ensemble. All methods are implemented as part of the PHENIX project
From deep TLS validation to ensembles of atomic models built from elemental motions
Energy Technology Data Exchange (ETDEWEB)
Urzhumtsev, Alexandre, E-mail: sacha@igbmc.fr [Centre for Integrative Biology, Institut de Génétique et de Biologie Moléculaire et Cellulaire, CNRS–INSERM–UdS, 1 Rue Laurent Fries, BP 10142, 67404 Illkirch (France); Université de Lorraine, BP 239, 54506 Vandoeuvre-les-Nancy (France); Afonine, Pavel V. [Lawrence Berkeley National Laboratory, Berkeley, California (United States); Van Benschoten, Andrew H.; Fraser, James S. [University of California, San Francisco, San Francisco, CA 94158 (United States); Adams, Paul D. [Lawrence Berkeley National Laboratory, Berkeley, California (United States); University of California Berkeley, Berkeley, CA 94720 (United States); Centre for Integrative Biology, Institut de Génétique et de Biologie Moléculaire et Cellulaire, CNRS–INSERM–UdS, 1 Rue Laurent Fries, BP 10142, 67404 Illkirch (France)
2015-07-28
Procedures are described for extracting the vibration and libration parameters corresponding to a given set of TLS matrices and their simultaneous validation. Knowledge of these parameters allows the generation of structural ensembles corresponding to these matrices. The translation–libration–screw model first introduced by Cruickshank, Schomaker and Trueblood describes the concerted motions of atomic groups. Using TLS models can improve the agreement between calculated and experimental diffraction data. Because the T, L and S matrices describe a combination of atomic vibrations and librations, TLS models can also potentially shed light on molecular mechanisms involving correlated motions. However, this use of TLS models in mechanistic studies is hampered by the difficulties in translating the results of refinement into molecular movement or a structural ensemble. To convert the matrices into a constituent molecular movement, the matrix elements must satisfy several conditions. Refining the T, L and S matrix elements as independent parameters without taking these conditions into account may result in matrices that do not represent concerted molecular movements. Here, a mathematical framework and the computational tools to analyze TLS matrices, resulting in either explicit decomposition into descriptions of the underlying motions or a report of broken conditions, are described. The description of valid underlying motions can then be output as a structural ensemble. All methods are implemented as part of the PHENIX project.
Assessment of managed aquifer recharge potential using ensembles of local models.
Smith, Anthony J; Pollock, Daniel W
2012-01-01
A simple quantitative approach for assessing the artificial recharge potential of large regions using spatial ensembles of local models is proposed. The method extends existing qualitative approaches and enables rapid assessments within a programmable environment. Spatial discretization of a water resource region into continuous local domains allows simple local models to be applied independently in each domain using lumped parameters. The ensemble results can be analyzed directly or combined with other quantitative and thematic information and visualized as regional suitability maps. A case study considers the hydraulic potential for surface infiltration across a large water resource region using a published analytic model for basin recharge. The model solution was implemented within a geographic information system and evaluated independently in >21,000 local domains using lumped parameters derived from existing regional datasets. Computer execution times to run the whole ensemble and process the results were in the order of a few minutes. Relevant aspects of the case study results and general conclusions concerning the utility and limitations of the method are discussed. © 2011, CSIRO. Ground Water © 2011, National Ground Water Association.
Kobayashi, Kenichiro; Otsuka, Shigenori; Apip; Saito, Kazuo
2016-08-01
This paper presents a study on short-term ensemble flood forecasting specifically for small dam catchments in Japan. Numerical ensemble simulations of rainfall from the Japan Meteorological Agency nonhydrostatic model (JMA-NHM) are used as the input data to a rainfall-runoff model for predicting river discharge into a dam. The ensemble weather simulations use a conventional 10 km and a high-resolution 2 km spatial resolutions. A distributed rainfall-runoff model is constructed for the Kasahori dam catchment (approx. 70 km2) and applied with the ensemble rainfalls. The results show that the hourly maximum and cumulative catchment-average rainfalls of the 2 km resolution JMA-NHM ensemble simulation are more appropriate than the 10 km resolution rainfalls. All the simulated inflows based on the 2 and 10 km rainfalls become larger than the flood discharge of 140 m3 s-1, a threshold value for flood control. The inflows with the 10 km resolution ensemble rainfall are all considerably smaller than the observations, while at least one simulated discharge out of 11 ensemble members with the 2 km resolution rainfalls reproduces the first peak of the inflow at the Kasahori dam with similar amplitude to observations, although there are spatiotemporal lags between simulation and observation. To take positional lags into account of the ensemble discharge simulation, the rainfall distribution in each ensemble member is shifted so that the catchment-averaged cumulative rainfall of the Kasahori dam maximizes. The runoff simulation with the position-shifted rainfalls shows much better results than the original ensemble discharge simulations.
A comprehensive probabilistic analysis model of oil pipelines network based on Bayesian network
Zhang, C.; Qin, T. X.; Jiang, B.; Huang, C.
2018-02-01
Oil pipelines network is one of the most important facilities of energy transportation. But oil pipelines network accident may result in serious disasters. Some analysis models for these accidents have been established mainly based on three methods, including event-tree, accident simulation and Bayesian network. Among these methods, Bayesian network is suitable for probabilistic analysis. But not all the important influencing factors are considered and the deployment rule of the factors has not been established. This paper proposed a probabilistic analysis model of oil pipelines network based on Bayesian network. Most of the important influencing factors, including the key environment condition and emergency response are considered in this model. Moreover, the paper also introduces a deployment rule for these factors. The model can be used in probabilistic analysis and sensitive analysis of oil pipelines network accident.
Bayesian Comparison of Alternative Graded Response Models for Performance Assessment Applications
Zhu, Xiaowen; Stone, Clement A.
2012-01-01
This study examined the relative effectiveness of Bayesian model comparison methods in selecting an appropriate graded response (GR) model for performance assessment applications. Three popular methods were considered: deviance information criterion (DIC), conditional predictive ordinate (CPO), and posterior predictive model checking (PPMC). Using…
Chen, Mingjie; Izady, Azizallah; Abdalla, Osman A.; Amerjeed, Mansoor
2018-02-01
Bayesian inference using Markov Chain Monte Carlo (MCMC) provides an explicit framework for stochastic calibration of hydrogeologic models accounting for uncertainties; however, the MCMC sampling entails a large number of model calls, and could easily become computationally unwieldy if the high-fidelity hydrogeologic model simulation is time consuming. This study proposes a surrogate-based Bayesian framework to address this notorious issue, and illustrates the methodology by inverse modeling a regional MODFLOW model. The high-fidelity groundwater model is approximated by a fast statistical model using Bagging Multivariate Adaptive Regression Spline (BMARS) algorithm, and hence the MCMC sampling can be efficiently performed. In this study, the MODFLOW model is developed to simulate the groundwater flow in an arid region of Oman consisting of mountain-coast aquifers, and used to run representative