Bayesian nonparametric hierarchical modeling.
Dunson, David B
2009-04-01
In biomedical research, hierarchical models are very widely used to accommodate dependence in multivariate and longitudinal data and for borrowing of information across data from different sources. A primary concern in hierarchical modeling is sensitivity to parametric assumptions, such as linearity and normality of the random effects. Parametric assumptions on latent variable distributions can be challenging to check and are typically unwarranted, given available prior knowledge. This article reviews some recent developments in Bayesian nonparametric methods motivated by complex, multivariate and functional data collected in biomedical studies. The author provides a brief review of flexible parametric approaches relying on finite mixtures and latent class modeling. Dirichlet process mixture models are motivated by the need to generalize these approaches to avoid assuming a fixed finite number of classes. Focusing on an epidemiology application, the author illustrates the practical utility and potential of nonparametric Bayes methods.
Nonparametric Bayesian Modeling of Complex Networks
DEFF Research Database (Denmark)
Schmidt, Mikkel Nørgaard; Mørup, Morten
2013-01-01
an infinite mixture model as running example, we go through the steps of deriving the model as an infinite limit of a finite parametric model, inferring the model parameters by Markov chain Monte Carlo, and checking the model?s fit and predictive performance. We explain how advanced nonparametric models......Modeling structure in complex networks using Bayesian nonparametrics makes it possible to specify flexible model structures and infer the adequate model complexity from the observed data. This article provides a gentle introduction to nonparametric Bayesian modeling of complex networks: Using...
Network structure exploration via Bayesian nonparametric models
International Nuclear Information System (INIS)
Chen, Y; Wang, X L; Xiang, X; Tang, B Z; Bu, J Z
2015-01-01
Complex networks provide a powerful mathematical representation of complex systems in nature and society. To understand complex networks, it is crucial to explore their internal structures, also called structural regularities. The task of network structure exploration is to determine how many groups there are in a complex network and how to group the nodes of the network. Most existing structure exploration methods need to specify either a group number or a certain type of structure when they are applied to a network. In the real world, however, the group number and also the certain type of structure that a network has are usually unknown in advance. To explore structural regularities in complex networks automatically, without any prior knowledge of the group number or the certain type of structure, we extend a probabilistic mixture model that can handle networks with any type of structure but needs to specify a group number using Bayesian nonparametric theory. We also propose a novel Bayesian nonparametric model, called the Bayesian nonparametric mixture (BNPM) model. Experiments conducted on a large number of networks with different structures show that the BNPM model is able to explore structural regularities in networks automatically with a stable, state-of-the-art performance. (paper)
Effect on Prediction when Modeling Covariates in Bayesian Nonparametric Models.
Cruz-Marcelo, Alejandro; Rosner, Gary L; Müller, Peter; Stewart, Clinton F
2013-04-01
In biomedical research, it is often of interest to characterize biologic processes giving rise to observations and to make predictions of future observations. Bayesian nonparametric methods provide a means for carrying out Bayesian inference making as few assumptions about restrictive parametric models as possible. There are several proposals in the literature for extending Bayesian nonparametric models to include dependence on covariates. Limited attention, however, has been directed to the following two aspects. In this article, we examine the effect on fitting and predictive performance of incorporating covariates in a class of Bayesian nonparametric models by one of two primary ways: either in the weights or in the locations of a discrete random probability measure. We show that different strategies for incorporating continuous covariates in Bayesian nonparametric models can result in big differences when used for prediction, even though they lead to otherwise similar posterior inferences. When one needs the predictive density, as in optimal design, and this density is a mixture, it is better to make the weights depend on the covariates. We demonstrate these points via a simulated data example and in an application in which one wants to determine the optimal dose of an anticancer drug used in pediatric oncology.
Nonparametric Bayesian models through probit stick-breaking processes.
Rodríguez, Abel; Dunson, David B
2011-03-01
We describe a novel class of Bayesian nonparametric priors based on stick-breaking constructions where the weights of the process are constructed as probit transformations of normal random variables. We show that these priors are extremely flexible, allowing us to generate a great variety of models while preserving computational simplicity. Particular emphasis is placed on the construction of rich temporal and spatial processes, which are applied to two problems in finance and ecology.
Modeling Non-Gaussian Time Series with Nonparametric Bayesian Model.
Xu, Zhiguang; MacEachern, Steven; Xu, Xinyi
2015-02-01
We present a class of Bayesian copula models whose major components are the marginal (limiting) distribution of a stationary time series and the internal dynamics of the series. We argue that these are the two features with which an analyst is typically most familiar, and hence that these are natural components with which to work. For the marginal distribution, we use a nonparametric Bayesian prior distribution along with a cdf-inverse cdf transformation to obtain large support. For the internal dynamics, we rely on the traditionally successful techniques of normal-theory time series. Coupling the two components gives us a family of (Gaussian) copula transformed autoregressive models. The models provide coherent adjustments of time scales and are compatible with many extensions, including changes in volatility of the series. We describe basic properties of the models, show their ability to recover non-Gaussian marginal distributions, and use a GARCH modification of the basic model to analyze stock index return series. The models are found to provide better fit and improved short-range and long-range predictions than Gaussian competitors. The models are extensible to a large variety of fields, including continuous time models, spatial models, models for multiple series, models driven by external covariate streams, and non-stationary models.
A Bayesian Nonparametric Meta-Analysis Model
Karabatsos, George; Talbott, Elizabeth; Walker, Stephen G.
2015-01-01
In a meta-analysis, it is important to specify a model that adequately describes the effect-size distribution of the underlying population of studies. The conventional normal fixed-effect and normal random-effects models assume a normal effect-size population distribution, conditionally on parameters and covariates. For estimating the mean overall…
Nonparametric Bayesian models for a spatial covariance.
Reich, Brian J; Fuentes, Montserrat
2012-01-01
A crucial step in the analysis of spatial data is to estimate the spatial correlation function that determines the relationship between a spatial process at two locations. The standard approach to selecting the appropriate correlation function is to use prior knowledge or exploratory analysis, such as a variogram analysis, to select the correct parametric correlation function. Rather that selecting a particular parametric correlation function, we treat the covariance function as an unknown function to be estimated from the data. We propose a flexible prior for the correlation function to provide robustness to the choice of correlation function. We specify the prior for the correlation function using spectral methods and the Dirichlet process prior, which is a common prior for an unknown distribution function. Our model does not require Gaussian data or spatial locations on a regular grid. The approach is demonstrated using a simulation study as well as an analysis of California air pollution data.
Bayesian nonparametric data analysis
Müller, Peter; Jara, Alejandro; Hanson, Tim
2015-01-01
This book reviews nonparametric Bayesian methods and models that have proven useful in the context of data analysis. Rather than providing an encyclopedic review of probability models, the book’s structure follows a data analysis perspective. As such, the chapters are organized by traditional data analysis problems. In selecting specific nonparametric models, simpler and more traditional models are favored over specialized ones. The discussed methods are illustrated with a wealth of examples, including applications ranging from stylized examples to case studies from recent literature. The book also includes an extensive discussion of computational methods and details on their implementation. R code for many examples is included in on-line software pages.
DPpackage: Bayesian Semi- and Nonparametric Modeling in R
Directory of Open Access Journals (Sweden)
Alejandro Jara
2011-04-01
Full Text Available Data analysis sometimes requires the relaxation of parametric assumptions in order to gain modeling flexibility and robustness against mis-specification of the probability model. In the Bayesian context, this is accomplished by placing a prior distribution on a function space, such as the space of all probability distributions or the space of all regression functions. Unfortunately, posterior distributions ranging over function spaces are highly complex and hence sampling methods play a key role. This paper provides an introduction to a simple, yet comprehensive, set of programs for the implementation of some Bayesian nonparametric and semiparametric models in R, DPpackage. Currently, DPpackage includes models for marginal and conditional density estimation, receiver operating characteristic curve analysis, interval-censored data, binary regression data, item response data, longitudinal and clustered data using generalized linear mixed models, and regression data using generalized additive models. The package also contains functions to compute pseudo-Bayes factors for model comparison and for eliciting the precision parameter of the Dirichlet process prior, and a general purpose Metropolis sampling algorithm. To maximize computational efficiency, the actual sampling for each model is carried out using compiled C, C++ or Fortran code.
Bayesian nonparametric meta-analysis using Polya tree mixture models.
Branscum, Adam J; Hanson, Timothy E
2008-09-01
Summary. A common goal in meta-analysis is estimation of a single effect measure using data from several studies that are each designed to address the same scientific inquiry. Because studies are typically conducted in geographically disperse locations, recent developments in the statistical analysis of meta-analytic data involve the use of random effects models that account for study-to-study variability attributable to differences in environments, demographics, genetics, and other sources that lead to heterogeneity in populations. Stemming from asymptotic theory, study-specific summary statistics are modeled according to normal distributions with means representing latent true effect measures. A parametric approach subsequently models these latent measures using a normal distribution, which is strictly a convenient modeling assumption absent of theoretical justification. To eliminate the influence of overly restrictive parametric models on inferences, we consider a broader class of random effects distributions. We develop a novel hierarchical Bayesian nonparametric Polya tree mixture (PTM) model. We present methodology for testing the PTM versus a normal random effects model. These methods provide researchers a straightforward approach for conducting a sensitivity analysis of the normality assumption for random effects. An application involving meta-analysis of epidemiologic studies designed to characterize the association between alcohol consumption and breast cancer is presented, which together with results from simulated data highlight the performance of PTMs in the presence of nonnormality of effect measures in the source population.
Bayesian nonparametric clustering in phylogenetics: modeling antigenic evolution in influenza.
Cybis, Gabriela B; Sinsheimer, Janet S; Bedford, Trevor; Rambaut, Andrew; Lemey, Philippe; Suchard, Marc A
2018-01-30
Influenza is responsible for up to 500,000 deaths every year, and antigenic variability represents much of its epidemiological burden. To visualize antigenic differences across many viral strains, antigenic cartography methods use multidimensional scaling on binding assay data to map influenza antigenicity onto a low-dimensional space. Analysis of such assay data ideally leads to natural clustering of influenza strains of similar antigenicity that correlate with sequence evolution. To understand the dynamics of these antigenic groups, we present a framework that jointly models genetic and antigenic evolution by combining multidimensional scaling of binding assay data, Bayesian phylogenetic machinery and nonparametric clustering methods. We propose a phylogenetic Chinese restaurant process that extends the current process to incorporate the phylogenetic dependency structure between strains in the modeling of antigenic clusters. With this method, we are able to use the genetic information to better understand the evolution of antigenicity throughout epidemics, as shown in applications of this model to H1N1 influenza. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Bayesian nonparametric estimation of hazard rate in monotone Aalen model
Czech Academy of Sciences Publication Activity Database
Timková, Jana
2014-01-01
Roč. 50, č. 6 (2014), s. 849-868 ISSN 0023-5954 Institutional support: RVO:67985556 Keywords : Aalen model * Bayesian estimation * MCMC Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.541, year: 2014 http://library.utia.cas.cz/separaty/2014/SI/timkova-0438210.pdf
Bayesian Nonparametric Model for Estimating Multistate Travel Time Distribution
Directory of Open Access Journals (Sweden)
Emmanuel Kidando
2017-01-01
Full Text Available Multistate models, that is, models with more than two distributions, are preferred over single-state probability models in modeling the distribution of travel time. Literature review indicated that the finite multistate modeling of travel time using lognormal distribution is superior to other probability functions. In this study, we extend the finite multistate lognormal model of estimating the travel time distribution to unbounded lognormal distribution. In particular, a nonparametric Dirichlet Process Mixture Model (DPMM with stick-breaking process representation was used. The strength of the DPMM is that it can choose the number of components dynamically as part of the algorithm during parameter estimation. To reduce computational complexity, the modeling process was limited to a maximum of six components. Then, the Markov Chain Monte Carlo (MCMC sampling technique was employed to estimate the parameters’ posterior distribution. Speed data from nine links of a freeway corridor, aggregated on a 5-minute basis, were used to calculate the corridor travel time. The results demonstrated that this model offers significant flexibility in modeling to account for complex mixture distributions of the travel time without specifying the number of components. The DPMM modeling further revealed that freeway travel time is characterized by multistate or single-state models depending on the inclusion of onset and offset of congestion periods.
A Bayesian Beta-Mixture Model for Nonparametric IRT (BBM-IRT)
Arenson, Ethan A.; Karabatsos, George
2017-01-01
Item response models typically assume that the item characteristic (step) curves follow a logistic or normal cumulative distribution function, which are strictly monotone functions of person test ability. Such assumptions can be overly-restrictive for real item response data. We propose a simple and more flexible Bayesian nonparametric IRT model…
A Bayesian approach to the analysis of quantal bioassay studies using nonparametric mixture models.
Fronczyk, Kassandra; Kottas, Athanasios
2014-03-01
We develop a Bayesian nonparametric mixture modeling framework for quantal bioassay settings. The approach is built upon modeling dose-dependent response distributions. We adopt a structured nonparametric prior mixture model, which induces a monotonicity restriction for the dose-response curve. Particular emphasis is placed on the key risk assessment goal of calibration for the dose level that corresponds to a specified response. The proposed methodology yields flexible inference for the dose-response relationship as well as for other inferential objectives, as illustrated with two data sets from the literature. © 2013, The International Biometric Society.
Scalable Bayesian nonparametric regression via a Plackett-Luce model for conditional ranks
Gray-Davies, Tristan; Holmes, Chris C.; Caron, François
2018-01-01
We present a novel Bayesian nonparametric regression model for covariates X and continuous response variable Y ∈ ℝ. The model is parametrized in terms of marginal distributions for Y and X and a regression function which tunes the stochastic ordering of the conditional distributions F (y|x). By adopting an approximate composite likelihood approach, we show that the resulting posterior inference can be decoupled for the separate components of the model. This procedure can scale to very large datasets and allows for the use of standard, existing, software from Bayesian nonparametric density estimation and Plackett-Luce ranking estimation to be applied. As an illustration, we show an application of our approach to a US Census dataset, with over 1,300,000 data points and more than 100 covariates. PMID:29623150
Bayesian Nonparametric Statistical Inference for Shock Models and Wear Processes.
1979-12-01
also note that the results in Section 2 do not depend on the support of F .) This shock model have been studied by Esary, Marshall and Proschan (1973...Barlow and Proschan (1975), among others. The analogy of the shock model in risk and acturial analysis has been given by BUhlmann (1970, Chapter 2... Mathematical Statistics, Vol. 4, pp. 894-906. Billingsley, P. (1968), CONVERGENCE OF PROBABILITY MEASURES, John Wiley, New York. BUhlmann, H. (1970
Bayesian Nonparametric Longitudinal Data Analysis.
Quintana, Fernando A; Johnson, Wesley O; Waetjen, Elaine; Gold, Ellen
2016-01-01
Practical Bayesian nonparametric methods have been developed across a wide variety of contexts. Here, we develop a novel statistical model that generalizes standard mixed models for longitudinal data that include flexible mean functions as well as combined compound symmetry (CS) and autoregressive (AR) covariance structures. AR structure is often specified through the use of a Gaussian process (GP) with covariance functions that allow longitudinal data to be more correlated if they are observed closer in time than if they are observed farther apart. We allow for AR structure by considering a broader class of models that incorporates a Dirichlet Process Mixture (DPM) over the covariance parameters of the GP. We are able to take advantage of modern Bayesian statistical methods in making full predictive inferences and about characteristics of longitudinal profiles and their differences across covariate combinations. We also take advantage of the generality of our model, which provides for estimation of a variety of covariance structures. We observe that models that fail to incorporate CS or AR structure can result in very poor estimation of a covariance or correlation matrix. In our illustration using hormone data observed on women through the menopausal transition, biology dictates the use of a generalized family of sigmoid functions as a model for time trends across subpopulation categories.
Non-parametric Bayesian graph models reveal community structure in resting state fMRI
DEFF Research Database (Denmark)
Andersen, Kasper Winther; Madsen, Kristoffer H.; Siebner, Hartwig Roman
2014-01-01
Modeling of resting state functional magnetic resonance imaging (rs-fMRI) data using network models is of increasing interest. It is often desirable to group nodes into clusters to interpret the communication patterns between nodes. In this study we consider three different nonparametric Bayesian...... models for node clustering in complex networks. In particular, we test their ability to predict unseen data and their ability to reproduce clustering across datasets. The three generative models considered are the Infinite Relational Model (IRM), Bayesian Community Detection (BCD), and the Infinite...... between clusters. BCD restricts the between-cluster link probabilities to be strictly lower than within-cluster link probabilities to conform to the community structure typically seen in social networks. IDM only models a single between-cluster link probability, which can be interpreted as a background...
Bayesian Non-Parametric Mixtures of GARCH(1,1 Models
Directory of Open Access Journals (Sweden)
John W. Lau
2012-01-01
Full Text Available Traditional GARCH models describe volatility levels that evolve smoothly over time, generated by a single GARCH regime. However, nonstationary time series data may exhibit abrupt changes in volatility, suggesting changes in the underlying GARCH regimes. Further, the number and times of regime changes are not always obvious. This article outlines a nonparametric mixture of GARCH models that is able to estimate the number and time of volatility regime changes by mixing over the Poisson-Kingman process. The process is a generalisation of the Dirichlet process typically used in nonparametric models for time-dependent data provides a richer clustering structure, and its application to time series data is novel. Inference is Bayesian, and a Markov chain Monte Carlo algorithm to explore the posterior distribution is described. The methodology is illustrated on the Standard and Poor's 500 financial index.
Nonparametric Bayesian inference in biostatistics
Müller, Peter
2015-01-01
As chapters in this book demonstrate, BNP has important uses in clinical sciences and inference for issues like unknown partitions in genomics. Nonparametric Bayesian approaches (BNP) play an ever expanding role in biostatistical inference from use in proteomics to clinical trials. Many research problems involve an abundance of data and require flexible and complex probability models beyond the traditional parametric approaches. As this book's expert contributors show, BNP approaches can be the answer. Survival Analysis, in particular survival regression, has traditionally used BNP, but BNP's potential is now very broad. This applies to important tasks like arrangement of patients into clinically meaningful subpopulations and segmenting the genome into functionally distinct regions. This book is designed to both review and introduce application areas for BNP. While existing books provide theoretical foundations, this book connects theory to practice through engaging examples and research questions. Chapters c...
Yau, Christopher; Holmes, Chris
2011-07-01
We propose a hierarchical Bayesian nonparametric mixture model for clustering when some of the covariates are assumed to be of varying relevance to the clustering problem. This can be thought of as an issue in variable selection for unsupervised learning. We demonstrate that by defining a hierarchical population based nonparametric prior on the cluster locations scaled by the inverse covariance matrices of the likelihood we arrive at a 'sparsity prior' representation which admits a conditionally conjugate prior. This allows us to perform full Gibbs sampling to obtain posterior distributions over parameters of interest including an explicit measure of each covariate's relevance and a distribution over the number of potential clusters present in the data. This also allows for individual cluster specific variable selection. We demonstrate improved inference on a number of canonical problems.
A BAYESIAN NONPARAMETRIC MIXTURE MODEL FOR SELECTING GENES AND GENE SUBNETWORKS.
Zhao, Yize; Kang, Jian; Yu, Tianwei
2014-06-01
It is very challenging to select informative features from tens of thousands of measured features in high-throughput data analysis. Recently, several parametric/regression models have been developed utilizing the gene network information to select genes or pathways strongly associated with a clinical/biological outcome. Alternatively, in this paper, we propose a nonparametric Bayesian model for gene selection incorporating network information. In addition to identifying genes that have a strong association with a clinical outcome, our model can select genes with particular expressional behavior, in which case the regression models are not directly applicable. We show that our proposed model is equivalent to an infinity mixture model for which we develop a posterior computation algorithm based on Markov chain Monte Carlo (MCMC) methods. We also propose two fast computing algorithms that approximate the posterior simulation with good accuracy but relatively low computational cost. We illustrate our methods on simulation studies and the analysis of Spellman yeast cell cycle microarray data.
A Bayesian Nonparametric Approach to Factor Analysis
DEFF Research Database (Denmark)
Piatek, Rémi; Papaspiliopoulos, Omiros
2018-01-01
This paper introduces a new approach for the inference of non-Gaussian factor models based on Bayesian nonparametric methods. It relaxes the usual normality assumption on the latent factors, widely used in practice, which is too restrictive in many settings. Our approach, on the contrary, does no...
Xu, Zhiqiang
2017-02-16
Attributed graph clustering, also known as community detection on attributed graphs, attracts much interests recently due to the ubiquity of attributed graphs in real life. Many existing algorithms have been proposed for this problem, which are either distance based or model based. However, model selection in attributed graph clustering has not been well addressed, that is, most existing algorithms assume the cluster number to be known a priori. In this paper, we propose two efficient approaches for attributed graph clustering with automatic model selection. The first approach is a popular Bayesian nonparametric method, while the second approach is an asymptotic method based on a recently proposed model selection criterion, factorized information criterion. Experimental results on both synthetic and real datasets demonstrate that our approaches for attributed graph clustering with automatic model selection significantly outperform the state-of-the-art algorithm.
Xu, Zhiqiang; Cheng, James; Xiao, Xiaokui; Fujimaki, Ryohei; Muraoka, Yusuke
2017-01-01
Attributed graph clustering, also known as community detection on attributed graphs, attracts much interests recently due to the ubiquity of attributed graphs in real life. Many existing algorithms have been proposed for this problem, which are either distance based or model based. However, model selection in attributed graph clustering has not been well addressed, that is, most existing algorithms assume the cluster number to be known a priori. In this paper, we propose two efficient approaches for attributed graph clustering with automatic model selection. The first approach is a popular Bayesian nonparametric method, while the second approach is an asymptotic method based on a recently proposed model selection criterion, factorized information criterion. Experimental results on both synthetic and real datasets demonstrate that our approaches for attributed graph clustering with automatic model selection significantly outperform the state-of-the-art algorithm.
Bayesian nonparametric generative models for causal inference with missing at random covariates.
Roy, Jason; Lum, Kirsten J; Zeldow, Bret; Dworkin, Jordan D; Re, Vincent Lo; Daniels, Michael J
2018-03-26
We propose a general Bayesian nonparametric (BNP) approach to causal inference in the point treatment setting. The joint distribution of the observed data (outcome, treatment, and confounders) is modeled using an enriched Dirichlet process. The combination of the observed data model and causal assumptions allows us to identify any type of causal effect-differences, ratios, or quantile effects, either marginally or for subpopulations of interest. The proposed BNP model is well-suited for causal inference problems, as it does not require parametric assumptions about the distribution of confounders and naturally leads to a computationally efficient Gibbs sampling algorithm. By flexibly modeling the joint distribution, we are also able to impute (via data augmentation) values for missing covariates within the algorithm under an assumption of ignorable missingness, obviating the need to create separate imputed data sets. This approach for imputing the missing covariates has the additional advantage of guaranteeing congeniality between the imputation model and the analysis model, and because we use a BNP approach, parametric models are avoided for imputation. The performance of the method is assessed using simulation studies. The method is applied to data from a cohort study of human immunodeficiency virus/hepatitis C virus co-infected patients. © 2018, The International Biometric Society.
Zhou, Haiming; Hanson, Timothy; Knapp, Roland
2015-12-01
The global emergence of Batrachochytrium dendrobatidis (Bd) has caused the extinction of hundreds of amphibian species worldwide. It has become increasingly important to be able to precisely predict time to Bd arrival in a population. The data analyzed herein present a unique challenge in terms of modeling because there is a strong spatial component to Bd arrival time and the traditional proportional hazards assumption is grossly violated. To address these concerns, we develop a novel marginal Bayesian nonparametric survival model for spatially correlated right-censored data. This class of models assumes that the logarithm of survival times marginally follow a mixture of normal densities with a linear-dependent Dirichlet process prior as the random mixing measure, and their joint distribution is induced by a Gaussian copula model with a spatial correlation structure. To invert high-dimensional spatial correlation matrices, we adopt a full-scale approximation that can capture both large- and small-scale spatial dependence. An efficient Markov chain Monte Carlo algorithm with delayed rejection is proposed for posterior computation, and an R package spBayesSurv is provided to fit the model. This approach is first evaluated through simulations, then applied to threatened frog populations in Sequoia-Kings Canyon National Park. © 2015, The International Biometric Society.
Robustifying Bayesian nonparametric mixtures for count data.
Canale, Antonio; Prünster, Igor
2017-03-01
Our motivating application stems from surveys of natural populations and is characterized by large spatial heterogeneity in the counts, which makes parametric approaches to modeling local animal abundance too restrictive. We adopt a Bayesian nonparametric approach based on mixture models and innovate with respect to popular Dirichlet process mixture of Poisson kernels by increasing the model flexibility at the level both of the kernel and the nonparametric mixing measure. This allows to derive accurate and robust estimates of the distribution of local animal abundance and of the corresponding clusters. The application and a simulation study for different scenarios yield also some general methodological implications. Adding flexibility solely at the level of the mixing measure does not improve inferences, since its impact is severely limited by the rigidity of the Poisson kernel with considerable consequences in terms of bias. However, once a kernel more flexible than the Poisson is chosen, inferences can be robustified by choosing a prior more general than the Dirichlet process. Therefore, to improve the performance of Bayesian nonparametric mixtures for count data one has to enrich the model simultaneously at both levels, the kernel and the mixing measure. © 2016, The International Biometric Society.
Karabatsos, George
2017-02-01
Most of applied statistics involves regression analysis of data. In practice, it is important to specify a regression model that has minimal assumptions which are not violated by data, to ensure that statistical inferences from the model are informative and not misleading. This paper presents a stand-alone and menu-driven software package, Bayesian Regression: Nonparametric and Parametric Models, constructed from MATLAB Compiler. Currently, this package gives the user a choice from 83 Bayesian models for data analysis. They include 47 Bayesian nonparametric (BNP) infinite-mixture regression models; 5 BNP infinite-mixture models for density estimation; and 31 normal random effects models (HLMs), including normal linear models. Each of the 78 regression models handles either a continuous, binary, or ordinal dependent variable, and can handle multi-level (grouped) data. All 83 Bayesian models can handle the analysis of weighted observations (e.g., for meta-analysis), and the analysis of left-censored, right-censored, and/or interval-censored data. Each BNP infinite-mixture model has a mixture distribution assigned one of various BNP prior distributions, including priors defined by either the Dirichlet process, Pitman-Yor process (including the normalized stable process), beta (two-parameter) process, normalized inverse-Gaussian process, geometric weights prior, dependent Dirichlet process, or the dependent infinite-probits prior. The software user can mouse-click to select a Bayesian model and perform data analysis via Markov chain Monte Carlo (MCMC) sampling. After the sampling completes, the software automatically opens text output that reports MCMC-based estimates of the model's posterior distribution and model predictive fit to the data. Additional text and/or graphical output can be generated by mouse-clicking other menu options. This includes output of MCMC convergence analyses, and estimates of the model's posterior predictive distribution, for selected
Bayesian Nonparametric Clustering for Positive Definite Matrices.
Cherian, Anoop; Morellas, Vassilios; Papanikolopoulos, Nikolaos
2016-05-01
Symmetric Positive Definite (SPD) matrices emerge as data descriptors in several applications of computer vision such as object tracking, texture recognition, and diffusion tensor imaging. Clustering these data matrices forms an integral part of these applications, for which soft-clustering algorithms (K-Means, expectation maximization, etc.) are generally used. As is well-known, these algorithms need the number of clusters to be specified, which is difficult when the dataset scales. To address this issue, we resort to the classical nonparametric Bayesian framework by modeling the data as a mixture model using the Dirichlet process (DP) prior. Since these matrices do not conform to the Euclidean geometry, rather belongs to a curved Riemannian manifold,existing DP models cannot be directly applied. Thus, in this paper, we propose a novel DP mixture model framework for SPD matrices. Using the log-determinant divergence as the underlying dissimilarity measure to compare these matrices, and further using the connection between this measure and the Wishart distribution, we derive a novel DPM model based on the Wishart-Inverse-Wishart conjugate pair. We apply this model to several applications in computer vision. Our experiments demonstrate that our model is scalable to the dataset size and at the same time achieves superior accuracy compared to several state-of-the-art parametric and nonparametric clustering algorithms.
Bayesian Bandwidth Selection for a Nonparametric Regression Model with Mixed Types of Regressors
Directory of Open Access Journals (Sweden)
Xibin Zhang
2016-04-01
Full Text Available This paper develops a sampling algorithm for bandwidth estimation in a nonparametric regression model with continuous and discrete regressors under an unknown error density. The error density is approximated by the kernel density estimator of the unobserved errors, while the regression function is estimated using the Nadaraya-Watson estimator admitting continuous and discrete regressors. We derive an approximate likelihood and posterior for bandwidth parameters, followed by a sampling algorithm. Simulation results show that the proposed approach typically leads to better accuracy of the resulting estimates than cross-validation, particularly for smaller sample sizes. This bandwidth estimation approach is applied to nonparametric regression model of the Australian All Ordinaries returns and the kernel density estimation of gross domestic product (GDP growth rates among the organisation for economic co-operation and development (OECD and non-OECD countries.
Bayesian nonparametric adaptive control using Gaussian processes.
Chowdhary, Girish; Kingravi, Hassan A; How, Jonathan P; Vela, Patricio A
2015-03-01
Most current model reference adaptive control (MRAC) methods rely on parametric adaptive elements, in which the number of parameters of the adaptive element are fixed a priori, often through expert judgment. An example of such an adaptive element is radial basis function networks (RBFNs), with RBF centers preallocated based on the expected operating domain. If the system operates outside of the expected operating domain, this adaptive element can become noneffective in capturing and canceling the uncertainty, thus rendering the adaptive controller only semiglobal in nature. This paper investigates a Gaussian process-based Bayesian MRAC architecture (GP-MRAC), which leverages the power and flexibility of GP Bayesian nonparametric models of uncertainty. The GP-MRAC does not require the centers to be preallocated, can inherently handle measurement noise, and enables MRAC to handle a broader set of uncertainties, including those that are defined as distributions over functions. We use stochastic stability arguments to show that GP-MRAC guarantees good closed-loop performance with no prior domain knowledge of the uncertainty. Online implementable GP inference methods are compared in numerical simulations against RBFN-MRAC with preallocated centers and are shown to provide better tracking and improved long-term learning.
Non-parametric Bayesian models of response function in dynamic image sequences
Czech Academy of Sciences Publication Activity Database
Tichý, Ondřej; Šmídl, Václav
2016-01-01
Roč. 151, č. 1 (2016), s. 90-100 ISSN 1077-3142 R&D Projects: GA ČR GA13-29225S Institutional support: RVO:67985556 Keywords : Response function * Blind source separation * Dynamic medical imaging * Probabilistic models * Bayesian methods Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 2.498, year: 2016 http://library.utia.cas.cz/separaty/2016/AS/tichy-0456983.pdf
Yau, C; Papaspiliopoulos, O; Roberts, G O; Holmes, C
2011-01-01
We consider the development of Bayesian Nonparametric methods for product partition models such as Hidden Markov Models and change point models. Our approach uses a Mixture of Dirichlet Process (MDP) model for the unknown sampling distribution (likelihood) for the observations arising in each state and a computationally efficient data augmentation scheme to aid inference. The method uses novel MCMC methodology which combines recent retrospective sampling methods with the use of slice sampler variables. The methodology is computationally efficient, both in terms of MCMC mixing properties, and robustness to the length of the time series being investigated. Moreover, the method is easy to implement requiring little or no user-interaction. We apply our methodology to the analysis of genomic copy number variation.
Bayesian nonparametric modeling for comparison of single-neuron firing intensities.
Kottas, Athanasios; Behseta, Sam
2010-03-01
We propose a fully inferential model-based approach to the problem of comparing the firing patterns of a neuron recorded under two distinct experimental conditions. The methodology is based on nonhomogeneous Poisson process models for the firing times of each condition with flexible nonparametric mixture prior models for the corresponding intensity functions. We demonstrate posterior inferences from a global analysis, which may be used to compare the two conditions over the entire experimental time window, as well as from a pointwise analysis at selected time points to detect local deviations of firing patterns from one condition to another. We apply our method on two neurons recorded from the primary motor cortex area of a monkey's brain while performing a sequence of reaching tasks.
Zhang, Linlin; Guindani, Michele; Versace, Francesco; Vannucci, Marina
2014-07-15
In this paper we present a novel wavelet-based Bayesian nonparametric regression model for the analysis of functional magnetic resonance imaging (fMRI) data. Our goal is to provide a joint analytical framework that allows to detect regions of the brain which exhibit neuronal activity in response to a stimulus and, simultaneously, infer the association, or clustering, of spatially remote voxels that exhibit fMRI time series with similar characteristics. We start by modeling the data with a hemodynamic response function (HRF) with a voxel-dependent shape parameter. We detect regions of the brain activated in response to a given stimulus by using mixture priors with a spike at zero on the coefficients of the regression model. We account for the complex spatial correlation structure of the brain by using a Markov random field (MRF) prior on the parameters guiding the selection of the activated voxels, therefore capturing correlation among nearby voxels. In order to infer association of the voxel time courses, we assume correlated errors, in particular long memory, and exploit the whitening properties of discrete wavelet transforms. Furthermore, we achieve clustering of the voxels by imposing a Dirichlet process (DP) prior on the parameters of the long memory process. For inference, we use Markov Chain Monte Carlo (MCMC) sampling techniques that combine Metropolis-Hastings schemes employed in Bayesian variable selection with sampling algorithms for nonparametric DP models. We explore the performance of the proposed model on simulated data, with both block- and event-related design, and on real fMRI data. Copyright © 2014 Elsevier Inc. All rights reserved.
A Bayesian nonparametric approach to causal inference on quantiles.
Xu, Dandan; Daniels, Michael J; Winterstein, Almut G
2018-02-25
We propose a Bayesian nonparametric approach (BNP) for causal inference on quantiles in the presence of many confounders. In particular, we define relevant causal quantities and specify BNP models to avoid bias from restrictive parametric assumptions. We first use Bayesian additive regression trees (BART) to model the propensity score and then construct the distribution of potential outcomes given the propensity score using a Dirichlet process mixture (DPM) of normals model. We thoroughly evaluate the operating characteristics of our approach and compare it to Bayesian and frequentist competitors. We use our approach to answer an important clinical question involving acute kidney injury using electronic health records. © 2018, The International Biometric Society.
Bayesian nonparametric system reliability using sets of priors
Walter, G.M.; Aslett, L.J.M.; Coolen, F.P.A.
2016-01-01
An imprecise Bayesian nonparametric approach to system reliability with multiple types of components is developed. This allows modelling partial or imperfect prior knowledge on component failure distributions in a flexible way through bounds on the functioning probability. Given component level test
Tang, Niansheng; Chow, Sy-Miin; Ibrahim, Joseph G; Zhu, Hongtu
2017-12-01
Many psychological concepts are unobserved and usually represented as latent factors apprehended through multiple observed indicators. When multiple-subject multivariate time series data are available, dynamic factor analysis models with random effects offer one way of modeling patterns of within- and between-person variations by combining factor analysis and time series analysis at the factor level. Using the Dirichlet process (DP) as a nonparametric prior for individual-specific time series parameters further allows the distributional forms of these parameters to deviate from commonly imposed (e.g., normal or other symmetric) functional forms, arising as a result of these parameters' restricted ranges. Given the complexity of such models, a thorough sensitivity analysis is critical but computationally prohibitive. We propose a Bayesian local influence method that allows for simultaneous sensitivity analysis of multiple modeling components within a single fitting of the model of choice. Five illustrations and an empirical example are provided to demonstrate the utility of the proposed approach in facilitating the detection of outlying cases and common sources of misspecification in dynamic factor analysis models, as well as identification of modeling components that are sensitive to changes in the DP prior specification.
Nonparametric Transfer Function Models
Liu, Jun M.; Chen, Rong; Yao, Qiwei
2009-01-01
In this paper a class of nonparametric transfer function models is proposed to model nonlinear relationships between ‘input’ and ‘output’ time series. The transfer function is smooth with unknown functional forms, and the noise is assumed to be a stationary autoregressive-moving average (ARMA) process. The nonparametric transfer function is estimated jointly with the ARMA parameters. By modeling the correlation in the noise, the transfer function can be estimated more efficiently. The parsimonious ARMA structure improves the estimation efficiency in finite samples. The asymptotic properties of the estimators are investigated. The finite-sample properties are illustrated through simulations and one empirical example. PMID:20628584
Nonparametric Bayesian inference for multidimensional compound Poisson processes
Gugushvili, S.; van der Meulen, F.; Spreij, P.
2015-01-01
Given a sample from a discretely observed multidimensional compound Poisson process, we study the problem of nonparametric estimation of its jump size density r0 and intensity λ0. We take a nonparametric Bayesian approach to the problem and determine posterior contraction rates in this context,
Comparing nonparametric Bayesian tree priors for clonal reconstruction of tumors.
Deshwar, Amit G; Vembu, Shankar; Morris, Quaid
2015-01-01
Statistical machine learning methods, especially nonparametric Bayesian methods, have become increasingly popular to infer clonal population structure of tumors. Here we describe the treeCRP, an extension of the Chinese restaurant process (CRP), a popular construction used in nonparametric mixture models, to infer the phylogeny and genotype of major subclonal lineages represented in the population of cancer cells. We also propose new split-merge updates tailored to the subclonal reconstruction problem that improve the mixing time of Markov chains. In comparisons with the tree-structured stick breaking prior used in PhyloSub, we demonstrate superior mixing and running time using the treeCRP with our new split-merge procedures. We also show that given the same number of samples, TSSB and treeCRP have similar ability to recover the subclonal structure of a tumor…
A Nonparametric Bayesian Approach For Emission Tomography Reconstruction
International Nuclear Information System (INIS)
Barat, Eric; Dautremer, Thomas
2007-01-01
We introduce a PET reconstruction algorithm following a nonparametric Bayesian (NPB) approach. In contrast with Expectation Maximization (EM), the proposed technique does not rely on any space discretization. Namely, the activity distribution--normalized emission intensity of the spatial poisson process--is considered as a spatial probability density and observations are the projections of random emissions whose distribution has to be estimated. This approach is nonparametric in the sense that the quantity of interest belongs to the set of probability measures on R k (for reconstruction in k-dimensions) and it is Bayesian in the sense that we define a prior directly on this spatial measure. In this context, we propose to model the nonparametric probability density as an infinite mixture of multivariate normal distributions. As a prior for this mixture we consider a Dirichlet Process Mixture (DPM) with a Normal-Inverse Wishart (NIW) model as base distribution of the Dirichlet Process. As in EM-family reconstruction, we use a data augmentation scheme where the set of hidden variables are the emission locations for each observed line of response in the continuous object space. Thanks to the data augmentation, we propose a Markov Chain Monte Carlo (MCMC) algorithm (Gibbs sampler) which is able to generate draws from the posterior distribution of the spatial intensity. A difference with EM is that one step of the Gibbs sampler corresponds to the generation of emission locations while only the expected number of emissions per pixel/voxel is used in EM. Another key difference is that the estimated spatial intensity is a continuous function such that there is no need to compute a projection matrix. Finally, draws from the intensity posterior distribution allow the estimation of posterior functionnals like the variance or confidence intervals. Results are presented for simulated data based on a 2D brain phantom and compared to Bayesian MAP-EM
Bayesian nonparametric dictionary learning for compressed sensing MRI.
Huang, Yue; Paisley, John; Lin, Qin; Ding, Xinghao; Fu, Xueyang; Zhang, Xiao-Ping
2014-12-01
We develop a Bayesian nonparametric model for reconstructing magnetic resonance images (MRIs) from highly undersampled k -space data. We perform dictionary learning as part of the image reconstruction process. To this end, we use the beta process as a nonparametric dictionary learning prior for representing an image patch as a sparse combination of dictionary elements. The size of the dictionary and patch-specific sparsity pattern are inferred from the data, in addition to other dictionary learning variables. Dictionary learning is performed directly on the compressed image, and so is tailored to the MRI being considered. In addition, we investigate a total variation penalty term in combination with the dictionary learning model, and show how the denoising property of dictionary learning removes dependence on regularization parameters in the noisy setting. We derive a stochastic optimization algorithm based on Markov chain Monte Carlo for the Bayesian model, and use the alternating direction method of multipliers for efficiently performing total variation minimization. We present empirical results on several MRI, which show that the proposed regularization framework can improve reconstruction accuracy over other methods.
Quantal Response: Nonparametric Modeling
2017-01-01
capture the behavior of observed phenomena. Higher-order polynomial and finite-dimensional spline basis models allow for more complicated responses as the...flexibility as these are nonparametric (not constrained to any particular functional form). These should be useful in identifying nonstandard behavior via... deviance ∆ = −2 log(Lreduced/Lfull) is defined in terms of the likelihood function L. For normal error, Lfull = 1, and based on Eq. A-2, we have log
Prior processes and their applications nonparametric Bayesian estimation
Phadia, Eswar G
2016-01-01
This book presents a systematic and comprehensive treatment of various prior processes that have been developed over the past four decades for dealing with Bayesian approach to solving selected nonparametric inference problems. This revised edition has been substantially expanded to reflect the current interest in this area. After an overview of different prior processes, it examines the now pre-eminent Dirichlet process and its variants including hierarchical processes, then addresses new processes such as dependent Dirichlet, local Dirichlet, time-varying and spatial processes, all of which exploit the countable mixture representation of the Dirichlet process. It subsequently discusses various neutral to right type processes, including gamma and extended gamma, beta and beta-Stacy processes, and then describes the Chinese Restaurant, Indian Buffet and infinite gamma-Poisson processes, which prove to be very useful in areas such as machine learning, information retrieval and featural modeling. Tailfree and P...
A Bayesian nonparametric estimation of distributions and quantiles
International Nuclear Information System (INIS)
Poern, K.
1988-11-01
The report describes a Bayesian, nonparametric method for the estimation of a distribution function and its quantiles. The method, presupposing random sampling, is nonparametric, so the user has to specify a prior distribution on a space of distributions (and not on a parameter space). In the current application, where the method is used to estimate the uncertainty of a parametric calculational model, the Dirichlet prior distribution is to a large extent determined by the first batch of Monte Carlo-realizations. In this case the results of the estimation technique is very similar to the conventional empirical distribution function. The resulting posterior distribution is also Dirichlet, and thus facilitates the determination of probability (confidence) intervals at any given point in the space of interest. Another advantage is that also the posterior distribution of a specified quantitle can be derived and utilized to determine a probability interval for that quantile. The method was devised for use in the PROPER code package for uncertainty and sensitivity analysis. (orig.)
Nonparametric combinatorial sequence models.
Wauthier, Fabian L; Jordan, Michael I; Jojic, Nebojsa
2011-11-01
This work considers biological sequences that exhibit combinatorial structures in their composition: groups of positions of the aligned sequences are "linked" and covary as one unit across sequences. If multiple such groups exist, complex interactions can emerge between them. Sequences of this kind arise frequently in biology but methodologies for analyzing them are still being developed. This article presents a nonparametric prior on sequences which allows combinatorial structures to emerge and which induces a posterior distribution over factorized sequence representations. We carry out experiments on three biological sequence families which indicate that combinatorial structures are indeed present and that combinatorial sequence models can more succinctly describe them than simpler mixture models. We conclude with an application to MHC binding prediction which highlights the utility of the posterior distribution over sequence representations induced by the prior. By integrating out the posterior, our method compares favorably to leading binding predictors.
Seismic Signal Compression Using Nonparametric Bayesian Dictionary Learning via Clustering
Directory of Open Access Journals (Sweden)
Xin Tian
2017-06-01
Full Text Available We introduce a seismic signal compression method based on nonparametric Bayesian dictionary learning method via clustering. The seismic data is compressed patch by patch, and the dictionary is learned online. Clustering is introduced for dictionary learning. A set of dictionaries could be generated, and each dictionary is used for one cluster’s sparse coding. In this way, the signals in one cluster could be well represented by their corresponding dictionaries. A nonparametric Bayesian dictionary learning method is used to learn the dictionaries, which naturally infers an appropriate dictionary size for each cluster. A uniform quantizer and an adaptive arithmetic coding algorithm are adopted to code the sparse coefficients. With comparisons to other state-of-the art approaches, the effectiveness of the proposed method could be validated in the experiments.
A nonparametric Bayesian approach for genetic evaluation in ...
African Journals Online (AJOL)
South African Journal of Animal Science ... the Bayesian and Classical models, a Bayesian procedure is provided which allows these random ... data from the Elsenburg Dormer sheep stud and data from a simulation experiment are utilized. >
Non-parametric Bayesian networks: Improving theory and reviewing applications
International Nuclear Information System (INIS)
Hanea, Anca; Morales Napoles, Oswaldo; Ababei, Dan
2015-01-01
Applications in various domains often lead to high dimensional dependence modelling. A Bayesian network (BN) is a probabilistic graphical model that provides an elegant way of expressing the joint distribution of a large number of interrelated variables. BNs have been successfully used to represent uncertain knowledge in a variety of fields. The majority of applications use discrete BNs, i.e. BNs whose nodes represent discrete variables. Integrating continuous variables in BNs is an area fraught with difficulty. Several methods that handle discrete-continuous BNs have been proposed in the literature. This paper concentrates only on one method called non-parametric BNs (NPBNs). NPBNs were introduced in 2004 and they have been or are currently being used in at least twelve professional applications. This paper provides a short introduction to NPBNs, a couple of theoretical advances, and an overview of applications. The aim of the paper is twofold: one is to present the latest improvements of the theory underlying NPBNs, and the other is to complement the existing overviews of BNs applications with the NPNBs applications. The latter opens the opportunity to discuss some difficulties that applications pose to the theoretical framework and in this way offers some NPBN modelling guidance to practitioners. - Highlights: • The paper gives an overview of the current NPBNs methodology. • We extend the NPBN methodology by relaxing the conditions of one of its fundamental theorems. • We propose improvements of the data mining algorithm for the NPBNs. • We review the professional applications of the NPBNs.
2016-05-31
Distribution Unlimited UU UU UU UU 31-05-2016 15-Apr-2014 14-Jan-2015 Final Report: Technical Topic 3.2.2.d Bayesian and Non- parametric Statistics...of Papers published in non peer-reviewed journals: Final Report: Technical Topic 3.2.2.d Bayesian and Non- parametric Statistics: Integration of Neural...Transfer N/A Number of graduating undergraduates who achieved a 3.5 GPA to 4.0 (4.0 max scale ): Number of graduating undergraduates funded by a DoD funded
Filippi, Sarah; Holmes, Chris C; Nieto-Barajas, Luis E
2016-11-16
In this article we propose novel Bayesian nonparametric methods using Dirichlet Process Mixture (DPM) models for detecting pairwise dependence between random variables while accounting for uncertainty in the form of the underlying distributions. A key criteria is that the procedures should scale to large data sets. In this regard we find that the formal calculation of the Bayes factor for a dependent-vs.-independent DPM joint probability measure is not feasible computationally. To address this we present Bayesian diagnostic measures for characterising evidence against a "null model" of pairwise independence. In simulation studies, as well as for a real data analysis, we show that our approach provides a useful tool for the exploratory nonparametric Bayesian analysis of large multivariate data sets.
Akhtar, Naveed; Mian, Ajmal
2017-10-03
We present a principled approach to learn a discriminative dictionary along a linear classifier for hyperspectral classification. Our approach places Gaussian Process priors over the dictionary to account for the relative smoothness of the natural spectra, whereas the classifier parameters are sampled from multivariate Gaussians. We employ two Beta-Bernoulli processes to jointly infer the dictionary and the classifier. These processes are coupled under the same sets of Bernoulli distributions. In our approach, these distributions signify the frequency of the dictionary atom usage in representing class-specific training spectra, which also makes the dictionary discriminative. Due to the coupling between the dictionary and the classifier, the popularity of the atoms for representing different classes gets encoded into the classifier. This helps in predicting the class labels of test spectra that are first represented over the dictionary by solving a simultaneous sparse optimization problem. The labels of the spectra are predicted by feeding the resulting representations to the classifier. Our approach exploits the nonparametric Bayesian framework to automatically infer the dictionary size--the key parameter in discriminative dictionary learning. Moreover, it also has the desirable property of adaptively learning the association between the dictionary atoms and the class labels by itself. We use Gibbs sampling to infer the posterior probability distributions over the dictionary and the classifier under the proposed model, for which, we derive analytical expressions. To establish the effectiveness of our approach, we test it on benchmark hyperspectral images. The classification performance is compared with the state-of-the-art dictionary learning-based classification methods.
A Bayesian nonparametric approach to reconstruction and prediction of random dynamical systems
Merkatas, Christos; Kaloudis, Konstantinos; Hatjispyros, Spyridon J.
2017-06-01
We propose a Bayesian nonparametric mixture model for the reconstruction and prediction from observed time series data, of discretized stochastic dynamical systems, based on Markov Chain Monte Carlo methods. Our results can be used by researchers in physical modeling interested in a fast and accurate estimation of low dimensional stochastic models when the size of the observed time series is small and the noise process (perhaps) is non-Gaussian. The inference procedure is demonstrated specifically in the case of polynomial maps of an arbitrary degree and when a Geometric Stick Breaking mixture process prior over the space of densities, is applied to the additive errors. Our method is parsimonious compared to Bayesian nonparametric techniques based on Dirichlet process mixtures, flexible and general. Simulations based on synthetic time series are presented.
A Bayesian nonparametric approach to reconstruction and prediction of random dynamical systems.
Merkatas, Christos; Kaloudis, Konstantinos; Hatjispyros, Spyridon J
2017-06-01
We propose a Bayesian nonparametric mixture model for the reconstruction and prediction from observed time series data, of discretized stochastic dynamical systems, based on Markov Chain Monte Carlo methods. Our results can be used by researchers in physical modeling interested in a fast and accurate estimation of low dimensional stochastic models when the size of the observed time series is small and the noise process (perhaps) is non-Gaussian. The inference procedure is demonstrated specifically in the case of polynomial maps of an arbitrary degree and when a Geometric Stick Breaking mixture process prior over the space of densities, is applied to the additive errors. Our method is parsimonious compared to Bayesian nonparametric techniques based on Dirichlet process mixtures, flexible and general. Simulations based on synthetic time series are presented.
Driving Style Analysis Using Primitive Driving Patterns With Bayesian Nonparametric Approaches
Wang, Wenshuo; Xi, Junqiang; Zhao, Ding
2017-01-01
Analysis and recognition of driving styles are profoundly important to intelligent transportation and vehicle calibration. This paper presents a novel driving style analysis framework using the primitive driving patterns learned from naturalistic driving data. In order to achieve this, first, a Bayesian nonparametric learning method based on a hidden semi-Markov model (HSMM) is introduced to extract primitive driving patterns from time series driving data without prior knowledge of the number...
Nonparametric Bayesian density estimation on manifolds with applications to planar shapes.
Bhattacharya, Abhishek; Dunson, David B
2010-12-01
Statistical analysis on landmark-based shape spaces has diverse applications in morphometrics, medical diagnostics, machine vision and other areas. These shape spaces are non-Euclidean quotient manifolds. To conduct nonparametric inferences, one may define notions of centre and spread on this manifold and work with their estimates. However, it is useful to consider full likelihood-based methods, which allow nonparametric estimation of the probability density. This article proposes a broad class of mixture models constructed using suitable kernels on a general compact metric space and then on the planar shape space in particular. Following a Bayesian approach with a nonparametric prior on the mixing distribution, conditions are obtained under which the Kullback-Leibler property holds, implying large support and weak posterior consistency. Gibbs sampling methods are developed for posterior computation, and the methods are applied to problems in density estimation and classification with shape-based predictors. Simulation studies show improved estimation performance relative to existing approaches.
Riihimäki, Jaakko; Sund, Reijo; Vehtari, Aki
2010-06-01
Effective utilisation of limited resources is a challenge for health care providers. Accurate and relevant information extracted from the length of stay distributions is useful for management purposes. Patient care episodes can be reconstructed from the comprehensive health registers, and in this paper we develop a Bayesian approach to analyse the length of care episode after a fractured hip. We model the large scale data with a flexible nonparametric multilayer perceptron network and with a parametric Weibull mixture model. To assess the performances of the models, we estimate expected utilities using predictive density as a utility measure. Since the model parameters cannot be directly compared, we focus on observables, and estimate the relevances of patient explanatory variables in predicting the length of stay. To demonstrate how the use of the nonparametric flexible model is advantageous for this complex health care data, we also study joint effects of variables in predictions, and visualise nonlinearities and interactions found in the data.
Bornkamp, Björn; Ickstadt, Katja
2009-03-01
In this article, we consider monotone nonparametric regression in a Bayesian framework. The monotone function is modeled as a mixture of shifted and scaled parametric probability distribution functions, and a general random probability measure is assumed as the prior for the mixing distribution. We investigate the choice of the underlying parametric distribution function and find that the two-sided power distribution function is well suited both from a computational and mathematical point of view. The model is motivated by traditional nonlinear models for dose-response analysis, and provides possibilities to elicitate informative prior distributions on different aspects of the curve. The method is compared with other recent approaches to monotone nonparametric regression in a simulation study and is illustrated on a data set from dose-response analysis.
Nonparametric Mixture of Regression Models.
Huang, Mian; Li, Runze; Wang, Shaoli
2013-07-01
Motivated by an analysis of US house price index data, we propose nonparametric finite mixture of regression models. We study the identifiability issue of the proposed models, and develop an estimation procedure by employing kernel regression. We further systematically study the sampling properties of the proposed estimators, and establish their asymptotic normality. A modified EM algorithm is proposed to carry out the estimation procedure. We show that our algorithm preserves the ascent property of the EM algorithm in an asymptotic sense. Monte Carlo simulations are conducted to examine the finite sample performance of the proposed estimation procedure. An empirical analysis of the US house price index data is illustrated for the proposed methodology.
Nonparametric correlation models for portfolio allocation
DEFF Research Database (Denmark)
Aslanidis, Nektarios; Casas, Isabel
2013-01-01
This article proposes time-varying nonparametric and semiparametric estimators of the conditional cross-correlation matrix in the context of portfolio allocation. Simulations results show that the nonparametric and semiparametric models are best in DGPs with substantial variability or structural ...... currencies. Results show the nonparametric model generally dominates the others when evaluating in-sample. However, the semiparametric model is best for out-of-sample analysis....
Predicting Market Impact Costs Using Nonparametric Machine Learning Models.
Directory of Open Access Journals (Sweden)
Saerom Park
Full Text Available Market impact cost is the most significant portion of implicit transaction costs that can reduce the overall transaction cost, although it cannot be measured directly. In this paper, we employed the state-of-the-art nonparametric machine learning models: neural networks, Bayesian neural network, Gaussian process, and support vector regression, to predict market impact cost accurately and to provide the predictive model that is versatile in the number of variables. We collected a large amount of real single transaction data of US stock market from Bloomberg Terminal and generated three independent input variables. As a result, most nonparametric machine learning models outperformed a-state-of-the-art benchmark parametric model such as I-star model in four error measures. Although these models encounter certain difficulties in separating the permanent and temporary cost directly, nonparametric machine learning models can be good alternatives in reducing transaction costs by considerably improving in prediction performance.
Predicting Market Impact Costs Using Nonparametric Machine Learning Models.
Park, Saerom; Lee, Jaewook; Son, Youngdoo
2016-01-01
Market impact cost is the most significant portion of implicit transaction costs that can reduce the overall transaction cost, although it cannot be measured directly. In this paper, we employed the state-of-the-art nonparametric machine learning models: neural networks, Bayesian neural network, Gaussian process, and support vector regression, to predict market impact cost accurately and to provide the predictive model that is versatile in the number of variables. We collected a large amount of real single transaction data of US stock market from Bloomberg Terminal and generated three independent input variables. As a result, most nonparametric machine learning models outperformed a-state-of-the-art benchmark parametric model such as I-star model in four error measures. Although these models encounter certain difficulties in separating the permanent and temporary cost directly, nonparametric machine learning models can be good alternatives in reducing transaction costs by considerably improving in prediction performance.
Frasso, Gianluca; Lambert, Philippe
2016-10-01
SummaryThe 2014 Ebola outbreak in Sierra Leone is analyzed using a susceptible-exposed-infectious-removed (SEIR) epidemic compartmental model. The discrete time-stochastic model for the epidemic evolution is coupled to a set of ordinary differential equations describing the dynamics of the expected proportions of subjects in each epidemic state. The unknown parameters are estimated in a Bayesian framework by combining data on the number of new (laboratory confirmed) Ebola cases reported by the Ministry of Health and prior distributions for the transition rates elicited using information collected by the WHO during the follow-up of specific Ebola cases. The time-varying disease transmission rate is modeled in a flexible way using penalized B-splines. Our framework represents a valuable stochastic tool for the study of an epidemic dynamic even when only irregularly observed and possibly aggregated data are available. Simulations and the analysis of the 2014 Sierra Leone Ebola data highlight the merits of the proposed methodology. In particular, the flexible modeling of the disease transmission rate makes the estimation of the effective reproduction number robust to the misspecification of the initial epidemic states and to underreporting of the infectious cases. © The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Adaptive nonparametric Bayesian inference using location-scale mixture priors
Jonge, de R.; Zanten, van J.H.
2010-01-01
We study location-scale mixture priors for nonparametric statistical problems, including multivariate regression, density estimation and classification. We show that a rate-adaptive procedure can be obtained if the prior is properly constructed. In particular, we show that adaptation is achieved if
A nonparametric Bayesian approach for genetic evaluation in ...
African Journals Online (AJOL)
Unknown
Finally, one can report the whole of the posterior probability distributions of the parameters in ... the Markov Chain Monte Carlo Methods, and more specific Gibbs Sampling, these ...... Bayesian Methods in Animal Breeding Theory. J. Anim. Sci.
DEFF Research Database (Denmark)
Jensen, Finn Verner; Nielsen, Thomas Dyhre
2016-01-01
Mathematically, a Bayesian graphical model is a compact representation of the joint probability distribution for a set of variables. The most frequently used type of Bayesian graphical models are Bayesian networks. The structural part of a Bayesian graphical model is a graph consisting of nodes...
Ryu, Duchwan
2010-09-28
We consider nonparametric regression analysis in a generalized linear model (GLM) framework for data with covariates that are the subject-specific random effects of longitudinal measurements. The usual assumption that the effects of the longitudinal covariate processes are linear in the GLM may be unrealistic and if this happens it can cast doubt on the inference of observed covariate effects. Allowing the regression functions to be unknown, we propose to apply Bayesian nonparametric methods including cubic smoothing splines or P-splines for the possible nonlinearity and use an additive model in this complex setting. To improve computational efficiency, we propose the use of data-augmentation schemes. The approach allows flexible covariance structures for the random effects and within-subject measurement errors of the longitudinal processes. The posterior model space is explored through a Markov chain Monte Carlo (MCMC) sampler. The proposed methods are illustrated and compared to other approaches, the "naive" approach and the regression calibration, via simulations and by an application that investigates the relationship between obesity in adulthood and childhood growth curves. © 2010, The International Biometric Society.
DEFF Research Database (Denmark)
Ramirez, José Rangel; Sørensen, John Dalsgaard
2011-01-01
This work illustrates the updating and incorporation of information in the assessment of fatigue reliability for offshore wind turbine. The new information, coming from external and condition monitoring can be used to direct updating of the stochastic variables through a non-parametric Bayesian u...
A non-parametric Bayesian approach to decompounding from high frequency data
Gugushvili, Shota; van der Meulen, F.H.; Spreij, Peter
2016-01-01
Given a sample from a discretely observed compound Poisson process, we consider non-parametric estimation of the density f0 of its jump sizes, as well as of its intensity λ0. We take a Bayesian approach to the problem and specify the prior on f0 as the Dirichlet location mixture of normal densities.
Nonparametric Bayesian predictive distributions for future order statistics
Richard A. Johnson; James W. Evans; David W. Green
1999-01-01
We derive the predictive distribution for a specified order statistic, determined from a future random sample, under a Dirichlet process prior. Two variants of the approach are treated and some limiting cases studied. A practical application to monitoring the strength of lumber is discussed including choices of prior expectation and comparisons made to a Bayesian...
Park, Taeyoung; Jeong, Jong-Hyeon; Lee, Jae Won
2012-08-15
There is often an interest in estimating a residual life function as a summary measure of survival data. For ease in presentation of the potential therapeutic effect of a new drug, investigators may summarize survival data in terms of the remaining life years of patients. Under heavy right censoring, however, some reasonably high quantiles (e.g., median) of a residual lifetime distribution cannot be always estimated via a popular nonparametric approach on the basis of the Kaplan-Meier estimator. To overcome the difficulties in dealing with heavily censored survival data, this paper develops a Bayesian nonparametric approach that takes advantage of a fully model-based but highly flexible probabilistic framework. We use a Dirichlet process mixture of Weibull distributions to avoid strong parametric assumptions on the unknown failure time distribution, making it possible to estimate any quantile residual life function under heavy censoring. Posterior computation through Markov chain Monte Carlo is straightforward and efficient because of conjugacy properties and partial collapse. We illustrate the proposed methods by using both simulated data and heavily censored survival data from a recent breast cancer clinical trial conducted by the National Surgical Adjuvant Breast and Bowel Project. Copyright © 2012 John Wiley & Sons, Ltd.
MAP estimators and their consistency in Bayesian nonparametric inverse problems
Dashti, M.
2013-09-01
We consider the inverse problem of estimating an unknown function u from noisy measurements y of a known, possibly nonlinear, map applied to u. We adopt a Bayesian approach to the problem and work in a setting where the prior measure is specified as a Gaussian random field μ0. We work under a natural set of conditions on the likelihood which implies the existence of a well-posed posterior measure, μy. Under these conditions, we show that the maximum a posteriori (MAP) estimator is well defined as the minimizer of an Onsager-Machlup functional defined on the Cameron-Martin space of the prior; thus, we link a problem in probability with a problem in the calculus of variations. We then consider the case where the observational noise vanishes and establish a form of Bayesian posterior consistency for the MAP estimator. We also prove a similar result for the case where the observation of can be repeated as many times as desired with independent identically distributed noise. The theory is illustrated with examples from an inverse problem for the Navier-Stokes equation, motivated by problems arising in weather forecasting, and from the theory of conditioned diffusions, motivated by problems arising in molecular dynamics. © 2013 IOP Publishing Ltd.
MAP estimators and their consistency in Bayesian nonparametric inverse problems
International Nuclear Information System (INIS)
Dashti, M; Law, K J H; Stuart, A M; Voss, J
2013-01-01
We consider the inverse problem of estimating an unknown function u from noisy measurements y of a known, possibly nonlinear, map G applied to u. We adopt a Bayesian approach to the problem and work in a setting where the prior measure is specified as a Gaussian random field μ 0 . We work under a natural set of conditions on the likelihood which implies the existence of a well-posed posterior measure, μ y . Under these conditions, we show that the maximum a posteriori (MAP) estimator is well defined as the minimizer of an Onsager–Machlup functional defined on the Cameron–Martin space of the prior; thus, we link a problem in probability with a problem in the calculus of variations. We then consider the case where the observational noise vanishes and establish a form of Bayesian posterior consistency for the MAP estimator. We also prove a similar result for the case where the observation of G(u) can be repeated as many times as desired with independent identically distributed noise. The theory is illustrated with examples from an inverse problem for the Navier–Stokes equation, motivated by problems arising in weather forecasting, and from the theory of conditioned diffusions, motivated by problems arising in molecular dynamics. (paper)
Nonparametric estimation in models for unobservable heterogeneity
Hohmann, Daniel
2014-01-01
Nonparametric models which allow for data with unobservable heterogeneity are studied. The first publication introduces new estimators and their asymptotic properties for conditional mixture models. The second publication considers estimation of a function from noisy observations of its Radon transform in a Gaussian white noise model.
Bayesian Nonparametric Mixture Estimation for Time-Indexed Functional Data in R
Directory of Open Access Journals (Sweden)
Terrance D. Savitsky
2016-08-01
Full Text Available We present growfunctions for R that offers Bayesian nonparametric estimation models for analysis of dependent, noisy time series data indexed by a collection of domains. This data structure arises from combining periodically published government survey statistics, such as are reported in the Current Population Study (CPS. The CPS publishes monthly, by-state estimates of employment levels, where each state expresses a noisy time series. Published state-level estimates from the CPS are composed from household survey responses in a model-free manner and express high levels of volatility due to insufficient sample sizes. Existing software solutions borrow information over a modeled time-based dependence to extract a de-noised time series for each domain. These solutions, however, ignore the dependence among the domains that may be additionally leveraged to improve estimation efficiency. The growfunctions package offers two fully nonparametric mixture models that simultaneously estimate both a time and domain-indexed dependence structure for a collection of time series: (1 A Gaussian process (GP construction, which is parameterized through the covariance matrix, estimates a latent function for each domain. The covariance parameters of the latent functions are indexed by domain under a Dirichlet process prior that permits estimation of the dependence among functions across the domains: (2 An intrinsic Gaussian Markov random field prior construction provides an alternative to the GP that expresses different computation and estimation properties. In addition to performing denoised estimation of latent functions from published domain estimates, growfunctions allows estimation of collections of functions for observation units (e.g., households, rather than aggregated domains, by accounting for an informative sampling design under which the probabilities for inclusion of observation units are related to the response variable. growfunctions includes plot
Fully probabilistic design of hierarchical Bayesian models
Czech Academy of Sciences Publication Activity Database
Quinn, A.; Kárný, Miroslav; Guy, Tatiana Valentine
2016-01-01
Roč. 369, č. 1 (2016), s. 532-547 ISSN 0020-0255 R&D Projects: GA ČR GA13-13502S Institutional support: RVO:67985556 Keywords : Fully probabilistic design * Ideal distribution * Minimum cross-entropy principle * Bayesian conditioning * Kullback-Leibler divergence * Bayesian nonparametric modelling Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 4.832, year: 2016 http://library.utia.cas.cz/separaty/2016/AS/karny-0463052.pdf
Single molecule force spectroscopy at high data acquisition: A Bayesian nonparametric analysis
Sgouralis, Ioannis; Whitmore, Miles; Lapidus, Lisa; Comstock, Matthew J.; Pressé, Steve
2018-03-01
Bayesian nonparametrics (BNPs) are poised to have a deep impact in the analysis of single molecule data as they provide posterior probabilities over entire models consistent with the supplied data, not just model parameters of one preferred model. Thus they provide an elegant and rigorous solution to the difficult problem encountered when selecting an appropriate candidate model. Nevertheless, BNPs' flexibility to learn models and their associated parameters from experimental data is a double-edged sword. Most importantly, BNPs are prone to increasing the complexity of the estimated models due to artifactual features present in time traces. Thus, because of experimental challenges unique to single molecule methods, naive application of available BNP tools is not possible. Here we consider traces with time correlations and, as a specific example, we deal with force spectroscopy traces collected at high acquisition rates. While high acquisition rates are required in order to capture dwells in short-lived molecular states, in this setup, a slow response of the optical trap instrumentation (i.e., trapped beads, ambient fluid, and tethering handles) distorts the molecular signals introducing time correlations into the data that may be misinterpreted as true states by naive BNPs. Our adaptation of BNP tools explicitly takes into consideration these response dynamics, in addition to drift and noise, and makes unsupervised time series analysis of correlated single molecule force spectroscopy measurements possible, even at acquisition rates similar to or below the trap's response times.
Nonparametric Bayesian inference for mean residual life functions in survival analysis.
Poynor, Valerie; Kottas, Athanasios
2018-01-19
Modeling and inference for survival analysis problems typically revolves around different functions related to the survival distribution. Here, we focus on the mean residual life (MRL) function, which provides the expected remaining lifetime given that a subject has survived (i.e. is event-free) up to a particular time. This function is of direct interest in reliability, medical, and actuarial fields. In addition to its practical interpretation, the MRL function characterizes the survival distribution. We develop general Bayesian nonparametric inference for MRL functions built from a Dirichlet process mixture model for the associated survival distribution. The resulting model for the MRL function admits a representation as a mixture of the kernel MRL functions with time-dependent mixture weights. This model structure allows for a wide range of shapes for the MRL function. Particular emphasis is placed on the selection of the mixture kernel, taken to be a gamma distribution, to obtain desirable properties for the MRL function arising from the mixture model. The inference method is illustrated with a data set of two experimental groups and a data set involving right censoring. The supplementary material available at Biostatistics online provides further results on empirical performance of the model, using simulated data examples. © The Author 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Graziani, Rebecca; Guindani, Michele; Thall, Peter F.
2015-01-01
Summary The effect of a targeted agent on a cancer patient's clinical outcome putatively is mediated through the agent's effect on one or more early biological events. This is motivated by pre-clinical experiments with cells or animals that identify such events, represented by binary or quantitative biomarkers. When evaluating targeted agents in humans, central questions are whether the distribution of a targeted biomarker changes following treatment, the nature and magnitude of this change, and whether it is associated with clinical outcome. Major difficulties in estimating these effects are that a biomarker's distribution may be complex, vary substantially between patients, and have complicated relationships with clinical outcomes. We present a probabilistically coherent framework for modeling and estimation in this setting, including a hierarchical Bayesian nonparametric mixture model for biomarkers that we use to define a functional profile of pre-versus-post treatment biomarker distribution change. The functional is similar to the receiver operating characteristic used in diagnostic testing. The hierarchical model yields clusters of individual patient biomarker profile functionals, and we use the profile as a covariate in a regression model for clinical outcome. The methodology is illustrated by analysis of a dataset from a clinical trial in prostate cancer using imatinib to target platelet-derived growth factor, with the clinical aim to improve progression-free survival time. PMID:25319212
Rodríguez, Abel; Wang, Ziwei; Kottas, Athanasios
2017-01-01
We develop a Bayesian nonparametric model to assess the effect of systematic risks on multiple financial markets, and apply it to understand the behavior of the S&P500 sector indexes between January 1, 2000 and December 31, 2011. More than prediction, our main goal is to understand the evolution of systematic and idiosyncratic risks in the U.S. economy over this particular time period, leading to novel sector-specific risk indexes. To accomplish this goal, we model the appearance of extreme l...
Ponciano, José Miguel
2017-11-22
Using a nonparametric Bayesian approach Palacios and Minin (2013) dramatically improved the accuracy, precision of Bayesian inference of population size trajectories from gene genealogies. These authors proposed an extension of a Gaussian Process (GP) nonparametric inferential method for the intensity function of non-homogeneous Poisson processes. They found that not only the statistical properties of the estimators were improved with their method, but also, that key aspects of the demographic histories were recovered. The authors' work represents the first Bayesian nonparametric solution to this inferential problem because they specify a convenient prior belief without a particular functional form on the population trajectory. Their approach works so well and provides such a profound understanding of the biological process, that the question arises as to how truly "biology-free" their approach really is. Using well-known concepts of stochastic population dynamics, here I demonstrate that in fact, Palacios and Minin's GP model can be cast as a parametric population growth model with density dependence and environmental stochasticity. Making this link between population genetics and stochastic population dynamics modeling provides novel insights into eliciting biologically meaningful priors for the trajectory of the effective population size. The results presented here also bring novel understanding of GP as models for the evolution of a trait. Thus, the ecological principles foundation of Palacios and Minin (2013)'s prior adds to the conceptual and scientific value of these authors' inferential approach. I conclude this note by listing a series of insights brought about by this connection with Ecology. Copyright © 2017 The Author. Published by Elsevier Inc. All rights reserved.
Parametric and Non-Parametric System Modelling
DEFF Research Database (Denmark)
Nielsen, Henrik Aalborg
1999-01-01
the focus is on combinations of parametric and non-parametric methods of regression. This combination can be in terms of additive models where e.g. one or more non-parametric term is added to a linear regression model. It can also be in terms of conditional parametric models where the coefficients...... considered. It is shown that adaptive estimation in conditional parametric models can be performed by combining the well known methods of local polynomial regression and recursive least squares with exponential forgetting. The approach used for estimation in conditional parametric models also highlights how...... networks is included. In this paper, neural networks are used for predicting the electricity production of a wind farm. The results are compared with results obtained using an adaptively estimated ARX-model. Finally, two papers on stochastic differential equations are included. In the first paper, among...
Directory of Open Access Journals (Sweden)
Antonio Canale
2017-06-01
Full Text Available msBP is an R package that implements a new method to perform Bayesian multiscale nonparametric inference introduced by Canale and Dunson (2016. The method, based on mixtures of multiscale beta dictionary densities, overcomes the drawbacks of Pólya trees and inherits many of the advantages of Dirichlet process mixture models. The key idea is that an infinitely-deep binary tree is introduced, with a beta dictionary density assigned to each node of the tree. Using a multiscale stick-breaking characterization, stochastically decreasing weights are assigned to each node. The result is an infinite mixture model. The package msBP implements a series of basic functions to deal with this family of priors such as random densities and numbers generation, creation and manipulation of binary tree objects, and generic functions to plot and print the results. In addition, it implements the Gibbs samplers for posterior computation to perform multiscale density estimation and multiscale testing of group differences described in Canale and Dunson (2016.
Non-parametric Bayesian inference for inhomogeneous Markov point processes
DEFF Research Database (Denmark)
Berthelsen, Kasper Klitgaard; Møller, Jesper; Johansen, Per Michael
is a shot noise process, and the interaction function for a pair of points depends only on the distance between the two points and is a piecewise linear function modelled by a marked Poisson process. Simulation of the resulting posterior using a Metropolis-Hastings algorithm in the "conventional" way...
A framework for Bayesian nonparametric inference for causal effects of mediation.
Kim, Chanmin; Daniels, Michael J; Marcus, Bess H; Roy, Jason A
2017-06-01
We propose a Bayesian non-parametric (BNP) framework for estimating causal effects of mediation, the natural direct, and indirect, effects. The strategy is to do this in two parts. Part 1 is a flexible model (using BNP) for the observed data distribution. Part 2 is a set of uncheckable assumptions with sensitivity parameters that in conjunction with Part 1 allows identification and estimation of the causal parameters and allows for uncertainty about these assumptions via priors on the sensitivity parameters. For Part 1, we specify a Dirichlet process mixture of multivariate normals as a prior on the joint distribution of the outcome, mediator, and covariates. This approach allows us to obtain a (simple) closed form of each marginal distribution. For Part 2, we consider two sets of assumptions: (a) the standard sequential ignorability (Imai et al., 2010) and (b) weakened set of the conditional independence type assumptions introduced in Daniels et al. (2012) and propose sensitivity analyses for both. We use this approach to assess mediation in a physical activity promotion trial. © 2016, The International Biometric Society.
Online Nonparametric Bayesian Activity Mining and Analysis From Surveillance Video.
Bastani, Vahid; Marcenaro, Lucio; Regazzoni, Carlo S
2016-05-01
A method for online incremental mining of activity patterns from the surveillance video stream is presented in this paper. The framework consists of a learning block in which Dirichlet process mixture model is employed for the incremental clustering of trajectories. Stochastic trajectory pattern models are formed using the Gaussian process regression of the corresponding flow functions. Moreover, a sequential Monte Carlo method based on Rao-Blackwellized particle filter is proposed for tracking and online classification as well as the detection of abnormality during the observation of an object. Experimental results on real surveillance video data are provided to show the performance of the proposed algorithm in different tasks of trajectory clustering, classification, and abnormality detection.
Nonparametric Bayes Modeling of Multivariate Categorical Data.
Dunson, David B; Xing, Chuanhua
2012-01-01
Modeling of multivariate unordered categorical (nominal) data is a challenging problem, particularly in high dimensions and cases in which one wishes to avoid strong assumptions about the dependence structure. Commonly used approaches rely on the incorporation of latent Gaussian random variables or parametric latent class models. The goal of this article is to develop a nonparametric Bayes approach, which defines a prior with full support on the space of distributions for multiple unordered categorical variables. This support condition ensures that we are not restricting the dependence structure a priori. We show this can be accomplished through a Dirichlet process mixture of product multinomial distributions, which is also a convenient form for posterior computation. Methods for nonparametric testing of violations of independence are proposed, and the methods are applied to model positional dependence within transcription factor binding motifs.
A non-parametric hierarchical model to discover behavior dynamics from tracks
Kooij, J.F.P.; Englebienne, G.; Gavrila, D.M.
2012-01-01
We present a novel non-parametric Bayesian model to jointly discover the dynamics of low-level actions and high-level behaviors of tracked people in open environments. Our model represents behaviors as Markov chains of actions which capture high-level temporal dynamics. Actions may be shared by
Nonparametric Mixture Models for Supervised Image Parcellation.
Sabuncu, Mert R; Yeo, B T Thomas; Van Leemput, Koen; Fischl, Bruce; Golland, Polina
2009-09-01
We present a nonparametric, probabilistic mixture model for the supervised parcellation of images. The proposed model yields segmentation algorithms conceptually similar to the recently developed label fusion methods, which register a new image with each training image separately. Segmentation is achieved via the fusion of transferred manual labels. We show that in our framework various settings of a model parameter yield algorithms that use image intensity information differently in determining the weight of a training subject during fusion. One particular setting computes a single, global weight per training subject, whereas another setting uses locally varying weights when fusing the training data. The proposed nonparametric parcellation approach capitalizes on recently developed fast and robust pairwise image alignment tools. The use of multiple registrations allows the algorithm to be robust to occasional registration failures. We report experiments on 39 volumetric brain MRI scans with expert manual labels for the white matter, cerebral cortex, ventricles and subcortical structures. The results demonstrate that the proposed nonparametric segmentation framework yields significantly better segmentation than state-of-the-art algorithms.
Genomic breeding value estimation using nonparametric additive regression models
Directory of Open Access Journals (Sweden)
Solberg Trygve
2009-01-01
Full Text Available Abstract Genomic selection refers to the use of genomewide dense markers for breeding value estimation and subsequently for selection. The main challenge of genomic breeding value estimation is the estimation of many effects from a limited number of observations. Bayesian methods have been proposed to successfully cope with these challenges. As an alternative class of models, non- and semiparametric models were recently introduced. The present study investigated the ability of nonparametric additive regression models to predict genomic breeding values. The genotypes were modelled for each marker or pair of flanking markers (i.e. the predictors separately. The nonparametric functions for the predictors were estimated simultaneously using additive model theory, applying a binomial kernel. The optimal degree of smoothing was determined by bootstrapping. A mutation-drift-balance simulation was carried out. The breeding values of the last generation (genotyped was predicted using data from the next last generation (genotyped and phenotyped. The results show moderate to high accuracies of the predicted breeding values. A determination of predictor specific degree of smoothing increased the accuracy.
Copula Based Factorization in Bayesian Multivariate Infinite Mixture Models
Martin Burda; Artem Prokhorov
2012-01-01
Bayesian nonparametric models based on infinite mixtures of density kernels have been recently gaining in popularity due to their flexibility and feasibility of implementation even in complicated modeling scenarios. In economics, they have been particularly useful in estimating nonparametric distributions of latent variables. However, these models have been rarely applied in more than one dimension. Indeed, the multivariate case suffers from the curse of dimensionality, with a rapidly increas...
Energy Technology Data Exchange (ETDEWEB)
Takamizawa, Hisashi, E-mail: takamizawa.hisashi@jaea.go.jp; Itoh, Hiroto, E-mail: ito.hiroto@jaea.go.jp; Nishiyama, Yutaka, E-mail: nishiyama.yutaka93@jaea.go.jp
2016-10-15
In order to understand neutron irradiation embrittlement in high fluence regions, statistical analysis using the Bayesian nonparametric (BNP) method was performed for the Japanese surveillance and material test reactor irradiation database. The BNP method is essentially expressed as an infinite summation of normal distributions, with input data being subdivided into clusters with identical statistical parameters, such as mean and standard deviation, for each cluster to estimate shifts in ductile-to-brittle transition temperature (DBTT). The clusters typically depend on chemical compositions, irradiation conditions, and the irradiation embrittlement. Specific variables contributing to the irradiation embrittlement include the content of Cu, Ni, P, Si, and Mn in the pressure vessel steels, neutron flux, neutron fluence, and irradiation temperatures. It was found that the measured shifts of DBTT correlated well with the calculated ones. Data associated with the same materials were subdivided into the same clusters even if neutron fluences were increased.
Palacios, Julia A; Minin, Vladimir N
2013-03-01
Changes in population size influence genetic diversity of the population and, as a result, leave a signature of these changes in individual genomes in the population. We are interested in the inverse problem of reconstructing past population dynamics from genomic data. We start with a standard framework based on the coalescent, a stochastic process that generates genealogies connecting randomly sampled individuals from the population of interest. These genealogies serve as a glue between the population demographic history and genomic sequences. It turns out that only the times of genealogical lineage coalescences contain information about population size dynamics. Viewing these coalescent times as a point process, estimating population size trajectories is equivalent to estimating a conditional intensity of this point process. Therefore, our inverse problem is similar to estimating an inhomogeneous Poisson process intensity function. We demonstrate how recent advances in Gaussian process-based nonparametric inference for Poisson processes can be extended to Bayesian nonparametric estimation of population size dynamics under the coalescent. We compare our Gaussian process (GP) approach to one of the state-of-the-art Gaussian Markov random field (GMRF) methods for estimating population trajectories. Using simulated data, we demonstrate that our method has better accuracy and precision. Next, we analyze two genealogies reconstructed from real sequences of hepatitis C and human Influenza A viruses. In both cases, we recover more believed aspects of the viral demographic histories than the GMRF approach. We also find that our GP method produces more reasonable uncertainty estimates than the GMRF method. Copyright © 2013, The International Biometric Society.
A Structural Labor Supply Model with Nonparametric Preferences
van Soest, A.H.O.; Das, J.W.M.; Gong, X.
2000-01-01
Nonparametric techniques are usually seen as a statistic device for data description and exploration, and not as a tool for estimating models with a richer economic structure, which are often required for policy analysis.This paper presents an example where nonparametric flexibility can be attained
Bayesian Modelling of Functional Whole Brain Connectivity
DEFF Research Database (Denmark)
Røge, Rasmus
the prevalent strategy of standardizing of fMRI time series and model data using directional statistics or we model the variability in the signal across the brain and across multiple subjects. In either case, we use Bayesian nonparametric modeling to automatically learn from the fMRI data the number......This thesis deals with parcellation of whole-brain functional magnetic resonance imaging (fMRI) using Bayesian inference with mixture models tailored to the fMRI data. In the three included papers and manuscripts, we analyze two different approaches to modeling fMRI signal; either we accept...... of funcional units, i.e. parcels. We benchmark the proposed mixture models against state of the art methods of brain parcellation, both probabilistic and non-probabilistic. The time series of each voxel are most often standardized using z-scoring which projects the time series data onto a hypersphere...
A nonparametric mixture model for cure rate estimation.
Peng, Y; Dear, K B
2000-03-01
Nonparametric methods have attracted less attention than their parametric counterparts for cure rate analysis. In this paper, we study a general nonparametric mixture model. The proportional hazards assumption is employed in modeling the effect of covariates on the failure time of patients who are not cured. The EM algorithm, the marginal likelihood approach, and multiple imputations are employed to estimate parameters of interest in the model. This model extends models and improves estimation methods proposed by other researchers. It also extends Cox's proportional hazards regression model by allowing a proportion of event-free patients and investigating covariate effects on that proportion. The model and its estimation method are investigated by simulations. An application to breast cancer data, including comparisons with previous analyses using a parametric model and an existing nonparametric model by other researchers, confirms the conclusions from the parametric model but not those from the existing nonparametric model.
Directory of Open Access Journals (Sweden)
Urbi Garay
2016-03-01
Full Text Available We define a dynamic and self-adjusting mixture of Gaussian Graphical Models to cluster financial returns, and provide a new method for extraction of nonparametric estimates of dynamic alphas (excess return and betas (to a choice set of explanatory factors in a multivariate setting. This approach, as well as the outputs, has a dynamic, nonstationary and nonparametric form, which circumvents the problem of model risk and parametric assumptions that the Kalman filter and other widely used approaches rely on. The by-product of clusters, used for shrinkage and information borrowing, can be of use to determine relationships around specific events. This approach exhibits a smaller Root Mean Squared Error than traditionally used benchmarks in financial settings, which we illustrate through simulation. As an illustration, we use hedge fund index data, and find that our estimated alphas are, on average, 0.13% per month higher (1.6% per year than alphas estimated through Ordinary Least Squares. The approach exhibits fast adaptation to abrupt changes in the parameters, as seen in our estimated alphas and betas, which exhibit high volatility, especially in periods which can be identified as times of stressful market events, a reflection of the dynamic positioning of hedge fund portfolio managers.
Congdon, Peter
2014-01-01
This book provides an accessible approach to Bayesian computing and data analysis, with an emphasis on the interpretation of real data sets. Following in the tradition of the successful first edition, this book aims to make a wide range of statistical modeling applications accessible using tested code that can be readily adapted to the reader's own applications. The second edition has been thoroughly reworked and updated to take account of advances in the field. A new set of worked examples is included. The novel aspect of the first edition was the coverage of statistical modeling using WinBU
Promotion time cure rate model with nonparametric form of covariate effects.
Chen, Tianlei; Du, Pang
2018-05-10
Survival data with a cured portion are commonly seen in clinical trials. Motivated from a biological interpretation of cancer metastasis, promotion time cure model is a popular alternative to the mixture cure rate model for analyzing such data. The existing promotion cure models all assume a restrictive parametric form of covariate effects, which can be incorrectly specified especially at the exploratory stage. In this paper, we propose a nonparametric approach to modeling the covariate effects under the framework of promotion time cure model. The covariate effect function is estimated by smoothing splines via the optimization of a penalized profile likelihood. Point-wise interval estimates are also derived from the Bayesian interpretation of the penalized profile likelihood. Asymptotic convergence rates are established for the proposed estimates. Simulations show excellent performance of the proposed nonparametric method, which is then applied to a melanoma study. Copyright © 2018 John Wiley & Sons, Ltd.
Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne
2012-12-01
In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models.
Nonparametric modeling of dynamic functional connectivity in fmri data
DEFF Research Database (Denmark)
Nielsen, Søren Føns Vind; Madsen, Kristoffer H.; Røge, Rasmus
2015-01-01
dynamic changes. The existing approaches modeling dynamic connectivity have primarily been based on time-windowing the data and k-means clustering. We propose a nonparametric generative model for dynamic FC in fMRI that does not rely on specifying window lengths and number of dynamic states. Rooted...
The nonparametric bootstrap for the current status model
Groeneboom, P.; Hendrickx, K.
2017-01-01
It has been proved that direct bootstrapping of the nonparametric maximum likelihood estimator (MLE) of the distribution function in the current status model leads to inconsistent confidence intervals. We show that bootstrapping of functionals of the MLE can however be used to produce valid
Fast Bayesian Inference in Dirichlet Process Mixture Models.
Wang, Lianming; Dunson, David B
2011-01-01
There has been increasing interest in applying Bayesian nonparametric methods in large samples and high dimensions. As Markov chain Monte Carlo (MCMC) algorithms are often infeasible, there is a pressing need for much faster algorithms. This article proposes a fast approach for inference in Dirichlet process mixture (DPM) models. Viewing the partitioning of subjects into clusters as a model selection problem, we propose a sequential greedy search algorithm for selecting the partition. Then, when conjugate priors are chosen, the resulting posterior conditionally on the selected partition is available in closed form. This approach allows testing of parametric models versus nonparametric alternatives based on Bayes factors. We evaluate the approach using simulation studies and compare it with four other fast nonparametric methods in the literature. We apply the proposed approach to three datasets including one from a large epidemiologic study. Matlab codes for the simulation and data analyses using the proposed approach are available online in the supplemental materials.
Shahbaba, Babak; Johnson, Wesley O
2013-05-30
High-throughput scientific studies involving no clear a priori hypothesis are common. For example, a large-scale genomic study of a disease may examine thousands of genes without hypothesizing that any specific gene is responsible for the disease. In these studies, the objective is to explore a large number of possible factors (e.g., genes) in order to identify a small number that will be considered in follow-up studies that tend to be more thorough and on smaller scales. A simple, hierarchical, linear regression model with random coefficients is assumed for case-control data that correspond to each gene. The specific model used will be seen to be related to a standard Bayesian variable selection model. Relatively large regression coefficients correspond to potential differences in responses for cases versus controls and thus to genes that might 'matter'. For large-scale studies, and using a Dirichlet process mixture model for the regression coefficients, we are able to find clusters of regression effects of genes with increasing potential effect or 'relevance', in relation to the outcome of interest. One cluster will always correspond to genes whose coefficients are in a neighborhood that is relatively close to zero and will be deemed least relevant. Other clusters will correspond to increasing magnitudes of the random/latent regression coefficients. Using simulated data, we demonstrate that our approach could be quite effective in finding relevant genes compared with several alternative methods. We apply our model to two large-scale studies. The first study involves transcriptome analysis of infection by human cytomegalovirus. The second study's objective is to identify differentially expressed genes between two types of leukemia. Copyright © 2012 John Wiley & Sons, Ltd.
Non-Parametric Model Drift Detection
2016-07-01
framework on two tasks in NLP domain, topic modeling, and machine translation. Our main findings are summarized as follows: • We can measure important...thank,us,me,hope,today Group num: 4, TC(X;Y_j): 0.407 4:republic,palestinian,israel, arab ,israeli,democratic,congo,mr,president,occupied Group num: 5...support,change,lessons,partnerships,l earned Group num: 35, TC(X;Y_j): 0.094 35:russian,federation,spoke,you,french,spanish, arabic ,your,chinese,sir
Bayesian methods for data analysis
Carlin, Bradley P.
2009-01-01
Approaches for statistical inference Introduction Motivating Vignettes Defining the Approaches The Bayes-Frequentist Controversy Some Basic Bayesian Models The Bayes approach Introduction Prior Distributions Bayesian Inference Hierarchical Modeling Model Assessment Nonparametric Methods Bayesian computation Introduction Asymptotic Methods Noniterative Monte Carlo Methods Markov Chain Monte Carlo Methods Model criticism and selection Bayesian Modeling Bayesian Robustness Model Assessment Bayes Factors via Marginal Density Estimation Bayes Factors
Wei, Jiawei; Carroll, Raymond J.; Maity, Arnab
2011-01-01
We consider the problem of testing for a constant nonparametric effect in a general semi-parametric regression model when there is the potential for interaction between the parametrically and nonparametrically modeled variables. The work
Bayesian analysis of CCDM models
Jesus, J. F.; Valentim, R.; Andrade-Oliveira, F.
2017-09-01
Creation of Cold Dark Matter (CCDM), in the context of Einstein Field Equations, produces a negative pressure term which can be used to explain the accelerated expansion of the Universe. In this work we tested six different spatially flat models for matter creation using statistical criteria, in light of SNe Ia data: Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and Bayesian Evidence (BE). These criteria allow to compare models considering goodness of fit and number of free parameters, penalizing excess of complexity. We find that JO model is slightly favoured over LJO/ΛCDM model, however, neither of these, nor Γ = 3αH0 model can be discarded from the current analysis. Three other scenarios are discarded either because poor fitting or because of the excess of free parameters. A method of increasing Bayesian evidence through reparameterization in order to reducing parameter degeneracy is also developed.
Bayesian analysis of CCDM models
Energy Technology Data Exchange (ETDEWEB)
Jesus, J.F. [Universidade Estadual Paulista (Unesp), Câmpus Experimental de Itapeva, Rua Geraldo Alckmin 519, Vila N. Sra. de Fátima, Itapeva, SP, 18409-010 Brazil (Brazil); Valentim, R. [Departamento de Física, Instituto de Ciências Ambientais, Químicas e Farmacêuticas—ICAQF, Universidade Federal de São Paulo (UNIFESP), Unidade José Alencar, Rua São Nicolau No. 210, Diadema, SP, 09913-030 Brazil (Brazil); Andrade-Oliveira, F., E-mail: jfjesus@itapeva.unesp.br, E-mail: valentim.rodolfo@unifesp.br, E-mail: felipe.oliveira@port.ac.uk [Institute of Cosmology and Gravitation—University of Portsmouth, Burnaby Road, Portsmouth, PO1 3FX United Kingdom (United Kingdom)
2017-09-01
Creation of Cold Dark Matter (CCDM), in the context of Einstein Field Equations, produces a negative pressure term which can be used to explain the accelerated expansion of the Universe. In this work we tested six different spatially flat models for matter creation using statistical criteria, in light of SNe Ia data: Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and Bayesian Evidence (BE). These criteria allow to compare models considering goodness of fit and number of free parameters, penalizing excess of complexity. We find that JO model is slightly favoured over LJO/ΛCDM model, however, neither of these, nor Γ = 3α H {sub 0} model can be discarded from the current analysis. Three other scenarios are discarded either because poor fitting or because of the excess of free parameters. A method of increasing Bayesian evidence through reparameterization in order to reducing parameter degeneracy is also developed.
Estimation of Stochastic Volatility Models by Nonparametric Filtering
DEFF Research Database (Denmark)
Kanaya, Shin; Kristensen, Dennis
2016-01-01
/estimated volatility process replacing the latent process. Our estimation strategy is applicable to both parametric and nonparametric stochastic volatility models, and can handle both jumps and market microstructure noise. The resulting estimators of the stochastic volatility model will carry additional biases...... and variances due to the first-step estimation, but under regularity conditions we show that these vanish asymptotically and our estimators inherit the asymptotic properties of the infeasible estimators based on observations of the volatility process. A simulation study examines the finite-sample properties...
Bayesian modeling using WinBUGS
Ntzoufras, Ioannis
2009-01-01
A hands-on introduction to the principles of Bayesian modeling using WinBUGS Bayesian Modeling Using WinBUGS provides an easily accessible introduction to the use of WinBUGS programming techniques in a variety of Bayesian modeling settings. The author provides an accessible treatment of the topic, offering readers a smooth introduction to the principles of Bayesian modeling with detailed guidance on the practical implementation of key principles. The book begins with a basic introduction to Bayesian inference and the WinBUGS software and goes on to cover key topics, including: Markov Chain Monte Carlo algorithms in Bayesian inference Generalized linear models Bayesian hierarchical models Predictive distribution and model checking Bayesian model and variable evaluation Computational notes and screen captures illustrate the use of both WinBUGS as well as R software to apply the discussed techniques. Exercises at the end of each chapter allow readers to test their understanding of the presented concepts and all ...
Guhaniyogi, Rajarshi
2017-11-10
With increasingly abundant spatial data in the form of case counts or rates combined over areal regions (eg, ZIP codes, census tracts, or counties), interest turns to formal identification of difference "boundaries," or barriers on the map, in addition to the estimated statistical map itself. "Boundary" refers to a border that describes vastly disparate outcomes in the adjacent areal units, perhaps caused by latent risk factors. This article focuses on developing a model-based statistical tool, equipped to identify difference boundaries in maps with a small number of areal units, also referred to as small-scale maps. This article proposes a novel and robust nonparametric boundary detection rule based on nonparametric Dirichlet processes, later referred to as Dirichlet process wombling (DPW) rule, by employing Dirichlet process-based mixture models for small-scale maps. Unlike the recently proposed nonparametric boundary detection rules based on false discovery rates, the DPW rule is free of ad hoc parameters, computationally simple, and readily implementable in freely available software for public health practitioners such as JAGS and OpenBUGS and yet provides statistically interpretable boundary detection in small-scale wombling. We offer a detailed simulation study and an application of our proposed approach to a urinary bladder cancer incidence rates dataset between 1990 and 2012 in the 8 counties in Connecticut. Copyright © 2017 John Wiley & Sons, Ltd.
Bayesian Plackett-Luce Mixture Models for Partially Ranked Data.
Mollica, Cristina; Tardella, Luca
2017-06-01
The elicitation of an ordinal judgment on multiple alternatives is often required in many psychological and behavioral experiments to investigate preference/choice orientation of a specific population. The Plackett-Luce model is one of the most popular and frequently applied parametric distributions to analyze rankings of a finite set of items. The present work introduces a Bayesian finite mixture of Plackett-Luce models to account for unobserved sample heterogeneity of partially ranked data. We describe an efficient way to incorporate the latent group structure in the data augmentation approach and the derivation of existing maximum likelihood procedures as special instances of the proposed Bayesian method. Inference can be conducted with the combination of the Expectation-Maximization algorithm for maximum a posteriori estimation and the Gibbs sampling iterative procedure. We additionally investigate several Bayesian criteria for selecting the optimal mixture configuration and describe diagnostic tools for assessing the fitness of ranking distributions conditionally and unconditionally on the number of ranked items. The utility of the novel Bayesian parametric Plackett-Luce mixture for characterizing sample heterogeneity is illustrated with several applications to simulated and real preference ranked data. We compare our method with the frequentist approach and a Bayesian nonparametric mixture model both assuming the Plackett-Luce model as a mixture component. Our analysis on real datasets reveals the importance of an accurate diagnostic check for an appropriate in-depth understanding of the heterogenous nature of the partial ranking data.
A Bayesian Nonparametric Causal Model for Regression Discontinuity Designs
Karabatsos, George; Walker, Stephen G.
2013-01-01
The regression discontinuity (RD) design (Thistlewaite & Campbell, 1960; Cook, 2008) provides a framework to identify and estimate causal effects from a non-randomized design. Each subject of a RD design is assigned to the treatment (versus assignment to a non-treatment) whenever her/his observed value of the assignment variable equals or…
Bayesian non parametric modelling of Higgs pair production
Directory of Open Access Journals (Sweden)
Scarpa Bruno
2017-01-01
Full Text Available Statistical classification models are commonly used to separate a signal from a background. In this talk we face the problem of isolating the signal of Higgs pair production using the decay channel in which each boson decays into a pair of b-quarks. Typically in this context non parametric methods are used, such as Random Forests or different types of boosting tools. We remain in the same non-parametric framework, but we propose to face the problem following a Bayesian approach. A Dirichlet process is used as prior for the random effects in a logit model which is fitted by leveraging the Polya-Gamma data augmentation. Refinements of the model include the insertion in the simple model of P-splines to relate explanatory variables with the response and the use of Bayesian trees (BART to describe the atoms in the Dirichlet process.
A ¤nonparametric dynamic additive regression model for longitudinal data
DEFF Research Database (Denmark)
Martinussen, T.; Scheike, T. H.
2000-01-01
dynamic linear models, estimating equations, least squares, longitudinal data, nonparametric methods, partly conditional mean models, time-varying-coefficient models......dynamic linear models, estimating equations, least squares, longitudinal data, nonparametric methods, partly conditional mean models, time-varying-coefficient models...
Generative Temporal Modelling of Neuroimaging - Decomposition and Nonparametric Testing
DEFF Research Database (Denmark)
Hald, Ditte Høvenhoff
The goal of this thesis is to explore two improvements for functional magnetic resonance imaging (fMRI) analysis; namely our proposed decomposition method and an extension to the non-parametric testing framework. Analysis of fMRI allows researchers to investigate the functional processes...... of the brain, and provides insight into neuronal coupling during mental processes or tasks. The decomposition method is a Gaussian process-based independent components analysis (GPICA), which incorporates a temporal dependency in the sources. A hierarchical model specification is used, featuring both...... instantaneous and convolutive mixing, and the inferred temporal patterns. Spatial maps are seen to capture smooth and localized stimuli-related components, and often identifiable noise components. The implementation is freely available as a GUI/SPM plugin, and we recommend using GPICA as an additional tool when...
Nonparametric Estimation of Distributions in Random Effects Models
Hart, Jeffrey D.
2011-01-01
We propose using minimum distance to obtain nonparametric estimates of the distributions of components in random effects models. A main setting considered is equivalent to having a large number of small datasets whose locations, and perhaps scales, vary randomly, but which otherwise have a common distribution. Interest focuses on estimating the distribution that is common to all datasets, knowledge of which is crucial in multiple testing problems where a location/scale invariant test is applied to every small dataset. A detailed algorithm for computing minimum distance estimates is proposed, and the usefulness of our methodology is illustrated by a simulation study and an analysis of microarray data. Supplemental materials for the article, including R-code and a dataset, are available online. © 2011 American Statistical Association.
Simple nonparametric checks for model data fit in CAT
Meijer, R.R.
2005-01-01
In this paper, the usefulness of several nonparametric checks is discussed in a computerized adaptive testing (CAT) context. Although there is no tradition of nonparametric scalability in CAT, it can be argued that scalability checks can be useful to investigate, for example, the quality of item
Borsboom, D.; Haig, B.D.
2013-01-01
Unlike most other statistical frameworks, Bayesian statistical inference is wedded to a particular approach in the philosophy of science (see Howson & Urbach, 2006); this approach is called Bayesianism. Rather than being concerned with model fitting, this position in the philosophy of science
A Bayesian Markov geostatistical model for estimation of hydrogeological properties
International Nuclear Information System (INIS)
Rosen, L.; Gustafson, G.
1996-01-01
A geostatistical methodology based on Markov-chain analysis and Bayesian statistics was developed for probability estimations of hydrogeological and geological properties in the siting process of a nuclear waste repository. The probability estimates have practical use in decision-making on issues such as siting, investigation programs, and construction design. The methodology is nonparametric which makes it possible to handle information that does not exhibit standard statistical distributions, as is often the case for classified information. Data do not need to meet the requirements on additivity and normality as with the geostatistical methods based on regionalized variable theory, e.g., kriging. The methodology also has a formal way for incorporating professional judgments through the use of Bayesian statistics, which allows for updating of prior estimates to posterior probabilities each time new information becomes available. A Bayesian Markov Geostatistical Model (BayMar) software was developed for implementation of the methodology in two and three dimensions. This paper gives (1) a theoretical description of the Bayesian Markov Geostatistical Model; (2) a short description of the BayMar software; and (3) an example of application of the model for estimating the suitability for repository establishment with respect to the three parameters of lithology, hydraulic conductivity, and rock quality designation index (RQD) at 400--500 meters below ground surface in an area around the Aespoe Hard Rock Laboratory in southeastern Sweden
A nonparametric Bayesian approach for clustering bisulfate-based DNA methylation profiles.
Zhang, Lin; Meng, Jia; Liu, Hui; Huang, Yufei
2012-01-01
DNA methylation occurs in the context of a CpG dinucleotide. It is an important epigenetic modification, which can be inherited through cell division. The two major types of methylation include hypomethylation and hypermethylation. Unique methylation patterns have been shown to exist in diseases including various types of cancer. DNA methylation analysis promises to become a powerful tool in cancer diagnosis, treatment and prognostication. Large-scale methylation arrays are now available for studying methylation genome-wide. The Illumina methylation platform simultaneously measures cytosine methylation at more than 1500 CpG sites associated with over 800 cancer-related genes. Cluster analysis is often used to identify DNA methylation subgroups for prognosis and diagnosis. However, due to the unique non-Gaussian characteristics, traditional clustering methods may not be appropriate for DNA and methylation data, and the determination of optimal cluster number is still problematic. A Dirichlet process beta mixture model (DPBMM) is proposed that models the DNA methylation expressions as an infinite number of beta mixture distribution. The model allows automatic learning of the relevant parameters such as the cluster mixing proportion, the parameters of beta distribution for each cluster, and especially the number of potential clusters. Since the model is high dimensional and analytically intractable, we proposed a Gibbs sampling "no-gaps" solution for computing the posterior distributions, hence the estimates of the parameters. The proposed algorithm was tested on simulated data as well as methylation data from 55 Glioblastoma multiform (GBM) brain tissue samples. To reduce the computational burden due to the high data dimensionality, a dimension reduction method is adopted. The two GBM clusters yielded by DPBMM are based on data of different number of loci (P-value < 0.1), while hierarchical clustering cannot yield statistically significant clusters.
A local non-parametric model for trade sign inference
Blazejewski, Adam; Coggins, Richard
2005-03-01
We investigate a regularity in market order submission strategies for 12 stocks with large market capitalization on the Australian Stock Exchange. The regularity is evidenced by a predictable relationship between the trade sign (trade initiator), size of the trade, and the contents of the limit order book before the trade. We demonstrate this predictability by developing an empirical inference model to classify trades into buyer-initiated and seller-initiated. The model employs a local non-parametric method, k-nearest neighbor, which in the past was used successfully for chaotic time series prediction. The k-nearest neighbor with three predictor variables achieves an average out-of-sample classification accuracy of 71.40%, compared to 63.32% for the linear logistic regression with seven predictor variables. The result suggests that a non-linear approach may produce a more parsimonious trade sign inference model with a higher out-of-sample classification accuracy. Furthermore, for most of our stocks the observed regularity in market order submissions seems to have a memory of at least 30 trading days.
Economic decision making and the application of nonparametric prediction models
Attanasi, E.D.; Coburn, T.C.; Freeman, P.A.
2008-01-01
Sustained increases in energy prices have focused attention on gas resources in low-permeability shale or in coals that were previously considered economically marginal. Daily well deliverability is often relatively small, although the estimates of the total volumes of recoverable resources in these settings are often large. Planning and development decisions for extraction of such resources must be areawide because profitable extraction requires optimization of scale economies to minimize costs and reduce risk. For an individual firm, the decision to enter such plays depends on reconnaissance-level estimates of regional recoverable resources and on cost estimates to develop untested areas. This paper shows how simple nonparametric local regression models, used to predict technically recoverable resources at untested sites, can be combined with economic models to compute regional-scale cost functions. The context of the worked example is the Devonian Antrim-shale gas play in the Michigan basin. One finding relates to selection of the resource prediction model to be used with economic models. Models chosen because they can best predict aggregate volume over larger areas (many hundreds of sites) smooth out granularity in the distribution of predicted volumes at individual sites. This loss of detail affects the representation of economic cost functions and may affect economic decisions. Second, because some analysts consider unconventional resources to be ubiquitous, the selection and order of specific drilling sites may, in practice, be determined arbitrarily by extraneous factors. The analysis shows a 15-20% gain in gas volume when these simple models are applied to order drilling prospects strategically rather than to choose drilling locations randomly. Copyright ?? 2008 Society of Petroleum Engineers.
Zhao, Zhibiao
2011-06-01
We address the nonparametric model validation problem for hidden Markov models with partially observable variables and hidden states. We achieve this goal by constructing a nonparametric simultaneous confidence envelope for transition density function of the observable variables and checking whether the parametric density estimate is contained within such an envelope. Our specification test procedure is motivated by a functional connection between the transition density of the observable variables and the Markov transition kernel of the hidden states. Our approach is applicable for continuous time diffusion models, stochastic volatility models, nonlinear time series models, and models with market microstructure noise.
Bayesian models: A statistical primer for ecologists
Hobbs, N. Thompson; Hooten, Mevin B.
2015-01-01
Bayesian modeling has become an indispensable tool for ecological research because it is uniquely suited to deal with complexity in a statistically coherent way. This textbook provides a comprehensive and accessible introduction to the latest Bayesian methods—in language ecologists can understand. Unlike other books on the subject, this one emphasizes the principles behind the computations, giving ecologists a big-picture understanding of how to implement this powerful statistical approach.Bayesian Models is an essential primer for non-statisticians. It begins with a definition of probability and develops a step-by-step sequence of connected ideas, including basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and inference from single and multiple models. This unique book places less emphasis on computer coding, favoring instead a concise presentation of the mathematical statistics needed to understand how and why Bayesian analysis works. It also explains how to write out properly formulated hierarchical Bayesian models and use them in computing, research papers, and proposals.This primer enables ecologists to understand the statistical principles behind Bayesian modeling and apply them to research, teaching, policy, and management.Presents the mathematical and statistical foundations of Bayesian modeling in language accessible to non-statisticiansCovers basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and moreDeemphasizes computer coding in favor of basic principlesExplains how to write out properly factored statistical expressions representing Bayesian models
A Bayesian model for binary Markov chains
Directory of Open Access Journals (Sweden)
Belkheir Essebbar
2004-02-01
Full Text Available This note is concerned with Bayesian estimation of the transition probabilities of a binary Markov chain observed from heterogeneous individuals. The model is founded on the Jeffreys' prior which allows for transition probabilities to be correlated. The Bayesian estimator is approximated by means of Monte Carlo Markov chain (MCMC techniques. The performance of the Bayesian estimates is illustrated by analyzing a small simulated data set.
Nonparametric Tree-Based Predictive Modeling of Storm Outages on an Electric Distribution Network.
He, Jichao; Wanik, David W; Hartman, Brian M; Anagnostou, Emmanouil N; Astitha, Marina; Frediani, Maria E B
2017-03-01
This article compares two nonparametric tree-based models, quantile regression forests (QRF) and Bayesian additive regression trees (BART), for predicting storm outages on an electric distribution network in Connecticut, USA. We evaluated point estimates and prediction intervals of outage predictions for both models using high-resolution weather, infrastructure, and land use data for 89 storm events (including hurricanes, blizzards, and thunderstorms). We found that spatially BART predicted more accurate point estimates than QRF. However, QRF produced better prediction intervals for high spatial resolutions (2-km grid cells and towns), while BART predictions aggregated to coarser resolutions (divisions and service territory) more effectively. We also found that the predictive accuracy was dependent on the season (e.g., tree-leaf condition, storm characteristics), and that the predictions were most accurate for winter storms. Given the merits of each individual model, we suggest that BART and QRF be implemented together to show the complete picture of a storm's potential impact on the electric distribution network, which would allow for a utility to make better decisions about allocating prestorm resources. © 2016 Society for Risk Analysis.
A Bayesian approach to model uncertainty
International Nuclear Information System (INIS)
Buslik, A.
1994-01-01
A Bayesian approach to model uncertainty is taken. For the case of a finite number of alternative models, the model uncertainty is equivalent to parameter uncertainty. A derivation based on Savage's partition problem is given
Parametric vs. Nonparametric Regression Modelling within Clinical Decision Support
Czech Academy of Sciences Publication Activity Database
Kalina, Jan; Zvárová, Jana
2017-01-01
Roč. 5, č. 1 (2017), s. 21-27 ISSN 1805-8698 R&D Projects: GA ČR GA17-01251S Institutional support: RVO:67985807 Keywords : decision support systems * decision rules * statistical analysis * nonparametric regression Subject RIV: IN - Informatics, Computer Science OBOR OECD: Statistics and probability
An adaptive distance measure for use with nonparametric models
International Nuclear Information System (INIS)
Garvey, D. R.; Hines, J. W.
2006-01-01
Distance measures perform a critical task in nonparametric, locally weighted regression. Locally weighted regression (LWR) models are a form of 'lazy learning' which construct a local model 'on the fly' by comparing a query vector to historical, exemplar vectors according to a three step process. First, the distance of the query vector to each of the exemplar vectors is calculated. Next, these distances are passed to a kernel function, which converts the distances to similarities or weights. Finally, the model output or response is calculated by performing locally weighted polynomial regression. To date, traditional distance measures, such as the Euclidean, weighted Euclidean, and L1-norm have been used as the first step in the prediction process. Since these measures do not take into consideration sensor failures and drift, they are inherently ill-suited for application to 'real world' systems. This paper describes one such LWR model, namely auto associative kernel regression (AAKR), and describes a new, Adaptive Euclidean distance measure that can be used to dynamically compensate for faulty sensor inputs. In this new distance measure, the query observations that lie outside of the training range (i.e. outside the minimum and maximum input exemplars) are dropped from the distance calculation. This allows for the distance calculation to be robust to sensor drifts and failures, in addition to providing a method for managing inputs that exceed the training range. In this paper, AAKR models using the standard and Adaptive Euclidean distance are developed and compared for the pressure system of an operating nuclear power plant. It is shown that using the standard Euclidean distance for data with failed inputs, significant errors in the AAKR predictions can result. By using the Adaptive Euclidean distance it is shown that high fidelity predictions are possible, in spite of the input failure. In fact, it is shown that with the Adaptive Euclidean distance prediction
Tokuda, Tomoki; Yoshimoto, Junichiro; Shimizu, Yu; Okada, Go; Takamura, Masahiro; Okamoto, Yasumasa; Yamawaki, Shigeto; Doya, Kenji
2017-01-01
We propose a novel method for multiple clustering, which is useful for analysis of high-dimensional data containing heterogeneous types of features. Our method is based on nonparametric Bayesian mixture models in which features are automatically partitioned (into views) for each clustering solution. This feature partition works as feature selection for a particular clustering solution, which screens out irrelevant features. To make our method applicable to high-dimensional data, a co-clustering structure is newly introduced for each view. Further, the outstanding novelty of our method is that we simultaneously model different distribution families, such as Gaussian, Poisson, and multinomial distributions in each cluster block, which widens areas of application to real data. We apply the proposed method to synthetic and real data, and show that our method outperforms other multiple clustering methods both in recovering true cluster structures and in computation time. Finally, we apply our method to a depression dataset with no true cluster structure available, from which useful inferences are drawn about possible clustering structures of the data.
Directory of Open Access Journals (Sweden)
Tomoki Tokuda
Full Text Available We propose a novel method for multiple clustering, which is useful for analysis of high-dimensional data containing heterogeneous types of features. Our method is based on nonparametric Bayesian mixture models in which features are automatically partitioned (into views for each clustering solution. This feature partition works as feature selection for a particular clustering solution, which screens out irrelevant features. To make our method applicable to high-dimensional data, a co-clustering structure is newly introduced for each view. Further, the outstanding novelty of our method is that we simultaneously model different distribution families, such as Gaussian, Poisson, and multinomial distributions in each cluster block, which widens areas of application to real data. We apply the proposed method to synthetic and real data, and show that our method outperforms other multiple clustering methods both in recovering true cluster structures and in computation time. Finally, we apply our method to a depression dataset with no true cluster structure available, from which useful inferences are drawn about possible clustering structures of the data.
Yoshimoto, Junichiro; Shimizu, Yu; Okada, Go; Takamura, Masahiro; Okamoto, Yasumasa; Yamawaki, Shigeto; Doya, Kenji
2017-01-01
We propose a novel method for multiple clustering, which is useful for analysis of high-dimensional data containing heterogeneous types of features. Our method is based on nonparametric Bayesian mixture models in which features are automatically partitioned (into views) for each clustering solution. This feature partition works as feature selection for a particular clustering solution, which screens out irrelevant features. To make our method applicable to high-dimensional data, a co-clustering structure is newly introduced for each view. Further, the outstanding novelty of our method is that we simultaneously model different distribution families, such as Gaussian, Poisson, and multinomial distributions in each cluster block, which widens areas of application to real data. We apply the proposed method to synthetic and real data, and show that our method outperforms other multiple clustering methods both in recovering true cluster structures and in computation time. Finally, we apply our method to a depression dataset with no true cluster structure available, from which useful inferences are drawn about possible clustering structures of the data. PMID:29049392
Nonparametric Identification and Estimation of Finite Mixture Models of Dynamic Discrete Choices
Hiroyuki Kasahara; Katsumi Shimotsu
2006-01-01
In dynamic discrete choice analysis, controlling for unobserved heterogeneity is an important issue, and finite mixture models provide flexible ways to account for unobserved heterogeneity. This paper studies nonparametric identifiability of type probabilities and type-specific component distributions in finite mixture models of dynamic discrete choices. We derive sufficient conditions for nonparametric identification for various finite mixture models of dynamic discrete choices used in appli...
Bayesian models a statistical primer for ecologists
Hobbs, N Thompson
2015-01-01
Bayesian modeling has become an indispensable tool for ecological research because it is uniquely suited to deal with complexity in a statistically coherent way. This textbook provides a comprehensive and accessible introduction to the latest Bayesian methods-in language ecologists can understand. Unlike other books on the subject, this one emphasizes the principles behind the computations, giving ecologists a big-picture understanding of how to implement this powerful statistical approach. Bayesian Models is an essential primer for non-statisticians. It begins with a definition of probabili
Bayesian Modeling of a Human MMORPG Player
Synnaeve, Gabriel; Bessière, Pierre
2011-03-01
This paper describes an application of Bayesian programming to the control of an autonomous avatar in a multiplayer role-playing game (the example is based on World of Warcraft). We model a particular task, which consists of choosing what to do and to select which target in a situation where allies and foes are present. We explain the model in Bayesian programming and show how we could learn the conditional probabilities from data gathered during human-played sessions.
Quantum-Like Bayesian Networks for Modeling Decision Making
Directory of Open Access Journals (Sweden)
Catarina eMoreira
2016-01-01
Full Text Available In this work, we explore an alternative quantum structure to perform quantum probabilistic inferences to accommodate the paradoxical findings of the Sure Thing Principle. We propose a Quantum-Like Bayesian Network, which consists in replacing classical probabilities by quantum probability amplitudes. However, since this approach suffers from the problem of exponential growth of quantum parameters, we also propose a similarity heuristic that automatically fits quantum parameters through vector similarities. This makes the proposed model general and predictive in contrast to the current state of the art models, which cannot be generalized for more complex decision scenarios and that only provide an explanatory nature for the observed paradoxes. In the end, the model that we propose consists in a nonparametric method for estimating inference effects from a statistical point of view. It is a statistical model that is simpler than the previous quantum dynamic and quantum-like models proposed in the literature. We tested the proposed network with several empirical data from the literature, mainly from the Prisoner's Dilemma game and the Two Stage Gambling game. The results obtained show that the proposed quantum Bayesian Network is a general method that can accommodate violations of the laws of classical probability theory and make accurate predictions regarding human decision-making in these scenarios.
Bayesian semiparametric regression models to characterize molecular evolution
Directory of Open Access Journals (Sweden)
Datta Saheli
2012-10-01
Full Text Available Abstract Background Statistical models and methods that associate changes in the physicochemical properties of amino acids with natural selection at the molecular level typically do not take into account the correlations between such properties. We propose a Bayesian hierarchical regression model with a generalization of the Dirichlet process prior on the distribution of the regression coefficients that describes the relationship between the changes in amino acid distances and natural selection in protein-coding DNA sequence alignments. Results The Bayesian semiparametric approach is illustrated with simulated data and the abalone lysin sperm data. Our method identifies groups of properties which, for this particular dataset, have a similar effect on evolution. The model also provides nonparametric site-specific estimates for the strength of conservation of these properties. Conclusions The model described here is distinguished by its ability to handle a large number of amino acid properties simultaneously, while taking into account that such data can be correlated. The multi-level clustering ability of the model allows for appealing interpretations of the results in terms of properties that are roughly equivalent from the standpoint of molecular evolution.
Directory of Open Access Journals (Sweden)
Rabia Ece OMAY
2013-06-01
Full Text Available In this study, relationship between gross domestic product (GDP per capita and sulfur dioxide (SO2 and particulate matter (PM10 per capita is modeled for Turkey. Nonparametric fixed effect panel data analysis is used for the modeling. The panel data covers 12 territories, in first level of Nomenclature of Territorial Units for Statistics (NUTS, for period of 1990-2001. Modeling of the relationship between GDP and SO2 and PM10 for Turkey, the non-parametric models have given good results.
Bioprocess iterative batch-to-batch optimization based on hybrid parametric/nonparametric models.
Teixeira, Ana P; Clemente, João J; Cunha, António E; Carrondo, Manuel J T; Oliveira, Rui
2006-01-01
This paper presents a novel method for iterative batch-to-batch dynamic optimization of bioprocesses. The relationship between process performance and control inputs is established by means of hybrid grey-box models combining parametric and nonparametric structures. The bioreactor dynamics are defined by material balance equations, whereas the cell population subsystem is represented by an adjustable mixture of nonparametric and parametric models. Thus optimizations are possible without detailed mechanistic knowledge concerning the biological system. A clustering technique is used to supervise the reliability of the nonparametric subsystem during the optimization. Whenever the nonparametric outputs are unreliable, the objective function is penalized. The technique was evaluated with three simulation case studies. The overall results suggest that the convergence to the optimal process performance may be achieved after a small number of batches. The model unreliability risk constraint along with sampling scheduling are crucial to minimize the experimental effort required to attain a given process performance. In general terms, it may be concluded that the proposed method broadens the application of the hybrid parametric/nonparametric modeling technique to "newer" processes with higher potential for optimization.
Robust bayesian analysis of an autoregressive model with ...
African Journals Online (AJOL)
In this work, robust Bayesian analysis of the Bayesian estimation of an autoregressive model with exponential innovations is performed. Using a Bayesian robustness methodology, we show that, using a suitable generalized quadratic loss, we obtain optimal Bayesian estimators of the parameters corresponding to the ...
Nonparametric Integrated Agrometeorological Drought Monitoring: Model Development and Application
Zhang, Qiang; Li, Qin; Singh, Vijay P.; Shi, Peijun; Huang, Qingzhong; Sun, Peng
2018-01-01
Drought is a major natural hazard that has massive impacts on the society. How to monitor drought is critical for its mitigation and early warning. This study proposed a modified version of the multivariate standardized drought index (MSDI) based on precipitation, evapotranspiration, and soil moisture, i.e., modified multivariate standardized drought index (MMSDI). This study also used nonparametric joint probability distribution analysis. Comparisons were done between standardized precipitation evapotranspiration index (SPEI), standardized soil moisture index (SSMI), MSDI, and MMSDI, and real-world observed drought regimes. Results indicated that MMSDI detected droughts that SPEI and/or SSMI failed to do. Also, MMSDI detected almost all droughts that were identified by SPEI and SSMI. Further, droughts detected by MMSDI were similar to real-world observed droughts in terms of drought intensity and drought-affected area. When compared to MMSDI, MSDI has the potential to overestimate drought intensity and drought-affected area across China, which should be attributed to exclusion of the evapotranspiration components from estimation of drought intensity. Therefore, MMSDI is proposed for drought monitoring that can detect agrometeorological droughts. Results of this study provide a framework for integrated drought monitoring in other regions of the world and can help to develop drought mitigation.
Empirical Bayesian inference and model uncertainty
International Nuclear Information System (INIS)
Poern, K.
1994-01-01
This paper presents a hierarchical or multistage empirical Bayesian approach for the estimation of uncertainty concerning the intensity of a homogeneous Poisson process. A class of contaminated gamma distributions is considered to describe the uncertainty concerning the intensity. These distributions in turn are defined through a set of secondary parameters, the knowledge of which is also described and updated via Bayes formula. This two-stage Bayesian approach is an example where the modeling uncertainty is treated in a comprehensive way. Each contaminated gamma distributions, represented by a point in the 3D space of secondary parameters, can be considered as a specific model of the uncertainty about the Poisson intensity. Then, by the empirical Bayesian method each individual model is assigned a posterior probability
Feng, Jinchao; Lansford, Joshua; Mironenko, Alexander; Pourkargar, Davood Babaei; Vlachos, Dionisios G.; Katsoulakis, Markos A.
2018-03-01
We propose non-parametric methods for both local and global sensitivity analysis of chemical reaction models with correlated parameter dependencies. The developed mathematical and statistical tools are applied to a benchmark Langmuir competitive adsorption model on a close packed platinum surface, whose parameters, estimated from quantum-scale computations, are correlated and are limited in size (small data). The proposed mathematical methodology employs gradient-based methods to compute sensitivity indices. We observe that ranking influential parameters depends critically on whether or not correlations between parameters are taken into account. The impact of uncertainty in the correlation and the necessity of the proposed non-parametric perspective are demonstrated.
Directory of Open Access Journals (Sweden)
Jinchao Feng
2018-03-01
Full Text Available We propose non-parametric methods for both local and global sensitivity analysis of chemical reaction models with correlated parameter dependencies. The developed mathematical and statistical tools are applied to a benchmark Langmuir competitive adsorption model on a close packed platinum surface, whose parameters, estimated from quantum-scale computations, are correlated and are limited in size (small data. The proposed mathematical methodology employs gradient-based methods to compute sensitivity indices. We observe that ranking influential parameters depends critically on whether or not correlations between parameters are taken into account. The impact of uncertainty in the correlation and the necessity of the proposed non-parametric perspective are demonstrated.
Wei, Jiawei
2011-07-01
We consider the problem of testing for a constant nonparametric effect in a general semi-parametric regression model when there is the potential for interaction between the parametrically and nonparametrically modeled variables. The work was originally motivated by a unique testing problem in genetic epidemiology (Chatterjee, et al., 2006) that involved a typical generalized linear model but with an additional term reminiscent of the Tukey one-degree-of-freedom formulation, and their interest was in testing for main effects of the genetic variables, while gaining statistical power by allowing for a possible interaction between genes and the environment. Later work (Maity, et al., 2009) involved the possibility of modeling the environmental variable nonparametrically, but they focused on whether there was a parametric main effect for the genetic variables. In this paper, we consider the complementary problem, where the interest is in testing for the main effect of the nonparametrically modeled environmental variable. We derive a generalized likelihood ratio test for this hypothesis, show how to implement it, and provide evidence that our method can improve statistical power when compared to standard partially linear models with main effects only. We use the method for the primary purpose of analyzing data from a case-control study of colorectal adenoma.
Bayesian uncertainty analyses of probabilistic risk models
International Nuclear Information System (INIS)
Pulkkinen, U.
1989-01-01
Applications of Bayesian principles to the uncertainty analyses are discussed in the paper. A short review of the most important uncertainties and their causes is provided. An application of the principle of maximum entropy to the determination of Bayesian prior distributions is described. An approach based on so called probabilistic structures is presented in order to develop a method of quantitative evaluation of modelling uncertainties. The method is applied to a small example case. Ideas for application areas for the proposed method are discussed
Hierarchical Bayesian Models of Subtask Learning
Anglim, Jeromy; Wynton, Sarah K. A.
2015-01-01
The current study used Bayesian hierarchical methods to challenge and extend previous work on subtask learning consistency. A general model of individual-level subtask learning was proposed focusing on power and exponential functions with constraints to test for inconsistency. To study subtask learning, we developed a novel computer-based booking…
The Support Reduction Algorithm for Computing Non-Parametric Function Estimates in Mixture Models
GROENEBOOM, PIET; JONGBLOED, GEURT; WELLNER, JON A.
2008-01-01
In this paper, we study an algorithm (which we call the support reduction algorithm) that can be used to compute non-parametric M-estimators in mixture models. The algorithm is compared with natural competitors in the context of convex regression and the ‘Aspect problem’ in quantum physics.
A comparative study of non-parametric models for identification of ...
African Journals Online (AJOL)
However, the frequency response method using random binary signals was good for unpredicted white noise characteristics and considered the best method for non-parametric system identifica-tion. The autoregressive external input (ARX) model was very useful for system identification, but on applicati-on, few input ...
Bayesian Geostatistical Modeling of Malaria Indicator Survey Data in Angola
Gosoniu, Laura; Veta, Andre Mia; Vounatsou, Penelope
2010-01-01
The 2006–2007 Angola Malaria Indicator Survey (AMIS) is the first nationally representative household survey in the country assessing coverage of the key malaria control interventions and measuring malaria-related burden among children under 5 years of age. In this paper, the Angolan MIS data were analyzed to produce the first smooth map of parasitaemia prevalence based on contemporary nationwide empirical data in the country. Bayesian geostatistical models were fitted to assess the effect of interventions after adjusting for environmental, climatic and socio-economic factors. Non-linear relationships between parasitaemia risk and environmental predictors were modeled by categorizing the covariates and by employing two non-parametric approaches, the B-splines and the P-splines. The results of the model validation showed that the categorical model was able to better capture the relationship between parasitaemia prevalence and the environmental factors. Model fit and prediction were handled within a Bayesian framework using Markov chain Monte Carlo (MCMC) simulations. Combining estimates of parasitaemia prevalence with the number of children under we obtained estimates of the number of infected children in the country. The population-adjusted prevalence ranges from in Namibe province to in Malanje province. The odds of parasitaemia in children living in a household with at least ITNs per person was by 41% lower (CI: 14%, 60%) than in those with fewer ITNs. The estimates of the number of parasitaemic children produced in this paper are important for planning and implementing malaria control interventions and for monitoring the impact of prevention and control activities. PMID:20351775
Faraway, Julian J
2005-01-01
Linear models are central to the practice of statistics and form the foundation of a vast range of statistical methodologies. Julian J. Faraway''s critically acclaimed Linear Models with R examined regression and analysis of variance, demonstrated the different methods available, and showed in which situations each one applies. Following in those footsteps, Extending the Linear Model with R surveys the techniques that grow from the regression model, presenting three extensions to that framework: generalized linear models (GLMs), mixed effect models, and nonparametric regression models. The author''s treatment is thoroughly modern and covers topics that include GLM diagnostics, generalized linear mixed models, trees, and even the use of neural networks in statistics. To demonstrate the interplay of theory and practice, throughout the book the author weaves the use of the R software environment to analyze the data of real examples, providing all of the R commands necessary to reproduce the analyses. All of the ...
Semiparametric Bayesian analysis of accelerated failure time models with cluster structures.
Li, Zhaonan; Xu, Xinyi; Shen, Junshan
2017-11-10
In this paper, we develop a Bayesian semiparametric accelerated failure time model for survival data with cluster structures. Our model allows distributional heterogeneity across clusters and accommodates their relationships through a density ratio approach. Moreover, a nonparametric mixture of Dirichlet processes prior is placed on the baseline distribution to yield full distributional flexibility. We illustrate through simulations that our model can greatly improve estimation accuracy by effectively pooling information from multiple clusters, while taking into account the heterogeneity in their random error distributions. We also demonstrate the implementation of our method using analysis of Mayo Clinic Trial in Primary Biliary Cirrhosis. Copyright © 2017 John Wiley & Sons, Ltd.
Modelling dependable systems using hybrid Bayesian networks
International Nuclear Information System (INIS)
Neil, Martin; Tailor, Manesh; Marquez, David; Fenton, Norman; Hearty, Peter
2008-01-01
A hybrid Bayesian network (BN) is one that incorporates both discrete and continuous nodes. In our extensive applications of BNs for system dependability assessment, the models are invariably hybrid and the need for efficient and accurate computation is paramount. We apply a new iterative algorithm that efficiently combines dynamic discretisation with robust propagation algorithms on junction tree structures to perform inference in hybrid BNs. We illustrate its use in the field of dependability with two example of reliability estimation. Firstly we estimate the reliability of a simple single system and next we implement a hierarchical Bayesian model. In the hierarchical model we compute the reliability of two unknown subsystems from data collected on historically similar subsystems and then input the result into a reliability block model to compute system level reliability. We conclude that dynamic discretisation can be used as an alternative to analytical or Monte Carlo methods with high precision and can be applied to a wide range of dependability problems
Energy Technology Data Exchange (ETDEWEB)
Rohée, E. [CEA, LIST, Laboratoire Capteurs et Architectures Electroniques, F-91191 Gif-sur-Yvette (France); Coulon, R., E-mail: romain.coulon@cea.fr [CEA, LIST, Laboratoire Capteurs et Architectures Electroniques, F-91191 Gif-sur-Yvette (France); Carrel, F. [CEA, LIST, Laboratoire Capteurs et Architectures Electroniques, F-91191 Gif-sur-Yvette (France); Dautremer, T.; Barat, E.; Montagu, T. [CEA, LIST, Laboratoire de Modélisation et Simulation des Systèmes, F-91191 Gif-sur-Yvette (France); Normand, S. [CEA, DAM, Le Ponant, DPN/STXN, F-75015 Paris (France); Jammes, C. [CEA, DEN, Cadarache, DER/SPEx/LDCI, F-13108 Saint-Paul-lez-Durance (France)
2016-11-11
Radionuclide identification and quantification are a serious concern for many applications as for in situ monitoring at nuclear facilities, laboratory analysis, special nuclear materials detection, environmental monitoring, and waste measurements. High resolution gamma-ray spectrometry based on high purity germanium diode detectors is the best solution available for isotopic identification. Over the last decades, methods have been developed to improve gamma spectra analysis. However, some difficulties remain in the analysis when full energy peaks are folded together with high ratio between their amplitudes, and when the Compton background is much larger compared to the signal of a single peak. In this context, this study deals with the comparison between a conventional analysis based on “iterative peak fitting deconvolution” method and a “nonparametric Bayesian deconvolution” approach developed by the CEA LIST and implemented into the SINBAD code. The iterative peak fit deconvolution is used in this study as a reference method largely validated by industrial standards to unfold complex spectra from HPGe detectors. Complex cases of spectra are studied from IAEA benchmark protocol tests and with measured spectra. The SINBAD code shows promising deconvolution capabilities compared to the conventional method without any expert parameter fine tuning.
Bayesian disease mapping: hierarchical modeling in spatial epidemiology
National Research Council Canada - National Science Library
Lawson, Andrew
2013-01-01
.... Exploring these new developments, Bayesian Disease Mapping: Hierarchical Modeling in Spatial Epidemiology, Second Edition provides an up-to-date, cohesive account of the full range of Bayesian disease mapping methods and applications...
Constrained bayesian inference of project performance models
Sunmola, Funlade
2013-01-01
Project performance models play an important role in the management of project success. When used for monitoring projects, they can offer predictive ability such as indications of possible delivery problems. Approaches for monitoring project performance relies on available project information including restrictions imposed on the project, particularly the constraints of cost, quality, scope and time. We study in this paper a Bayesian inference methodology for project performance modelling in ...
Directory of Open Access Journals (Sweden)
Ibsen Chivatá Cárdenas
2008-05-01
Full Text Available This article presents a rainfall model constructed by applying non-parametric modelling and imprecise probabilities; these tools were used because there was not enough homogeneous information in the study area. The area’s hydro-logical information regarding rainfall was scarce and existing hydrological time series were not uniform. A distributed extended rainfall model was constructed from so-called probability boxes (p-boxes, multinomial probability distribu-tion and confidence intervals (a friendly algorithm was constructed for non-parametric modelling by combining the last two tools. This model confirmed the high level of uncertainty involved in local rainfall modelling. Uncertainty en-compassed the whole range (domain of probability values thereby showing the severe limitations on information, leading to the conclusion that a detailed estimation of probability would lead to significant error. Nevertheless, rele-vant information was extracted; it was estimated that maximum daily rainfall threshold (70 mm would be surpassed at least once every three years and the magnitude of uncertainty affecting hydrological parameter estimation. This paper’s conclusions may be of interest to non-parametric modellers and decisions-makers as such modelling and imprecise probability represents an alternative for hydrological variable assessment and maybe an obligatory proce-dure in the future. Its potential lies in treating scarce information and represents a robust modelling strategy for non-seasonal stochastic modelling conditions
Bayesian methodology for reliability model acceptance
International Nuclear Information System (INIS)
Zhang Ruoxue; Mahadevan, Sankaran
2003-01-01
This paper develops a methodology to assess the reliability computation model validity using the concept of Bayesian hypothesis testing, by comparing the model prediction and experimental observation, when there is only one computational model available to evaluate system behavior. Time-independent and time-dependent problems are investigated, with consideration of both cases: with and without statistical uncertainty in the model. The case of time-independent failure probability prediction with no statistical uncertainty is a straightforward application of Bayesian hypothesis testing. However, for the life prediction (time-dependent reliability) problem, a new methodology is developed in this paper to make the same Bayesian hypothesis testing concept applicable. With the existence of statistical uncertainty in the model, in addition to the application of a predictor estimator of the Bayes factor, the uncertainty in the Bayes factor is explicitly quantified through treating it as a random variable and calculating the probability that it exceeds a specified value. The developed method provides a rational criterion to decision-makers for the acceptance or rejection of the computational model
Nonparametric NAR-ARCH Modelling of Stock Prices by the Kernel Methodology
Directory of Open Access Journals (Sweden)
Mohamed Chikhi
2018-02-01
Full Text Available This paper analyses cyclical behaviour of Orange stock price listed in French stock exchange over 01/03/2000 to 02/02/2017 by testing the nonlinearities through a class of conditional heteroscedastic nonparametric models. The linearity and Gaussianity assumptions are rejected for Orange Stock returns and informational shocks have transitory effects on returns and volatility. The forecasting results show that Orange stock prices are short-term predictable and nonparametric NAR-ARCH model has better performance over parametric MA-APARCH model for short horizons. Plus, the estimates of this model are also better comparing to the predictions of the random walk model. This finding provides evidence for weak form of inefficiency in Paris stock market with limited rationality, thus it emerges arbitrage opportunities.
Bayesian network modelling of upper gastrointestinal bleeding
Aisha, Nazziwa; Shohaimi, Shamarina; Adam, Mohd Bakri
2013-09-01
Bayesian networks are graphical probabilistic models that represent causal and other relationships between domain variables. In the context of medical decision making, these models have been explored to help in medical diagnosis and prognosis. In this paper, we discuss the Bayesian network formalism in building medical support systems and we learn a tree augmented naive Bayes Network (TAN) from gastrointestinal bleeding data. The accuracy of the TAN in classifying the source of gastrointestinal bleeding into upper or lower source is obtained. The TAN achieves a high classification accuracy of 86% and an area under curve of 92%. A sensitivity analysis of the model shows relatively high levels of entropy reduction for color of the stool, history of gastrointestinal bleeding, consistency and the ratio of blood urea nitrogen to creatinine. The TAN facilitates the identification of the source of GIB and requires further validation.
Nonparametric volatility density estimation for discrete time models
Es, van Bert; Spreij, P.J.C.; Zanten, van J.H.
2005-01-01
We consider discrete time models for asset prices with a stationary volatility process. We aim at estimating the multivariate density of this process at a set of consecutive time instants. A Fourier-type deconvolution kernel density estimator based on the logarithm of the squared process is proposed
Bayesian Recurrent Neural Network for Language Modeling.
Chien, Jen-Tzung; Ku, Yuan-Chu
2016-02-01
A language model (LM) is calculated as the probability of a word sequence that provides the solution to word prediction for a variety of information systems. A recurrent neural network (RNN) is powerful to learn the large-span dynamics of a word sequence in the continuous space. However, the training of the RNN-LM is an ill-posed problem because of too many parameters from a large dictionary size and a high-dimensional hidden layer. This paper presents a Bayesian approach to regularize the RNN-LM and apply it for continuous speech recognition. We aim to penalize the too complicated RNN-LM by compensating for the uncertainty of the estimated model parameters, which is represented by a Gaussian prior. The objective function in a Bayesian classification network is formed as the regularized cross-entropy error function. The regularized model is constructed not only by calculating the regularized parameters according to the maximum a posteriori criterion but also by estimating the Gaussian hyperparameter by maximizing the marginal likelihood. A rapid approximation to a Hessian matrix is developed to implement the Bayesian RNN-LM (BRNN-LM) by selecting a small set of salient outer-products. The proposed BRNN-LM achieves a sparser model than the RNN-LM. Experiments on different corpora show the robustness of system performance by applying the rapid BRNN-LM under different conditions.
Nonparametric Estimation of Regression Parameters in Measurement Error Models
Czech Academy of Sciences Publication Activity Database
Ehsanes Saleh, A.K.M.D.; Picek, J.; Kalina, Jan
2009-01-01
Roč. 67, č. 2 (2009), s. 177-200 ISSN 0026-1424 Grant - others:GA AV ČR(CZ) IAA101120801; GA MŠk(CZ) LC06024 Institutional research plan: CEZ:AV0Z10300504 Keywords : asymptotic relative efficiency(ARE) * asymptotic theory * emaculate mode * Me model * R-estimation * Reliabilty ratio(RR) Subject RIV: BB - Applied Statistics, Operational Research
On concurvity in nonlinear and nonparametric regression models
Directory of Open Access Journals (Sweden)
Sonia Amodio
2014-12-01
Full Text Available When data are affected by multicollinearity in the linear regression framework, then concurvity will be present in fitting a generalized additive model (GAM. The term concurvity describes nonlinear dependencies among the predictor variables. As collinearity results in inflated variance of the estimated regression coefficients in the linear regression model, the result of the presence of concurvity leads to instability of the estimated coefficients in GAMs. Even if the backfitting algorithm will always converge to a solution, in case of concurvity the final solution of the backfitting procedure in fitting a GAM is influenced by the starting functions. While exact concurvity is highly unlikely, approximate concurvity, the analogue of multicollinearity, is of practical concern as it can lead to upwardly biased estimates of the parameters and to underestimation of their standard errors, increasing the risk of committing type I error. We compare the existing approaches to detect concurvity, pointing out their advantages and drawbacks, using simulated and real data sets. As a result, this paper will provide a general criterion to detect concurvity in nonlinear and non parametric regression models.
Centralized Bayesian reliability modelling with sensor networks
Czech Academy of Sciences Publication Activity Database
Dedecius, Kamil; Sečkárová, Vladimíra
2013-01-01
Roč. 19, č. 5 (2013), s. 471-482 ISSN 1387-3954 R&D Projects: GA MŠk 7D12004 Grant - others:GA MŠk(CZ) SVV-265315 Keywords : Bayesian modelling * Sensor network * Reliability Subject RIV: BD - Theory of Information Impact factor: 0.984, year: 2013 http://library.utia.cas.cz/separaty/2013/AS/dedecius-0392551.pdf
Bayesian Inference of a Multivariate Regression Model
Directory of Open Access Journals (Sweden)
Marick S. Sinay
2014-01-01
Full Text Available We explore Bayesian inference of a multivariate linear regression model with use of a flexible prior for the covariance structure. The commonly adopted Bayesian setup involves the conjugate prior, multivariate normal distribution for the regression coefficients and inverse Wishart specification for the covariance matrix. Here we depart from this approach and propose a novel Bayesian estimator for the covariance. A multivariate normal prior for the unique elements of the matrix logarithm of the covariance matrix is considered. Such structure allows for a richer class of prior distributions for the covariance, with respect to strength of beliefs in prior location hyperparameters, as well as the added ability, to model potential correlation amongst the covariance structure. The posterior moments of all relevant parameters of interest are calculated based upon numerical results via a Markov chain Monte Carlo procedure. The Metropolis-Hastings-within-Gibbs algorithm is invoked to account for the construction of a proposal density that closely matches the shape of the target posterior distribution. As an application of the proposed technique, we investigate a multiple regression based upon the 1980 High School and Beyond Survey.
Bayesian Predictive Models for Rayleigh Wind Speed
DEFF Research Database (Denmark)
Shahirinia, Amir; Hajizadeh, Amin; Yu, David C
2017-01-01
predictive model of the wind speed aggregates the non-homogeneous distributions into a single continuous distribution. Therefore, the result is able to capture the variation among the probability distributions of the wind speeds at the turbines’ locations in a wind farm. More specifically, instead of using...... a wind speed distribution whose parameters are known or estimated, the parameters are considered as random whose variations are according to probability distributions. The Bayesian predictive model for a Rayleigh which only has a single model scale parameter has been proposed. Also closed-form posterior...... and predictive inferences under different reasonable choices of prior distribution in sensitivity analysis have been presented....
Rogers, Jeffrey N.; Parrish, Christopher E.; Ward, Larry G.; Burdick, David M.
2018-03-01
Salt marsh vegetation tends to increase vertical uncertainty in light detection and ranging (lidar) derived elevation data, often causing the data to become ineffective for analysis of topographic features governing tidal inundation or vegetation zonation. Previous attempts at improving lidar data collected in salt marsh environments range from simply computing and subtracting the global elevation bias to more complex methods such as computing vegetation-specific, constant correction factors. The vegetation specific corrections can be used along with an existing habitat map to apply separate corrections to different areas within a study site. It is hypothesized here that correcting salt marsh lidar data by applying location-specific, point-by-point corrections, which are computed from lidar waveform-derived features, tidal-datum based elevation, distance from shoreline and other lidar digital elevation model based variables, using nonparametric regression will produce better results. The methods were developed and tested using full-waveform lidar and ground truth for three marshes in Cape Cod, Massachusetts, U.S.A. Five different model algorithms for nonparametric regression were evaluated, with TreeNet's stochastic gradient boosting algorithm consistently producing better regression and classification results. Additionally, models were constructed to predict the vegetative zone (high marsh and low marsh). The predictive modeling methods used in this study estimated ground elevation with a mean bias of 0.00 m and a standard deviation of 0.07 m (0.07 m root mean square error). These methods appear very promising for correction of salt marsh lidar data and, importantly, do not require an existing habitat map, biomass measurements, or image based remote sensing data such as multi/hyperspectral imagery.
Bayesian Model Selection under Time Constraints
Hoege, M.; Nowak, W.; Illman, W. A.
2017-12-01
Bayesian model selection (BMS) provides a consistent framework for rating and comparing models in multi-model inference. In cases where models of vastly different complexity compete with each other, we also face vastly different computational runtimes of such models. For instance, time series of a quantity of interest can be simulated by an autoregressive process model that takes even less than a second for one run, or by a partial differential equations-based model with runtimes up to several hours or even days. The classical BMS is based on a quantity called Bayesian model evidence (BME). It determines the model weights in the selection process and resembles a trade-off between bias of a model and its complexity. However, in practice, the runtime of models is another weight relevant factor for model selection. Hence, we believe that it should be included, leading to an overall trade-off problem between bias, variance and computing effort. We approach this triple trade-off from the viewpoint of our ability to generate realizations of the models under a given computational budget. One way to obtain BME values is through sampling-based integration techniques. We argue with the fact that more expensive models can be sampled much less under time constraints than faster models (in straight proportion to their runtime). The computed evidence in favor of a more expensive model is statistically less significant than the evidence computed in favor of a faster model, since sampling-based strategies are always subject to statistical sampling error. We present a straightforward way to include this misbalance into the model weights that are the basis for model selection. Our approach follows directly from the idea of insufficient significance. It is based on a computationally cheap bootstrapping error estimate of model evidence and is easy to implement. The approach is illustrated in a small synthetic modeling study.
Distributed Bayesian Networks for User Modeling
DEFF Research Database (Denmark)
Tedesco, Roberto; Dolog, Peter; Nejdl, Wolfgang
2006-01-01
The World Wide Web is a popular platform for providing eLearning applications to a wide spectrum of users. However – as users differ in their preferences, background, requirements, and goals – applications should provide personalization mechanisms. In the Web context, user models used by such ada......The World Wide Web is a popular platform for providing eLearning applications to a wide spectrum of users. However – as users differ in their preferences, background, requirements, and goals – applications should provide personalization mechanisms. In the Web context, user models used...... by such adaptive applications are often partial fragments of an overall user model. The fragments have then to be collected and merged into a global user profile. In this paper we investigate and present algorithms able to cope with distributed, fragmented user models – based on Bayesian Networks – in the context...... of Web-based eLearning platforms. The scenario we are tackling assumes learners who use several systems over time, which are able to create partial Bayesian Networks for user models based on the local system context. In particular, we focus on how to merge these partial user models. Our merge mechanism...
Bayesian Spatial Modelling with R-INLA
Directory of Open Access Journals (Sweden)
Finn Lindgren
2015-02-01
Full Text Available The principles behind the interface to continuous domain spatial models in the R- INLA software package for R are described. The integrated nested Laplace approximation (INLA approach proposed by Rue, Martino, and Chopin (2009 is a computationally effective alternative to MCMC for Bayesian inference. INLA is designed for latent Gaussian models, a very wide and flexible class of models ranging from (generalized linear mixed to spatial and spatio-temporal models. Combined with the stochastic partial differential equation approach (SPDE, Lindgren, Rue, and Lindstrm 2011, one can accommodate all kinds of geographically referenced data, including areal and geostatistical ones, as well as spatial point process data. The implementation interface covers stationary spatial mod- els, non-stationary spatial models, and also spatio-temporal models, and is applicable in epidemiology, ecology, environmental risk assessment, as well as general geostatistics.
Sparse Event Modeling with Hierarchical Bayesian Kernel Methods
2016-01-05
SECURITY CLASSIFICATION OF: The research objective of this proposal was to develop a predictive Bayesian kernel approach to model count data based on...several predictive variables. Such an approach, which we refer to as the Poisson Bayesian kernel model, is able to model the rate of occurrence of... kernel methods made use of: (i) the Bayesian property of improving predictive accuracy as data are dynamically obtained, and (ii) the kernel function
bspmma: An R Package for Bayesian Semiparametric Models for Meta-Analysis
Directory of Open Access Journals (Sweden)
Deborah Burr
2012-07-01
Full Text Available We introduce an R package, bspmma, which implements a Dirichlet-based random effects model specific to meta-analysis. In meta-analysis, when combining effect estimates from several heterogeneous studies, it is common to use a random-effects model. The usual frequentist or Bayesian models specify a normal distribution for the true effects. However, in many situations, the effect distribution is not normal, e.g., it can have thick tails, be skewed, or be multi-modal. A Bayesian nonparametric model based on mixtures of Dirichlet process priors has been proposed in the literature, for the purpose of accommodating the non-normality. We review this model and then describe a competitor, a semiparametric version which has the feature that it allows for a well-defined centrality parameter convenient for determining whether the overall effect is significant. This second Bayesian model is based on a different version of the Dirichlet process prior, and we call it the "conditional Dirichlet model". The package contains functions to carry out analyses based on either the ordinary or the conditional Dirichlet model, functions for calculating certain Bayes factors that provide a check on the appropriateness of the conditional Dirichlet model, and functions that enable an empirical Bayes selection of the precision parameter of the Dirichlet process. We illustrate the use of the package on two examples, and give an interpretation of the results in these two different scenarios.
Adversarial life testing: A Bayesian negotiation model
International Nuclear Information System (INIS)
Rufo, M.J.; Martín, J.; Pérez, C.J.
2014-01-01
Life testing is a procedure intended for facilitating the process of making decisions in the context of industrial reliability. On the other hand, negotiation is a process of making joint decisions that has one of its main foundations in decision theory. A Bayesian sequential model of negotiation in the context of adversarial life testing is proposed. This model considers a general setting for which a manufacturer offers a product batch to a consumer. It is assumed that the reliability of the product is measured in terms of its lifetime. Furthermore, both the manufacturer and the consumer have to use their own information with respect to the quality of the product. Under these assumptions, two situations can be analyzed. For both of them, the main aim is to accept or reject the product batch based on the product reliability. This topic is related to a reliability demonstration problem. The procedure is applied to a class of distributions that belong to the exponential family. Thus, a unified framework addressing the main topics in the considered Bayesian model is presented. An illustrative example shows that the proposed technique can be easily applied in practice
International Nuclear Information System (INIS)
Coolen, F.P.A.
1997-01-01
This paper is intended to make researchers in reliability theory aware of a recently introduced Bayesian model with imprecise prior distributions for statistical inference on failure data, that can also be considered as a robust Bayesian model. The model consists of a multinomial distribution with Dirichlet priors, making the approach basically nonparametric. New results for the model are presented, related to right-censored observations, where estimation based on this model is closely related to the product-limit estimator, which is an important statistical method to deal with reliability or survival data including right-censored observations. As for the product-limit estimator, the model considered in this paper aims at not using any information other than that provided by observed data, but our model fits into the robust Bayesian context which has the advantage that all inferences can be based on probabilities or expectations, or bounds for probabilities or expectations. The model uses a finite partition of the time-axis, and as such it is also related to life-tables
Jones, Matt; Love, Bradley C
2011-08-01
The prominence of Bayesian modeling of cognition has increased recently largely because of mathematical advances in specifying and deriving predictions from complex probabilistic models. Much of this research aims to demonstrate that cognitive behavior can be explained from rational principles alone, without recourse to psychological or neurological processes and representations. We note commonalities between this rational approach and other movements in psychology - namely, Behaviorism and evolutionary psychology - that set aside mechanistic explanations or make use of optimality assumptions. Through these comparisons, we identify a number of challenges that limit the rational program's potential contribution to psychological theory. Specifically, rational Bayesian models are significantly unconstrained, both because they are uninformed by a wide range of process-level data and because their assumptions about the environment are generally not grounded in empirical measurement. The psychological implications of most Bayesian models are also unclear. Bayesian inference itself is conceptually trivial, but strong assumptions are often embedded in the hypothesis sets and the approximation algorithms used to derive model predictions, without a clear delineation between psychological commitments and implementational details. Comparing multiple Bayesian models of the same task is rare, as is the realization that many Bayesian models recapitulate existing (mechanistic level) theories. Despite the expressive power of current Bayesian models, we argue they must be developed in conjunction with mechanistic considerations to offer substantive explanations of cognition. We lay out several means for such an integration, which take into account the representations on which Bayesian inference operates, as well as the algorithms and heuristics that carry it out. We argue this unification will better facilitate lasting contributions to psychological theory, avoiding the pitfalls
Merging Digital Surface Models Implementing Bayesian Approaches
Sadeq, H.; Drummond, J.; Li, Z.
2016-06-01
In this research different DSMs from different sources have been merged. The merging is based on a probabilistic model using a Bayesian Approach. The implemented data have been sourced from very high resolution satellite imagery sensors (e.g. WorldView-1 and Pleiades). It is deemed preferable to use a Bayesian Approach when the data obtained from the sensors are limited and it is difficult to obtain many measurements or it would be very costly, thus the problem of the lack of data can be solved by introducing a priori estimations of data. To infer the prior data, it is assumed that the roofs of the buildings are specified as smooth, and for that purpose local entropy has been implemented. In addition to the a priori estimations, GNSS RTK measurements have been collected in the field which are used as check points to assess the quality of the DSMs and to validate the merging result. The model has been applied in the West-End of Glasgow containing different kinds of buildings, such as flat roofed and hipped roofed buildings. Both quantitative and qualitative methods have been employed to validate the merged DSM. The validation results have shown that the model was successfully able to improve the quality of the DSMs and improving some characteristics such as the roof surfaces, which consequently led to better representations. In addition to that, the developed model has been compared with the well established Maximum Likelihood model and showed similar quantitative statistical results and better qualitative results. Although the proposed model has been applied on DSMs that were derived from satellite imagery, it can be applied to any other sourced DSMs.
MERGING DIGITAL SURFACE MODELS IMPLEMENTING BAYESIAN APPROACHES
Directory of Open Access Journals (Sweden)
H. Sadeq
2016-06-01
Full Text Available In this research different DSMs from different sources have been merged. The merging is based on a probabilistic model using a Bayesian Approach. The implemented data have been sourced from very high resolution satellite imagery sensors (e.g. WorldView-1 and Pleiades. It is deemed preferable to use a Bayesian Approach when the data obtained from the sensors are limited and it is difficult to obtain many measurements or it would be very costly, thus the problem of the lack of data can be solved by introducing a priori estimations of data. To infer the prior data, it is assumed that the roofs of the buildings are specified as smooth, and for that purpose local entropy has been implemented. In addition to the a priori estimations, GNSS RTK measurements have been collected in the field which are used as check points to assess the quality of the DSMs and to validate the merging result. The model has been applied in the West-End of Glasgow containing different kinds of buildings, such as flat roofed and hipped roofed buildings. Both quantitative and qualitative methods have been employed to validate the merged DSM. The validation results have shown that the model was successfully able to improve the quality of the DSMs and improving some characteristics such as the roof surfaces, which consequently led to better representations. In addition to that, the developed model has been compared with the well established Maximum Likelihood model and showed similar quantitative statistical results and better qualitative results. Although the proposed model has been applied on DSMs that were derived from satellite imagery, it can be applied to any other sourced DSMs.
Comparison of Parametric and Nonparametric Methods for Analyzing the Bias of a Numerical Model
Directory of Open Access Journals (Sweden)
Isaac Mugume
2016-01-01
Full Text Available Numerical models are presently applied in many fields for simulation and prediction, operation, or research. The output from these models normally has both systematic and random errors. The study compared January 2015 temperature data for Uganda as simulated using the Weather Research and Forecast model with actual observed station temperature data to analyze the bias using parametric (the root mean square error (RMSE, the mean absolute error (MAE, mean error (ME, skewness, and the bias easy estimate (BES and nonparametric (the sign test, STM methods. The RMSE normally overestimates the error compared to MAE. The RMSE and MAE are not sensitive to direction of bias. The ME gives both direction and magnitude of bias but can be distorted by extreme values while the BES is insensitive to extreme values. The STM is robust for giving the direction of bias; it is not sensitive to extreme values but it does not give the magnitude of bias. The graphical tools (such as time series and cumulative curves show the performance of the model with time. It is recommended to integrate parametric and nonparametric methods along with graphical methods for a comprehensive analysis of bias of a numerical model.
Learning Bayesian Dependence Model for Student Modelling
Directory of Open Access Journals (Sweden)
Adina COCU
2008-12-01
Full Text Available Learning a Bayesian network from a numeric set of data is a challenging task because of dual nature of learning process: initial need to learn network structure, and then to find out the distribution probability tables. In this paper, we propose a machine-learning algorithm based on hill climbing search combined with Tabu list. The aim of learning process is to discover the best network that represents dependences between nodes. Another issue in machine learning procedure is handling numeric attributes. In order to do that, we must perform an attribute discretization pre-processes. This discretization operation can influence the results of learning network structure. Therefore, we make a comparative study to find out the most suitable combination between discretization method and learning algorithm, for a specific data set.
Parametric, nonparametric and parametric modelling of a chaotic circuit time series
Timmer, J.; Rust, H.; Horbelt, W.; Voss, H. U.
2000-09-01
The determination of a differential equation underlying a measured time series is a frequently arising task in nonlinear time series analysis. In the validation of a proposed model one often faces the dilemma that it is hard to decide whether possible discrepancies between the time series and model output are caused by an inappropriate model or by bad estimates of parameters in a correct type of model, or both. We propose a combination of parametric modelling based on Bock's multiple shooting algorithm and nonparametric modelling based on optimal transformations as a strategy to test proposed models and if rejected suggest and test new ones. We exemplify this strategy on an experimental time series from a chaotic circuit where we obtain an extremely accurate reconstruction of the observed attractor.
Galindo-Garre, Francisca; Hidalgo, María Dolores; Guilera, Georgina; Pino, Oscar; Rojo, J Emilio; Gómez-Benito, Juana
2015-03-01
The World Health Organization Disability Assessment Schedule II (WHO-DAS II) is a multidimensional instrument developed for measuring disability. It comprises six domains (getting around, self-care, getting along with others, life activities and participation in society). The main purpose of this paper is the evaluation of the psychometric properties for each domain of the WHO-DAS II with parametric and non-parametric Item Response Theory (IRT) models. A secondary objective is to assess whether the WHO-DAS II items within each domain form a hierarchy of invariantly ordered severity indicators of disability. A sample of 352 patients with a schizophrenia spectrum disorder is used in this study. The 36 items WHO-DAS II was administered during the consultation. Partial Credit and Mokken scale models are used to study the psychometric properties of the questionnaire. The psychometric properties of the WHO-DAS II scale are satisfactory for all the domains. However, we identify a few items that do not discriminate satisfactorily between different levels of disability and cannot be invariantly ordered in the scale. In conclusion the WHO-DAS II can be used to assess overall disability in patients with schizophrenia, but some domains are too general to assess functionality in these patients because they contain items that are not applicable to this pathology. Copyright © 2014 John Wiley & Sons, Ltd.
Gianola, Daniel; Wu, Xiao-Lin; Manfredi, Eduardo; Simianer, Henner
2010-10-01
A Bayesian nonparametric form of regression based on Dirichlet process priors is adapted to the analysis of quantitative traits possibly affected by cryptic forms of gene action, and to the context of SNP-assisted genomic selection, where the main objective is to predict a genomic signal on phenotype. The procedure clusters unknown genotypes into groups with distinct genetic values, but in a setting in which the number of clusters is unknown a priori, so that standard methods for finite mixture analysis do not work. The central assumption is that genetic effects follow an unknown distribution with some "baseline" family, which is a normal process in the cases considered here. A Bayesian analysis based on the Gibbs sampler produces estimates of the number of clusters, posterior means of genetic effects, a measure of credibility in the baseline distribution, as well as estimates of parameters of the latter. The procedure is illustrated with a simulation representing two populations. In the first one, there are 3 unknown QTL, with additive, dominance and epistatic effects; in the second, there are 10 QTL with additive, dominance and additive × additive epistatic effects. In the two populations, baseline parameters are inferred correctly. The Dirichlet process model infers the number of unique genetic values correctly in the first population, but it produces an understatement in the second one; here, the true number of clusters is over 900, and the model gives a posterior mean estimate of about 140, probably because more replication of genotypes is needed for correct inference. The impact on inferences of the prior distribution of a key parameter (M), and of the extent of replication, was examined via an analysis of mean body weight in 192 paternal half-sib families of broiler chickens, where each sire was genotyped for nearly 7,000 SNPs. In this small sample, it was found that inference about the number of clusters was affected by the prior distribution of M. For a
Efficient Bayesian network modeling of systems
International Nuclear Information System (INIS)
Bensi, Michelle; Kiureghian, Armen Der; Straub, Daniel
2013-01-01
The Bayesian network (BN) is a convenient tool for probabilistic modeling of system performance, particularly when it is of interest to update the reliability of the system or its components in light of observed information. In this paper, BN structures for modeling the performance of systems that are defined in terms of their minimum link or cut sets are investigated. Standard BN structures that define the system node as a child of its constituent components or its minimum link/cut sets lead to converging structures, which are computationally disadvantageous and could severely hamper application of the BN to real systems. A systematic approach to defining an alternative formulation is developed that creates chain-like BN structures that are orders of magnitude more efficient, particularly in terms of computational memory demand. The formulation uses an integer optimization algorithm to identify the most efficient BN structure. Example applications demonstrate the proposed methodology and quantify the gained computational advantage
Model parameter updating using Bayesian networks
International Nuclear Information System (INIS)
Treml, C.A.; Ross, Timothy J.
2004-01-01
This paper outlines a model parameter updating technique for a new method of model validation using a modified model reference adaptive control (MRAC) framework with Bayesian Networks (BNs). The model parameter updating within this method is generic in the sense that the model/simulation to be validated is treated as a black box. It must have updateable parameters to which its outputs are sensitive, and those outputs must have metrics that can be compared to that of the model reference, i.e., experimental data. Furthermore, no assumptions are made about the statistics of the model parameter uncertainty, only upper and lower bounds need to be specified. This method is designed for situations where a model is not intended to predict a complete point-by-point time domain description of the item/system behavior; rather, there are specific points, features, or events of interest that need to be predicted. These specific points are compared to the model reference derived from actual experimental data. The logic for updating the model parameters to match the model reference is formed via a BN. The nodes of this BN consist of updateable model input parameters and the specific output values or features of interest. Each time the model is executed, the input/output pairs are used to adapt the conditional probabilities of the BN. Each iteration further refines the inferred model parameters to produce the desired model output. After parameter updating is complete and model inputs are inferred, reliabilities for the model output are supplied. Finally, this method is applied to a simulation of a resonance control cooling system for a prototype coupled cavity linac. The results are compared to experimental data.
A non-parametric consistency test of the ΛCDM model with Planck CMB data
Energy Technology Data Exchange (ETDEWEB)
Aghamousa, Amir; Shafieloo, Arman [Korea Astronomy and Space Science Institute, Daejeon 305-348 (Korea, Republic of); Hamann, Jan, E-mail: amir@aghamousa.com, E-mail: jan.hamann@unsw.edu.au, E-mail: shafieloo@kasi.re.kr [School of Physics, The University of New South Wales, Sydney NSW 2052 (Australia)
2017-09-01
Non-parametric reconstruction methods, such as Gaussian process (GP) regression, provide a model-independent way of estimating an underlying function and its uncertainty from noisy data. We demonstrate how GP-reconstruction can be used as a consistency test between a given data set and a specific model by looking for structures in the residuals of the data with respect to the model's best-fit. Applying this formalism to the Planck temperature and polarisation power spectrum measurements, we test their global consistency with the predictions of the base ΛCDM model. Our results do not show any serious inconsistencies, lending further support to the interpretation of the base ΛCDM model as cosmology's gold standard.
Kwak, Sehyun; Svensson, J; Brix, M; Ghim, Y-C
2016-02-01
A Bayesian model of the emission spectrum of the JET lithium beam has been developed to infer the intensity of the Li I (2p-2s) line radiation and associated uncertainties. The detected spectrum for each channel of the lithium beam emission spectroscopy system is here modelled by a single Li line modified by an instrumental function, Bremsstrahlung background, instrumental offset, and interference filter curve. Both the instrumental function and the interference filter curve are modelled with non-parametric Gaussian processes. All free parameters of the model, the intensities of the Li line, Bremsstrahlung background, and instrumental offset, are inferred using Bayesian probability theory with a Gaussian likelihood for photon statistics and electronic background noise. The prior distributions of the free parameters are chosen as Gaussians. Given these assumptions, the intensity of the Li line and corresponding uncertainties are analytically available using a Bayesian linear inversion technique. The proposed approach makes it possible to extract the intensity of Li line without doing a separate background subtraction through modulation of the Li beam.
A semi-nonparametric mixture model for selecting functionally consistent proteins.
Yu, Lianbo; Doerge, Rw
2010-09-28
High-throughput technologies have led to a new era of proteomics. Although protein microarray experiments are becoming more common place there are a variety of experimental and statistical issues that have yet to be addressed, and that will carry over to new high-throughput technologies unless they are investigated. One of the largest of these challenges is the selection of functionally consistent proteins. We present a novel semi-nonparametric mixture model for classifying proteins as consistent or inconsistent while controlling the false discovery rate and the false non-discovery rate. The performance of the proposed approach is compared to current methods via simulation under a variety of experimental conditions. We provide a statistical method for selecting functionally consistent proteins in the context of protein microarray experiments, but the proposed semi-nonparametric mixture model method can certainly be generalized to solve other mixture data problems. The main advantage of this approach is that it provides the posterior probability of consistency for each protein.
Bayesian Analysis for Penalized Spline Regression Using WinBUGS
Directory of Open Access Journals (Sweden)
Ciprian M. Crainiceanu
2005-09-01
Full Text Available Penalized splines can be viewed as BLUPs in a mixed model framework, which allows the use of mixed model software for smoothing. Thus, software originally developed for Bayesian analysis of mixed models can be used for penalized spline regression. Bayesian inference for nonparametric models enjoys the flexibility of nonparametric models and the exact inference provided by the Bayesian inferential machinery. This paper provides a simple, yet comprehensive, set of programs for the implementation of nonparametric Bayesian analysis in WinBUGS. Good mixing properties of the MCMC chains are obtained by using low-rank thin-plate splines, while simulation times per iteration are reduced employing WinBUGS specific computational tricks.
Using nonparametrics to specify a model to measure the value of travel time
DEFF Research Database (Denmark)
Fosgerau, Mogens
2007-01-01
Using a range of nonparametric methods, the paper examines the specification of a model to evaluate the willingness-to-pay (WTP) for travel time changes from binomial choice data from a simple time-cost trading experiment. The analysis favours a model with random WTP as the only source...... of randomness over a model with fixed WTP which is linear in time and cost and has an additive random error term. Results further indicate that the distribution of log WTP can be described as a sum of a linear index fixing the location of the log WTP distribution and an independent random variable representing...... unobserved heterogeneity. This formulation is useful for parametric modelling. The index indicates that the WTP varies systematically with income and other individual characteristics. The WTP varies also with the time difference presented in the experiment which is in contradiction of standard utility theory....
Item selection via Bayesian IRT models.
Arima, Serena
2015-02-10
With reference to a questionnaire that aimed to assess the quality of life for dysarthric speakers, we investigate the usefulness of a model-based procedure for reducing the number of items. We propose a mixed cumulative logit model, which is known in the psychometrics literature as the graded response model: responses to different items are modelled as a function of individual latent traits and as a function of item characteristics, such as their difficulty and their discrimination power. We jointly model the discrimination and the difficulty parameters by using a k-component mixture of normal distributions. Mixture components correspond to disjoint groups of items. Items that belong to the same groups can be considered equivalent in terms of both difficulty and discrimination power. According to decision criteria, we select a subset of items such that the reduced questionnaire is able to provide the same information that the complete questionnaire provides. The model is estimated by using a Bayesian approach, and the choice of the number of mixture components is justified according to information criteria. We illustrate the proposed approach on the basis of data that are collected for 104 dysarthric patients by local health authorities in Lecce and in Milan. Copyright © 2014 John Wiley & Sons, Ltd.
Advances in Bayesian Modeling in Educational Research
Levy, Roy
2016-01-01
In this article, I provide a conceptually oriented overview of Bayesian approaches to statistical inference and contrast them with frequentist approaches that currently dominate conventional practice in educational research. The features and advantages of Bayesian approaches are illustrated with examples spanning several statistical modeling…
Maydeu-Olivares, Albert
2005-01-01
Chernyshenko, Stark, Chan, Drasgow, and Williams (2001) investigated the fit of Samejima's logistic graded model and Levine's non-parametric MFS model to the scales of two personality questionnaires and found that the graded model did not fit well. We attribute the poor fit of the graded model to small amounts of multidimensionality present in…
Nonparametric test of consistency between cosmological models and multiband CMB measurements
Energy Technology Data Exchange (ETDEWEB)
Aghamousa, Amir [Asia Pacific Center for Theoretical Physics, Pohang, Gyeongbuk 790-784 (Korea, Republic of); Shafieloo, Arman, E-mail: amir@apctp.org, E-mail: shafieloo@kasi.re.kr [Korea Astronomy and Space Science Institute, Daejeon 305-348 (Korea, Republic of)
2015-06-01
We present a novel approach to test the consistency of the cosmological models with multiband CMB data using a nonparametric approach. In our analysis we calibrate the REACT (Risk Estimation and Adaptation after Coordinate Transformation) confidence levels associated with distances in function space (confidence distances) based on the Monte Carlo simulations in order to test the consistency of an assumed cosmological model with observation. To show the applicability of our algorithm, we confront Planck 2013 temperature data with concordance model of cosmology considering two different Planck spectra combination. In order to have an accurate quantitative statistical measure to compare between the data and the theoretical expectations, we calibrate REACT confidence distances and perform a bias control using many realizations of the data. Our results in this work using Planck 2013 temperature data put the best fit ΛCDM model at 95% (∼ 2σ) confidence distance from the center of the nonparametric confidence set while repeating the analysis excluding the Planck 217 × 217 GHz spectrum data, the best fit ΛCDM model shifts to 70% (∼ 1σ) confidence distance. The most prominent features in the data deviating from the best fit ΛCDM model seems to be at low multipoles 18 < ℓ < 26 at greater than 2σ, ℓ ∼ 750 at ∼1 to 2σ and ℓ ∼ 1800 at greater than 2σ level. Excluding the 217×217 GHz spectrum the feature at ℓ ∼ 1800 becomes substantially less significance at ∼1 to 2σ confidence level. Results of our analysis based on the new approach we propose in this work are in agreement with other analysis done using alternative methods.
Bayesian inference of chemical kinetic models from proposed reactions
Galagali, Nikhil; Marzouk, Youssef M.
2015-01-01
© 2014 Elsevier Ltd. Bayesian inference provides a natural framework for combining experimental data with prior knowledge to develop chemical kinetic models and quantify the associated uncertainties, not only in parameter values but also in model
Constructive Epistemic Modeling: A Hierarchical Bayesian Model Averaging Method
Tsai, F. T. C.; Elshall, A. S.
2014-12-01
Constructive epistemic modeling is the idea that our understanding of a natural system through a scientific model is a mental construct that continually develops through learning about and from the model. Using the hierarchical Bayesian model averaging (HBMA) method [1], this study shows that segregating different uncertain model components through a BMA tree of posterior model probabilities, model prediction, within-model variance, between-model variance and total model variance serves as a learning tool [2]. First, the BMA tree of posterior model probabilities permits the comparative evaluation of the candidate propositions of each uncertain model component. Second, systemic model dissection is imperative for understanding the individual contribution of each uncertain model component to the model prediction and variance. Third, the hierarchical representation of the between-model variance facilitates the prioritization of the contribution of each uncertain model component to the overall model uncertainty. We illustrate these concepts using the groundwater modeling of a siliciclastic aquifer-fault system. The sources of uncertainty considered are from geological architecture, formation dip, boundary conditions and model parameters. The study shows that the HBMA analysis helps in advancing knowledge about the model rather than forcing the model to fit a particularly understanding or merely averaging several candidate models. [1] Tsai, F. T.-C., and A. S. Elshall (2013), Hierarchical Bayesian model averaging for hydrostratigraphic modeling: Uncertainty segregation and comparative evaluation. Water Resources Research, 49, 5520-5536, doi:10.1002/wrcr.20428. [2] Elshall, A.S., and F. T.-C. Tsai (2014). Constructive epistemic modeling of groundwater flow with geological architecture and boundary condition uncertainty under Bayesian paradigm, Journal of Hydrology, 517, 105-119, doi: 10.1016/j.jhydrol.2014.05.027.
Bootstrap prediction and Bayesian prediction under misspecified models
Fushiki, Tadayoshi
2005-01-01
We consider a statistical prediction problem under misspecified models. In a sense, Bayesian prediction is an optimal prediction method when an assumed model is true. Bootstrap prediction is obtained by applying Breiman's `bagging' method to a plug-in prediction. Bootstrap prediction can be considered to be an approximation to the Bayesian prediction under the assumption that the model is true. However, in applications, there are frequently deviations from the assumed model. In this paper, bo...
A Bayesian Approach to Functional Mixed Effect Modeling for Longitudinal Data with Binomial Outcomes
Kliethermes, Stephanie; Oleson, Jacob
2014-01-01
Longitudinal growth patterns are routinely seen in medical studies where individual and population growth is followed over a period of time. Many current methods for modeling growth presuppose a parametric relationship between the outcome and time (e.g., linear, quadratic); however, these relationships may not accurately capture growth over time. Functional mixed effects (FME) models provide flexibility in handling longitudinal data with nonparametric temporal trends. Although FME methods are well-developed for continuous, normally distributed outcome measures, nonparametric methods for handling categorical outcomes are limited. We consider the situation with binomially distributed longitudinal outcomes. Although percent correct data can be modeled assuming normality, estimates outside the parameter space are possible and thus estimated curves can be unrealistic. We propose a binomial FME model using Bayesian methodology to account for growth curves with binomial (percentage) outcomes. The usefulness of our methods is demonstrated using a longitudinal study of speech perception outcomes from cochlear implant users where we successfully model both the population and individual growth trajectories. Simulation studies also advocate the usefulness of the binomial model particularly when outcomes occur near the boundary of the probability parameter space and in situations with a small number of trials. PMID:24723495
Kliethermes, Stephanie; Oleson, Jacob
2014-08-15
Longitudinal growth patterns are routinely seen in medical studies where individual growth and population growth are followed up over a period of time. Many current methods for modeling growth presuppose a parametric relationship between the outcome and time (e.g., linear and quadratic); however, these relationships may not accurately capture growth over time. Functional mixed-effects (FME) models provide flexibility in handling longitudinal data with nonparametric temporal trends. Although FME methods are well developed for continuous, normally distributed outcome measures, nonparametric methods for handling categorical outcomes are limited. We consider the situation with binomially distributed longitudinal outcomes. Although percent correct data can be modeled assuming normality, estimates outside the parameter space are possible, and thus, estimated curves can be unrealistic. We propose a binomial FME model using Bayesian methodology to account for growth curves with binomial (percentage) outcomes. The usefulness of our methods is demonstrated using a longitudinal study of speech perception outcomes from cochlear implant users where we successfully model both the population and individual growth trajectories. Simulation studies also advocate the usefulness of the binomial model particularly when outcomes occur near the boundary of the probability parameter space and in situations with a small number of trials. Copyright © 2014 John Wiley & Sons, Ltd.
Hierarchical Bayesian Modeling of Fluid-Induced Seismicity
Broccardo, M.; Mignan, A.; Wiemer, S.; Stojadinovic, B.; Giardini, D.
2017-11-01
In this study, we present a Bayesian hierarchical framework to model fluid-induced seismicity. The framework is based on a nonhomogeneous Poisson process with a fluid-induced seismicity rate proportional to the rate of injected fluid. The fluid-induced seismicity rate model depends upon a set of physically meaningful parameters and has been validated for six fluid-induced case studies. In line with the vision of hierarchical Bayesian modeling, the rate parameters are considered as random variables. We develop both the Bayesian inference and updating rules, which are used to develop a probabilistic forecasting model. We tested the Basel 2006 fluid-induced seismic case study to prove that the hierarchical Bayesian model offers a suitable framework to coherently encode both epistemic uncertainty and aleatory variability. Moreover, it provides a robust and consistent short-term seismic forecasting model suitable for online risk quantification and mitigation.
Bayesian Networks for Modeling Dredging Decisions
2011-10-01
years, that algorithms have been developed to solve these problems efficiently. Most modern Bayesian network software uses junction tree (a.k.a. join... software was used to develop the network . This is by no means an exhaustive list of Bayesian network applications, but it is representative of recent...characteristic node (SCN), state- defining node ( SDN ), effect node (EFN), or value node. The five types of nodes can be described as follows: ERDC/EL TR-11
Directory of Open Access Journals (Sweden)
Yin Wang
2014-01-01
Full Text Available We present a nonparametric shape constrained algorithm for segmentation of coronary arteries in computed tomography images within the framework of active contours. An adaptive scale selection scheme, based on the global histogram information of the image data, is employed to determine the appropriate window size for each point on the active contour, which improves the performance of the active contour model in the low contrast local image regions. The possible leakage, which cannot be identified by using intensity features alone, is reduced through the application of the proposed shape constraint, where the shape of circular sampled intensity profile is used to evaluate the likelihood of current segmentation being considered vascular structures. Experiments on both synthetic and clinical datasets have demonstrated the efficiency and robustness of the proposed method. The results on clinical datasets have shown that the proposed approach is capable of extracting more detailed coronary vessels with subvoxel accuracy.
A Nonparametric Operational Risk Modeling Approach Based on Cornish-Fisher Expansion
Directory of Open Access Journals (Sweden)
Xiaoqian Zhu
2014-01-01
Full Text Available It is generally accepted that the choice of severity distribution in loss distribution approach has a significant effect on the operational risk capital estimation. However, the usually used parametric approaches with predefined distribution assumption might be not able to fit the severity distribution accurately. The objective of this paper is to propose a nonparametric operational risk modeling approach based on Cornish-Fisher expansion. In this approach, the samples of severity are generated by Cornish-Fisher expansion and then used in the Monte Carlo simulation to sketch the annual operational loss distribution. In the experiment, the proposed approach is employed to calculate the operational risk capital charge for the overall Chinese banking. The experiment dataset is the most comprehensive operational risk dataset in China as far as we know. The results show that the proposed approach is able to use the information of high order moments and might be more effective and stable than the usually used parametric approach.
Genomic outlier profile analysis: mixture models, null hypotheses, and nonparametric estimation.
Ghosh, Debashis; Chinnaiyan, Arul M
2009-01-01
In most analyses of large-scale genomic data sets, differential expression analysis is typically assessed by testing for differences in the mean of the distributions between 2 groups. A recent finding by Tomlins and others (2005) is of a different type of pattern of differential expression in which a fraction of samples in one group have overexpression relative to samples in the other group. In this work, we describe a general mixture model framework for the assessment of this type of expression, called outlier profile analysis. We start by considering the single-gene situation and establishing results on identifiability. We propose 2 nonparametric estimation procedures that have natural links to familiar multiple testing procedures. We then develop multivariate extensions of this methodology to handle genome-wide measurements. The proposed methodologies are compared using simulation studies as well as data from a prostate cancer gene expression study.
Fast model updating coupling Bayesian inference and PGD model reduction
Rubio, Paul-Baptiste; Louf, François; Chamoin, Ludovic
2018-04-01
The paper focuses on a coupled Bayesian-Proper Generalized Decomposition (PGD) approach for the real-time identification and updating of numerical models. The purpose is to use the most general case of Bayesian inference theory in order to address inverse problems and to deal with different sources of uncertainties (measurement and model errors, stochastic parameters). In order to do so with a reasonable CPU cost, the idea is to replace the direct model called for Monte-Carlo sampling by a PGD reduced model, and in some cases directly compute the probability density functions from the obtained analytical formulation. This procedure is first applied to a welding control example with the updating of a deterministic parameter. In the second application, the identification of a stochastic parameter is studied through a glued assembly example.
When mechanism matters: Bayesian forecasting using models of ecological diffusion
Hefley, Trevor J.; Hooten, Mevin B.; Russell, Robin E.; Walsh, Daniel P.; Powell, James A.
2017-01-01
Ecological diffusion is a theory that can be used to understand and forecast spatio-temporal processes such as dispersal, invasion, and the spread of disease. Hierarchical Bayesian modelling provides a framework to make statistical inference and probabilistic forecasts, using mechanistic ecological models. To illustrate, we show how hierarchical Bayesian models of ecological diffusion can be implemented for large data sets that are distributed densely across space and time. The hierarchical Bayesian approach is used to understand and forecast the growth and geographic spread in the prevalence of chronic wasting disease in white-tailed deer (Odocoileus virginianus). We compare statistical inference and forecasts from our hierarchical Bayesian model to phenomenological regression-based methods that are commonly used to analyse spatial occurrence data. The mechanistic statistical model based on ecological diffusion led to important ecological insights, obviated a commonly ignored type of collinearity, and was the most accurate method for forecasting.
Bayesian hierarchical modelling of North Atlantic windiness
Vanem, E.; Breivik, O. N.
2013-03-01
Extreme weather conditions represent serious natural hazards to ship operations and may be the direct cause or contributing factor to maritime accidents. Such severe environmental conditions can be taken into account in ship design and operational windows can be defined that limits hazardous operations to less extreme conditions. Nevertheless, possible changes in the statistics of extreme weather conditions, possibly due to anthropogenic climate change, represent an additional hazard to ship operations that is less straightforward to account for in a consistent way. Obviously, there are large uncertainties as to how future climate change will affect the extreme weather conditions at sea and there is a need for stochastic models that can describe the variability in both space and time at various scales of the environmental conditions. Previously, Bayesian hierarchical space-time models have been developed to describe the variability and complex dependence structures of significant wave height in space and time. These models were found to perform reasonably well and provided some interesting results, in particular, pertaining to long-term trends in the wave climate. In this paper, a similar framework is applied to oceanic windiness and the spatial and temporal variability of the 10-m wind speed over an area in the North Atlantic ocean is investigated. When the results from the model for North Atlantic windiness is compared to the results for significant wave height over the same area, it is interesting to observe that whereas an increasing trend in significant wave height was identified, no statistically significant long-term trend was estimated in windiness. This may indicate that the increase in significant wave height is not due to an increase in locally generated wind waves, but rather to increased swell. This observation is also consistent with studies that have suggested a poleward shift of the main storm tracks.
Bayesian hierarchical modelling of North Atlantic windiness
Directory of Open Access Journals (Sweden)
E. Vanem
2013-03-01
Full Text Available Extreme weather conditions represent serious natural hazards to ship operations and may be the direct cause or contributing factor to maritime accidents. Such severe environmental conditions can be taken into account in ship design and operational windows can be defined that limits hazardous operations to less extreme conditions. Nevertheless, possible changes in the statistics of extreme weather conditions, possibly due to anthropogenic climate change, represent an additional hazard to ship operations that is less straightforward to account for in a consistent way. Obviously, there are large uncertainties as to how future climate change will affect the extreme weather conditions at sea and there is a need for stochastic models that can describe the variability in both space and time at various scales of the environmental conditions. Previously, Bayesian hierarchical space-time models have been developed to describe the variability and complex dependence structures of significant wave height in space and time. These models were found to perform reasonably well and provided some interesting results, in particular, pertaining to long-term trends in the wave climate. In this paper, a similar framework is applied to oceanic windiness and the spatial and temporal variability of the 10-m wind speed over an area in the North Atlantic ocean is investigated. When the results from the model for North Atlantic windiness is compared to the results for significant wave height over the same area, it is interesting to observe that whereas an increasing trend in significant wave height was identified, no statistically significant long-term trend was estimated in windiness. This may indicate that the increase in significant wave height is not due to an increase in locally generated wind waves, but rather to increased swell. This observation is also consistent with studies that have suggested a poleward shift of the main storm tracks.
Bayesian non- and semi-parametric methods and applications
Rossi, Peter
2014-01-01
This book reviews and develops Bayesian non-parametric and semi-parametric methods for applications in microeconometrics and quantitative marketing. Most econometric models used in microeconomics and marketing applications involve arbitrary distributional assumptions. As more data becomes available, a natural desire to provide methods that relax these assumptions arises. Peter Rossi advocates a Bayesian approach in which specific distributional assumptions are replaced with more flexible distributions based on mixtures of normals. The Bayesian approach can use either a large but fixed number
Bayesian disease mapping: hierarchical modeling in spatial epidemiology
National Research Council Canada - National Science Library
Lawson, Andrew
2013-01-01
Since the publication of the first edition, many new Bayesian tools and methods have been developed for space-time data analysis, the predictive modeling of health outcomes, and other spatial biostatistical areas...
Flexible Bayesian Dynamic Modeling of Covariance and Correlation Matrices
Lan, Shiwei; Holbrook, Andrew; Fortin, Norbert J.; Ombao, Hernando; Shahbaba, Babak
2017-01-01
Modeling covariance (and correlation) matrices is a challenging problem due to the large dimensionality and positive-definiteness constraint. In this paper, we propose a novel Bayesian framework based on decomposing the covariance matrix
Bayesian network modeling of operator's state recognition process
International Nuclear Information System (INIS)
Hatakeyama, Naoki; Furuta, Kazuo
2000-01-01
Nowadays we are facing a difficult problem of establishing a good relation between humans and machines. To solve this problem, we suppose that machine system need to have a model of human behavior. In this study we model the state cognition process of a PWR plant operator as an example. We use a Bayesian network as an inference engine. We incorporate the knowledge hierarchy in the Bayesian network and confirm its validity using the example of PWR plant operator. (author)
DEFF Research Database (Denmark)
Kuikka, Sakari; Haapasaari, Päivi Elisabet; Helle, Inari
2011-01-01
We review the experience obtained in using integrative Bayesian models in interdisciplinary analysis focusing on sustainable use of marine resources and environmental management tasks. We have applied Bayesian models to both fisheries and environmental risk analysis problems. Bayesian belief...... be time consuming and research projects can be difficult to manage due to unpredictable technical problems related to parameter estimation. Biology, sociology and environmental economics have their own scientific traditions. Bayesian models are becoming traditional tools in fisheries biology, where...
Variable Selection for Nonparametric Gaussian Process Priors: Models and Computational Strategies.
Savitsky, Terrance; Vannucci, Marina; Sha, Naijun
2011-02-01
This paper presents a unified treatment of Gaussian process models that extends to data from the exponential dispersion family and to survival data. Our specific interest is in the analysis of data sets with predictors that have an a priori unknown form of possibly nonlinear associations to the response. The modeling approach we describe incorporates Gaussian processes in a generalized linear model framework to obtain a class of nonparametric regression models where the covariance matrix depends on the predictors. We consider, in particular, continuous, categorical and count responses. We also look into models that account for survival outcomes. We explore alternative covariance formulations for the Gaussian process prior and demonstrate the flexibility of the construction. Next, we focus on the important problem of selecting variables from the set of possible predictors and describe a general framework that employs mixture priors. We compare alternative MCMC strategies for posterior inference and achieve a computationally efficient and practical approach. We demonstrate performances on simulated and benchmark data sets.
Le Bihan, Nicolas; Margerin, Ludovic
2009-07-01
In this paper, we present a nonparametric method to estimate the heterogeneity of a random medium from the angular distribution of intensity of waves transmitted through a slab of random material. Our approach is based on the modeling of forward multiple scattering using compound Poisson processes on compact Lie groups. The estimation technique is validated through numerical simulations based on radiative transfer theory.
Jiang, GJ
1998-01-01
This paper develops a nonparametric model of interest rate term structure dynamics based an a spot rate process that permits only positive interest rates and a market price of interest rate risk that precludes arbitrage opportunities. Both the spot rate process and the market price of interest rate
Combination of Bayesian Network and Overlay Model in User Modeling
Directory of Open Access Journals (Sweden)
Loc Nguyen
2009-12-01
Full Text Available The core of adaptive system is user model containing personal information such as knowledge, learning styles, goals… which is requisite for learning personalized process. There are many modeling approaches, for example: stereotype, overlay, plan recognition… but they don’t bring out the solid method for reasoning from user model. This paper introduces the statistical method that combines Bayesian network and overlay modeling so that it is able to infer user’s knowledge from evidences collected during user’s learning process.
Bayesian graphical models for genomewide association studies.
Verzilli, Claudio J; Stallard, Nigel; Whittaker, John C
2006-07-01
As the extent of human genetic variation becomes more fully characterized, the research community is faced with the challenging task of using this information to dissect the heritable components of complex traits. Genomewide association studies offer great promise in this respect, but their analysis poses formidable difficulties. In this article, we describe a computationally efficient approach to mining genotype-phenotype associations that scales to the size of the data sets currently being collected in such studies. We use discrete graphical models as a data-mining tool, searching for single- or multilocus patterns of association around a causative site. The approach is fully Bayesian, allowing us to incorporate prior knowledge on the spatial dependencies around each marker due to linkage disequilibrium, which reduces considerably the number of possible graphical structures. A Markov chain-Monte Carlo scheme is developed that yields samples from the posterior distribution of graphs conditional on the data from which probabilistic statements about the strength of any genotype-phenotype association can be made. Using data simulated under scenarios that vary in marker density, genotype relative risk of a causative allele, and mode of inheritance, we show that the proposed approach has better localization properties and leads to lower false-positive rates than do single-locus analyses. Finally, we present an application of our method to a quasi-synthetic data set in which data from the CYP2D6 region are embedded within simulated data on 100K single-nucleotide polymorphisms. Analysis is quick (<5 min), and we are able to localize the causative site to a very short interval.
A tutorial introduction to Bayesian models of cognitive development.
Perfors, Amy; Tenenbaum, Joshua B; Griffiths, Thomas L; Xu, Fei
2011-09-01
We present an introduction to Bayesian inference as it is used in probabilistic models of cognitive development. Our goal is to provide an intuitive and accessible guide to the what, the how, and the why of the Bayesian approach: what sorts of problems and data the framework is most relevant for, and how and why it may be useful for developmentalists. We emphasize a qualitative understanding of Bayesian inference, but also include information about additional resources for those interested in the cognitive science applications, mathematical foundations, or machine learning details in more depth. In addition, we discuss some important interpretation issues that often arise when evaluating Bayesian models in cognitive science. Copyright © 2010 Elsevier B.V. All rights reserved.
Modelling of JET diagnostics using Bayesian Graphical Models
Energy Technology Data Exchange (ETDEWEB)
Svensson, J. [IPP Greifswald, Greifswald (Germany); Ford, O. [Imperial College, London (United Kingdom); McDonald, D.; Hole, M.; Nessi, G. von; Meakins, A.; Brix, M.; Thomsen, H.; Werner, A.; Sirinelli, A.
2011-07-01
The mapping between physics parameters (such as densities, currents, flows, temperatures etc) defining the plasma 'state' under a given model and the raw observations of each plasma diagnostic will 1) depend on the particular physics model used, 2) is inherently probabilistic, from uncertainties on both observations and instrumental aspects of the mapping, such as calibrations, instrument functions etc. A flexible and principled way of modelling such interconnected probabilistic systems is through so called Bayesian graphical models. Being an amalgam between graph theory and probability theory, Bayesian graphical models can simulate the complex interconnections between physics models and diagnostic observations from multiple heterogeneous diagnostic systems, making it relatively easy to optimally combine the observations from multiple diagnostics for joint inference on parameters of the underlying physics model, which in itself can be represented as part of the graph. At JET about 10 diagnostic systems have to date been modelled in this way, and has lead to a number of new results, including: the reconstruction of the flux surface topology and q-profiles without any specific equilibrium assumption, using information from a number of different diagnostic systems; profile inversions taking into account the uncertainties in the flux surface positions and a substantial increase in accuracy of JET electron density and temperature profiles, including improved pedestal resolution, through the joint analysis of three diagnostic systems. It is believed that the Bayesian graph approach could potentially be utilised for very large sets of diagnostics, providing a generic data analysis framework for nuclear fusion experiments, that would be able to optimally utilize the information from multiple diagnostics simultaneously, and where the explicit graph representation of the connections to underlying physics models could be used for sophisticated model testing. This
Inventory model using bayesian dynamic linear model for demand forecasting
Directory of Open Access Journals (Sweden)
Marisol Valencia-Cárdenas
2014-12-01
Full Text Available An important factor of manufacturing process is the inventory management of terminated product. Constantly, industry is looking for better alternatives to establish an adequate plan of production and stored quantities, with optimal cost, getting quantities in a time horizon, which permits to define resources and logistics with anticipation, needed to distribute products on time. Total absence of historical data, required by many statistical models to forecast, demands the search for other kind of accurate techniques. This work presents an alternative that not only permits to forecast, in an adjusted way, but also, to provide optimal quantities to produce and store with an optimal cost, using Bayesian statistics. The proposal is illustrated with real data. Palabras clave: estadística bayesiana, optimización, modelo de inventarios, modelo lineal dinámico bayesiano. Keywords: Bayesian statistics, opti
Two-component mixture cure rate model with spline estimated nonparametric components.
Wang, Lu; Du, Pang; Liang, Hua
2012-09-01
In some survival analysis of medical studies, there are often long-term survivors who can be considered as permanently cured. The goals in these studies are to estimate the noncured probability of the whole population and the hazard rate of the susceptible subpopulation. When covariates are present as often happens in practice, to understand covariate effects on the noncured probability and hazard rate is of equal importance. The existing methods are limited to parametric and semiparametric models. We propose a two-component mixture cure rate model with nonparametric forms for both the cure probability and the hazard rate function. Identifiability of the model is guaranteed by an additive assumption that allows no time-covariate interactions in the logarithm of hazard rate. Estimation is carried out by an expectation-maximization algorithm on maximizing a penalized likelihood. For inferential purpose, we apply the Louis formula to obtain point-wise confidence intervals for noncured probability and hazard rate. Asymptotic convergence rates of our function estimates are established. We then evaluate the proposed method by extensive simulations. We analyze the survival data from a melanoma study and find interesting patterns for this study. © 2011, The International Biometric Society.
Lu, Tao
2016-01-01
The gene regulation network (GRN) evaluates the interactions between genes and look for models to describe the gene expression behavior. These models have many applications; for instance, by characterizing the gene expression mechanisms that cause certain disorders, it would be possible to target those genes to block the progress of the disease. Many biological processes are driven by nonlinear dynamic GRN. In this article, we propose a nonparametric differential equation (ODE) to model the nonlinear dynamic GRN. Specially, we address following questions simultaneously: (i) extract information from noisy time course gene expression data; (ii) model the nonlinear ODE through a nonparametric smoothing function; (iii) identify the important regulatory gene(s) through a group smoothly clipped absolute deviation (SCAD) approach; (iv) test the robustness of the model against possible shortening of experimental duration. We illustrate the usefulness of the model and associated statistical methods through a simulation and a real application examples.
A Bayesian alternative for multi-objective ecohydrological model specification
Tang, Yating; Marshall, Lucy; Sharma, Ashish; Ajami, Hoori
2018-01-01
Recent studies have identified the importance of vegetation processes in terrestrial hydrologic systems. Process-based ecohydrological models combine hydrological, physical, biochemical and ecological processes of the catchments, and as such are generally more complex and parametric than conceptual hydrological models. Thus, appropriate calibration objectives and model uncertainty analysis are essential for ecohydrological modeling. In recent years, Bayesian inference has become one of the most popular tools for quantifying the uncertainties in hydrological modeling with the development of Markov chain Monte Carlo (MCMC) techniques. The Bayesian approach offers an appealing alternative to traditional multi-objective hydrologic model calibrations by defining proper prior distributions that can be considered analogous to the ad-hoc weighting often prescribed in multi-objective calibration. Our study aims to develop appropriate prior distributions and likelihood functions that minimize the model uncertainties and bias within a Bayesian ecohydrological modeling framework based on a traditional Pareto-based model calibration technique. In our study, a Pareto-based multi-objective optimization and a formal Bayesian framework are implemented in a conceptual ecohydrological model that combines a hydrological model (HYMOD) and a modified Bucket Grassland Model (BGM). Simulations focused on one objective (streamflow/LAI) and multiple objectives (streamflow and LAI) with different emphasis defined via the prior distribution of the model error parameters. Results show more reliable outputs for both predicted streamflow and LAI using Bayesian multi-objective calibration with specified prior distributions for error parameters based on results from the Pareto front in the ecohydrological modeling. The methodology implemented here provides insight into the usefulness of multiobjective Bayesian calibration for ecohydrologic systems and the importance of appropriate prior
Bayesian Estimation of the Logistic Positive Exponent IRT Model
Bolfarine, Heleno; Bazan, Jorge Luis
2010-01-01
A Bayesian inference approach using Markov Chain Monte Carlo (MCMC) is developed for the logistic positive exponent (LPE) model proposed by Samejima and for a new skewed Logistic Item Response Theory (IRT) model, named Reflection LPE model. Both models lead to asymmetric item characteristic curves (ICC) and can be appropriate because a symmetric…
Using consensus bayesian network to model the reactive oxygen species regulatory pathway.
Directory of Open Access Journals (Sweden)
Liangdong Hu
Full Text Available Bayesian network is one of the most successful graph models for representing the reactive oxygen species regulatory pathway. With the increasing number of microarray measurements, it is possible to construct the bayesian network from microarray data directly. Although large numbers of bayesian network learning algorithms have been developed, when applying them to learn bayesian networks from microarray data, the accuracies are low due to that the databases they used to learn bayesian networks contain too few microarray data. In this paper, we propose a consensus bayesian network which is constructed by combining bayesian networks from relevant literatures and bayesian networks learned from microarray data. It would have a higher accuracy than the bayesian networks learned from one database. In the experiment, we validated the bayesian network combination algorithm on several classic machine learning databases and used the consensus bayesian network to model the Escherichia coli's ROS pathway.
Yap, John Stephen; Fan, Jianqing; Wu, Rongling
2009-12-01
Estimation of the covariance structure of longitudinal processes is a fundamental prerequisite for the practical deployment of functional mapping designed to study the genetic regulation and network of quantitative variation in dynamic complex traits. We present a nonparametric approach for estimating the covariance structure of a quantitative trait measured repeatedly at a series of time points. Specifically, we adopt Huang et al.'s (2006, Biometrika 93, 85-98) approach of invoking the modified Cholesky decomposition and converting the problem into modeling a sequence of regressions of responses. A regularized covariance estimator is obtained using a normal penalized likelihood with an L(2) penalty. This approach, embedded within a mixture likelihood framework, leads to enhanced accuracy, precision, and flexibility of functional mapping while preserving its biological relevance. Simulation studies are performed to reveal the statistical properties and advantages of the proposed method. A real example from a mouse genome project is analyzed to illustrate the utilization of the methodology. The new method will provide a useful tool for genome-wide scanning for the existence and distribution of quantitative trait loci underlying a dynamic trait important to agriculture, biology, and health sciences.
Lee, Soojeong; Rajan, Sreeraman; Jeon, Gwanggil; Chang, Joon-Hyuk; Dajani, Hilmi R; Groza, Voicu Z
2017-06-01
Blood pressure (BP) is one of the most important vital indicators and plays a key role in determining the cardiovascular activity of patients. This paper proposes a hybrid approach consisting of nonparametric bootstrap (NPB) and machine learning techniques to obtain the characteristic ratios (CR) used in the blood pressure estimation algorithm to improve the accuracy of systolic blood pressure (SBP) and diastolic blood pressure (DBP) estimates and obtain confidence intervals (CI). The NPB technique is used to circumvent the requirement for large sample set for obtaining the CI. A mixture of Gaussian densities is assumed for the CRs and Gaussian mixture model (GMM) is chosen to estimate the SBP and DBP ratios. The K-means clustering technique is used to obtain the mixture order of the Gaussian densities. The proposed approach achieves grade "A" under British Society of Hypertension testing protocol and is superior to the conventional approach based on maximum amplitude algorithm (MAA) that uses fixed CR ratios. The proposed approach also yields a lower mean error (ME) and the standard deviation of the error (SDE) in the estimates when compared to the conventional MAA method. In addition, CIs obtained through the proposed hybrid approach are also narrower with a lower SDE. The proposed approach combining the NPB technique with the GMM provides a methodology to derive individualized characteristic ratio. The results exhibit that the proposed approach enhances the accuracy of SBP and DBP estimation and provides narrower confidence intervals for the estimates. Copyright © 2015 Elsevier Ltd. All rights reserved.
Sueiro, Manuel J.; Abad, Francisco J.
2011-01-01
The distance between nonparametric and parametric item characteristic curves has been proposed as an index of goodness of fit in item response theory in the form of a root integrated squared error index. This article proposes to use the posterior distribution of the latent trait as the nonparametric model and compares the performance of an index…
An Evaluation of Parametric and Nonparametric Models of Fish Population Response.
Energy Technology Data Exchange (ETDEWEB)
Haas, Timothy C.; Peterson, James T.; Lee, Danny C.
1999-11-01
Predicting the distribution or status of animal populations at large scales often requires the use of broad-scale information describing landforms, climate, vegetation, etc. These data, however, often consist of mixtures of continuous and categorical covariates and nonmultiplicative interactions among covariates, complicating statistical analyses. Using data from the interior Columbia River Basin, USA, we compared four methods for predicting the distribution of seven salmonid taxa using landscape information. Subwatersheds (mean size, 7800 ha) were characterized using a set of 12 covariates describing physiography, vegetation, and current land-use. The techniques included generalized logit modeling, classification trees, a nearest neighbor technique, and a modular neural network. We evaluated model performance using out-of-sample prediction accuracy via leave-one-out cross-validation and introduce a computer-intensive Monte Carlo hypothesis testing approach for examining the statistical significance of landscape covariates with the non-parametric methods. We found the modular neural network and the nearest-neighbor techniques to be the most accurate, but were difficult to summarize in ways that provided ecological insight. The modular neural network also required the most extensive computer resources for model fitting and hypothesis testing. The generalized logit models were readily interpretable, but were the least accurate, possibly due to nonlinear relationships and nonmultiplicative interactions among covariates. Substantial overlap among the statistically significant (P<0.05) covariates for each method suggested that each is capable of detecting similar relationships between responses and covariates. Consequently, we believe that employing one or more methods may provide greater biological insight without sacrificing prediction accuracy.
Maritime piracy situation modelling with dynamic Bayesian networks
CSIR Research Space (South Africa)
Dabrowski, James M
2015-05-01
Full Text Available A generative model for modelling maritime vessel behaviour is proposed. The model is a novel variant of the dynamic Bayesian network (DBN). The proposed DBN is in the form of a switching linear dynamic system (SLDS) that has been extended into a...
Bayesian inference model for fatigue life of laminated composites
DEFF Research Database (Denmark)
Dimitrov, Nikolay Krasimirov; Kiureghian, Armen Der; Berggreen, Christian
2016-01-01
A probabilistic model for estimating the fatigue life of laminated composite plates is developed. The model is based on lamina-level input data, making it possible to predict fatigue properties for a wide range of laminate configurations. Model parameters are estimated by Bayesian inference. The ...
Characterizing economic trends by Bayesian stochastic model specification search
DEFF Research Database (Denmark)
Grassi, Stefano; Proietti, Tommaso
We extend a recently proposed Bayesian model selection technique, known as stochastic model specification search, for characterising the nature of the trend in macroeconomic time series. In particular, we focus on autoregressive models with possibly time-varying intercept and slope and decide on ...
Bayesian inference of chemical kinetic models from proposed reactions
Galagali, Nikhil
2015-02-01
© 2014 Elsevier Ltd. Bayesian inference provides a natural framework for combining experimental data with prior knowledge to develop chemical kinetic models and quantify the associated uncertainties, not only in parameter values but also in model structure. Most existing applications of Bayesian model selection methods to chemical kinetics have been limited to comparisons among a small set of models, however. The significant computational cost of evaluating posterior model probabilities renders traditional Bayesian methods infeasible when the model space becomes large. We present a new framework for tractable Bayesian model inference and uncertainty quantification using a large number of systematically generated model hypotheses. The approach involves imposing point-mass mixture priors over rate constants and exploring the resulting posterior distribution using an adaptive Markov chain Monte Carlo method. The posterior samples are used to identify plausible models, to quantify rate constant uncertainties, and to extract key diagnostic information about model structure-such as the reactions and operating pathways most strongly supported by the data. We provide numerical demonstrations of the proposed framework by inferring kinetic models for catalytic steam and dry reforming of methane using available experimental data.
Monitoring coastal marshes biomass with CASI: a comparison of parametric and non-parametric models
Mo, Y.; Kearney, M.
2017-12-01
Coastal marshes are important carbon sinks that face multiple natural and anthropogenic stresses. Optical remote sensing is a powerful tool for closely monitoring the biomass of coastal marshes. However, application of hyperspectral sensors on assessing the biomass of diverse coastal marsh ecosystems is limited. This study samples spectral and biophysical data from coastal freshwater, intermediate, brackish, and saline marshes in Louisiana, and develops parametric and non-parametric models for using the Compact Airborne Spectrographic Imager (CASI) to retrieve the marshes' biomass. Linear models and random forest models are developed from simulated CASI data (48 bands, 380-1050 nm, bandwidth 14 nm). Linear models are also developed using narrowband vegetation indices computed from all possible band combinations from the blue, red, and near infrared wavelengths. It is found that the linear models derived from the optimal narrowband vegetation indices provide strong predictions for the marshes' Leaf Area Index (LAI; R2 > 0.74 for ARVI), but not for their Aboveground Green Biomass (AGB; R2 > 0.25). The linear models derived from the simulated CASI data strongly predict the marshes' LAI (R2 = 0.93) and AGB (R2 = 0.71) and have 27 and 30 bands/variables in the final models through stepwise regression, respectively. The random forest models derived from the simulated CASI data also strongly predict the marshes' LAI and AGB (R2 = 0.91 and 0.84, respectively), where the most important variables for predicting LAI are near infrared bands at 784 and 756 nm and for predicting ABG are red bands at 684 and 670 nm. In sum, the random forest model is preferable for assessing coastal marsh biomass using CASI data as it offers high R2 for both LAI and AGB. The superior performance of the random forest model is likely to due to that it fully utilizes the full-spectrum data and makes no assumption of the approximate normality of the sampling population. This study offers solutions
Bayesian Subset Modeling for High-Dimensional Generalized Linear Models
Liang, Faming
2013-06-01
This article presents a new prior setting for high-dimensional generalized linear models, which leads to a Bayesian subset regression (BSR) with the maximum a posteriori model approximately equivalent to the minimum extended Bayesian information criterion model. The consistency of the resulting posterior is established under mild conditions. Further, a variable screening procedure is proposed based on the marginal inclusion probability, which shares the same properties of sure screening and consistency with the existing sure independence screening (SIS) and iterative sure independence screening (ISIS) procedures. However, since the proposed procedure makes use of joint information from all predictors, it generally outperforms SIS and ISIS in real applications. This article also makes extensive comparisons of BSR with the popular penalized likelihood methods, including Lasso, elastic net, SIS, and ISIS. The numerical results indicate that BSR can generally outperform the penalized likelihood methods. The models selected by BSR tend to be sparser and, more importantly, of higher prediction ability. In addition, the performance of the penalized likelihood methods tends to deteriorate as the number of predictors increases, while this is not significant for BSR. Supplementary materials for this article are available online. © 2013 American Statistical Association.
Nonparametric tests for censored data
Bagdonavicus, Vilijandas; Nikulin, Mikhail
2013-01-01
This book concerns testing hypotheses in non-parametric models. Generalizations of many non-parametric tests to the case of censored and truncated data are considered. Most of the test results are proved and real applications are illustrated using examples. Theories and exercises are provided. The incorrect use of many tests applying most statistical software is highlighted and discussed.
Research & development and growth: A Bayesian model averaging analysis
Czech Academy of Sciences Publication Activity Database
Horváth, Roman
2011-01-01
Roč. 28, č. 6 (2011), s. 2669-2673 ISSN 0264-9993. [Society for Non-linear Dynamics and Econometrics Annual Conferencen. Washington DC, 16.03.2011-18.03.2011] R&D Projects: GA ČR GA402/09/0965 Institutional research plan: CEZ:AV0Z10750506 Keywords : Research and development * Growth * Bayesian model averaging Subject RIV: AH - Economic s Impact factor: 0.701, year: 2011 http://library.utia.cas.cz/separaty/2011/E/horvath-research & development and growth a bayesian model averaging analysis.pdf
Bayesian inference with information content model check for Langevin equations
DEFF Research Database (Denmark)
Krog, Jens F. C.; Lomholt, Michael Andersen
2017-01-01
The Bayesian data analysis framework has been proven to be a systematic and effective method of parameter inference and model selection for stochastic processes. In this work we introduce an information content model check which may serve as a goodness-of-fit, like the chi-square procedure...
Accurate phenotyping: Reconciling approaches through Bayesian model averaging.
Directory of Open Access Journals (Sweden)
Carla Chia-Ming Chen
Full Text Available Genetic research into complex diseases is frequently hindered by a lack of clear biomarkers for phenotype ascertainment. Phenotypes for such diseases are often identified on the basis of clinically defined criteria; however such criteria may not be suitable for understanding the genetic composition of the diseases. Various statistical approaches have been proposed for phenotype definition; however our previous studies have shown that differences in phenotypes estimated using different approaches have substantial impact on subsequent analyses. Instead of obtaining results based upon a single model, we propose a new method, using Bayesian model averaging to overcome problems associated with phenotype definition. Although Bayesian model averaging has been used in other fields of research, this is the first study that uses Bayesian model averaging to reconcile phenotypes obtained using multiple models. We illustrate the new method by applying it to simulated genetic and phenotypic data for Kofendred personality disorder-an imaginary disease with several sub-types. Two separate statistical methods were used to identify clusters of individuals with distinct phenotypes: latent class analysis and grade of membership. Bayesian model averaging was then used to combine the two clusterings for the purpose of subsequent linkage analyses. We found that causative genetic loci for the disease produced higher LOD scores using model averaging than under either individual model separately. We attribute this improvement to consolidation of the cores of phenotype clusters identified using each individual method.
Involving stakeholders in building integrated fisheries models using Bayesian methods
DEFF Research Database (Denmark)
Haapasaari, Päivi Elisabet; Mäntyniemi, Samu; Kuikka, Sakari
2013-01-01
the potential of the study to contribute to the development of participatory modeling practices. It is concluded that the subjective perspective to knowledge, that is fundamental in Bayesian theory, suits participatory modeling better than a positivist paradigm that seeks the objective truth. The methodology...
Small Sample Properties of Bayesian Multivariate Autoregressive Time Series Models
Price, Larry R.
2012-01-01
The aim of this study was to compare the small sample (N = 1, 3, 5, 10, 15) performance of a Bayesian multivariate vector autoregressive (BVAR-SEM) time series model relative to frequentist power and parameter estimation bias. A multivariate autoregressive model was developed based on correlated autoregressive time series vectors of varying…
Bayesian Network Models in Cyber Security: A Systematic Review
Chockalingam, S.; Pieters, W.; Herdeiro Teixeira, A.M.; van Gelder, P.H.A.J.M.; Lipmaa, Helger; Mitrokotsa, Aikaterini; Matulevicius, Raimundas
2017-01-01
Bayesian Networks (BNs) are an increasingly popular modelling technique in cyber security especially due to their capability to overcome data limitations. This is also instantiated by the growth of BN models development in cyber security. However, a comprehensive comparison and analysis of these
A Bayesian Model of the Memory Colour Effect.
Witzel, Christoph; Olkkonen, Maria; Gegenfurtner, Karl R
2018-01-01
According to the memory colour effect, the colour of a colour-diagnostic object is not perceived independently of the object itself. Instead, it has been shown through an achromatic adjustment method that colour-diagnostic objects still appear slightly in their typical colour, even when they are colourimetrically grey. Bayesian models provide a promising approach to capture the effect of prior knowledge on colour perception and to link these effects to more general effects of cue integration. Here, we model memory colour effects using prior knowledge about typical colours as priors for the grey adjustments in a Bayesian model. This simple model does not involve any fitting of free parameters. The Bayesian model roughly captured the magnitude of the measured memory colour effect for photographs of objects. To some extent, the model predicted observed differences in memory colour effects across objects. The model could not account for the differences in memory colour effects across different levels of realism in the object images. The Bayesian model provides a particularly simple account of memory colour effects, capturing some of the multiple sources of variation of these effects.
Nonparametric estimation in an "illness-death" model when all transition times are interval censored
DEFF Research Database (Denmark)
Frydman, Halina; Gerds, Thomas; Grøn, Randi
2013-01-01
We develop nonparametric maximum likelihood estimation for the parameters of an irreversible Markov chain on states {0,1,2} from the observations with interval censored times of 0 → 1, 0 → 2 and 1 → 2 transitions. The distinguishing aspect of the data is that, in addition to all transition times ...
Böhning, Dankmar; Karasek, Sarah; Terschüren, Claudia; Annuß, Rolf; Fehr, Rainer
2013-03-09
Life expectancy is of increasing prime interest for a variety of reasons. In many countries, life expectancy is growing linearly, without any indication of reaching a limit. The state of North Rhine-Westphalia (NRW) in Germany with its 54 districts is considered here where the above mentioned growth in life expectancy is occurring as well. However, there is also empirical evidence that life expectancy is not growing linearly at the same level for different regions. To explore this situation further a likelihood-based cluster analysis is suggested and performed. The modelling uses a nonparametric mixture approach for the latent random effect. Maximum likelihood estimates are determined by means of the EM algorithm and the number of components in the mixture model are found on the basis of the Bayesian Information Criterion. Regions are classified into the mixture components (clusters) using the maximum posterior allocation rule. For the data analyzed here, 7 components are found with a spatial concentration of lower life expectancy levels in a centre of NRW, formerly an enormous conglomerate of heavy industry, still the most densely populated area with Gelsenkirchen having the lowest level of life expectancy growth for both genders. The paper offers some explanations for this fact including demographic and socio-economic sources. This case study shows that life expectancy growth is widely linear, but it might occur on different levels.
Modeling error distributions of growth curve models through Bayesian methods.
Zhang, Zhiyong
2016-06-01
Growth curve models are widely used in social and behavioral sciences. However, typical growth curve models often assume that the errors are normally distributed although non-normal data may be even more common than normal data. In order to avoid possible statistical inference problems in blindly assuming normality, a general Bayesian framework is proposed to flexibly model normal and non-normal data through the explicit specification of the error distributions. A simulation study shows when the distribution of the error is correctly specified, one can avoid the loss in the efficiency of standard error estimates. A real example on the analysis of mathematical ability growth data from the Early Childhood Longitudinal Study, Kindergarten Class of 1998-99 is used to show the application of the proposed methods. Instructions and code on how to conduct growth curve analysis with both normal and non-normal error distributions using the the MCMC procedure of SAS are provided.
Bayesian Network Models in Cyber Security: A Systematic Review
Chockalingam, S.; Pieters, W.; Herdeiro Teixeira, A.M.; van Gelder, P.H.A.J.M.; Lipmaa, Helger; Mitrokotsa, Aikaterini; Matulevicius, Raimundas
2017-01-01
Bayesian Networks (BNs) are an increasingly popular modelling technique in cyber security especially due to their capability to overcome data limitations. This is also instantiated by the growth of BN models development in cyber security. However, a comprehensive comparison and analysis of these models is missing. In this paper, we conduct a systematic review of the scientific literature and identify 17 standard BN models in cyber security. We analyse these models based on 9 different criteri...
Essays on parametric and nonparametric modeling and estimation with applications to energy economics
Gao, Weiyu
My dissertation research is composed of two parts: a theoretical part on semiparametric efficient estimation and an applied part in energy economics under different dynamic settings. The essays are related in terms of their applications as well as the way in which models are constructed and estimated. In the first essay, efficient estimation of the partially linear model is studied. We work out the efficient score functions and efficiency bounds under four stochastic restrictions---independence, conditional symmetry, conditional zero mean, and partially conditional zero mean. A feasible efficient estimation method for the linear part of the model is developed based on the efficient score. A battery of specification test that allows for choosing between the alternative assumptions is provided. A Monte Carlo simulation is also conducted. The second essay presents a dynamic optimization model for a stylized oilfield resembling the largest developed light oil field in Saudi Arabia, Ghawar. We use data from different sources to estimate the oil production cost function and the revenue function. We pay particular attention to the dynamic aspect of the oil production by employing petroleum-engineering software to simulate the interaction between control variables and reservoir state variables. Optimal solutions are studied under different scenarios to account for the possible changes in the exogenous variables and the uncertainty about the forecasts. The third essay examines the effect of oil price volatility on the level of innovation displayed by the U.S. economy. A measure of innovation is calculated by decomposing an output-based Malmquist index. We also construct a nonparametric measure for oil price volatility. Technical change and oil price volatility are then placed in a VAR system with oil price and a variable indicative of monetary policy. The system is estimated and analyzed for significant relationships. We find that oil price volatility displays a significant
Development of dynamic Bayesian models for web application test management
Azarnova, T. V.; Polukhin, P. V.; Bondarenko, Yu V.; Kashirina, I. L.
2018-03-01
The mathematical apparatus of dynamic Bayesian networks is an effective and technically proven tool that can be used to model complex stochastic dynamic processes. According to the results of the research, mathematical models and methods of dynamic Bayesian networks provide a high coverage of stochastic tasks associated with error testing in multiuser software products operated in a dynamically changing environment. Formalized representation of the discrete test process as a dynamic Bayesian model allows us to organize the logical connection between individual test assets for multiple time slices. This approach gives an opportunity to present testing as a discrete process with set structural components responsible for the generation of test assets. Dynamic Bayesian network-based models allow us to combine in one management area individual units and testing components with different functionalities and a direct influence on each other in the process of comprehensive testing of various groups of computer bugs. The application of the proposed models provides an opportunity to use a consistent approach to formalize test principles and procedures, methods used to treat situational error signs, and methods used to produce analytical conclusions based on test results.
Spatial and spatio-temporal bayesian models with R - INLA
Blangiardo, Marta
2015-01-01
Dedication iiiPreface ix1 Introduction 11.1 Why spatial and spatio-temporal statistics? 11.2 Why do we use Bayesian methods for modelling spatial and spatio-temporal structures? 21.3 Why INLA? 31.4 Datasets 32 Introduction to 212.1 The language 212.2 objects 222.3 Data and session management 342.4 Packages 352.5 Programming in 362.6 Basic statistical analysis with 393 Introduction to Bayesian Methods 533.1 Bayesian Philosophy 533.2 Basic Probability Elements 573.3 Bayes Theorem 623.4 Prior and Posterior Distributions 643.5 Working with the Posterior Distribution 663.6 Choosing the Prior Distr
Bayesian log-periodic model for financial crashes
DEFF Research Database (Denmark)
Rodríguez-Caballero, Carlos Vladimir; Knapik, Oskar
2014-01-01
This paper introduces a Bayesian approach in econophysics literature about financial bubbles in order to estimate the most probable time for a financial crash to occur. To this end, we propose using noninformative prior distributions to obtain posterior distributions. Since these distributions...... cannot be performed analytically, we develop a Markov Chain Monte Carlo algorithm to draw from posterior distributions. We consider three Bayesian models that involve normal and Student’s t-distributions in the disturbances and an AR(1)-GARCH(1,1) structure only within the first case. In the empirical...... part of the study, we analyze a well-known example of financial bubble – the S&P 500 1987 crash – to show the usefulness of the three methods under consideration and crashes of Merval-94, Bovespa-97, IPCMX-94, Hang Seng-97 using the simplest method. The novelty of this research is that the Bayesian...
DEFF Research Database (Denmark)
Mørup, Morten; Schmidt, Mikkel N
2012-01-01
Many networks of scientific interest naturally decompose into clusters or communities with comparatively fewer external than internal links; however, current Bayesian models of network communities do not exert this intuitive notion of communities. We formulate a nonparametric Bayesian model...... for community detection consistent with an intuitive definition of communities and present a Markov chain Monte Carlo procedure for inferring the community structure. A Matlab toolbox with the proposed inference procedure is available for download. On synthetic and real networks, our model detects communities...... consistent with ground truth, and on real networks, it outperforms existing approaches in predicting missing links. This suggests that community structure is an important structural property of networks that should be explicitly modeled....
Occam factors and model independent Bayesian learning of continuous distributions
International Nuclear Information System (INIS)
Nemenman, Ilya; Bialek, William
2002-01-01
Learning of a smooth but nonparametric probability density can be regularized using methods of quantum field theory. We implement a field theoretic prior numerically, test its efficacy, and show that the data and the phase space factors arising from the integration over the model space determine the free parameter of the theory ('smoothness scale') self-consistently. This persists even for distributions that are atypical in the prior and is a step towards a model independent theory for learning continuous distributions. Finally, we point out that a wrong parametrization of a model family may sometimes be advantageous for small data sets
Hierarchical Bayesian spatial models for multispecies conservation planning and monitoring.
Carroll, Carlos; Johnson, Devin S; Dunk, Jeffrey R; Zielinski, William J
2010-12-01
Biologists who develop and apply habitat models are often familiar with the statistical challenges posed by their data's spatial structure but are unsure of whether the use of complex spatial models will increase the utility of model results in planning. We compared the relative performance of nonspatial and hierarchical Bayesian spatial models for three vertebrate and invertebrate taxa of conservation concern (Church's sideband snails [Monadenia churchi], red tree voles [Arborimus longicaudus], and Pacific fishers [Martes pennanti pacifica]) that provide examples of a range of distributional extents and dispersal abilities. We used presence-absence data derived from regional monitoring programs to develop models with both landscape and site-level environmental covariates. We used Markov chain Monte Carlo algorithms and a conditional autoregressive or intrinsic conditional autoregressive model framework to fit spatial models. The fit of Bayesian spatial models was between 35 and 55% better than the fit of nonspatial analogue models. Bayesian spatial models outperformed analogous models developed with maximum entropy (Maxent) methods. Although the best spatial and nonspatial models included similar environmental variables, spatial models provided estimates of residual spatial effects that suggested how ecological processes might structure distribution patterns. Spatial models built from presence-absence data improved fit most for localized endemic species with ranges constrained by poorly known biogeographic factors and for widely distributed species suspected to be strongly affected by unmeasured environmental variables or population processes. By treating spatial effects as a variable of interest rather than a nuisance, hierarchical Bayesian spatial models, especially when they are based on a common broad-scale spatial lattice (here the national Forest Inventory and Analysis grid of 24 km(2) hexagons), can increase the relevance of habitat models to multispecies
Advanced REACH tool: A Bayesian model for occupational exposure assessment
McNally, K.; Warren, N.; Fransman, W.; Entink, R.K.; Schinkel, J.; Van Tongeren, M.; Cherrie, J.W.; Kromhout, H.; Schneider, T.; Tielemans, E.
2014-01-01
This paper describes a Bayesian model for the assessment of inhalation exposures in an occupational setting; the methodology underpins a freely available web-based application for exposure assessment, the Advanced REACH Tool (ART). The ART is a higher tier exposure tool that combines disparate
Testing adaptive toolbox models: a Bayesian hierarchical approach.
Scheibehenne, Benjamin; Rieskamp, Jörg; Wagenmakers, Eric-Jan
2013-01-01
Many theories of human cognition postulate that people are equipped with a repertoire of strategies to solve the tasks they face. This theoretical framework of a cognitive toolbox provides a plausible account of intra- and interindividual differences in human behavior. Unfortunately, it is often unclear how to rigorously test the toolbox framework. How can a toolbox model be quantitatively specified? How can the number of toolbox strategies be limited to prevent uncontrolled strategy sprawl? How can a toolbox model be formally tested against alternative theories? The authors show how these challenges can be met by using Bayesian inference techniques. By means of parameter recovery simulations and the analysis of empirical data across a variety of domains (i.e., judgment and decision making, children's cognitive development, function learning, and perceptual categorization), the authors illustrate how Bayesian inference techniques allow toolbox models to be quantitatively specified, strategy sprawl to be contained, and toolbox models to be rigorously tested against competing theories. The authors demonstrate that their approach applies at the individual level but can also be generalized to the group level with hierarchical Bayesian procedures. The suggested Bayesian inference techniques represent a theoretical and methodological advancement for toolbox theories of cognition and behavior.
2nd Bayesian Young Statisticians Meeting
Bitto, Angela; Kastner, Gregor; Posekany, Alexandra
2015-01-01
The Second Bayesian Young Statisticians Meeting (BAYSM 2014) and the research presented here facilitate connections among researchers using Bayesian Statistics by providing a forum for the development and exchange of ideas. WU Vienna University of Business and Economics hosted BAYSM 2014 from September 18th to 19th. The guidance of renowned plenary lecturers and senior discussants is a critical part of the meeting and this volume, which follows publication of contributions from BAYSM 2013. The meeting's scientific program reflected the variety of fields in which Bayesian methods are currently employed or could be introduced in the future. Three brilliant keynote lectures by Chris Holmes (University of Oxford), Christian Robert (Université Paris-Dauphine), and Mike West (Duke University), were complemented by 24 plenary talks covering the major topics Dynamic Models, Applications, Bayesian Nonparametrics, Biostatistics, Bayesian Methods in Economics, and Models and Methods, as well as a lively poster session ...
Bayesian estimation of parameters in a regional hydrological model
Directory of Open Access Journals (Sweden)
K. Engeland
2002-01-01
Full Text Available This study evaluates the applicability of the distributed, process-oriented Ecomag model for prediction of daily streamflow in ungauged basins. The Ecomag model is applied as a regional model to nine catchments in the NOPEX area, using Bayesian statistics to estimate the posterior distribution of the model parameters conditioned on the observed streamflow. The distribution is calculated by Markov Chain Monte Carlo (MCMC analysis. The Bayesian method requires formulation of a likelihood function for the parameters and three alternative formulations are used. The first is a subjectively chosen objective function that describes the goodness of fit between the simulated and observed streamflow, as defined in the GLUE framework. The second and third formulations are more statistically correct likelihood models that describe the simulation errors. The full statistical likelihood model describes the simulation errors as an AR(1 process, whereas the simple model excludes the auto-regressive part. The statistical parameters depend on the catchments and the hydrological processes and the statistical and the hydrological parameters are estimated simultaneously. The results show that the simple likelihood model gives the most robust parameter estimates. The simulation error may be explained to a large extent by the catchment characteristics and climatic conditions, so it is possible to transfer knowledge about them to ungauged catchments. The statistical models for the simulation errors indicate that structural errors in the model are more important than parameter uncertainties. Keywords: regional hydrological model, model uncertainty, Bayesian analysis, Markov Chain Monte Carlo analysis
Bayesian inference method for stochastic damage accumulation modeling
International Nuclear Information System (INIS)
Jiang, Xiaomo; Yuan, Yong; Liu, Xian
2013-01-01
Damage accumulation based reliability model plays an increasingly important role in successful realization of condition based maintenance for complicated engineering systems. This paper developed a Bayesian framework to establish stochastic damage accumulation model from historical inspection data, considering data uncertainty. Proportional hazards modeling technique is developed to model the nonlinear effect of multiple influencing factors on system reliability. Different from other hazard modeling techniques such as normal linear regression model, the approach does not require any distribution assumption for the hazard model, and can be applied for a wide variety of distribution models. A Bayesian network is created to represent the nonlinear proportional hazards models and to estimate model parameters by Bayesian inference with Markov Chain Monte Carlo simulation. Both qualitative and quantitative approaches are developed to assess the validity of the established damage accumulation model. Anderson–Darling goodness-of-fit test is employed to perform the normality test, and Box–Cox transformation approach is utilized to convert the non-normality data into normal distribution for hypothesis testing in quantitative model validation. The methodology is illustrated with the seepage data collected from real-world subway tunnels.
Bayesian Dimensionality Assessment for the Multidimensional Nominal Response Model
Directory of Open Access Journals (Sweden)
Javier Revuelta
2017-06-01
Full Text Available This article introduces Bayesian estimation and evaluation procedures for the multidimensional nominal response model. The utility of this model is to perform a nominal factor analysis of items that consist of a finite number of unordered response categories. The key aspect of the model, in comparison with traditional factorial model, is that there is a slope for each response category on the latent dimensions, instead of having slopes associated to the items. The extended parameterization of the multidimensional nominal response model requires large samples for estimation. When sample size is of a moderate or small size, some of these parameters may be weakly empirically identifiable and the estimation algorithm may run into difficulties. We propose a Bayesian MCMC inferential algorithm to estimate the parameters and the number of dimensions underlying the multidimensional nominal response model. Two Bayesian approaches to model evaluation were compared: discrepancy statistics (DIC, WAICC, and LOO that provide an indication of the relative merit of different models, and the standardized generalized discrepancy measure that requires resampling data and is computationally more involved. A simulation study was conducted to compare these two approaches, and the results show that the standardized generalized discrepancy measure can be used to reliably estimate the dimensionality of the model whereas the discrepancy statistics are questionable. The paper also includes an example with real data in the context of learning styles, in which the model is used to conduct an exploratory factor analysis of nominal data.
Characterizing economic trends by Bayesian stochastic model specifi cation search
Grassi, Stefano; Proietti, Tommaso
2010-01-01
We apply a recently proposed Bayesian model selection technique, known as stochastic model specification search, for characterising the nature of the trend in macroeconomic time series. We illustrate that the methodology can be quite successfully applied to discriminate between stochastic and deterministic trends. In particular, we formulate autoregressive models with stochastic trends components and decide on whether a specific feature of the series, i.e. the underlying level and/or the rate...
A Bayesian ensemble of sensitivity measures for severe accident modeling
Energy Technology Data Exchange (ETDEWEB)
Hoseyni, Seyed Mohsen [Department of Basic Sciences, East Tehran Branch, Islamic Azad University, Tehran (Iran, Islamic Republic of); Di Maio, Francesco, E-mail: francesco.dimaio@polimi.it [Energy Department, Politecnico di Milano, Via La Masa 34, 20156 Milano (Italy); Vagnoli, Matteo [Energy Department, Politecnico di Milano, Via La Masa 34, 20156 Milano (Italy); Zio, Enrico [Energy Department, Politecnico di Milano, Via La Masa 34, 20156 Milano (Italy); Chair on System Science and Energetic Challenge, Fondation EDF – Electricite de France Ecole Centrale, Paris, and Supelec, Paris (France); Pourgol-Mohammad, Mohammad [Department of Mechanical Engineering, Sahand University of Technology, Tabriz (Iran, Islamic Republic of)
2015-12-15
Highlights: • We propose a sensitivity analysis (SA) method based on a Bayesian updating scheme. • The Bayesian updating schemes adjourns an ensemble of sensitivity measures. • Bootstrap replicates of a severe accident code output are fed to the Bayesian scheme. • The MELCOR code simulates the fission products release of LOFT LP-FP-2 experiment. • Results are compared with those of traditional SA methods. - Abstract: In this work, a sensitivity analysis framework is presented to identify the relevant input variables of a severe accident code, based on an incremental Bayesian ensemble updating method. The proposed methodology entails: (i) the propagation of the uncertainty in the input variables through the severe accident code; (ii) the collection of bootstrap replicates of the input and output of limited number of simulations for building a set of finite mixture models (FMMs) for approximating the probability density function (pdf) of the severe accident code output of the replicates; (iii) for each FMM, the calculation of an ensemble of sensitivity measures (i.e., input saliency, Hellinger distance and Kullback–Leibler divergence) and the updating when a new piece of evidence arrives, by a Bayesian scheme, based on the Bradley–Terry model for ranking the most relevant input model variables. An application is given with respect to a limited number of simulations of a MELCOR severe accident model describing the fission products release in the LP-FP-2 experiment of the loss of fluid test (LOFT) facility, which is a scaled-down facility of a pressurized water reactor (PWR).
Zhang, Hanze; Huang, Yangxin; Wang, Wei; Chen, Henian; Langland-Orban, Barbara
2017-01-01
In longitudinal AIDS studies, it is of interest to investigate the relationship between HIV viral load and CD4 cell counts, as well as the complicated time effect. Most of common models to analyze such complex longitudinal data are based on mean-regression, which fails to provide efficient estimates due to outliers and/or heavy tails. Quantile regression-based partially linear mixed-effects models, a special case of semiparametric models enjoying benefits of both parametric and nonparametric models, have the flexibility to monitor the viral dynamics nonparametrically and detect the varying CD4 effects parametrically at different quantiles of viral load. Meanwhile, it is critical to consider various data features of repeated measurements, including left-censoring due to a limit of detection, covariate measurement error, and asymmetric distribution. In this research, we first establish a Bayesian joint models that accounts for all these data features simultaneously in the framework of quantile regression-based partially linear mixed-effects models. The proposed models are applied to analyze the Multicenter AIDS Cohort Study (MACS) data. Simulation studies are also conducted to assess the performance of the proposed methods under different scenarios.
Harlander, Niklas; Rosenkranz, Tobias; Hohmann, Volker
2012-08-01
Single channel noise reduction has been well investigated and seems to have reached its limits in terms of speech intelligibility improvement, however, the quality of such schemes can still be advanced. This study tests to what extent novel model-based processing schemes might improve performance in particular for non-stationary noise conditions. Two prototype model-based algorithms, a speech-model-based, and a auditory-model-based algorithm were compared to a state-of-the-art non-parametric minimum statistics algorithm. A speech intelligibility test, preference rating, and listening effort scaling were performed. Additionally, three objective quality measures for the signal, background, and overall distortions were applied. For a better comparison of all algorithms, particular attention was given to the usage of the similar Wiener-based gain rule. The perceptual investigation was performed with fourteen hearing-impaired subjects. The results revealed that the non-parametric algorithm and the auditory model-based algorithm did not affect speech intelligibility, whereas the speech-model-based algorithm slightly decreased intelligibility. In terms of subjective quality, both model-based algorithms perform better than the unprocessed condition and the reference in particular for highly non-stationary noise environments. Data support the hypothesis that model-based algorithms are promising for improving performance in non-stationary noise conditions.
Energy Technology Data Exchange (ETDEWEB)
Hampf, Benjamin
2011-08-15
In this paper we present a new approach to evaluate the environmental efficiency of decision making units. We propose a model that describes a two-stage process consisting of a production and an end-of-pipe abatement stage with the environmental efficiency being determined by the efficiency of both stages. Taking the dependencies between the two stages into account, we show how nonparametric methods can be used to measure environmental efficiency and to decompose it into production and abatement efficiency. For an empirical illustration we apply our model to an analysis of U.S. power plants.
Bayesian models for comparative analysis integrating phylogenetic uncertainty
Directory of Open Access Journals (Sweden)
Villemereuil Pierre de
2012-06-01
Full Text Available Abstract Background Uncertainty in comparative analyses can come from at least two sources: a phylogenetic uncertainty in the tree topology or branch lengths, and b uncertainty due to intraspecific variation in trait values, either due to measurement error or natural individual variation. Most phylogenetic comparative methods do not account for such uncertainties. Not accounting for these sources of uncertainty leads to false perceptions of precision (confidence intervals will be too narrow and inflated significance in hypothesis testing (e.g. p-values will be too small. Although there is some application-specific software for fitting Bayesian models accounting for phylogenetic error, more general and flexible software is desirable. Methods We developed models to directly incorporate phylogenetic uncertainty into a range of analyses that biologists commonly perform, using a Bayesian framework and Markov Chain Monte Carlo analyses. Results We demonstrate applications in linear regression, quantification of phylogenetic signal, and measurement error models. Phylogenetic uncertainty was incorporated by applying a prior distribution for the phylogeny, where this distribution consisted of the posterior tree sets from Bayesian phylogenetic tree estimation programs. The models were analysed using simulated data sets, and applied to a real data set on plant traits, from rainforest plant species in Northern Australia. Analyses were performed using the free and open source software OpenBUGS and JAGS. Conclusions Incorporating phylogenetic uncertainty through an empirical prior distribution of trees leads to more precise estimation of regression model parameters than using a single consensus tree and enables a more realistic estimation of confidence intervals. In addition, models incorporating measurement errors and/or individual variation, in one or both variables, are easily formulated in the Bayesian framework. We show that BUGS is a useful, flexible
Bayesian models for comparative analysis integrating phylogenetic uncertainty
2012-01-01
Background Uncertainty in comparative analyses can come from at least two sources: a) phylogenetic uncertainty in the tree topology or branch lengths, and b) uncertainty due to intraspecific variation in trait values, either due to measurement error or natural individual variation. Most phylogenetic comparative methods do not account for such uncertainties. Not accounting for these sources of uncertainty leads to false perceptions of precision (confidence intervals will be too narrow) and inflated significance in hypothesis testing (e.g. p-values will be too small). Although there is some application-specific software for fitting Bayesian models accounting for phylogenetic error, more general and flexible software is desirable. Methods We developed models to directly incorporate phylogenetic uncertainty into a range of analyses that biologists commonly perform, using a Bayesian framework and Markov Chain Monte Carlo analyses. Results We demonstrate applications in linear regression, quantification of phylogenetic signal, and measurement error models. Phylogenetic uncertainty was incorporated by applying a prior distribution for the phylogeny, where this distribution consisted of the posterior tree sets from Bayesian phylogenetic tree estimation programs. The models were analysed using simulated data sets, and applied to a real data set on plant traits, from rainforest plant species in Northern Australia. Analyses were performed using the free and open source software OpenBUGS and JAGS. Conclusions Incorporating phylogenetic uncertainty through an empirical prior distribution of trees leads to more precise estimation of regression model parameters than using a single consensus tree and enables a more realistic estimation of confidence intervals. In addition, models incorporating measurement errors and/or individual variation, in one or both variables, are easily formulated in the Bayesian framework. We show that BUGS is a useful, flexible general purpose tool for
CSIR Research Space (South Africa)
Johnson, S
2010-02-01
Full Text Available metapopulations was the focus of a Bayesian Network (BN) modelling workshop in South Africa. Using a new heuristics, Iterative Bayesian Network Development Cycle (IBNDC), described in this paper, several networks were formulated to distinguish between the unique...
Uncertainties in ozone concentrations predicted with a Lagrangian photochemical air quality model have been estimated using Bayesian Monte Carlo (BMC) analysis. Bayesian Monte Carlo analysis provides a means of combining subjective "prior" uncertainty estimates developed ...
Technical note: Bayesian calibration of dynamic ruminant nutrition models.
Reed, K F; Arhonditsis, G B; France, J; Kebreab, E
2016-08-01
Mechanistic models of ruminant digestion and metabolism have advanced our understanding of the processes underlying ruminant animal physiology. Deterministic modeling practices ignore the inherent variation within and among individual animals and thus have no way to assess how sources of error influence model outputs. We introduce Bayesian calibration of mathematical models to address the need for robust mechanistic modeling tools that can accommodate error analysis by remaining within the bounds of data-based parameter estimation. For the purpose of prediction, the Bayesian approach generates a posterior predictive distribution that represents the current estimate of the value of the response variable, taking into account both the uncertainty about the parameters and model residual variability. Predictions are expressed as probability distributions, thereby conveying significantly more information than point estimates in regard to uncertainty. Our study illustrates some of the technical advantages of Bayesian calibration and discusses the future perspectives in the context of animal nutrition modeling. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
DEFF Research Database (Denmark)
A methodology is presented that combines modelling based on first principles and data based modelling into a modelling cycle that facilitates fast decision-making based on statistical methods. A strong feature of this methodology is that given a first principles model along with process data......, the corresponding modelling cycle model of the given system for a given purpose. A computer-aided tool, which integrates the elements of the modelling cycle, is also presented, and an example is given of modelling a fed-batch bioreactor....
Projecting UK mortality using Bayesian generalised additive models
Hilton, Jason; Dodd, Erengul; Forster, Jonathan; Smith, Peter W.F.
2018-01-01
Forecasts of mortality provide vital information about future populations, with implications for pension and health-care policy as well as for decisions made by private companies about life insurance and annuity pricing. This paper presents a Bayesian approach to the forecasting of mortality that jointly estimates a Generalised Additive Model (GAM) for mortality for the majority of the age-range and a parametric model for older ages where the data are sparser. The GAM allows smooth components...
Bayesian modeling and prediction of solar particles flux
International Nuclear Information System (INIS)
Dedecius, Kamil; Kalova, Jana
2010-01-01
An autoregression model was developed based on the Bayesian approach. Considering the solar wind non-homogeneity, the idea was applied of combining the pure autoregressive properties of the model with expert knowledge based on a similar behaviour of the various phenomena related to the flux properties. Examples of such situations include the hardening of the X-ray spectrum, which is often followed by coronal mass ejection and a significant increase in the particles flux intensity
International Conference on Robust Rank-Based and Nonparametric Methods
McKean, Joseph
2016-01-01
The contributors to this volume include many of the distinguished researchers in this area. Many of these scholars have collaborated with Joseph McKean to develop underlying theory for these methods, obtain small sample corrections, and develop efficient algorithms for their computation. The papers cover the scope of the area, including robust nonparametric rank-based procedures through Bayesian and big data rank-based analyses. Areas of application include biostatistics and spatial areas. Over the last 30 years, robust rank-based and nonparametric methods have developed considerably. These procedures generalize traditional Wilcoxon-type methods for one- and two-sample location problems. Research into these procedures has culminated in complete analyses for many of the models used in practice including linear, generalized linear, mixed, and nonlinear models. Settings are both multivariate and univariate. With the development of R packages in these areas, computation of these procedures is easily shared with r...
Dagne, Getachew A; Huang, Yangxin
2013-09-30
Common problems to many longitudinal HIV/AIDS, cancer, vaccine, and environmental exposure studies are the presence of a lower limit of quantification of an outcome with skewness and time-varying covariates with measurement errors. There has been relatively little work published simultaneously dealing with these features of longitudinal data. In particular, left-censored data falling below a limit of detection may sometimes have a proportion larger than expected under a usually assumed log-normal distribution. In such cases, alternative models, which can account for a high proportion of censored data, should be considered. In this article, we present an extension of the Tobit model that incorporates a mixture of true undetectable observations and those values from a skew-normal distribution for an outcome with possible left censoring and skewness, and covariates with substantial measurement error. To quantify the covariate process, we offer a flexible nonparametric mixed-effects model within the Tobit framework. A Bayesian modeling approach is used to assess the simultaneous impact of left censoring, skewness, and measurement error in covariates on inference. The proposed methods are illustrated using real data from an AIDS clinical study. . Copyright © 2013 John Wiley & Sons, Ltd.
Application of a predictive Bayesian model to environmental accounting.
Anex, R P; Englehardt, J D
2001-03-30
Environmental accounting techniques are intended to capture important environmental costs and benefits that are often overlooked in standard accounting practices. Environmental accounting methods themselves often ignore or inadequately represent large but highly uncertain environmental costs and costs conditioned by specific prior events. Use of a predictive Bayesian model is demonstrated for the assessment of such highly uncertain environmental and contingent costs. The predictive Bayesian approach presented generates probability distributions for the quantity of interest (rather than parameters thereof). A spreadsheet implementation of a previously proposed predictive Bayesian model, extended to represent contingent costs, is described and used to evaluate whether a firm should undertake an accelerated phase-out of its PCB containing transformers. Variability and uncertainty (due to lack of information) in transformer accident frequency and severity are assessed simultaneously using a combination of historical accident data, engineering model-based cost estimates, and subjective judgement. Model results are compared using several different risk measures. Use of the model for incorporation of environmental risk management into a company's overall risk management strategy is discussed.
Impact of censoring on learning Bayesian networks in survival modelling.
Stajduhar, Ivan; Dalbelo-Basić, Bojana; Bogunović, Nikola
2009-11-01
Bayesian networks are commonly used for presenting uncertainty and covariate interactions in an easily interpretable way. Because of their efficient inference and ability to represent causal relationships, they are an excellent choice for medical decision support systems in diagnosis, treatment, and prognosis. Although good procedures for learning Bayesian networks from data have been defined, their performance in learning from censored survival data has not been widely studied. In this paper, we explore how to use these procedures to learn about possible interactions between prognostic factors and their influence on the variate of interest. We study how censoring affects the probability of learning correct Bayesian network structures. Additionally, we analyse the potential usefulness of the learnt models for predicting the time-independent probability of an event of interest. We analysed the influence of censoring with a simulation on synthetic data sampled from randomly generated Bayesian networks. We used two well-known methods for learning Bayesian networks from data: a constraint-based method and a score-based method. We compared the performance of each method under different levels of censoring to those of the naive Bayes classifier and the proportional hazards model. We did additional experiments on several datasets from real-world medical domains. The machine-learning methods treated censored cases in the data as event-free. We report and compare results for several commonly used model evaluation metrics. On average, the proportional hazards method outperformed other methods in most censoring setups. As part of the simulation study, we also analysed structural similarities of the learnt networks. Heavy censoring, as opposed to no censoring, produces up to a 5% surplus and up to 10% missing total arcs. It also produces up to 50% missing arcs that should originally be connected to the variate of interest. Presented methods for learning Bayesian networks from
Bayesian Modeling of Temporal Coherence in Videos for Entity Discovery and Summarization.
Mitra, Adway; Biswas, Soma; Bhattacharyya, Chiranjib
2017-03-01
A video is understood by users in terms of entities present in it. Entity Discovery is the task of building appearance model for each entity (e.g., a person), and finding all its occurrences in the video. We represent a video as a sequence of tracklets, each spanning 10-20 frames, and associated with one entity. We pose Entity Discovery as tracklet clustering, and approach it by leveraging Temporal Coherence (TC): the property that temporally neighboring tracklets are likely to be associated with the same entity. Our major contributions are the first Bayesian nonparametric models for TC at tracklet-level. We extend Chinese Restaurant Process (CRP) to TC-CRP, and further to Temporally Coherent Chinese Restaurant Franchise (TC-CRF) to jointly model entities and temporal segments using mixture components and sparse distributions. For discovering persons in TV serial videos without meta-data like scripts, these methods show considerable improvement over state-of-the-art approaches to tracklet clustering in terms of clustering accuracy, cluster purity and entity coverage. The proposed methods can perform online tracklet clustering on streaming videos unlike existing approaches, and can automatically reject false tracklets. Finally we discuss entity-driven video summarization- where temporal segments of the video are selected based on the discovered entities, to create a semantically meaningful summary.
Bayesian Inference of High-Dimensional Dynamical Ocean Models
Lin, J.; Lermusiaux, P. F. J.; Lolla, S. V. T.; Gupta, A.; Haley, P. J., Jr.
2015-12-01
This presentation addresses a holistic set of challenges in high-dimension ocean Bayesian nonlinear estimation: i) predict the probability distribution functions (pdfs) of large nonlinear dynamical systems using stochastic partial differential equations (PDEs); ii) assimilate data using Bayes' law with these pdfs; iii) predict the future data that optimally reduce uncertainties; and (iv) rank the known and learn the new model formulations themselves. Overall, we allow the joint inference of the state, equations, geometry, boundary conditions and initial conditions of dynamical models. Examples are provided for time-dependent fluid and ocean flows, including cavity, double-gyre and Strait flows with jets and eddies. The Bayesian model inference, based on limited observations, is illustrated first by the estimation of obstacle shapes and positions in fluid flows. Next, the Bayesian inference of biogeochemical reaction equations and of their states and parameters is presented, illustrating how PDE-based machine learning can rigorously guide the selection and discovery of complex ecosystem models. Finally, the inference of multiscale bottom gravity current dynamics is illustrated, motivated in part by classic overflows and dense water formation sites and their relevance to climate monitoring and dynamics. This is joint work with our MSEAS group at MIT.
Bayesian Age-Period-Cohort Modeling and Prediction - BAMP
Directory of Open Access Journals (Sweden)
Volker J. Schmid
2007-10-01
Full Text Available The software package BAMP provides a method of analyzing incidence or mortality data on the Lexis diagram, using a Bayesian version of an age-period-cohort model. A hierarchical model is assumed with a binomial model in the first-stage. As smoothing priors for the age, period and cohort parameters random walks of first and second order, with and without an additional unstructured component are available. Unstructured heterogeneity can also be included in the model. In order to evaluate the model fit, posterior deviance, DIC and predictive deviances are computed. By projecting the random walk prior into the future, future death rates can be predicted.
Introduction to Hierarchical Bayesian Modeling for Ecological Data
Parent, Eric
2012-01-01
Making statistical modeling and inference more accessible to ecologists and related scientists, Introduction to Hierarchical Bayesian Modeling for Ecological Data gives readers a flexible and effective framework to learn about complex ecological processes from various sources of data. It also helps readers get started on building their own statistical models. The text begins with simple models that progressively become more complex and realistic through explanatory covariates and intermediate hidden states variables. When fitting the models to data, the authors gradually present the concepts a
Bayesian Option Pricing using Mixed Normal Heteroskedasticity Models
DEFF Research Database (Denmark)
Rombouts, Jeroen; Stentoft, Lars
2014-01-01
Option pricing using mixed normal heteroscedasticity models is considered. It is explained how to perform inference and price options in a Bayesian framework. The approach allows to easily compute risk neutral predictive price densities which take into account parameter uncertainty....... In an application to the S&P 500 index, classical and Bayesian inference is performed on the mixture model using the available return data. Comparing the ML estimates and posterior moments small differences are found. When pricing a rich sample of options on the index, both methods yield similar pricing errors...... measured in dollar and implied standard deviation losses, and it turns out that the impact of parameter uncertainty is minor. Therefore, when it comes to option pricing where large amounts of data are available, the choice of the inference method is unimportant. The results are robust to different...
Operational modal analysis modeling, Bayesian inference, uncertainty laws
Au, Siu-Kui
2017-01-01
This book presents operational modal analysis (OMA), employing a coherent and comprehensive Bayesian framework for modal identification and covering stochastic modeling, theoretical formulations, computational algorithms, and practical applications. Mathematical similarities and philosophical differences between Bayesian and classical statistical approaches to system identification are discussed, allowing their mathematical tools to be shared and their results correctly interpreted. Many chapters can be used as lecture notes for the general topic they cover beyond the OMA context. After an introductory chapter (1), Chapters 2–7 present the general theory of stochastic modeling and analysis of ambient vibrations. Readers are first introduced to the spectral analysis of deterministic time series (2) and structural dynamics (3), which do not require the use of probability concepts. The concepts and techniques in these chapters are subsequently extended to a probabilistic context in Chapter 4 (on stochastic pro...
Energy Technology Data Exchange (ETDEWEB)
Marques, T.C.; Cruz Junior, G.; Vinhal, C. [Universidade Federal de Goias (UFG), Goiania, GO (Brazil). Escola de Engenharia Eletrica e de Computacao], Emails: thyago@eeec.ufg.br, gcruz@eeec.ufg.br, vinhal@eeec.ufg.br
2009-07-01
The goal of this paper is to present a methodology to carry out the seasonal stream flow forecasting using database of average monthly inflows of one Brazilian hydroelectric plant located at Grande, Tocantins, Paranaiba, Sao Francisco and Iguacu river's. The model is based on the Adaptive Network Based Fuzzy Inference System (ANFIS), the non-parametric model. The performance of this model was compared with a periodic autoregressive model, the parametric model. The results show that the forecasting errors of the non-parametric model considered are significantly lower than the parametric model. (author)
Bayesian hierarchical model for large-scale covariance matrix estimation.
Zhu, Dongxiao; Hero, Alfred O
2007-12-01
Many bioinformatics problems implicitly depend on estimating large-scale covariance matrix. The traditional approaches tend to give rise to high variance and low accuracy due to "overfitting." We cast the large-scale covariance matrix estimation problem into the Bayesian hierarchical model framework, and introduce dependency between covariance parameters. We demonstrate the advantages of our approaches over the traditional approaches using simulations and OMICS data analysis.
Bayesian Modelling of fMRI Time Series
DEFF Research Database (Denmark)
Højen-Sørensen, Pedro; Hansen, Lars Kai; Rasmussen, Carl Edward
2000-01-01
We present a Hidden Markov Model (HMM) for inferring the hidden psychological state (or neural activity) during single trial fMRI activation experiments with blocked task paradigms. Inference is based on Bayesian methodology, using a combination of analytical and a variety of Markov Chain Monte...... Carlo (MCMC) sampling techniques. The advantage of this method is that detection of short time learning effects between repeated trials is possible since inference is based only on single trial experiments....
A Bayesian Model of the Memory Colour Effect
Witzel, Christoph; Olkkonen, Maria; Gegenfurtner, Karl R.
2018-01-01
According to the memory colour effect, the colour of a colour-diagnostic object is not perceived independently of the object itself. Instead, it has been shown through an achromatic adjustment method that colour-diagnostic objects still appear slightly in their typical colour, even when they are colourimetrically grey. Bayesian models provide a promising approach to capture the effect of prior knowledge on colour perception and to link these effects to more general effects of cue integration....
A Bayesian model of the memory colour effect.
Witzel, Christoph; Olkkonen, Maria; Gegenfurtner, Karl R.
2018-01-01
According to the memory colour effect, the colour of a colour-diagnostic object is not perceived independently of the object itself. Instead, it has been shown through an achromatic adjustment method that colour-diagnostic objects still appear slightly in their typical colour, even when they are colourimetrically grey. Bayesian models provide a promising approach to capture the effect of prior knowledge on colour perception and to link these effects to more general effects of cue integration....
Li, L.; Xu, C.-Y.; Engeland, K.
2012-04-01
With respect to model calibration, parameter estimation and analysis of uncertainty sources, different approaches have been used in hydrological models. Bayesian method is one of the most widely used methods for uncertainty assessment of hydrological models, which incorporates different sources of information into a single analysis through Bayesian theorem. However, none of these applications can well treat the uncertainty in extreme flows of hydrological models' simulations. This study proposes a Bayesian modularization method approach in uncertainty assessment of conceptual hydrological models by considering the extreme flows. It includes a comprehensive comparison and evaluation of uncertainty assessments by a new Bayesian modularization method approach and traditional Bayesian models using the Metropolis Hasting (MH) algorithm with the daily hydrological model WASMOD. Three likelihood functions are used in combination with traditional Bayesian: the AR (1) plus Normal and time period independent model (Model 1), the AR (1) plus Normal and time period dependent model (Model 2) and the AR (1) plus multi-normal model (Model 3). The results reveal that (1) the simulations derived from Bayesian modularization method are more accurate with the highest Nash-Sutcliffe efficiency value, and (2) the Bayesian modularization method performs best in uncertainty estimates of entire flows and in terms of the application and computational efficiency. The study thus introduces a new approach for reducing the extreme flow's effect on the discharge uncertainty assessment of hydrological models via Bayesian. Keywords: extreme flow, uncertainty assessment, Bayesian modularization, hydrological model, WASMOD
Advances in Applications of Hierarchical Bayesian Methods with Hydrological Models
Alexander, R. B.; Schwarz, G. E.; Boyer, E. W.
2017-12-01
Mechanistic and empirical watershed models are increasingly used to inform water resource decisions. Growing access to historical stream measurements and data from in-situ sensor technologies has increased the need for improved techniques for coupling models with hydrological measurements. Techniques that account for the intrinsic uncertainties of both models and measurements are especially needed. Hierarchical Bayesian methods provide an efficient modeling tool for quantifying model and prediction uncertainties, including those associated with measurements. Hierarchical methods can also be used to explore spatial and temporal variations in model parameters and uncertainties that are informed by hydrological measurements. We used hierarchical Bayesian methods to develop a hybrid (statistical-mechanistic) SPARROW (SPAtially Referenced Regression On Watershed attributes) model of long-term mean annual streamflow across diverse environmental and climatic drainages in 18 U.S. hydrological regions. Our application illustrates the use of a new generation of Bayesian methods that offer more advanced computational efficiencies than the prior generation. Evaluations of the effects of hierarchical (regional) variations in model coefficients and uncertainties on model accuracy indicates improved prediction accuracies (median of 10-50%) but primarily in humid eastern regions, where model uncertainties are one-third of those in arid western regions. Generally moderate regional variability is observed for most hierarchical coefficients. Accounting for measurement and structural uncertainties, using hierarchical state-space techniques, revealed the effects of spatially-heterogeneous, latent hydrological processes in the "localized" drainages between calibration sites; this improved model precision, with only minor changes in regional coefficients. Our study can inform advances in the use of hierarchical methods with hydrological models to improve their integration with stream
Salameh , Farah; Picot , Antoine; Chabert , Marie; Maussion , Pascal
2017-01-01
International audience; This paper describes an original statistical approach for the lifespan modeling of electric machine insulation materials. The presented models aim to study the effect of three main stress factors (voltage, frequency and temperature) and their interactions on the insulation lifespan. The proposed methodology is applied to two different insulation materials tested in partial discharge regime. Accelerated ageing tests are organized according to experimental optimization m...
A Bayesian Hierarchical Modeling Approach to Predicting Flow in Ungauged Basins
Gronewold, A.; Alameddine, I.; Anderson, R. M.
2009-12-01
Recent innovative approaches to identifying and applying regression-based relationships between land use patterns (such as increasing impervious surface area and decreasing vegetative cover) and rainfall-runoff model parameters represent novel and promising improvements to predicting flow from ungauged basins. In particular, these approaches allow for predicting flows under uncertain and potentially variable future conditions due to rapid land cover changes, variable climate conditions, and other factors. Despite the broad range of literature on estimating rainfall-runoff model parameters, however, the absence of a robust set of modeling tools for identifying and quantifying uncertainties in (and correlation between) rainfall-runoff model parameters represents a significant gap in current hydrological modeling research. Here, we build upon a series of recent publications promoting novel Bayesian and probabilistic modeling strategies for quantifying rainfall-runoff model parameter estimation uncertainty. Our approach applies alternative measures of rainfall-runoff model parameter joint likelihood (including Nash-Sutcliffe efficiency, among others) to simulate samples from the joint parameter posterior probability density function. We then use these correlated samples as response variables in a Bayesian hierarchical model with land use coverage data as predictor variables in order to develop a robust land use-based tool for forecasting flow in ungauged basins while accounting for, and explicitly acknowledging, parameter estimation uncertainty. We apply this modeling strategy to low-relief coastal watersheds of Eastern North Carolina, an area representative of coastal resource waters throughout the world because of its sensitive embayments and because of the abundant (but currently threatened) natural resources it hosts. Consequently, this area is the subject of several ongoing studies and large-scale planning initiatives, including those conducted through the United
AIC, BIC, Bayesian evidence against the interacting dark energy model
Energy Technology Data Exchange (ETDEWEB)
Szydlowski, Marek [Jagiellonian University, Astronomical Observatory, Krakow (Poland); Jagiellonian University, Mark Kac Complex Systems Research Centre, Krakow (Poland); Krawiec, Adam [Jagiellonian University, Institute of Economics, Finance and Management, Krakow (Poland); Jagiellonian University, Mark Kac Complex Systems Research Centre, Krakow (Poland); Kurek, Aleksandra [Jagiellonian University, Astronomical Observatory, Krakow (Poland); Kamionka, Michal [University of Wroclaw, Astronomical Institute, Wroclaw (Poland)
2015-01-01
Recent astronomical observations have indicated that the Universe is in a phase of accelerated expansion. While there are many cosmological models which try to explain this phenomenon, we focus on the interacting ΛCDM model where an interaction between the dark energy and dark matter sectors takes place. This model is compared to its simpler alternative - the ΛCDM model. To choose between these models the likelihood ratio test was applied as well as the model comparison methods (employing Occam's principle): the Akaike information criterion (AIC), the Bayesian information criterion (BIC) and the Bayesian evidence. Using the current astronomical data: type Ia supernova (Union2.1), h(z), baryon acoustic oscillation, the Alcock- Paczynski test, and the cosmic microwave background data, we evaluated both models. The analyses based on the AIC indicated that there is less support for the interacting ΛCDM model when compared to the ΛCDM model, while those based on the BIC indicated that there is strong evidence against it in favor of the ΛCDM model. Given the weak or almost non-existing support for the interacting ΛCDM model and bearing in mind Occam's razor we are inclined to reject this model. (orig.)
AIC, BIC, Bayesian evidence against the interacting dark energy model
Energy Technology Data Exchange (ETDEWEB)
Szydłowski, Marek, E-mail: marek.szydlowski@uj.edu.pl [Astronomical Observatory, Jagiellonian University, Orla 171, 30-244, Kraków (Poland); Mark Kac Complex Systems Research Centre, Jagiellonian University, Reymonta 4, 30-059, Kraków (Poland); Krawiec, Adam, E-mail: adam.krawiec@uj.edu.pl [Institute of Economics, Finance and Management, Jagiellonian University, Łojasiewicza 4, 30-348, Kraków (Poland); Mark Kac Complex Systems Research Centre, Jagiellonian University, Reymonta 4, 30-059, Kraków (Poland); Kurek, Aleksandra, E-mail: alex@oa.uj.edu.pl [Astronomical Observatory, Jagiellonian University, Orla 171, 30-244, Kraków (Poland); Kamionka, Michał, E-mail: kamionka@astro.uni.wroc.pl [Astronomical Institute, University of Wrocław, ul. Kopernika 11, 51-622, Wrocław (Poland)
2015-01-14
Recent astronomical observations have indicated that the Universe is in a phase of accelerated expansion. While there are many cosmological models which try to explain this phenomenon, we focus on the interacting ΛCDM model where an interaction between the dark energy and dark matter sectors takes place. This model is compared to its simpler alternative—the ΛCDM model. To choose between these models the likelihood ratio test was applied as well as the model comparison methods (employing Occam’s principle): the Akaike information criterion (AIC), the Bayesian information criterion (BIC) and the Bayesian evidence. Using the current astronomical data: type Ia supernova (Union2.1), h(z), baryon acoustic oscillation, the Alcock–Paczynski test, and the cosmic microwave background data, we evaluated both models. The analyses based on the AIC indicated that there is less support for the interacting ΛCDM model when compared to the ΛCDM model, while those based on the BIC indicated that there is strong evidence against it in favor of the ΛCDM model. Given the weak or almost non-existing support for the interacting ΛCDM model and bearing in mind Occam’s razor we are inclined to reject this model.
AIC, BIC, Bayesian evidence against the interacting dark energy model
International Nuclear Information System (INIS)
Szydlowski, Marek; Krawiec, Adam; Kurek, Aleksandra; Kamionka, Michal
2015-01-01
Recent astronomical observations have indicated that the Universe is in a phase of accelerated expansion. While there are many cosmological models which try to explain this phenomenon, we focus on the interacting ΛCDM model where an interaction between the dark energy and dark matter sectors takes place. This model is compared to its simpler alternative - the ΛCDM model. To choose between these models the likelihood ratio test was applied as well as the model comparison methods (employing Occam's principle): the Akaike information criterion (AIC), the Bayesian information criterion (BIC) and the Bayesian evidence. Using the current astronomical data: type Ia supernova (Union2.1), h(z), baryon acoustic oscillation, the Alcock- Paczynski test, and the cosmic microwave background data, we evaluated both models. The analyses based on the AIC indicated that there is less support for the interacting ΛCDM model when compared to the ΛCDM model, while those based on the BIC indicated that there is strong evidence against it in favor of the ΛCDM model. Given the weak or almost non-existing support for the interacting ΛCDM model and bearing in mind Occam's razor we are inclined to reject this model. (orig.)
Bayesian Modeling of MPSS Data: Gene Expression Analysis of Bovine Salmonella Infection
Dhavala, Soma S.
2010-09-01
Massively Parallel Signature Sequencing (MPSS) is a high-throughput, counting-based technology available for gene expression profiling. It produces output that is similar to Serial Analysis of Gene Expression and is ideal for building complex relational databases for gene expression. Our goal is to compare the in vivo global gene expression profiles of tissues infected with different strains of Salmonella obtained using the MPSS technology. In this article, we develop an exact ANOVA type model for this count data using a zero-inflatedPoisson distribution, different from existing methods that assume continuous densities. We adopt two Bayesian hierarchical models-one parametric and the other semiparametric with a Dirichlet process prior that has the ability to "borrow strength" across related signatures, where a signature is a specific arrangement of the nucleotides, usually 16-21 base pairs long. We utilize the discreteness of Dirichlet process prior to cluster signatures that exhibit similar differential expression profiles. Tests for differential expression are carried out using nonparametric approaches, while controlling the false discovery rate. We identify several differentially expressed genes that have important biological significance and conclude with a summary of the biological discoveries. This article has supplementary materials online. © 2010 American Statistical Association.
Bayesian Modeling of MPSS Data: Gene Expression Analysis of Bovine Salmonella Infection
Dhavala, Soma S.; Datta, Sujay; Mallick, Bani K.; Carroll, Raymond J.; Khare, Sangeeta; Lawhon, Sara D.; Adams, L. Garry
2010-01-01
Massively Parallel Signature Sequencing (MPSS) is a high-throughput, counting-based technology available for gene expression profiling. It produces output that is similar to Serial Analysis of Gene Expression and is ideal for building complex relational databases for gene expression. Our goal is to compare the in vivo global gene expression profiles of tissues infected with different strains of Salmonella obtained using the MPSS technology. In this article, we develop an exact ANOVA type model for this count data using a zero-inflatedPoisson distribution, different from existing methods that assume continuous densities. We adopt two Bayesian hierarchical models-one parametric and the other semiparametric with a Dirichlet process prior that has the ability to "borrow strength" across related signatures, where a signature is a specific arrangement of the nucleotides, usually 16-21 base pairs long. We utilize the discreteness of Dirichlet process prior to cluster signatures that exhibit similar differential expression profiles. Tests for differential expression are carried out using nonparametric approaches, while controlling the false discovery rate. We identify several differentially expressed genes that have important biological significance and conclude with a summary of the biological discoveries. This article has supplementary materials online. © 2010 American Statistical Association.
Towards port sustainability through probabilistic models: Bayesian networks
Directory of Open Access Journals (Sweden)
B. Molina
2018-04-01
Full Text Available It is necessary that a manager of an infrastructure knows relations between variables. Using Bayesian networks, variables can be classified, predicted and diagnosed, being able to estimate posterior probability of the unknown ones based on known ones. The proposed methodology has generated a database with port variables, which have been classified as economic, social, environmental and institutional, as addressed in of smart ports studies made in all Spanish Port System. Network has been developed using an acyclic directed graph, which have let us know relationships in terms of parents and sons. In probabilistic terms, it can be concluded from the constructed network that the most decisive variables for port sustainability are those that are part of the institutional dimension. It has been concluded that Bayesian networks allow modeling uncertainty probabilistically even when the number of variables is high as it occurs in port planning and exploitation.
Bayesian geostatistical modeling of leishmaniasis incidence in Brazil.
Directory of Open Access Journals (Sweden)
Dimitrios-Alexios Karagiannis-Voules
Full Text Available BACKGROUND: Leishmaniasis is endemic in 98 countries with an estimated 350 million people at risk and approximately 2 million cases annually. Brazil is one of the most severely affected countries. METHODOLOGY: We applied Bayesian geostatistical negative binomial models to analyze reported incidence data of cutaneous and visceral leishmaniasis in Brazil covering a 10-year period (2001-2010. Particular emphasis was placed on spatial and temporal patterns. The models were fitted using integrated nested Laplace approximations to perform fast approximate Bayesian inference. Bayesian variable selection was employed to determine the most important climatic, environmental, and socioeconomic predictors of cutaneous and visceral leishmaniasis. PRINCIPAL FINDINGS: For both types of leishmaniasis, precipitation and socioeconomic proxies were identified as important risk factors. The predicted number of cases in 2010 were 30,189 (standard deviation [SD]: 7,676 for cutaneous leishmaniasis and 4,889 (SD: 288 for visceral leishmaniasis. Our risk maps predicted the highest numbers of infected people in the states of Minas Gerais and Pará for visceral and cutaneous leishmaniasis, respectively. CONCLUSIONS/SIGNIFICANCE: Our spatially explicit, high-resolution incidence maps identified priority areas where leishmaniasis control efforts should be targeted with the ultimate goal to reduce disease incidence.
International Nuclear Information System (INIS)
Frepoli, Cesare; Oriani, Luca
2006-01-01
In recent years, non-parametric or order statistics methods have been widely used to assess the impact of the uncertainties within Best-Estimate LOCA evaluation models. The bounding of the uncertainties is achieved with a direct Monte Carlo sampling of the uncertainty attributes, with the minimum trial number selected to 'stabilize' the estimation of the critical output values (peak cladding temperature (PCT), local maximum oxidation (LMO), and core-wide oxidation (CWO A non-parametric order statistics uncertainty analysis was recently implemented within the Westinghouse Realistic Large Break LOCA evaluation model, also referred to as 'Automated Statistical Treatment of Uncertainty Method' (ASTRUM). The implementation or interpretation of order statistics in safety analysis is not fully consistent within the industry. This has led to an extensive public debate among regulators and researchers which can be found in the open literature. The USNRC-approved Westinghouse method follows a rigorous implementation of the order statistics theory, which leads to the execution of 124 simulations within a Large Break LOCA analysis. This is a solid approach which guarantees that a bounding value (at 95% probability) of the 95 th percentile for each of the three 10 CFR 50.46 ECCS design acceptance criteria (PCT, LMO and CWO) is obtained. The objective of this paper is to provide additional insights on the ASTRUM statistical approach, with a more in-depth analysis of pros and cons of the order statistics and of the Westinghouse approach in the implementation of this statistical methodology. (authors)
Bayesian dynamic mediation analysis.
Huang, Jing; Yuan, Ying
2017-12-01
Most existing methods for mediation analysis assume that mediation is a stationary, time-invariant process, which overlooks the inherently dynamic nature of many human psychological processes and behavioral activities. In this article, we consider mediation as a dynamic process that continuously changes over time. We propose Bayesian multilevel time-varying coefficient models to describe and estimate such dynamic mediation effects. By taking the nonparametric penalized spline approach, the proposed method is flexible and able to accommodate any shape of the relationship between time and mediation effects. Simulation studies show that the proposed method works well and faithfully reflects the true nature of the mediation process. By modeling mediation effect nonparametrically as a continuous function of time, our method provides a valuable tool to help researchers obtain a more complete understanding of the dynamic nature of the mediation process underlying psychological and behavioral phenomena. We also briefly discuss an alternative approach of using dynamic autoregressive mediation model to estimate the dynamic mediation effect. The computer code is provided to implement the proposed Bayesian dynamic mediation analysis. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Dynamic model based on Bayesian method for energy security assessment
International Nuclear Information System (INIS)
Augutis, Juozas; Krikštolaitis, Ričardas; Pečiulytė, Sigita; Žutautaitė, Inga
2015-01-01
Highlights: • Methodology for dynamic indicator model construction and forecasting of indicators. • Application of dynamic indicator model for energy system development scenarios. • Expert judgement involvement using Bayesian method. - Abstract: The methodology for the dynamic indicator model construction and forecasting of indicators for the assessment of energy security level is presented in this article. An indicator is a special index, which provides numerical values to important factors for the investigated area. In real life, models of different processes take into account various factors that are time-dependent and dependent on each other. Thus, it is advisable to construct a dynamic model in order to describe these dependences. The energy security indicators are used as factors in the dynamic model. Usually, the values of indicators are obtained from statistical data. The developed dynamic model enables to forecast indicators’ variation taking into account changes in system configuration. The energy system development is usually based on a new object construction. Since the parameters of changes of the new system are not exactly known, information about their influences on indicators could not be involved in the model by deterministic methods. Thus, dynamic indicators’ model based on historical data is adjusted by probabilistic model with the influence of new factors on indicators using the Bayesian method
Rate-optimal Bayesian intensity smoothing for inhomogeneous Poisson processes
Belitser, E.N.; Serra, P.; van Zanten, H.
2015-01-01
We apply nonparametric Bayesian methods to study the problem of estimating the intensity function of an inhomogeneous Poisson process. To motivate our results we start by analyzing count data coming from a call center which we model as a Poisson process. This analysis is carried out using a certain
Bayesian model discrimination for glucose-insulin homeostasis
DEFF Research Database (Denmark)
Andersen, Kim Emil; Brooks, Stephen P.; Højbjerre, Malene
In this paper we analyse a set of experimental data on a number of healthy and diabetic patients and discuss a variety of models for describing the physiological processes involved in glucose absorption and insulin secretion within the human body. We adopt a Bayesian approach which facilitates...... as parameter uncertainty. Markov chain Monte Carlo methods are used, combining Metropolis Hastings, reversible jump and simulated tempering updates to provide rapidly mixing chains so as to provide robust inference. We demonstrate the methodology for both healthy and type II diabetic populations concluding...... that whilst both populations are well modelled by a common insulin model, their glucose dynamics differ considerably....
Bayesian modeling to paired comparison data via the Pareto distribution
Directory of Open Access Journals (Sweden)
Nasir Abbas
2017-12-01
Full Text Available A probabilistic approach to build models for paired comparison experiments based on the comparison of two Pareto variables is considered. Analysis of the proposed model is carried out in classical as well as Bayesian frameworks. Informative and uninformative priors are employed to accommodate the prior information. Simulation study is conducted to assess the suitablily and performance of the model under theoretical conditions. Appropriateness of fit of the is also carried out. Entire inferential procedure is illustrated by comparing certain cricket teams using real dataset.
Bayesian inference and model comparison for metallic fatigue data
Babuška, Ivo
2016-02-23
In this work, we present a statistical treatment of stress-life (S-N) data drawn from a collection of records of fatigue experiments that were performed on 75S-T6 aluminum alloys. Our main objective is to predict the fatigue life of materials by providing a systematic approach to model calibration, model selection and model ranking with reference to S-N data. To this purpose, we consider fatigue-limit models and random fatigue-limit models that are specially designed to allow the treatment of the run-outs (right-censored data). We first fit the models to the data by maximum likelihood methods and estimate the quantiles of the life distribution of the alloy specimen. To assess the robustness of the estimation of the quantile functions, we obtain bootstrap confidence bands by stratified resampling with respect to the cycle ratio. We then compare and rank the models by classical measures of fit based on information criteria. We also consider a Bayesian approach that provides, under the prior distribution of the model parameters selected by the user, their simulation-based posterior distributions. We implement and apply Bayesian model comparison methods, such as Bayes factor ranking and predictive information criteria based on cross-validation techniques under various a priori scenarios.
Bayesian inference and model comparison for metallic fatigue data
Babuška, Ivo; Sawlan, Zaid A; Scavino, Marco; Szabó , Barna; Tempone, Raul
2016-01-01
In this work, we present a statistical treatment of stress-life (S-N) data drawn from a collection of records of fatigue experiments that were performed on 75S-T6 aluminum alloys. Our main objective is to predict the fatigue life of materials by providing a systematic approach to model calibration, model selection and model ranking with reference to S-N data. To this purpose, we consider fatigue-limit models and random fatigue-limit models that are specially designed to allow the treatment of the run-outs (right-censored data). We first fit the models to the data by maximum likelihood methods and estimate the quantiles of the life distribution of the alloy specimen. To assess the robustness of the estimation of the quantile functions, we obtain bootstrap confidence bands by stratified resampling with respect to the cycle ratio. We then compare and rank the models by classical measures of fit based on information criteria. We also consider a Bayesian approach that provides, under the prior distribution of the model parameters selected by the user, their simulation-based posterior distributions. We implement and apply Bayesian model comparison methods, such as Bayes factor ranking and predictive information criteria based on cross-validation techniques under various a priori scenarios.
Predicting coastal cliff erosion using a Bayesian probabilistic model
Hapke, Cheryl J.; Plant, Nathaniel G.
2010-01-01
Regional coastal cliff retreat is difficult to model due to the episodic nature of failures and the along-shore variability of retreat events. There is a growing demand, however, for predictive models that can be used to forecast areas vulnerable to coastal erosion hazards. Increasingly, probabilistic models are being employed that require data sets of high temporal density to define the joint probability density function that relates forcing variables (e.g. wave conditions) and initial conditions (e.g. cliff geometry) to erosion events. In this study we use a multi-parameter Bayesian network to investigate correlations between key variables that control and influence variations in cliff retreat processes. The network uses Bayesian statistical methods to estimate event probabilities using existing observations. Within this framework, we forecast the spatial distribution of cliff retreat along two stretches of cliffed coast in Southern California. The input parameters are the height and slope of the cliff, a descriptor of material strength based on the dominant cliff-forming lithology, and the long-term cliff erosion rate that represents prior behavior. The model is forced using predicted wave impact hours. Results demonstrate that the Bayesian approach is well-suited to the forward modeling of coastal cliff retreat, with the correct outcomes forecast in 70–90% of the modeled transects. The model also performs well in identifying specific locations of high cliff erosion, thus providing a foundation for hazard mapping. This approach can be employed to predict cliff erosion at time-scales ranging from storm events to the impacts of sea-level rise at the century-scale.
Bayesian Kernel Mixtures for Counts.
Canale, Antonio; Dunson, David B
2011-12-01
Although Bayesian nonparametric mixture models for continuous data are well developed, there is a limited literature on related approaches for count data. A common strategy is to use a mixture of Poissons, which unfortunately is quite restrictive in not accounting for distributions having variance less than the mean. Other approaches include mixing multinomials, which requires finite support, and using a Dirichlet process prior with a Poisson base measure, which does not allow smooth deviations from the Poisson. As a broad class of alternative models, we propose to use nonparametric mixtures of rounded continuous kernels. An efficient Gibbs sampler is developed for posterior computation, and a simulation study is performed to assess performance. Focusing on the rounded Gaussian case, we generalize the modeling framework to account for multivariate count data, joint modeling with continuous and categorical variables, and other complications. The methods are illustrated through applications to a developmental toxicity study and marketing data. This article has supplementary material online.
A Bayesian approach for quantification of model uncertainty
International Nuclear Information System (INIS)
Park, Inseok; Amarchinta, Hemanth K.; Grandhi, Ramana V.
2010-01-01
In most engineering problems, more than one model can be created to represent an engineering system's behavior. Uncertainty is inevitably involved in selecting the best model from among the models that are possible. Uncertainty in model selection cannot be ignored, especially when the differences between the predictions of competing models are significant. In this research, a methodology is proposed to quantify model uncertainty using measured differences between experimental data and model outcomes under a Bayesian statistical framework. The adjustment factor approach is used to propagate model uncertainty into prediction of a system response. A nonlinear vibration system is used to demonstrate the processes for implementing the adjustment factor approach. Finally, the methodology is applied on the engineering benefits of a laser peening process, and a confidence band for residual stresses is established to indicate the reliability of model prediction.
Energy Technology Data Exchange (ETDEWEB)
Xu, Bin [School of Statistics, Jiangxi University of Finance and Economics, Nanchang, Jiangxi 330013 (China); Research Center of Applied Statistics, Jiangxi University of Finance and Economics, Nanchang, Jiangxi 330013 (China); Lin, Boqiang, E-mail: bqlin@xmu.edu.cn [Collaborative Innovation Center for Energy Economics and Energy Policy, China Institute for Studies in Energy Policy, Xiamen University, Xiamen, Fujian 361005 (China)
2017-03-15
China is currently the world's largest carbon dioxide (CO{sub 2}) emitter. Moreover, total energy consumption and CO{sub 2} emissions in China will continue to increase due to the rapid growth of industrialization and urbanization. Therefore, vigorously developing the high–tech industry becomes an inevitable choice to reduce CO{sub 2} emissions at the moment or in the future. However, ignoring the existing nonlinear links between economic variables, most scholars use traditional linear models to explore the impact of the high–tech industry on CO{sub 2} emissions from an aggregate perspective. Few studies have focused on nonlinear relationships and regional differences in China. Based on panel data of 1998–2014, this study uses the nonparametric additive regression model to explore the nonlinear effect of the high–tech industry from a regional perspective. The estimated results show that the residual sum of squares (SSR) of the nonparametric additive regression model in the eastern, central and western regions are 0.693, 0.054 and 0.085 respectively, which are much less those that of the traditional linear regression model (3.158, 4.227 and 7.196). This verifies that the nonparametric additive regression model has a better fitting effect. Specifically, the high–tech industry produces an inverted “U–shaped” nonlinear impact on CO{sub 2} emissions in the eastern region, but a positive “U–shaped” nonlinear effect in the central and western regions. Therefore, the nonlinear impact of the high–tech industry on CO{sub 2} emissions in the three regions should be given adequate attention in developing effective abatement policies. - Highlights: • The nonlinear effect of the high–tech industry on CO{sub 2} emissions was investigated. • The high–tech industry yields an inverted “U–shaped” effect in the eastern region. • The high–tech industry has a positive “U–shaped” nonlinear effect in other regions. • The linear impact
Two Bayesian tests of the GLOMOsys Model.
Field, Sarahanne M; Wagenmakers, Eric-Jan; Newell, Ben R; Zeelenberg, René; van Ravenzwaaij, Don
2016-12-01
Priming is arguably one of the key phenomena in contemporary social psychology. Recent retractions and failed replication attempts have led to a division in the field between proponents and skeptics and have reinforced the importance of confirming certain priming effects through replication. In this study, we describe the results of 2 preregistered replication attempts of 1 experiment by Förster and Denzler (2012). In both experiments, participants first processed letters either globally or locally, then were tested using a typicality rating task. Bayes factor hypothesis tests were conducted for both experiments: Experiment 1 (N = 100) yielded an indecisive Bayes factor of 1.38, indicating that the in-lab data are 1.38 times more likely to have occurred under the null hypothesis than under the alternative. Experiment 2 (N = 908) yielded a Bayes factor of 10.84, indicating strong support for the null hypothesis that global priming does not affect participants' mean typicality ratings. The failure to replicate this priming effect challenges existing support for the GLOMO sys model. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Modeling Women's Menstrual Cycles using PICI Gates in Bayesian Network.
Zagorecki, Adam; Łupińska-Dubicka, Anna; Voortman, Mark; Druzdzel, Marek J
2016-03-01
A major difficulty in building Bayesian network (BN) models is the size of conditional probability tables, which grow exponentially in the number of parents. One way of dealing with this problem is through parametric conditional probability distributions that usually require only a number of parameters that is linear in the number of parents. In this paper, we introduce a new class of parametric models, the Probabilistic Independence of Causal Influences (PICI) models, that aim at lowering the number of parameters required to specify local probability distributions, but are still capable of efficiently modeling a variety of interactions. A subset of PICI models is decomposable and this leads to significantly faster inference as compared to models that cannot be decomposed. We present an application of the proposed method to learning dynamic BNs for modeling a woman's menstrual cycle. We show that PICI models are especially useful for parameter learning from small data sets and lead to higher parameter accuracy than when learning CPTs.
Robust Bayesian Experimental Design for Conceptual Model Discrimination
Pham, H. V.; Tsai, F. T. C.
2015-12-01
A robust Bayesian optimal experimental design under uncertainty is presented to provide firm information for model discrimination, given the least number of pumping wells and observation wells. Firm information is the maximum information of a system can be guaranteed from an experimental design. The design is based on the Box-Hill expected entropy decrease (EED) before and after the experiment design and the Bayesian model averaging (BMA) framework. A max-min programming is introduced to choose the robust design that maximizes the minimal Box-Hill EED subject to that the highest expected posterior model probability satisfies a desired probability threshold. The EED is calculated by the Gauss-Hermite quadrature. The BMA method is used to predict future observations and to quantify future observation uncertainty arising from conceptual and parametric uncertainties in calculating EED. Monte Carlo approach is adopted to quantify the uncertainty in the posterior model probabilities. The optimal experimental design is tested by a synthetic 5-layer anisotropic confined aquifer. Nine conceptual groundwater models are constructed due to uncertain geological architecture and boundary condition. High-performance computing is used to enumerate all possible design solutions in order to identify the most plausible groundwater model. Results highlight the impacts of scedasticity in future observation data as well as uncertainty sources on potential pumping and observation locations.
Macroeconomic Forecasts in Models with Bayesian Averaging of Classical Estimates
Directory of Open Access Journals (Sweden)
Piotr Białowolski
2012-03-01
Full Text Available The aim of this paper is to construct a forecasting model oriented on predicting basic macroeconomic variables, namely: the GDP growth rate, the unemployment rate, and the consumer price inflation. In order to select the set of the best regressors, Bayesian Averaging of Classical Estimators (BACE is employed. The models are atheoretical (i.e. they do not reflect causal relationships postulated by the macroeconomic theory and the role of regressors is played by business and consumer tendency survey-based indicators. Additionally, survey-based indicators are included with a lag that enables to forecast the variables of interest (GDP, unemployment, and inflation for the four forthcoming quarters without the need to make any additional assumptions concerning the values of predictor variables in the forecast period. Bayesian Averaging of Classical Estimators is a method allowing for full and controlled overview of all econometric models which can be obtained out of a particular set of regressors. In this paper authors describe the method of generating a family of econometric models and the procedure for selection of a final forecasting model. Verification of the procedure is performed by means of out-of-sample forecasts of main economic variables for the quarters of 2011. The accuracy of the forecasts implies that there is still a need to search for new solutions in the atheoretical modelling.
Non-parametric early seizure detection in an animal model of temporal lobe epilepsy
Talathi, Sachin S.; Hwang, Dong-Uk; Spano, Mark L.; Simonotto, Jennifer; Furman, Michael D.; Myers, Stephen M.; Winters, Jason T.; Ditto, William L.; Carney, Paul R.
2008-03-01
The performance of five non-parametric, univariate seizure detection schemes (embedding delay, Hurst scale, wavelet scale, nonlinear autocorrelation and variance energy) were evaluated as a function of the sampling rate of EEG recordings, the electrode types used for EEG acquisition, and the spatial location of the EEG electrodes in order to determine the applicability of the measures in real-time closed-loop seizure intervention. The criteria chosen for evaluating the performance were high statistical robustness (as determined through the sensitivity and the specificity of a given measure in detecting a seizure) and the lag in seizure detection with respect to the seizure onset time (as determined by visual inspection of the EEG signal by a trained epileptologist). An optimality index was designed to evaluate the overall performance of each measure. For the EEG data recorded with microwire electrode array at a sampling rate of 12 kHz, the wavelet scale measure exhibited better overall performance in terms of its ability to detect a seizure with high optimality index value and high statistics in terms of sensitivity and specificity.
Modeling operational risks of the nuclear industry with Bayesian networks
International Nuclear Information System (INIS)
Wieland, Patricia; Lustosa, Leonardo J.
2009-01-01
Basically, planning a new industrial plant requires information on the industrial management, regulations, site selection, definition of initial and planned capacity, and on the estimation of the potential demand. However, this is far from enough to assure the success of an industrial enterprise. Unexpected and extremely damaging events may occur that deviates from the original plan. The so-called operational risks are not only in the system, equipment, process or human (technical or managerial) failures. They are also in intentional events such as frauds and sabotage, or extreme events like terrorist attacks or radiological accidents and even on public reaction to perceived environmental or future generation impacts. For the nuclear industry, it is a challenge to identify and to assess the operational risks and their various sources. Early identification of operational risks can help in preparing contingency plans, to delay the decision to invest or to approve a project that can, at an extreme, affect the public perception of the nuclear energy. A major problem in modeling operational risk losses is the lack of internal data that are essential, for example, to apply the loss distribution approach. As an alternative, methods that consider qualitative and subjective information can be applied, for example, fuzzy logic, neural networks, system dynamic or Bayesian networks. An advantage of applying Bayesian networks to model operational risk is the possibility to include expert opinions and variables of interest, to structure the model via causal dependencies among these variables, and to specify subjective prior and conditional probabilities distributions at each step or network node. This paper suggests a classification of operational risks in industry and discusses the benefits and obstacles of the Bayesian networks approach to model those risks. (author)
Modeling operational risks of the nuclear industry with Bayesian networks
Energy Technology Data Exchange (ETDEWEB)
Wieland, Patricia [Pontificia Univ. Catolica do Rio de Janeiro (PUC-Rio), RJ (Brazil). Dept. de Engenharia Industrial; Comissao Nacional de Energia Nuclear (CNEN), Rio de Janeiro, RJ (Brazil)], e-mail: pwieland@cnen.gov.br; Lustosa, Leonardo J. [Pontificia Univ. Catolica do Rio de Janeiro (PUC-Rio), RJ (Brazil). Dept. de Engenharia Industrial], e-mail: ljl@puc-rio.br
2009-07-01
Basically, planning a new industrial plant requires information on the industrial management, regulations, site selection, definition of initial and planned capacity, and on the estimation of the potential demand. However, this is far from enough to assure the success of an industrial enterprise. Unexpected and extremely damaging events may occur that deviates from the original plan. The so-called operational risks are not only in the system, equipment, process or human (technical or managerial) failures. They are also in intentional events such as frauds and sabotage, or extreme events like terrorist attacks or radiological accidents and even on public reaction to perceived environmental or future generation impacts. For the nuclear industry, it is a challenge to identify and to assess the operational risks and their various sources. Early identification of operational risks can help in preparing contingency plans, to delay the decision to invest or to approve a project that can, at an extreme, affect the public perception of the nuclear energy. A major problem in modeling operational risk losses is the lack of internal data that are essential, for example, to apply the loss distribution approach. As an alternative, methods that consider qualitative and subjective information can be applied, for example, fuzzy logic, neural networks, system dynamic or Bayesian networks. An advantage of applying Bayesian networks to model operational risk is the possibility to include expert opinions and variables of interest, to structure the model via causal dependencies among these variables, and to specify subjective prior and conditional probabilities distributions at each step or network node. This paper suggests a classification of operational risks in industry and discusses the benefits and obstacles of the Bayesian networks approach to model those risks. (author)
Bayesian modeling of the mass and density of asteroids
Dotson, Jessie L.; Mathias, Donovan
2017-10-01
Mass and density are two of the fundamental properties of any object. In the case of near earth asteroids, knowledge about the mass of an asteroid is essential for estimating the risk due to (potential) impact and planning possible mitigation options. The density of an asteroid can illuminate the structure of the asteroid. A low density can be indicative of a rubble pile structure whereas a higher density can imply a monolith and/or higher metal content. The damage resulting from an impact of an asteroid with Earth depends on its interior structure in addition to its total mass, and as a result, density is a key parameter to understanding the risk of asteroid impact. Unfortunately, measuring the mass and density of asteroids is challenging and often results in measurements with large uncertainties. In the absence of mass / density measurements for a specific object, understanding the range and distribution of likely values can facilitate probabilistic assessments of structure and impact risk. Hierarchical Bayesian models have recently been developed to investigate the mass - radius relationship of exoplanets (Wolfgang, Rogers & Ford 2016) and to probabilistically forecast the mass of bodies large enough to establish hydrostatic equilibrium over a range of 9 orders of magnitude in mass (from planemos to main sequence stars; Chen & Kipping 2017). Here, we extend this approach to investigate the mass and densities of asteroids. Several candidate Bayesian models are presented, and their performance is assessed relative to a synthetic asteroid population. In addition, a preliminary Bayesian model for probablistically forecasting masses and densities of asteroids is presented. The forecasting model is conditioned on existing asteroid data and includes observational errors, hyper-parameter uncertainties and intrinsic scatter.
Bayesian analysis for uncertainty estimation of a canopy transpiration model
Samanta, S.; Mackay, D. S.; Clayton, M. K.; Kruger, E. L.; Ewers, B. E.
2007-04-01
A Bayesian approach was used to fit a conceptual transpiration model to half-hourly transpiration rates for a sugar maple (Acer saccharum) stand collected over a 5-month period and probabilistically estimate its parameter and prediction uncertainties. The model used the Penman-Monteith equation with the Jarvis model for canopy conductance. This deterministic model was extended by adding a normally distributed error term. This extension enabled using Markov chain Monte Carlo simulations to sample the posterior parameter distributions. The residuals revealed approximate conformance to the assumption of normally distributed errors. However, minor systematic structures in the residuals at fine timescales suggested model changes that would potentially improve the modeling of transpiration. Results also indicated considerable uncertainties in the parameter and transpiration estimates. This simple methodology of uncertainty analysis would facilitate the deductive step during the development cycle of deterministic conceptual models by accounting for these uncertainties while drawing inferences from data.
Kaplan, David; Lee, Chansoon
2018-01-01
This article provides a review of Bayesian model averaging as a means of optimizing the predictive performance of common statistical models applied to large-scale educational assessments. The Bayesian framework recognizes that in addition to parameter uncertainty, there is uncertainty in the choice of models themselves. A Bayesian approach to addressing the problem of model uncertainty is the method of Bayesian model averaging. Bayesian model averaging searches the space of possible models for a set of submodels that satisfy certain scientific principles and then averages the coefficients across these submodels weighted by each model's posterior model probability (PMP). Using the weighted coefficients for prediction has been shown to yield optimal predictive performance according to certain scoring rules. We demonstrate the utility of Bayesian model averaging for prediction in education research with three examples: Bayesian regression analysis, Bayesian logistic regression, and a recently developed approach for Bayesian structural equation modeling. In each case, the model-averaged estimates are shown to yield better prediction of the outcome of interest than any submodel based on predictive coverage and the log-score rule. Implications for the design of large-scale assessments when the goal is optimal prediction in a policy context are discussed.
A study of finite mixture model: Bayesian approach on financial time series data
Phoong, Seuk-Yen; Ismail, Mohd Tahir
2014-07-01
Recently, statistician have emphasized on the fitting finite mixture model by using Bayesian method. Finite mixture model is a mixture of distributions in modeling a statistical distribution meanwhile Bayesian method is a statistical method that use to fit the mixture model. Bayesian method is being used widely because it has asymptotic properties which provide remarkable result. In addition, Bayesian method also shows consistency characteristic which means the parameter estimates are close to the predictive distributions. In the present paper, the number of components for mixture model is studied by using Bayesian Information Criterion. Identify the number of component is important because it may lead to an invalid result. Later, the Bayesian method is utilized to fit the k-component mixture model in order to explore the relationship between rubber price and stock market price for Malaysia, Thailand, Philippines and Indonesia. Lastly, the results showed that there is a negative effect among rubber price and stock market price for all selected countries.
Directory of Open Access Journals (Sweden)
Hashem Salarzadeh Jenatabadi
2016-11-01
Full Text Available There are many factors which could influence the sustainability of airlines. The main purpose of this study is to introduce a framework for a financial sustainability index and model it based on structural equation modeling (SEM with maximum likelihood and Bayesian predictors. The introduced framework includes economic performance, operational performance, cost performance, and financial performance. Based on both Bayesian SEM (Bayesian-SEM and Classical SEM (Classical-SEM, it was found that economic performance with both operational performance and cost performance are significantly related to the financial performance index. The four mathematical indices employed are root mean square error, coefficient of determination, mean absolute error, and mean absolute percentage error to compare the efficiency of Bayesian-SEM and Classical-SEM in predicting the airline financial performance. The outputs confirmed that the framework with Bayesian prediction delivered a good fit with the data, although the framework predicted with a Classical-SEM approach did not prepare a well-fitting model. The reasons for this discrepancy between Classical and Bayesian predictions, as well as the potential advantages and caveats with the application of Bayesian approach in airline sustainability studies, are debated.
From qualitative reasoning models to Bayesian-based learner modeling
Milošević, U.; Bredeweg, B.; de Kleer, J.; Forbus, K.D.
2010-01-01
Assessing the knowledge of a student is a fundamental part of intelligent learning environments. We present a Bayesian network based approach to dealing with uncertainty when estimating a learner’s state of knowledge in the context of Qualitative Reasoning (QR). A proposal for a global architecture
Development of a cyber security risk model using Bayesian networks
International Nuclear Information System (INIS)
Shin, Jinsoo; Son, Hanseong; Khalil ur, Rahman; Heo, Gyunyoung
2015-01-01
Cyber security is an emerging safety issue in the nuclear industry, especially in the instrumentation and control (I and C) field. To address the cyber security issue systematically, a model that can be used for cyber security evaluation is required. In this work, a cyber security risk model based on a Bayesian network is suggested for evaluating cyber security for nuclear facilities in an integrated manner. The suggested model enables the evaluation of both the procedural and technical aspects of cyber security, which are related to compliance with regulatory guides and system architectures, respectively. The activity-quality analysis model was developed to evaluate how well people and/or organizations comply with the regulatory guidance associated with cyber security. The architecture analysis model was created to evaluate vulnerabilities and mitigation measures with respect to their effect on cyber security. The two models are integrated into a single model, which is called the cyber security risk model, so that cyber security can be evaluated from procedural and technical viewpoints at the same time. The model was applied to evaluate the cyber security risk of the reactor protection system (RPS) of a research reactor and to demonstrate its usefulness and feasibility. - Highlights: • We developed the cyber security risk model can be find the weak point of cyber security integrated two cyber analysis models by using Bayesian Network. • One is the activity-quality model signifies how people and/or organization comply with the cyber security regulatory guide. • Other is the architecture model represents the probability of cyber-attack on RPS architecture. • The cyber security risk model can provide evidence that is able to determine the key element for cyber security for RPS of a research reactor
Prior Sensitivity Analysis in Default Bayesian Structural Equation Modeling.
van Erp, Sara; Mulder, Joris; Oberski, Daniel L
2017-11-27
Bayesian structural equation modeling (BSEM) has recently gained popularity because it enables researchers to fit complex models and solve some of the issues often encountered in classical maximum likelihood estimation, such as nonconvergence and inadmissible solutions. An important component of any Bayesian analysis is the prior distribution of the unknown model parameters. Often, researchers rely on default priors, which are constructed in an automatic fashion without requiring substantive prior information. However, the prior can have a serious influence on the estimation of the model parameters, which affects the mean squared error, bias, coverage rates, and quantiles of the estimates. In this article, we investigate the performance of three different default priors: noninformative improper priors, vague proper priors, and empirical Bayes priors-with the latter being novel in the BSEM literature. Based on a simulation study, we find that these three default BSEM methods may perform very differently, especially with small samples. A careful prior sensitivity analysis is therefore needed when performing a default BSEM analysis. For this purpose, we provide a practical step-by-step guide for practitioners to conducting a prior sensitivity analysis in default BSEM. Our recommendations are illustrated using a well-known case study from the structural equation modeling literature, and all code for conducting the prior sensitivity analysis is available in the online supplemental materials. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Experimental validation of a Bayesian model of visual acuity.
LENUS (Irish Health Repository)
Dalimier, Eugénie
2009-01-01
Based on standard procedures used in optometry clinics, we compare measurements of visual acuity for 10 subjects (11 eyes tested) in the presence of natural ocular aberrations and different degrees of induced defocus, with the predictions given by a Bayesian model customized with aberrometric data of the eye. The absolute predictions of the model, without any adjustment, show good agreement with the experimental data, in terms of correlation and absolute error. The efficiency of the model is discussed in comparison with image quality metrics and other customized visual process models. An analysis of the importance and customization of each stage of the model is also given; it stresses the potential high predictive power from precise modeling of ocular and neural transfer functions.
Assessing global vegetation activity using spatio-temporal Bayesian modelling
Mulder, Vera L.; van Eck, Christel M.; Friedlingstein, Pierre; Regnier, Pierre A. G.
2016-04-01
This work demonstrates the potential of modelling vegetation activity using a hierarchical Bayesian spatio-temporal model. This approach allows modelling changes in vegetation and climate simultaneous in space and time. Changes of vegetation activity such as phenology are modelled as a dynamic process depending on climate variability in both space and time. Additionally, differences in observed vegetation status can be contributed to other abiotic ecosystem properties, e.g. soil and terrain properties. Although these properties do not change in time, they do change in space and may provide valuable information in addition to the climate dynamics. The spatio-temporal Bayesian models were calibrated at a regional scale because the local trends in space and time can be better captured by the model. The regional subsets were defined according to the SREX segmentation, as defined by the IPCC. Each region is considered being relatively homogeneous in terms of large-scale climate and biomes, still capturing small-scale (grid-cell level) variability. Modelling within these regions is hence expected to be less uncertain due to the absence of these large-scale patterns, compared to a global approach. This overall modelling approach allows the comparison of model behavior for the different regions and may provide insights on the main dynamic processes driving the interaction between vegetation and climate within different regions. The data employed in this study encompasses the global datasets for soil properties (SoilGrids), terrain properties (Global Relief Model based on SRTM DEM and ETOPO), monthly time series of satellite-derived vegetation indices (GIMMS NDVI3g) and climate variables (Princeton Meteorological Forcing Dataset). The findings proved the potential of a spatio-temporal Bayesian modelling approach for assessing vegetation dynamics, at a regional scale. The observed interrelationships of the employed data and the different spatial and temporal trends support
Theory-based Bayesian models of inductive learning and reasoning.
Tenenbaum, Joshua B; Griffiths, Thomas L; Kemp, Charles
2006-07-01
Inductive inference allows humans to make powerful generalizations from sparse data when learning about word meanings, unobserved properties, causal relationships, and many other aspects of the world. Traditional accounts of induction emphasize either the power of statistical learning, or the importance of strong constraints from structured domain knowledge, intuitive theories or schemas. We argue that both components are necessary to explain the nature, use and acquisition of human knowledge, and we introduce a theory-based Bayesian framework for modeling inductive learning and reasoning as statistical inferences over structured knowledge representations.
Factors affecting GEBV accuracy with single-step Bayesian models.
Zhou, Lei; Mrode, Raphael; Zhang, Shengli; Zhang, Qin; Li, Bugao; Liu, Jian-Feng
2018-01-01
A single-step approach to obtain genomic prediction was first proposed in 2009. Many studies have investigated the components of GEBV accuracy in genomic selection. However, it is still unclear how the population structure and the relationships between training and validation populations influence GEBV accuracy in terms of single-step analysis. Here, we explored the components of GEBV accuracy in single-step Bayesian analysis with a simulation study. Three scenarios with various numbers of QTL (5, 50, and 500) were simulated. Three models were implemented to analyze the simulated data: single-step genomic best linear unbiased prediction (GBLUP; SSGBLUP), single-step BayesA (SS-BayesA), and single-step BayesB (SS-BayesB). According to our results, GEBV accuracy was influenced by the relationships between the training and validation populations more significantly for ungenotyped animals than for genotyped animals. SS-BayesA/BayesB showed an obvious advantage over SSGBLUP with the scenarios of 5 and 50 QTL. SS-BayesB model obtained the lowest accuracy with the 500 QTL in the simulation. SS-BayesA model was the most efficient and robust considering all QTL scenarios. Generally, both the relationships between training and validation populations and LD between markers and QTL contributed to GEBV accuracy in the single-step analysis, and the advantages of single-step Bayesian models were more apparent when the trait is controlled by fewer QTL.
Bayesian Sensitivity Analysis of Statistical Models with Missing Data.
Zhu, Hongtu; Ibrahim, Joseph G; Tang, Niansheng
2014-04-01
Methods for handling missing data depend strongly on the mechanism that generated the missing values, such as missing completely at random (MCAR) or missing at random (MAR), as well as other distributional and modeling assumptions at various stages. It is well known that the resulting estimates and tests may be sensitive to these assumptions as well as to outlying observations. In this paper, we introduce various perturbations to modeling assumptions and individual observations, and then develop a formal sensitivity analysis to assess these perturbations in the Bayesian analysis of statistical models with missing data. We develop a geometric framework, called the Bayesian perturbation manifold, to characterize the intrinsic structure of these perturbations. We propose several intrinsic influence measures to perform sensitivity analysis and quantify the effect of various perturbations to statistical models. We use the proposed sensitivity analysis procedure to systematically investigate the tenability of the non-ignorable missing at random (NMAR) assumption. Simulation studies are conducted to evaluate our methods, and a dataset is analyzed to illustrate the use of our diagnostic measures.
Approximate Bayesian computation for forward modeling in cosmology
International Nuclear Information System (INIS)
Akeret, Joël; Refregier, Alexandre; Amara, Adam; Seehars, Sebastian; Hasner, Caspar
2015-01-01
Bayesian inference is often used in cosmology and astrophysics to derive constraints on model parameters from observations. This approach relies on the ability to compute the likelihood of the data given a choice of model parameters. In many practical situations, the likelihood function may however be unavailable or intractable due to non-gaussian errors, non-linear measurements processes, or complex data formats such as catalogs and maps. In these cases, the simulation of mock data sets can often be made through forward modeling. We discuss how Approximate Bayesian Computation (ABC) can be used in these cases to derive an approximation to the posterior constraints using simulated data sets. This technique relies on the sampling of the parameter set, a distance metric to quantify the difference between the observation and the simulations and summary statistics to compress the information in the data. We first review the principles of ABC and discuss its implementation using a Population Monte-Carlo (PMC) algorithm and the Mahalanobis distance metric. We test the performance of the implementation using a Gaussian toy model. We then apply the ABC technique to the practical case of the calibration of image simulations for wide field cosmological surveys. We find that the ABC analysis is able to provide reliable parameter constraints for this problem and is therefore a promising technique for other applications in cosmology and astrophysics. Our implementation of the ABC PMC method is made available via a public code release
On-line Bayesian model updating for structural health monitoring
Rocchetta, Roberto; Broggi, Matteo; Huchet, Quentin; Patelli, Edoardo
2018-03-01
Fatigue induced cracks is a dangerous failure mechanism which affects mechanical components subject to alternating load cycles. System health monitoring should be adopted to identify cracks which can jeopardise the structure. Real-time damage detection may fail in the identification of the cracks due to different sources of uncertainty which have been poorly assessed or even fully neglected. In this paper, a novel efficient and robust procedure is used for the detection of cracks locations and lengths in mechanical components. A Bayesian model updating framework is employed, which allows accounting for relevant sources of uncertainty. The idea underpinning the approach is to identify the most probable crack consistent with the experimental measurements. To tackle the computational cost of the Bayesian approach an emulator is adopted for replacing the computationally costly Finite Element model. To improve the overall robustness of the procedure, different numerical likelihoods, measurement noises and imprecision in the value of model parameters are analysed and their effects quantified. The accuracy of the stochastic updating and the efficiency of the numerical procedure are discussed. An experimental aluminium frame and on a numerical model of a typical car suspension arm are used to demonstrate the applicability of the approach.
Bayesian statistic methods and theri application in probabilistic simulation models
Directory of Open Access Journals (Sweden)
Sergio Iannazzo
2007-03-01
Full Text Available Bayesian statistic methods are facing a rapidly growing level of interest and acceptance in the field of health economics. The reasons of this success are probably to be found on the theoretical fundaments of the discipline that make these techniques more appealing to decision analysis. To this point should be added the modern IT progress that has developed different flexible and powerful statistical software framework. Among them probably one of the most noticeably is the BUGS language project and its standalone application for MS Windows WinBUGS. Scope of this paper is to introduce the subject and to show some interesting applications of WinBUGS in developing complex economical models based on Markov chains. The advantages of this approach reside on the elegance of the code produced and in its capability to easily develop probabilistic simulations. Moreover an example of the integration of bayesian inference models in a Markov model is shown. This last feature let the analyst conduce statistical analyses on the available sources of evidence and exploit them directly as inputs in the economic model.
Bayesian calibration of power plant models for accurate performance prediction
International Nuclear Information System (INIS)
Boksteen, Sowande Z.; Buijtenen, Jos P. van; Pecnik, Rene; Vecht, Dick van der
2014-01-01
Highlights: • Bayesian calibration is applied to power plant performance prediction. • Measurements from a plant in operation are used for model calibration. • A gas turbine performance model and steam cycle model are calibrated. • An integrated plant model is derived. • Part load efficiency is accurately predicted as a function of ambient conditions. - Abstract: Gas turbine combined cycles are expected to play an increasingly important role in the balancing of supply and demand in future energy markets. Thermodynamic modeling of these energy systems is frequently applied to assist in decision making processes related to the management of plant operation and maintenance. In most cases, model inputs, parameters and outputs are treated as deterministic quantities and plant operators make decisions with limited or no regard of uncertainties. As the steady integration of wind and solar energy into the energy market induces extra uncertainties, part load operation and reliability are becoming increasingly important. In the current study, methods are proposed to not only quantify various types of uncertainties in measurements and plant model parameters using measured data, but to also assess their effect on various aspects of performance prediction. The authors aim to account for model parameter and measurement uncertainty, and for systematic discrepancy of models with respect to reality. For this purpose, the Bayesian calibration framework of Kennedy and O’Hagan is used, which is especially suitable for high-dimensional industrial problems. The article derives a calibrated model of the plant efficiency as a function of ambient conditions and operational parameters, which is also accurate in part load. The article shows that complete statistical modeling of power plants not only enhances process models, but can also increases confidence in operational decisions
Nonparametric inference of network structure and dynamics
Peixoto, Tiago P.
The network structure of complex systems determine their function and serve as evidence for the evolutionary mechanisms that lie behind them. Despite considerable effort in recent years, it remains an open challenge to formulate general descriptions of the large-scale structure of network systems, and how to reliably extract such information from data. Although many approaches have been proposed, few methods attempt to gauge the statistical significance of the uncovered structures, and hence the majority cannot reliably separate actual structure from stochastic fluctuations. Due to the sheer size and high-dimensionality of many networks, this represents a major limitation that prevents meaningful interpretations of the results obtained with such nonstatistical methods. In this talk, I will show how these issues can be tackled in a principled and efficient fashion by formulating appropriate generative models of network structure that can have their parameters inferred from data. By employing a Bayesian description of such models, the inference can be performed in a nonparametric fashion, that does not require any a priori knowledge or ad hoc assumptions about the data. I will show how this approach can be used to perform model comparison, and how hierarchical models yield the most appropriate trade-off between model complexity and quality of fit based on the statistical evidence present in the data. I will also show how this general approach can be elegantly extended to networks with edge attributes, that are embedded in latent spaces, and that change in time. The latter is obtained via a fully dynamic generative network model, based on arbitrary-order Markov chains, that can also be inferred in a nonparametric fashion. Throughout the talk I will illustrate the application of the methods with many empirical networks such as the internet at the autonomous systems level, the global airport network, the network of actors and films, social networks, citations among
Model Selection in Historical Research Using Approximate Bayesian Computation
Rubio-Campillo, Xavier
2016-01-01
Formal Models and History Computational models are increasingly being used to study historical dynamics. This new trend, which could be named Model-Based History, makes use of recently published datasets and innovative quantitative methods to improve our understanding of past societies based on their written sources. The extensive use of formal models allows historians to re-evaluate hypotheses formulated decades ago and still subject to debate due to the lack of an adequate quantitative framework. The initiative has the potential to transform the discipline if it solves the challenges posed by the study of historical dynamics. These difficulties are based on the complexities of modelling social interaction, and the methodological issues raised by the evaluation of formal models against data with low sample size, high variance and strong fragmentation. Case Study This work examines an alternate approach to this evaluation based on a Bayesian-inspired model selection method. The validity of the classical Lanchester’s laws of combat is examined against a dataset comprising over a thousand battles spanning 300 years. Four variations of the basic equations are discussed, including the three most common formulations (linear, squared, and logarithmic) and a new variant introducing fatigue. Approximate Bayesian Computation is then used to infer both parameter values and model selection via Bayes Factors. Impact Results indicate decisive evidence favouring the new fatigue model. The interpretation of both parameter estimations and model selection provides new insights into the factors guiding the evolution of warfare. At a methodological level, the case study shows how model selection methods can be used to guide historical research through the comparison between existing hypotheses and empirical evidence. PMID:26730953
An Active Lattice Model in a Bayesian Framework
DEFF Research Database (Denmark)
Carstensen, Jens Michael
1996-01-01
A Markov Random Field is used as a structural model of a deformable rectangular lattice. When used as a template prior in a Bayesian framework this model is powerful for making inferences about lattice structures in images. The model assigns maximum probability to the perfect regular lattice...... by penalizing deviations in alignment and lattice node distance. The Markov random field represents prior knowledge about the lattice structure, and through an observation model that incorporates the visual appearance of the nodes, we can simulate realizations from the posterior distribution. A maximum...... a posteriori (MAP) estimate, found by simulated annealing, is used as the reconstructed lattice. The model was developed as a central part of an algorithm for automatic analylsis of genetic experiments, positioned in a lattice structure by a robot. The algorithm has been successfully applied to many images...
Bayesian approach to errors-in-variables in regression models
Rozliman, Nur Aainaa; Ibrahim, Adriana Irawati Nur; Yunus, Rossita Mohammad
2017-05-01
In many applications and experiments, data sets are often contaminated with error or mismeasured covariates. When at least one of the covariates in a model is measured with error, Errors-in-Variables (EIV) model can be used. Measurement error, when not corrected, would cause misleading statistical inferences and analysis. Therefore, our goal is to examine the relationship of the outcome variable and the unobserved exposure variable given the observed mismeasured surrogate by applying the Bayesian formulation to the EIV model. We shall extend the flexible parametric method proposed by Hossain and Gustafson (2009) to another nonlinear regression model which is the Poisson regression model. We shall then illustrate the application of this approach via a simulation study using Markov chain Monte Carlo sampling methods.
Uncovering Transcriptional Regulatory Networks by Sparse Bayesian Factor Model
Directory of Open Access Journals (Sweden)
Qi Yuan(Alan
2010-01-01
Full Text Available Abstract The problem of uncovering transcriptional regulation by transcription factors (TFs based on microarray data is considered. A novel Bayesian sparse correlated rectified factor model (BSCRFM is proposed that models the unknown TF protein level activity, the correlated regulations between TFs, and the sparse nature of TF-regulated genes. The model admits prior knowledge from existing database regarding TF-regulated target genes based on a sparse prior and through a developed Gibbs sampling algorithm, a context-specific transcriptional regulatory network specific to the experimental condition of the microarray data can be obtained. The proposed model and the Gibbs sampling algorithm were evaluated on the simulated systems, and results demonstrated the validity and effectiveness of the proposed approach. The proposed model was then applied to the breast cancer microarray data of patients with Estrogen Receptor positive ( status and Estrogen Receptor negative ( status, respectively.
MODELING INFORMATION SYSTEM AVAILABILITY BY USING BAYESIAN BELIEF NETWORK APPROACH
Directory of Open Access Journals (Sweden)
Semir Ibrahimović
2016-03-01
Full Text Available Modern information systems are expected to be always-on by providing services to end-users, regardless of time and location. This is particularly important for organizations and industries where information systems support real-time operations and mission-critical applications that need to be available on 24 7 365 basis. Examples of such entities include process industries, telecommunications, healthcare, energy, banking, electronic commerce and a variety of cloud services. This article presents a modified Bayesian Belief Network model for predicting information system availability, introduced initially by Franke, U. and Johnson, P. (in article “Availability of enterprise IT systems – an expert based Bayesian model”. Software Quality Journal 20(2, 369-394, 2012 based on a thorough review of several dimensions of the information system availability, we proposed a modified set of determinants. The model is parameterized by using probability elicitation process with the participation of experts from the financial sector of Bosnia and Herzegovina. The model validation was performed using Monte Carlo simulation.
Hunink, Maria; Bult, J.R.; De Vries, J; Weinstein, MC
1998-01-01
Purpose. To illustrate the use of a nonparametric bootstrap method in the evaluation of uncertainty in decision models analyzing cost-effectiveness. Methods. The authors reevaluated a previously published cost-effectiveness analysis that used a Markov model comparing initial percutaneous
10th Conference on Bayesian Nonparametrics
2016-05-08
RETURN YOUR FORM TO THE ABOVE ADDRESS. North Carolina State University 2701 Sullivan Drive Admin Srvcs III, Box 7514 Raleigh, NC 27695 -7514 ABSTRACT...the conference. The findings from the conference is widely disseminated. The conference web site displays slides of the talks presented in the...being published by the Electronic Journal of Statistics consisting of about 20 papers read at the conference. The conference web site displays
Bayesian Model Comparison With the g-Prior
DEFF Research Database (Denmark)
Nielsen, Jesper Kjær; Christensen, Mads Græsbøll; Cemgil, Ali Taylan
2014-01-01
’s asymptotic MAP rule was an improvement, and in this paper we extend the work by Djuric in several ways. Speciﬁcally, we consider the elicitation of proper prior distributions, treat the case of real- and complex-valued data simultaneously in a Bayesian framework similar to that considered by Djuric......, and develop new model selection rules for a regression model containing both linear and non-linear parameters. Moreover, we use this framework to give a new interpretation of the popular information criteria and relate their performance to the signal-to-noise ratio of the data. By use of simulations, we also...... demonstrate that our proposed model comparison and selection rules outperform the traditional information criteria both in terms of detecting the true model and in terms of predicting unobserved data. The simulation code is available online....
Recursive Bayesian recurrent neural networks for time-series modeling.
Mirikitani, Derrick T; Nikolaev, Nikolay
2010-02-01
This paper develops a probabilistic approach to recursive second-order training of recurrent neural networks (RNNs) for improved time-series modeling. A general recursive Bayesian Levenberg-Marquardt algorithm is derived to sequentially update the weights and the covariance (Hessian) matrix. The main strengths of the approach are a principled handling of the regularization hyperparameters that leads to better generalization, and stable numerical performance. The framework involves the adaptation of a noise hyperparameter and local weight prior hyperparameters, which represent the noise in the data and the uncertainties in the model parameters. Experimental investigations using artificial and real-world data sets show that RNNs equipped with the proposed approach outperform standard real-time recurrent learning and extended Kalman training algorithms for recurrent networks, as well as other contemporary nonlinear neural models, on time-series modeling.
Bayesian Dose-Response Modeling in Sparse Data
Kim, Steven B.
This book discusses Bayesian dose-response modeling in small samples applied to two different settings. The first setting is early phase clinical trials, and the second setting is toxicology studies in cancer risk assessment. In early phase clinical trials, experimental units are humans who are actual patients. Prior to a clinical trial, opinions from multiple subject area experts are generally more informative than the opinion of a single expert, but we may face a dilemma when they have disagreeing prior opinions. In this regard, we consider compromising the disagreement and compare two different approaches for making a decision. In addition to combining multiple opinions, we also address balancing two levels of ethics in early phase clinical trials. The first level is individual-level ethics which reflects the perspective of trial participants. The second level is population-level ethics which reflects the perspective of future patients. We extensively compare two existing statistical methods which focus on each perspective and propose a new method which balances the two conflicting perspectives. In toxicology studies, experimental units are living animals. Here we focus on a potential non-monotonic dose-response relationship which is known as hormesis. Briefly, hormesis is a phenomenon which can be characterized by a beneficial effect at low doses and a harmful effect at high doses. In cancer risk assessments, the estimation of a parameter, which is known as a benchmark dose, can be highly sensitive to a class of assumptions, monotonicity or hormesis. In this regard, we propose a robust approach which considers both monotonicity and hormesis as a possibility. In addition, We discuss statistical hypothesis testing for hormesis and consider various experimental designs for detecting hormesis based on Bayesian decision theory. Past experiments have not been optimally designed for testing for hormesis, and some Bayesian optimal designs may not be optimal under a
Bayesian uncertainty analysis with applications to turbulence modeling
International Nuclear Information System (INIS)
Cheung, Sai Hung; Oliver, Todd A.; Prudencio, Ernesto E.; Prudhomme, Serge; Moser, Robert D.
2011-01-01
In this paper, we apply Bayesian uncertainty quantification techniques to the processes of calibrating complex mathematical models and predicting quantities of interest (QoI's) with such models. These techniques also enable the systematic comparison of competing model classes. The processes of calibration and comparison constitute the building blocks of a larger validation process, the goal of which is to accept or reject a given mathematical model for the prediction of a particular QoI for a particular scenario. In this work, we take the first step in this process by applying the methodology to the analysis of the Spalart-Allmaras turbulence model in the context of incompressible, boundary layer flows. Three competing model classes based on the Spalart-Allmaras model are formulated, calibrated against experimental data, and used to issue a prediction with quantified uncertainty. The model classes are compared in terms of their posterior probabilities and their prediction of QoI's. The model posterior probability represents the relative plausibility of a model class given the data. Thus, it incorporates the model's ability to fit experimental observations. Alternatively, comparing models using the predicted QoI connects the process to the needs of decision makers that use the results of the model. We show that by using both the model plausibility and predicted QoI, one has the opportunity to reject some model classes after calibration, before subjecting the remaining classes to additional validation challenges.
Optimal inference with suboptimal models: Addiction and active Bayesian inference
Schwartenbeck, Philipp; FitzGerald, Thomas H.B.; Mathys, Christoph; Dolan, Ray; Wurst, Friedrich; Kronbichler, Martin; Friston, Karl
2015-01-01
When casting behaviour as active (Bayesian) inference, optimal inference is defined with respect to an agent’s beliefs – based on its generative model of the world. This contrasts with normative accounts of choice behaviour, in which optimal actions are considered in relation to the true structure of the environment – as opposed to the agent’s beliefs about worldly states (or the task). This distinction shifts an understanding of suboptimal or pathological behaviour away from aberrant inference as such, to understanding the prior beliefs of a subject that cause them to behave less ‘optimally’ than our prior beliefs suggest they should behave. Put simply, suboptimal or pathological behaviour does not speak against understanding behaviour in terms of (Bayes optimal) inference, but rather calls for a more refined understanding of the subject’s generative model upon which their (optimal) Bayesian inference is based. Here, we discuss this fundamental distinction and its implications for understanding optimality, bounded rationality and pathological (choice) behaviour. We illustrate our argument using addictive choice behaviour in a recently described ‘limited offer’ task. Our simulations of pathological choices and addictive behaviour also generate some clear hypotheses, which we hope to pursue in ongoing empirical work. PMID:25561321
Bayesian energy landscape tilting: towards concordant models of molecular ensembles.
Beauchamp, Kyle A; Pande, Vijay S; Das, Rhiju
2014-03-18
Predicting biological structure has remained challenging for systems such as disordered proteins that take on myriad conformations. Hybrid simulation/experiment strategies have been undermined by difficulties in evaluating errors from computational model inaccuracies and data uncertainties. Building on recent proposals from maximum entropy theory and nonequilibrium thermodynamics, we address these issues through a Bayesian energy landscape tilting (BELT) scheme for computing Bayesian hyperensembles over conformational ensembles. BELT uses Markov chain Monte Carlo to directly sample maximum-entropy conformational ensembles consistent with a set of input experimental observables. To test this framework, we apply BELT to model trialanine, starting from disagreeing simulations with the force fields ff96, ff99, ff99sbnmr-ildn, CHARMM27, and OPLS-AA. BELT incorporation of limited chemical shift and (3)J measurements gives convergent values of the peptide's α, β, and PPII conformational populations in all cases. As a test of predictive power, all five BELT hyperensembles recover set-aside measurements not used in the fitting and report accurate errors, even when starting from highly inaccurate simulations. BELT's principled framework thus enables practical predictions for complex biomolecular systems from discordant simulations and sparse data. Copyright © 2014 Biophysical Society. Published by Elsevier Inc. All rights reserved.
Sparse linear models: Variational approximate inference and Bayesian experimental design
International Nuclear Information System (INIS)
Seeger, Matthias W
2009-01-01
A wide range of problems such as signal reconstruction, denoising, source separation, feature selection, and graphical model search are addressed today by posterior maximization for linear models with sparsity-favouring prior distributions. The Bayesian posterior contains useful information far beyond its mode, which can be used to drive methods for sampling optimization (active learning), feature relevance ranking, or hyperparameter estimation, if only this representation of uncertainty can be approximated in a tractable manner. In this paper, we review recent results for variational sparse inference, and show that they share underlying computational primitives. We discuss how sampling optimization can be implemented as sequential Bayesian experimental design. While there has been tremendous recent activity to develop sparse estimation, little attendance has been given to sparse approximate inference. In this paper, we argue that many problems in practice, such as compressive sensing for real-world image reconstruction, are served much better by proper uncertainty approximations than by ever more aggressive sparse estimation algorithms. Moreover, since some variational inference methods have been given strong convex optimization characterizations recently, theoretical analysis may become possible, promising new insights into nonlinear experimental design.
Sparse linear models: Variational approximate inference and Bayesian experimental design
Energy Technology Data Exchange (ETDEWEB)
Seeger, Matthias W [Saarland University and Max Planck Institute for Informatics, Campus E1.4, 66123 Saarbruecken (Germany)
2009-12-01
A wide range of problems such as signal reconstruction, denoising, source separation, feature selection, and graphical model search are addressed today by posterior maximization for linear models with sparsity-favouring prior distributions. The Bayesian posterior contains useful information far beyond its mode, which can be used to drive methods for sampling optimization (active learning), feature relevance ranking, or hyperparameter estimation, if only this representation of uncertainty can be approximated in a tractable manner. In this paper, we review recent results for variational sparse inference, and show that they share underlying computational primitives. We discuss how sampling optimization can be implemented as sequential Bayesian experimental design. While there has been tremendous recent activity to develop sparse estimation, little attendance has been given to sparse approximate inference. In this paper, we argue that many problems in practice, such as compressive sensing for real-world image reconstruction, are served much better by proper uncertainty approximations than by ever more aggressive sparse estimation algorithms. Moreover, since some variational inference methods have been given strong convex optimization characterizations recently, theoretical analysis may become possible, promising new insights into nonlinear experimental design.
MATHEMATICAL RISK ANALYSIS: VIA NICHOLAS RISK MODEL AND BAYESIAN ANALYSIS
Directory of Open Access Journals (Sweden)
Anass BAYAGA
2010-07-01
Full Text Available The objective of this second part of a two-phased study was to explorethe predictive power of quantitative risk analysis (QRA method andprocess within Higher Education Institution (HEI. The method and process investigated the use impact analysis via Nicholas risk model and Bayesian analysis, with a sample of hundred (100 risk analysts in a historically black South African University in the greater Eastern Cape Province.The first findings supported and confirmed previous literature (KingIII report, 2009: Nicholas and Steyn, 2008: Stoney, 2007: COSA, 2004 that there was a direct relationship between risk factor, its likelihood and impact, certiris paribus. The second finding in relation to either controlling the likelihood or the impact of occurrence of risk (Nicholas risk model was that to have a brighter risk reward, it was important to control the likelihood ofoccurrence of risks as compared with its impact so to have a direct effect on entire University. On the Bayesian analysis, thus third finding, the impact of risk should be predicted along three aspects. These aspects included the human impact (decisions made, the property impact (students and infrastructural based and the business impact. Lastly, the study revealed that although in most business cases, where as business cycles considerably vary dependingon the industry and or the institution, this study revealed that, most impacts in HEI (University was within the period of one academic.The recommendation was that application of quantitative risk analysisshould be related to current legislative framework that affects HEI.
Bayesian Age-Period-Cohort Model of Lung Cancer Mortality
Directory of Open Access Journals (Sweden)
Bhikhari P. Tharu
2015-09-01
Full Text Available Background The objective of this study was to analyze the time trend for lung cancer mortality in the population of the USA by 5 years based on most recent available data namely to 2010. The knowledge of the mortality rates in the temporal trends is necessary to understand cancer burden.Methods Bayesian Age-Period-Cohort model was fitted using Poisson regression with histogram smoothing prior to decompose mortality rates based on age at death, period at death, and birth-cohort.Results Mortality rates from lung cancer increased more rapidly from age 52 years. It ended up to 325 deaths annually for 82 years on average. The mortality of younger cohorts was lower than older cohorts. The risk of lung cancer was lowered from period 1993 to recent periods.Conclusions The fitted Bayesian Age-Period-Cohort model with histogram smoothing prior is capable of explaining mortality rate of lung cancer. The reduction in carcinogens in cigarettes and increase in smoking cessation from around 1960 might led to decreasing trend of lung cancer mortality after calendar period 1993.
Bayesian modeling of recombination events in bacterial populations
Directory of Open Access Journals (Sweden)
Dowson Chris
2008-10-01
Full Text Available Abstract Background We consider the discovery of recombinant segments jointly with their origins within multilocus DNA sequences from bacteria representing heterogeneous populations of fairly closely related species. The currently available methods for recombination detection capable of probabilistic characterization of uncertainty have a limited applicability in practice as the number of strains in a data set increases. Results We introduce a Bayesian spatial structural model representing the continuum of origins over sites within the observed sequences, including a probabilistic characterization of uncertainty related to the origin of any particular site. To enable a statistically accurate and practically feasible approach to the analysis of large-scale data sets representing a single genus, we have developed a novel software tool (BRAT, Bayesian Recombination Tracker implementing the model and the corresponding learning algorithm, which is capable of identifying the posterior optimal structure and to estimate the marginal posterior probabilities of putative origins over the sites. Conclusion A multitude of challenging simulation scenarios and an analysis of real data from seven housekeeping genes of 120 strains of genus Burkholderia are used to illustrate the possibilities offered by our approach. The software is freely available for download at URL http://web.abo.fi/fak/mnf//mate/jc/software/brat.html.
Li, Lu; Xu, Chong-Yu; Engeland, Kolbjørn
2013-04-01
SummaryWith respect to model calibration, parameter estimation and analysis of uncertainty sources, various regression and probabilistic approaches are used in hydrological modeling. A family of Bayesian methods, which incorporates different sources of information into a single analysis through Bayes' theorem, is widely used for uncertainty assessment. However, none of these approaches can well treat the impact of high flows in hydrological modeling. This study proposes a Bayesian modularization uncertainty assessment approach in which the highest streamflow observations are treated as suspect information that should not influence the inference of the main bulk of the model parameters. This study includes a comprehensive comparison and evaluation of uncertainty assessments by our new Bayesian modularization method and standard Bayesian methods using the Metropolis-Hastings (MH) algorithm with the daily hydrological model WASMOD. Three likelihood functions were used in combination with standard Bayesian method: the AR(1) plus Normal model independent of time (Model 1), the AR(1) plus Normal model dependent on time (Model 2) and the AR(1) plus Multi-normal model (Model 3). The results reveal that the Bayesian modularization method provides the most accurate streamflow estimates measured by the Nash-Sutcliffe efficiency and provide the best in uncertainty estimates for low, medium and entire flows compared to standard Bayesian methods. The study thus provides a new approach for reducing the impact of high flows on the discharge uncertainty assessment of hydrological models via Bayesian method.
Application of Bayesian Model Selection for Metal Yield Models using ALEGRA and Dakota.
Energy Technology Data Exchange (ETDEWEB)
Portone, Teresa; Niederhaus, John Henry; Sanchez, Jason James; Swiler, Laura Painton
2018-02-01
This report introduces the concepts of Bayesian model selection, which provides a systematic means of calibrating and selecting an optimal model to represent a phenomenon. This has many potential applications, including for comparing constitutive models. The ideas described herein are applied to a model selection problem between different yield models for hardened steel under extreme loading conditions.
Aggregated Residential Load Modeling Using Dynamic Bayesian Networks
Energy Technology Data Exchange (ETDEWEB)
Vlachopoulou, Maria; Chin, George; Fuller, Jason C.; Lu, Shuai
2014-09-28
Abstract—It is already obvious that the future power grid will have to address higher demand for power and energy, and to incorporate renewable resources of different energy generation patterns. Demand response (DR) schemes could successfully be used to manage and balance power supply and demand under operating conditions of the future power grid. To achieve that, more advanced tools for DR management of operations and planning are necessary that can estimate the available capacity from DR resources. In this research, a Dynamic Bayesian Network (DBN) is derived, trained, and tested that can model aggregated load of Heating, Ventilation, and Air Conditioning (HVAC) systems. DBNs can provide flexible and powerful tools for both operations and planing, due to their unique analytical capabilities. The DBN model accuracy and flexibility of use is demonstrated by testing the model under different operational scenarios.
A Bayesian Analysis of Unobserved Component Models Using Ox
Directory of Open Access Journals (Sweden)
Charles S. Bos
2011-05-01
Full Text Available This article details a Bayesian analysis of the Nile river flow data, using a similar state space model as other articles in this volume. For this data set, Metropolis-Hastings and Gibbs sampling algorithms are implemented in the programming language Ox. These Markov chain Monte Carlo methods only provide output conditioned upon the full data set. For filtered output, conditioning only on past observations, the particle filter is introduced. The sampling methods are flexible, and this advantage is used to extend the model to incorporate a stochastic volatility process. The volatility changes both in the Nile data and also in daily S&P 500 return data are investigated. The posterior density of parameters and states is found to provide information on which elements of the model are easily identifiable, and which elements are estimated with less precision.
Model-based dispersive wave processing: A recursive Bayesian solution
International Nuclear Information System (INIS)
Candy, J.V.; Chambers, D.H.
1999-01-01
Wave propagation through dispersive media represents a significant problem in many acoustic applications, especially in ocean acoustics, seismology, and nondestructive evaluation. In this paper we propose a propagation model that can easily represent many classes of dispersive waves and proceed to develop the model-based solution to the wave processing problem. It is shown that the underlying wave system is nonlinear and time-variable requiring a recursive processor. Thus the general solution to the model-based dispersive wave enhancement problem is developed using a Bayesian maximum a posteriori (MAP) approach and shown to lead to the recursive, nonlinear extended Kalman filter (EKF) processor. The problem of internal wave estimation is cast within this framework. The specific processor is developed and applied to data synthesized by a sophisticated simulator demonstrating the feasibility of this approach. copyright 1999 Acoustical Society of America.
Advances in Bayesian Model Based Clustering Using Particle Learning
Energy Technology Data Exchange (ETDEWEB)
Merl, D M
2009-11-19
Recent work by Carvalho, Johannes, Lopes and Polson and Carvalho, Lopes, Polson and Taddy introduced a sequential Monte Carlo (SMC) alternative to traditional iterative Monte Carlo strategies (e.g. MCMC and EM) for Bayesian inference for a large class of dynamic models. The basis of SMC techniques involves representing the underlying inference problem as one of state space estimation, thus giving way to inference via particle filtering. The key insight of Carvalho et al was to construct the sequence of filtering distributions so as to make use of the posterior predictive distribution of the observable, a distribution usually only accessible in certain Bayesian settings. Access to this distribution allows a reversal of the usual propagate and resample steps characteristic of many SMC methods, thereby alleviating to a large extent many problems associated with particle degeneration. Furthermore, Carvalho et al point out that for many conjugate models the posterior distribution of the static variables can be parametrized in terms of [recursively defined] sufficient statistics of the previously observed data. For models where such sufficient statistics exist, particle learning as it is being called, is especially well suited for the analysis of streaming data do to the relative invariance of its algorithmic complexity with the number of data observations. Through a particle learning approach, a statistical model can be fit to data as the data is arriving, allowing at any instant during the observation process direct quantification of uncertainty surrounding underlying model parameters. Here we describe the use of a particle learning approach for fitting a standard Bayesian semiparametric mixture model as described in Carvalho, Lopes, Polson and Taddy. In Section 2 we briefly review the previously presented particle learning algorithm for the case of a Dirichlet process mixture of multivariate normals. In Section 3 we describe several novel extensions to the original
A contingency table approach to nonparametric testing
Rayner, JCW
2000-01-01
Most texts on nonparametric techniques concentrate on location and linear-linear (correlation) tests, with less emphasis on dispersion effects and linear-quadratic tests. Tests for higher moment effects are virtually ignored. Using a fresh approach, A Contingency Table Approach to Nonparametric Testing unifies and extends the popular, standard tests by linking them to tests based on models for data that can be presented in contingency tables.This approach unifies popular nonparametric statistical inference and makes the traditional, most commonly performed nonparametric analyses much more comp
Road network safety evaluation using Bayesian hierarchical joint model.
Wang, Jie; Huang, Helai
2016-05-01
Safety and efficiency are commonly regarded as two significant performance indicators of transportation systems. In practice, road network planning has focused on road capacity and transport efficiency whereas the safety level of a road network has received little attention in the planning stage. This study develops a Bayesian hierarchical joint model for road network safety evaluation to help planners take traffic safety into account when planning a road network. The proposed model establishes relationships between road network risk and micro-level variables related to road entities and traffic volume, as well as socioeconomic, trip generation and network density variables at macro level which are generally used for long term transportation plans. In addition, network spatial correlation between intersections and their connected road segments is also considered in the model. A road network is elaborately selected in order to compare the proposed hierarchical joint model with a previous joint model and a negative binomial model. According to the results of the model comparison, the hierarchical joint model outperforms the joint model and negative binomial model in terms of the goodness-of-fit and predictive performance, which indicates the reasonableness of considering the hierarchical data structure in crash prediction and analysis. Moreover, both random effects at the TAZ level and the spatial correlation between intersections and their adjacent segments are found to be significant, supporting the employment of the hierarchical joint model as an alternative in road-network-level safety modeling as well. Copyright © 2016 Elsevier Ltd. All rights reserved.
Modeling Land-Use Decision Behavior with Bayesian Belief Networks
Directory of Open Access Journals (Sweden)
Inge Aalders
2008-06-01
Full Text Available The ability to incorporate and manage the different drivers of land-use change in a modeling process is one of the key challenges because they are complex and are both quantitative and qualitative in nature. This paper uses Bayesian belief networks (BBN to incorporate characteristics of land managers in the modeling process and to enhance our understanding of land-use change based on the limited and disparate sources of information. One of the two models based on spatial data represented land managers in the form of a quantitative variable, the area of individual holdings, whereas the other model included qualitative data from a survey of land managers. Random samples from the spatial data provided evidence of the relationship between the different variables, which I used to develop the BBN structure. The model was tested for four different posterior probability distributions, and results showed that the trained and learned models are better at predicting land use than the uniform and random models. The inference from the model demonstrated the constraints that biophysical characteristics impose on land managers; for older land managers without heirs, there is a higher probability of the land use being arable agriculture. The results show the benefits of incorporating a more complex notion of land managers in land-use models, and of using different empirical data sources in the modeling process. Future research should focus on incorporating more complex social processes into the modeling structure, as well as incorporating spatio-temporal dynamics in a BBN.
Nonparametric statistical inference
Gibbons, Jean Dickinson
2010-01-01
Overall, this remains a very fine book suitable for a graduate-level course in nonparametric statistics. I recommend it for all people interested in learning the basic ideas of nonparametric statistical inference.-Eugenia Stoimenova, Journal of Applied Statistics, June 2012… one of the best books available for a graduate (or advanced undergraduate) text for a theory course on nonparametric statistics. … a very well-written and organized book on nonparametric statistics, especially useful and recommended for teachers and graduate students.-Biometrics, 67, September 2011This excellently presente
Evaluating Flight Crew Performance by a Bayesian Network Model
Directory of Open Access Journals (Sweden)
Wei Chen
2018-03-01
Full Text Available Flight crew performance is of great significance in keeping flights safe and sound. When evaluating the crew performance, quantitative detailed behavior information may not be available. The present paper introduces the Bayesian Network to perform flight crew performance evaluation, which permits the utilization of multidisciplinary sources of objective and subjective information, despite sparse behavioral data. In this paper, the causal factors are selected based on the analysis of 484 aviation accidents caused by human factors. Then, a network termed Flight Crew Performance Model is constructed. The Delphi technique helps to gather subjective data as a supplement to objective data from accident reports. The conditional probabilities are elicited by the leaky noisy MAX model. Two ways of inference for the BN—probability prediction and probabilistic diagnosis are used and some interesting conclusions are drawn, which could provide data support to make interventions for human error management in aviation safety.
GPU Computing in Bayesian Inference of Realized Stochastic Volatility Model
International Nuclear Information System (INIS)
Takaishi, Tetsuya
2015-01-01
The realized stochastic volatility (RSV) model that utilizes the realized volatility as additional information has been proposed to infer volatility of financial time series. We consider the Bayesian inference of the RSV model by the Hybrid Monte Carlo (HMC) algorithm. The HMC algorithm can be parallelized and thus performed on the GPU for speedup. The GPU code is developed with CUDA Fortran. We compare the computational time in performing the HMC algorithm on GPU (GTX 760) and CPU (Intel i7-4770 3.4GHz) and find that the GPU can be up to 17 times faster than the CPU. We also code the program with OpenACC and find that appropriate coding can achieve the similar speedup with CUDA Fortran
Modelling of population dynamics of red king crab using Bayesian approach
Directory of Open Access Journals (Sweden)
Bakanev Sergey ...
2012-10-01
Modeling population dynamics based on the Bayesian approach enables to successfully resolve the above issues. The integration of the data from various studies into a unified model based on Bayesian parameter estimation method provides a much more detailed description of the processes occurring in the population.
Inconsistency of Bayesian Inference for Misspecified Linear Models, and a Proposal for Repairing It
Grünwald, P.; van Ommen, T.
2017-01-01
We empirically show that Bayesian inference can be inconsistent under misspecification in simple linear regression problems, both in a model averaging/selection and in a Bayesian ridge regression setting. We use the standard linear model, which assumes homoskedasticity, whereas the data are
Dynamic Bayesian Network Modeling of Game Based Diagnostic Assessments. CRESST Report 837
Levy, Roy
2014-01-01
Digital games offer an appealing environment for assessing student proficiencies, including skills and misconceptions in a diagnostic setting. This paper proposes a dynamic Bayesian network modeling approach for observations of student performance from an educational video game. A Bayesian approach to model construction, calibration, and use in…
Inconsistency of Bayesian inference for misspecified linear models, and a proposal for repairing it
P.D. Grünwald (Peter); T. van Ommen (Thijs)
2017-01-01
textabstractWe empirically show that Bayesian inference can be inconsistent under misspecification in simple linear regression problems, both in a model averaging/selection and in a Bayesian ridge regression setting. We use the standard linear model, which assumes homoskedasticity, whereas the data
BDgraph: An R Package for Bayesian Structure Learning in Graphical Models
Mohammadi, A.; Wit, E.C.
2017-01-01
Graphical models provide powerful tools to uncover complicated patterns in multivariate data and are commonly used in Bayesian statistics and machine learning. In this paper, we introduce an R package BDgraph which performs Bayesian structure learning for general undirected graphical models with
Cooper, Richard J; Krueger, Tobias; Hiscock, Kevin M; Rawlins, Barry G
2014-11-01
Mixing models have become increasingly common tools for apportioning fluvial sediment load to various sediment sources across catchments using a wide variety of Bayesian and frequentist modeling approaches. In this study, we demonstrate how different model setups can impact upon resulting source apportionment estimates in a Bayesian framework via a one-factor-at-a-time (OFAT) sensitivity analysis. We formulate 13 versions of a mixing model, each with different error assumptions and model structural choices, and apply them to sediment geochemistry data from the River Blackwater, Norfolk, UK, to apportion suspended particulate matter (SPM) contributions from three sources (arable topsoils, road verges, and subsurface material) under base flow conditions between August 2012 and August 2013. Whilst all 13 models estimate subsurface sources to be the largest contributor of SPM (median ∼76%), comparison of apportionment estimates reveal varying degrees of sensitivity to changing priors, inclusion of covariance terms, incorporation of time-variant distributions, and methods of proportion characterization. We also demonstrate differences in apportionment results between a full and an empirical Bayesian setup, and between a Bayesian and a frequentist optimization approach. This OFAT sensitivity analysis reveals that mixing model structural choices and error assumptions can significantly impact upon sediment source apportionment results, with estimated median contributions in this study varying by up to 21% between model versions. Users of mixing models are therefore strongly advised to carefully consider and justify their choice of model structure prior to conducting sediment source apportionment investigations. An OFAT sensitivity analysis of sediment fingerprinting mixing models is conductedBayesian models display high sensitivity to error assumptions and structural choicesSource apportionment results differ between Bayesian and frequentist approaches.
International Nuclear Information System (INIS)
Das, Shiva K.; Zhou Sumin; Zhang, Junan; Yin, F.-F.; Dewhirst, Mark W.; Marks, Lawrence B.
2007-01-01
Purpose: To develop and test a model to predict for lung radiation-induced Grade 2+ pneumonitis. Methods and Materials: The model was built from a database of 234 lung cancer patients treated with radiotherapy (RT), of whom 43 were diagnosed with pneumonitis. The model augmented the predictive capability of the parametric dose-based Lyman normal tissue complication probability (LNTCP) metric by combining it with weighted nonparametric decision trees that use dose and nondose inputs. The decision trees were sequentially added to the model using a 'boosting' process that enhances the accuracy of prediction. The model's predictive capability was estimated by 10-fold cross-validation. To facilitate dissemination, the cross-validation result was used to extract a simplified approximation to the complicated model architecture created by boosting. Application of the simplified model is demonstrated in two example cases. Results: The area under the model receiver operating characteristics curve for cross-validation was 0.72, a significant improvement over the LNTCP area of 0.63 (p = 0.005). The simplified model used the following variables to output a measure of injury: LNTCP, gender, histologic type, chemotherapy schedule, and treatment schedule. For a given patient RT plan, injury prediction was highest for the combination of pre-RT chemotherapy, once-daily treatment, female gender and lowest for the combination of no pre-RT chemotherapy and nonsquamous cell histologic type. Application of the simplified model to the example cases revealed that injury prediction for a given treatment plan can range from very low to very high, depending on the settings of the nondose variables. Conclusions: Radiation pneumonitis prediction was significantly enhanced by decision trees that added the influence of nondose factors to the LNTCP formulation
A Bayesian Approach for Structural Learning with Hidden Markov Models
Directory of Open Access Journals (Sweden)
Cen Li
2002-01-01
Full Text Available Hidden Markov Models(HMM have proved to be a successful modeling paradigm for dynamic and spatial processes in many domains, such as speech recognition, genomics, and general sequence alignment. Typically, in these applications, the model structures are predefined by domain experts. Therefore, the HMM learning problem focuses on the learning of the parameter values of the model to fit the given data sequences. However, when one considers other domains, such as, economics and physiology, model structure capturing the system dynamic behavior is not available. In order to successfully apply the HMM methodology in these domains, it is important that a mechanism is available for automatically deriving the model structure from the data. This paper presents a HMM learning procedure that simultaneously learns the model structure and the maximum likelihood parameter values of a HMM from data. The HMM model structures are derived based on the Bayesian model selection methodology. In addition, we introduce a new initialization procedure for HMM parameter value estimation based on the K-means clustering method. Experimental results with artificially generated data show the effectiveness of the approach.
Bayesian network models for error detection in radiotherapy plans
International Nuclear Information System (INIS)
Kalet, Alan M; Ford, Eric C; Phillips, Mark H; Gennari, John H
2015-01-01
The purpose of this study is to design and develop a probabilistic network for detecting errors in radiotherapy plans for use at the time of initial plan verification. Our group has initiated a multi-pronged approach to reduce these errors. We report on our development of Bayesian models of radiotherapy plans. Bayesian networks consist of joint probability distributions that define the probability of one event, given some set of other known information. Using the networks, we find the probability of obtaining certain radiotherapy parameters, given a set of initial clinical information. A low probability in a propagated network then corresponds to potential errors to be flagged for investigation. To build our networks we first interviewed medical physicists and other domain experts to identify the relevant radiotherapy concepts and their associated interdependencies and to construct a network topology. Next, to populate the network’s conditional probability tables, we used the Hugin Expert software to learn parameter distributions from a subset of de-identified data derived from a radiation oncology based clinical information database system. These data represent 4990 unique prescription cases over a 5 year period. Under test case scenarios with approximately 1.5% introduced error rates, network performance produced areas under the ROC curve of 0.88, 0.98, and 0.89 for the lung, brain and female breast cancer error detection networks, respectively. Comparison of the brain network to human experts performance (AUC of 0.90 ± 0.01) shows the Bayes network model performs better than domain experts under the same test conditions. Our results demonstrate the feasibility and effectiveness of comprehensive probabilistic models as part of decision support systems for improved detection of errors in initial radiotherapy plan verification procedures. (paper)
Sparse Estimation Using Bayesian Hierarchical Prior Modeling for Real and Complex Linear Models
DEFF Research Database (Denmark)
Pedersen, Niels Lovmand; Manchón, Carles Navarro; Badiu, Mihai Alin
2015-01-01
In sparse Bayesian learning (SBL), Gaussian scale mixtures (GSMs) have been used to model sparsity-inducing priors that realize a class of concave penalty functions for the regression task in real-valued signal models. Motivated by the relative scarcity of formal tools for SBL in complex-valued m......In sparse Bayesian learning (SBL), Gaussian scale mixtures (GSMs) have been used to model sparsity-inducing priors that realize a class of concave penalty functions for the regression task in real-valued signal models. Motivated by the relative scarcity of formal tools for SBL in complex...... error, and robustness in low and medium signal-to-noise ratio regimes....
Large scale Bayesian nuclear data evaluation with consistent model defects
International Nuclear Information System (INIS)
Schnabel, G
2015-01-01
The aim of nuclear data evaluation is the reliable determination of cross sections and related quantities of the atomic nuclei. To this end, evaluation methods are applied which combine the information of experiments with the results of model calculations. The evaluated observables with their associated uncertainties and correlations are assembled into data sets, which are required for the development of novel nuclear facilities, such as fusion reactors for energy supply, and accelerator driven systems for nuclear waste incineration. The efficiency and safety of such future facilities is dependent on the quality of these data sets and thus also on the reliability of the applied evaluation methods. This work investigated the performance of the majority of available evaluation methods in two scenarios. The study indicated the importance of an essential component in these methods, which is the frequently ignored deficiency of nuclear models. Usually, nuclear models are based on approximations and thus their predictions may deviate from reliable experimental data. As demonstrated in this thesis, the neglect of this possibility in evaluation methods can lead to estimates of observables which are inconsistent with experimental data. Due to this finding, an extension of Bayesian evaluation methods is proposed to take into account the deficiency of the nuclear models. The deficiency is modeled as a random function in terms of a Gaussian process and combined with the model prediction. This novel formulation conserves sum rules and allows to explicitly estimate the magnitude of model deficiency. Both features are missing in available evaluation methods so far. Furthermore, two improvements of existing methods have been developed in the course of this thesis. The first improvement concerns methods relying on Monte Carlo sampling. A Metropolis-Hastings scheme with a specific proposal distribution is suggested, which proved to be more efficient in the studied scenarios than the
Bayesian inference and model comparison for metallic fatigue data
Babuska, Ivo
2016-01-06
In this work, we present a statistical treatment of stress-life (S-N) data drawn from a collection of records of fatigue experiments that were performed on 75S-T6 aluminum alloys. Our main objective is to predict the fatigue life of materials by providing a systematic approach to model calibration, model selection and model ranking with reference to S-N data. To this purpose, we consider fatigue-limit models and random fatigue-limit models that are specially designed to allow the treatment of the run-outs (right-censored data). We first fit the models to the data by maximum likelihood methods and estimate the quantiles of the life distribution of the alloy specimen. We then compare and rank the models by classical measures of fit based on information criteria. We also consider a Bayesian approach that provides, under the prior distribution of the model parameters selected by the user, their simulation-based posterior distributions.
iSEDfit: Bayesian spectral energy distribution modeling of galaxies
Moustakas, John
2017-08-01
iSEDfit uses Bayesian inference to extract the physical properties of galaxies from their observed broadband photometric spectral energy distribution (SED). In its default mode, the inputs to iSEDfit are the measured photometry (fluxes and corresponding inverse variances) and a measurement of the galaxy redshift. Alternatively, iSEDfit can be used to estimate photometric redshifts from the input photometry alone. After the priors have been specified, iSEDfit calculates the marginalized posterior probability distributions for the physical parameters of interest, including the stellar mass, star-formation rate, dust content, star formation history, and stellar metallicity. iSEDfit also optionally computes K-corrections and produces multiple "quality assurance" (QA) plots at each stage of the modeling procedure to aid in the interpretation of the prior parameter choices and subsequent fitting results. The software is distributed as part of the impro IDL suite.
A Bayesian modelling framework for tornado occurrences in North America.
Cheng, Vincent Y S; Arhonditsis, George B; Sills, David M L; Gough, William A; Auld, Heather
2015-03-25
Tornadoes represent one of nature's most hazardous phenomena that have been responsible for significant destruction and devastating fatalities. Here we present a Bayesian modelling approach for elucidating the spatiotemporal patterns of tornado activity in North America. Our analysis shows a significant increase in the Canadian Prairies and the Northern Great Plains during the summer, indicating a clear transition of tornado activity from the United States to Canada. The linkage between monthly-averaged atmospheric variables and likelihood of tornado events is characterized by distinct seasonality; the convective available potential energy is the predominant factor in the summer; vertical wind shear appears to have a strong signature primarily in the winter and secondarily in the summer; and storm relative environmental helicity is most influential in the spring. The present probabilistic mapping can be used to draw inference on the likelihood of tornado occurrence in any location in North America within a selected time period of the year.
Designing and testing inflationary models with Bayesian networks
Energy Technology Data Exchange (ETDEWEB)
Price, Layne C. [Carnegie Mellon Univ., Pittsburgh, PA (United States). Dept. of Physics; Auckland Univ. (New Zealand). Dept. of Physics; Peiris, Hiranya V. [Univ. College London (United Kingdom). Dept. of Physics and Astronomy; Frazer, Jonathan [DESY Hamburg (Germany). Theory Group; Univ. of the Basque Country, Bilbao (Spain). Dept. of Theoretical Physics; Basque Foundation for Science, Bilbao (Spain). IKERBASQUE; Easther, Richard [Auckland Univ. (New Zealand). Dept. of Physics
2015-11-15
Even simple inflationary scenarios have many free parameters. Beyond the variables appearing in the inflationary action, these include dynamical initial conditions, the number of fields, and couplings to other sectors. These quantities are often ignored but cosmological observables can depend on the unknown parameters. We use Bayesian networks to account for a large set of inflationary parameters, deriving generative models for the primordial spectra that are conditioned on a hierarchical set of prior probabilities describing the initial conditions, reheating physics, and other free parameters. We use N{sub f}-quadratic inflation as an illustrative example, finding that the number of e-folds N{sub *} between horizon exit for the pivot scale and the end of inflation is typically the most important parameter, even when the number of fields, their masses and initial conditions are unknown, along with possible conditional dependencies between these parameters.
Designing and testing inflationary models with Bayesian networks
Energy Technology Data Exchange (ETDEWEB)
Price, Layne C. [McWilliams Center for Cosmology, Department of Physics, Carnegie Mellon University, Pittsburgh, PA 15213 (United States); Peiris, Hiranya V. [Department of Physics and Astronomy, University College London, London WC1E 6BT (United Kingdom); Frazer, Jonathan [Deutsches Elektronen-Synchrotron DESY, Theory Group, 22603 Hamburg (Germany); Easther, Richard, E-mail: laynep@andrew.cmu.edu, E-mail: h.peiris@ucl.ac.uk, E-mail: jonathan.frazer@desy.de, E-mail: r.easther@auckland.ac.nz [Department of Physics, University of Auckland, Private Bag 92019, Auckland (New Zealand)
2016-02-01
Even simple inflationary scenarios have many free parameters. Beyond the variables appearing in the inflationary action, these include dynamical initial conditions, the number of fields, and couplings to other sectors. These quantities are often ignored but cosmological observables can depend on the unknown parameters. We use Bayesian networks to account for a large set of inflationary parameters, deriving generative models for the primordial spectra that are conditioned on a hierarchical set of prior probabilities describing the initial conditions, reheating physics, and other free parameters. We use N{sub f}-quadratic inflation as an illustrative example, finding that the number of e-folds N{sub *} between horizon exit for the pivot scale and the end of inflation is typically the most important parameter, even when the number of fields, their masses and initial conditions are unknown, along with possible conditional dependencies between these parameters.
Designing and testing inflationary models with Bayesian networks
International Nuclear Information System (INIS)
Price, Layne C.; Auckland Univ.; Peiris, Hiranya V.; Frazer, Jonathan; Univ. of the Basque Country, Bilbao; Basque Foundation for Science, Bilbao; Easther, Richard
2015-11-01
Even simple inflationary scenarios have many free parameters. Beyond the variables appearing in the inflationary action, these include dynamical initial conditions, the number of fields, and couplings to other sectors. These quantities are often ignored but cosmological observables can depend on the unknown parameters. We use Bayesian networks to account for a large set of inflationary parameters, deriving generative models for the primordial spectra that are conditioned on a hierarchical set of prior probabilities describing the initial conditions, reheating physics, and other free parameters. We use N f -quadratic inflation as an illustrative example, finding that the number of e-folds N * between horizon exit for the pivot scale and the end of inflation is typically the most important parameter, even when the number of fields, their masses and initial conditions are unknown, along with possible conditional dependencies between these parameters.
Forecast Accuracy and Economic Gains from Bayesian Model Averaging using Time Varying Weights
L.F. Hoogerheide (Lennart); R.H. Kleijn (Richard); H.K. van Dijk (Herman); M.J.C.M. Verbeek (Marno)
2009-01-01
textabstractSeveral Bayesian model combination schemes, including some novel approaches that simultaneously allow for parameter uncertainty, model uncertainty and robust time varying model weights, are compared in terms of forecast accuracy and economic gains using financial and macroeconomic time
Evidence on Features of a DSGE Business Cycle Model from Bayesian Model Averaging
R.W. Strachan (Rodney); H.K. van Dijk (Herman)
2012-01-01
textabstractThe empirical support for features of a Dynamic Stochastic General Equilibrium model with two technology shocks is valuated using Bayesian model averaging over vector autoregressions. The model features include equilibria, restrictions on long-run responses, a structural break of unknown
Nonparametric methods for volatility density estimation
Es, van Bert; Spreij, P.J.C.; Zanten, van J.H.
2009-01-01
Stochastic volatility modelling of financial processes has become increasingly popular. The proposed models usually contain a stationary volatility process. We will motivate and review several nonparametric methods for estimation of the density of the volatility process. Both models based on
Bayesian modeling of ChIP-chip data using latent variables.
Wu, Mingqi
2009-10-26
BACKGROUND: The ChIP-chip technology has been used in a wide range of biomedical studies, such as identification of human transcription factor binding sites, investigation of DNA methylation, and investigation of histone modifications in animals and plants. Various methods have been proposed in the literature for analyzing the ChIP-chip data, such as the sliding window methods, the hidden Markov model-based methods, and Bayesian methods. Although, due to the integrated consideration of uncertainty of the models and model parameters, Bayesian methods can potentially work better than the other two classes of methods, the existing Bayesian methods do not perform satisfactorily. They usually require multiple replicates or some extra experimental information to parametrize the model, and long CPU time due to involving of MCMC simulations. RESULTS: In this paper, we propose a Bayesian latent model for the ChIP-chip data. The new model mainly differs from the existing Bayesian models, such as the joint deconvolution model, the hierarchical gamma mixture model, and the Bayesian hierarchical model, in two respects. Firstly, it works on the difference between the averaged treatment and control samples. This enables the use of a simple model for the data, which avoids the probe-specific effect and the sample (control/treatment) effect. As a consequence, this enables an efficient MCMC simulation of the posterior distribution of the model, and also makes the model more robust to the outliers. Secondly, it models the neighboring dependence of probes by introducing a latent indicator vector. A truncated Poisson prior distribution is assumed for the latent indicator variable, with the rationale being justified at length. CONCLUSION: The Bayesian latent method is successfully applied to real and ten simulated datasets, with comparisons with some of the existing Bayesian methods, hidden Markov model methods, and sliding window methods. The numerical results indicate that the
A novel Bayesian hierarchical model for road safety hotspot prediction.
Fawcett, Lee; Thorpe, Neil; Matthews, Joseph; Kremer, Karsten
2017-02-01
In this paper, we propose a Bayesian hierarchical model for predicting accident counts in future years at sites within a pool of potential road safety hotspots. The aim is to inform road safety practitioners of the location of likely future hotspots to enable a proactive, rather than reactive, approach to road safety scheme implementation. A feature of our model is the ability to rank sites according to their potential to exceed, in some future time period, a threshold accident count which may be used as a criterion for scheme implementation. Our model specification enables the classical empirical Bayes formulation - commonly used in before-and-after studies, wherein accident counts from a single before period are used to estimate counterfactual counts in the after period - to be extended to incorporate counts from multiple time periods. This allows site-specific variations in historical accident counts (e.g. locally-observed trends) to offset estimates of safety generated by a global accident prediction model (APM), which itself is used to help account for the effects of global trend and regression-to-mean (RTM). The Bayesian posterior predictive distribution is exploited to formulate predictions and to properly quantify our uncertainty in these predictions. The main contributions of our model include (i) the ability to allow accident counts from multiple time-points to inform predictions, with counts in more recent years lending more weight to predictions than counts from time-points further in the past; (ii) where appropriate, the ability to offset global estimates of trend by variations in accident counts observed locally, at a site-specific level; and (iii) the ability to account for unknown/unobserved site-specific factors which may affect accident counts. We illustrate our model with an application to accident counts at 734 potential hotspots in the German city of Halle; we also propose some simple diagnostics to validate the predictive capability of our
International Nuclear Information System (INIS)
Storlie, Curtis B.; Swiler, Laura P.; Helton, Jon C.; Sallaberry, Cedric J.
2009-01-01
The analysis of many physical and engineering problems involves running complex computational models (simulation models, computer codes). With problems of this type, it is important to understand the relationships between the input variables (whose values are often imprecisely known) and the output. The goal of sensitivity analysis (SA) is to study this relationship and identify the most significant factors or variables affecting the results of the model. In this presentation, an improvement on existing methods for SA of complex computer models is described for use when the model is too computationally expensive for a standard Monte-Carlo analysis. In these situations, a meta-model or surrogate model can be used to estimate the necessary sensitivity index for each input. A sensitivity index is a measure of the variance in the response that is due to the uncertainty in an input. Most existing approaches to this problem either do not work well with a large number of input variables and/or they ignore the error involved in estimating a sensitivity index. Here, a new approach to sensitivity index estimation using meta-models and bootstrap confidence intervals is described that provides solutions to these drawbacks. Further, an efficient yet effective approach to incorporate this methodology into an actual SA is presented. Several simulated and real examples illustrate the utility of this approach. This framework can be extended to uncertainty analysis as well.
Bayesian Network Webserver: a comprehensive tool for biological network modeling.
Ziebarth, Jesse D; Bhattacharya, Anindya; Cui, Yan
2013-11-01
The Bayesian Network Webserver (BNW) is a platform for comprehensive network modeling of systems genetics and other biological datasets. It allows users to quickly and seamlessly upload a dataset, learn the structure of the network model that best explains the data and use the model to understand relationships between network variables. Many datasets, including those used to create genetic network models, contain both discrete (e.g. genotype) and continuous (e.g. gene expression traits) variables, and BNW allows for modeling hybrid datasets. Users of BNW can incorporate prior knowledge during structure learning through an easy-to-use structural constraint interface. After structure learning, users are immediately presented with an interactive network model, which can be used to make testable hypotheses about network relationships. BNW, including a downloadable structure learning package, is available at http://compbio.uthsc.edu/BNW. (The BNW interface for adding structural constraints uses HTML5 features that are not supported by current version of Internet Explorer. We suggest using other browsers (e.g. Google Chrome or Mozilla Firefox) when accessing BNW). ycui2@uthsc.edu. Supplementary data are available at Bioinformatics online.
Bayesian model ensembling using meta-trained recurrent neural networks
Ambrogioni, L.; Berezutskaya, Y.; Gü ç lü , U.; Borne, E.W.P. van den; Gü ç lü tü rk, Y.; Gerven, M.A.J. van; Maris, E.G.G.
2017-01-01
In this paper we demonstrate that a recurrent neural network meta-trained on an ensemble of arbitrary classification tasks can be used as an approximation of the Bayes optimal classifier. This result is obtained by relying on the framework of e-free approximate Bayesian inference, where the Bayesian
Touw, D J; Vinks, A A; Neef, C
1997-06-01
The availability of personal computer programs for individualizing drug dosage regimens has stimulated the interest in modelling population pharmacokinetics. Data from 82 adolescent and adult patients with cystic fibrosis (CF) who were treated with intravenous tobramycin because of an exacerbation of their pulmonary infection were analysed with a non-parametric expectation maximization (NPEM) algorithm. This algorithm estimates the entire discrete joint probability density of the pharmacokinetic parameters. It also provides traditional parametric statistics such as the means, standard deviation, median, covariances and correlations among the various parameters. It also provides graphic-2- and 3-dimensional representations of the marginal densities of the parameters investigated. Several models for intravenous tobramycin in adolescent and adult patients with CF were compared. Covariates were total body weight (for the volume of distribution) and creatinine clearance (for the total body clearance and elimination rate). Because of lack of data on patients with poor renal function, restricted models with non-renal clearance and the non-renal elimination rate constant fixed at literature values of 0.15 L/h and 0.01 h-1 were also included. In this population, intravenous tobramycin could be best described by median (+/-dispersion factor) volume of distribution per unit of total body weight of 0.28 +/- 0.05 L/kg, elimination rate constant of 0.25 +/- 0.10 h-1 and elimination rate constant per unit of creatinine clearance of 0.0008 +/- 0.0009 h-1/(ml/min/1.73 m2). Analysis of populations of increasing size showed that using a restricted model with a non-renal elimination rate constant fixed at 0.01 h-1, a model based on a population of only 10 to 20 patients, contained parameter values similar to those of the entire population and, using the full model, a larger population (at least 40 patients) was needed.
DEFF Research Database (Denmark)
Iglesias, J. E.; Sabuncu, M. R.; Van Leemput, Koen
2012-01-01
Many successful segmentation algorithms are based on Bayesian models in which prior anatomical knowledge is combined with the available image information. However, these methods typically have many free parameters that are estimated to obtain point estimates only, whereas a faithful Bayesian anal...
A Bayesian Approach to Person Fit Analysis in Item Response Theory Models. Research Report.
Glas, Cees A. W.; Meijer, Rob R.
A Bayesian approach to the evaluation of person fit in item response theory (IRT) models is presented. In a posterior predictive check, the observed value on a discrepancy variable is positioned in its posterior distribution. In a Bayesian framework, a Markov Chain Monte Carlo procedure can be used to generate samples of the posterior distribution…
Walter, G.M.; Augustin, Th.; Kneib, Thomas; Tutz, Gerhard
2010-01-01
The paper is concerned with Bayesian analysis under prior-data conflict, i.e. the situation when observed data are rather unexpected under the prior (and the sample size is not large enough to eliminate the influence of the prior). Two approaches for Bayesian linear regression modeling based on
Rational Irrationality: Modeling Climate Change Belief Polarization Using Bayesian Networks.
Cook, John; Lewandowsky, Stephan
2016-01-01
Belief polarization is said to occur when two people respond to the same evidence by updating their beliefs in opposite directions. This response is considered to be "irrational" because it involves contrary updating, a form of belief updating that appears to violate normatively optimal responding, as for example dictated by Bayes' theorem. In light of much evidence that people are capable of normatively optimal behavior, belief polarization presents a puzzling exception. We show that Bayesian networks, or Bayes nets, can simulate rational belief updating. When fit to experimental data, Bayes nets can help identify the factors that contribute to polarization. We present a study into belief updating concerning the reality of climate change in response to information about the scientific consensus on anthropogenic global warming (AGW). The study used representative samples of Australian and U.S. Among Australians, consensus information partially neutralized the influence of worldview, with free-market supporters showing a greater increase in acceptance of human-caused global warming relative to free-market opponents. In contrast, while consensus information overall had a positive effect on perceived consensus among U.S. participants, there was a reduction in perceived consensus and acceptance of human-caused global warming for strong supporters of unregulated free markets. Fitting a Bayes net model to the data indicated that under a Bayesian framework, free-market support is a significant driver of beliefs about climate change and trust in climate scientists. Further, active distrust of climate scientists among a small number of U.S. conservatives drives contrary updating in response to consensus information among this particular group. Copyright © 2016 Cognitive Science Society, Inc.
Model-based Bayesian signal extraction algorithm for peripheral nerves
Eggers, Thomas E.; Dweiri, Yazan M.; McCallum, Grant A.; Durand, Dominique M.
2017-10-01
Objective. Multi-channel cuff electrodes have recently been investigated for extracting fascicular-level motor commands from mixed neural recordings. Such signals could provide volitional, intuitive control over a robotic prosthesis for amputee patients. Recent work has demonstrated success in extracting these signals in acute and chronic preparations using spatial filtering techniques. These extracted signals, however, had low signal-to-noise ratios and thus limited their utility to binary classification. In this work a new algorithm is proposed which combines previous source localization approaches to create a model based method which operates in real time. Approach. To validate this algorithm, a saline benchtop setup was created to allow the precise placement of artificial sources within a cuff and interference sources outside the cuff. The artificial source was taken from five seconds of chronic neural activity to replicate realistic recordings. The proposed algorithm, hybrid Bayesian signal extraction (HBSE), is then compared to previous algorithms, beamforming and a Bayesian spatial filtering method, on this test data. An example chronic neural recording is also analyzed with all three algorithms. Main results. The proposed algorithm improved the signal to noise and signal to interference ratio of extracted test signals two to three fold, as well as increased the correlation coefficient between the original and recovered signals by 10-20%. These improvements translated to the chronic recording example and increased the calculated bit rate between the recovered signals and the recorded motor activity. Significance. HBSE significantly outperforms previous algorithms in extracting realistic neural signals, even in the presence of external noise sources. These results demonstrate the feasibility of extracting dynamic motor signals from a multi-fascicled intact nerve trunk, which in turn could extract motor command signals from an amputee for the end goal of
A Bayesian Model of Category-Specific Emotional Brain Responses
Wager, Tor D.; Kang, Jian; Johnson, Timothy D.; Nichols, Thomas E.; Satpute, Ajay B.; Barrett, Lisa Feldman
2015-01-01
Understanding emotion is critical for a science of healthy and disordered brain function, but the neurophysiological basis of emotional experience is still poorly understood. We analyzed human brain activity patterns from 148 studies of emotion categories (2159 total participants) using a novel hierarchical Bayesian model. The model allowed us to classify which of five categories—fear, anger, disgust, sadness, or happiness—is engaged by a study with 66% accuracy (43-86% across categories). Analyses of the activity patterns encoded in the model revealed that each emotion category is associated with unique, prototypical patterns of activity across multiple brain systems including the cortex, thalamus, amygdala, and other structures. The results indicate that emotion categories are not contained within any one region or system, but are represented as configurations across multiple brain networks. The model provides a precise summary of the prototypical patterns for each emotion category, and demonstrates that a sufficient characterization of emotion categories relies on (a) differential patterns of involvement in neocortical systems that differ between humans and other species, and (b) distinctive patterns of cortical-subcortical interactions. Thus, these findings are incompatible with several contemporary theories of emotion, including those that emphasize emotion-dedicated brain systems and those that propose emotion is localized primarily in subcortical activity. They are consistent with componential and constructionist views, which propose that emotions are differentiated by a combination of perceptual, mnemonic, prospective, and motivational elements. Such brain-based models of emotion provide a foundation for new translational and clinical approaches. PMID:25853490
Bayesian calibration of the Community Land Model using surrogates
Energy Technology Data Exchange (ETDEWEB)
Ray, Jaideep; Hou, Zhangshuan; Huang, Maoyi; Swiler, Laura Painton
2014-02-01
We present results from the Bayesian calibration of hydrological parameters of the Community Land Model (CLM), which is often used in climate simulations and Earth system models. A statistical inverse problem is formulated for three hydrological parameters, conditional on observations of latent heat surface fluxes over 48 months. Our calibration method uses polynomial and Gaussian process surrogates of the CLM, and solves the parameter estimation problem using a Markov chain Monte Carlo sampler. Posterior probability densities for the parameters are developed for two sites with different soil and vegetation covers. Our method also allows us to examine the structural error in CLM under two error models. We find that surrogate models can be created for CLM in most cases. The posterior distributions are more predictive than the default parameter values in CLM. Climatologically averaging the observations does not modify the parameters' distributions significantly. The structural error model reveals a correlation time-scale which can be used to identify the physical process that could be contributing to it. While the calibrated CLM has a higher predictive skill, the calibration is under-dispersive.
Ensemble bayesian model averaging using markov chain Monte Carlo sampling
Energy Technology Data Exchange (ETDEWEB)
Vrugt, Jasper A [Los Alamos National Laboratory; Diks, Cees G H [NON LANL; Clark, Martyn P [NON LANL
2008-01-01
Bayesian model averaging (BMA) has recently been proposed as a statistical method to calibrate forecast ensembles from numerical weather models. Successful implementation of BMA however, requires accurate estimates of the weights and variances of the individual competing models in the ensemble. In their seminal paper (Raftery etal. Mon Weather Rev 133: 1155-1174, 2(05)) has recommended the Expectation-Maximization (EM) algorithm for BMA model training, even though global convergence of this algorithm cannot be guaranteed. In this paper, we compare the performance of the EM algorithm and the recently developed Differential Evolution Adaptive Metropolis (DREAM) Markov Chain Monte Carlo (MCMC) algorithm for estimating the BMA weights and variances. Simulation experiments using 48-hour ensemble data of surface temperature and multi-model stream-flow forecasts show that both methods produce similar results, and that their performance is unaffected by the length of the training data set. However, MCMC simulation with DREAM is capable of efficiently handling a wide variety of BMA predictive distributions, and provides useful information about the uncertainty associated with the estimated BMA weights and variances.
Ferrarini, Luca; Veer, Ilya M; van Lew, Baldur; Oei, Nicole Y L; van Buchem, Mark A; Reiber, Johan H C; Rombouts, Serge A R B; Milles, J
2011-06-01
In recent years, graph theory has been successfully applied to study functional and anatomical connectivity networks in the human brain. Most of these networks have shown small-world topological characteristics: high efficiency in long distance communication between nodes, combined with highly interconnected local clusters of nodes. Moreover, functional studies performed at high resolutions have presented convincing evidence that resting-state functional connectivity networks exhibits (exponentially truncated) scale-free behavior. Such evidence, however, was mostly presented qualitatively, in terms of linear regressions of the degree distributions on log-log plots. Even when quantitative measures were given, these were usually limited to the r(2) correlation coefficient. However, the r(2) statistic is not an optimal estimator of explained variance, when dealing with (truncated) power-law models. Recent developments in statistics have introduced new non-parametric approaches, based on the Kolmogorov-Smirnov test, for the problem of model selection. In this work, we have built on this idea to statistically tackle the issue of model selection for the degree distribution of functional connectivity at rest. The analysis, performed at voxel level and in a subject-specific fashion, confirmed the superiority of a truncated power-law model, showing high consistency across subjects. Moreover, the most highly connected voxels were found to be consistently part of the default mode network. Our results provide statistically sound support to the evidence previously presented in literature for a truncated power-law model of resting-state functional connectivity. Copyright © 2010 Elsevier Inc. All rights reserved.
Bayesian Belief Networks Approach for Modeling Irrigation Behavior
Andriyas, S.; McKee, M.
2012-12-01
Canal operators need information to manage water deliveries to irrigators. Short-term irrigation demand forecasts can potentially valuable information for a canal operator who must manage an on-demand system. Such forecasts could be generated by using information about the decision-making processes of irrigators. Bayesian models of irrigation behavior can provide insight into the likely criteria which farmers use to make irrigation decisions. This paper develops a Bayesian belief network (BBN) to learn irrigation decision-making behavior of farmers and utilizes the resulting model to make forecasts of future irrigation decisions based on factor interaction and posterior probabilities. Models for studying irrigation behavior have been rarely explored in the past. The model discussed here was built from a combination of data about biotic, climatic, and edaphic conditions under which observed irrigation decisions were made. The paper includes a case study using data collected from the Canal B region of the Sevier River, near Delta, Utah. Alfalfa, barley and corn are the main crops of the location. The model has been tested with a portion of the data to affirm the model predictive capabilities. Irrigation rules were deduced in the process of learning and verified in the testing phase. It was found that most of the farmers used consistent rules throughout all years and across different types of crops. Soil moisture stress, which indicates the level of water available to the plant in the soil profile, was found to be one of the most significant likely driving forces for irrigation. Irrigations appeared to be triggered by a farmer's perception of soil stress, or by a perception of combined factors such as information about a neighbor irrigating or an apparent preference to irrigate on a weekend. Soil stress resulted in irrigation probabilities of 94.4% for alfalfa. With additional factors like weekend and irrigating when a neighbor irrigates, alfalfa irrigation
Bayesian Genomic Prediction with Genotype × Environment Interaction Kernel Models
Cuevas, Jaime; Crossa, José; Montesinos-López, Osval A.; Burgueño, Juan; Pérez-Rodríguez, Paulino; de los Campos, Gustavo
2016-01-01
The phenomenon of genotype × environment (G × E) interaction in plant breeding decreases selection accuracy, thereby negatively affecting genetic gains. Several genomic prediction models incorporating G × E have been recently developed and used in genomic selection of plant breeding programs. Genomic prediction models for assessing multi-environment G × E interaction are extensions of a single-environment model, and have advantages and limitations. In this study, we propose two multi-environment Bayesian genomic models: the first model considers genetic effects (u) that can be assessed by the Kronecker product of variance–covariance matrices of genetic correlations between environments and genomic kernels through markers under two linear kernel methods, linear (genomic best linear unbiased predictors, GBLUP) and Gaussian (Gaussian kernel, GK). The other model has the same genetic component as the first model (u) plus an extra component, f, that captures random effects between environments that were not captured by the random effects u. We used five CIMMYT data sets (one maize and four wheat) that were previously used in different studies. Results show that models with G × E always have superior prediction ability than single-environment models, and the higher prediction ability of multi-environment models with u and f over the multi-environment model with only u occurred 85% of the time with GBLUP and 45% of the time with GK across the five data sets. The latter result indicated that including the random effect f is still beneficial for increasing prediction ability after adjusting by the random effect u. PMID:27793970
Bayesian Genomic Prediction with Genotype × Environment Interaction Kernel Models
Directory of Open Access Journals (Sweden)
Jaime Cuevas
2017-01-01
Full Text Available The phenomenon of genotype × environment (G × E interaction in plant breeding decreases selection accuracy, thereby negatively affecting genetic gains. Several genomic prediction models incorporating G × E have been recently developed and used in genomic selection of plant breeding programs. Genomic prediction models for assessing multi-environment G × E interaction are extensions of a single-environment model, and have advantages and limitations. In this study, we propose two multi-environment Bayesian genomic models: the first model considers genetic effects ( u that can be assessed by the Kronecker product of variance–covariance matrices of genetic correlations between environments and genomic kernels through markers under two linear kernel methods, linear (genomic best linear unbiased predictors, GBLUP and Gaussian (Gaussian kernel, GK. The other model has the same genetic component as the first model ( u plus an extra component, f, that captures random effects between environments that were not captured by the random effects u . We used five CIMMYT data sets (one maize and four wheat that were previously used in different studies. Results show that models with G × E always have superior prediction ability than single-environment models, and the higher prediction ability of multi-environment models with u and f over the multi-environment model with only u occurred 85% of the time with GBLUP and 45% of the time with GK across the five data sets. The latter result indicated that including the random effect f is still beneficial for increasing prediction ability after adjusting by the random effect u .
Bayesian Genomic Prediction with Genotype × Environment Interaction Kernel Models.
Cuevas, Jaime; Crossa, José; Montesinos-López, Osval A; Burgueño, Juan; Pérez-Rodríguez, Paulino; de Los Campos, Gustavo
2017-01-05
The phenomenon of genotype × environment (G × E) interaction in plant breeding decreases selection accuracy, thereby negatively affecting genetic gains. Several genomic prediction models incorporating G × E have been recently developed and used in genomic selection of plant breeding programs. Genomic prediction models for assessing multi-environment G × E interaction are extensions of a single-environment model, and have advantages and limitations. In this study, we propose two multi-environment Bayesian genomic models: the first model considers genetic effects [Formula: see text] that can be assessed by the Kronecker product of variance-covariance matrices of genetic correlations between environments and genomic kernels through markers under two linear kernel methods, linear (genomic best linear unbiased predictors, GBLUP) and Gaussian (Gaussian kernel, GK). The other model has the same genetic component as the first model [Formula: see text] plus an extra component, F: , that captures random effects between environments that were not captured by the random effects [Formula: see text] We used five CIMMYT data sets (one maize and four wheat) that were previously used in different studies. Results show that models with G × E always have superior prediction ability than single-environment models, and the higher prediction ability of multi-environment models with [Formula: see text] over the multi-environment model with only u occurred 85% of the time with GBLUP and 45% of the time with GK across the five data sets. The latter result indicated that including the random effect f is still beneficial for increasing prediction ability after adjusting by the random effect [Formula: see text]. Copyright © 2017 Cuevas et al.
Rings, J.; Vrugt, J.A.; Schoups, G.; Huisman, J.A.; Vereecken, H.
2012-01-01
Bayesian model averaging (BMA) is a standard method for combining predictive distributions from different models. In recent years, this method has enjoyed widespread application and use in many fields of study to improve the spread-skill relationship of forecast ensembles. The BMA predictive
DEFF Research Database (Denmark)
Salarzadeh Jenatabadi, Hashem; Babashamsi, Peyman; Khajeheian, Datis
2016-01-01
There are many factors which could inﬂuence the sustainability of airlines. The main purpose of this study is to introduce a framework for a financial sustainability index and model it based on structural equation modeling (SEM) with maximum likelihood and Bayesian predictors. The introduced...
Bayesian inference for hybrid discrete-continuous stochastic kinetic models
International Nuclear Information System (INIS)
Sherlock, Chris; Golightly, Andrew; Gillespie, Colin S
2014-01-01
We consider the problem of efficiently performing simulation and inference for stochastic kinetic models. Whilst it is possible to work directly with the resulting Markov jump process (MJP), computational cost can be prohibitive for networks of realistic size and complexity. In this paper, we consider an inference scheme based on a novel hybrid simulator that classifies reactions as either ‘fast’ or ‘slow’ with fast reactions evolving as a continuous Markov process whilst the remaining slow reaction occurrences are modelled through a MJP with time-dependent hazards. A linear noise approximation (LNA) of fast reaction dynamics is employed and slow reaction events are captured by exploiting the ability to solve the stochastic differential equation driving the LNA. This simulation procedure is used as a proposal mechanism inside a particle MCMC scheme, thus allowing Bayesian inference for the model parameters. We apply the scheme to a simple application and compare the output with an existing hybrid approach and also a scheme for performing inference for the underlying discrete stochastic model. (paper)
Bayesian model calibration of ramp compression experiments on Z
Brown, Justin; Hund, Lauren
2017-06-01
Bayesian model calibration (BMC) is a statistical framework to estimate inputs for a computational model in the presence of multiple uncertainties, making it well suited to dynamic experiments which must be coupled with numerical simulations to interpret the results. Often, dynamic experiments are diagnosed using velocimetry and this output can be modeled using a hydrocode. Several calibration issues unique to this type of scenario including the functional nature of the output, uncertainty of nuisance parameters within the simulation, and model discrepancy identifiability are addressed, and a novel BMC process is proposed. As a proof of concept, we examine experiments conducted on Sandia National Laboratories' Z-machine which ramp compressed tantalum to peak stresses of 250 GPa. The proposed BMC framework is used to calibrate the cold curve of Ta (with uncertainty), and we conclude that the procedure results in simple, fast, and valid inferences. Sandia National Laboratories is a multi-mission laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000.
Improving default risk prediction using Bayesian model uncertainty techniques.
Kazemi, Reza; Mosleh, Ali
2012-11-01
Credit risk is the potential exposure of a creditor to an obligor's failure or refusal to repay the debt in principal or interest. The potential of exposure is measured in terms of probability of default. Many models have been developed to estimate credit risk, with rating agencies dating back to the 19th century. They provide their assessment of probability of default and transition probabilities of various firms in their annual reports. Regulatory capital requirements for credit risk outlined by the Basel Committee on Banking Supervision have made it essential for banks and financial institutions to develop sophisticated models in an attempt to measure credit risk with higher accuracy. The Bayesian framework proposed in this article uses the techniques developed in physical sciences and engineering for dealing with model uncertainty and expert accuracy to obtain improved estimates of credit risk and associated uncertainties. The approach uses estimates from one or more rating agencies and incorporates their historical accuracy (past performance data) in estimating future default risk and transition probabilities. Several examples demonstrate that the proposed methodology can assess default probability with accuracy exceeding the estimations of all the individual models. Moreover, the methodology accounts for potentially significant departures from "nominal predictions" due to "upsetting events" such as the 2008 global banking crisis. © 2012 Society for Risk Analysis.
Bayesian Hierarchical Random Effects Models in Forensic Science
Directory of Open Access Journals (Sweden)
Colin G. G. Aitken
2018-04-01
Full Text Available Statistical modeling of the evaluation of evidence with the use of the likelihood ratio has a long history. It dates from the Dreyfus case at the end of the nineteenth century through the work at Bletchley Park in the Second World War to the present day. The development received a significant boost in 1977 with a seminal work by Dennis Lindley which introduced a Bayesian hierarchical random effects model for the evaluation of evidence with an example of refractive index measurements on fragments of glass. Many models have been developed since then. The methods have now been sufficiently well-developed and have become so widespread that it is timely to try and provide a software package to assist in their implementation. With that in mind, a project (SAILR: Software for the Analysis and Implementation of Likelihood Ratios was funded by the European Network of Forensic Science Institutes through their Monopoly programme to develop a software package for use by forensic scientists world-wide that would assist in the statistical analysis and implementation of the approach based on likelihood ratios. It is the purpose of this document to provide a short review of a small part of this history. The review also provides a background, or landscape, for the development of some of the models within the SAILR package and references to SAILR as made as appropriate.
Bayesian Hierarchical Random Effects Models in Forensic Science.
Aitken, Colin G G
2018-01-01
Statistical modeling of the evaluation of evidence with the use of the likelihood ratio has a long history. It dates from the Dreyfus case at the end of the nineteenth century through the work at Bletchley Park in the Second World War to the present day. The development received a significant boost in 1977 with a seminal work by Dennis Lindley which introduced a Bayesian hierarchical random effects model for the evaluation of evidence with an example of refractive index measurements on fragments of glass. Many models have been developed since then. The methods have now been sufficiently well-developed and have become so widespread that it is timely to try and provide a software package to assist in their implementation. With that in mind, a project (SAILR: Software for the Analysis and Implementation of Likelihood Ratios) was funded by the European Network of Forensic Science Institutes through their Monopoly programme to develop a software package for use by forensic scientists world-wide that would assist in the statistical analysis and implementation of the approach based on likelihood ratios. It is the purpose of this document to provide a short review of a small part of this history. The review also provides a background, or landscape, for the development of some of the models within the SAILR package and references to SAILR as made as appropriate.
Assessing fit in Bayesian models for spatial processes
Jun, M.
2014-09-16
© 2014 John Wiley & Sons, Ltd. Gaussian random fields are frequently used to model spatial and spatial-temporal data, particularly in geostatistical settings. As much of the attention of the statistics community has been focused on defining and estimating the mean and covariance functions of these processes, little effort has been devoted to developing goodness-of-fit tests to allow users to assess the models\\' adequacy. We describe a general goodness-of-fit test and related graphical diagnostics for assessing the fit of Bayesian Gaussian process models using pivotal discrepancy measures. Our method is applicable for both regularly and irregularly spaced observation locations on planar and spherical domains. The essential idea behind our method is to evaluate pivotal quantities defined for a realization of a Gaussian random field at parameter values drawn from the posterior distribution. Because the nominal distribution of the resulting pivotal discrepancy measures is known, it is possible to quantitatively assess model fit directly from the output of Markov chain Monte Carlo algorithms used to sample from the posterior distribution on the parameter space. We illustrate our method in a simulation study and in two applications.
Assessing fit in Bayesian models for spatial processes
Jun, M.; Katzfuss, M.; Hu, J.; Johnson, V. E.
2014-01-01
© 2014 John Wiley & Sons, Ltd. Gaussian random fields are frequently used to model spatial and spatial-temporal data, particularly in geostatistical settings. As much of the attention of the statistics community has been focused on defining and estimating the mean and covariance functions of these processes, little effort has been devoted to developing goodness-of-fit tests to allow users to assess the models' adequacy. We describe a general goodness-of-fit test and related graphical diagnostics for assessing the fit of Bayesian Gaussian process models using pivotal discrepancy measures. Our method is applicable for both regularly and irregularly spaced observation locations on planar and spherical domains. The essential idea behind our method is to evaluate pivotal quantities defined for a realization of a Gaussian random field at parameter values drawn from the posterior distribution. Because the nominal distribution of the resulting pivotal discrepancy measures is known, it is possible to quantitatively assess model fit directly from the output of Markov chain Monte Carlo algorithms used to sample from the posterior distribution on the parameter space. We illustrate our method in a simulation study and in two applications.
Dziak, John J; Li, Runze; Tan, Xianming; Shiffman, Saul; Shiyko, Mariya P
2015-12-01
Behavioral scientists increasingly collect intensive longitudinal data (ILD), in which phenomena are measured at high frequency and in real time. In many such studies, it is of interest to describe the pattern of change over time in important variables as well as the changing nature of the relationship between variables. Individuals' trajectories on variables of interest may be far from linear, and the predictive relationship between variables of interest and related covariates may also change over time in a nonlinear way. Time-varying effect models (TVEMs; see Tan, Shiyko, Li, Li, & Dierker, 2012) address these needs by allowing regression coefficients to be smooth, nonlinear functions of time rather than constants. However, it is possible that not only observed covariates but also unknown, latent variables may be related to the outcome. That is, regression coefficients may change over time and also vary for different kinds of individuals. Therefore, we describe a finite mixture version of TVEM for situations in which the population is heterogeneous and in which a single trajectory would conceal important, interindividual differences. This extended approach, MixTVEM, combines finite mixture modeling with non- or semiparametric regression modeling, to describe a complex pattern of change over time for distinct latent classes of individuals. The usefulness of the method is demonstrated in an empirical example from a smoking cessation study. We provide a versatile SAS macro and R function for fitting MixTVEMs. (c) 2015 APA, all rights reserved).
Estimating Parameters in Physical Models through Bayesian Inversion: A Complete Example
Allmaras, Moritz; Bangerth, Wolfgang; Linhart, Jean Marie; Polanco, Javier; Wang, Fang; Wang, Kainan; Webster, Jennifer; Zedler, Sarah
2013-01-01
All mathematical models of real-world phenomena contain parameters that need to be estimated from measurements, either for realistic predictions or simply to understand the characteristics of the model. Bayesian statistics provides a framework
Elsheikh, Ahmed H.; Wheeler, Mary Fanett; Hoteit, Ibrahim
2014-01-01
A Hybrid Nested Sampling (HNS) algorithm is proposed for efficient Bayesian model calibration and prior model selection. The proposed algorithm combines, Nested Sampling (NS) algorithm, Hybrid Monte Carlo (HMC) sampling and gradient estimation using
Essays on nonparametric econometrics of stochastic volatility
Zu, Y.
2012-01-01
Volatility is a concept that describes the variation of financial returns. Measuring and modelling volatility dynamics is an important aspect of financial econometrics. This thesis is concerned with nonparametric approaches to volatility measurement and volatility model validation.
International Nuclear Information System (INIS)
Iskandar, Ismed; Gondokaryono, Yudi Satria
2016-01-01
In reliability theory, the most important problem is to determine the reliability of a complex system from the reliability of its components. The weakness of most reliability theories is that the systems are described and explained as simply functioning or failed. In many real situations, the failures may be from many causes depending upon the age and the environment of the system and its components. Another problem in reliability theory is one of estimating the parameters of the assumed failure models. The estimation may be based on data collected over censored or uncensored life tests. In many reliability problems, the failure data are simply quantitatively inadequate, especially in engineering design and maintenance system. The Bayesian analyses are more beneficial than the classical one in such cases. The Bayesian estimation analyses allow us to combine past knowledge or experience in the form of an apriori distribution with life test data to make inferences of the parameter of interest. In this paper, we have investigated the application of the Bayesian estimation analyses to competing risk systems. The cases are limited to the models with independent causes of failure by using the Weibull distribution as our model. A simulation is conducted for this distribution with the objectives of verifying the models and the estimators and investigating the performance of the estimators for varying sample size. The simulation data are analyzed by using Bayesian and the maximum likelihood analyses. The simulation results show that the change of the true of parameter relatively to another will change the value of standard deviation in an opposite direction. For a perfect information on the prior distribution, the estimation methods of the Bayesian analyses are better than those of the maximum likelihood. The sensitivity analyses show some amount of sensitivity over the shifts of the prior locations. They also show the robustness of the Bayesian analysis within the range
International Nuclear Information System (INIS)
Hanachi, Houman; Liu, Jie; Banerjee, Avisekh; Chen, Ying
2015-01-01
Modern health management approaches for gas turbine engines (GTEs) aim to precisely estimate the health state of the GTE components to optimize maintenance decisions with respect to both economy and safety. In this research, we propose an advanced framework to identify the most likely degradation state of the turbine section in a GTE for prognostics and health management (PHM) applications. A novel nonlinear thermodynamic model is used to predict the performance parameters of the GTE given the measurements. The ratio between real efficiency of the GTE and simulated efficiency in the newly installed condition is defined as the health indicator and provided at each sequence. The symptom of nonrecoverable degradations in the turbine section, i.e. loss of turbine efficiency, is assumed to be the internal degradation state. A regularized auxiliary particle filter (RAPF) is developed to sequentially estimate the internal degradation state in nonuniform time sequences upon receiving sets of new measurements. The effectiveness of the technique is examined using the operating data over an entire time-between-overhaul cycle of a simple-cycle industrial GTE. The results clearly show the trend of degradation in the turbine section and the occasional fluctuations, which are well supported by the service history of the GTE. The research also suggests the efficacy of the proposed technique to monitor the health state of the turbine section of a GTE by implementing model-based PHM without the need for additional instrumentation. (paper)
Ma, Yanyuan; Wang, Yuanjia
2014-04-15
Huntington's disease (HD) is a neurodegenerative disorder with a dominant genetic mode of inheritance caused by an expansion of CAG repeats on chromosome 4. Typically, a longer sequence of CAG repeat length is associated with increased risk of experiencing earlier onset of HD. Previous studies of the association between HD onset age and CAG length have favored a logistic model, where the CAG repeat length enters the mean and variance components of the logistic model in a complex exponential-linear form. To relax the parametric assumption of the exponential-linear association to the true HD onset distribution, we propose to leave both mean and variance functions of the CAG repeat length unspecified and perform semiparametric estimation in this context through a local kernel and backfitting procedure. Motivated by including family history of HD information available in the family members of participants in the Cooperative Huntington's Observational Research Trial (COHORT), we develop the methodology in the context of mixture data, where some subjects have a positive probability of being risk free. We also allow censoring on the age at onset of disease and accommodate covariates other than the CAG length. We study the theoretical properties of the proposed estimator and derive its asymptotic distribution. Finally, we apply the proposed methods to the COHORT data to estimate the HD onset distribution using a group of study participants and the disease family history information available on their family members. Copyright © 2013 John Wiley & Sons, Ltd.
Konietschke, Frank; Libiger, Ondrej; Hothorn, Ludwig A
2012-01-01
Statistical association between a single nucleotide polymorphism (SNP) genotype and a quantitative trait in genome-wide association studies is usually assessed using a linear regression model, or, in the case of non-normally distributed trait values, using the Kruskal-Wallis test. While linear regression models assume an additive mode of inheritance via equi-distant genotype scores, Kruskal-Wallis test merely tests global differences in trait values associated with the three genotype groups. Both approaches thus exhibit suboptimal power when the underlying inheritance mode is dominant or recessive. Furthermore, these tests do not perform well in the common situations when only a few trait values are available in a rare genotype category (disbalance), or when the values associated with the three genotype categories exhibit unequal variance (variance heterogeneity). We propose a maximum test based on Marcus-type multiple contrast test for relative effect sizes. This test allows model-specific testing of either dominant, additive or recessive mode of inheritance, and it is robust against variance heterogeneity. We show how to obtain mode-specific simultaneous confidence intervals for the relative effect sizes to aid in interpreting the biological relevance of the results. Further, we discuss the use of a related all-pairwise comparisons contrast test with range preserving confidence intervals as an alternative to Kruskal-Wallis heterogeneity test. We applied the proposed maximum test to the Bogalusa Heart Study dataset, and gained a remarkable increase in the power to detect association, particularly for rare genotypes. Our simulation study also demonstrated that the proposed non-parametric tests control family-wise error rate in the presence of non-normality and variance heterogeneity contrary to the standard parametric approaches. We provide a publicly available R library nparcomp that can be used to estimate simultaneous confidence intervals or compatible
Flexible Bayesian Dynamic Modeling of Covariance and Correlation Matrices
Lan, Shiwei
2017-11-08
Modeling covariance (and correlation) matrices is a challenging problem due to the large dimensionality and positive-definiteness constraint. In this paper, we propose a novel Bayesian framework based on decomposing the covariance matrix into variance and correlation matrices. The highlight is that the correlations are represented as products of vectors on unit spheres. We propose a variety of distributions on spheres (e.g. the squared-Dirichlet distribution) to induce flexible prior distributions for covariance matrices that go beyond the commonly used inverse-Wishart prior. To handle the intractability of the resulting posterior, we introduce the adaptive $\\\\Delta$-Spherical Hamiltonian Monte Carlo. We also extend our structured framework to dynamic cases and introduce unit-vector Gaussian process priors for modeling the evolution of correlation among multiple time series. Using an example of Normal-Inverse-Wishart problem, a simulated periodic process, and an analysis of local field potential data (collected from the hippocampus of rats performing a complex sequence memory task), we demonstrated the validity and effectiveness of our proposed framework for (dynamic) modeling covariance and correlation matrices.
Context-dependent decision-making: a simple Bayesian model.
Lloyd, Kevin; Leslie, David S
2013-05-06
Many phenomena in animal learning can be explained by a context-learning process whereby an animal learns about different patterns of relationship between environmental variables. Differentiating between such environmental regimes or 'contexts' allows an animal to rapidly adapt its behaviour when context changes occur. The current work views animals as making sequential inferences about current context identity in a world assumed to be relatively stable but also capable of rapid switches to previously observed or entirely new contexts. We describe a novel decision-making model in which contexts are assumed to follow a Chinese restaurant process with inertia and full Bayesian inference is approximated by a sequential-sampling scheme in which only a single hypothesis about current context is maintained. Actions are selected via Thompson sampling, allowing uncertainty in parameters to drive exploration in a straightforward manner. The model is tested on simple two-alternative choice problems with switching reinforcement schedules and the results compared with rat behavioural data from a number of T-maze studies. The model successfully replicates a number of important behavioural effects: spontaneous recovery, the effect of partial reinforcement on extinction and reversal, the overtraining reversal effect, and serial reversal-learning effects.
A Bayesian approach to the modelling of α Cen A
Bazot, M.; Bourguignon, S.; Christensen-Dalsgaard, J.
2012-12-01
Determining the physical characteristics of a star is an inverse problem consisting of estimating the parameters of models for the stellar structure and evolution, and knowing certain observable quantities. We use a Bayesian approach to solve this problem for α Cen A, which allows us to incorporate prior information on the parameters to be estimated, in order to better constrain the problem. Our strategy is based on the use of a Markov chain Monte Carlo (MCMC) algorithm to estimate the posterior probability densities of the stellar parameters: mass, age, initial chemical composition, etc. We use the stellar evolutionary code ASTEC to model the star. To constrain this model both seismic and non-seismic observations were considered. Several different strategies were tested to fit these values, using either two free parameters or five free parameters in ASTEC. We are thus able to show evidence that MCMC methods become efficient with respect to more classical grid-based strategies when the number of parameters increases. The results of our MCMC algorithm allow us to derive estimates for the stellar parameters and robust uncertainties thanks to the statistical analysis of the posterior probability densities. We are also able to compute odds for the presence of a convective core in α Cen A. When using core-sensitive seismic observational constraints, these can rise above ˜40 per cent. The comparison of results to previous studies also indicates that these seismic constraints are of critical importance for our knowledge of the structure of this star.
Parameter Estimation of Structural Equation Modeling Using Bayesian Approach
Directory of Open Access Journals (Sweden)
Dewi Kurnia Sari
2016-05-01
Full Text Available Leadership is a process of influencing, directing or giving an example of employees in order to achieve the objectives of the organization and is a key element in the effectiveness of the organization. In addition to the style of leadership, the success of an organization or company in achieving its objectives can also be influenced by the commitment of the organization. Where organizational commitment is a commitment created by each individual for the betterment of the organization. The purpose of this research is to obtain a model of leadership style and organizational commitment to job satisfaction and employee performance, and determine the factors that influence job satisfaction and employee performance using SEM with Bayesian approach. This research was conducted at Statistics FNI employees in Malang, with 15 people. The result of this study showed that the measurement model, all significant indicators measure each latent variable. Meanwhile in the structural model, it was concluded there are a significant difference between the variables of Leadership Style and Organizational Commitment toward Job Satisfaction directly as well as a significant difference between Job Satisfaction on Employee Performance. As for the influence of Leadership Style and variable Organizational Commitment on Employee Performance directly declared insignificant.
Bayesian averaging over Decision Tree models for trauma severity scoring.
Schetinin, V; Jakaite, L; Krzanowski, W
2018-01-01
Health care practitioners analyse possible risks of misleading decisions and need to estimate and quantify uncertainty in predictions. We have examined the "gold" standard of screening a patient's conditions for predicting survival probability, based on logistic regression modelling, which is used in trauma care for clinical purposes and quality audit. This methodology is based on theoretical assumptions about data and uncertainties. Models induced within such an approach have exposed a number of problems, providing unexplained fluctuation of predicted survival and low accuracy of estimating uncertainty intervals within which predictions are made. Bayesian method, which in theory is capable of providing accurate predictions and uncertainty estimates, has been adopted in our study using Decision Tree models. Our approach has been tested on a large set of patients registered in the US National Trauma Data Bank and has outperformed the standard method in terms of prediction accuracy, thereby providing practitioners with accurate estimates of the predictive posterior densities of interest that are required for making risk-aware decisions. Copyright © 2017 Elsevier B.V. All rights reserved.
Bayesian network model of crowd emotion and negative behavior
Ramli, Nurulhuda; Ghani, Noraida Abdul; Hatta, Zulkarnain Ahmad; Hashim, Intan Hashimah Mohd; Sulong, Jasni; Mahudin, Nor Diana Mohd; Rahman, Shukran Abd; Saad, Zarina Mat
2014-12-01
The effects of overcrowding have become a major concern for event organizers. One aspect of this concern has been the idea that overcrowding can enhance the occurrence of serious incidents during events. As one of the largest Muslim religious gathering attended by pilgrims from all over the world, Hajj has become extremely overcrowded with many incidents being reported. The purpose of this study is to analyze the nature of human emotion and negative behavior resulting from overcrowding during Hajj events from data gathered in Malaysian Hajj Experience Survey in 2013. The sample comprised of 147 Malaysian pilgrims (70 males and 77 females). Utilizing a probabilistic model called Bayesian network, this paper models the dependence structure between different emotions and negative behaviors of pilgrims in the crowd. The model included the following variables of emotion: negative, negative comfortable, positive, positive comfortable and positive spiritual and variables of negative behaviors; aggressive and hazardous acts. The study demonstrated that emotions of negative, negative comfortable, positive spiritual and positive emotion have a direct influence on aggressive behavior whereas emotion of negative comfortable, positive spiritual and positive have a direct influence on hazardous acts behavior. The sensitivity analysis showed that a low level of negative and negative comfortable emotions leads to a lower level of aggressive and hazardous behavior. Findings of the study can be further improved to identify the exact cause and risk factors of crowd-related incidents in preventing crowd disasters during the mass gathering events.
Bayesian models for astrophysical data using R, JAGS, Python, and Stan
Hilbe, Joseph M; Ishida, Emille E O
2017-01-01
This comprehensive guide to Bayesian methods in astronomy enables hands-on work by supplying complete R, JAGS, Python, and Stan code, to use directly or to adapt. It begins by examining the normal model from both frequentist and Bayesian perspectives and then progresses to a full range of Bayesian generalized linear and mixed or hierarchical models, as well as additional types of models such as ABC and INLA. The book provides code that is largely unavailable elsewhere and includes details on interpreting and evaluating Bayesian models. Initial discussions offer models in synthetic form so that readers can easily adapt them to their own data; later the models are applied to real astronomical data. The consistent focus is on hands-on modeling, analysis of data, and interpretations that address scientific questions. A must-have for astronomers, its concrete approach will also be attractive to researchers in the sciences more generally.
BayesCLUMPY: BAYESIAN INFERENCE WITH CLUMPY DUSTY TORUS MODELS
International Nuclear Information System (INIS)
Asensio Ramos, A.; Ramos Almeida, C.
2009-01-01
Our aim is to present a fast and general Bayesian inference framework based on the synergy between machine learning techniques and standard sampling methods and apply it to infer the physical properties of clumpy dusty torus using infrared photometric high spatial resolution observations of active galactic nuclei. We make use of the Metropolis-Hastings Markov Chain Monte Carlo algorithm for sampling the posterior distribution function. Such distribution results from combining all a priori knowledge about the parameters of the model and the information introduced by the observations. The main difficulty resides in the fact that the model used to explain the observations is computationally demanding and the sampling is very time consuming. For this reason, we apply a set of artificial neural networks that are used to approximate and interpolate a database of models. As a consequence, models not present in the original database can be computed ensuring continuity. We focus on the application of this solution scheme to the recently developed public database of clumpy dusty torus models. The machine learning scheme used in this paper allows us to generate any model from the database using only a factor of 10 -4 of the original size of the database and a factor of 10 -3 in computing time. The posterior distribution obtained for each model parameter allows us to investigate how the observations constrain the parameters and which ones remain partially or completely undetermined, providing statistically relevant confidence intervals. As an example, the application to the nuclear region of Centaurus A shows that the optical depth of the clouds, the total number of clouds, and the radial extent of the cloud distribution zone are well constrained using only six filters. The code is freely available from the authors.
Bayesian uncertainty quantification in linear models for diffusion MRI.
Sjölund, Jens; Eklund, Anders; Özarslan, Evren; Herberthson, Magnus; Bånkestad, Maria; Knutsson, Hans
2018-03-29
Diffusion MRI (dMRI) is a valuable tool in the assessment of tissue microstructure. By fitting a model to the dMRI signal it is possible to derive various quantitative features. Several of the most popular dMRI signal models are expansions in an appropriately chosen basis, where the coefficients are determined using some variation of least-squares. However, such approaches lack any notion of uncertainty, which could be valuable in e.g. group analyses. In this work, we use a probabilistic interpretation of linear least-squares methods to recast popular dMRI models as Bayesian ones. This makes it possible to quantify the uncertainty of any derived quantity. In particular, for quantities that are affine functions of the coefficients, the posterior distribution can be expressed in closed-form. We simulated measurements from single- and double-tensor models where the correct values of several quantities are known, to validate that the theoretically derived quantiles agree with those observed empirically. We included results from residual bootstrap for comparison and found good agreement. The validation employed several different models: Diffusion Tensor Imaging (DTI), Mean Apparent Propagator MRI (MAP-MRI) and Constrained Spherical Deconvolution (CSD). We also used in vivo data to visualize maps of quantitative features and corresponding uncertainties, and to show how our approach can be used in a group analysis to downweight subjects with high uncertainty. In summary, we convert successful linear models for dMRI signal estimation to probabilistic models, capable of accurate uncertainty quantification. Copyright © 2018 Elsevier Inc. All rights reserved.
Bayesian Regression of Thermodynamic Models of Redox Active Materials
Energy Technology Data Exchange (ETDEWEB)
Johnston, Katherine [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
2017-09-01
Finding a suitable functional redox material is a critical challenge to achieving scalable, economically viable technologies for storing concentrated solar energy in the form of a defected oxide. Demonstrating e ectiveness for thermal storage or solar fuel is largely accomplished by using a thermodynamic model derived from experimental data. The purpose of this project is to test the accuracy of our regression model on representative data sets. Determining the accuracy of the model includes parameter tting the model to the data, comparing the model using di erent numbers of param- eters, and analyzing the entropy and enthalpy calculated from the model. Three data sets were considered in this project: two demonstrating materials for solar fuels by wa- ter splitting and the other of a material for thermal storage. Using Bayesian Inference and Markov Chain Monte Carlo (MCMC), parameter estimation was preformed on the three data sets. Good results were achieved, except some there was some deviations on the edges of the data input ranges. The evidence values were then calculated in a variety of ways and used to compare models with di erent number of parameters. It was believed that at least one of the parameters was unnecessary and comparing evidence values demonstrated that the parameter was need on one data set and not signi cantly helpful on another. The entropy was calculated by taking the derivative in one variable and integrating over another. and its uncertainty was also calculated by evaluating the entropy over multiple MCMC samples. Afterwards, all the parts were written up as a tutorial for the Uncertainty Quanti cation Toolkit (UQTk).
Macroscopic Models of Clique Tree Growth for Bayesian Networks
National Aeronautics and Space Administration — In clique tree clustering, inference consists of propagation in a clique tree compiled from a Bayesian network. In this paper, we develop an analytical approach to...
Rights, Jason D; Sterba, Sonya K
2016-11-01
Multilevel data structures are common in the social sciences. Often, such nested data are analysed with multilevel models (MLMs) in which heterogeneity between clusters is modelled by continuously distributed random intercepts and/or slopes. Alternatively, the non-parametric multilevel regression mixture model (NPMM) can accommodate the same nested data structures through discrete latent class variation. The purpose of this article is to delineate analytic relationships between NPMM and MLM parameters that are useful for understanding the indirect interpretation of the NPMM as a non-parametric approximation of the MLM, with relaxed distributional assumptions. We define how seven standard and non-standard MLM specifications can be indirectly approximated by particular NPMM specifications. We provide formulas showing how the NPMM can serve as an approximation of the MLM in terms of intraclass correlation, random coefficient means and (co)variances, heteroscedasticity of residuals at level 1, and heteroscedasticity of residuals at level 2. Further, we discuss how these relationships can be useful in practice. The specific relationships are illustrated with simulated graphical demonstrations, and direct and indirect interpretations of NPMM classes are contrasted. We provide an R function to aid in implementing and visualizing an indirect interpretation of NPMM classes. An empirical example is presented and future directions are discussed. © 2016 The British Psychological Society.
Hybrid elementary flux analysis/nonparametric modeling: application for bioprocess control
Directory of Open Access Journals (Sweden)
Alves Paula M
2007-01-01
Full Text Available Abstract Background The progress in the "-omic" sciences has allowed a deeper knowledge on many biological systems with industrial interest. This knowledge is still rarely used for advanced bioprocess monitoring and control at the bioreactor level. In this work, a bioprocess control method is presented, which is designed on the basis of the metabolic network of the organism under consideration. The bioprocess dynamics are formulated using hybrid rigorous/data driven systems and its inherent structure is defined by the metabolism elementary modes. Results The metabolic network of the system under study is decomposed into elementary modes (EMs, which are the simplest paths able to operate coherently in steady-state. A reduced reaction mechanism in the form of simplified reactions connecting substrates with end-products is obtained. A dynamical hybrid system integrating material balance equations, EMs reactions stoichiometry and kinetics was formulated. EMs kinetics were defined as the product of two terms: a mechanistic/empirical known term and an unknown term that must be identified from data, in a process optimisation perspective. This approach allows the quantification of fluxes carried by individual elementary modes which is of great help to identify dominant pathways as a function of environmental conditions. The methodology was employed to analyse experimental data of recombinant Baby Hamster Kidney (BHK-21A cultures producing a recombinant fusion glycoprotein. The identified EMs kinetics demonstrated typical glucose and glutamine metabolic responses during cell growth and IgG1-IL2 synthesis. Finally, an online optimisation study was conducted in which the optimal feeding strategies of glucose and glutamine were calculated after re-estimation of model parameters at each sampling time. An improvement in the final product concentration was obtained as a result of this online optimisation. Conclusion The main contribution of this work is a
Development of uncertainty-based work injury model using Bayesian structural equation modelling.
Chatterjee, Snehamoy
2014-01-01
This paper proposed a Bayesian method-based structural equation model (SEM) of miners' work injury for an underground coal mine in India. The environmental and behavioural variables for work injury were identified and causal relationships were developed. For Bayesian modelling, prior distributions of SEM parameters are necessary to develop the model. In this paper, two approaches were adopted to obtain prior distribution for factor loading parameters and structural parameters of SEM. In the first approach, the prior distributions were considered as a fixed distribution function with specific parameter values, whereas, in the second approach, prior distributions of the parameters were generated from experts' opinions. The posterior distributions of these parameters were obtained by applying Bayesian rule. The Markov Chain Monte Carlo sampling in the form Gibbs sampling was applied for sampling from the posterior distribution. The results revealed that all coefficients of structural and measurement model parameters are statistically significant in experts' opinion-based priors, whereas, two coefficients are not statistically significant when fixed prior-based distributions are applied. The error statistics reveals that Bayesian structural model provides reasonably good fit of work injury with high coefficient of determination (0.91) and less mean squared error as compared to traditional SEM.
Nitrate source apportionment in a subtropical watershed using Bayesian model
Energy Technology Data Exchange (ETDEWEB)
Yang, Liping; Han, Jiangpei; Xue, Jianlong; Zeng, Lingzao [College of Environmental and Natural Resource Sciences, Zhejiang Provincial Key Laboratory of Subtropical Soil and Plant Nutrition, Zhejiang University, Hangzhou, 310058 (China); Shi, Jiachun, E-mail: jcshi@zju.edu.cn [College of Environmental and Natural Resource Sciences, Zhejiang Provincial Key Laboratory of Subtropical Soil and Plant Nutrition, Zhejiang University, Hangzhou, 310058 (China); Wu, Laosheng, E-mail: laowu@zju.edu.cn [College of Environmental and Natural Resource Sciences, Zhejiang Provincial Key Laboratory of Subtropical Soil and Plant Nutrition, Zhejiang University, Hangzhou, 310058 (China); Jiang, Yonghai [State Key Laboratory of Environmental Criteria and Risk Assessment, Chinese Research Academy of Environmental Sciences, Beijing, 100012 (China)
2013-10-01
Nitrate (NO{sub 3}{sup −}) pollution in aquatic system is a worldwide problem. The temporal distribution pattern and sources of nitrate are of great concern for water quality. The nitrogen (N) cycling processes in a subtropical watershed located in Changxing County, Zhejiang Province, China were greatly influenced by the temporal variations of precipitation and temperature during the study period (September 2011 to July 2012). The highest NO{sub 3}{sup −} concentration in water was in May (wet season, mean ± SD = 17.45 ± 9.50 mg L{sup −1}) and the lowest concentration occurred in December (dry season, mean ± SD = 10.54 ± 6.28 mg L{sup −1}). Nevertheless, no water sample in the study area exceeds the WHO drinking water limit of 50 mg L{sup −1} NO{sub 3}{sup −}. Four sources of NO{sub 3}{sup −} (atmospheric deposition, AD; soil N, SN; synthetic fertilizer, SF; manure and sewage, M and S) were identified using both hydrochemical characteristics [Cl{sup −}, NO{sub 3}{sup −}, HCO{sub 3}{sup −}, SO{sub 4}{sup 2−}, Ca{sup 2+}, K{sup +}, Mg{sup 2+}, Na{sup +}, dissolved oxygen (DO)] and dual isotope approach (δ{sup 15}N–NO{sub 3}{sup −} and δ{sup 18}O–NO{sub 3}{sup −}). Both chemical and isotopic characteristics indicated that denitrification was not the main N cycling process in the study area. Using a Bayesian model (stable isotope analysis in R, SIAR), the contribution of each source was apportioned. Source apportionment results showed that source contributions differed significantly between the dry and wet season, AD and M and S contributed more in December than in May. In contrast, SN and SF contributed more NO{sub 3}{sup −} to water in May than that in December. M and S and SF were the major contributors in December and May, respectively. Moreover, the shortcomings and uncertainties of SIAR were discussed to provide implications for future works. With the assessment of temporal variation and sources of NO{sub 3}{sup −}, better
Nitrate source apportionment in a subtropical watershed using Bayesian model
International Nuclear Information System (INIS)
Yang, Liping; Han, Jiangpei; Xue, Jianlong; Zeng, Lingzao; Shi, Jiachun; Wu, Laosheng; Jiang, Yonghai
2013-01-01
Nitrate (NO 3 − ) pollution in aquatic system is a worldwide problem. The temporal distribution pattern and sources of nitrate are of great concern for water quality. The nitrogen (N) cycling processes in a subtropical watershed located in Changxing County, Zhejiang Province, China were greatly influenced by the temporal variations of precipitation and temperature during the study period (September 2011 to July 2012). The highest NO 3 − concentration in water was in May (wet season, mean ± SD = 17.45 ± 9.50 mg L −1 ) and the lowest concentration occurred in December (dry season, mean ± SD = 10.54 ± 6.28 mg L −1 ). Nevertheless, no water sample in the study area exceeds the WHO drinking water limit of 50 mg L −1 NO 3 − . Four sources of NO 3 − (atmospheric deposition, AD; soil N, SN; synthetic fertilizer, SF; manure and sewage, M and S) were identified using both hydrochemical characteristics [Cl − , NO 3 − , HCO 3 − , SO 4 2− , Ca 2+ , K + , Mg 2+ , Na + , dissolved oxygen (DO)] and dual isotope approach (δ 15 N–NO 3 − and δ 18 O–NO 3 − ). Both chemical and isotopic characteristics indicated that denitrification was not the main N cycling process in the study area. Using a Bayesian model (stable isotope analysis in R, SIAR), the contribution of each source was apportioned. Source apportionment results showed that source contributions differed significantly between the dry and wet season, AD and M and S contributed more in December than in May. In contrast, SN and SF contributed more NO 3 − to water in May than that in December. M and S and SF were the major contributors in December and May, respectively. Moreover, the shortcomings and uncertainties of SIAR were discussed to provide implications for future works. With the assessment of temporal variation and sources of NO 3 − , better agricultural management practices and sewage disposal programs can be implemented to sustain water quality in subtropical watersheds
Bayesian mixture models for source separation in MEG
International Nuclear Information System (INIS)
Calvetti, Daniela; Homa, Laura; Somersalo, Erkki
2011-01-01
This paper discusses the problem of imaging electromagnetic brain activity from measurements of the induced magnetic field outside the head. This imaging modality, magnetoencephalography (MEG), is known to be severely ill posed, and in order to obtain useful estimates for the activity map, complementary information needs to be used to regularize the problem. In this paper, a particular emphasis is on finding non-superficial focal sources that induce a magnetic field that may be confused with noise due to external sources and with distributed brain noise. The data are assumed to come from a mixture of a focal source and a spatially distributed possibly virtual source; hence, to differentiate between those two components, the problem is solved within a Bayesian framework, with a mixture model prior encoding the information that different sources may be concurrently active. The mixture model prior combines one density that favors strongly focal sources and another that favors spatially distributed sources, interpreted as clutter in the source estimation. Furthermore, to address the challenge of localizing deep focal sources, a novel depth sounding algorithm is suggested, and it is shown with simulated data that the method is able to distinguish between a signal arising from a deep focal source and a clutter signal. (paper)
On Cooper's Nonparametric Test.
Schmeidler, James
1978-01-01
The basic assumption of Cooper's nonparametric test for trend (EJ 125 069) is questioned. It is contended that the proper assumption alters the distribution of the statistic and reduces its usefulness. (JKS)
Bayesian inference for partially identified models exploring the limits of limited data
Gustafson, Paul
2015-01-01
Introduction Identification What Is against Us? What Is for Us? Some Simple Examples of Partially Identified ModelsThe Road Ahead The Structure of Inference in Partially Identified Models Bayesian Inference The Structure of Posterior Distributions in PIMs Computational Strategies Strength of Bayesian Updating, Revisited Posterior MomentsCredible Intervals Evaluating the Worth of Inference Partial Identification versus Model Misspecification The Siren Call of Identification Comp
International Nuclear Information System (INIS)
Elsheikh, Ahmed H.; Wheeler, Mary F.; Hoteit, Ibrahim
2014-01-01
A Hybrid Nested Sampling (HNS) algorithm is proposed for efficient Bayesian model calibration and prior model selection. The proposed algorithm combines, Nested Sampling (NS) algorithm, Hybrid Monte Carlo (HMC) sampling and gradient estimation using Stochastic Ensemble Method (SEM). NS is an efficient sampling algorithm that can be used for Bayesian calibration and estimating the Bayesian evidence for prior model selection. Nested sampling has the advantage of computational feasibility. Within the nested sampling algorithm, a constrained sampling step is performed. For this step, we utilize HMC to reduce the correlation between successive sampled states. HMC relies on the gradient of the logarithm of the posterior distribution, which we estimate using a stochastic ensemble method based on an ensemble of directional derivatives. SEM only requires forward model runs and the simulator is then used as a black box and no adjoint code is needed. The developed HNS algorithm is successfully applied for Bayesian calibration and prior model selection of several nonlinear subsurface flow problems
Energy Technology Data Exchange (ETDEWEB)
Elsheikh, Ahmed H., E-mail: aelsheikh@ices.utexas.edu [Institute for Computational Engineering and Sciences (ICES), University of Texas at Austin, TX (United States); Institute of Petroleum Engineering, Heriot-Watt University, Edinburgh EH14 4AS (United Kingdom); Wheeler, Mary F. [Institute for Computational Engineering and Sciences (ICES), University of Texas at Austin, TX (United States); Hoteit, Ibrahim [Department of Earth Sciences and Engineering, King Abdullah University of Science and Technology (KAUST), Thuwal (Saudi Arabia)
2014-02-01
A Hybrid Nested Sampling (HNS) algorithm is proposed for efficient Bayesian model calibration and prior model selection. The proposed algorithm combines, Nested Sampling (NS) algorithm, Hybrid Monte Carlo (HMC) sampling and gradient estimation using Stochastic Ensemble Method (SEM). NS is an efficient sampling algorithm that can be used for Bayesian calibration and estimating the Bayesian evidence for prior model selection. Nested sampling has the advantage of computational feasibility. Within the nested sampling algorithm, a constrained sampling step is performed. For this step, we utilize HMC to reduce the correlation between successive sampled states. HMC relies on the gradient of the logarithm of the posterior distribution, which we estimate using a stochastic ensemble method based on an ensemble of directional derivatives. SEM only requires forward model runs and the simulator is then used as a black box and no adjoint code is needed. The developed HNS algorithm is successfully applied for Bayesian calibration and prior model selection of several nonlinear subsurface flow problems.
A Bayesian, generalized frailty model for comet assays.
Ghebretinsae, Aklilu Habteab; Faes, Christel; Molenberghs, Geert; De Boeck, Marlies; Geys, Helena
2013-05-01
This paper proposes a flexible modeling approach for so-called comet assay data regularly encountered in preclinical research. While such data consist of non-Gaussian outcomes in a multilevel hierarchical structure, traditional analyses typically completely or partly ignore this hierarchical nature by summarizing measurements within a cluster. Non-Gaussian outcomes are often modeled using exponential family models. This is true not only for binary and count data, but also for, example, time-to-event outcomes. Two important reasons for extending this family are for (1) the possible occurrence of overdispersion, meaning that the variability in the data may not be adequately described by the models, which often exhibit a prescribed mean-variance link, and (2) the accommodation of a hierarchical structure in the data, owing to clustering in the data. The first issue is dealt with through so-called overdispersion models. Clustering is often accommodated through the inclusion of random subject-specific effects. Though not always, one conventionally assumes such random effects to be normally distributed. In the case of time-to-event data, one encounters, for example, the gamma frailty model (Duchateau and Janssen, 2007 ). While both of these issues may occur simultaneously, models combining both are uncommon. Molenberghs et al. ( 2010 ) proposed a broad class of generalized linear models accommodating overdispersion and clustering through two separate sets of random effects. Here, we use this method to model data from a comet assay with a three-level hierarchical structure. Although a conjugate gamma random effect is used for the overdispersion random effect, both gamma and normal random effects are considered for the hierarchical random effect. Apart from model formulation, we place emphasis on Bayesian estimation. Our proposed method has an upper hand over the traditional analysis in that it (1) uses the appropriate distribution stipulated in the literature; (2) deals
Ma, Lu; Yan, Xuedong
2014-06-01
This study seeks to inspect the nonparametric characteristics connecting the age of the driver to the relative risk of being an at-fault vehicle, in order to discover a more precise and smooth pattern of age impact, which has commonly been neglected in past studies. Records of drivers in two-vehicle rear-end collisions are selected from the general estimates system (GES) 2011 dataset. These extracted observations in fact constitute inherently matched driver pairs under certain matching variables including weather conditions, pavement conditions and road geometry design characteristics that are shared by pairs of drivers in rear-end accidents. The introduced data structure is able to guarantee that the variance of the response variable will not depend on the matching variables and hence provides a high power of statistical modeling. The estimation results exhibit a smooth cubic spline function for examining the nonlinear relationship between the age of the driver and the log odds of being at fault in a rear-end accident. The results are presented with respect to the main effect of age, the interaction effect between age and sex, and the effects of age under different scenarios of pre-crash actions by the leading vehicle. Compared to the conventional specification in which age is categorized into several predefined groups, the proposed method is more flexible and able to produce quantitatively explicit results. First, it confirms the U-shaped pattern of the age effect, and further shows that the risks of young and old drivers change rapidly with age. Second, the interaction effects between age and sex show that female and male drivers behave differently in rear-end accidents. Third, it is found that the pattern of age impact varies according to the type of pre-crash actions exhibited by the leading vehicle. Copyright © 2014 Elsevier Ltd. All rights reserved.
Reliability assessment using degradation models: bayesian and classical approaches
Directory of Open Access Journals (Sweden)
Marta Afonso Freitas
2010-04-01
Full Text Available Traditionally, reliability assessment of devices has been based on (accelerated life tests. However, for highly reliable products, little information about reliability is provided by life tests in which few or no failures are typically observed. Since most failures arise from a degradation mechanism at work for which there are characteristics that degrade over time, one alternative is monitor the device for a period of time and assess its reliability from the changes in performance (degradation observed during that period. The goal of this article is to illustrate how degradation data can be modeled and analyzed by using "classical" and Bayesian approaches. Four methods of data analysis based on classical inference are presented. Next we show how Bayesian methods can also be used to provide a natural approach to analyzing degradation data. The approaches are applied to a real data set regarding train wheels degradation.Tradicionalmente, o acesso à confiabilidade de dispositivos tem sido baseado em testes de vida (acelerados. Entretanto, para produtos altamente confiáveis, pouca informação a respeito de sua confiabilidade é fornecida por testes de vida no quais poucas ou nenhumas falhas são observadas. Uma vez que boa parte das falhas é induzida por mecanismos de degradação, uma alternativa é monitorar o dispositivo por um período de tempo e acessar sua confiabilidade através das mudanças em desempenho (degradação observadas durante aquele período. O objetivo deste artigo é ilustrar como dados de degradação podem ser modelados e analisados utilizando-se abordagens "clássicas" e Bayesiana. Quatro métodos de análise de dados baseados em inferência clássica são apresentados. A seguir, mostramos como os métodos Bayesianos podem também ser aplicados para proporcionar uma abordagem natural à análise de dados de degradação. As abordagens são aplicadas a um banco de dados real relacionado à degradação de rodas de trens.
International Nuclear Information System (INIS)
Ebrahimi, Nader
2009-01-01
This paper considers prediction of unknown number of failures in a future inspection of a group of in-service units based on number of failures observed from an earlier inspection. We develop a flexible Bayesian model and calculate Bayesian estimator for this unknown number and other quantities of interest. The paper also includes an illustration of our method in an example about heat exchanger. A main advantage of our approach is in its nonparametric nature. By nonparametric here we simply mean that no assumption is required about the failure time distribution of a unit
Yang, Ziheng; Zhu, Tianqi
2018-02-20
The Bayesian method is noted to produce spuriously high posterior probabilities for phylogenetic trees in analysis of large datasets, but the precise reasons for this overconfidence are unknown. In general, the performance of Bayesian selection of misspecified models is poorly understood, even though this is of great scientific interest since models are never true in real data analysis. Here we characterize the asymptotic behavior of Bayesian model selection and show that when the competing models are equally wrong, Bayesian model selection exhibits surprising and polarized behaviors in large datasets, supporting one model with full force while rejecting the others. If one model is slightly less wrong than the other, the less wrong model will eventually win when the amount of data increases, but the method may become overconfident before it becomes reliable. We suggest that this extreme behavior may be a major factor for the spuriously high posterior probabilities for evolutionary trees. The philosophical implications of our results to the application of Bayesian model selection to evaluate opposing scientific hypotheses are yet to be explored, as are the behaviors of non-Bayesian methods in similar situations.
Tauber, Sean; Navarro, Daniel J; Perfors, Amy; Steyvers, Mark
2017-07-01
Recent debates in the psychological literature have raised questions about the assumptions that underpin Bayesian models of cognition and what inferences they license about human cognition. In this paper we revisit this topic, arguing that there are 2 qualitatively different ways in which a Bayesian model could be constructed. The most common approach uses a Bayesian model as a normative standard upon which to license a claim about optimality. In the alternative approach, a descriptive Bayesian model need not correspond to any claim that the underlying cognition is optimal or rational, and is used solely as a tool for instantiating a substantive psychological theory. We present 3 case studies in which these 2 perspectives lead to different computational models and license different conclusions about human cognition. We demonstrate how the descriptive Bayesian approach can be used to answer different sorts of questions than the optimal approach, especially when combined with principled tools for model evaluation and model selection. More generally we argue for the importance of making a clear distinction between the 2 perspectives. Considerable confusion results when descriptive models and optimal models are conflated, and if Bayesians are to avoid contributing to this confusion it is important to avoid making normative claims when none are intended. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Bayesian leave-one-out cross-validation approximations for Gaussian latent variable models
DEFF Research Database (Denmark)
Vehtari, Aki; Mononen, Tommi; Tolvanen, Ville
2016-01-01
The future predictive performance of a Bayesian model can be estimated using Bayesian cross-validation. In this article, we consider Gaussian latent variable models where the integration over the latent values is approximated using the Laplace method or expectation propagation (EP). We study...... the properties of several Bayesian leave-one-out (LOO) cross-validation approximations that in most cases can be computed with a small additional cost after forming the posterior approximation given the full data. Our main objective is to assess the accuracy of the approximative LOO cross-validation estimators...
Bayesian Safety Risk Modeling of Human-Flightdeck Automation Interaction
Ancel, Ersin; Shih, Ann T.
2015-01-01
Usage of automatic systems in airliners has increased fuel efficiency, added extra capabilities, enhanced safety and reliability, as well as provide improved passenger comfort since its introduction in the late 80's. However, original automation benefits, including reduced flight crew workload, human errors or training requirements, were not achieved as originally expected. Instead, automation introduced new failure modes, redistributed, and sometimes increased workload, brought in new cognitive and attention demands, and increased training requirements. Modern airliners have numerous flight modes, providing more flexibility (and inherently more complexity) to the flight crew. However, the price to pay for the increased flexibility is the need for increased mode awareness, as well as the need to supervise, understand, and predict automated system behavior. Also, over-reliance on automation is linked to manual flight skill degradation and complacency in commercial pilots. As a result, recent accidents involving human errors are often caused by the interactions between humans and the automated systems (e.g., the breakdown in man-machine coordination), deteriorated manual flying skills, and/or loss of situational awareness due to heavy dependence on automated systems. This paper describes the development of the increased complexity and reliance on automation baseline model, named FLAP for FLightdeck Automation Problems. The model development process starts with a comprehensive literature review followed by the construction of a framework comprised of high-level causal factors leading to an automation-related flight anomaly. The framework was then converted into a Bayesian Belief Network (BBN) using the Hugin Software v7.8. The effects of automation on flight crew are incorporated into the model, including flight skill degradation, increased cognitive demand and training requirements along with their interactions. Besides flight crew deficiencies, automation system
Bayesian joint modelling of benefit and risk in drug development.
Costa, Maria J; Drury, Thomas
2018-05-01
To gain regulatory approval, a new medicine must demonstrate that its benefits outweigh any potential risks, ie, that the benefit-risk balance is favourable towards the new medicine. For transparency and clarity of the decision, a structured and consistent approach to benefit-risk assessment that quantifies uncertainties and accounts for underlying dependencies is desirable. This paper proposes two approaches to benefit-risk evaluation, both based on the idea of joint modelling of mixed outcomes that are potentially dependent at the subject level. Using Bayesian inference, the two approaches offer interpretability and efficiency to enhance qualitative frameworks. Simulation studies show that accounting for correlation leads to a more accurate assessment of the strength of evidence to support benefit-risk profiles of interest. Several graphical approaches are proposed that can be used to communicate the benefit-risk balance to project teams. Finally, the two approaches are illustrated in a case study using real clinical trial data. Copyright © 2018 John Wiley & Sons, Ltd.
Statistical modelling of railway track geometry degradation using Hierarchical Bayesian models
International Nuclear Information System (INIS)
Andrade, A.R.; Teixeira, P.F.
2015-01-01
Railway maintenance planners require a predictive model that can assess the railway track geometry degradation. The present paper uses a Hierarchical Bayesian model as a tool to model the main two quality indicators related to railway track geometry degradation: the standard deviation of longitudinal level defects and the standard deviation of horizontal alignment defects. Hierarchical Bayesian Models (HBM) are flexible statistical models that allow specifying different spatially correlated components between consecutive track sections, namely for the deterioration rates and the initial qualities parameters. HBM are developed for both quality indicators, conducting an extensive comparison between candidate models and a sensitivity analysis on prior distributions. HBM is applied to provide an overall assessment of the degradation of railway track geometry, for the main Portuguese railway line Lisbon–Oporto. - Highlights: • Rail track geometry degradation is analysed using Hierarchical Bayesian models. • A Gibbs sampling strategy is put forward to estimate the HBM. • Model comparison and sensitivity analysis find the most suitable model. • We applied the most suitable model to all the segments of the main Portuguese line. • Tackling spatial correlations using CAR structures lead to a better model fit
DEFF Research Database (Denmark)
Quinonero, Joaquin; Girard, Agathe; Larsen, Jan
2003-01-01
The object of Bayesian modelling is predictive distribution, which, in a forecasting scenario, enables evaluation of forecasted values and their uncertainties. We focus on reliably estimating the predictive mean and variance of forecasted values using Bayesian kernel based models such as the Gaus......The object of Bayesian modelling is predictive distribution, which, in a forecasting scenario, enables evaluation of forecasted values and their uncertainties. We focus on reliably estimating the predictive mean and variance of forecasted values using Bayesian kernel based models...... such as the Gaussian process and the relevance vector machine. We derive novel analytic expressions for the predictive mean and variance for Gaussian kernel shapes under the assumption of a Gaussian input distribution in the static case, and of a recursive Gaussian predictive density in iterative forecasting...
U.S. Environmental Protection Agency — The dataset is lake dissolved oxygen concentrations obtained form plots published by Gelda et al. (1996) and lake reaeration model simulated values using Bayesian...
Fenton, N.; Neil, M.; Lagnado, D.; Marsh, W.; Yet, B.; Constantinou, A.
2016-01-01
We show that existing Bayesian network (BN) modelling techniques cannot capture the correct intuitive reasoning in the important case when a set of mutually exclusive events need to be modelled as separate nodes instead of states of a single node. A previously proposed ‘solution’, which introduces a simple constraint node that enforces mutual exclusivity, fails to preserve the prior probabilities of the events, while other proposed solutions involve major changes to the original model. We pro...
Energy Technology Data Exchange (ETDEWEB)
Duarte, Juliana P.; Leite, Victor C.; Melo, P.F. Frutuoso e, E-mail: julianapduarte@poli.ufrj.br, E-mail: victor.coppo.leite@poli.ufrj.br, E-mail: frutuoso@nuclear.ufrj.br [Universidade Federal do Rio de Janeiro (UFRJ), Rio de Janeiro, RJ (Brazil)
2013-07-01
Bayesian networks have become a very handy tool for solving problems in various application areas. This paper discusses the use of Bayesian networks to treat dependent events in reliability engineering typically modeled by Markovian models. Dependent events play an important role as, for example, when treating load-sharing systems, bridge systems, common-cause failures, and switching systems (those for which a standby component is activated after the main one fails by means of a switching mechanism). Repair plays an important role in all these cases (as, for example, the number of repairmen). All Bayesian network calculations are performed by means of the Netica™ software, of Norsys Software Corporation, and Fortran 90 to evaluate them over time. The discussion considers the development of time-dependent reliability figures of merit, which are easily obtained, through Markovian models, but not through Bayesian networks, because these latter need probability figures as input and not failure and repair rates. Bayesian networks produced results in very good agreement with those of Markov models and pivotal decomposition. Static and discrete time (DTBN) Bayesian networks were used in order to check their capabilities of modeling specific situations, like switching failures in cold-standby systems. The DTBN was more flexible to modeling systems where the time of occurrence of an event is important, for example, standby failure and repair. However, the static network model showed as good results as DTBN by a much more simplified approach. (author)
International Nuclear Information System (INIS)
Duarte, Juliana P.; Leite, Victor C.; Melo, P.F. Frutuoso e
2013-01-01
Bayesian networks have become a very handy tool for solving problems in various application areas. This paper discusses the use of Bayesian networks to treat dependent events in reliability engineering typically modeled by Markovian models. Dependent events play an important role as, for example, when treating load-sharing systems, bridge systems, common-cause failures, and switching systems (those for which a standby component is activated after the main one fails by means of a switching mechanism). Repair plays an important role in all these cases (as, for example, the number of repairmen). All Bayesian network calculations are performed by means of the Netica™ software, of Norsys Software Corporation, and Fortran 90 to evaluate them over time. The discussion considers the development of time-dependent reliability figures of merit, which are easily obtained, through Markovian models, but not through Bayesian networks, because these latter need probability figures as input and not failure and repair rates. Bayesian networks produced results in very good agreement with those of Markov models and pivotal decomposition. Static and discrete time (DTBN) Bayesian networks were used in order to check their capabilities of modeling specific situations, like switching failures in cold-standby systems. The DTBN was more flexible to modeling systems where the time of occurrence of an event is important, for example, standby failure and repair. However, the static network model showed as good results as DTBN by a much more simplified approach. (author)
LEON-GONZALEZ, Roberto; VINAYAGATHASAN, Thanabalasingam
2013-01-01
This paper investigates the determinants of growth in the Asian developing economies. We use Bayesian model averaging (BMA) in the context of a dynamic panel data growth regression to overcome the uncertainty over the choice of control variables. In addition, we use a Bayesian algorithm to analyze a large number of competing models. Among the explanatory variables, we include a non-linear function of inflation that allows for threshold effects. We use an unbalanced panel data set of 27 Asian ...
Predicting water main failures using Bayesian model averaging and survival modelling approach
International Nuclear Information System (INIS)
Kabir, Golam; Tesfamariam, Solomon; Sadiq, Rehan
2015-01-01
To develop an effective preventive or proactive repair and replacement action plan, water utilities often rely on water main failure prediction models. However, in predicting the failure of water mains, uncertainty is inherent regardless of the quality and quantity of data used in the model. To improve the understanding of water main failure, a Bayesian framework is developed for predicting the failure of water mains considering uncertainties. In this study, Bayesian model averaging method (BMA) is presented to identify the influential pipe-dependent and time-dependent covariates considering model uncertainties whereas Bayesian Weibull Proportional Hazard Model (BWPHM) is applied to develop the survival curves and to predict the failure rates of water mains. To accredit the proposed framework, it is implemented to predict the failure of cast iron (CI) and ductile iron (DI) pipes of the water distribution network of the City of Calgary, Alberta, Canada. Results indicate that the predicted 95% uncertainty bounds of the proposed BWPHMs capture effectively the observed breaks for both CI and DI water mains. Moreover, the performance of the proposed BWPHMs are better compare to the Cox-Proportional Hazard Model (Cox-PHM) for considering Weibull distribution for the baseline hazard function and model uncertainties. - Highlights: • Prioritize rehabilitation and replacements (R/R) strategies of water mains. • Consider the uncertainties for the failure prediction. • Improve the prediction capability of the water mains failure models. • Identify the influential and appropriate covariates for different models. • Determine the effects of the covariates on failure
Bayesian Model Averaging of Artificial Intelligence Models for Hydraulic Conductivity Estimation
Nadiri, A.; Chitsazan, N.; Tsai, F. T.; Asghari Moghaddam, A.
2012-12-01
This research presents a Bayesian artificial intelligence model averaging (BAIMA) method that incorporates multiple artificial intelligence (AI) models to estimate hydraulic conductivity and evaluate estimation uncertainties. Uncertainty in the AI model outputs stems from error in model input as well as non-uniqueness in selecting different AI methods. Using one single AI model tends to bias the estimation and underestimate uncertainty. BAIMA employs Bayesian model averaging (BMA) technique to address the issue of using one single AI model for estimation. BAIMA estimates hydraulic conductivity by averaging the outputs of AI models according to their model weights. In this study, the model weights were determined using the Bayesian information criterion (BIC) that follows the parsimony principle. BAIMA calculates the within-model variances to account for uncertainty propagation from input data to AI model output. Between-model variances are evaluated to account for uncertainty due to model non-uniqueness. We employed Takagi-Sugeno fuzzy logic (TS-FL), artificial neural network (ANN) and neurofuzzy (NF) to estimate hydraulic conductivity for the Tasuj plain aquifer, Iran. BAIMA combined three AI models and produced better fitting than individual models. While NF was expected to be the best AI model owing to its utilization of both TS-FL and ANN models, the NF model is nearly discarded by the parsimony principle. The TS-FL model and the ANN model showed equal importance although their hydraulic conductivity estimates were quite different. This resulted in significant between-model variances that are normally ignored by using one AI model.
Bayesian network as a modelling tool for risk management in agriculture
DEFF Research Database (Denmark)
Rasmussen, Svend; Madsen, Anders L.; Lund, Mogens
. In this paper we use Bayesian networks as an integrated modelling approach for representing uncertainty and analysing risk management in agriculture. It is shown how historical farm account data may be efficiently used to estimate conditional probabilities, which are the core elements in Bayesian network models....... We further show how the Bayesian network model RiBay is used for stochastic simulation of farm income, and we demonstrate how RiBay can be used to simulate risk management at the farm level. It is concluded that the key strength of a Bayesian network is the transparency of assumptions......, and that it has the ability to link uncertainty from different external sources to budget figures and to quantify risk at the farm level....
Bayesian Proteoform Modeling Improves Protein Quantification of Global Proteomic Measurements
Energy Technology Data Exchange (ETDEWEB)
Webb-Robertson, Bobbie-Jo M.; Matzke, Melissa M.; Datta, Susmita; Payne, Samuel H.; Kang, Jiyun; Bramer, Lisa M.; Nicora, Carrie D.; Shukla, Anil K.; Metz, Thomas O.; Rodland, Karin D.; Smith, Richard D.; Tardiff, Mark F.; McDermott, Jason E.; Pounds, Joel G.; Waters, Katrina M.
2014-12-01
As the capability of mass spectrometry-based proteomics has matured, tens of thousands of peptides can be measured simultaneously, which has the benefit of offering a systems view of protein expression. However, a major challenge is that with an increase in throughput, protein quantification estimation from the native measured peptides has become a computational task. A limitation to existing computationally-driven protein quantification methods is that most ignore protein variation, such as alternate splicing of the RNA transcript and post-translational modifications or other possible proteoforms, which will affect a significant fraction of the proteome. The consequence of this assumption is that statistical inference at the protein level, and consequently downstream analyses, such as network and pathway modeling, have only limited power for biomarker discovery. Here, we describe a Bayesian model (BP-Quant) that uses statistically derived peptides signatures to identify peptides that are outside the dominant pattern, or the existence of multiple over-expressed patterns to improve relative protein abundance estimates. It is a research-driven approach that utilizes the objectives of the experiment, defined in the context of a standard statistical hypothesis, to identify a set of peptides exhibiting similar statistical behavior relating to a protein. This approach infers that changes in relative protein abundance can be used as a surrogate for changes in function, without necessarily taking into account the effect of differential post-translational modifications, processing, or splicing in altering protein function. We verify the approach using a dilution study from mouse plasma samples and demonstrate that BP-Quant achieves similar accuracy as the current state-of-the-art methods at proteoform identification with significantly better specificity. BP-Quant is available as a MatLab ® and R packages at https://github.com/PNNL-Comp-Mass-Spec/BP-Quant.
A Bayesian hierarchical model for demand curve analysis.
Ho, Yen-Yi; Nhu Vo, Tien; Chu, Haitao; Luo, Xianghua; Le, Chap T
2018-07-01
Drug self-administration experiments are a frequently used approach to assessing the abuse liability and reinforcing property of a compound. It has been used to assess the abuse liabilities of various substances such as psychomotor stimulants and hallucinogens, food, nicotine, and alcohol. The demand curve generated from a self-administration study describes how demand of a drug or non-drug reinforcer varies as a function of price. With the approval of the 2009 Family Smoking Prevention and Tobacco Control Act, demand curve analysis provides crucial evidence to inform the US Food and Drug Administration's policy on tobacco regulation, because it produces several important quantitative measurements to assess the reinforcing strength of nicotine. The conventional approach popularly used to analyze the demand curve data is individual-specific non-linear least square regression. The non-linear least square approach sets out to minimize the residual sum of squares for each subject in the dataset; however, this one-subject-at-a-time approach does not allow for the estimation of between- and within-subject variability in a unified model framework. In this paper, we review the existing approaches to analyze the demand curve data, non-linear least square regression, and the mixed effects regression and propose a new Bayesian hierarchical model. We conduct simulation analyses to compare the performance of these three approaches and illustrate the proposed approaches in a case study of nicotine self-administration in rats. We present simulation results and discuss the benefits of using the proposed approaches.
A Bayesian Approach for Summarizing and Modeling Time-Series Exposure Data with Left Censoring.
Houseman, E Andres; Virji, M Abbas
2017-08-01
Direct reading instruments are valuable tools for measuring exposure as they provide real-time measurements for rapid decision making. However, their use is limited to general survey applications in part due to issues related to their performance. Moreover, statistical analysis of real-time data is complicated by autocorrelation among successive measurements, non-stationary time series, and the presence of left-censoring due to limit-of-detection (LOD). A Bayesian framework is proposed that accounts for non-stationary autocorrelation and LOD issues in exposure time-series data in order to model workplace factors that affect exposure and estimate summary statistics for tasks or other covariates of interest. A spline-based approach is used to model non-stationary autocorrelation with relatively few assumptions about autocorrelation structure. Left-censoring is addressed by integrating over the left tail of the distribution. The model is fit using Markov-Chain Monte Carlo within a Bayesian paradigm. The method can flexibly account for hierarchical relationships, random effects and fixed effects of covariates. The method is implemented using the rjags package in R, and is illustrated by applying it to real-time exposure data. Estimates for task means and covariates from the Bayesian model are compared to those from conventional frequentist models including linear regression, mixed-effects, and time-series models with different autocorrelation structures. Simulations studies are also conducted to evaluate method performance. Simulation studies with percent of measurements below the LOD ranging from 0 to 50% showed lowest root mean squared errors for task means and the least biased standard deviations from the Bayesian model compared to the frequentist models across all levels of LOD. In the application, task means from the Bayesian model were similar to means from the frequentist models, while the standard deviations were different. Parameter estimates for covariates
Bayesian estimation of regularization parameters for deformable surface models
International Nuclear Information System (INIS)
Cunningham, G.S.; Lehovich, A.; Hanson, K.M.
1999-01-01
In this article the authors build on their past attempts to reconstruct a 3D, time-varying bolus of radiotracer from first-pass data obtained by the dynamic SPECT imager, FASTSPECT, built by the University of Arizona. The object imaged is a CardioWest total artificial heart. The bolus is entirely contained in one ventricle and its associated inlet and outlet tubes. The model for the radiotracer distribution at a given time is a closed surface parameterized by 482 vertices that are connected to make 960 triangles, with nonuniform intensity variations of radiotracer allowed inside the surface on a voxel-to-voxel basis. The total curvature of the surface is minimized through the use of a weighted prior in the Bayesian framework, as is the weighted norm of the gradient of the voxellated grid. MAP estimates for the vertices, interior intensity voxels and background count level are produced. The strength of the priors, or hyperparameters, are determined by maximizing the probability of the data given the hyperparameters, called the evidence. The evidence is calculated by first assuming that the posterior is approximately normal in the values of the vertices and voxels, and then by evaluating the integral of the multi-dimensional normal distribution. This integral (which requires evaluating the determinant of a covariance matrix) is computed by applying a recent algorithm from Bai et. al. that calculates the needed determinant efficiently. They demonstrate that the radiotracer is highly inhomogeneous in early time frames, as suspected in earlier reconstruction attempts that assumed a uniform intensity of radiotracer within the closed surface, and that the optimal choice of hyperparameters is substantially different for different time frames
Bayesian Subset Modeling for High-Dimensional Generalized Linear Models
Liang, Faming; Song, Qifan; Yu, Kai
2013-01-01
criterion model. The consistency of the resulting posterior is established under mild conditions. Further, a variable screening procedure is proposed based on the marginal inclusion probability, which shares the same properties of sure screening
Energy Technology Data Exchange (ETDEWEB)
Brown, Justin [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Hund, Lauren [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
2017-02-01
Dynamic compression experiments are being performed on complicated materials using increasingly complex drivers. The data produced in these experiments are beginning to reach a regime where traditional analysis techniques break down; requiring the solution of an inverse problem. A common measurement in dynamic experiments is an interface velocity as a function of time, and often this functional output can be simulated using a hydrodynamics code. Bayesian model calibration is a statistical framework to estimate inputs into a computational model in the presence of multiple uncertainties, making it well suited to measurements of this type. In this article, we apply Bayesian model calibration to high pressure (250 GPa) ramp compression measurements in tantalum. We address several issues speci c to this calibration including the functional nature of the output as well as parameter and model discrepancy identi ability. Speci cally, we propose scaling the likelihood function by an e ective sample size rather than modeling the autocorrelation function to accommodate the functional output and propose sensitivity analyses using the notion of `modularization' to assess the impact of experiment-speci c nuisance input parameters on estimates of material properties. We conclude that the proposed Bayesian model calibration procedure results in simple, fast, and valid inferences on the equation of state parameters for tantalum.
Nonparametric functional mapping of quantitative trait loci.
Yang, Jie; Wu, Rongling; Casella, George
2009-03-01
Functional mapping is a useful tool for mapping quantitative trait loci (QTL) that control dynamic traits. It incorporates mathematical aspects of biological processes into the mixture model-based likelihood setting for QTL mapping, thus increasing the power of QTL detection and the precision of parameter estimation. However, in many situations there is no obvious functional form and, in such cases, this strategy will not be optimal. Here we propose to use nonparametric function estimation, typically implemented with B-splines, to estimate the underlying functional form of phenotypic trajectories, and then construct a nonparametric test to find evidence of existing QTL. Using the representation of a nonparametric regression as a mixed model, the final test statistic is a likelihood ratio test. We consider two types of genetic maps: dense maps and general maps, and the power of nonparametric functional mapping is investigated through simulation studies and demonstrated by examples.
Dickhaus, Thorsten
2018-01-01
This textbook provides a self-contained presentation of the main concepts and methods of nonparametric statistical testing, with a particular focus on the theoretical foundations of goodness-of-fit tests, rank tests, resampling tests, and projection tests. The substitution principle is employed as a unified approach to the nonparametric test problems discussed. In addition to mathematical theory, it also includes numerous examples and computer implementations. The book is intended for advanced undergraduate, graduate, and postdoc students as well as young researchers. Readers should be familiar with the basic concepts of mathematical statistics typically covered in introductory statistics courses.
Directory of Open Access Journals (Sweden)
Mihaela Simionescu
2014-12-01
Full Text Available There are many types of econometric models used in predicting the inflation rate, but in this study we used a Bayesian shrinkage combination approach. This methodology is used in order to improve the predictions accuracy by including information that is not captured by the econometric models. Therefore, experts’ forecasts are utilized as prior information, for Romania these predictions being provided by Institute for Economic Forecasting (Dobrescu macromodel, National Commission for Prognosis and European Commission. The empirical results for Romanian inflation show the superiority of a fixed effects model compared to other types of econometric models like VAR, Bayesian VAR, simultaneous equations model, dynamic model, log-linear model. The Bayesian combinations that used experts’ predictions as priors, when the shrinkage parameter tends to infinite, improved the accuracy of all forecasts based on individual models, outperforming also zero and equal weights predictions and naïve forecasts.
Bayesian Modeling of ChIP-chip Data Through a High-Order Ising Model
Mo, Qianxing
2010-01-29
ChIP-chip experiments are procedures that combine chromatin immunoprecipitation (ChIP) and DNA microarray (chip) technology to study a variety of biological problems, including protein-DNA interaction, histone modification, and DNA methylation. The most important feature of ChIP-chip data is that the intensity measurements of probes are spatially correlated because the DNA fragments are hybridized to neighboring probes in the experiments. We propose a simple, but powerful Bayesian hierarchical approach to ChIP-chip data through an Ising model with high-order interactions. The proposed method naturally takes into account the intrinsic spatial structure of the data and can be used to analyze data from multiple platforms with different genomic resolutions. The model parameters are estimated using the Gibbs sampler. The proposed method is illustrated using two publicly available data sets from Affymetrix and Agilent platforms, and compared with three alternative Bayesian methods, namely, Bayesian hierarchical model, hierarchical gamma mixture model, and Tilemap hidden Markov model. The numerical results indicate that the proposed method performs as well as the other three methods for the data from Affymetrix tiling arrays, but significantly outperforms the other three methods for the data from Agilent promoter arrays. In addition, we find that the proposed method has better operating characteristics in terms of sensitivities and false discovery rates under various scenarios. © 2010, The International Biometric Society.
Bayesian spatial modeling of HIV mortality via zero-inflated Poisson models.
Musal, Muzaffer; Aktekin, Tevfik
2013-01-30
In this paper, we investigate the effects of poverty and inequality on the number of HIV-related deaths in 62 New York counties via Bayesian zero-inflated Poisson models that exhibit spatial dependence. We quantify inequality via the Theil index and poverty via the ratios of two Census 2000 variables, the number of people under the poverty line and the number of people for whom poverty status is determined, in each Zip Code Tabulation Area. The purpose of this study was to investigate the effects of inequality and poverty in addition to spatial dependence between neighboring regions on HIV mortality rate, which can lead to improved health resource allocation decisions. In modeling county-specific HIV counts, we propose Bayesian zero-inflated Poisson models whose rates are functions of both covariate and spatial/random effects. To show how the proposed models work, we used three different publicly available data sets: TIGER Shapefiles, Census 2000, and mortality index files. In addition, we introduce parameter estimation issues of Bayesian zero-inflated Poisson models and discuss MCMC method implications. Copyright © 2012 John Wiley & Sons, Ltd.
Use of SAMC for Bayesian analysis of statistical models with intractable normalizing constants
Jin, Ick Hoon
2014-03-01
Statistical inference for the models with intractable normalizing constants has attracted much attention. During the past two decades, various approximation- or simulation-based methods have been proposed for the problem, such as the Monte Carlo maximum likelihood method and the auxiliary variable Markov chain Monte Carlo methods. The Bayesian stochastic approximation Monte Carlo algorithm specifically addresses this problem: It works by sampling from a sequence of approximate distributions with their average converging to the target posterior distribution, where the approximate distributions can be achieved using the stochastic approximation Monte Carlo algorithm. A strong law of large numbers is established for the Bayesian stochastic approximation Monte Carlo estimator under mild conditions. Compared to the Monte Carlo maximum likelihood method, the Bayesian stochastic approximation Monte Carlo algorithm is more robust to the initial guess of model parameters. Compared to the auxiliary variable MCMC methods, the Bayesian stochastic approximation Monte Carlo algorithm avoids the requirement for perfect samples, and thus can be applied to many models for which perfect sampling is not available or very expensive. The Bayesian stochastic approximation Monte Carlo algorithm also provides a general framework for approximate Bayesian analysis. © 2012 Elsevier B.V. All rights reserved.
Lesaffre, Emmanuel
2012-01-01
The growth of biostatistics has been phenomenal in recent years and has been marked by considerable technical innovation in both methodology and computational practicality. One area that has experienced significant growth is Bayesian methods. The growing use of Bayesian methodology has taken place partly due to an increasing number of practitioners valuing the Bayesian paradigm as matching that of scientific discovery. In addition, computational advances have allowed for more complex models to be fitted routinely to realistic data sets. Through examples, exercises and a combination of introd
Parameterizing Bayesian network Representations of Social-Behavioral Models by Expert Elicitation
Energy Technology Data Exchange (ETDEWEB)
Walsh, Stephen J.; Dalton, Angela C.; Whitney, Paul D.; White, Amanda M.
2010-05-23
Bayesian networks provide a general framework with which to model many natural phenomena. The mathematical nature of Bayesian networks enables a plethora of model validation and calibration techniques: e.g parameter estimation, goodness of fit tests, and diagnostic checking of the model assumptions. However, they are not free of shortcomings. Parameter estimation from relevant extant data is a common approach to calibrating the model parameters. In practice it is not uncommon to find oneself lacking adequate data to reliably estimate all model parameters. In this paper we present the early development of a novel application of conjoint analysis as a method for eliciting and modeling expert opinions and using the results in a methodology for calibrating the parameters of a Bayesian network.
Fast and accurate Bayesian model criticism and conflict diagnostics using R-INLA
Ferkingstad, Egil
2017-10-16
Bayesian hierarchical models are increasingly popular for realistic modelling and analysis of complex data. This trend is accompanied by the need for flexible, general and computationally efficient methods for model criticism and conflict detection. Usually, a Bayesian hierarchical model incorporates a grouping of the individual data points, as, for example, with individuals in repeated measurement data. In such cases, the following question arises: Are any of the groups “outliers,” or in conflict with the remaining groups? Existing general approaches aiming to answer such questions tend to be extremely computationally demanding when model fitting is based on Markov chain Monte Carlo. We show how group-level model criticism and conflict detection can be carried out quickly and accurately through integrated nested Laplace approximations (INLA). The new method is implemented as a part of the open-source R-INLA package for Bayesian computing (http://r-inla.org).
Bhattacharya, Abhishek; Dunson, David B
2012-08-01
This article considers a broad class of kernel mixture density models on compact metric spaces and manifolds. Following a Bayesian approach with a nonparametric prior on the location mixing distribution, sufficient conditions are obtained on the kernel, prior and the underlying space for strong posterior consistency at any continuous density. The prior is also allowed to depend on the sample size n and sufficient conditions are obtained for weak and strong consistency. These conditions are verified on compact Euclidean spaces using multivariate Gaussian kernels, on the hypersphere using a von Mises-Fisher kernel and on the planar shape space using complex Watson kernels.
Boos, Moritz; Seer, Caroline; Lange, Florian; Kopp, Bruno
2016-01-01
Cognitive determinants of probabilistic inference were examined using hierarchical Bayesian modeling techniques. A classic urn-ball paradigm served as experimental strategy, involving a factorial two (prior probabilities) by two (likelihoods) design. Five computational models of cognitive processes were compared with the observed behavior. Parameter-free Bayesian posterior probabilities and parameter-free base rate neglect provided inadequate models of probabilistic inference. The introduction of distorted subjective probabilities yielded more robust and generalizable results. A general class of (inverted) S-shaped probability weighting functions had been proposed; however, the possibility of large differences in probability distortions not only across experimental conditions, but also across individuals, seems critical for the model's success. It also seems advantageous to consider individual differences in parameters of probability weighting as being sampled from weakly informative prior distributions of individual parameter values. Thus, the results from hierarchical Bayesian modeling converge with previous results in revealing that probability weighting parameters show considerable task dependency and individual differences. Methodologically, this work exemplifies the usefulness of hierarchical Bayesian modeling techniques for cognitive psychology. Theoretically, human probabilistic inference might be best described as the application of individualized strategic policies for Bayesian belief revision.
A Tutorial Introduction to Bayesian Models of Cognitive Development
2011-01-01
typewriter with an infinite amount of paper. There is a space of documents that it is capable of producing, which includes things like The Tempest and does...not include, say, a Vermeer painting or a poem written in Russian. This typewriter represents a means of generating the hypothesis space for a Bayesian...learner: each possible document that can be typed on it is a hypothesis, the infinite set of documents producible by the typewriter is the latent
Wheeler, David C.; Hickson, DeMarc A.; Waller, Lance A.
2010-01-01
Many diagnostic tools and goodness-of-fit measures, such as the Akaike information criterion (AIC) and the Bayesian deviance information criterion (DIC), are available to evaluate the overall adequacy of linear regression models. In addition, visually assessing adequacy in models has become an essential part of any regression analysis. In this paper, we focus on a spatial consideration of the local DIC measure for model selection and goodness-of-fit evaluation. We use a partitioning of the DIC into the local DIC, leverage, and deviance residuals to assess local model fit and influence for both individual observations and groups of observations in a Bayesian framework. We use visualization of the local DIC and differences in local DIC between models to assist in model selection and to visualize the global and local impacts of adding covariates or model parameters. We demonstrate the utility of the local DIC in assessing model adequacy using HIV prevalence data from pregnant women in the Butare province of Rwanda during 1989-1993 using a range of linear model specifications, from global effects only to spatially varying coefficient models, and a set of covariates related to sexual behavior. Results of applying the diagnostic visualization approach include more refined model selection and greater understanding of the models as applied to the data. PMID:21243121
Kaewprag, Pacharmon; Newton, Cheryl; Vermillion, Brenda; Hyun, Sookyung; Huang, Kun; Machiraju, Raghu
2017-07-05
We develop predictive models enabling clinicians to better understand and explore patient clinical data along with risk factors for pressure ulcers in intensive care unit patients from electronic health record data. Identifying accurate risk factors of pressure ulcers is essential to determining appropriate prevention strategies; in this work we examine medication, diagnosis, and traditional Braden pressure ulcer assessment scale measurements as patient features. In order to predict pressure ulcer incidence and better understand the structure of related risk factors, we construct Bayesian networks from patient features. Bayesian network nodes (features) and edges (conditional dependencies) are simplified with statistical network techniques. Upon reviewing a network visualization of our model, our clinician collaborators were able to identify strong relationships between risk factors widely recognized as associated with pressure ulcers. We present a three-stage framework for predictive analysis of patient clinical data: 1) Developing electronic health record feature extraction functions with assistance of clinicians, 2) simplifying features, and 3) building Bayesian network predictive models. We evaluate all combinations of Bayesian network models from different search algorithms, scoring functions, prior structure initializations, and sets of features. From the EHRs of 7,717 ICU patients, we construct Bayesian network predictive models from 86 medication, diagnosis, and Braden scale features. Our model not only identifies known and suspected high PU risk factors, but also substantially increases sensitivity of the prediction - nearly three times higher comparing to logistical regression models - without sacrificing the overall accuracy. We visualize a representative model with which our clinician collaborators identify strong relationships between risk factors widely recognized as associated with pressure ulcers. Given the strong adverse effect of pressure ulcers
Testing students' e-learning via Facebook through Bayesian structural equation modeling.
Salarzadeh Jenatabadi, Hashem; Moghavvemi, Sedigheh; Wan Mohamed Radzi, Che Wan Jasimah Bt; Babashamsi, Parastoo; Arashi, Mohammad
2017-01-01
Learning is an intentional activity, with several factors affecting students' intention to use new learning technology. Researchers have investigated technology acceptance in different contexts by developing various theories/models and testing them by a number of means. Although most theories/models developed have been examined through regression or structural equation modeling, Bayesian analysis offers more accurate data analysis results. To address this gap, the unified theory of acceptance and technology use in the context of e-learning via Facebook are re-examined in this study using Bayesian analysis. The data (S1 Data) were collected from 170 students enrolled in a business statistics course at University of Malaya, Malaysia, and tested with the maximum likelihood and Bayesian approaches. The difference between the two methods' results indicates that performance expectancy and hedonic motivation are the strongest factors influencing the intention to use e-learning via Facebook. The Bayesian estimation model exhibited better data fit than the maximum likelihood estimator model. The results of the Bayesian and maximum likelihood estimator approaches are compared and the reasons for the result discrepancy are deliberated.
Testing students' e-learning via Facebook through Bayesian structural equation modeling.
Directory of Open Access Journals (Sweden)
Hashem Salarzadeh Jenatabadi
Full Text Available Learning is an intentional activity, with several factors affecting students' intention to use new learning technology. Researchers have investigated technology acceptance in different contexts by developing various theories/models and testing them by a number of means. Although most theories/models developed have been examined through regression or structural equation modeling, Bayesian analysis offers more accurate data analysis results. To address this gap, the unified theory of acceptance and technology use in the context of e-learning via Facebook are re-examined in this study using Bayesian analysis. The data (S1 Data were collected from 170 students enrolled in a business statistics course at University of Malaya, Malaysia, and tested with the maximum likelihood and Bayesian approaches. The difference between the two methods' results indicates that performance expectancy and hedonic motivation are the strongest factors influencing the intention to use e-learning via Facebook. The Bayesian estimation model exhibited better data fit than the maximum likelihood estimator model. The results of the Bayesian and maximum likelihood estimator approaches are compared and the reasons for the result discrepancy are deliberated.
Directory of Open Access Journals (Sweden)
Moritz eBoos
2016-05-01
Full Text Available Cognitive determinants of probabilistic inference were examined using hierarchical Bayesian modelling techniques. A classic urn-ball paradigm served as experimental strategy, involving a factorial two (prior probabilities by two (likelihoods design. Five computational models of cognitive processes were compared with the observed behaviour. Parameter-free Bayesian posterior probabilities and parameter-free base rate neglect provided inadequate models of probabilistic inference. The introduction of distorted subjective probabilities yielded more robust and generalizable results. A general class of (inverted S-shaped probability weighting functions had been proposed; however, the possibility of large differences in probability distortions not only across experimental conditions, but also across individuals, seems critical for the model’s success. It also seems advantageous to consider individual differences in parameters of probability weighting as being sampled from weakly informative prior distributions of individual parameter values. Thus, the results from hierarchical Bayesian modelling converge with previous results in revealing that probability weighting parameters show considerable task dependency and individual differences. Methodologically, this work exemplifies the usefulness of hierarchical Bayesian modelling techniques for cognitive psychology. Theoretically, human probabilistic inference might be best described as the application of individualized strategic policies for Bayesian belief revision.
Directory of Open Access Journals (Sweden)
J. P. Werner
2015-03-01
Full Text Available Reconstructions of the late-Holocene climate rely heavily upon proxies that are assumed to be accurately dated by layer counting, such as measurements of tree rings, ice cores, and varved lake sediments. Considerable advances could be achieved if time-uncertain proxies were able to be included within these multiproxy reconstructions, and if time uncertainties were recognized and correctly modeled for proxies commonly treated as free of age model errors. Current approaches for accounting for time uncertainty are generally limited to repeating the reconstruction using each one of an ensemble of age models, thereby inflating the final estimated uncertainty – in effect, each possible age model is given equal weighting. Uncertainties can be reduced by exploiting the inferred space–time covariance structure of the climate to re-weight the possible age models. Here, we demonstrate how Bayesian hierarchical climate reconstruction models can be augmented to account for time-uncertain proxies. Critically, although a priori all age models are given equal probability of being correct, the probabilities associated with the age models are formally updated within the Bayesian framework, thereby reducing uncertainties. Numerical experiments show that updating the age model probabilities decreases uncertainty in the resulting reconstructions, as compared with the current de facto standard of sampling over all age models, provided there is sufficient information from other data sources in the spatial region of the time-uncertain proxy. This approach can readily be generalized to non-layer-counted proxies, such as those derived from marine sediments.
Lee, L.; Helsel, D.
2007-01-01
Analysis of low concentrations of trace contaminants in environmental media often results in left-censored data that are below some limit of analytical precision. Interpretation of values becomes complicated when there are multiple detection limits in the data-perhaps as a result of changing analytical precision over time. Parametric and semi-parametric methods, such as maximum likelihood estimation and robust regression on order statistics, can be employed to model distributions of multiply censored data and provide estimates of summary statistics. However, these methods are based on assumptions about the underlying distribution of data. Nonparametric methods provide an alternative that does not require such assumptions. A standard nonparametric method for estimating summary statistics of multiply-censored data is the Kaplan-Meier (K-M) method. This method has seen widespread usage in the medical sciences within a general framework termed "survival analysis" where it is employed with right-censored time-to-failure data. However, K-M methods are equally valid for the left-censored data common in the geosciences. Our S-language software provides an analytical framework based on K-M methods that is tailored to the needs of the earth and environmental sciences community. This includes routines for the generation of empirical cumulative distribution functions, prediction or exceedance probabilities, and related confidence limits computation. Additionally, our software contains K-M-based routines for nonparametric hypothesis testing among an unlimited number of grouping variables. A primary characteristic of K-M methods is that they do not perform extrapolation and interpolation. Thus, these routines cannot be used to model statistics beyond the observed data range or when linear interpolation is desired. For such applications, the aforementioned parametric and semi-parametric methods must be used.
A comprehensive probabilistic analysis model of oil pipelines network based on Bayesian network
Zhang, C.; Qin, T. X.; Jiang, B.; Huang, C.
2018-02-01
Oil pipelines network is one of the most important facilities of energy transportation. But oil pipelines network accident may result in serious disasters. Some analysis models for these accidents have been established mainly based on three methods, including event-tree, accident simulation and Bayesian network. Among these methods, Bayesian network is suitable for probabilistic analysis. But not all the important influencing factors are considered and the deployment rule of the factors has not been established. This paper proposed a probabilistic analysis model of oil pipelines network based on Bayesian network. Most of the important influencing factors, including the key environment condition and emergency response are considered in this model. Moreover, the paper also introduces a deployment rule for these factors. The model can be used in probabilistic analysis and sensitive analysis of oil pipelines network accident.
Bayesian Comparison of Alternative Graded Response Models for Performance Assessment Applications
Zhu, Xiaowen; Stone, Clement A.
2012-01-01
This study examined the relative effectiveness of Bayesian model comparison methods in selecting an appropriate graded response (GR) model for performance assessment applications. Three popular methods were considered: deviance information criterion (DIC), conditional predictive ordinate (CPO), and posterior predictive model checking (PPMC). Using…
Chen, Mingjie; Izady, Azizallah; Abdalla, Osman A.; Amerjeed, Mansoor
2018-02-01
Bayesian inference using Markov Chain Monte Carlo (MCMC) provides an explicit framework for stochastic calibration of hydrogeologic models accounting for uncertainties; however, the MCMC sampling entails a large number of model calls, and could easily become computationally unwieldy if the high-fidelity hydrogeologic model simulation is time consuming. This study proposes a surrogate-based Bayesian framework to address this notorious issue, and illustrates the methodology by inverse modeling a regional MODFLOW model. The high-fidelity groundwater model is approximated by a fast statistical model using Bagging Multivariate Adaptive Regression Spline (BMARS) algorithm, and hence the MCMC sampling can be efficiently performed. In this study, the MODFLOW model is developed to simulate the groundwater flow in an arid region of Oman consisting of mountain-coast aquifers, and used to run representative simulations to generate training dataset for BMARS model construction. A BMARS-based Sobol' method is also employed to efficiently calculate input parameter sensitivities, which are used to evaluate and rank their importance for the groundwater flow model system. According to sensitivity analysis, insensitive parameters are screened out of Bayesian inversion of the MODFLOW model, further saving computing efforts. The posterior probability distribution of input parameters is efficiently inferred from the prescribed prior distribution using observed head data, demonstrating that the presented BMARS-based Bayesian framework is an efficient tool to reduce parameter uncertainties of a groundwater system.
Couasnon, Anaïs; Sebastian, Antonia; Morales-Nápoles, Oswaldo
2017-04-01
Recent research has highlighted the increased risk of compound flooding in the U.S. In coastal catchments, an elevated downstream water level, resulting from high tide and/or storm surge, impedes drainage creating a backwater effect that may exacerbate flooding in the riverine environment. Catchments exposed to tropical cyclone activity along the Gulf of Mexico and Atlantic coasts are particularly vulnerable. However, conventional flood hazard models focus mainly on precipitation-induced flooding and few studies accurately represent the hazard associated with the interaction between discharge and elevated downstream water levels. This study presents a method to derive stochastic boundary conditions for a coastal watershed. Mean daily discharge and maximum daily residual water levels are used to build a non-parametric Bayesian network (BN) based on copulas. Stochastic boundary conditions for the watershed are extracted from the BN and input into a 1-D process-based hydraulic model to obtain water surface elevations in the main channel of the catchment. The method is applied to a section of the Houston Ship Channel (Buffalo Bayou) in Southeast Texas. Data at six stream gages and two tidal stations are used to build the BN and 100-year joint return period events are modeled. We find that the dependence relationship between the daily residual water level and the mean daily discharge in the catchment can be represented by a Gumbel copula (Spearman's rank correlation coefficient of 0.31) and that they result in higher water levels in the mid- to upstream reaches of the watershed than when modeled independently. This indicates that conventional (deterministic) methods may underestimate the flood hazard associated with compound flooding in the riverine environment and that such interactions should not be neglected in future coastal flood hazard studies.
International Nuclear Information System (INIS)
Zhou Zhongbao; Zhou Jinglun; Sun Quan
2007-01-01
Effect of Human factors on system safety is increasingly serious, which is often ignored in traditional probabilistic safety assessment methods however. A new probabilistic safety assessment model based on object-oriented Bayesian networks is proposed in this paper. Human factors are integrated into the existed event sequence diagrams. Then the classes of the object-oriented Bayesian networks are constructed which are converted to latent Bayesian networks for inference. Finally, the inference results are integrated into event sequence diagrams for probabilistic safety assessment. The new method is applied to the accident of loss of coolant in a nuclear power plant. the results show that the model is not only applicable to real-time situation assessment, but also applicable to situation assessment based certain amount of information. The modeling complexity is kept down and the new method is appropriate to large complex systems due to the thoughts of object-oriented. (authors)
A model-based Bayesian framework for ECG beat segmentation
International Nuclear Information System (INIS)
Sayadi, O; Shamsollahi, M B
2009-01-01
The study of electrocardiogram (ECG) waveform amplitudes, timings and patterns has been the subject of intense research, for it provides a deep insight into the diagnostic features of the heart's functionality. In some recent works, a Bayesian filtering paradigm has been proposed for denoising and compression of ECG signals. In this paper, it is shown that this framework may be effectively used for ECG beat segmentation and extraction of fiducial points. Analytic expressions for the determination of points and intervals are derived and evaluated on various real ECG signals. Simulation results show that the method can contribute to and enhance the clinical ECG beat segmentation performance