Bayesian online learning of the hazard rate in change-point problems.
Wilson, Robert C; Nassar, Matthew R; Gold, Joshua I
2010-09-01
Change-point models are generative models of time-varying data in which the underlying generative parameters undergo discontinuous changes at different points in time known as change points. Change-points often represent important events in the underlying processes, like a change in brain state reflected in EEG data or a change in the value of a company reflected in its stock price. However, change-points can be difficult to identify in noisy data streams. Previous attempts to identify change-points online using Bayesian inference relied on specifying in advance the rate at which they occur, called the hazard rate (h). This approach leads to predictions that can depend strongly on the choice of h and is unable to deal optimally with systems in which h is not constant in time. In this letter, we overcome these limitations by developing a hierarchical extension to earlier models. This approach allows h itself to be inferred from the data, which in turn helps to identify when change-points occur. We show that our approach can effectively identify change-points in both toy and real data sets with complex hazard rates and how it can be used as an ideal-observer model for human and animal behavior when faced with rapidly changing inputs.
Growth Curve Analysis and Change-Points Detection in Extremes
Meng, Rui
2016-05-15
The thesis consists of two coherent projects. The first project presents the results of evaluating salinity tolerance in barley using growth curve analysis where different growth trajectories are observed within barley families. The study of salinity tolerance in plants is crucial to understanding plant growth and productivity. Because fully-automated smarthouses with conveyor systems allow non-destructive and high-throughput phenotyping of large number of plants, it is now possible to apply advanced statistical tools to analyze daily measurements and to study salinity tolerance. To compare different growth patterns of barley variates, we use functional data analysis techniques to analyze the daily projected shoot areas. In particular, we apply the curve registration method to align all the curves from the same barley family in order to summarize the family-wise features. We also illustrate how to use statistical modeling to account for spatial variation in microclimate in smarthouses and for temporal variation across runs, which is crucial for identifying traits of the barley variates. In our analysis, we show that the concentrations of sodium and potassium in leaves are negatively correlated, and their interactions are associated with the degree of salinity tolerance. The second project studies change-points detection methods in extremes when multiple time series data are available. Motived by the scientific question of whether the chances to experience extreme weather are different in different seasons of a year, we develop a change-points detection model to study changes in extremes or in the tail of a distribution. Most of existing models identify seasons from multiple yearly time series assuming a season or a change-point location remains exactly the same across years. In this work, we propose a random effect model that allows the change-point to vary from year to year, following a given distribution. Both parametric and nonparametric methods are developed
The development of an information criterion for Change-Point Analysis
Wiggins, Paul A
2015-01-01
Change-point analysis is a flexible and computationally tractable tool for the analysis of times series data from systems that transition between discrete states and whose observables are corrupted by noise. The change-point algorithm is used to identify the time indices (change points) at which the system transitions between these discrete states. We present a unified information-based approach to testing for the existence of change points. This new approach reconciles two previously disparate approaches to Change-Point Analysis (frequentist and information-based) for testing transitions between states. The resulting method is statistically principled, parameter and prior free and widely applicable to a wide range of change-point problems.
Gelman, Andrew; Stern, Hal S; Dunson, David B; Vehtari, Aki; Rubin, Donald B
2013-01-01
FUNDAMENTALS OF BAYESIAN INFERENCEProbability and InferenceSingle-Parameter Models Introduction to Multiparameter Models Asymptotics and Connections to Non-Bayesian ApproachesHierarchical ModelsFUNDAMENTALS OF BAYESIAN DATA ANALYSISModel Checking Evaluating, Comparing, and Expanding ModelsModeling Accounting for Data Collection Decision AnalysisADVANCED COMPUTATION Introduction to Bayesian Computation Basics of Markov Chain Simulation Computationally Efficient Markov Chain Simulation Modal and Distributional ApproximationsREGRESSION MODELS Introduction to Regression Models Hierarchical Linear
Achcar, J A; Martinez, E Z; Ruffino-Netto, A; Paulino, C D; Soares, P
2008-12-01
We considered a Bayesian analysis for the prevalence of tuberculosis cases in New York City from 1970 to 2000. This counting dataset presented two change-points during this period. We modelled this counting dataset considering non-homogeneous Poisson processes in the presence of the two-change points. A Bayesian analysis for the data is considered using Markov chain Monte Carlo methods. Simulated Gibbs samples for the parameters of interest were obtained using WinBugs software.
Yuan, Ying; MacKinnon, David P.
2009-01-01
In this article, we propose Bayesian analysis of mediation effects. Compared with conventional frequentist mediation analysis, the Bayesian approach has several advantages. First, it allows researchers to incorporate prior information into the mediation analysis, thus potentially improving the efficiency of estimates. Second, under the Bayesian…
Bayesian analysis of longitudinal Johne's disease diagnostic data without a gold standard test
DEFF Research Database (Denmark)
Wang, C.; Turnbull, B.W.; Nielsen, Søren Saxmose
2011-01-01
. An application is presented to an analysis of ELISA and fecal culture test outcomes in the diagnostic testing of paratuberculosis (Johne's disease) for a Danish longitudinal study from January 2000 to March 2003. The posterior probability criterion based on the Bayesian model with 4 repeated observations has......A Bayesian methodology was developed based on a latent change-point model to evaluate the performance of milk ELISA and fecal culture tests for longitudinal Johne's disease diagnostic data. The situation of no perfect reference test was considered; that is, no “gold standard.” A change......-point process with a Weibull survival hazard function was used to model the progression of the hidden disease status. The model adjusted for the fixed effects of covariate variables and random effects of subject on the diagnostic testing procedure. Markov chain Monte Carlo methods were used to compute...
Segmentation and Estimation of Change-point Models
Fang, Xiao; Li, Jian; Siegmund, David
2016-01-01
To segment a sequence of independent random variables at an unknown number of change-points, we introduce new procedures that are based on thresholding the likelihood ratio statistic. We also study confidence regions based on the likelihood ratio statistic for the changepoints and joint confidence regions for the change-points and the parameter values. Applications to segment an array CGH analysis of the BT474 cell line are discussed.
Bayesian Exploratory Factor Analysis
DEFF Research Database (Denmark)
Conti, Gabriella; Frühwirth-Schnatter, Sylvia; Heckman, James J.;
2014-01-01
This paper develops and applies a Bayesian approach to Exploratory Factor Analysis that improves on ad hoc classical approaches. Our framework relies on dedicated factor models and simultaneously determines the number of factors, the allocation of each measurement to a unique factor, and the corr......This paper develops and applies a Bayesian approach to Exploratory Factor Analysis that improves on ad hoc classical approaches. Our framework relies on dedicated factor models and simultaneously determines the number of factors, the allocation of each measurement to a unique factor......, and the corresponding factor loadings. Classical identification criteria are applied and integrated into our Bayesian procedure to generate models that are stable and clearly interpretable. A Monte Carlo study confirms the validity of the approach. The method is used to produce interpretable low dimensional aggregates...
Bayesian Independent Component Analysis
DEFF Research Database (Denmark)
Winther, Ole; Petersen, Kaare Brandt
2007-01-01
In this paper we present an empirical Bayesian framework for independent component analysis. The framework provides estimates of the sources, the mixing matrix and the noise parameters, and is flexible with respect to choice of source prior and the number of sources and sensors. Inside the engine...... in a Matlab toolbox, is demonstrated for non-negative decompositions and compared with non-negative matrix factorization....
Bayesian nonparametric data analysis
Müller, Peter; Jara, Alejandro; Hanson, Tim
2015-01-01
This book reviews nonparametric Bayesian methods and models that have proven useful in the context of data analysis. Rather than providing an encyclopedic review of probability models, the book’s structure follows a data analysis perspective. As such, the chapters are organized by traditional data analysis problems. In selecting specific nonparametric models, simpler and more traditional models are favored over specialized ones. The discussed methods are illustrated with a wealth of examples, including applications ranging from stylized examples to case studies from recent literature. The book also includes an extensive discussion of computational methods and details on their implementation. R code for many examples is included in on-line software pages.
Sociological Environmental Causes are Insufficient to Explain Autism Changepoints of Incidence.
Deisher, Theresa A; Doan, Ngoc V
2015-01-01
The Environmental Protection Agency (EPA) recently published a study analyzing time trends in the cumulative incidence of autistic disorder (AD) in the U.S., Denmark, and worldwide. A birth year changepoint (CP) around 1988 was identified. It has been argued that the epidemic rise in autism over the past three decades is partly due to a combination of sociologic factors along with the potential contribution of thimerosal containing vaccines. Our work conducted an expanded analysis of AD changepoints in CA and U.S., and determined whether changepoints in time trends of AD rates temporally coincide with changepoints for the proposed causative sociologic and environmental factors. Birth year changepoints were identified for 1980.9 [95% CI, 1978.6-1983.1], 1988.4 [95% CI, 1987.8-1989.0] and 1995.6 [95% CI, 1994.6-1996.6] for CA and U.S. data, confirming and expanding the EPA results. AD birth year changepoints significantly precede the changepoints calculated for indicators of increased social awareness of AD. Furthermore, the 1981 and 1996 AD birth year changepoints don't coincide with any predicted changepoints based on altered thimerosal content in vaccines nor on revised editions of the Diagnostic and Statistical Manual of Mental Disorders (DSM).
Bayesian Methods for Statistical Analysis
Puza, Borek
2015-01-01
Bayesian methods for statistical analysis is a book on statistical methods for analysing a wide variety of data. The book consists of 12 chapters, starting with basic concepts and covering numerous topics, including Bayesian estimation, decision theory, prediction, hypothesis testing, hierarchical models, Markov chain Monte Carlo methods, finite population inference, biased sampling and nonignorable nonresponse. The book contains many exercises, all with worked solutions, including complete c...
Bayesian analysis of rare events
Energy Technology Data Exchange (ETDEWEB)
Straub, Daniel, E-mail: straub@tum.de; Papaioannou, Iason; Betz, Wolfgang
2016-06-01
In many areas of engineering and science there is an interest in predicting the probability of rare events, in particular in applications related to safety and security. Increasingly, such predictions are made through computer models of physical systems in an uncertainty quantification framework. Additionally, with advances in IT, monitoring and sensor technology, an increasing amount of data on the performance of the systems is collected. This data can be used to reduce uncertainty, improve the probability estimates and consequently enhance the management of rare events and associated risks. Bayesian analysis is the ideal method to include the data into the probabilistic model. It ensures a consistent probabilistic treatment of uncertainty, which is central in the prediction of rare events, where extrapolation from the domain of observation is common. We present a framework for performing Bayesian updating of rare event probabilities, termed BUS. It is based on a reinterpretation of the classical rejection-sampling approach to Bayesian analysis, which enables the use of established methods for estimating probabilities of rare events. By drawing upon these methods, the framework makes use of their computational efficiency. These methods include the First-Order Reliability Method (FORM), tailored importance sampling (IS) methods and Subset Simulation (SuS). In this contribution, we briefly review these methods in the context of the BUS framework and investigate their applicability to Bayesian analysis of rare events in different settings. We find that, for some applications, FORM can be highly efficient and is surprisingly accurate, enabling Bayesian analysis of rare events with just a few model evaluations. In a general setting, BUS implemented through IS and SuS is more robust and flexible.
ANALYSIS OF BAYESIAN CLASSIFIER ACCURACY
Directory of Open Access Journals (Sweden)
Felipe Schneider Costa
2013-01-01
Full Text Available The naÃ¯ve Bayes classifier is considered one of the most effective classification algorithms today, competing with more modern and sophisticated classifiers. Despite being based on unrealistic (naÃ¯ve assumption that all variables are independent, given the output class, the classifier provides proper results. However, depending on the scenario utilized (network structure, number of samples or training cases, number of variables, the network may not provide appropriate results. This study uses a process variable selection, using the chi-squared test to verify the existence of dependence between variables in the data model in order to identify the reasons which prevent a Bayesian network to provide good performance. A detailed analysis of the data is also proposed, unlike other existing work, as well as adjustments in case of limit values between two adjacent classes. Furthermore, variable weights are used in the calculation of a posteriori probabilities, calculated with mutual information function. Tests were applied in both a naÃ¯ve Bayesian network and a hierarchical Bayesian network. After testing, a significant reduction in error rate has been observed. The naÃ¯ve Bayesian network presented a drop in error rates from twenty five percent to five percent, considering the initial results of the classification process. In the hierarchical network, there was not only a drop in fifteen percent error rate, but also the final result came to zero.
Staudacher, M.; Telser, S.; Amann, A.; Hinterhuber, H.; Ritsch-Marte, M.
2005-04-01
We present a novel scaling-dependent measure for times series analysis, the progressive detrended fluctuation analysis (PDFA). Since this method progressively includes and analyzes all data points of the time series, it is suitable for on-line change-point detection: Sudden changes in the statistics of the data points, in the type of correlation or in the statistical variance, or both, are reliably indicated and localized in time. This is first shown for numerous artificially generated data sets of Gaussian random numbers. Also time series with various non-stationarities, such as non-polynomial trends and “spiking”, are included as examples. Although generally applicable, our method was specifically developed as a tool for numerical sleep evaluation based on heart rate variability in the ECG-channel of polysomnographic whole night recordings. It is demonstrated that PDFA can detect specific sleep stage transitions, typically ascending transitions involving sympathetic activation as for example short episodes of wakefulness, and that the method is capable to discern between NREM sleep and REM sleep.
Bayesian Model Averaging for Propensity Score Analysis
Kaplan, David; Chen, Jianshen
2013-01-01
The purpose of this study is to explore Bayesian model averaging in the propensity score context. Previous research on Bayesian propensity score analysis does not take into account model uncertainty. In this regard, an internally consistent Bayesian framework for model building and estimation must also account for model uncertainty. The…
Change-Point Estimates in Longitudinal Binary Data
Institute of Scientific and Technical Information of China (English)
WU Xiaoru; YANG Ying
2008-01-01
Most change-point models assume that the response is continuous or cross sectional binary.However,in many public health problems,the data is longitudinal binary.There are few studies of change-point problems for longitudinal outcomes.This paper describes a flexible change-point model which includes random-effects and takes into account the difference between various individuals in longitudinal binary data.A transition function is used to make the linear-linear logistic model differentiable at the change-point.The location of the change-point is estimated using the maximum likelihood method.Adjust-ment of the transition parameter from zero to one controls the sharpness of the transition.The performance of this estimation procedure is illustrated with simulations using SAS/proc nlmixed and a detailed analysis of data relating hormone levels and ovary functions based on data from the Obstetrics and Gynecology Hospi-tal,Medical Center of Fudan University.
STATISTICAL BAYESIAN ANALYSIS OF EXPERIMENTAL DATA.
Directory of Open Access Journals (Sweden)
AHLAM LABDAOUI
2012-12-01
Full Text Available The Bayesian researcher should know the basic ideas underlying Bayesian methodology and the computational tools used in modern Bayesian econometrics. Some of the most important methods of posterior simulation are Monte Carlo integration, importance sampling, Gibbs sampling and the Metropolis- Hastings algorithm. The Bayesian should also be able to put the theory and computational tools together in the context of substantive empirical problems. We focus primarily on recent developments in Bayesian computation. Then we focus on particular models. Inevitably, we combine theory and computation in the context of particular models. Although we have tried to be reasonably complete in terms of covering the basic ideas of Bayesian theory and the computational tools most commonly used by the Bayesian, there is no way we can cover all the classes of models used in econometrics. We propose to the user of analysis of variance and linear regression model.
Multiview Bayesian Correlated Component Analysis
DEFF Research Database (Denmark)
Kamronn, Simon Due; Poulsen, Andreas Trier; Hansen, Lars Kai
2015-01-01
Correlated component analysis as proposed by Dmochowski, Sajda, Dias, and Parra (2012) is a tool for investigating brain process similarity in the responses to multiple views of a given stimulus. Correlated components are identified under the assumption that the involved spatial networks are iden......Correlated component analysis as proposed by Dmochowski, Sajda, Dias, and Parra (2012) is a tool for investigating brain process similarity in the responses to multiple views of a given stimulus. Correlated components are identified under the assumption that the involved spatial networks...... we denote Bayesian correlated component analysis, evaluates favorably against three relevant algorithms in simulated data. A well-established benchmark EEG data set is used to further validate the new model and infer the variability of spatial representations across multiple subjects....
Bayesian analysis of exoplanet and binary orbits
Schulze-Hartung, Tim; Launhardt, Ralf; Henning, Thomas
2012-01-01
We introduce BASE (Bayesian astrometric and spectroscopic exoplanet detection and characterisation tool), a novel program for the combined or separate Bayesian analysis of astrometric and radial-velocity measurements of potential exoplanet hosts and binary stars. The capabilities of BASE are demonstrated using all publicly available data of the binary Mizar A.
Detecting change-points in extremes
Dupuis, D. J.
2015-01-01
Even though most work on change-point estimation focuses on changes in the mean, changes in the variance or in the tail distribution can lead to more extreme events. In this paper, we develop a new method of detecting and estimating the change-points in the tail of multiple time series data. In addition, we adapt existing tail change-point detection methods to our specific problem and conduct a thorough comparison of different methods in terms of performance on the estimation of change-points and computational time. We also examine three locations on the U.S. northeast coast and demonstrate that the methods are useful for identifying changes in seasonally extreme warm temperatures.
Bayesian analysis of binary sequences
Torney, David C.
2005-03-01
This manuscript details Bayesian methodology for "learning by example", with binary n-sequences encoding the objects under consideration. Priors prove influential; conformable priors are described. Laplace approximation of Bayes integrals yields posterior likelihoods for all n-sequences. This involves the optimization of a definite function over a convex domain--efficiently effectuated by the sequential application of the quadratic program.
Grzegorczyk, Marco; Husmeier, Dirk
2013-01-01
To relax the homogeneity assumption of classical dynamic Bayesian networks (DBNs), various recent studies have combined DBNs with multiple changepoint processes. The underlying assumption is that the parameters associated with time series segments delimited by multiple changepoints are a priori inde
CHAMP: Changepoint Detection Using Approximate Model Parameters
2014-06-01
positions as a Markov chain in which the transition probabilities are defined by the time since the last changepoint: p(τi+1 = t|τi = s) = g(t− s), (1...experimentally verified using artifi- cially generated data and are compared to those of Fearnhead and Liu [5]. 2 Related work Hidden Markov Models (HMMs) are...length α, and maximum number of particles M . Output: Viterbi path of changepoint times and models // Initialize data structures 1: max path, prev queue
Bayesian analysis for kaon photoproduction
Energy Technology Data Exchange (ETDEWEB)
Marsainy, T., E-mail: tmart@fisika.ui.ac.id; Mart, T., E-mail: tmart@fisika.ui.ac.id [Department Fisika, FMIPA, Universitas Indonesia, Depok 16424 (Indonesia)
2014-09-25
We have investigated contribution of the nucleon resonances in the kaon photoproduction process by using an established statistical decision making method, i.e. the Bayesian method. This method does not only evaluate the model over its entire parameter space, but also takes the prior information and experimental data into account. The result indicates that certain resonances have larger probabilities to contribute to the process.
Bayesian Meta-Analysis of Coefficient Alpha
Brannick, Michael T.; Zhang, Nanhua
2013-01-01
The current paper describes and illustrates a Bayesian approach to the meta-analysis of coefficient alpha. Alpha is the most commonly used estimate of the reliability or consistency (freedom from measurement error) for educational and psychological measures. The conventional approach to meta-analysis uses inverse variance weights to combine…
Detection of Acoustic Change-Points in Audio Streams and Signal Segmentation
Directory of Open Access Journals (Sweden)
J. Zdansky
2005-04-01
Full Text Available This contribution proposes an efficient method for the detection ofrelevant changes in continuous stream of sound. The detectedchange-points can then serve for the segmentation of long audiorecordings into shorter and more or less homogenous sections. First, wediscuss the task of a single change-point detection using the Bayesdecision theory. We show that it leads to a quite simple andcomputationally efficient solution based on the Bayesian InformationCriterion. Next, we extend this approach to formulate the algorithm forthe detection of multiple change-points. Finally, the proposedalgorithm is applied for the segmentation of broadcast newsaudio-streams into parts belonging to different speakers or differentacoustic conditions. Such segmentation is necessary as the first stepin the automatic speech-to-text transcription of TV or radio news.
On Bayesian System Reliability Analysis
Energy Technology Data Exchange (ETDEWEB)
Soerensen Ringi, M.
1995-05-01
The view taken in this thesis is that reliability, the probability that a system will perform a required function for a stated period of time, depends on a person`s state of knowledge. Reliability changes as this state of knowledge changes, i.e. when new relevant information becomes available. Most existing models for system reliability prediction are developed in a classical framework of probability theory and they overlook some information that is always present. Probability is just an analytical tool to handle uncertainty, based on judgement and subjective opinions. It is argued that the Bayesian approach gives a much more comprehensive understanding of the foundations of probability than the so called frequentistic school. A new model for system reliability prediction is given in two papers. The model encloses the fact that component failures are dependent because of a shared operational environment. The suggested model also naturally permits learning from failure data of similar components in non identical environments. 85 refs.
Bayesian analysis of cosmic structures
Kitaura, Francisco-Shu
2011-01-01
We revise the Bayesian inference steps required to analyse the cosmological large-scale structure. Here we make special emphasis in the complications which arise due to the non-Gaussian character of the galaxy and matter distribution. In particular we investigate the advantages and limitations of the Poisson-lognormal model and discuss how to extend this work. With the lognormal prior using the Hamiltonian sampling technique and on scales of about 4 h^{-1} Mpc we find that the over-dense regions are excellent reconstructed, however, under-dense regions (void statistics) are quantitatively poorly recovered. Contrary to the maximum a posteriori (MAP) solution which was shown to over-estimate the density in the under-dense regions we obtain lower densities than in N-body simulations. This is due to the fact that the MAP solution is conservative whereas the full posterior yields samples which are consistent with the prior statistics. The lognormal prior is not able to capture the full non-linear regime at scales ...
Asymptotic Distribution of the Jump Change-Point Estimator
Institute of Scientific and Technical Information of China (English)
Changchun TAN; Huifang NIU; Baiqi MIAO
2012-01-01
The asymptotic distribution of the change-point estimator in a jump changepoint model is considered.For the jump change-point model Xi =a + θI{[nTo] ＜ i ≤n} + εi,where εi (i =1,…,n) are independent identically distributed random variables with Eεi=0 and Var(εi) ＜ oo,with the help of the slip window method,the asymptotic distribution of the jump change-point estimator (T) is studied under the condition of the local alternative hypothesis.
A Bayesian nonparametric meta-analysis model.
Karabatsos, George; Talbott, Elizabeth; Walker, Stephen G
2015-03-01
In a meta-analysis, it is important to specify a model that adequately describes the effect-size distribution of the underlying population of studies. The conventional normal fixed-effect and normal random-effects models assume a normal effect-size population distribution, conditionally on parameters and covariates. For estimating the mean overall effect size, such models may be adequate, but for prediction, they surely are not if the effect-size distribution exhibits non-normal behavior. To address this issue, we propose a Bayesian nonparametric meta-analysis model, which can describe a wider range of effect-size distributions, including unimodal symmetric distributions, as well as skewed and more multimodal distributions. We demonstrate our model through the analysis of real meta-analytic data arising from behavioral-genetic research. We compare the predictive performance of the Bayesian nonparametric model against various conventional and more modern normal fixed-effects and random-effects models.
Bayesian data analysis tools for atomic physics
Trassinelli, Martino
2016-01-01
We present an introduction to some concepts of Bayesian data analysis in the context of atomic physics. Starting from basic rules of probability, we present the Bayes' theorem and its applications. In particular we discuss about how to calculate simple and joint probability distributions and the Bayesian evidence, a model dependent quantity that allows to assign probabilities to different hypotheses from the analysis of a same data set. To give some practical examples, these methods are applied to two concrete cases. In the first example, the presence or not of a satellite line in an atomic spectrum is investigated. In the second example, we determine the most probable model among a set of possible profiles from the analysis of a statistically poor spectrum. We show also how to calculate the probability distribution of the main spectral component without having to determine uniquely the spectrum modeling. For these two studies, we implement the program Nested fit to calculate the different probability distrib...
Confidence Sets for a Change-Point.
1986-10-01
probability credible set for j. In fact, even without the explicit evaluation in (1), one knows from a general theorem of Stein (1965) and Hora and...confidence sets with smallest expected measure, Ann. Statist. , 10, 1283-94. Hora , R. B. and Buehler, R. J. (1966), Fiducial theory and invariant...simple cumulative sum type statistic for the change-point problem -’-"C with zero -one observations, Biometrika 67, 79-84. Raferty, A. E. and Akman, V
A Gentle Introduction to Bayesian Analysis : Applications to Developmental Research
Van de Schoot, Rens; Kaplan, David; Denissen, Jaap; Asendorpf, Jens B.; Neyer, Franz J.; van Aken, Marcel A G
2014-01-01
Bayesian statistical methods are becoming ever more popular in applied and fundamental research. In this study a gentle introduction to Bayesian analysis is provided. It is shown under what circumstances it is attractive to use Bayesian estimation, and how to interpret properly the results. First, t
A SAS Interface for Bayesian Analysis with WinBUGS
Zhang, Zhiyong; McArdle, John J.; Wang, Lijuan; Hamagami, Fumiaki
2008-01-01
Bayesian methods are becoming very popular despite some practical difficulties in implementation. To assist in the practical application of Bayesian methods, we show how to implement Bayesian analysis with WinBUGS as part of a standard set of SAS routines. This implementation procedure is first illustrated by fitting a multiple regression model…
Integrative bayesian network analysis of genomic data.
Ni, Yang; Stingo, Francesco C; Baladandayuthapani, Veerabhadran
2014-01-01
Rapid development of genome-wide profiling technologies has made it possible to conduct integrative analysis on genomic data from multiple platforms. In this study, we develop a novel integrative Bayesian network approach to investigate the relationships between genetic and epigenetic alterations as well as how these mutations affect a patient's clinical outcome. We take a Bayesian network approach that admits a convenient decomposition of the joint distribution into local distributions. Exploiting the prior biological knowledge about regulatory mechanisms, we model each local distribution as linear regressions. This allows us to analyze multi-platform genome-wide data in a computationally efficient manner. We illustrate the performance of our approach through simulation studies. Our methods are motivated by and applied to a multi-platform glioblastoma dataset, from which we reveal several biologically relevant relationships that have been validated in the literature as well as new genes that could potentially be novel biomarkers for cancer progression.
Bayesian analysis of multiple direct detection experiments
Arina, Chiara
2013-01-01
Bayesian methods offer a coherent and efficient framework for implementing uncertainties into induction problems. In this article, we review how this approach applies to the analysis of dark matter direct detection experiments. In particular we discuss the exclusion limit of XENON100 and the debated hints of detection under the hypothesis of a WIMP signal. Within parameter inference, marginalizing consistently over uncertainties to extract robust posterior probability distributions, we find that the claimed tension between XENON100 and the other experiments can be partially alleviated in isospin violating scenario, while elastic scattering model appears to be compatible with the classical approach. We then move to model comparison, for which Bayesian methods are particularly well suited. Firstly, we investigate the annual modulation seen in CoGeNT data, finding that there is weak evidence for a modulation. Modulation models due to other physics compare unfavorably with the WIMP models, paying the price for th...
Book review: Bayesian analysis for population ecology
Link, William A.
2011-01-01
Brian Dennis described the field of ecology as “fertile, uncolonized ground for Bayesian ideas.” He continued: “The Bayesian propagule has arrived at the shore. Ecologists need to think long and hard about the consequences of a Bayesian ecology. The Bayesian outlook is a successful competitor, but is it a weed? I think so.” (Dennis 2004)
Change-point estimation for censored regression model
Institute of Scientific and Technical Information of China (English)
Zhan-feng WANG; Yao-hua WU; Lin-cheng ZHAO
2007-01-01
In this paper, we consider the change-point estimation in the censored regression model assuming that there exists one change point. A nonparametric estimate of the change-point is proposed and is shown to be strongly consistent. Furthermore, its convergence rate is also obtained.
Bayesian Analysis of Individual Level Personality Dynamics
Directory of Open Access Journals (Sweden)
Edward Cripps
2016-07-01
Full Text Available A Bayesian technique with analyses of within-person processes at the level of the individual is presented. The approach is used to examine if the patterns of within-person responses on a 12 trial simulation task are consistent with the predictions of ITA theory (Dweck, 1999. ITA theory states that the performance of an individual with an entity theory of ability is more likely to spiral down following a failure experience than the performance of an individual with an incremental theory of ability. This is because entity theorists interpret failure experiences as evidence of a lack of ability, which they believe is largely innate and therefore relatively ﬁxed; whilst incremental theorists believe in the malleability of abilities and interpret failure experiences as evidence of more controllable factors such as poor strategy or lack of effort. The results of our analyses support ITA theory at both the within- and between-person levels of analyses and demonstrate the beneﬁts of Bayesian techniques for the analysis of within-person processes. These include more formal speciﬁcation of the theory and the ability to draw inferences about each individual, which allows for more nuanced interpretations of individuals within a personality category, such as differences in the individual probabilities of spiralling. While Bayesian techniques have many potential advantages for the analyses of within-person processes at the individual level, ease of use is not one of them for psychologists trained in traditional frequentist statistical techniques.
Bayesian Inference in Statistical Analysis
Box, George E P
2011-01-01
The Wiley Classics Library consists of selected books that have become recognized classics in their respective fields. With these new unabridged and inexpensive editions, Wiley hopes to extend the life of these important works by making them available to future generations of mathematicians and scientists. Currently available in the Series: T. W. Anderson The Statistical Analysis of Time Series T. S. Arthanari & Yadolah Dodge Mathematical Programming in Statistics Emil Artin Geometric Algebra Norman T. J. Bailey The Elements of Stochastic Processes with Applications to the Natural Sciences Rob
Analysis of COSIMA spectra: Bayesian approach
Directory of Open Access Journals (Sweden)
H. J. Lehto
2014-11-01
Full Text Available We describe the use of Bayesian analysis methods applied to TOF-SIMS spectra. The method finds the probability density functions of measured line parameters (number of lines, and their widths, peak amplitudes, integrated amplitudes, positions in mass intervals over the whole spectrum. We discuss the results we can expect from this analysis. We discuss the effects the instrument dead time causes in the COSIMA TOF SIMS. We address this issue in a new way. The derived line parameters can be used to further calibrate the mass scaling of TOF-SIMS and to feed the results into other analysis methods such as multivariate analyses of spectra. We intend to use the method in two ways, first as a comprehensive tool to perform quantitative analysis of spectra, and second as a fast tool for studying interesting targets for obtaining additional TOF-SIMS measurements of the sample, a property unique for COSIMA. Finally, we point out that the Bayesian method can be thought as a means to solve inverse problems but with forward calculations only.
Bayesian Analysis of Type Ia Supernova Data
Institute of Scientific and Technical Information of China (English)
王晓峰; 周旭; 李宗伟; 陈黎
2003-01-01
Recently, the distances to type Ia supernova (SN Ia) at z ～ 0.5 have been measured with the motivation of estimating cosmological parameters. However, different sleuthing techniques tend to give inconsistent measurements for SN Ia distances (～0.3 mag), which significantly affects the determination of cosmological parameters.A Bayesian "hyper-parameter" procedure is used to analyse jointly the current SN Ia data, which considers the relative weights of different datasets. For a flat Universe, the combining analysis yields ΩM = 0.20 ± 0.07.
Bayesian global analysis of neutrino oscillation data
Bergstrom, Johannes; Maltoni, Michele; Schwetz, Thomas
2015-01-01
We perform a Bayesian analysis of current neutrino oscillation data. When estimating the oscillation parameters we find that the results generally agree with those of the $\\chi^2$ method, with some differences involving $s_{23}^2$ and CP-violating effects. We discuss the additional subtleties caused by the circular nature of the CP-violating phase, and how it is possible to obtain correlation coefficients with $s_{23}^2$. When performing model comparison, we find that there is no significant evidence for any mass ordering, any octant of $s_{23}^2$ or a deviation from maximal mixing, nor the presence of CP-violation.
Doing bayesian data analysis a tutorial with R and BUGS
Kruschke, John K
2011-01-01
There is an explosion of interest in Bayesian statistics, primarily because recently created computational methods have finally made Bayesian analysis obtainable to a wide audience. Doing Bayesian Data Analysis, A Tutorial Introduction with R and BUGS provides an accessible approach to Bayesian data analysis, as material is explained clearly with concrete examples. The book begins with the basics, including essential concepts of probability and random sampling, and gradually progresses to advanced hierarchical modeling methods for realistic data. The text delivers comprehensive coverage of all
Bayesian Analysis of High Dimensional Classification
Mukhopadhyay, Subhadeep; Liang, Faming
2009-12-01
Modern data mining and bioinformatics have presented an important playground for statistical learning techniques, where the number of input variables is possibly much larger than the sample size of the training data. In supervised learning, logistic regression or probit regression can be used to model a binary output and form perceptron classification rules based on Bayesian inference. In these cases , there is a lot of interest in searching for sparse model in High Dimensional regression(/classification) setup. we first discuss two common challenges for analyzing high dimensional data. The first one is the curse of dimensionality. The complexity of many existing algorithms scale exponentially with the dimensionality of the space and by virtue of that algorithms soon become computationally intractable and therefore inapplicable in many real applications. secondly, multicollinearities among the predictors which severely slowdown the algorithm. In order to make Bayesian analysis operational in high dimension we propose a novel 'Hierarchical stochastic approximation monte carlo algorithm' (HSAMC), which overcomes the curse of dimensionality, multicollinearity of predictors in high dimension and also it possesses the self-adjusting mechanism to avoid the local minima separated by high energy barriers. Models and methods are illustrated by simulation inspired from from the feild of genomics. Numerical results indicate that HSAMC can work as a general model selection sampler in high dimensional complex model space.
Bayesian analysis of multiple direct detection experiments
Arina, Chiara
2014-12-01
Bayesian methods offer a coherent and efficient framework for implementing uncertainties into induction problems. In this article, we review how this approach applies to the analysis of dark matter direct detection experiments. In particular we discuss the exclusion limit of XENON100 and the debated hints of detection under the hypothesis of a WIMP signal. Within parameter inference, marginalizing consistently over uncertainties to extract robust posterior probability distributions, we find that the claimed tension between XENON100 and the other experiments can be partially alleviated in isospin violating scenario, while elastic scattering model appears to be compatible with the frequentist statistical approach. We then move to model comparison, for which Bayesian methods are particularly well suited. Firstly, we investigate the annual modulation seen in CoGeNT data, finding that there is weak evidence for a modulation. Modulation models due to other physics compare unfavorably with the WIMP models, paying the price for their excessive complexity. Secondly, we confront several coherent scattering models to determine the current best physical scenario compatible with the experimental hints. We find that exothermic and inelastic dark matter are moderatly disfavored against the elastic scenario, while the isospin violating model has a similar evidence. Lastly the Bayes' factor gives inconclusive evidence for an incompatibility between the data sets of XENON100 and the hints of detection. The same question assessed with goodness of fit would indicate a 2 σ discrepancy. This suggests that more data are therefore needed to settle this question.
Nearly Optimal Change-Point Detection with an Application to Cybersecurity
Polunchenko, Aleksey S; Mukhopadhyay, Nitis
2012-01-01
We address the sequential change-point detection problem for the Gaussian model where baseline distribution is Gaussian with variance \\sigma^2 and mean \\mu such that \\sigma^2=a\\mu, where a>0 is a known constant; the change is in \\mu from one known value to another. First, we carry out a comparative performance analysis of four detection procedures: the CUSUM procedure, the Shiryaev-Roberts (SR) procedure, and two its modifications - the Shiryaev-Roberts-Pollak and Shiryaev-Roberts-r procedures. The performance is benchmarked via Pollak's maximal average delay to detection and Shiryaev's stationary average delay to detection, each subject to a fixed average run length to false alarm. The analysis shows that in practically interesting cases the accuracy of asymptotic approximations is "reasonable" to "excellent". We also consider an application of change-point detection to cybersecurity - for rapid anomaly detection in computer networks. Using real network data we show that statistically traffic's intensity can...
The bugs book a practical introduction to Bayesian analysis
Lunn, David; Best, Nicky; Thomas, Andrew; Spiegelhalter, David
2012-01-01
Introduction: Probability and ParametersProbabilityProbability distributionsCalculating properties of probability distributionsMonte Carlo integrationMonte Carlo Simulations Using BUGSIntroduction to BUGSDoodleBUGSUsing BUGS to simulate from distributionsTransformations of random variablesComplex calculations using Monte CarloMultivariate Monte Carlo analysisPredictions with unknown parametersIntroduction to Bayesian InferenceBayesian learningPosterior predictive distributionsConjugate Bayesian inferenceInference about a discrete parameterCombinations of conjugate analysesBayesian and classica
Confirmation via Analogue Simulation: A Bayesian Analysis
Dardashti, Radin; Thebault, Karim P Y; Winsberg, Eric
2016-01-01
Analogue simulation is a novel mode of scientific inference found increasingly within modern physics, and yet all but neglected in the philosophical literature. Experiments conducted upon a table-top 'source system' are taken to provide insight into features of an inaccessible 'target system', based upon a syntactic isomorphism between the relevant modelling frameworks. An important example is the use of acoustic 'dumb hole' systems to simulate gravitational black holes. In a recent paper it was argued that there exists circumstances in which confirmation via analogue simulation can obtain; in particular when the robustness of the isomorphism is established via universality arguments. The current paper supports these claims via an analysis in terms of Bayesian confirmation theory.
BEAST: Bayesian evolutionary analysis by sampling trees
Directory of Open Access Journals (Sweden)
Drummond Alexei J
2007-11-01
Full Text Available Abstract Background The evolutionary analysis of molecular sequence variation is a statistical enterprise. This is reflected in the increased use of probabilistic models for phylogenetic inference, multiple sequence alignment, and molecular population genetics. Here we present BEAST: a fast, flexible software architecture for Bayesian analysis of molecular sequences related by an evolutionary tree. A large number of popular stochastic models of sequence evolution are provided and tree-based models suitable for both within- and between-species sequence data are implemented. Results BEAST version 1.4.6 consists of 81000 lines of Java source code, 779 classes and 81 packages. It provides models for DNA and protein sequence evolution, highly parametric coalescent analysis, relaxed clock phylogenetics, non-contemporaneous sequence data, statistical alignment and a wide range of options for prior distributions. BEAST source code is object-oriented, modular in design and freely available at http://beast-mcmc.googlecode.com/ under the GNU LGPL license. Conclusion BEAST is a powerful and flexible evolutionary analysis package for molecular sequence variation. It also provides a resource for the further development of new models and statistical methods of evolutionary analysis.
Analysis of COSIMA spectra: Bayesian approach
Directory of Open Access Journals (Sweden)
H. J. Lehto
2015-06-01
secondary ion mass spectrometer (TOF-SIMS spectra. The method is applied to the COmetary Secondary Ion Mass Analyzer (COSIMA TOF-SIMS mass spectra where the analysis can be broken into subgroups of lines close to integer mass values. The effects of the instrumental dead time are discussed in a new way. The method finds the joint probability density functions of measured line parameters (number of lines, and their widths, peak amplitudes, integrated amplitudes and positions. In the case of two or more lines, these distributions can take complex forms. The derived line parameters can be used to further calibrate the mass scaling of TOF-SIMS and to feed the results into other analysis methods such as multivariate analyses of spectra. We intend to use the method, first as a comprehensive tool to perform quantitative analysis of spectra, and second as a fast tool for studying interesting targets for obtaining additional TOF-SIMS measurements of the sample, a property unique to COSIMA. Finally, we point out that the Bayesian method can be thought of as a means to solve inverse problems but with forward calculations, only with no iterative corrections or other manipulation of the observed data.
Homogeneity and change-point detection tests for multivariate data using rank statistics
Lung-Yut-Fong, Alexandre; Cappé, Olivier
2011-01-01
Detecting and locating changes in highly multivariate data is a major concern in several current statistical applications. In this context, the first contribution of the paper is a novel non-parametric two-sample homogeneity test for multivariate data based on the well-known Wilcoxon rank statistic. The proposed two-sample homogeneity test statistic can be extended to deal with ordinal or censored data as well as to test for the homogeneity of more than two samples. The second contribution of the paper concerns the use of the proposed test statistic to perform retrospective change-point analysis. It is first shown that the approach is computationally feasible even when looking for a large number of change-points thanks to the use of dynamic programming. Computable asymptotic $p$-values for the test are then provided in the case where a single potential change-point is to be detected. Compared to available alternatives, the proposed approach appears to be very reliable and robust. This is particularly true in ...
Bayesian data analysis in population ecology: motivations, methods, and benefits
Dorazio, Robert
2016-01-01
During the 20th century ecologists largely relied on the frequentist system of inference for the analysis of their data. However, in the past few decades ecologists have become increasingly interested in the use of Bayesian methods of data analysis. In this article I provide guidance to ecologists who would like to decide whether Bayesian methods can be used to improve their conclusions and predictions. I begin by providing a concise summary of Bayesian methods of analysis, including a comparison of differences between Bayesian and frequentist approaches to inference when using hierarchical models. Next I provide a list of problems where Bayesian methods of analysis may arguably be preferred over frequentist methods. These problems are usually encountered in analyses based on hierarchical models of data. I describe the essentials required for applying modern methods of Bayesian computation, and I use real-world examples to illustrate these methods. I conclude by summarizing what I perceive to be the main strengths and weaknesses of using Bayesian methods to solve ecological inference problems.
Stochastic back analysis of permeability coefficient using generalized Bayesian method
Institute of Scientific and Technical Information of China (English)
Zheng Guilan; Wang Yuan; Wang Fei; Yang Jian
2008-01-01
Owing to the fact that the conventional deterministic back analysis of the permeability coefficient cannot reflect the uncertainties of parameters, including the hydraulic head at the boundary, the permeability coefficient and measured hydraulic head, a stochastic back analysis taking consideration of uncertainties of parameters was performed using the generalized Bayesian method. Based on the stochastic finite element method (SFEM) for a seepage field, the variable metric algorithm and the generalized Bayesian method, formulas for stochastic back analysis of the permeability coefficient were derived. A case study of seepage analysis of a sluice foundation was performed to illustrate the proposed method. The results indicate that, with the generalized Bayesian method that considers the uncertainties of measured hydraulic head, the permeability coefficient and the hydraulic head at the boundary, both the mean and standard deviation of the permeability coefficient can be obtained and the standard deviation is less than that obtained by the conventional Bayesian method. Therefore, the present method is valid and applicable.
Bayesian Analysis of the Cosmic Microwave Background
Jewell, Jeffrey
2007-01-01
There is a wealth of cosmological information encoded in the spatial power spectrum of temperature anisotropies of the cosmic microwave background! Experiments designed to map the microwave sky are returning a flood of data (time streams of instrument response as a beam is swept over the sky) at several different frequencies (from 30 to 900 GHz), all with different resolutions and noise properties. The resulting analysis challenge is to estimate, and quantify our uncertainty in, the spatial power spectrum of the cosmic microwave background given the complexities of "missing data", foreground emission, and complicated instrumental noise. Bayesian formulation of this problem allows consistent treatment of many complexities including complicated instrumental noise and foregrounds, and can be numerically implemented with Gibbs sampling. Gibbs sampling has now been validated as an efficient, statistically exact, and practically useful method for low-resolution (as demonstrated on WMAP 1 and 3 year temperature and polarization data). Continuing development for Planck - the goal is to exploit the unique capabilities of Gibbs sampling to directly propagate uncertainties in both foreground and instrument models to total uncertainty in cosmological parameters.
Objective Bayesian Analysis of Skew- t Distributions
BRANCO, MARCIA D'ELIA
2012-02-27
We study the Jeffreys prior and its properties for the shape parameter of univariate skew-t distributions with linear and nonlinear Student\\'s t skewing functions. In both cases, we show that the resulting priors for the shape parameter are symmetric around zero and proper. Moreover, we propose a Student\\'s t approximation of the Jeffreys prior that makes an objective Bayesian analysis easy to perform. We carry out a Monte Carlo simulation study that demonstrates an overall better behaviour of the maximum a posteriori estimator compared with the maximum likelihood estimator. We also compare the frequentist coverage of the credible intervals based on the Jeffreys prior and its approximation and show that they are similar. We further discuss location-scale models under scale mixtures of skew-normal distributions and show some conditions for the existence of the posterior distribution and its moments. Finally, we present three numerical examples to illustrate the implications of our results on inference for skew-t distributions. © 2012 Board of the Foundation of the Scandinavian Journal of Statistics.
Bayesian analysis of MEG visual evoked responses
Energy Technology Data Exchange (ETDEWEB)
Schmidt, D.M.; George, J.S.; Wood, C.C.
1999-04-01
The authors developed a method for analyzing neural electromagnetic data that allows probabilistic inferences to be drawn about regions of activation. The method involves the generation of a large number of possible solutions which both fir the data and prior expectations about the nature of probable solutions made explicit by a Bayesian formalism. In addition, they have introduced a model for the current distributions that produce MEG and (EEG) data that allows extended regions of activity, and can easily incorporate prior information such as anatomical constraints from MRI. To evaluate the feasibility and utility of the Bayesian approach with actual data, they analyzed MEG data from a visual evoked response experiment. They compared Bayesian analyses of MEG responses to visual stimuli in the left and right visual fields, in order to examine the sensitivity of the method to detect known features of human visual cortex organization. They also examined the changing pattern of cortical activation as a function of time.
Bayesian Analysis of Perceived Eye Level
Orendorff, Elaine E.; Kalesinskas, Laurynas; Palumbo, Robert T.; Albert, Mark V.
2016-01-01
To accurately perceive the world, people must efficiently combine internal beliefs and external sensory cues. We introduce a Bayesian framework that explains the role of internal balance cues and visual stimuli on perceived eye level (PEL)—a self-reported measure of elevation angle. This framework provides a single, coherent model explaining a set of experimentally observed PEL over a range of experimental conditions. Further, it provides a parsimonious explanation for the additive effect of low fidelity cues as well as the averaging effect of high fidelity cues, as also found in other Bayesian cue combination psychophysical studies. Our model accurately estimates the PEL and explains the form of previous equations used in describing PEL behavior. Most importantly, the proposed Bayesian framework for PEL is more powerful than previous behavioral modeling; it permits behavioral estimation in a wider range of cue combination and perceptual studies than models previously reported. PMID:28018204
A Bayesian Analysis of Spectral ARMA Model
Directory of Open Access Journals (Sweden)
Manoel I. Silvestre Bezerra
2012-01-01
Full Text Available Bezerra et al. (2008 proposed a new method, based on Yule-Walker equations, to estimate the ARMA spectral model. In this paper, a Bayesian approach is developed for this model by using the noninformative prior proposed by Jeffreys (1967. The Bayesian computations, simulation via Markov Monte Carlo (MCMC is carried out and characteristics of marginal posterior distributions such as Bayes estimator and confidence interval for the parameters of the ARMA model are derived. Both methods are also compared with the traditional least squares and maximum likelihood approaches and a numerical illustration with two examples of the ARMA model is presented to evaluate the performance of the procedures.
Directory of Open Access Journals (Sweden)
Jingjing Zhang
Full Text Available We present a simple framework for classifying mutually exclusive behavioural states within the geospatial lifelines of animals. This method involves use of three sequentially applied statistical procedures: (1 behavioural change point analysis to partition movement trajectories into discrete bouts of same-state behaviours, based on abrupt changes in the spatio-temporal autocorrelation structure of movement parameters; (2 hierarchical multivariate cluster analysis to determine the number of different behavioural states; and (3 k-means clustering to classify inferred bouts of same-state location observations into behavioural modes. We demonstrate application of the method by analysing synthetic trajectories of known 'artificial behaviours' comprised of different correlated random walks, as well as real foraging trajectories of little penguins (Eudyptula minor obtained by global-positioning-system telemetry. Our results show that the modelling procedure correctly classified 92.5% of all individual location observations in the synthetic trajectories, demonstrating reasonable ability to successfully discriminate behavioural modes. Most individual little penguins were found to exhibit three unique behavioural states (resting, commuting/active searching, area-restricted foraging, with variation in the timing and locations of observations apparently related to ambient light, bathymetry, and proximity to coastlines and river mouths. Addition of k-means clustering extends the utility of behavioural change point analysis, by providing a simple means through which the behaviours inferred for the location observations comprising individual movement trajectories can be objectively classified.
Bayesian methods for the design and analysis of noninferiority trials.
Gamalo-Siebers, Margaret; Gao, Aijun; Lakshminarayanan, Mani; Liu, Guanghan; Natanegara, Fanni; Railkar, Radha; Schmidli, Heinz; Song, Guochen
2016-01-01
The gold standard for evaluating treatment efficacy of a medical product is a placebo-controlled trial. However, when the use of placebo is considered to be unethical or impractical, a viable alternative for evaluating treatment efficacy is through a noninferiority (NI) study where a test treatment is compared to an active control treatment. The minimal objective of such a study is to determine whether the test treatment is superior to placebo. An assumption is made that if the active control treatment remains efficacious, as was observed when it was compared against placebo, then a test treatment that has comparable efficacy with the active control, within a certain range, must also be superior to placebo. Because of this assumption, the design, implementation, and analysis of NI trials present challenges for sponsors and regulators. In designing and analyzing NI trials, substantial historical data are often required on the active control treatment and placebo. Bayesian approaches provide a natural framework for synthesizing the historical data in the form of prior distributions that can effectively be used in design and analysis of a NI clinical trial. Despite a flurry of recent research activities in the area of Bayesian approaches in medical product development, there are still substantial gaps in recognition and acceptance of Bayesian approaches in NI trial design and analysis. The Bayesian Scientific Working Group of the Drug Information Association provides a coordinated effort to target the education and implementation issues on Bayesian approaches for NI trials. In this article, we provide a review of both frequentist and Bayesian approaches in NI trials, and elaborate on the implementation for two common Bayesian methods including hierarchical prior method and meta-analytic-predictive approach. Simulations are conducted to investigate the properties of the Bayesian methods, and some real clinical trial examples are presented for illustration.
Bayesian analysis of Markov point processes
DEFF Research Database (Denmark)
Berthelsen, Kasper Klitgaard; Møller, Jesper
2006-01-01
Recently Møller, Pettitt, Berthelsen and Reeves introduced a new MCMC methodology for drawing samples from a posterior distribution when the likelihood function is only specified up to a normalising constant. We illustrate the method in the setting of Bayesian inference for Markov point processes...
PAC-Bayesian Analysis of Martingales and Multiarmed Bandits
Seldin, Yevgeny; Shawe-Taylor, John; Peters, Jan; Auer, Peter
2011-01-01
We present two alternative ways to apply PAC-Bayesian analysis to sequences of dependent random variables. The first is based on a new lemma that enables to bound expectations of convex functions of certain dependent random variables by expectations of the same functions of independent Bernoulli random variables. This lemma provides an alternative tool to Hoeffding-Azuma inequality to bound concentration of martingale values. Our second approach is based on integration of Hoeffding-Azuma inequality with PAC-Bayesian analysis. We also introduce a way to apply PAC-Bayesian analysis in situation of limited feedback. We combine the new tools to derive PAC-Bayesian generalization and regret bounds for the multiarmed bandit problem. Although our regret bound is not yet as tight as state-of-the-art regret bounds based on other well-established techniques, our results significantly expand the range of potential applications of PAC-Bayesian analysis and introduce a new analysis tool to reinforcement learning and many ...
Sequential Analysis: Hypothesis Testing and Changepoint Detection
2014-07-11
health mon- itoring of bridges [24, 25, 43], wind turbines [178, 216], and aircraft [41, 102, 186, 188], detecting multiple sensor faults in an unmanned...describe a couple of signal processing problems, namely segmentation of signals and seismic signal processing. Mechanical systems integrity monitoring is...and useful in image segmentation and bound- ary tracking problems [96]. 1.3.4.2 Seismic Data Processing In many situations of seismic data processing
MATHEMATICAL RISK ANALYSIS: VIA NICHOLAS RISK MODEL AND BAYESIAN ANALYSIS
Directory of Open Access Journals (Sweden)
Anass BAYAGA
2010-07-01
Full Text Available The objective of this second part of a two-phased study was to explorethe predictive power of quantitative risk analysis (QRA method andprocess within Higher Education Institution (HEI. The method and process investigated the use impact analysis via Nicholas risk model and Bayesian analysis, with a sample of hundred (100 risk analysts in a historically black South African University in the greater Eastern Cape Province.The first findings supported and confirmed previous literature (KingIII report, 2009: Nicholas and Steyn, 2008: Stoney, 2007: COSA, 2004 that there was a direct relationship between risk factor, its likelihood and impact, certiris paribus. The second finding in relation to either controlling the likelihood or the impact of occurrence of risk (Nicholas risk model was that to have a brighter risk reward, it was important to control the likelihood ofoccurrence of risks as compared with its impact so to have a direct effect on entire University. On the Bayesian analysis, thus third finding, the impact of risk should be predicted along three aspects. These aspects included the human impact (decisions made, the property impact (students and infrastructural based and the business impact. Lastly, the study revealed that although in most business cases, where as business cycles considerably vary dependingon the industry and or the institution, this study revealed that, most impacts in HEI (University was within the period of one academic.The recommendation was that application of quantitative risk analysisshould be related to current legislative framework that affects HEI.
Detecting change-points in multidimensional stochatic processes
de Gooijer, J.G.
2006-01-01
A general test statistic for detecting change-points in multidimensional stochastic processes with unknown parameters is proposed. The test statistic is specialized to the case of detecting changes in sequences of covariance matrices. Large-sample distributional results are presented for the test st
Elite Athletes Refine Their Internal Clocks: A Bayesian Analysis.
Chen, Yin-Hua; Verdinelli, Isabella; Cesari, Paola
2016-07-01
This paper carries out a full Bayesian analysis for a data set examined in Chen & Cesari (2015). These data were collected for assessing people's ability in evaluating short intervals of time. Chen & Cesari (2015) showed evidence of the existence of two independent internal clocks for evaluating time intervals below and above the second. We reexamine here, the same question by performing a complete statistical Bayesian analysis of the data. The Bayesian approach can be used to analyze these data thanks to the specific trial design. Data were obtained from evaluation of time ranges from two groups of individuals. More specifically, information gathered from a nontrained group (considered as baseline) allowed us to build a prior distribution for the parameter(s) of interest, and data from the trained group determined the likelihood function. This paper's main goals are (i) showing how the Bayesian inferential method can be used in statistical analyses and (ii) showing that the Bayesian methodology gives additional support to the findings presented in Chen & Cesari (2015) regarding the existence of two internal clocks in assessing duration of time intervals.
Analysis of Gumbel Model for Software Reliability Using Bayesian Paradigm
Directory of Open Access Journals (Sweden)
Raj Kumar
2012-12-01
Full Text Available In this paper, we have illustrated the suitability of Gumbel Model for software reliability data. The model parameters are estimated using likelihood based inferential procedure: classical as well as Bayesian. The quasi Newton-Raphson algorithm is applied to obtain the maximum likelihood estimates and associated probability intervals. The Bayesian estimates of the parameters of Gumbel model are obtained using Markov Chain Monte Carlo(MCMC simulation method in OpenBUGS(established software for Bayesian analysis using Markov Chain Monte Carlo methods. The R functions are developed to study the statistical properties, model validation and comparison tools of the model and the output analysis of MCMC samples generated from OpenBUGS. Details of applying MCMC to parameter estimation for the Gumbel model are elaborated and a real software reliability data set is considered to illustrate the methods of inference discussed in this paper.
On Bayesian analysis of on–off measurements
Energy Technology Data Exchange (ETDEWEB)
Nosek, Dalibor, E-mail: nosek@ipnp.troja.mff.cuni.cz [Charles University, Faculty of Mathematics and Physics, Prague (Czech Republic); Nosková, Jana [Czech Technical University, Faculty of Civil Engineering, Prague (Czech Republic)
2016-06-01
We propose an analytical solution to the on–off problem within the framework of Bayesian statistics. Both the statistical significance for the discovery of new phenomena and credible intervals on model parameters are presented in a consistent way. We use a large enough family of prior distributions of relevant parameters. The proposed analysis is designed to provide Bayesian solutions that can be used for any number of observed on–off events, including zero. The procedure is checked using Monte Carlo simulations. The usefulness of the method is demonstrated on examples from γ-ray astronomy.
On Bayesian analysis of on-off measurements
Nosek, Dalibor
2016-01-01
We propose an analytical solution to the on-off problem within the framework of Bayesian statistics. Both the statistical significance for the discovery of new phenomena and credible intervals on model parameters are presented in a consistent way. We use a large enough family of prior distributions of relevant parameters. The proposed analysis is designed to provide Bayesian solutions that can be used for any number of observed on-off events, including zero. The procedure is checked using Monte Carlo simulations. The usefulness of the method is demonstrated on examples from gamma-ray astronomy.
Semiparametric bayesian analysis of gene-environment interactions
Lobach, I.
2010-01-01
A key component to prevention and control of complex diseases, such as cancer, diabetes, hypertension, is to analyze the genetic and environmental factors that lead to the development of these complex diseases. We propose a Bayesian approach for analysis of gene-environment interactions that efficiently models information available in the observed data and a priori biomedical knowledge.
A Bayesian Predictive Discriminant Analysis with Screened Data
Hea-Jung Kim
2015-01-01
In the application of discriminant analysis, a situation sometimes arises where individual measurements are screened by a multidimensional screening scheme. For this situation, a discriminant analysis with screened populations is considered from a Bayesian viewpoint, and an optimal predictive rule for the analysis is proposed. In order to establish a flexible method to incorporate the prior information of the screening mechanism, we propose a hierarchical screened scale mixture of normal (HSS...
Robust Mean Change-Point Detecting through Laplace Linear Regression Using EM Algorithm
Directory of Open Access Journals (Sweden)
Fengkai Yang
2014-01-01
normal distribution, we developed the expectation maximization (EM algorithm to estimate the position of mean change-point. We investigated the performance of the algorithm through different simulations, finding that our methods is robust to the distributions of errors and is effective to estimate the position of mean change-point. Finally, we applied our method to the classical Holbert data and detected a change-point.
Multiple quantitative trait analysis using bayesian networks.
Scutari, Marco; Howell, Phil; Balding, David J; Mackay, Ian
2014-09-01
Models for genome-wide prediction and association studies usually target a single phenotypic trait. However, in animal and plant genetics it is common to record information on multiple phenotypes for each individual that will be genotyped. Modeling traits individually disregards the fact that they are most likely associated due to pleiotropy and shared biological basis, thus providing only a partial, confounded view of genetic effects and phenotypic interactions. In this article we use data from a Multiparent Advanced Generation Inter-Cross (MAGIC) winter wheat population to explore Bayesian networks as a convenient and interpretable framework for the simultaneous modeling of multiple quantitative traits. We show that they are equivalent to multivariate genetic best linear unbiased prediction (GBLUP) and that they are competitive with single-trait elastic net and single-trait GBLUP in predictive performance. Finally, we discuss their relationship with other additive-effects models and their advantages in inference and interpretation. MAGIC populations provide an ideal setting for this kind of investigation because the very low population structure and large sample size result in predictive models with good power and limited confounding due to relatedness.
Multiple change-points estimation of moving-average processes under dependence assumptions
Institute of Scientific and Technical Information of China (English)
ZHANG Lixin; LI Yunxia
2004-01-01
In this paper, some results of convergence for a least-square estimator in the problem of multiple change-points estimation are presented and the moving-average processes of ρ-mixing sequence in the mean shifts are discussed. When the number of change points is known, the consistency of change-points estimator is derived. When the number of changes is unknown, the consistency of the change-points number and the change-points estimator by penalized least-squares method are obtained. The results are also true for φ-mixing, α-mixing, associated and negative associated sequences under suitable conditions.
An Overview of Bayesian Methods for Neural Spike Train Analysis
Directory of Open Access Journals (Sweden)
Zhe Chen
2013-01-01
Full Text Available Neural spike train analysis is an important task in computational neuroscience which aims to understand neural mechanisms and gain insights into neural circuits. With the advancement of multielectrode recording and imaging technologies, it has become increasingly demanding to develop statistical tools for analyzing large neuronal ensemble spike activity. Here we present a tutorial overview of Bayesian methods and their representative applications in neural spike train analysis, at both single neuron and population levels. On the theoretical side, we focus on various approximate Bayesian inference techniques as applied to latent state and parameter estimation. On the application side, the topics include spike sorting, tuning curve estimation, neural encoding and decoding, deconvolution of spike trains from calcium imaging signals, and inference of neuronal functional connectivity and synchrony. Some research challenges and opportunities for neural spike train analysis are discussed.
A Bayesian Predictive Discriminant Analysis with Screened Data
Directory of Open Access Journals (Sweden)
Hea-Jung Kim
2015-09-01
Full Text Available In the application of discriminant analysis, a situation sometimes arises where individual measurements are screened by a multidimensional screening scheme. For this situation, a discriminant analysis with screened populations is considered from a Bayesian viewpoint, and an optimal predictive rule for the analysis is proposed. In order to establish a flexible method to incorporate the prior information of the screening mechanism, we propose a hierarchical screened scale mixture of normal (HSSMN model, which makes provision for flexible modeling of the screened observations. An Markov chain Monte Carlo (MCMC method using the Gibbs sampler and the Metropolis–Hastings algorithm within the Gibbs sampler is used to perform a Bayesian inference on the HSSMN models and to approximate the optimal predictive rule. A simulation study is given to demonstrate the performance of the proposed predictive discrimination procedure.
Bayesian Methods for Analysis and Adaptive Scheduling of Exoplanet Observations
Loredo, Thomas J; Chernoff, David F; Clyde, Merlise A; Liu, Bin
2011-01-01
We describe work in progress by a collaboration of astronomers and statisticians developing a suite of Bayesian data analysis tools for extrasolar planet (exoplanet) detection, planetary orbit estimation, and adaptive scheduling of observations. Our work addresses analysis of stellar reflex motion data, where a planet is detected by observing the "wobble" of its host star as it responds to the gravitational tug of the orbiting planet. Newtonian mechanics specifies an analytical model for the resulting time series, but it is strongly nonlinear, yielding complex, multimodal likelihood functions; it is even more complex when multiple planets are present. The parameter spaces range in size from few-dimensional to dozens of dimensions, depending on the number of planets in the system, and the type of motion measured (line-of-sight velocity, or position on the sky). Since orbits are periodic, Bayesian generalizations of periodogram methods facilitate the analysis. This relies on the model being linearly separable, ...
Bayesian phylogeny analysis via stochastic approximation Monte Carlo
Cheon, Sooyoung
2009-11-01
Monte Carlo methods have received much attention in the recent literature of phylogeny analysis. However, the conventional Markov chain Monte Carlo algorithms, such as the Metropolis-Hastings algorithm, tend to get trapped in a local mode in simulating from the posterior distribution of phylogenetic trees, rendering the inference ineffective. In this paper, we apply an advanced Monte Carlo algorithm, the stochastic approximation Monte Carlo algorithm, to Bayesian phylogeny analysis. Our method is compared with two popular Bayesian phylogeny software, BAMBE and MrBayes, on simulated and real datasets. The numerical results indicate that our method outperforms BAMBE and MrBayes. Among the three methods, SAMC produces the consensus trees which have the highest similarity to the true trees, and the model parameter estimates which have the smallest mean square errors, but costs the least CPU time. © 2009 Elsevier Inc. All rights reserved.
Bayesian phylogeny analysis via stochastic approximation Monte Carlo.
Cheon, Sooyoung; Liang, Faming
2009-11-01
Monte Carlo methods have received much attention in the recent literature of phylogeny analysis. However, the conventional Markov chain Monte Carlo algorithms, such as the Metropolis-Hastings algorithm, tend to get trapped in a local mode in simulating from the posterior distribution of phylogenetic trees, rendering the inference ineffective. In this paper, we apply an advanced Monte Carlo algorithm, the stochastic approximation Monte Carlo algorithm, to Bayesian phylogeny analysis. Our method is compared with two popular Bayesian phylogeny software, BAMBE and MrBayes, on simulated and real datasets. The numerical results indicate that our method outperforms BAMBE and MrBayes. Among the three methods, SAMC produces the consensus trees which have the highest similarity to the true trees, and the model parameter estimates which have the smallest mean square errors, but costs the least CPU time.
Bayesian Variable Selection in Cost-Effectiveness Analysis
Directory of Open Access Journals (Sweden)
Miguel A. Negrín
2010-04-01
Full Text Available Linear regression models are often used to represent the cost and effectiveness of medical treatment. The covariates used may include sociodemographic variables, such as age, gender or race; clinical variables, such as initial health status, years of treatment or the existence of concomitant illnesses; and a binary variable indicating the treatment received. However, most studies estimate only one model, which usually includes all the covariates. This procedure ignores the question of uncertainty in model selection. In this paper, we examine four alternative Bayesian variable selection methods that have been proposed. In this analysis, we estimate the inclusion probability of each covariate in the real model conditional on the data. Variable selection can be useful for estimating incremental effectiveness and incremental cost, through Bayesian model averaging, as well as for subgroup analysis.
Bayesian imperfect information analysis for clinical recurrent data
Chang CK; Chang CC
2014-01-01
Chih-Kuang Chang,1 Chi-Chang Chang2 1Department of Cardiology, Jen-Ai Hospital, Dali District, Taichung, Taiwan; 2School of Medical Informatics, Chung Shan Medical University, Information Technology Office of Chung Shan Medical University Hospital, Taichung, TaiwanAbstract: In medical research, clinical practice must often be undertaken with imperfect information from limited resources. This study applied Bayesian imperfect information-value analysis to realistic situations to produce likelih...
Bayesian methods for the analysis of inequality constrained contingency tables.
Laudy, Olav; Hoijtink, Herbert
2007-04-01
A Bayesian methodology for the analysis of inequality constrained models for contingency tables is presented. The problem of interest lies in obtaining the estimates of functions of cell probabilities subject to inequality constraints, testing hypotheses and selection of the best model. Constraints on conditional cell probabilities and on local, global, continuation and cumulative odds ratios are discussed. A Gibbs sampler to obtain a discrete representation of the posterior distribution of the inequality constrained parameters is used. Using this discrete representation, the credibility regions of functions of cell probabilities can be constructed. Posterior model probabilities are used for model selection and hypotheses are tested using posterior predictive checks. The Bayesian methodology proposed is illustrated in two examples.
BaTMAn: Bayesian Technique for Multi-image Analysis
Casado, J; García-Benito, R; Guidi, G; Choudhury, O S; Bellocchi, E; Sánchez, S; Díaz, A I
2016-01-01
This paper describes the Bayesian Technique for Multi-image Analysis (BaTMAn), a novel image segmentation technique based on Bayesian statistics, whose main purpose is to characterize an astronomical dataset containing spatial information and perform a tessellation based on the measurements and errors provided as input. The algorithm will iteratively merge spatial elements as long as they are statistically consistent with carrying the same information (i.e. signal compatible with being identical within the errors). We illustrate its operation and performance with a set of test cases that comprises both synthetic and real Integral-Field Spectroscopic (IFS) data. Our results show that the segmentations obtained by BaTMAn adapt to the underlying structure of the data, regardless of the precise details of their morphology and the statistical properties of the noise. The quality of the recovered signal represents an improvement with respect to the input, especially in those regions where the signal is actually con...
Bayesian tomography and integrated data analysis in fusion diagnostics
Li, Dong; Dong, Y. B.; Deng, Wei; Shi, Z. B.; Fu, B. Z.; Gao, J. M.; Wang, T. B.; Zhou, Yan; Liu, Yi; Yang, Q. W.; Duan, X. R.
2016-11-01
In this article, a Bayesian tomography method using non-stationary Gaussian process for a prior has been introduced. The Bayesian formalism allows quantities which bear uncertainty to be expressed in the probabilistic form so that the uncertainty of a final solution can be fully resolved from the confidence interval of a posterior probability. Moreover, a consistency check of that solution can be performed by checking whether the misfits between predicted and measured data are reasonably within an assumed data error. In particular, the accuracy of reconstructions is significantly improved by using the non-stationary Gaussian process that can adapt to the varying smoothness of emission distribution. The implementation of this method to a soft X-ray diagnostics on HL-2A has been used to explore relevant physics in equilibrium and MHD instability modes. This project is carried out within a large size inference framework, aiming at an integrated analysis of heterogeneous diagnostics.
A Bayesian Nonparametric Meta-Analysis Model
Karabatsos, George; Talbott, Elizabeth; Walker, Stephen G.
2015-01-01
In a meta-analysis, it is important to specify a model that adequately describes the effect-size distribution of the underlying population of studies. The conventional normal fixed-effect and normal random-effects models assume a normal effect-size population distribution, conditionally on parameters and covariates. For estimating the mean overall…
Bayesian networks for omics data analysis
Gavai, A.K.
2009-01-01
This thesis focuses on two aspects of high throughput technologies, i.e. data storage and data analysis, in particular in transcriptomics and metabolomics. Both technologies are part of a research field that is generally called ‘omics’ (or ‘-omics’, with a leading hyphen), which refers to genomics,
Bayesian networks for omics data analysis
Gavai, A.K.
2009-01-01
This thesis focuses on two aspects of high throughput technologies, i.e. data storage and data analysis, in particular in transcriptomics and metabolomics. Both technologies are part of a research field that is generally called ‘omics’ (or ‘-omics’, with a leading hyphen), which refers to genomics, transcriptomics, proteomics, or metabolomics. Although these techniques study different entities (genes, gene expression, proteins, or metabolites), they all have in common that they use high-throu...
Using change-point models to estimate empirical critical loads for nitrogen in mountain ecosystems.
Roth, Tobias; Kohli, Lukas; Rihm, Beat; Meier, Reto; Achermann, Beat
2017-01-01
To protect ecosystems and their services, the critical load concept has been implemented under the framework of the Convention on Long-range Transboundary Air Pollution (UNECE) to develop effects-oriented air pollution abatement strategies. Critical loads are thresholds below which damaging effects on sensitive habitats do not occur according to current knowledge. Here we use change-point models applied in a Bayesian context to overcome some of the difficulties when estimating empirical critical loads for nitrogen (N) from empirical data. We tested the method using simulated data with varying sample sizes, varying effects of confounding variables, and with varying negative effects of N deposition on species richness. The method was applied to the national-scale plant species richness data from mountain hay meadows and (sub)alpine scrubs sites in Switzerland. Seven confounding factors (elevation, inclination, precipitation, calcareous content, aspect as well as indicator values for humidity and light) were selected based on earlier studies examining numerous environmental factors to explain Swiss vascular plant diversity. The estimated critical load confirmed the existing empirical critical load of 5-15 kg N ha(-1) yr(-1) for (sub)alpine scrubs, while for mountain hay meadows the estimated critical load was at the lower end of the current empirical critical load range. Based on these results, we suggest to narrow down the critical load range for mountain hay meadows to 10-15 kg N ha(-1) yr(-1).
A Bayesian analysis of pentaquark signals from CLAS data
Ireland, D G; Protopopescu, D; Ambrozewicz, P; Anghinolfi, M; Asryan, G; Avakian, H; Bagdasaryan, H; Baillie, N; Ball, J P; Baltzell, N A; Batourine, V; Battaglieri, M; Bedlinskiy, I; Bellis, M; Benmouna, N; Berman, B L; Biselli, A S; Blaszczyk, L; Bouchigny, S; Boiarinov, S; Bradford, R; Branford, D; Briscoe, W J; Brooks, W K; Burkert, V D; Butuceanu, C; Calarco, J R; Careccia, S L; Carman, D S; Casey, L; Chen, S; Cheng, L; Cole, P L; Collins, P; Coltharp, P; Crabb, D; Credé, V; Dashyan, N; De Masi, R; De Vita, R; De Sanctis, E; Degtyarenko, P V; Deur, A; Dickson, R; Djalali, C; Dodge, G E; Donnelly, J; Doughty, D; Dugger, M; Dzyubak, O P; Egiyan, H; Egiyan, K S; El Fassi, L; Elouadrhiri, L; Eugenio, P; Fedotov, G; Feldman, G; Fradi, A; Funsten, H; Garçon, M; Gavalian, G; Gevorgyan, N; Gilfoyle, G P; Giovanetti, K L; Girod, F X; Goetz, J T; Gohn, W; Gonenc, A; Gothe, R W; Griffioen, K A; Guidal, M; Guler, N; Guo, L; Gyurjyan, V; Hafidi, K; Hakobyan, H; Hanretty, C; Hassall, N; Hersman, F W; Hleiqawi, I; Holtrop, M; Hyde-Wright, C E; Ilieva, Y; Ishkhanov, B S; Isupov, E L; Jenkins, D; Jo, H S; Johnstone, J R; Joo, K; Jüngst, H G; Kalantarians, N; Kellie, J D; Khandaker, M; Kim, W; Klein, A; Klein, F J; Kossov, M; Krahn, Z; Kramer, L H; Kubarovski, V; Kühn, J; Kuleshov, S V; Kuznetsov, V; Lachniet, J; Laget, J M; Langheinrich, J; Lawrence, D; Livingston, K; Lu, H Y; MacCormick, M; Markov, N; Mattione, P; Mecking, B A; Mestayer, M D; Meyer, C A; Mibe, T; Mikhailov, K; Mirazita, M; Miskimen, R; Mokeev, V; Moreno, B; Moriya, K; Morrow, S A; Moteabbed, M; Munevar, E; Mutchler, G S; Nadel-Turonski, P; Nasseripour, R; Niccolai, S; Niculescu, G; Niculescu, I; Niczyporuk, B B; Niroula, M R; Niyazov, R A; Nozar, M; Osipenko, M; Ostrovidov, A I; Park, K; Pasyuk, E; Paterson, C; Anefalos Pereira, S; Pierce, J; Pivnyuk, N; Pogorelko, O; Pozdniakov, S; Price, J W; Procureur, S; Prok, Y; Raue, B A; Ricco, G; Ripani, M; Ritchie, B G; Ronchetti, F; Rosner, G; Rossi, P; Sabatie, F; Salamanca, J; Salgado, C; Santoro, J P; Sapunenko, V; Schumacher, R A; Serov, V S; Sharabyan, Yu G; Sharov, D; Shvedunov, N V; Smith, E S; Smith, L C; Sober, D I; Sokhan, D; Stavinsky, A; Stepanyan, S S; Stepanyan, S; Stokes, B E; Stoler, P; Strauch, S; Taiuti, M; Tedeschi, D J; Thoma, U; Tkabladze, A; Tkachenko, S; Tur, C; Ungaro, M; Vineyard, M F; Vlassov, A V; Watts, D P; Weinstein, L B; Weygand, D P; Williams, M; Wolin, E; Wood, M H; Yegneswaran, A; Zana, L; Zhang, J; Zhao, B; Zhao, Z W
2007-01-01
We examine the results of two measurements by the CLAS collaboration, one of which claimed evidence for a $\\Theta^{+}$ pentaquark, whilst the other found no such evidence. The unique feature of these two experiments was that they were performed with the same experimental setup. Using a Bayesian analysis we find that the results of the two experiments are in fact compatible with each other, but that the first measurement did not contain sufficient information to determine unambiguously the existence of a $\\Theta^{+}$. Further, we suggest a means by which the existence of a new candidate particle can be tested in a rigorous manner.
Bayesian Reasoning in Data Analysis A Critical Introduction
D'Agostini, Giulio
2003-01-01
This book provides a multi-level introduction to Bayesian reasoning (as opposed to "conventional statistics") and its applications to data analysis. The basic ideas of this "new" approach to the quantification of uncertainty are presented using examples from research and everyday life. Applications covered include: parametric inference; combination of results; treatment of uncertainty due to systematic errors and background; comparison of hypotheses; unfolding of experimental distributions; upper/lower bounds in frontier-type measurements. Approximate methods for routine use are derived and ar
Bayesian frequency analysis of HD 201433 observations with BRITE
Kallinger, T
2016-01-01
Multiple oscillation frequencies separated by close to or less than the formal frequency resolution of a data set are a serious problem in the frequency analysis of time series data. We present a new and fully automated Bayesian approach that searches for close frequencies in time series data and assesses their significance by comparison to no signal and a mono-periodic signal. We extensively test the approach with synthetic data sets and apply it to the 156 days-long high-precision BRITE photometry of the SPB star HD 201433, for which we find a sequence of nine statistically significant rotationally split dipole modes.
A Bayesian analysis of pentaquark signals from CLAS data
Energy Technology Data Exchange (ETDEWEB)
David Ireland; Bryan McKinnon; Dan Protopopescu; Pawel Ambrozewicz; Marco Anghinolfi; G. Asryan; Harutyun Avakian; H. Bagdasaryan; Nathan Baillie; Jacques Ball; Nathan Baltzell; V. Batourine; Marco Battaglieri; Ivan Bedlinski; Ivan Bedlinskiy; Matthew Bellis; Nawal Benmouna; Barry Berman; Angela Biselli; Lukasz Blaszczyk; Sylvain Bouchigny; Sergey Boyarinov; Robert Bradford; Derek Branford; William Briscoe; William Brooks; Volker Burkert; Cornel Butuceanu; John Calarco; Sharon Careccia; Daniel Carman; Liam Casey; Shifeng Chen; Lu Cheng; Philip Cole; Patrick Collins; Philip Coltharp; Donald Crabb; Volker Crede; Natalya Dashyan; Rita De Masi; Raffaella De Vita; Enzo De Sanctis; Pavel Degtiarenko; Alexandre Deur; Richard Dickson; Chaden Djalali; Gail Dodge; Joseph Donnelly; David Doughty; Michael Dugger; Oleksandr Dzyubak; Hovanes Egiyan; Kim Egiyan; Lamiaa Elfassi; Latifa Elouadrhiri; Paul Eugenio; Gleb Fedotov; Gerald Feldman; Ahmed Fradi; Herbert Funsten; Michel Garcon; Gagik Gavalian; Nerses Gevorgyan; Gerard Gilfoyle; Kevin Giovanetti; Francois-Xavier Girod; John Goetz; Wesley Gohn; Atilla Gonenc; Ralf Gothe; Keith Griffioen; Michel Guidal; Nevzat Guler; Lei Guo; Vardan Gyurjyan; Kawtar Hafidi; Hayk Hakobyan; Charles Hanretty; Neil Hassall; F. Hersman; Ishaq Hleiqawi; Maurik Holtrop; Charles Hyde; Yordanka Ilieva; Boris Ishkhanov; Eugeny Isupov; D. Jenkins; Hyon-Suk Jo; John Johnstone; Kyungseon Joo; Henry Juengst; Narbe Kalantarians; James Kellie; Mahbubul Khandaker; Wooyoung Kim; Andreas Klein; Franz Klein; Mikhail Kossov; Zebulun Krahn; Laird Kramer; Valery Kubarovsky; Joachim Kuhn; Sergey Kuleshov; Viacheslav Kuznetsov; Jeff Lachniet; Jean Laget; Jorn Langheinrich; D. Lawrence; Kenneth Livingston; Haiyun Lu; Marion MacCormick; Nikolai Markov; Paul Mattione; Bernhard Mecking; Mac Mestayer; Curtis Meyer; Tsutomu Mibe; Konstantin Mikhaylov; Marco Mirazita; Rory Miskimen; Viktor Mokeev; Brahim Moreno; Kei Moriya; Steven Morrow; Maryam Moteabbed; Edwin Munevar Espitia; Gordon Mutchler; Pawel Nadel-Turonski; Rakhsha Nasseripour; Silvia Niccolai; Gabriel Niculescu; Maria-Ioana Niculescu; Bogdan Niczyporuk; Megh Niroula; Rustam Niyazov; Mina Nozar; Mikhail Osipenko; Alexander Ostrovidov; Kijun Park; Evgueni Pasyuk; Craig Paterson; Sergio Pereira; Joshua Pierce; Nikolay Pivnyuk; Oleg Pogorelko; Sergey Pozdnyakov; John Price; Sebastien Procureur; Yelena Prok; Brian Raue; Giovanni Ricco; Marco Ripani; Barry Ritchie; Federico Ronchetti; Guenther Rosner; Patrizia Rossi; Franck Sabatie; Julian Salamanca; Carlos Salgado; Joseph Santoro; Vladimir Sapunenko; Reinhard Schumacher; Vladimir Serov; Youri Sharabian; Dmitri Sharov; Nikolay Shvedunov; Elton Smith; Lee Smith; Daniel Sober; Daria Sokhan; Aleksey Stavinskiy; Samuel Stepanyan; Stepan Stepanyan; Burnham Stokes; Paul Stoler; Steffen Strauch; Mauro Taiuti; David Tedeschi; Ulrike Thoma; Avtandil Tkabladze; Svyatoslav Tkachenko; Clarisse Tur; Maurizio Ungaro; Michael Vineyard; Alexander Vlassov; Daniel Watts; Lawrence Weinstein; Dennis Weygand; M. Williams; Elliott Wolin; M.H. Wood; Amrit Yegneswaran; Lorenzo Zana; Jixie Zhang; Bo Zhao; Zhiwen Zhao
2008-02-01
We examine the results of two measurements by the CLAS collaboration, one of which claimed evidence for a $\\Theta^{+}$ pentaquark, whilst the other found no such evidence. The unique feature of these two experiments was that they were performed with the same experimental setup. Using a Bayesian analysis we find that the results of the two experiments are in fact compatible with each other, but that the first measurement did not contain sufficient information to determine unambiguously the existence of a $\\Theta^{+}$. Further, we suggest a means by which the existence of a new candidate particle can be tested in a rigorous manner.
Safety Analysis of Liquid Rocket Engine Using Bayesian Networks
Institute of Scientific and Technical Information of China (English)
WANG Hua-wei; YAN Zhi-qiang
2007-01-01
Safety analysis for liquid rocket engine has a great meaning for shortening development cycle, saving development expenditure and reducing development risk. The relationship between the structure and component of liquid rocket engine is much more complex, furthermore test data are absent in development phase. Thereby, the uncertainties exist in safety analysis for liquid rocket engine. A safety analysis model integrated with FMEA(failure mode and effect analysis)based on Bayesian networks (BN) is brought forward for liquid rocket engine, which can combine qualitative analysis with quantitative decision. The method has the advantages of fusing multi-information, saving sample amount and having high veracity. An example shows that the method is efficient.
Bayesian analysis of inflationary features in Planck and SDSS data
Benetti, Micol
2016-01-01
We perform a Bayesian analysis to study possible features in the primordial inflationary power spectrum of scalar perturbations. In particular, we analyse the possibility of detecting the imprint of these primordial features in the anisotropy temperature power spectrum of the Cosmic Microwave Background (CMB) and also in the matter power spectrum P (k). We use the most recent CMB data provided by the Planck Collaboration and P (k) measurements from the eleventh data release of the Sloan Digital Sky Survey. We focus our analysis on a class of potentials whose features are localised at different intervals of angular scales, corresponding to multipoles in the ranges 10 < l < 60 (Oscill-1) and 150 < l < 300 (Oscill-2). Our results show that one of the step-potentials (Oscill-1) provides a better fit to the CMB data than does the featureless LCDM scenario, with a moderate Bayesian evidence in favor of the former. Adding the P (k) data to the analysis weakens the evidence of the Oscill-1 potential relat...
Implementation of a Bayesian Engine for Uncertainty Analysis
Energy Technology Data Exchange (ETDEWEB)
Leng Vang; Curtis Smith; Steven Prescott
2014-08-01
In probabilistic risk assessment, it is important to have an environment where analysts have access to a shared and secured high performance computing and a statistical analysis tool package. As part of the advanced small modular reactor probabilistic risk analysis framework implementation, we have identified the need for advanced Bayesian computations. However, in order to make this technology available to non-specialists, there is also a need of a simplified tool that allows users to author models and evaluate them within this framework. As a proof-of-concept, we have implemented an advanced open source Bayesian inference tool, OpenBUGS, within the browser-based cloud risk analysis framework that is under development at the Idaho National Laboratory. This development, the “OpenBUGS Scripter” has been implemented as a client side, visual web-based and integrated development environment for creating OpenBUGS language scripts. It depends on the shared server environment to execute the generated scripts and to transmit results back to the user. The visual models are in the form of linked diagrams, from which we automatically create the applicable OpenBUGS script that matches the diagram. These diagrams can be saved locally or stored on the server environment to be shared with other users.
Evolutionary Sequential Monte Carlo Samplers for Change-Point Models
Directory of Open Access Journals (Sweden)
Arnaud Dufays
2016-03-01
Full Text Available Sequential Monte Carlo (SMC methods are widely used for non-linear filtering purposes. However, the SMC scope encompasses wider applications such as estimating static model parameters so much that it is becoming a serious alternative to Markov-Chain Monte-Carlo (MCMC methods. Not only do SMC algorithms draw posterior distributions of static or dynamic parameters but additionally they provide an estimate of the marginal likelihood. The tempered and time (TNT algorithm, developed in this paper, combines (off-line tempered SMC inference with on-line SMC inference for drawing realizations from many sequential posterior distributions without experiencing a particle degeneracy problem. Furthermore, it introduces a new MCMC rejuvenation step that is generic, automated and well-suited for multi-modal distributions. As this update relies on the wide heuristic optimization literature, numerous extensions are readily available. The algorithm is notably appropriate for estimating change-point models. As an example, we compare several change-point GARCH models through their marginal log-likelihoods over time.
Analysis of Wave Directional Spreading by Bayesian Parameter Estimation
Institute of Scientific and Technical Information of China (English)
钱桦; 莊士贤; 高家俊
2002-01-01
A spatial array of wave gauges installed on an observatoion platform has been designed and arranged to measure the lo-cal features of winter monsoon directional waves off Taishi coast of Taiwan. A new method, named the Bayesian ParameterEstimation Method( BPEM), is developed and adopted to determine the main direction and the directional spreading parame-ter of directional spectra. The BPEM could be considered as a regression analysis to find the maximum joint probability ofparameters, which best approximates the observed data from the Bayesian viewpoint. The result of the analysis of field wavedata demonstrates the highly dependency of the characteristics of normalized directional spreading on the wave age. The Mit-suyasu type empirical formula of directional spectnun is therefore modified to be representative of monsoon wave field. More-over, it is suggested that Smax could be expressed as a function of wave steepness. The values of Smax decrease with increas-ing steepness. Finally, a local directional spreading model, which is simple to be utilized in engineering practice, is prop-osed.
Node Augmentation Technique in Bayesian Network Evidence Analysis and Marshaling
Energy Technology Data Exchange (ETDEWEB)
Keselman, Dmitry [Los Alamos National Laboratory; Tompkins, George H [Los Alamos National Laboratory; Leishman, Deborah A [Los Alamos National Laboratory
2010-01-01
Given a Bayesian network, sensitivity analysis is an important activity. This paper begins by describing a network augmentation technique which can simplifY the analysis. Next, we present two techniques which allow the user to determination the probability distribution of a hypothesis node under conditions of uncertain evidence; i.e. the state of an evidence node or nodes is described by a user specified probability distribution. Finally, we conclude with a discussion of three criteria for ranking evidence nodes based on their influence on a hypothesis node. All of these techniques have been used in conjunction with a commercial software package. A Bayesian network based on a directed acyclic graph (DAG) G is a graphical representation of a system of random variables that satisfies the following Markov property: any node (random variable) is independent of its non-descendants given the state of all its parents (Neapolitan, 2004). For simplicities sake, we consider only discrete variables with a finite number of states, though most of the conclusions may be generalized.
A Bayesian formulation of seismic fragility analysis of safety related equipment
Energy Technology Data Exchange (ETDEWEB)
Wang, Z-L.; Pandey, M.; Xie, W-C., E-mail: z268wang@uwaterloo.ca, E-mail: mdpandey@uwaterloo.ca, E-mail: xie@uwaterloo.ca [Univ. of Waterloo, Ontario (Canada)
2013-07-01
A Bayesian approach to seismic fragility analysis of safety-related equipment is formulated. Unlike treating two sources of uncertainty of in the parameter estimation in two steps separately using the classical statistics, a Bayesian hierarchical model is advocated for interpreting and combining the various uncertainties more clearly in this article. In addition, with the availability of additional earthquake experience data and shaking table test results, a Bayesian approach to updating the fragility model of safety-related equipment is formulated by incorporating acquired failure and survivor evidence. Numerical results show the significance in fragility analysis using the Bayesian approach. (author)
A Bayesian subgroup analysis using collections of ANOVA models.
Liu, Jinzhong; Sivaganesan, Siva; Laud, Purushottam W; Müller, Peter
2017-03-20
We develop a Bayesian approach to subgroup analysis using ANOVA models with multiple covariates, extending an earlier work. We assume a two-arm clinical trial with normally distributed response variable. We also assume that the covariates for subgroup finding are categorical and are a priori specified, and parsimonious easy-to-interpret subgroups are preferable. We represent the subgroups of interest by a collection of models and use a model selection approach to finding subgroups with heterogeneous effects. We develop suitable priors for the model space and use an objective Bayesian approach that yields multiplicity adjusted posterior probabilities for the models. We use a structured algorithm based on the posterior probabilities of the models to determine which subgroup effects to report. Frequentist operating characteristics of the approach are evaluated using simulation. While our approach is applicable in more general cases, we mainly focus on the 2 × 2 case of two covariates each at two levels for ease of presentation. The approach is illustrated using a real data example.
A Bayesian Framework for Reliability Analysis of Spacecraft Deployments
Evans, John W.; Gallo, Luis; Kaminsky, Mark
2012-01-01
Deployable subsystems are essential to mission success of most spacecraft. These subsystems enable critical functions including power, communications and thermal control. The loss of any of these functions will generally result in loss of the mission. These subsystems and their components often consist of unique designs and applications for which various standardized data sources are not applicable for estimating reliability and for assessing risks. In this study, a two stage sequential Bayesian framework for reliability estimation of spacecraft deployment was developed for this purpose. This process was then applied to the James Webb Space Telescope (JWST) Sunshield subsystem, a unique design intended for thermal control of the Optical Telescope Element. Initially, detailed studies of NASA deployment history, "heritage information", were conducted, extending over 45 years of spacecraft launches. This information was then coupled to a non-informative prior and a binomial likelihood function to create a posterior distribution for deployments of various subsystems uSing Monte Carlo Markov Chain sampling. Select distributions were then coupled to a subsequent analysis, using test data and anomaly occurrences on successive ground test deployments of scale model test articles of JWST hardware, to update the NASA heritage data. This allowed for a realistic prediction for the reliability of the complex Sunshield deployment, with credibility limits, within this two stage Bayesian framework.
Developing and Testing a Bayesian Analysis of Fluorescence Lifetime Measurements
Needleman, Daniel J.
2017-01-01
FRET measurements can provide dynamic spatial information on length scales smaller than the diffraction limit of light. Several methods exist to measure FRET between fluorophores, including Fluorescence Lifetime Imaging Microscopy (FLIM), which relies on the reduction of fluorescence lifetime when a fluorophore is undergoing FRET. FLIM measurements take the form of histograms of photon arrival times, containing contributions from a mixed population of fluorophores both undergoing and not undergoing FRET, with the measured distribution being a mixture of exponentials of different lifetimes. Here, we present an analysis method based on Bayesian inference that rigorously takes into account several experimental complications. We test the precision and accuracy of our analysis on controlled experimental data and verify that we can faithfully extract model parameters, both in the low-photon and low-fraction regimes. PMID:28060890
BaTMAn: Bayesian Technique for Multi-image Analysis
Casado, J.; Ascasibar, Y.; García-Benito, R.; Guidi, G.; Choudhury, O. S.; Bellocchi, E.; Sánchez, S. F.; Díaz, A. I.
2016-12-01
Bayesian Technique for Multi-image Analysis (BaTMAn) characterizes any astronomical dataset containing spatial information and performs a tessellation based on the measurements and errors provided as input. The algorithm iteratively merges spatial elements as long as they are statistically consistent with carrying the same information (i.e. identical signal within the errors). The output segmentations successfully adapt to the underlying spatial structure, regardless of its morphology and/or the statistical properties of the noise. BaTMAn identifies (and keeps) all the statistically-significant information contained in the input multi-image (e.g. an IFS datacube). The main aim of the algorithm is to characterize spatially-resolved data prior to their analysis.
Bayesian Model Selection with Network Based Diffusion Analysis.
Whalen, Andrew; Hoppitt, William J E
2016-01-01
A number of recent studies have used Network Based Diffusion Analysis (NBDA) to detect the role of social transmission in the spread of a novel behavior through a population. In this paper we present a unified framework for performing NBDA in a Bayesian setting, and demonstrate how the Watanabe Akaike Information Criteria (WAIC) can be used for model selection. We present a specific example of applying this method to Time to Acquisition Diffusion Analysis (TADA). To examine the robustness of this technique, we performed a large scale simulation study and found that NBDA using WAIC could recover the correct model of social transmission under a wide range of cases, including under the presence of random effects, individual level variables, and alternative models of social transmission. This work suggests that NBDA is an effective and widely applicable tool for uncovering whether social transmission underpins the spread of a novel behavior, and may still provide accurate results even when key model assumptions are relaxed.
BATMAN: Bayesian Technique for Multi-image Analysis
Casado, J.; Ascasibar, Y.; García-Benito, R.; Guidi, G.; Choudhury, O. S.; Bellocchi, E.; Sánchez, S. F.; Díaz, A. I.
2016-12-01
This paper describes the Bayesian Technique for Multi-image Analysis (BATMAN), a novel image-segmentation technique based on Bayesian statistics that characterizes any astronomical dataset containing spatial information and performs a tessellation based on the measurements and errors provided as input. The algorithm iteratively merges spatial elements as long as they are statistically consistent with carrying the same information (i.e. identical signal within the errors). We illustrate its operation and performance with a set of test cases including both synthetic and real Integral-Field Spectroscopic data. The output segmentations adapt to the underlying spatial structure, regardless of its morphology and/or the statistical properties of the noise. The quality of the recovered signal represents an improvement with respect to the input, especially in regions with low signal-to-noise ratio. However, the algorithm may be sensitive to small-scale random fluctuations, and its performance in presence of spatial gradients is limited. Due to these effects, errors may be underestimated by as much as a factor of two. Our analysis reveals that the algorithm prioritizes conservation of all the statistically-significant information over noise reduction, and that the precise choice of the input data has a crucial impact on the results. Hence, the philosophy of BATMAN is not to be used as a `black box' to improve the signal-to-noise ratio, but as a new approach to characterize spatially-resolved data prior to its analysis. The source code is publicly available at http://astro.ft.uam.es/SELGIFS/BaTMAn.
Japanese Dairy Cattle Productivity Analysis using Bayesian Network Model (BNM
Directory of Open Access Journals (Sweden)
Iqbal Ahmed
2016-11-01
Full Text Available Japanese Dairy Cattle Productivity Analysis is carried out based on Bayesian Network Model (BNM. Through the experiment with 280 Japanese anestrus Holstein dairy cow, it is found that the estimation for finding out the presence of estrous cycle using BNM represents almost 55% accuracy while considering all samples. On the contrary, almost 73% accurate estimation could be achieved while using suspended likelihood in sample datasets. Moreover, while the proposed BNM model have more confidence then the estimation accuracy is lies in between 93 to 100%. In addition, this research also reveals the optimum factors to find out the presence of estrous cycle among the 270 individual dairy cows. The objective estimation methods using BNM definitely lead a unique idea to overcome the error of subjective estimation of having estrous cycle among these Japanese dairy cattle.
Studies in Astronomical Time Series Analysis. VI. Bayesian Block Representations
Scargle, Jeffrey D; Jackson, Brad; Chiang, James
2012-01-01
This paper addresses the problem of detecting and characterizing local variability in time series and other forms of sequential data. The goal is to identify and characterize statistically significant variations, at the same time suppressing the inevitable corrupting observational errors. We present a simple nonparametric modeling technique and an algorithm implementing it - an improved and generalized version of Bayesian Blocks (Scargle 1998) - that finds the optimal segmentation of the data in the observation interval. The structure of the algorithm allows it to be used in either a real-time trigger mode, or a retrospective mode. Maximum likelihood or marginal posterior functions to measure model fitness are presented for events, binned counts, and measurements at arbitrary times with known error distributions. Problems addressed include those connected with data gaps, variable exposure, extension to piecewise linear and piecewise exponential representations, multi-variate time series data, analysis of vari...
A Bayesian analysis of regularised source inversions in gravitational lensing
Suyu, S H; Hobson, M P; Marshall, P J
2006-01-01
Strong gravitational lens systems with extended sources are of special interest because they provide additional constraints on the models of the lens systems. To use a gravitational lens system for measuring the Hubble constant, one would need to determine the lens potential and the source intensity distribution simultaneously. A linear inversion method to reconstruct a pixellated source distribution of a given lens potential model was introduced by Warren and Dye. In the inversion process, a regularisation on the source intensity is often needed to ensure a successful inversion with a faithful resulting source. In this paper, we use Bayesian analysis to determine the optimal regularisation constant (strength of regularisation) of a given form of regularisation and to objectively choose the optimal form of regularisation given a selection of regularisations. We consider and compare quantitatively three different forms of regularisation previously described in the literature for source inversions in gravitatio...
Objective Bayesian Comparison of Constrained Analysis of Variance Models.
Consonni, Guido; Paroli, Roberta
2016-10-04
In the social sciences we are often interested in comparing models specified by parametric equality or inequality constraints. For instance, when examining three group means [Formula: see text] through an analysis of variance (ANOVA), a model may specify that [Formula: see text], while another one may state that [Formula: see text], and finally a third model may instead suggest that all means are unrestricted. This is a challenging problem, because it involves a combination of nonnested models, as well as nested models having the same dimension. We adopt an objective Bayesian approach, requiring no prior specification from the user, and derive the posterior probability of each model under consideration. Our method is based on the intrinsic prior methodology, suitably modified to accommodate equality and inequality constraints. Focussing on normal ANOVA models, a comparative assessment is carried out through simulation studies. We also present an application to real data collected in a psychological experiment.
Bayesian analysis of factors associated with fibromyalgia syndrome subjects
Jayawardana, Veroni; Mondal, Sumona; Russek, Leslie
2015-01-01
Factors contributing to movement-related fear were assessed by Russek, et al. 2014 for subjects with Fibromyalgia (FM) based on the collected data by a national internet survey of community-based individuals. The study focused on the variables, Activities-Specific Balance Confidence scale (ABC), Primary Care Post-Traumatic Stress Disorder screen (PC-PTSD), Tampa Scale of Kinesiophobia (TSK), a Joint Hypermobility Syndrome screen (JHS), Vertigo Symptom Scale (VSS-SF), Obsessive-Compulsive Personality Disorder (OCPD), Pain, work status and physical activity dependent from the "Revised Fibromyalgia Impact Questionnaire" (FIQR). The study presented in this paper revisits same data with a Bayesian analysis where appropriate priors were introduced for variables selected in the Russek's paper.
Bayesian large-scale structure inference and cosmic web analysis
Leclercq, Florent
2015-01-01
Surveys of the cosmic large-scale structure carry opportunities for building and testing cosmological theories about the origin and evolution of the Universe. This endeavor requires appropriate data assimilation tools, for establishing the contact between survey catalogs and models of structure formation. In this thesis, we present an innovative statistical approach for the ab initio simultaneous analysis of the formation history and morphology of the cosmic web: the BORG algorithm infers the primordial density fluctuations and produces physical reconstructions of the dark matter distribution that underlies observed galaxies, by assimilating the survey data into a cosmological structure formation model. The method, based on Bayesian probability theory, provides accurate means of uncertainty quantification. We demonstrate the application of BORG to the Sloan Digital Sky Survey data and describe the primordial and late-time large-scale structure in the observed volume. We show how the approach has led to the fi...
BASE-9: Bayesian Analysis for Stellar Evolution with nine variables
Robinson, Elliot; von Hippel, Ted; Stein, Nathan; Stenning, David; Wagner-Kaiser, Rachel; Si, Shijing; van Dyk, David
2016-08-01
The BASE-9 (Bayesian Analysis for Stellar Evolution with nine variables) software suite recovers star cluster and stellar parameters from photometry and is useful for analyzing single-age, single-metallicity star clusters, binaries, or single stars, and for simulating such systems. BASE-9 uses a Markov chain Monte Carlo (MCMC) technique along with brute force numerical integration to estimate the posterior probability distribution for the age, metallicity, helium abundance, distance modulus, line-of-sight absorption, and parameters of the initial-final mass relation (IFMR) for a cluster, and for the primary mass, secondary mass (if a binary), and cluster probability for every potential cluster member. The MCMC technique is used for the cluster quantities (the first six items listed above) and numerical integration is used for the stellar quantities (the last three items in the above list).
Reference priors of nuisance parameters in Bayesian sequential population analysis
Bousquet, Nicolas
2010-01-01
Prior distributions elicited for modelling the natural fluctuations or the uncertainty on parameters of Bayesian fishery population models, can be chosen among a vast range of statistical laws. Since the statistical framework is defined by observational processes, observational parameters enter into the estimation and must be considered random, similarly to parameters or states of interest like population levels or real catches. The former are thus perceived as nuisance parameters whose values are intrinsically linked to the considered experiment, which also require noninformative priors. In fishery research Jeffreys methodology has been presented by Millar (2002) as a practical way to elicit such priors. However they can present wrong properties in multiparameter contexts. Therefore we suggest to use the elicitation method proposed by Berger and Bernardo to avoid paradoxical results raised by Jeffreys priors. These benchmark priors are derived here in the framework of sequential population analysis.
A Bayesian Seismic Hazard Analysis for the city of Naples
Faenza, Licia; Pierdominici, Simona; Hainzl, Sebastian; Cinti, Francesca R.; Sandri, Laura; Selva, Jacopo; Tonini, Roberto; Perfetti, Paolo
2016-04-01
In the last years many studies have been focused on determination and definition of the seismic, volcanic and tsunamogenic hazard in the city of Naples. The reason is that the town of Naples with its neighboring area is one of the most densely populated places in Italy. In addition, the risk is increased also by the type and condition of buildings and monuments in the city. It is crucial therefore to assess which active faults in Naples and surrounding area could trigger an earthquake able to shake and damage the urban area. We collect data from the most reliable and complete databases of macroseismic intensity records (from 79 AD to present). For each seismic event an active tectonic structure has been associated. Furthermore a set of active faults, well-known from geological investigations, located around the study area that they could shake the city, not associated with any earthquake, has been taken into account for our studies. This geological framework is the starting point for our Bayesian seismic hazard analysis for the city of Naples. We show the feasibility of formulating the hazard assessment procedure to include the information of past earthquakes into the probabilistic seismic hazard analysis. This strategy allows on one hand to enlarge the information used in the evaluation of the hazard, from alternative models for the earthquake generation process to past shaking and on the other hand to explicitly account for all kinds of information and their uncertainties. The Bayesian scheme we propose is applied to evaluate the seismic hazard of Naples. We implement five different spatio-temporal models to parameterize the occurrence of earthquakes potentially dangerous for Naples. Subsequently we combine these hazard curves with ShakeMap of past earthquakes that have been felt in Naples. The results are posterior hazard assessment for three exposure times, e.g., 50, 10 and 5 years, in a dense grid that cover the municipality of Naples, considering bedrock soil
Chung, Gregory K. W. K.; Dionne, Gary B.; Kaiser, William J.
2006-01-01
Our research question was whether we could develop a feasible technique, using Bayesian networks, to diagnose gaps in student knowledge. Thirty-four college-age participants completed tasks designed to measure conceptual knowledge, procedural knowledge, and problem-solving skills related to circuit analysis. A Bayesian network was used to model…
用贝叶斯网络进行因果分析%Bayesian Causal Analysis
Institute of Scientific and Technical Information of China (English)
王双成; 林士敏; 陆玉昌
2000-01-01
The Bayesian causal analysis includes two techniques, one of which takes advantage of Bayesian network structure learning under the Causal Markov assumption and the presupposition that hidden variables are absent, and the other uses canonical form influence diagram. The two techniques possess their distinctive characteristics,and ought to be selected and put to use in the light of specific conditions.
Directory of Open Access Journals (Sweden)
Ildikó Ungvári
Full Text Available Genetic studies indicate high number of potential factors related to asthma. Based on earlier linkage analyses we selected the 11q13 and 14q22 asthma susceptibility regions, for which we designed a partial genome screening study using 145 SNPs in 1201 individuals (436 asthmatic children and 765 controls. The results were evaluated with traditional frequentist methods and we applied a new statistical method, called bayesian network based bayesian multilevel analysis of relevance (BN-BMLA. This method uses bayesian network representation to provide detailed characterization of the relevance of factors, such as joint significance, the type of dependency, and multi-target aspects. We estimated posteriors for these relations within the bayesian statistical framework, in order to estimate the posteriors whether a variable is directly relevant or its association is only mediated.With frequentist methods one SNP (rs3751464 in the FRMD6 gene provided evidence for an association with asthma (OR = 1.43(1.2-1.8; p = 3×10(-4. The possible role of the FRMD6 gene in asthma was also confirmed in an animal model and human asthmatics.In the BN-BMLA analysis altogether 5 SNPs in 4 genes were found relevant in connection with asthma phenotype: PRPF19 on chromosome 11, and FRMD6, PTGER2 and PTGDR on chromosome 14. In a subsequent step a partial dataset containing rhinitis and further clinical parameters was used, which allowed the analysis of relevance of SNPs for asthma and multiple targets. These analyses suggested that SNPs in the AHNAK and MS4A2 genes were indirectly associated with asthma. This paper indicates that BN-BMLA explores the relevant factors more comprehensively than traditional statistical methods and extends the scope of strong relevance based methods to include partial relevance, global characterization of relevance and multi-target relevance.
Bayesian Inference for NASA Probabilistic Risk and Reliability Analysis
Dezfuli, Homayoon; Kelly, Dana; Smith, Curtis; Vedros, Kurt; Galyean, William
2009-01-01
This document, Bayesian Inference for NASA Probabilistic Risk and Reliability Analysis, is intended to provide guidelines for the collection and evaluation of risk and reliability-related data. It is aimed at scientists and engineers familiar with risk and reliability methods and provides a hands-on approach to the investigation and application of a variety of risk and reliability data assessment methods, tools, and techniques. This document provides both: A broad perspective on data analysis collection and evaluation issues. A narrow focus on the methods to implement a comprehensive information repository. The topics addressed herein cover the fundamentals of how data and information are to be used in risk and reliability analysis models and their potential role in decision making. Understanding these topics is essential to attaining a risk informed decision making environment that is being sought by NASA requirements and procedures such as 8000.4 (Agency Risk Management Procedural Requirements), NPR 8705.05 (Probabilistic Risk Assessment Procedures for NASA Programs and Projects), and the System Safety requirements of NPR 8715.3 (NASA General Safety Program Requirements).
Insights on the Bayesian spectral density method for operational modal analysis
Au, Siu-Kui
2016-01-01
This paper presents a study on the Bayesian spectral density method for operational modal analysis. The method makes Bayesian inference of the modal properties by using the sample power spectral density (PSD) matrix averaged over independent sets of ambient data. In the typical case with a single set of data, it is divided into non-overlapping segments and they are assumed to be independent. This study is motivated by a recent paper that reveals a mathematical equivalence of the method with the Bayesian FFT method. The latter does not require averaging concepts or the independent segment assumption. This study shows that the equivalence does not hold in reality because the theoretical long data asymptotic distribution of the PSD matrix may not be valid. A single time history can be considered long for the Bayesian FFT method but not necessarily for the Bayesian PSD method, depending on the number of segments.
Thermodynamically consistent Bayesian analysis of closed biochemical reaction systems
Directory of Open Access Journals (Sweden)
Goutsias John
2010-11-01
Full Text Available Abstract Background Estimating the rate constants of a biochemical reaction system with known stoichiometry from noisy time series measurements of molecular concentrations is an important step for building predictive models of cellular function. Inference techniques currently available in the literature may produce rate constant values that defy necessary constraints imposed by the fundamental laws of thermodynamics. As a result, these techniques may lead to biochemical reaction systems whose concentration dynamics could not possibly occur in nature. Therefore, development of a thermodynamically consistent approach for estimating the rate constants of a biochemical reaction system is highly desirable. Results We introduce a Bayesian analysis approach for computing thermodynamically consistent estimates of the rate constants of a closed biochemical reaction system with known stoichiometry given experimental data. Our method employs an appropriately designed prior probability density function that effectively integrates fundamental biophysical and thermodynamic knowledge into the inference problem. Moreover, it takes into account experimental strategies for collecting informative observations of molecular concentrations through perturbations. The proposed method employs a maximization-expectation-maximization algorithm that provides thermodynamically feasible estimates of the rate constant values and computes appropriate measures of estimation accuracy. We demonstrate various aspects of the proposed method on synthetic data obtained by simulating a subset of a well-known model of the EGF/ERK signaling pathway, and examine its robustness under conditions that violate key assumptions. Software, coded in MATLAB®, which implements all Bayesian analysis techniques discussed in this paper, is available free of charge at http://www.cis.jhu.edu/~goutsias/CSS%20lab/software.html. Conclusions Our approach provides an attractive statistical methodology for
Modified Bayesian Kriging for Noisy Response Problems for Reliability Analysis
2015-01-01
CIE 2015 August 2-5, 2015, Boston, Massachusetts, USA [DRAFT] DETC2015-47370 MODIFIED BAYESIAN KRIGING FOR NOISY RESPONSE PROBLEMS FOR...Lamb US Army RDECOM/TARDEC Warren, MI 48397-5000, USA david.lamb@us.army.mil ABSTRACT This paper develops a new modified Bayesian Kriging (MBKG...surrogate modeling method for problems in which simulation analyses are inherently noisy and thus standard Kriging approaches fail to properly
Using Bayesian analysis in repeated preclinical in vivo studies for a more effective use of animals.
Walley, Rosalind; Sherington, John; Rastrick, Joe; Detrait, Eric; Hanon, Etienne; Watt, Gillian
2016-05-01
Whilst innovative Bayesian approaches are increasingly used in clinical studies, in the preclinical area Bayesian methods appear to be rarely used in the reporting of pharmacology data. This is particularly surprising in the context of regularly repeated in vivo studies where there is a considerable amount of data from historical control groups, which has potential value. This paper describes our experience with introducing Bayesian analysis for such studies using a Bayesian meta-analytic predictive approach. This leads naturally either to an informative prior for a control group as part of a full Bayesian analysis of the next study or using a predictive distribution to replace a control group entirely. We use quality control charts to illustrate study-to-study variation to the scientists and describe informative priors in terms of their approximate effective numbers of animals. We describe two case studies of animal models: the lipopolysaccharide-induced cytokine release model used in inflammation and the novel object recognition model used to screen cognitive enhancers, both of which show the advantage of a Bayesian approach over the standard frequentist analysis. We conclude that using Bayesian methods in stable repeated in vivo studies can result in a more effective use of animals, either by reducing the total number of animals used or by increasing the precision of key treatment differences. This will lead to clearer results and supports the "3Rs initiative" to Refine, Reduce and Replace animals in research. Copyright © 2016 John Wiley & Sons, Ltd.
Guidance on the implementation and reporting of a drug safety Bayesian network meta-analysis.
Ohlssen, David; Price, Karen L; Xia, H Amy; Hong, Hwanhee; Kerman, Jouni; Fu, Haoda; Quartey, George; Heilmann, Cory R; Ma, Haijun; Carlin, Bradley P
2014-01-01
The Drug Information Association Bayesian Scientific Working Group (BSWG) was formed in 2011 with a vision to ensure that Bayesian methods are well understood and broadly utilized for design and analysis and throughout the medical product development process, and to improve industrial, regulatory, and economic decision making. The group, composed of individuals from academia, industry, and regulatory, has as its mission to facilitate the appropriate use and contribute to the progress of Bayesian methodology. In this paper, the safety sub-team of the BSWG explores the use of Bayesian methods when applied to drug safety meta-analysis and network meta-analysis. Guidance is presented on the conduct and reporting of such analyses. We also discuss different structural model assumptions and provide discussion on prior specification. The work is illustrated through a case study involving a network meta-analysis related to the cardiovascular safety of non-steroidal anti-inflammatory drugs.
Using Bayesian Population Viability Analysis to Define Relevant Conservation Objectives.
Directory of Open Access Journals (Sweden)
Adam W Green
Full Text Available Adaptive management provides a useful framework for managing natural resources in the face of uncertainty. An important component of adaptive management is identifying clear, measurable conservation objectives that reflect the desired outcomes of stakeholders. A common objective is to have a sustainable population, or metapopulation, but it can be difficult to quantify a threshold above which such a population is likely to persist. We performed a Bayesian metapopulation viability analysis (BMPVA using a dynamic occupancy model to quantify the characteristics of two wood frog (Lithobates sylvatica metapopulations resulting in sustainable populations, and we demonstrate how the results could be used to define meaningful objectives that serve as the basis of adaptive management. We explored scenarios involving metapopulations with different numbers of patches (pools using estimates of breeding occurrence and successful metamorphosis from two study areas to estimate the probability of quasi-extinction and calculate the proportion of vernal pools producing metamorphs. Our results suggest that ≥50 pools are required to ensure long-term persistence with approximately 16% of pools producing metamorphs in stable metapopulations. We demonstrate one way to incorporate the BMPVA results into a utility function that balances the trade-offs between ecological and financial objectives, which can be used in an adaptive management framework to make optimal, transparent decisions. Our approach provides a framework for using a standard method (i.e., PVA and available information to inform a formal decision process to determine optimal and timely management policies.
A Bayesian model for the analysis of transgenerational epigenetic variation.
Varona, Luis; Munilla, Sebastián; Mouresan, Elena Flavia; González-Rodríguez, Aldemar; Moreno, Carlos; Altarriba, Juan
2015-01-23
Epigenetics has become one of the major areas of biological research. However, the degree of phenotypic variability that is explained by epigenetic processes still remains unclear. From a quantitative genetics perspective, the estimation of variance components is achieved by means of the information provided by the resemblance between relatives. In a previous study, this resemblance was described as a function of the epigenetic variance component and a reset coefficient that indicates the rate of dissipation of epigenetic marks across generations. Given these assumptions, we propose a Bayesian mixed model methodology that allows the estimation of epigenetic variance from a genealogical and phenotypic database. The methodology is based on the development of a T: matrix of epigenetic relationships that depends on the reset coefficient. In addition, we present a simple procedure for the calculation of the inverse of this matrix ( T-1: ) and a Gibbs sampler algorithm that obtains posterior estimates of all the unknowns in the model. The new procedure was used with two simulated data sets and with a beef cattle database. In the simulated populations, the results of the analysis provided marginal posterior distributions that included the population parameters in the regions of highest posterior density. In the case of the beef cattle dataset, the posterior estimate of transgenerational epigenetic variability was very low and a model comparison test indicated that a model that did not included it was the most plausible.
Spatial Hierarchical Bayesian Analysis of the Historical Extreme Streamflow
Najafi, M. R.; Moradkhani, H.
2012-04-01
Analysis of the climate change impact on extreme hydro-climatic events is crucial for future hydrologic/hydraulic designs and water resources decision making. The purpose of this study is to investigate the changes of the extreme value distribution parameters with respect to time to reflect upon the impact of climate change. We develop a statistical model using the observed streamflow data of the Columbia River Basin in USA to estimate the changes of high flows as a function of time as well as other variables. Generalized Pareto Distribution (GPD) is used to model the upper 95% flows during December through March for 31 gauge stations. In the process layer of the model the covariates including time, latitude, longitude, elevation and basin area are considered to assess the sensitivity of the model to each variable. Markov Chain Monte Carlo (MCMC) method is used to estimate the parameters. The Spatial Hierarchical Bayesian technique models the GPD parameters spatially and borrows strength from other locations by pooling data together, while providing an explicit estimation of the uncertainties in all stages of modeling.
Studies in Astronomical Time Series Analysis. VI. Bayesian Block Representations
Scargle, Jeffrey D.; Norris, Jay P.; Jackson, Brad; Chiang, James
2013-01-01
This paper addresses the problem of detecting and characterizing local variability in time series and other forms of sequential data. The goal is to identify and characterize statistically significant variations, at the same time suppressing the inevitable corrupting observational errors. We present a simple nonparametric modeling technique and an algorithm implementing it-an improved and generalized version of Bayesian Blocks [Scargle 1998]-that finds the optimal segmentation of the data in the observation interval. The structure of the algorithm allows it to be used in either a real-time trigger mode, or a retrospective mode. Maximum likelihood or marginal posterior functions to measure model fitness are presented for events, binned counts, and measurements at arbitrary times with known error distributions. Problems addressed include those connected with data gaps, variable exposure, extension to piece- wise linear and piecewise exponential representations, multivariate time series data, analysis of variance, data on the circle, other data modes, and dispersed data. Simulations provide evidence that the detection efficiency for weak signals is close to a theoretical asymptotic limit derived by [Arias-Castro, Donoho and Huo 2003]. In the spirit of Reproducible Research [Donoho et al. (2008)] all of the code and data necessary to reproduce all of the figures in this paper are included as auxiliary material.
Bayesian Spectral Analysis of Metal Abandance Deficient Stars
Sourlas, E; Kashyap, V L; Drake, J; Pease, D; Sourlas, Epaminondas; Dyk, David van; Kashyap, Vinay; Drake, Jeremy; Pease, Deron
2002-01-01
Metallicity can be measured by analyzing the spectra in the X-ray region and comparing the flux in spectral lines to the flux in the underlying Bremsstrahlung continuum. In this paper we propose new Bayesian methods which directly model the Poisson nature of the data and thus are expected to exhibit improved sampling properties. Our model also accounts for the Poisson nature of background contamination of the observations, image blurring due to instrument response, and the absorption of photons in space. The resulting highly structured hierarchical model is fit using the Gibbs sampler, data augmentation and Metropolis-Hasting. We demonstrate our methods with the X-ray spectral analysis of several "Metal Abundance Deficient" stars. The model is designed to summarize the relative frequency of the energy of photons (X-ray or gamma-ray) arriving at a detector. Independent Poisson distributions are more appropriate to model the counts than the commonly used normal approximation. We model the high energy tail of th...
STUDIES IN ASTRONOMICAL TIME SERIES ANALYSIS. VI. BAYESIAN BLOCK REPRESENTATIONS
Energy Technology Data Exchange (ETDEWEB)
Scargle, Jeffrey D. [Space Science and Astrobiology Division, MS 245-3, NASA Ames Research Center, Moffett Field, CA 94035-1000 (United States); Norris, Jay P. [Physics Department, Boise State University, 2110 University Drive, Boise, ID 83725-1570 (United States); Jackson, Brad [The Center for Applied Mathematics and Computer Science, Department of Mathematics, San Jose State University, One Washington Square, MH 308, San Jose, CA 95192-0103 (United States); Chiang, James, E-mail: jeffrey.d.scargle@nasa.gov [W. W. Hansen Experimental Physics Laboratory, Kavli Institute for Particle Astrophysics and Cosmology, Department of Physics and SLAC National Accelerator Laboratory, Stanford University, Stanford, CA 94305 (United States)
2013-02-20
This paper addresses the problem of detecting and characterizing local variability in time series and other forms of sequential data. The goal is to identify and characterize statistically significant variations, at the same time suppressing the inevitable corrupting observational errors. We present a simple nonparametric modeling technique and an algorithm implementing it-an improved and generalized version of Bayesian Blocks-that finds the optimal segmentation of the data in the observation interval. The structure of the algorithm allows it to be used in either a real-time trigger mode, or a retrospective mode. Maximum likelihood or marginal posterior functions to measure model fitness are presented for events, binned counts, and measurements at arbitrary times with known error distributions. Problems addressed include those connected with data gaps, variable exposure, extension to piecewise linear and piecewise exponential representations, multivariate time series data, analysis of variance, data on the circle, other data modes, and dispersed data. Simulations provide evidence that the detection efficiency for weak signals is close to a theoretical asymptotic limit derived by Arias-Castro et al. In the spirit of Reproducible Research all of the code and data necessary to reproduce all of the figures in this paper are included as supplementary material.
Comparison of Bayesian and Classical Analysis of Weibull Regression Model: A Simulation Study
Directory of Open Access Journals (Sweden)
İmran KURT ÖMÜRLÜ
2011-01-01
Full Text Available Objective: The purpose of this study was to compare performances of classical Weibull Regression Model (WRM and Bayesian-WRM under varying conditions using Monte Carlo simulations. Material and Methods: It was simulated the generated data by running for each of classical WRM and Bayesian-WRM under varying informative priors and sample sizes using our simulation algorithm. In simulation studies, n=50, 100 and 250 were for sample sizes, and informative prior values using a normal prior distribution with was selected for b1. For each situation, 1000 simulations were performed. Results: Bayesian-WRM with proper informative prior showed a good performance with too little bias. It was found out that bias of Bayesian-WRM increased while priors were becoming distant from reliability in all sample sizes. Furthermore, Bayesian-WRM obtained predictions with more little standard error than the classical WRM in both of small and big samples in the light of proper priors. Conclusion: In this simulation study, Bayesian-WRM showed better performance than classical method, when subjective data analysis performed by considering of expert opinions and historical knowledge about parameters. Consequently, Bayesian-WRM should be preferred in existence of reliable informative priors, in the contrast cases, classical WRM should be preferred.
JBASE: Joint Bayesian Analysis of Subphenotypes and Epistasis
Colak, Recep; Kim, TaeHyung; Kazan, Hilal; Oh, Yoomi; Cruz, Miguel; Valladares-Salgado, Adan; Peralta, Jesus; Escobedo, Jorge; Parra, Esteban J.; Kim, Philip M.; Goldenberg, Anna
2016-01-01
Motivation: Rapid advances in genotyping and genome-wide association studies have enabled the discovery of many new genotype–phenotype associations at the resolution of individual markers. However, these associations explain only a small proportion of theoretically estimated heritability of most diseases. In this work, we propose an integrative mixture model called JBASE: joint Bayesian analysis of subphenotypes and epistasis. JBASE explores two major reasons of missing heritability: interactions between genetic variants, a phenomenon known as epistasis and phenotypic heterogeneity, addressed via subphenotyping. Results: Our extensive simulations in a wide range of scenarios repeatedly demonstrate that JBASE can identify true underlying subphenotypes, including their associated variants and their interactions, with high precision. In the presence of phenotypic heterogeneity, JBASE has higher Power and lower Type 1 Error than five state-of-the-art approaches. We applied our method to a sample of individuals from Mexico with Type 2 diabetes and discovered two novel epistatic modules, including two loci each, that define two subphenotypes characterized by differences in body mass index and waist-to-hip ratio. We successfully replicated these subphenotypes and epistatic modules in an independent dataset from Mexico genotyped with a different platform. Availability and implementation: JBASE is implemented in C++, supported on Linux and is available at http://www.cs.toronto.edu/∼goldenberg/JBASE/jbase.tar.gz. The genotype data underlying this study are available upon approval by the ethics review board of the Medical Centre Siglo XXI. Please contact Dr Miguel Cruz at mcruzl@yahoo.com for assistance with the application. Contact: anna.goldenberg@utoronto.ca Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26411870
Bayesian Analysis of Multiple Populations in Galactic Globular Clusters
Wagner-Kaiser, Rachel A.; Sarajedini, Ata; von Hippel, Ted; Stenning, David; Piotto, Giampaolo; Milone, Antonino; van Dyk, David A.; Robinson, Elliot; Stein, Nathan
2016-01-01
We use GO 13297 Cycle 21 Hubble Space Telescope (HST) observations and archival GO 10775 Cycle 14 HST ACS Treasury observations of Galactic Globular Clusters to find and characterize multiple stellar populations. Determining how globular clusters are able to create and retain enriched material to produce several generations of stars is key to understanding how these objects formed and how they have affected the structural, kinematic, and chemical evolution of the Milky Way. We employ a sophisticated Bayesian technique with an adaptive MCMC algorithm to simultaneously fit the age, distance, absorption, and metallicity for each cluster. At the same time, we also fit unique helium values to two distinct populations of the cluster and determine the relative proportions of those populations. Our unique numerical approach allows objective and precise analysis of these complicated clusters, providing posterior distribution functions for each parameter of interest. We use these results to gain a better understanding of multiple populations in these clusters and their role in the history of the Milky Way.Support for this work was provided by NASA through grant numbers HST-GO-10775 and HST-GO-13297 from the Space Telescope Science Institute, which is operated by AURA, Inc., under NASA contract NAS5-26555. This material is based upon work supported by the National Aeronautics and Space Administration under Grant NNX11AF34G issued through the Office of Space Science. This project was supported by the National Aeronautics & Space Administration through the University of Central Florida's NASA Florida Space Grant Consortium.
Model selection by LASSO methods in a change-point model
Ciuperca, Gabriela
2011-01-01
The paper considers a linear regression model with multiple change-points occurring at unknown times. The LASSO technique is very interesting since it allows the parametric estimation, including the change-points, and automatic variable selection simultaneously. The asymptotic properties of the LASSO-type (which has as particular case the LASSO estimator) and of the adaptive LASSO estimators are studied. For this last estimator the oracle properties are proved. In both cases, a model selection criterion is proposed. Numerical examples are provided showing the performances of the adaptive LASSO estimator compared to the LS estimator.
Quantum System Identification: Hamiltonian Estimation using Spectral and Bayesian Analysis
Schirmer, S G
2009-01-01
Identifying the Hamiltonian of a quantum system from experimental data is considered. General limits on the identifiability of model parameters with limited experimental resources are investigated, and a specific Bayesian estimation procedure is proposed and evaluated for a model system where a-priori information about the Hamiltonian's structure is available.
Carvalho, Pedro; Marques, Rui Cunha
2016-02-15
This study aims to search for economies of size and scope in the Portuguese water sector applying Bayesian and classical statistics to make inference in stochastic frontier analysis (SFA). This study proves the usefulness and advantages of the application of Bayesian statistics for making inference in SFA over traditional SFA which just uses classical statistics. The resulting Bayesian methods allow overcoming some problems that arise in the application of the traditional SFA, such as the bias in small samples and skewness of residuals. In the present case study of the water sector in Portugal, these Bayesian methods provide more plausible and acceptable results. Based on the results obtained we found that there are important economies of output density, economies of size, economies of vertical integration and economies of scope in the Portuguese water sector, pointing out to the huge advantages in undertaking mergers by joining the retail and wholesale components and by joining the drinking water and wastewater services.
Applied Bayesian Hierarchical Methods
Congdon, Peter D
2010-01-01
Bayesian methods facilitate the analysis of complex models and data structures. Emphasizing data applications, alternative modeling specifications, and computer implementation, this book provides a practical overview of methods for Bayesian analysis of hierarchical models.
Use of SAMC for Bayesian analysis of statistical models with intractable normalizing constants
Jin, Ick Hoon
2014-03-01
Statistical inference for the models with intractable normalizing constants has attracted much attention. During the past two decades, various approximation- or simulation-based methods have been proposed for the problem, such as the Monte Carlo maximum likelihood method and the auxiliary variable Markov chain Monte Carlo methods. The Bayesian stochastic approximation Monte Carlo algorithm specifically addresses this problem: It works by sampling from a sequence of approximate distributions with their average converging to the target posterior distribution, where the approximate distributions can be achieved using the stochastic approximation Monte Carlo algorithm. A strong law of large numbers is established for the Bayesian stochastic approximation Monte Carlo estimator under mild conditions. Compared to the Monte Carlo maximum likelihood method, the Bayesian stochastic approximation Monte Carlo algorithm is more robust to the initial guess of model parameters. Compared to the auxiliary variable MCMC methods, the Bayesian stochastic approximation Monte Carlo algorithm avoids the requirement for perfect samples, and thus can be applied to many models for which perfect sampling is not available or very expensive. The Bayesian stochastic approximation Monte Carlo algorithm also provides a general framework for approximate Bayesian analysis. © 2012 Elsevier B.V. All rights reserved.
A Dynamic Bayesian Approach to Computational Laban Shape Quality Analysis
Directory of Open Access Journals (Sweden)
Dilip Swaminathan
2009-01-01
kinesiology. LMA (especially Effort/Shape emphasizes how internal feelings and intentions govern the patterning of movement throughout the whole body. As we argue, a complex understanding of intention via LMA is necessary for human-computer interaction to become embodied in ways that resemble interaction in the physical world. We thus introduce a novel, flexible Bayesian fusion approach for identifying LMA Shape qualities from raw motion capture data in real time. The method uses a dynamic Bayesian network (DBN to fuse movement features across the body and across time and as we discuss can be readily adapted for low-cost video. It has delivered excellent performance in preliminary studies comprising improvisatory movements. Our approach has been incorporated in Response, a mixed-reality environment where users interact via natural, full-body human movement and enhance their bodily-kinesthetic awareness through immersive sound and light feedback, with applications to kinesiology training, Parkinson's patient rehabilitation, interactive dance, and many other areas.
A Bayesian Analysis of the Radioactive Releases of Fukushima
DEFF Research Database (Denmark)
Tomioka, Ryota; Mørup, Morten
2012-01-01
The Fukushima Daiichi disaster 11 March, 2011 is considered the largest nuclear accident since the 1986 Chernobyl disaster and has been rated at level 7 on the International Nuclear Event Scale. As different radioactive materials have different effects to human body, it is important to know...... the types of nuclides and their levels of concentration from the recorded mixture of radiations to take necessary measures. We presently formulate a Bayesian generative model for the data available on radioactive releases from the Fukushima Daiichi disaster across Japan. From the sparsely sampled...... the Fukushima Daiichi plant we establish that the model is able to account for the data. We further demonstrate how the model extends to include all the available measurements recorded throughout Japan. The model can be considered a first attempt to apply Bayesian learning unsupervised in order to give a more...
Doubly Bayesian Analysis of Confidence in Perceptual Decision-Making.
Aitchison, L.; Bang, D; Bahrami, B.; Latham, P.E.
2015-01-01
Humans stand out from other animals in that they are able to explicitly report on the reliability of their internal operations. This ability, which is known as metacognition, is typically studied by asking people to report their confidence in the correctness of some decision. However, the computations underlying confidence reports remain unclear. In this paper, we present a fully Bayesian method for directly comparing models of confidence. Using a visual two-interval forced-choice task, we te...
Bayesian analysis of the flutter margin method in aeroelasticity
Khalil, Mohammad; Poirel, Dominique; Sarkar, Abhijit
2016-12-01
A Bayesian statistical framework is presented for Zimmerman and Weissenburger flutter margin method which considers the uncertainties in aeroelastic modal parameters. The proposed methodology overcomes the limitations of the previously developed least-square based estimation technique which relies on the Gaussian approximation of the flutter margin probability density function (pdf). Using the measured free-decay responses at subcritical (preflutter) airspeeds, the joint non-Gaussain posterior pdf of the modal parameters is sampled using the Metropolis-Hastings (MH) Markov chain Monte Carlo (MCMC) algorithm. The posterior MCMC samples of the modal parameters are then used to obtain the flutter margin pdfs and finally the flutter speed pdf. The usefulness of the Bayesian flutter margin method is demonstrated using synthetic data generated from a two-degree-of-freedom pitch-plunge aeroelastic model. The robustness of the statistical framework is demonstrated using different sets of measurement data. It will be shown that the probabilistic (Bayesian) approach reduces the number of test points required in providing a flutter speed estimate for a given accuracy and precision.
A Gibbs sampler for Bayesian analysis of site-occupancy data
Dorazio, Robert M.; Rodriguez, Daniel Taylor
2012-01-01
1. A Bayesian analysis of site-occupancy data containing covariates of species occurrence and species detection probabilities is usually completed using Markov chain Monte Carlo methods in conjunction with software programs that can implement those methods for any statistical model, not just site-occupancy models. Although these software programs are quite flexible, considerable experience is often required to specify a model and to initialize the Markov chain so that summaries of the posterior distribution can be estimated efficiently and accurately. 2. As an alternative to these programs, we develop a Gibbs sampler for Bayesian analysis of site-occupancy data that include covariates of species occurrence and species detection probabilities. This Gibbs sampler is based on a class of site-occupancy models in which probabilities of species occurrence and detection are specified as probit-regression functions of site- and survey-specific covariate measurements. 3. To illustrate the Gibbs sampler, we analyse site-occupancy data of the blue hawker, Aeshna cyanea (Odonata, Aeshnidae), a common dragonfly species in Switzerland. Our analysis includes a comparison of results based on Bayesian and classical (non-Bayesian) methods of inference. We also provide code (based on the R software program) for conducting Bayesian and classical analyses of site-occupancy data.
Bayesian inference – a way to combine statistical data and semantic analysis meaningfully
Directory of Open Access Journals (Sweden)
Eila Lindfors
2011-11-01
Full Text Available This article focuses on presenting the possibilities of Bayesian modelling (Finite Mixture Modelling in the semantic analysis of statistically modelled data. The probability of a hypothesis in relation to the data available is an important question in inductive reasoning. Bayesian modelling allows the researcher to use many models at a time and provides tools to evaluate the goodness of different models. The researcher should always be aware that there is no such thing as the exact probability of an exact event. This is the reason for using probabilistic models. Each model presents a different perspective on the phenomenon in focus, and the researcher has to choose the most probable model with a view to previous research and the knowledge available.The idea of Bayesian modelling is illustrated here by presenting two different sets of data, one from craft science research (n=167 and the other (n=63 from educational research (Lindfors, 2007, 2002. The principles of how to build models and how to combine different profiles are described in the light of the research mentioned.Bayesian modelling is an analysis based on calculating probabilities in relation to a specific set of quantitative data. It is a tool for handling data and interpreting it semantically. The reliability of the analysis arises from an argumentation of which model can be selected from the model space as the basis for an interpretation, and on which arguments.Keywords: method, sloyd, Bayesian modelling, student teachersURN:NBN:no-29959
PAC-Bayesian Analysis of the Exploration-Exploitation Trade-off
Seldin, Yevgeny; Laviolette, François; Auer, Peter; Shawe-Taylor, John; Peters, Jan
2011-01-01
We develop a coherent framework for integrative simultaneous analysis of the exploration-exploitation and model order selection trade-offs. We improve over our preceding results on the same subject (Seldin et al., 2011) by combining PAC-Bayesian analysis with Bernstein-type inequality for martingales. Such a combination is also of independent interest for studies of multiple simultaneously evolving martingales.
Stakhovych, Stanislav; Bijmolt, Tammo H. A.; Wedel, Michel
2012-01-01
In this article, we present a Bayesian spatial factor analysis model. We extend previous work on confirmatory factor analysis by including geographically distributed latent variables and accounting for heterogeneity and spatial autocorrelation. The simulation study shows excellent recovery of the model parameters and demonstrates the consequences…
Stakhovych, Stanislav; Bijmolt, Tammo H. A.; Wedel, Michel
2012-01-01
In this article, we present a Bayesian spatial factor analysis model. We extend previous work on confirmatory factor analysis by including geographically distributed latent variables and accounting for heterogeneity and spatial autocorrelation. The simulation study shows excellent recovery of the mo
Nuclear stockpile stewardship and Bayesian image analysis (DARHT and the BIE)
Energy Technology Data Exchange (ETDEWEB)
Carroll, James L [Los Alamos National Laboratory
2011-01-11
Since the end of nuclear testing, the reliability of our nation's nuclear weapon stockpile has been performed using sub-critical hydrodynamic testing. These tests involve some pretty 'extreme' radiography. We will be discussing the challenges and solutions to these problems provided by DARHT (the world's premiere hydrodynamic testing facility) and the BIE or Bayesian Inference Engine (a powerful radiography analysis software tool). We will discuss the application of Bayesian image analysis techniques to this important and difficult problem.
Quantum System Identification by Bayesian Analysis of Noisy Data: Beyond Hamiltonian Tomography
Schirmer, S G
2009-01-01
We consider how to characterize the dynamics of a quantum system from a restricted set of initial states and measurements using Bayesian analysis. Previous work has shown that Hamiltonian systems can be well estimated from analysis of noisy data. Here we show how to generalize this approach to systems with moderate dephasing in the eigenbasis of the Hamiltonian. We illustrate the process for a range of three-level quantum systems. The results suggest that the Bayesian estimation of the frequencies and dephasing rates is generally highly accurate and the main source of errors are errors in the reconstructed Hamiltonian basis.
Li, Shi; Mukherjee, Bhramar; Batterman, Stuart; Ghosh, Malay
2013-12-01
Case-crossover designs are widely used to study short-term exposure effects on the risk of acute adverse health events. While the frequentist literature on this topic is vast, there is no Bayesian work in this general area. The contribution of this paper is twofold. First, the paper establishes Bayesian equivalence results that require characterization of the set of priors under which the posterior distributions of the risk ratio parameters based on a case-crossover and time-series analysis are identical. Second, the paper studies inferential issues under case-crossover designs in a Bayesian framework. Traditionally, a conditional logistic regression is used for inference on risk-ratio parameters in case-crossover studies. We consider instead a more general full likelihood-based approach which makes less restrictive assumptions on the risk functions. Formulation of a full likelihood leads to growth in the number of parameters proportional to the sample size. We propose a semi-parametric Bayesian approach using a Dirichlet process prior to handle the random nuisance parameters that appear in a full likelihood formulation. We carry out a simulation study to compare the Bayesian methods based on full and conditional likelihood with the standard frequentist approaches for case-crossover and time-series analysis. The proposed methods are illustrated through the Detroit Asthma Morbidity, Air Quality and Traffic study, which examines the association between acute asthma risk and ambient air pollutant concentrations.
[Meta analysis of the use of Bayesian networks in breast cancer diagnosis].
Simões, Priscyla Waleska; Silva, Geraldo Doneda da; Moretti, Gustavo Pasquali; Simon, Carla Sasso; Winnikow, Erik Paul; Nassar, Silvia Modesto; Medeiros, Lidia Rosi; Rosa, Maria Inês
2015-01-01
The aim of this study was to determine the accuracy of Bayesian networks in supporting breast cancer diagnoses. Systematic review and meta-analysis were carried out, including articles and papers published between January 1990 and March 2013. We included prospective and retrospective cross-sectional studies of the accuracy of diagnoses of breast lesions (target conditions) made using Bayesian networks (index test). Four primary studies that included 1,223 breast lesions were analyzed, 89.52% (444/496) of the breast cancer cases and 6.33% (46/727) of the benign lesions were positive based on the Bayesian network analysis. The area under the curve (AUC) for the summary receiver operating characteristic curve (SROC) was 0.97, with a Q* value of 0.92. Using Bayesian networks to diagnose malignant lesions increased the pretest probability of a true positive from 40.03% to 90.05% and decreased the probability of a false negative to 6.44%. Therefore, our results demonstrated that Bayesian networks provide an accurate and non-invasive method to support breast cancer diagnosis.
Introduction to Bayesian statistics
Bolstad, William M
2017-01-01
There is a strong upsurge in the use of Bayesian methods in applied statistical analysis, yet most introductory statistics texts only present frequentist methods. Bayesian statistics has many important advantages that students should learn about if they are going into fields where statistics will be used. In this Third Edition, four newly-added chapters address topics that reflect the rapid advances in the field of Bayesian staistics. The author continues to provide a Bayesian treatment of introductory statistical topics, such as scientific data gathering, discrete random variables, robust Bayesian methods, and Bayesian approaches to inferenfe cfor discrete random variables, bionomial proprotion, Poisson, normal mean, and simple linear regression. In addition, newly-developing topics in the field are presented in four new chapters: Bayesian inference with unknown mean and variance; Bayesian inference for Multivariate Normal mean vector; Bayesian inference for Multiple Linear RegressionModel; and Computati...
Bayesian analysis of the dynamic structure in China's economic growth
Kyo, Koki; Noda, Hideo
2008-11-01
To analyze the dynamic structure in China's economic growth during the period 1952-1998, we introduce a model of the aggregate production function for the Chinese economy that considers total factor productivity (TFP) and output elasticities as time-varying parameters. Specifically, this paper is concerned with the relationship between the rate of economic growth in China and the trend in TFP. Here, we consider the time-varying parameters as random variables and introduce smoothness priors to construct a set of Bayesian linear models for parameter estimation. The results of the estimation are in agreement with the movements in China's social economy, thus illustrating the validity of the proposed methods.
Bayesian analysis of truncation errors in chiral effective field theory
Melendez, J.; Furnstahl, R. J.; Klco, N.; Phillips, D. R.; Wesolowski, S.
2016-09-01
In the Bayesian approach to effective field theory (EFT) expansions, truncation errors are derived from degree-of-belief (DOB) intervals for EFT predictions. By encoding expectations about the naturalness of EFT expansion coefficients for observables, this framework provides a statistical interpretation of the standard EFT procedure where truncation errors are estimated using the order-by-order convergence of the expansion. We extend and test previous calculations of DOB intervals for chiral EFT observables, examine correlations between contributions at different orders and energies, and explore methods to validate the statistical consistency of the EFT expansion parameter. Supported in part by the NSF and the DOE.
Bayesian inference for inverse problems occurring in uncertainty analysis
Fu, Shuai; Celeux, Gilles; Bousquet, Nicolas; Couplet, Mathieu
2012-01-01
The inverse problem considered here is to estimate the distribution of a non-observed random variable $X$ from some noisy observed data $Y$ linked to $X$ through a time-consuming physical model $H$. Bayesian inference is considered to take into account prior expert knowledge on $X$ in a small sample size setting. A Metropolis-Hastings within Gibbs algorithm is proposed to compute the posterior distribution of the parameters of $X$ through a data augmentation process. Since calls to $H$ are qu...
Bayesian item fit analysis for unidimensional item response theory models.
Sinharay, Sandip
2006-11-01
Assessing item fit for unidimensional item response theory models for dichotomous items has always been an issue of enormous interest, but there exists no unanimously agreed item fit diagnostic for these models, and hence there is room for further investigation of the area. This paper employs the posterior predictive model-checking method, a popular Bayesian model-checking tool, to examine item fit for the above-mentioned models. An item fit plot, comparing the observed and predicted proportion-correct scores of examinees with different raw scores, is suggested. This paper also suggests how to obtain posterior predictive p-values (which are natural Bayesian p-values) for the item fit statistics of Orlando and Thissen that summarize numerically the information in the above-mentioned item fit plots. A number of simulation studies and a real data application demonstrate the effectiveness of the suggested item fit diagnostics. The suggested techniques seem to have adequate power and reasonable Type I error rate, and psychometricians will find them promising.
Bayesian analysis of deterministic and stochastic prisoner's dilemma games
Directory of Open Access Journals (Sweden)
Howard Kunreuther
2009-08-01
Full Text Available This paper compares the behavior of individuals playing a classic two-person deterministic prisoner's dilemma (PD game with choice data obtained from repeated interdependent security prisoner's dilemma games with varying probabilities of loss and the ability to learn (or not learn about the actions of one's counterpart, an area of recent interest in experimental economics. This novel data set, from a series of controlled laboratory experiments, is analyzed using Bayesian hierarchical methods, the first application of such methods in this research domain. We find that individuals are much more likely to be cooperative when payoffs are deterministic than when the outcomes are probabilistic. A key factor explaining this difference is that subjects in a stochastic PD game respond not just to what their counterparts did but also to whether or not they suffered a loss. These findings are interpreted in the context of behavioral theories of commitment, altruism and reciprocity. The work provides a linkage between Bayesian statistics, experimental economics, and consumer psychology.
Risk Analysis of New Product Development Using Bayesian Networks
Directory of Open Access Journals (Sweden)
MohammadRahim Ramezanian
2012-06-01
Full Text Available The process of presenting new product development (NPD to market is of great importance due to variability of competitive rules in the business world. The product development teams face a lot of pressures due to rapid growth of technology, increased risk-taking of world markets and increasing variations in the customers` needs. However, the process of NPD is always associated with high uncertainties and complexities. To be successful in completing NPD project, existing risks should be identified and assessed. On the other hand, the Bayesian networks as a strong approach of decision making modeling of uncertain situations has attracted many researchers in various areas. These networks provide a decision supporting system for problems with uncertainties or probable reasoning. In this paper, the available risk factors in product development have been first identified in an electric company and then, the Bayesian network has been utilized and their interrelationships have been modeled to evaluate the available risk in the process. To determine the primary and conditional probabilities of the nodes, the viewpoints of experts in this area have been applied. The available risks in this process have been divided to High (H, Medium (M and Low (L groups and analyzed by the Agena Risk software. The findings derived from software output indicate that the production of the desired product has relatively high risk. In addition, Predictive support and Diagnostic support have been performed on the model with two different scenarios..
Risk Analysis of New Product Development Using Bayesian Networks
Directory of Open Access Journals (Sweden)
Mohammad Rahim Ramezanian
2012-01-01
Full Text Available The process of presenting new product development (NPD to market is of great importance due to variability of competitive rules in the business world. The product development teams face a lot of pressures due to rapid growth of technology, increased risk-taking of world markets and increasing variations in the customers` needs. However, the process of NPD is always associated with high uncertainties and complexities. To be successful in completing NPD project, existing risks should be identified and assessed. On the other hand, the Bayesian networks as a strong approach of decision making modeling of uncertain situations has attracted many researchers in various areas. These networks provide a decision supporting system for problems with uncertainties or probable reasoning. In this paper, the available risk factors in product development have been first identified in an electric company and then, the Bayesian network has been utilized and their interrelationships have been modeled to evaluate the available risk in the process. To determine the primary and conditional probabilities of the nodes, the viewpoints of experts in this area have been applied. The available risks in this process have been divided to High (H, Medium (M and Low (L groups and analyzed by the Agena Risk software. The findings derived from software output indicate that the production of the desired product has relatively high risk. In addition, Predictive support and Diagnostic support have been performed on the model with two different scenarios.
A Bayesian Analysis of the Ages of Four Open Clusters
Jeffery, Elizabeth J; van Dyk, David A; Stenning, David C; Robinson, Elliot; Stein, Nathan; Jefferys, W H
2016-01-01
In this paper we apply a Bayesian technique to determine the best fit of stellar evolution models to find the main sequence turn off age and other cluster parameters of four intermediate-age open clusters: NGC 2360, NGC 2477, NGC 2660, and NGC 3960. Our algorithm utilizes a Markov chain Monte Carlo technique to fit these various parameters, objectively finding the best-fit isochrone for each cluster. The result is a high-precision isochrone fit. We compare these results with the those of traditional "by-eye" isochrone fitting methods. By applying this Bayesian technique to NGC 2360, NGC 2477, NGC 2660, and NGC 3960, we determine the ages of these clusters to be 1.35 +/- 0.05, 1.02 +/- 0.02, 1.64 +/- 0.04, and 0.860 +/- 0.04 Gyr, respectively. The results of this paper continue our effort to determine cluster ages to higher precision than that offered by these traditional methods of isochrone fitting.
Energy Technology Data Exchange (ETDEWEB)
Carvalho, Pedro, E-mail: pedrocarv@coc.ufrj.br [Computational Modelling in Engineering and Geophysics Laboratory (LAMEMO), Department of Civil Engineering, COPPE, Federal University of Rio de Janeiro, Av. Pedro Calmon - Ilha do Fundão, 21941-596 Rio de Janeiro (Brazil); Center for Urban and Regional Systems (CESUR), CERIS, Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais, 1049-001 Lisbon (Portugal); Marques, Rui Cunha, E-mail: pedro.c.carvalho@tecnico.ulisboa.pt [Center for Urban and Regional Systems (CESUR), CERIS, Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais, 1049-001 Lisbon (Portugal)
2016-02-15
This study aims to search for economies of size and scope in the Portuguese water sector applying Bayesian and classical statistics to make inference in stochastic frontier analysis (SFA). This study proves the usefulness and advantages of the application of Bayesian statistics for making inference in SFA over traditional SFA which just uses classical statistics. The resulting Bayesian methods allow overcoming some problems that arise in the application of the traditional SFA, such as the bias in small samples and skewness of residuals. In the present case study of the water sector in Portugal, these Bayesian methods provide more plausible and acceptable results. Based on the results obtained we found that there are important economies of output density, economies of size, economies of vertical integration and economies of scope in the Portuguese water sector, pointing out to the huge advantages in undertaking mergers by joining the retail and wholesale components and by joining the drinking water and wastewater services. - Highlights: • This study aims to search for economies of size and scope in the water sector; • The usefulness of the application of Bayesian methods is highlighted; • Important economies of output density, economies of size, economies of vertical integration and economies of scope are found.
Xu, Chengcheng; Wang, Wei; Liu, Pan; Li, Zhibin
2015-12-01
This study aimed to develop a real-time crash risk model with limited data in China by using Bayesian meta-analysis and Bayesian inference approach. A systematic review was first conducted by using three different Bayesian meta-analyses, including the fixed effect meta-analysis, the random effect meta-analysis, and the meta-regression. The meta-analyses provided a numerical summary of the effects of traffic variables on crash risks by quantitatively synthesizing results from previous studies. The random effect meta-analysis and the meta-regression produced a more conservative estimate for the effects of traffic variables compared with the fixed effect meta-analysis. Then, the meta-analyses results were used as informative priors for developing crash risk models with limited data. Three different meta-analyses significantly affect model fit and prediction accuracy. The model based on meta-regression can increase the prediction accuracy by about 15% as compared to the model that was directly developed with limited data. Finally, the Bayesian predictive densities analysis was used to identify the outliers in the limited data. It can further improve the prediction accuracy by 5.0%.
A Bayesian Surrogate Model for Rapid Time Series Analysis and Application to Exoplanet Observations
Ford, Eric B; Veras, Dimitri
2011-01-01
We present a Bayesian surrogate model for the analysis of periodic or quasi-periodic time series data. We describe a computationally efficient implementation that enables Bayesian model comparison. We apply this model to simulated and real exoplanet observations. We discuss the results and demonstrate some of the challenges for applying our surrogate model to realistic exoplanet data sets. In particular, we find that analyses of real world data should pay careful attention to the effects of uneven spacing of observations and the choice of prior for the "jitter" parameter.
A continuous-time Bayesian network reliability modeling and analysis framework
Boudali, H.; Dugan, J.B.
2006-01-01
We present a continuous-time Bayesian network (CTBN) framework for dynamic systems reliability modeling and analysis. Dynamic systems exhibit complex behaviors and interactions between their components; where not only the combination of failure events matters, but so does the sequence ordering of th
A Bayesian multidimensional scaling procedure for the spatial analysis of revealed choice data
DeSarbo, WS; Kim, Y; Fong, D
1999-01-01
We present a new Bayesian formulation of a vector multidimensional scaling procedure for the spatial analysis of binary choice data. The Gibbs sampler is gainfully employed to estimate the posterior distribution of the specified scalar products, bilinear model parameters. The computational procedure
Application of a data-mining method based on Bayesian networks to lesion-deficit analysis
Herskovits, Edward H.; Gerring, Joan P.
2003-01-01
Although lesion-deficit analysis (LDA) has provided extensive information about structure-function associations in the human brain, LDA has suffered from the difficulties inherent to the analysis of spatial data, i.e., there are many more variables than subjects, and data may be difficult to model using standard distributions, such as the normal distribution. We herein describe a Bayesian method for LDA; this method is based on data-mining techniques that employ Bayesian networks to represent structure-function associations. These methods are computationally tractable, and can represent complex, nonlinear structure-function associations. When applied to the evaluation of data obtained from a study of the psychiatric sequelae of traumatic brain injury in children, this method generates a Bayesian network that demonstrates complex, nonlinear associations among lesions in the left caudate, right globus pallidus, right side of the corpus callosum, right caudate, and left thalamus, and subsequent development of attention-deficit hyperactivity disorder, confirming and extending our previous statistical analysis of these data. Furthermore, analysis of simulated data indicates that methods based on Bayesian networks may be more sensitive and specific for detecting associations among categorical variables than methods based on chi-square and Fisher exact statistics.
Bayesian Network Meta-Analysis for Unordered Categorical Outcomes with Incomplete Data
Schmid, Christopher H.; Trikalinos, Thomas A.; Olkin, Ingram
2014-01-01
We develop a Bayesian multinomial network meta-analysis model for unordered (nominal) categorical outcomes that allows for partially observed data in which exact event counts may not be known for each category. This model properly accounts for correlations of counts in mutually exclusive categories and enables proper comparison and ranking of…
Zwick, Rebecca; Lenaburg, Lubella
2009-01-01
In certain data analyses (e.g., multiple discriminant analysis and multinomial log-linear modeling), classification decisions are made based on the estimated posterior probabilities that individuals belong to each of several distinct categories. In the Bayesian network literature, this type of classification is often accomplished by assigning…
Bayesian Factor Analysis as a Variable-Selection Problem: Alternative Priors and Consequences.
Lu, Zhao-Hua; Chow, Sy-Miin; Loken, Eric
2016-01-01
Factor analysis is a popular statistical technique for multivariate data analysis. Developments in the structural equation modeling framework have enabled the use of hybrid confirmatory/exploratory approaches in which factor-loading structures can be explored relatively flexibly within a confirmatory factor analysis (CFA) framework. Recently, Muthén & Asparouhov proposed a Bayesian structural equation modeling (BSEM) approach to explore the presence of cross loadings in CFA models. We show that the issue of determining factor-loading patterns may be formulated as a Bayesian variable selection problem in which Muthén and Asparouhov's approach can be regarded as a BSEM approach with ridge regression prior (BSEM-RP). We propose another Bayesian approach, denoted herein as the Bayesian structural equation modeling with spike-and-slab prior (BSEM-SSP), which serves as a one-stage alternative to the BSEM-RP. We review the theoretical advantages and disadvantages of both approaches and compare their empirical performance relative to two modification indices-based approaches and exploratory factor analysis with target rotation. A teacher stress scale data set is used to demonstrate our approach.
A Bayesian analysis of the unit root in real exchange rates
P.C. Schotman (Peter); H.K. van Dijk (Herman)
1991-01-01
textabstractWe propose a posterior odds analysis of the hypothesis of a unit root in real exchange rates. From a Bayesian viewpoint the random walk hypothesis for real exchange rates is a posteriori as probable as a stationary AR(1) process for four out of eight time series investigated. The French
Bayesian Meta-Analysis of Cronbach's Coefficient Alpha to Evaluate Informative Hypotheses
Okada, Kensuke
2015-01-01
This paper proposes a new method to evaluate informative hypotheses for meta-analysis of Cronbach's coefficient alpha using a Bayesian approach. The coefficient alpha is one of the most widely used reliability indices. In meta-analyses of reliability, researchers typically form specific informative hypotheses beforehand, such as "alpha of…
Family background variables as instruments for education in income regressions: A Bayesian analysis
L.F. Hoogerheide (Lennart); J.H. Block (Jörn); A.R. Thurik (Roy)
2012-01-01
textabstractThe validity of family background variables instrumenting education in income regressions has been much criticized. In this paper, we use data from the 2004 German Socio-Economic Panel and Bayesian analysis to analyze to what degree violations of the strict validity assumption affect the
In this paper, the Genetic Algorithms (GA) and Bayesian model averaging (BMA) were combined to simultaneously conduct calibration and uncertainty analysis for the Soil and Water Assessment Tool (SWAT). In this hybrid method, several SWAT models with different structures are first selected; next GA i...
Unsupervised Transient Light Curve Analysis Via Hierarchical Bayesian Inference
Sanders, Nathan; Soderberg, Alicia
2014-01-01
Historically, light curve studies of supernovae (SNe) and other transient classes have focused on individual objects with copious and high signal-to-noise observations. In the nascent era of wide field transient searches, objects with detailed observations are decreasing as a fraction of the overall known SN population, and this strategy sacrifices the majority of the information contained in the data about the underlying population of transients. A population level modeling approach, simultaneously fitting all available observations of objects in a transient sub-class of interest, fully mines the data to infer the properties of the population and avoids certain systematic biases. We present a novel hierarchical Bayesian statistical model for population level modeling of transient light curves, and discuss its implementation using an efficient Hamiltonian Monte Carlo technique. As a test case, we apply this model to the Type IIP SN sample from the Pan-STARRS1 Medium Deep Survey, consisting of 18,837 photometr...
Bayesian Analysis of Multiple Populations I: Statistical and Computational Methods
Stenning, D C; Robinson, E; van Dyk, D A; von Hippel, T; Sarajedini, A; Stein, N
2016-01-01
We develop a Bayesian model for globular clusters composed of multiple stellar populations, extending earlier statistical models for open clusters composed of simple (single) stellar populations (vanDyk et al. 2009, Stein et al. 2013). Specifically, we model globular clusters with two populations that differ in helium abundance. Our model assumes a hierarchical structuring of the parameters in which physical properties---age, metallicity, helium abundance, distance, absorption, and initial mass---are common to (i) the cluster as a whole or to (ii) individual populations within a cluster, or are unique to (iii) individual stars. An adaptive Markov chain Monte Carlo (MCMC) algorithm is devised for model fitting that greatly improves convergence relative to its precursor non-adaptive MCMC algorithm. Our model and computational tools are incorporated into an open-source software suite known as BASE-9. We use numerical studies to demonstrate that our method can recover parameters of two-population clusters, and al...
A Software Risk Analysis Model Using Bayesian Belief Network
Institute of Scientific and Technical Information of China (English)
Yong Hu; Juhua Chen; Mei Liu; Yang Yun; Junbiao Tang
2006-01-01
The uncertainty during the period of software project development often brings huge risks to contractors and clients. Ifwe can find an effective method to predict the cost and quality of software projects based on facts like the project character and two-side cooperating capability at the beginning of the project, we can reduce the risk.Bayesian Belief Network(BBN) is a good tool for analyzing uncertain consequences, but it is difficult to produce precise network structure and conditional probability table. In this paper, we built up network structure by Delphi method for conditional probability table learning, and learn update probability table and nodes' confidence levels continuously according to the application cases, which made the evaluation network have learning abilities, and evaluate the software development risk of organization more accurately. This paper also introduces EM algorithm, which will enhance the ability to produce hidden nodes caused by variant software projects.
Figueira, P.; Faria, J. P.; Adibekyan, V. Zh.; Oshagh, M.; Santos, N. C.
2016-05-01
We apply the Bayesian framework to assess the presence of a correlation between two quantities. To do so, we estimate the probability distribution of the parameter of interest, ρ, characterizing the strength of the correlation. We provide an implementation of these ideas and concepts using python programming language and the pyMC module in a very short (˜ 130 lines of code, heavily commented) and user-friendly program. We used this tool to assess the presence and properties of the correlation between planetary surface gravity and stellar activity level as measured by the log( R^' }_{{HK}}) indicator. The results of the Bayesian analysis are qualitatively similar to those obtained via p-value analysis, and support the presence of a correlation in the data. The results are more robust in their derivation and more informative, revealing interesting features such as asymmetric posterior distributions or markedly different credible intervals, and allowing for a deeper exploration. We encourage the reader interested in this kind of problem to apply our code to his/her own scientific problems. The full understanding of what the Bayesian framework is can only be gained through the insight that comes by handling priors, assessing the convergence of Monte Carlo runs, and a multitude of other practical problems. We hope to contribute so that Bayesian analysis becomes a tool in the toolkit of researchers, and they understand by experience its advantages and limitations.
Figueira, P; Faria, J P; Adibekyan, V Zh; Oshagh, M; Santos, N C
2016-11-01
We apply the Bayesian framework to assess the presence of a correlation between two quantities. To do so, we estimate the probability distribution of the parameter of interest, ρ, characterizing the strength of the correlation. We provide an implementation of these ideas and concepts using python programming language and the pyMC module in a very short (∼ 130 lines of code, heavily commented) and user-friendly program. We used this tool to assess the presence and properties of the correlation between planetary surface gravity and stellar activity level as measured by the log([Formula: see text]) indicator. The results of the Bayesian analysis are qualitatively similar to those obtained via p-value analysis, and support the presence of a correlation in the data. The results are more robust in their derivation and more informative, revealing interesting features such as asymmetric posterior distributions or markedly different credible intervals, and allowing for a deeper exploration. We encourage the reader interested in this kind of problem to apply our code to his/her own scientific problems. The full understanding of what the Bayesian framework is can only be gained through the insight that comes by handling priors, assessing the convergence of Monte Carlo runs, and a multitude of other practical problems. We hope to contribute so that Bayesian analysis becomes a tool in the toolkit of researchers, and they understand by experience its advantages and limitations.
Schmidt, Paul; Schmid, Volker J; Gaser, Christian; Buck, Dorothea; Bührlen, Susanne; Förschler, Annette; Mühlau, Mark
2013-01-01
Aiming at iron-related T2-hypointensity, which is related to normal aging and neurodegenerative processes, we here present two practicable approaches, based on Bayesian inference, for preprocessing and statistical analysis of a complex set of structural MRI data. In particular, Markov Chain Monte Carlo methods were used to simulate posterior distributions. First, we rendered a segmentation algorithm that uses outlier detection based on model checking techniques within a Bayesian mixture model. Second, we rendered an analytical tool comprising a Bayesian regression model with smoothness priors (in the form of Gaussian Markov random fields) mitigating the necessity to smooth data prior to statistical analysis. For validation, we used simulated data and MRI data of 27 healthy controls (age: [Formula: see text]; range, [Formula: see text]). We first observed robust segmentation of both simulated T2-hypointensities and gray-matter regions known to be T2-hypointense. Second, simulated data and images of segmented T2-hypointensity were analyzed. We found not only robust identification of simulated effects but also a biologically plausible age-related increase of T2-hypointensity primarily within the dentate nucleus but also within the globus pallidus, substantia nigra, and red nucleus. Our results indicate that fully Bayesian inference can successfully be applied for preprocessing and statistical analysis of structural MRI data.
Figueira, P.; Faria, J. P.; Adibekyan, V. Zh.; Oshagh, M.; Santos, N. C.
2016-11-01
We apply the Bayesian framework to assess the presence of a correlation between two quantities. To do so, we estimate the probability distribution of the parameter of interest, ρ, characterizing the strength of the correlation. We provide an implementation of these ideas and concepts using python programming language and the pyMC module in a very short (˜ 130 lines of code, heavily commented) and user-friendly program. We used this tool to assess the presence and properties of the correlation between planetary surface gravity and stellar activity level as measured by the log(R^' }_{ {HK}}) indicator. The results of the Bayesian analysis are qualitatively similar to those obtained via p-value analysis, and support the presence of a correlation in the data. The results are more robust in their derivation and more informative, revealing interesting features such as asymmetric posterior distributions or markedly different credible intervals, and allowing for a deeper exploration. We encourage the reader interested in this kind of problem to apply our code to his/her own scientific problems. The full understanding of what the Bayesian framework is can only be gained through the insight that comes by handling priors, assessing the convergence of Monte Carlo runs, and a multitude of other practical problems. We hope to contribute so that Bayesian analysis becomes a tool in the toolkit of researchers, and they understand by experience its advantages and limitations.
Kwon, Deukwoo; Hoffman, F Owen; Moroz, Brian E; Simon, Steven L
2016-02-10
Most conventional risk analysis methods rely on a single best estimate of exposure per person, which does not allow for adjustment for exposure-related uncertainty. Here, we propose a Bayesian model averaging method to properly quantify the relationship between radiation dose and disease outcomes by accounting for shared and unshared uncertainty in estimated dose. Our Bayesian risk analysis method utilizes multiple realizations of sets (vectors) of doses generated by a two-dimensional Monte Carlo simulation method that properly separates shared and unshared errors in dose estimation. The exposure model used in this work is taken from a study of the risk of thyroid nodules among a cohort of 2376 subjects who were exposed to fallout from nuclear testing in Kazakhstan. We assessed the performance of our method through an extensive series of simulations and comparisons against conventional regression risk analysis methods. When the estimated doses contain relatively small amounts of uncertainty, the Bayesian method using multiple a priori plausible draws of dose vectors gave similar results to the conventional regression-based methods of dose-response analysis. However, when large and complex mixtures of shared and unshared uncertainties are present, the Bayesian method using multiple dose vectors had significantly lower relative bias than conventional regression-based risk analysis methods and better coverage, that is, a markedly increased capability to include the true risk coefficient within the 95% credible interval of the Bayesian-based risk estimate. An evaluation of the dose-response using our method is presented for an epidemiological study of thyroid disease following radiation exposure.
Iskandar, Ismed; Satria Gondokaryono, Yudi
2016-02-01
In reliability theory, the most important problem is to determine the reliability of a complex system from the reliability of its components. The weakness of most reliability theories is that the systems are described and explained as simply functioning or failed. In many real situations, the failures may be from many causes depending upon the age and the environment of the system and its components. Another problem in reliability theory is one of estimating the parameters of the assumed failure models. The estimation may be based on data collected over censored or uncensored life tests. In many reliability problems, the failure data are simply quantitatively inadequate, especially in engineering design and maintenance system. The Bayesian analyses are more beneficial than the classical one in such cases. The Bayesian estimation analyses allow us to combine past knowledge or experience in the form of an apriori distribution with life test data to make inferences of the parameter of interest. In this paper, we have investigated the application of the Bayesian estimation analyses to competing risk systems. The cases are limited to the models with independent causes of failure by using the Weibull distribution as our model. A simulation is conducted for this distribution with the objectives of verifying the models and the estimators and investigating the performance of the estimators for varying sample size. The simulation data are analyzed by using Bayesian and the maximum likelihood analyses. The simulation results show that the change of the true of parameter relatively to another will change the value of standard deviation in an opposite direction. For a perfect information on the prior distribution, the estimation methods of the Bayesian analyses are better than those of the maximum likelihood. The sensitivity analyses show some amount of sensitivity over the shifts of the prior locations. They also show the robustness of the Bayesian analysis within the range
Institute of Scientific and Technical Information of China (English)
刘晓星; 方琳; 张颖; 唐攀
2014-01-01
Analyses based on liquidity perspectives show that serious imbalance of funding,asset and monetar-y liquidity caused the quick spread of the European and American sovereign crisis. Besides the traditional price and volume indicators,the paper introduces liquidity impact indicators and constructs a liquidity meas-urement system for stock market. Based on Spearman method,the paper builds a new change-points detection process via binary Copula and probability integral transformation and successfully realizes effective change-points detection of liquidity shocks between stock markets in Europe,America,and core developed countries and emerging countries.%流动性是现代金融体系的生命力，基于流动性视角的分析表明，融资流动性、资产流动性和货币流动性的严重失衡是导致欧美主权债务危机迅速传染扩散的主要驱动力。本文在传统价格和成交量的基础上引入流动性影响力指标，构建了股票市场流动性度量指标体系；在Spearman相关系数法的基础上，运用二元Copula和概率积分变换构建了新型的变结构点检测方法流程，有效实现了欧美债务危机国与英、日等核心发达国家以及中、印等新兴国家间股票市场流动性冲击的变结构点检测。
Case-control studies of gene-environment interaction: Bayesian design and analysis.
Mukherjee, Bhramar; Ahn, Jaeil; Gruber, Stephen B; Ghosh, Malay; Chatterjee, Nilanjan
2010-09-01
With increasing frequency, epidemiologic studies are addressing hypotheses regarding gene-environment interaction. In many well-studied candidate genes and for standard dietary and behavioral epidemiologic exposures, there is often substantial prior information available that may be used to analyze current data as well as for designing a new study. In this article, first, we propose a proper full Bayesian approach for analyzing studies of gene-environment interaction. The Bayesian approach provides a natural way to incorporate uncertainties around the assumption of gene-environment independence, often used in such an analysis. We then consider Bayesian sample size determination criteria for both estimation and hypothesis testing regarding the multiplicative gene-environment interaction parameter. We illustrate our proposed methods using data from a large ongoing case-control study of colorectal cancer investigating the interaction of N-acetyl transferase type 2 (NAT2) with smoking and red meat consumption. We use the existing data to elicit a design prior and show how to use this information in allocating cases and controls in planning a future study that investigates the same interaction parameters. The Bayesian design and analysis strategies are compared with their corresponding frequentist counterparts.
Villalba, Jesús
2015-01-01
In this document we are going to derive the equations needed to implement a Variational Bayes estimation of the parameters of the simplified probabilistic linear discriminant analysis (SPLDA) model. This can be used to adapt SPLDA from one database to another with few development data or to implement the fully Bayesian recipe. Our approach is similar to Bishop's VB PPCA.
Bayesian inversion analysis of nonlinear dynamics in surface heterogeneous reactions.
Omori, Toshiaki; Kuwatani, Tatsu; Okamoto, Atsushi; Hukushima, Koji
2016-09-01
It is essential to extract nonlinear dynamics from time-series data as an inverse problem in natural sciences. We propose a Bayesian statistical framework for extracting nonlinear dynamics of surface heterogeneous reactions from sparse and noisy observable data. Surface heterogeneous reactions are chemical reactions with conjugation of multiple phases, and they have the intrinsic nonlinearity of their dynamics caused by the effect of surface-area between different phases. We adapt a belief propagation method and an expectation-maximization (EM) algorithm to partial observation problem, in order to simultaneously estimate the time course of hidden variables and the kinetic parameters underlying dynamics. The proposed belief propagation method is performed by using sequential Monte Carlo algorithm in order to estimate nonlinear dynamical system. Using our proposed method, we show that the rate constants of dissolution and precipitation reactions, which are typical examples of surface heterogeneous reactions, as well as the temporal changes of solid reactants and products, were successfully estimated only from the observable temporal changes in the concentration of the dissolved intermediate product.
On the On-Off Problem: An Objective Bayesian Analysis
Ahnen, Max Ludwig
2015-01-01
The On-Off problem, aka. Li-Ma problem, is a statistical problem where a measured rate is the sum of two parts. The first is due to a signal and the second due to a background, both of which are unknown. Mostly frequentist solutions are being used that are only adequate for high count numbers. When the events are rare such an approximation is not good enough. Indeed, in high-energy astrophysics this is often the rule rather than the exception. I will present a universal objective Bayesian solution that depends only on the initial three parameters of the On-Off problem: the number of events in the "on" region, the number of events in the "off" region, and their ratio-of-exposure. With a two-step approach it is possible to infer the signal's significance, strength, uncertainty or upper limit in a unified a way. The approach is valid without restrictions for any count number including zero and may be widely applied in particle physics, cosmic-ray physics and high-energy astrophysics. I apply the method to Gamma ...
Heterogeneous multimodal biomarkers analysis for Alzheimer's disease via Bayesian network.
Jin, Yan; Su, Yi; Zhou, Xiao-Hua; Huang, Shuai
2016-12-01
By 2050, it is estimated that the number of worldwide Alzheimer's disease (AD) patients will quadruple from the current number of 36 million, while no proven disease-modifying treatments are available. At present, the underlying disease mechanisms remain under investigation, and recent studies suggest that the disease involves multiple etiological pathways. To better understand the disease and develop treatment strategies, a number of ongoing studies including the Alzheimer's Disease Neuroimaging Initiative (ADNI) enroll many study participants and acquire a large number of biomarkers from various modalities including demographic, genotyping, fluid biomarkers, neuroimaging, neuropsychometric test, and clinical assessments. However, a systematic approach that can integrate all the collected data is lacking. The overarching goal of our study is to use machine learning techniques to understand the relationships among different biomarkers and to establish a system-level model that can better describe the interactions among biomarkers and provide superior diagnostic and prognostic information. In this pilot study, we use Bayesian network (BN) to analyze multimodal data from ADNI, including demographics, volumetric MRI, PET, genotypes, and neuropsychometric measurements and demonstrate our approach to have superior prediction accuracy.
Directory of Open Access Journals (Sweden)
Yiannoutsos Constantin T
2009-06-01
Full Text Available Abstract Background Mortality of HIV-infected patients initiating antiretroviral therapy in the developing world is very high immediately after the start of ART therapy and drops sharply thereafter. It is necessary to use models of survival time that reflect this change. Methods In this endeavor, parametric models with changepoints such as Weibull models can be useful in order to explicitly model the underlying failure process, even in the case where abrupt changes in the mortality rate are present. Estimation of the temporal location of possible mortality changepoints has important implications on the effective management of these patients. We briefly describe these models and apply them to the case of estimating survival among HIV-infected patients who are initiating antiretroviral therapy in a care and treatment programme in sub-Saharan Africa. Results As a first reported data-driven estimate of the existence and location of early mortality changepoints after antiretroviral therapy initiation, we show that there is an early change in risk of death at three months, followed by an intermediate risk period lasting up to 10 months after therapy. Conclusion By explicitly modelling the underlying abrupt changes in mortality risk after initiation of antiretroviral therapy we are able to estimate their number and location in a rigorous, data-driven manner. The existence of a high early risk of death after initiation of antiretroviral therapy and the determination of its duration has direct implications for the optimal management of patients initiating therapy in this setting.
Individual organisms as units of analysis: Bayesian-clustering alternatives in population genetics.
Mank, Judith E; Avise, John C
2004-12-01
Population genetic analyses traditionally focus on the frequencies of alleles or genotypes in 'populations' that are delimited a priori. However, there are potential drawbacks of amalgamating genetic data into such composite attributes of assemblages of specimens: genetic information on individual specimens is lost or submerged as an inherent part of the analysis. A potential also exists for circular reasoning when a population's initial identification and subsequent genetic characterization are coupled. In principle, these problems are circumvented by some newer methods of population identification and individual assignment based on statistical clustering of specimen genotypes. Here we evaluate a recent method in this genre--Bayesian clustering--using four genotypic data sets involving different types of molecular markers in non-model organisms from nature. As expected, measures of population genetic structure (F(ST) and phiST) tended to be significantly greater in Bayesian a posteriori data treatments than in analyses where populations were delimited a priori. In the four biological contexts examined, which involved both geographic population structures and hybrid zones, Bayesian clustering was able to recover differentiated populations, and Bayesian assignments were able to identify likely population sources of specific individuals.
A Bayesian analysis of kaon photoproduction with the Regge-plus-resonance model
De Cruz, Lesley; Vrancx, Tom; Vancraeyveld, Pieter
2012-01-01
We address the issue of unbiased model selection and propose a methodology based on Bayesian inference to extract physical information from kaon photoproduction $p(\\gamma,K^+)\\Lambda$ data. We use the single-channel Regge-plus-resonance (RPR) framework for $p(\\gamma,K^+)\\Lambda$ to illustrate the proposed strategy. The Bayesian evidence Z is a quantitative measure for the model's fitness given the world's data. We present a numerical method for performing the multidimensional integrals in the expression for the Bayesian evidence. We use the $p(\\gamma,K^+)\\Lambda$ data with an invariant energy W > 2.6 GeV in order to constrain the background contributions in the RPR framework with Bayesian inference. Next, the resonance information is extracted from the analysis of differential cross sections, single and double polarization observables. This background and resonance content constitutes the basis of a model which is coined RPR-2011. It is shown that RPR-2011 yields a comprehensive account of the kaon photoprodu...
UNSUPERVISED TRANSIENT LIGHT CURVE ANALYSIS VIA HIERARCHICAL BAYESIAN INFERENCE
Energy Technology Data Exchange (ETDEWEB)
Sanders, N. E.; Soderberg, A. M. [Harvard-Smithsonian Center for Astrophysics, 60 Garden Street, Cambridge, MA 02138 (United States); Betancourt, M., E-mail: nsanders@cfa.harvard.edu [Department of Statistics, University of Warwick, Coventry CV4 7AL (United Kingdom)
2015-02-10
Historically, light curve studies of supernovae (SNe) and other transient classes have focused on individual objects with copious and high signal-to-noise observations. In the nascent era of wide field transient searches, objects with detailed observations are decreasing as a fraction of the overall known SN population, and this strategy sacrifices the majority of the information contained in the data about the underlying population of transients. A population level modeling approach, simultaneously fitting all available observations of objects in a transient sub-class of interest, fully mines the data to infer the properties of the population and avoids certain systematic biases. We present a novel hierarchical Bayesian statistical model for population level modeling of transient light curves, and discuss its implementation using an efficient Hamiltonian Monte Carlo technique. As a test case, we apply this model to the Type IIP SN sample from the Pan-STARRS1 Medium Deep Survey, consisting of 18,837 photometric observations of 76 SNe, corresponding to a joint posterior distribution with 9176 parameters under our model. Our hierarchical model fits provide improved constraints on light curve parameters relevant to the physical properties of their progenitor stars relative to modeling individual light curves alone. Moreover, we directly evaluate the probability for occurrence rates of unseen light curve characteristics from the model hyperparameters, addressing observational biases in survey methodology. We view this modeling framework as an unsupervised machine learning technique with the ability to maximize scientific returns from data to be collected by future wide field transient searches like LSST.
A problem in particle physics and its Bayesian analysis
Landon, Joshua
An up and coming field in contemporary nuclear and particle physics is "Lattice Quantum Chromodynamics", henceforth Lattice QCD. Indeed the 2004 Nobel Prize in Physics went to the developers of equations that describe QCD. In this dissertation, following a layperson's introduction to the structure of matter, we outline the statistical aspects of a problem in Lattice QCD faced by particle physicists, and point out the difficulties encountered by them in trying to address the problem. The difficulties stem from the fact that one is required to estimate a large -- conceptually infinite -- number of parameters based on a finite number of non-linear equations, each of which is a sum of exponential functions. We then present a plausible approach for solving the problem. Our approach is Bayesian and is driven by a computationally intensive Markov Chain Monte Carlo based solution. However, in order to invoke our approach we first look at the underlying anatomy of the problem and synthesize its essentials. These essentials reveal a pattern that can be harnessed via some assumptions, and this in turn enables us to outline a pathway towards a solution. We demonstrate the viability of our approach via simulated data, followed by its validation against real data provided to us by our physicist colleagues. Our approach yields results that in the past were not obtainable via alternate approaches. The contribution of this dissertation is two-fold. The first is a use of computationally intensive statistical technology to produce results in physics that could not be obtained using physics based techniques. Since the statistical architecture of the problem considered here can arise in other contexts as well, the second contribution of this dissertation is to indicate a plausible approach for addressing a generic class of problems wherein the number of parameters to be estimated exceeds the number of constraints, each constraint being a non-linear equation that is the sum of
Bayesian methods for model uncertainty analysis with application to future sea level rise
Energy Technology Data Exchange (ETDEWEB)
Patwardhan, A.; Small, M.J. (Carnegie Mellon Univ., Pittsburgh, PA (United States))
1992-12-01
This paper addresses the use of data for identifying and characterizing uncertainties in model parameters and predictions. The Bayesian Monte Carlo method is formally presented and elaborated, and applied to the analysis of the uncertainty in a predictive model for global mean sea level change. The method uses observations of output variables, made with an assumed error structure, to determine a posterior distribution of model outputs. This is used to derive a posterior distribution for the model parameters. Results demonstrate the resolution of the uncertainty that is obtained as a result of the Bayesian analysis and also indicate the key contributors to the uncertainty in the sea level rise model. While the technique is illustrated with a simple, preliminary model, the analysis provides an iterative framework for model refinement. The methodology developed in this paper provides a mechanism for the incorporation of ongoing data collection and research in decision-making for problems involving uncertain environmental change.
Application of Bayesian graphs to SN Ia data analysis and compression
Ma, Cong; Corasaniti, Pier-Stefano; Bassett, Bruce A.
2016-12-01
Bayesian graphical models are an efficient tool for modelling complex data and derive self-consistent expressions of the posterior distribution of model parameters. We apply Bayesian graphs to perform statistical analyses of Type Ia supernova (SN Ia) luminosity distance measurements from the joint light-curve analysis (JLA) data set. In contrast to the χ2 approach used in previous studies, the Bayesian inference allows us to fully account for the standard-candle parameter dependence of the data covariance matrix. Comparing with χ2 analysis results, we find a systematic offset of the marginal model parameter bounds. We demonstrate that the bias is statistically significant in the case of the SN Ia standardization parameters with a maximal 6σ shift of the SN light-curve colour correction. In addition, we find that the evidence for a host galaxy correction is now only 2.4σ. Systematic offsets on the cosmological parameters remain small, but may increase by combining constraints from complementary cosmological probes. The bias of the χ2 analysis is due to neglecting the parameter-dependent log-determinant of the data covariance, which gives more statistical weight to larger values of the standardization parameters. We find a similar effect on compressed distance modulus data. To this end, we implement a fully consistent compression method of the JLA data set that uses a Gaussian approximation of the posterior distribution for fast generation of compressed data. Overall, the results of our analysis emphasize the need for a fully consistent Bayesian statistical approach in the analysis of future large SN Ia data sets.
Application of Bayesian graphs to SN Ia data analysis and compression
Ma, Cong; Corasaniti, Pier-Stefano; Bassett, Bruce A.
2016-08-01
Bayesian graphical models are an efficient tool for modelling complex data and derive self-consistent expressions of the posterior distribution of model parameters. We apply Bayesian graphs to perform statistical analyses of Type Ia supernova (SN Ia) luminosity distance measurements from the Joint Light-curve Analysis (JLA) dataset (Betoule et al. 2014). In contrast to the χ2 approach used in previous studies, the Bayesian inference allows us to fully account for the standard-candle parameter dependence of the data covariance matrix. Comparing with χ2 analysis results we find a systematic offset of the marginal model parameter bounds. We demonstrate that the bias is statistically significant in the case of the SN Ia standardization parameters with a maximal 6σ shift of the SN light-curve colour correction. In addition, we find that the evidence for a host galaxy correction is now only 2.4σ. Systematic offsets on the cosmological parameters remain small, but may increase by combining constraints from complementary cosmological probes. The bias of the χ2 analysis is due to neglecting the parameter-dependent log-determinant of the data covariance, which gives more statistical weight to larger values of the standardization parameters. We find a similar effect on compressed distance modulus data. To this end we implement a fully consistent compression method of the JLA dataset that uses a Gaussian approximation of the posterior distribution for fast generation of compressed data. Overall, the results of our analysis emphasize the need for a fully consistent Bayesian statistical approach in the analysis of future large SN Ia datasets.
An integrated Bayesian analysis of LOH and copy number data
Directory of Open Access Journals (Sweden)
Hutter Marcus
2010-06-01
Full Text Available Abstract Background Cancer and other disorders are due to genomic lesions. SNP-microarrays are able to measure simultaneously both genotype and copy number (CN at several Single Nucleotide Polymorphisms (SNPs along the genome. CN is defined as the number of DNA copies, and the normal is two, since we have two copies of each chromosome. The genotype of a SNP is the status given by the nucleotides (alleles which are present on the two copies of DNA. It is defined homozygous or heterozygous if the two alleles are the same or if they differ, respectively. Loss of heterozygosity (LOH is the loss of the heterozygous status due to genomic events. Combining CN and LOH data, it is possible to better identify different types of genomic aberrations. For example, a long sequence of homozygous SNPs might be caused by either the physical loss of one copy or a uniparental disomy event (UPD, i.e. each SNP has two identical nucleotides both derived from only one parent. In this situation, the knowledge of the CN can help in distinguishing between these two events. Results To better identify genomic aberrations, we propose a method (called gBPCR which infers the type of aberration occurred, taking into account all the possible influence in the microarray detection of the homozygosity status of the SNPs, resulting from an altered CN level. Namely, we model the distributions of the detected genotype, given a specific genomic alteration and we estimate the parameters involved on public reference datasets. The estimation is performed similarly to the modified Bayesian Piecewise Constant Regression, but with improved estimators for the detection of the breakpoints. Using artificial and real data, we evaluate the quality of the estimation of gBPCR and we also show that it outperforms other well-known methods for LOH estimation. Conclusions We propose a method (gBPCR for the estimation of both LOH and CN aberrations, improving their estimation by integrating both types
Rodríguez-Ramilo, Silvia T; Wang, Jinliang
2012-09-01
The inference of population genetic structures is essential in many research areas in population genetics, conservation biology and evolutionary biology. Recently, unsupervised Bayesian clustering algorithms have been developed to detect a hidden population structure from genotypic data, assuming among others that individuals taken from the population are unrelated. Under this assumption, markers in a sample taken from a subpopulation can be considered to be in Hardy-Weinberg and linkage equilibrium. However, close relatives might be sampled from the same subpopulation, and consequently, might cause Hardy-Weinberg and linkage disequilibrium and thus bias a population genetic structure analysis. In this study, we used simulated and real data to investigate the impact of close relatives in a sample on Bayesian population structure analysis. We also showed that, when close relatives were identified by a pedigree reconstruction approach and removed, the accuracy of a population genetic structure analysis can be greatly improved. The results indicate that unsupervised Bayesian clustering algorithms cannot be used blindly to detect genetic structure in a sample with closely related individuals. Rather, when closely related individuals are suspected to be frequent in a sample, these individuals should be first identified and removed before conducting a population structure analysis.
Empirical Markov Chain Monte Carlo Bayesian analysis of fMRI data.
de Pasquale, F; Del Gratta, C; Romani, G L
2008-08-01
In this work an Empirical Markov Chain Monte Carlo Bayesian approach to analyse fMRI data is proposed. The Bayesian framework is appealing since complex models can be adopted in the analysis both for the image and noise model. Here, the noise autocorrelation is taken into account by adopting an AutoRegressive model of order one and a versatile non-linear model is assumed for the task-related activation. Model parameters include the noise variance and autocorrelation, activation amplitudes and the hemodynamic response function parameters. These are estimated at each voxel from samples of the Posterior Distribution. Prior information is included by means of a 4D spatio-temporal model for the interaction between neighbouring voxels in space and time. The results show that this model can provide smooth estimates from low SNR data while important spatial structures in the data can be preserved. A simulation study is presented in which the accuracy and bias of the estimates are addressed. Furthermore, some results on convergence diagnostic of the adopted algorithm are presented. To validate the proposed approach a comparison of the results with those from a standard GLM analysis, spatial filtering techniques and a Variational Bayes approach is provided. This comparison shows that our approach outperforms the classical analysis and is consistent with other Bayesian techniques. This is investigated further by means of the Bayes Factors and the analysis of the residuals. The proposed approach applied to Blocked Design and Event Related datasets produced reliable maps of activation.
Figueira, P; Adibekyan, V Zh; Oshagh, M; Santos, N C
2016-01-01
We apply the Bayesian framework to assess the presence of a correlation between two quantities. To do so, we estimate the probability distribution of the parameter of interest, $\\rho$, characterizing the strength of the correlation. We provide an implementation of these ideas and concepts using python programming language and the pyMC module in a very short ($\\sim$130 lines of code, heavily commented) and user-friendly program. We used this tool to assess the presence and properties of the correlation between planetary surface gravity and stellar activity level as measured by the log($R'_{\\mathrm{HK}}$) indicator. The results of the Bayesian analysis are qualitatively similar to those obtained via p-value analysis, and support the presence of a correlation in the data. The results are more robust in their derivation and more informative, revealing interesting features such as asymmetric posterior distributions or markedly different credible intervals, and allowing for a deeper exploration. We encourage the re...
Bayesian Switching Factor Analysis for Estimating Time-varying Functional Connectivity in fMRI.
Taghia, Jalil; Ryali, Srikanth; Chen, Tianwen; Supekar, Kaustubh; Cai, Weidong; Menon, Vinod
2017-03-03
There is growing interest in understanding the dynamical properties of functional interactions between distributed brain regions. However, robust estimation of temporal dynamics from functional magnetic resonance imaging (fMRI) data remains challenging due to limitations in extant multivariate methods for modeling time-varying functional interactions between multiple brain areas. Here, we develop a Bayesian generative model for fMRI time-series within the framework of hidden Markov models (HMMs). The model is a dynamic variant of the static factor analysis model (Ghahramani and Beal, 2000). We refer to this model as Bayesian switching factor analysis (BSFA) as it integrates factor analysis into a generative HMM in a unified Bayesian framework. In BSFA, brain dynamic functional networks are represented by latent states which are learnt from the data. Crucially, BSFA is a generative model which estimates the temporal evolution of brain states and transition probabilities between states as a function of time. An attractive feature of BSFA is the automatic determination of the number of latent states via Bayesian model selection arising from penalization of excessively complex models. Key features of BSFA are validated using extensive simulations on carefully designed synthetic data. We further validate BSFA using fingerprint analysis of multisession resting-state fMRI data from the Human Connectome Project (HCP). Our results show that modeling temporal dependencies in the generative model of BSFA results in improved fingerprinting of individual participants. Finally, we apply BSFA to elucidate the dynamic functional organization of the salience, central-executive, and default mode networks-three core neurocognitive systems with central role in cognitive and affective information processing (Menon, 2011). Across two HCP sessions, we demonstrate a high level of dynamic interactions between these networks and determine that the salience network has the highest temporal
A note on the robustness of a full Bayesian method for nonignorable missing data analysis
Zhang, Zhiyong; Wang,Lijuan
2012-01-01
A full Bayesian method utilizing data augmentation and Gibbs sampling algorithms is presented for analyzing nonignorable missing data. The discussion focuses on a simplified selection model for regression analysis. Regardless of missing mechanisms, it is assumed that missingness only depends on the missing variable itself. Simulation results demonstrate that the simplified selection model can recover regression model parameters under both correctly specified situations and many misspecified s...
Directory of Open Access Journals (Sweden)
Kelemen Arpad
2008-08-01
Full Text Available Abstract Background This paper addresses key biological problems and statistical issues in the analysis of large gene expression data sets that describe systemic temporal response cascades to therapeutic doses in multiple tissues such as liver, skeletal muscle, and kidney from the same animals. Affymetrix time course gene expression data U34A are obtained from three different tissues including kidney, liver and muscle. Our goal is not only to find the concordance of gene in different tissues, identify the common differentially expressed genes over time and also examine the reproducibility of the findings by integrating the results through meta analysis from multiple tissues in order to gain a significant increase in the power of detecting differentially expressed genes over time and to find the differential differences of three tissues responding to the drug. Results and conclusion Bayesian categorical model for estimating the proportion of the 'call' are used for pre-screening genes. Hierarchical Bayesian Mixture Model is further developed for the identifications of differentially expressed genes across time and dynamic clusters. Deviance information criterion is applied to determine the number of components for model comparisons and selections. Bayesian mixture model produces the gene-specific posterior probability of differential/non-differential expression and the 95% credible interval, which is the basis for our further Bayesian meta-inference. Meta-analysis is performed in order to identify commonly expressed genes from multiple tissues that may serve as ideal targets for novel treatment strategies and to integrate the results across separate studies. We have found the common expressed genes in the three tissues. However, the up/down/no regulations of these common genes are different at different time points. Moreover, the most differentially expressed genes were found in the liver, then in kidney, and then in muscle.
MorePower 6.0 for ANOVA with relational confidence intervals and Bayesian analysis.
Campbell, Jamie I D; Thompson, Valerie A
2012-12-01
MorePower 6.0 is a flexible freeware statistical calculator that computes sample size, effect size, and power statistics for factorial ANOVA designs. It also calculates relational confidence intervals for ANOVA effects based on formulas from Jarmasz and Hollands (Canadian Journal of Experimental Psychology 63:124-138, 2009), as well as Bayesian posterior probabilities for the null and alternative hypotheses based on formulas in Masson (Behavior Research Methods 43:679-690, 2011). The program is unique in affording direct comparison of these three approaches to the interpretation of ANOVA tests. Its high numerical precision and ability to work with complex ANOVA designs could facilitate researchers' attention to issues of statistical power, Bayesian analysis, and the use of confidence intervals for data interpretation. MorePower 6.0 is available at https://wiki.usask.ca/pages/viewpageattachments.action?pageId=420413544 .
Risks Analysis of Logistics Financial Business Based on Evidential Bayesian Network
Directory of Open Access Journals (Sweden)
Ying Yan
2013-01-01
Full Text Available Risks in logistics financial business are identified and classified. Making the failure of the business as the root node, a Bayesian network is constructed to measure the risk levels in the business. Three importance indexes are calculated to find the most important risks in the business. And more, considering the epistemic uncertainties in the risks, evidence theory associate with Bayesian network is used as an evidential network in the risk analysis of logistics finance. To find how much uncertainty in root node is produced by each risk, a new index, epistemic importance, is defined. Numerical examples show that the proposed methods could provide a lot of useful information. With the information, effective approaches could be found to control and avoid these sensitive risks, thus keep logistics financial business working more reliable. The proposed method also gives a quantitative measure of risk levels in logistics financial business, which provides guidance for the selection of financing solutions.
Bayesian analysis of non-homogeneous Markov chains: application to mental health data.
Sung, Minje; Soyer, Refik; Nhan, Nguyen
2007-07-10
In this paper we present a formal treatment of non-homogeneous Markov chains by introducing a hierarchical Bayesian framework. Our work is motivated by the analysis of correlated categorical data which arise in assessment of psychiatric treatment programs. In our development, we introduce a Markovian structure to describe the non-homogeneity of transition patterns. In doing so, we introduce a logistic regression set-up for Markov chains and incorporate covariates in our model. We present a Bayesian model using Markov chain Monte Carlo methods and develop inference procedures to address issues encountered in the analyses of data from psychiatric treatment programs. Our model and inference procedures are implemented to some real data from a psychiatric treatment study.
A PAC-Bayesian Analysis of Graph Clustering and Pairwise Clustering
Seldin, Yevgeny
2010-01-01
We formulate weighted graph clustering as a prediction problem: given a subset of edge weights we analyze the ability of graph clustering to predict the remaining edge weights. This formulation enables practical and theoretical comparison of different approaches to graph clustering as well as comparison of graph clustering with other possible ways to model the graph. We adapt the PAC-Bayesian analysis of co-clustering (Seldin and Tishby, 2008; Seldin, 2009) to derive a PAC-Bayesian generalization bound for graph clustering. The bound shows that graph clustering should optimize a trade-off between empirical data fit and the mutual information that clusters preserve on the graph nodes. A similar trade-off derived from information-theoretic considerations was already shown to produce state-of-the-art results in practice (Slonim et al., 2005; Yom-Tov and Slonim, 2009). This paper supports the empirical evidence by providing a better theoretical foundation, suggesting formal generalization guarantees, and offering...
Directory of Open Access Journals (Sweden)
Gianola Daniel
2007-09-01
Full Text Available Abstract Multivariate linear models are increasingly important in quantitative genetics. In high dimensional specifications, factor analysis (FA may provide an avenue for structuring (covariance matrices, thus reducing the number of parameters needed for describing (codispersion. We describe how FA can be used to model genetic effects in the context of a multivariate linear mixed model. An orthogonal common factor structure is used to model genetic effects under Gaussian assumption, so that the marginal likelihood is multivariate normal with a structured genetic (covariance matrix. Under standard prior assumptions, all fully conditional distributions have closed form, and samples from the joint posterior distribution can be obtained via Gibbs sampling. The model and the algorithm developed for its Bayesian implementation were used to describe five repeated records of milk yield in dairy cattle, and a one common FA model was compared with a standard multiple trait model. The Bayesian Information Criterion favored the FA model.
Bayesian Propensity Score Analysis: Simulation and Case Study
Kaplan, David; Chen, Cassie J. S.
2011-01-01
Propensity score analysis (PSA) has been used in a variety of settings, such as education, epidemiology, and sociology. Most typically, propensity score analysis has been implemented within the conventional frequentist perspective of statistics. This perspective, as is well known, does not account for uncertainty in either the parameters of the…
bspmma: An R Package for Bayesian Semiparametric Models for Meta-Analysis
Directory of Open Access Journals (Sweden)
Deborah Burr
2012-07-01
Full Text Available We introduce an R package, bspmma, which implements a Dirichlet-based random effects model specific to meta-analysis. In meta-analysis, when combining effect estimates from several heterogeneous studies, it is common to use a random-effects model. The usual frequentist or Bayesian models specify a normal distribution for the true effects. However, in many situations, the effect distribution is not normal, e.g., it can have thick tails, be skewed, or be multi-modal. A Bayesian nonparametric model based on mixtures of Dirichlet process priors has been proposed in the literature, for the purpose of accommodating the non-normality. We review this model and then describe a competitor, a semiparametric version which has the feature that it allows for a well-defined centrality parameter convenient for determining whether the overall effect is significant. This second Bayesian model is based on a different version of the Dirichlet process prior, and we call it the "conditional Dirichlet model". The package contains functions to carry out analyses based on either the ordinary or the conditional Dirichlet model, functions for calculating certain Bayes factors that provide a check on the appropriateness of the conditional Dirichlet model, and functions that enable an empirical Bayes selection of the precision parameter of the Dirichlet process. We illustrate the use of the package on two examples, and give an interpretation of the results in these two different scenarios.
Application of Bayesian graphs to SN Ia data analysis and compression
Ma, Con; Bassett, Bruce A
2016-01-01
Bayesian graphical models are an efficient tool for modelling complex data and derive self-consistent expressions of the posterior distribution of model parameters. We apply Bayesian graphs to perform statistical analyses of Type Ia supernova (SN Ia) luminosity distance measurements from the Joint Light-curve Analysis (JLA) dataset (Betoule et al. 2014, arXiv:1401.4064). In contrast to the $\\chi^2$ approach used in previous studies, the Bayesian inference allows us to fully account for the standard-candle parameter dependence of the data covariance matrix. Comparing with $\\chi^2$ analysis results we find a systematic offset of the marginal model parameter bounds. We demonstrate that the bias is statistically significant in the case of the SN Ia standardization parameters with a maximal $6\\sigma$ shift of the SN light-curve colour correction. In addition, we find that the evidence for a host galaxy correction is now only $2.4\\sigma$. Systematic offsets on the cosmological parameters remain small, but may incre...
Time-varying nonstationary multivariate risk analysis using a dynamic Bayesian copula
Sarhadi, Ali; Burn, Donald H.; Concepción Ausín, María.; Wiper, Michael P.
2016-03-01
A time-varying risk analysis is proposed for an adaptive design framework in nonstationary conditions arising from climate change. A Bayesian, dynamic conditional copula is developed for modeling the time-varying dependence structure between mixed continuous and discrete multiattributes of multidimensional hydrometeorological phenomena. Joint Bayesian inference is carried out to fit the marginals and copula in an illustrative example using an adaptive, Gibbs Markov Chain Monte Carlo (MCMC) sampler. Posterior mean estimates and credible intervals are provided for the model parameters and the Deviance Information Criterion (DIC) is used to select the model that best captures different forms of nonstationarity over time. This study also introduces a fully Bayesian, time-varying joint return period for multivariate time-dependent risk analysis in nonstationary environments. The results demonstrate that the nature and the risk of extreme-climate multidimensional processes are changed over time under the impact of climate change, and accordingly the long-term decision making strategies should be updated based on the anomalies of the nonstationary environment.
Fox, Neil I.; Micheas, Athanasios C.; Peng, Yuqiang
2016-07-01
This paper introduces the use of Bayesian full Procrustes shape analysis in object-oriented meteorological applications. In particular, the Procrustes methodology is used to generate mean forecast precipitation fields from a set of ensemble forecasts. This approach has advantages over other ensemble averaging techniques in that it can produce a forecast that retains the morphological features of the precipitation structures and present the range of forecast outcomes represented by the ensemble. The production of the ensemble mean avoids the problems of smoothing that result from simple pixel or cell averaging, while producing credible sets that retain information on ensemble spread. Also in this paper, the full Bayesian Procrustes scheme is used as an object verification tool for precipitation forecasts. This is an extension of a previously presented Procrustes shape analysis based verification approach into a full Bayesian format designed to handle the verification of precipitation forecasts that match objects from an ensemble of forecast fields to a single truth image. The methodology is tested on radar reflectivity nowcasts produced in the Warning Decision Support System - Integrated Information (WDSS-II) by varying parameters in the K-means cluster tracking scheme.
PFG NMR and Bayesian analysis to characterise non-Newtonian fluids
Blythe, Thomas W.; Sederman, Andrew J.; Stitt, E. Hugh; York, Andrew P. E.; Gladden, Lynn F.
2017-01-01
Many industrial flow processes are sensitive to changes in the rheological behaviour of process fluids, and there therefore exists a need for methods that provide online, or inline, rheological characterisation necessary for process control and optimisation over timescales of minutes or less. Nuclear magnetic resonance (NMR) offers a non-invasive technique for this application, without limitation on optical opacity. We present a Bayesian analysis approach using pulsed field gradient (PFG) NMR to enable estimation of the rheological parameters of Herschel-Bulkley fluids in a pipe flow geometry, characterised by a flow behaviour index n , yield stress τ0 , and consistency factor k , by analysis of the signal in q -space. This approach eliminates the need for velocity image acquisition and expensive gradient hardware. We investigate the robustness of the proposed Bayesian NMR approach to noisy data and reduced sampling using simulated NMR data and show that even with a signal-to-noise ratio (SNR) of 100, only 16 points are required to be sampled to provide rheological parameters accurate to within 2% of the ground truth. Experimental validation is provided through an experimental case study on Carbopol 940 solutions (model Herschel-Bulkley fluids) using PFG NMR at a 1H resonance frequency of 85.2 MHz; for SNR > 1000, only 8 points are required to be sampled. This corresponds to a total acquisition time of non-Bayesian NMR methods demonstrates that the Bayesian NMR approach is in agreement with MR flow imaging to within the accuracy of the measurement. Furthermore, as we increase the concentration of Carbopol 940 we observe a change in rheological characteristics, probably due to shear history-dependent behaviour and the different geometries used. This behaviour highlights the need for online, or inline, rheological characterisation in industrial process applications.
Bayesian model-based cluster analysis for predicting macrofaunal communities
Braak, ter C.J.F.; Hoijtink, H.; Akkermans, W.; Verdonschot, P.F.M.
2003-01-01
To predict macrofaunal community composition from environmental data a two-step approach is often followed: (1) the water samples are clustered into groups on the basis of the macrofauna data and (2) the groups are related to the environmental data, e.g. by discriminant analysis. For the cluster ana
Bayesian latent variable models for the analysis of experimental psychology data.
Merkle, Edgar C; Wang, Ting
2016-03-18
In this paper, we address the use of Bayesian factor analysis and structural equation models to draw inferences from experimental psychology data. While such application is non-standard, the models are generally useful for the unified analysis of multivariate data that stem from, e.g., subjects' responses to multiple experimental stimuli. We first review the models and the parameter identification issues inherent in the models. We then provide details on model estimation via JAGS and on Bayes factor estimation. Finally, we use the models to re-analyze experimental data on risky choice, comparing the approach to simpler, alternative methods.
Alves, Nelson A; Rizzi, Leandro G
2015-01-01
Microcanonical thermostatistics analysis has become an important tool to reveal essential aspects of phase transitions in complex systems. An efficient way to estimate the microcanonical inverse temperature $\\beta(E)$ and the microcanonical entropy $S(E)$ is achieved with the statistical temperature weighted histogram analysis method (ST-WHAM). The strength of this method lies on its flexibility, as it can be used to analyse data produced by algorithms with generalised sampling weights. However, for any sampling weight, ST-WHAM requires the calculation of derivatives of energy histograms $H(E)$, which leads to non-trivial and tedious binning tasks for models with continuous energy spectrum such as those for biomolecular and colloidal systems. Here, we discuss two alternative methods that avoid the need for such energy binning to obtain continuous estimates for $H(E)$ in order to evaluate $\\beta(E)$ by using ST-WHAM: (i) a series expansion to estimate probability densities from the empirical cumulative distrib...
Bayesian meta-analysis models for microarray data: a comparative study
Directory of Open Access Journals (Sweden)
Song Joon J
2007-03-01
Full Text Available Abstract Background With the growing abundance of microarray data, statistical methods are increasingly needed to integrate results across studies. Two common approaches for meta-analysis of microarrays include either combining gene expression measures across studies or combining summaries such as p-values, probabilities or ranks. Here, we compare two Bayesian meta-analysis models that are analogous to these methods. Results Two Bayesian meta-analysis models for microarray data have recently been introduced. The first model combines standardized gene expression measures across studies into an overall mean, accounting for inter-study variability, while the second combines probabilities of differential expression without combining expression values. Both models produce the gene-specific posterior probability of differential expression, which is the basis for inference. Since the standardized expression integration model includes inter-study variability, it may improve accuracy of results versus the probability integration model. However, due to the small number of studies typical in microarray meta-analyses, the variability between studies is challenging to estimate. The probability integration model eliminates the need to model variability between studies, and thus its implementation is more straightforward. We found in simulations of two and five studies that combining probabilities outperformed combining standardized gene expression measures for three comparison values: the percent of true discovered genes in meta-analysis versus individual studies; the percent of true genes omitted in meta-analysis versus separate studies, and the number of true discovered genes for fixed levels of Bayesian false discovery. We identified similar results when pooling two independent studies of Bacillus subtilis. We assumed that each study was produced from the same microarray platform with only two conditions: a treatment and control, and that the data sets
OVERALL SENSITIVITY ANALYSIS UTILIZING BAYESIAN NETWORK FOR THE QUESTIONNAIRE INVESTIGATION ON SNS
Directory of Open Access Journals (Sweden)
Tsuyoshi Aburai
2013-11-01
Full Text Available Social Networking Service (SNS is prevailing rapidly in Japan in recent years. The most popular ones are Facebook, mixi, and Twitter, which are utilized in various fields of life together with the convenient tool such as smart-phone. In this work, a questionnaire investigation is carried out in order to clarify the current usage condition, issues and desired functions. More than 1,000 samples are gathered. Bayesian network is utilized for this analysis. Sensitivity analysis is carried out by setting evidence to all items. This enables overall analysis for each item. We analyzed them by sensitivity analysis and some useful results were obtained. We have presented the paper concerning this. But the volume becomes too large, therefore we have split them and this paper shows the latter half of the investigation result by setting evidence to Bayesian Network parameters. Differences in usage objectives and SNS sites are made clear by the attributes and preference of SNS users. They can be utilized effectively for marketing by clarifying the target customer through the sensitivity analysis.
Crash risk analysis for Shanghai urban expressways: A Bayesian semi-parametric modeling approach.
Yu, Rongjie; Wang, Xuesong; Yang, Kui; Abdel-Aty, Mohamed
2016-10-01
Urban expressway systems have been developed rapidly in recent years in China; it has become one key part of the city roadway networks as carrying large traffic volume and providing high traveling speed. Along with the increase of traffic volume, traffic safety has become a major issue for Chinese urban expressways due to the frequent crash occurrence and the non-recurrent congestions caused by them. For the purpose of unveiling crash occurrence mechanisms and further developing Active Traffic Management (ATM) control strategies to improve traffic safety, this study developed disaggregate crash risk analysis models with loop detector traffic data and historical crash data. Bayesian random effects logistic regression models were utilized as it can account for the unobserved heterogeneity among crashes. However, previous crash risk analysis studies formulated random effects distributions in a parametric approach, which assigned them to follow normal distributions. Due to the limited information known about random effects distributions, subjective parametric setting may be incorrect. In order to construct more flexible and robust random effects to capture the unobserved heterogeneity, Bayesian semi-parametric inference technique was introduced to crash risk analysis in this study. Models with both inference techniques were developed for total crashes; semi-parametric models were proved to provide substantial better model goodness-of-fit, while the two models shared consistent coefficient estimations. Later on, Bayesian semi-parametric random effects logistic regression models were developed for weekday peak hour crashes, weekday non-peak hour crashes, and weekend non-peak hour crashes to investigate different crash occurrence scenarios. Significant factors that affect crash risk have been revealed and crash mechanisms have been concluded.
Nonparametric Bayesian Dictionary Learning for Analysis of Noisy and Incomplete Images
2010-04-01
OF EACH CELL ARE RESULTS OF KSVD AND BPFA, RESPECTIVELY. σ C.man House Peppers Lena Barbara Boats F.print Couple Hill 5 37.87 39.37 37.78 38.60 38.08...INTERPOLATION PSNR RESULTS, USING PATCH SIZE 8× 8. BOTTOM: BPFA RGB IMAGE INTERPOLATION PSNR RESULTS, USING PATCH SIZE 7× 7. data ratio C.man House Peppers Lena...of subspaces. IEEE Trans. Inform. Theory, 2009. [16] T. Ferguson . A Bayesian analysis of some nonparametric problems. Annals of Statistics, 1:209–230
BayesLCA: An R Package for Bayesian Latent Class Analysis
Directory of Open Access Journals (Sweden)
Arthur White
2014-11-01
Full Text Available The BayesLCA package for R provides tools for performing latent class analysis within a Bayesian setting. Three methods for fitting the model are provided, incorporating an expectation-maximization algorithm, Gibbs sampling and a variational Bayes approximation. The article briefly outlines the methodology behind each of these techniques and discusses some of the technical difficulties associated with them. Methods to remedy these problems are also described. Visualization methods for each of these techniques are included, as well as criteria to aid model selection.
A novel framework of change-point detection for machine monitoring
Lu, Guoliang; Zhou, Yiqi; Lu, Changhou; Li, Xueyong
2017-01-01
The need for automatic machine monitoring has been well known in industries for many years. Although it has been widely accepted that a change in the structural property can indicate the fault in rotating machinery components (e.g., bearing and gears), automatic algorithms for this task are still in progress. In this paper, we propose a novel framework for change-point detection in machine monitoring. The framework includes two phases: (1) anomaly measure: on the basis of an automatic regression (AR) model, a new computation method is proposed to measure anomalies in a given time series which does not require any reference data from other measurement(s); (2) change detection: a new statistical test is employed by using martingale for detecting a potential change in the series which can be operated in an unsupervised and self-conducted manner. Experimental results on testing data captured in real scenarios demonstrated the effectiveness and the realizability of the proposed framework for change-point detection in machine monitoring, which suggests that our framework can be directly applicable in many real-world applications.
Directory of Open Access Journals (Sweden)
P. K. Kapur
2009-01-01
Full Text Available Software testing is an important phase of softwaredevelopment life cycle. It controls the quality of softwareproduct. Due to the complexity of software system andincomplete understanding of software, the testing team maynot be able to remove/correct the fault perfectly onobservation/detection of a failure and the original fault mayremain resulting in a phenomenon known as imperfectdebugging, or get replaced by another fault causing faultgeneration. In case of imperfect debugging, the fault contentof the software remains same while in case of faultgeneration, the fault content increases as the testingprogresses and removal/correction results in introduction ofnew faults while removing/correcting old ones. Duringsoftware testing fault detection /correction rate may not besame throughout the whole testing process, but it maychange at any time moment. In the literature varioussoftware reliability models have been proposedincorporating change-point concept. In this paper wepropose a distribution based change-point problem with twotypes of imperfect debugging in software reliability. Themodels developed have been validated and verified usingreal data sets. Estimated Parameters and comparisoncriteria results have also been presented
Directory of Open Access Journals (Sweden)
Naveed Khan
2016-10-01
Full Text Available In recent years, smart phones with inbuilt sensors have become popular devices to facilitate activity recognition. The sensors capture a large amount of data, containing meaningful events, in a short period of time. The change points in this data are used to specify transitions to distinct events and can be used in various scenarios such as identifying change in a patient’s vital signs in the medical domain or requesting activity labels for generating real-world labeled activity datasets. Our work focuses on change-point detection to identify a transition from one activity to another. Within this paper, we extend our previous work on multivariate exponentially weighted moving average (MEWMA algorithm by using a genetic algorithm (GA to identify the optimal set of parameters for online change-point detection. The proposed technique finds the maximum accuracy and F_measure by optimizing the different parameters of the MEWMA, which subsequently identifies the exact location of the change point from an existing activity to a new one. Optimal parameter selection facilitates an algorithm to detect accurate change points and minimize false alarms. Results have been evaluated based on two real datasets of accelerometer data collected from a set of different activities from two users, with a high degree of accuracy from 99.4% to 99.8% and F_measure of up to 66.7%.
Fuzzy Bayesian Network-Bow-Tie Analysis of Gas Leakage during Biomass Gasification.
Directory of Open Access Journals (Sweden)
Fang Yan
Full Text Available Biomass gasification technology has been rapidly developed recently. But fire and poisoning accidents caused by gas leakage restrict the development and promotion of biomass gasification. Therefore, probabilistic safety assessment (PSA is necessary for biomass gasification system. Subsequently, Bayesian network-bow-tie (BN-bow-tie analysis was proposed by mapping bow-tie analysis into Bayesian network (BN. Causes of gas leakage and the accidents triggered by gas leakage can be obtained by bow-tie analysis, and BN was used to confirm the critical nodes of accidents by introducing corresponding three importance measures. Meanwhile, certain occurrence probability of failure was needed in PSA. In view of the insufficient failure data of biomass gasification, the occurrence probability of failure which cannot be obtained from standard reliability data sources was confirmed by fuzzy methods based on expert judgment. An improved approach considered expert weighting to aggregate fuzzy numbers included triangular and trapezoidal numbers was proposed, and the occurrence probability of failure was obtained. Finally, safety measures were indicated based on the obtained critical nodes. The theoretical occurrence probabilities in one year of gas leakage and the accidents caused by it were reduced to 1/10.3 of the original values by these safety measures.
Analysis of lifestyle and metabolic predictors of visceral obesity with Bayesian Networks
Directory of Open Access Journals (Sweden)
de Morais Sérgio
2010-09-01
Full Text Available Abstract Background The aim of this study was to provide a framework for the analysis of visceral obesity and its determinants in women, where complex inter-relationships are observed among lifestyle, nutritional and metabolic predictors. Thirty-four predictors related to lifestyle, adiposity, body fat distribution, blood lipids and adipocyte sizes have been considered as potential correlates of visceral obesity in women. To properly address the difficulties in managing such interactions given our limited sample of 150 women, bootstrapped Bayesian networks were constructed based on novel constraint-based learning methods that appeared recently in the statistical learning community. Statistical significance of edge strengths was evaluated and the less reliable edges were pruned to increase the network robustness. To allow accessible interpretation and integrate biological knowledge into the final network, several undirected edges were afterwards directed with physiological expertise according to relevant literature. Results Extensive experiments on synthetic data sampled from a known Bayesian network show that the algorithm, called Recursive Hybrid Parents and Children (RHPC, outperforms state-of-the-art algorithms that appeared in the recent literature. Regarding biological plausibility, we found that the inference results obtained with the proposed method were in excellent agreement with biological knowledge. For example, these analyses indicated that visceral adipose tissue accumulation is strongly related to blood lipid alterations independent of overall obesity level. Conclusions Bayesian Networks are a useful tool for investigating and summarizing evidence when complex relationships exist among predictors, in particular, as in the case of multifactorial conditions like visceral obesity, when there is a concurrent incidence for several variables, interacting in a complex manner. The source code and the data sets used for the empirical tests
Application of evidence theory in information fusion of multiple sources in bayesian analysis
Institute of Scientific and Technical Information of China (English)
周忠宝; 蒋平; 武小悦
2004-01-01
How to obtain proper prior distribution is one of the most critical problems in Bayesian analysis. In many practical cases, the prior information often comes from different sources, and the prior distribution form could be easily known in some certain way while the parameters are hard to determine. In this paper, based on the evidence theory, a new method is presented to fuse the information of multiple sources and determine the parameters of the prior distribution when the form is known. By taking the prior distributions which result from the information of multiple sources and converting them into corresponding mass functions which can be combined by Dempster-Shafer (D-S) method, we get the combined mass function and the representative points of the prior distribution. These points are used to fit with the given distribution form to determine the parameters of the prior distrbution. And then the fused prior distribution is obtained and Bayesian analysis can be performed.How to convert the prior distributions into mass functions properly and get the representative points of the fused prior distribution is the central question we address in this paper. The simulation example shows that the proposed method is effective.
Directory of Open Access Journals (Sweden)
Nazia Afreen
2016-03-01
Full Text Available Dengue fever is the most important arboviral disease in the tropical and sub-tropical countries of the world. Delhi, the metropolitan capital state of India, has reported many dengue outbreaks, with the last outbreak occurring in 2013. We have recently reported predominance of dengue virus serotype 2 during 2011-2014 in Delhi. In the present study, we report molecular characterization and evolutionary analysis of dengue serotype 2 viruses which were detected in 2011-2014 in Delhi. Envelope genes of 42 DENV-2 strains were sequenced in the study. All DENV-2 strains grouped within the Cosmopolitan genotype and further clustered into three lineages; Lineage I, II and III. Lineage III replaced lineage I during dengue fever outbreak of 2013. Further, a novel mutation Thr404Ile was detected in the stem region of the envelope protein of a single DENV-2 strain in 2014. Nucleotide substitution rate and time to the most recent common ancestor were determined by molecular clock analysis using Bayesian methods. A change in effective population size of Indian DENV-2 viruses was investigated through Bayesian skyline plot. The study will be a vital road map for investigation of epidemiology and evolutionary pattern of dengue viruses in India.
Bayesian flux balance analysis applied to a skeletal muscle metabolic model.
Heino, Jenni; Tunyan, Knarik; Calvetti, Daniela; Somersalo, Erkki
2007-09-01
In this article, the steady state condition for the multi-compartment models for cellular metabolism is considered. The problem is to estimate the reaction and transport fluxes, as well as the concentrations in venous blood when the stoichiometry and bound constraints for the fluxes and the concentrations are given. The problem has been addressed previously by a number of authors, and optimization-based approaches as well as extreme pathway analysis have been proposed. These approaches are briefly discussed here. The main emphasis of this work is a Bayesian statistical approach to the flux balance analysis (FBA). We show how the bound constraints and optimality conditions such as maximizing the oxidative phosphorylation flux can be incorporated into the model in the Bayesian framework by proper construction of the prior densities. We propose an effective Markov chain Monte Carlo (MCMC) scheme to explore the posterior densities, and compare the results with those obtained via the previously studied linear programming (LP) approach. The proposed methodology, which is applied here to a two-compartment model for skeletal muscle metabolism, can be extended to more complex models.
Directory of Open Access Journals (Sweden)
Madsen Per
2003-03-01
Full Text Available Abstract A fully Bayesian analysis using Gibbs sampling and data augmentation in a multivariate model of Gaussian, right censored, and grouped Gaussian traits is described. The grouped Gaussian traits are either ordered categorical traits (with more than two categories or binary traits, where the grouping is determined via thresholds on the underlying Gaussian scale, the liability scale. Allowances are made for unequal models, unknown covariance matrices and missing data. Having outlined the theory, strategies for implementation are reviewed. These include joint sampling of location parameters; efficient sampling from the fully conditional posterior distribution of augmented data, a multivariate truncated normal distribution; and sampling from the conditional inverse Wishart distribution, the fully conditional posterior distribution of the residual covariance matrix. Finally, a simulated dataset was analysed to illustrate the methodology. This paper concentrates on a model where residuals associated with liabilities of the binary traits are assumed to be independent. A Bayesian analysis using Gibbs sampling is outlined for the model where this assumption is relaxed.
Korsgaard, Inge Riis; Lund, Mogens Sandø; Sorensen, Daniel; Gianola, Daniel; Madsen, Per; Jensen, Just
2003-01-01
A fully Bayesian analysis using Gibbs sampling and data augmentation in a multivariate model of Gaussian, right censored, and grouped Gaussian traits is described. The grouped Gaussian traits are either ordered categorical traits (with more than two categories) or binary traits, where the grouping is determined via thresholds on the underlying Gaussian scale, the liability scale. Allowances are made for unequal models, unknown covariance matrices and missing data. Having outlined the theory, strategies for implementation are reviewed. These include joint sampling of location parameters; efficient sampling from the fully conditional posterior distribution of augmented data, a multivariate truncated normal distribution; and sampling from the conditional inverse Wishart distribution, the fully conditional posterior distribution of the residual covariance matrix. Finally, a simulated dataset was analysed to illustrate the methodology. This paper concentrates on a model where residuals associated with liabilities of the binary traits are assumed to be independent. A Bayesian analysis using Gibbs sampling is outlined for the model where this assumption is relaxed.
Risk analysis of emergent water pollution accidents based on a Bayesian Network.
Tang, Caihong; Yi, Yujun; Yang, Zhifeng; Sun, Jie
2016-01-01
To guarantee the security of water quality in water transfer channels, especially in open channels, analysis of potential emergent pollution sources in the water transfer process is critical. It is also indispensable for forewarnings and protection from emergent pollution accidents. Bridges above open channels with large amounts of truck traffic are the main locations where emergent accidents could occur. A Bayesian Network model, which consists of six root nodes and three middle layer nodes, was developed in this paper, and was employed to identify the possibility of potential pollution risk. Dianbei Bridge is reviewed as a typical bridge on an open channel of the Middle Route of the South to North Water Transfer Project where emergent traffic accidents could occur. Risk of water pollutions caused by leakage of pollutants into water is focused in this study. The risk for potential traffic accidents at the Dianbei Bridge implies a risk for water pollution in the canal. Based on survey data, statistical analysis, and domain specialist knowledge, a Bayesian Network model was established. The human factor of emergent accidents has been considered in this model. Additionally, this model has been employed to describe the probability of accidents and the risk level. The sensitive reasons for pollution accidents have been deduced. The case has also been simulated that sensitive factors are in a state of most likely to lead to accidents.
Improving water quality assessments through a hierarchical Bayesian analysis of variability.
Gronewold, Andrew D; Borsuk, Mark E
2010-10-15
Water quality measurement error and variability, while well-documented in laboratory-scale studies, is rarely acknowledged or explicitly resolved in most model-based water body assessments, including those conducted in compliance with the United States Environmental Protection Agency (USEPA) Total Maximum Daily Load (TMDL) program. Consequently, proposed pollutant loading reductions in TMDLs and similar water quality management programs may be biased, resulting in either slower-than-expected rates of water quality restoration and designated use reinstatement or, in some cases, overly conservative management decisions. To address this problem, we present a hierarchical Bayesian approach for relating actual in situ or model-predicted pollutant concentrations to multiple sampling and analysis procedures, each with distinct sources of variability. We apply this method to recently approved TMDLs to investigate whether appropriate accounting for measurement error and variability will lead to different management decisions. We find that required pollutant loading reductions may in fact vary depending not only on how measurement variability is addressed but also on which water quality analysis procedure is used to assess standard compliance. As a general strategy, our Bayesian approach to quantifying variability may represent an alternative to the common practice of addressing all forms of uncertainty through an arbitrary margin of safety (MOS).
Afreen, Nazia; Naqvi, Irshad H; Broor, Shobha; Ahmed, Anwar; Kazim, Syed Naqui; Dohare, Ravins; Kumar, Manoj; Parveen, Shama
2016-03-01
Dengue fever is the most important arboviral disease in the tropical and sub-tropical countries of the world. Delhi, the metropolitan capital state of India, has reported many dengue outbreaks, with the last outbreak occurring in 2013. We have recently reported predominance of dengue virus serotype 2 during 2011-2014 in Delhi. In the present study, we report molecular characterization and evolutionary analysis of dengue serotype 2 viruses which were detected in 2011-2014 in Delhi. Envelope genes of 42 DENV-2 strains were sequenced in the study. All DENV-2 strains grouped within the Cosmopolitan genotype and further clustered into three lineages; Lineage I, II and III. Lineage III replaced lineage I during dengue fever outbreak of 2013. Further, a novel mutation Thr404Ile was detected in the stem region of the envelope protein of a single DENV-2 strain in 2014. Nucleotide substitution rate and time to the most recent common ancestor were determined by molecular clock analysis using Bayesian methods. A change in effective population size of Indian DENV-2 viruses was investigated through Bayesian skyline plot. The study will be a vital road map for investigation of epidemiology and evolutionary pattern of dengue viruses in India.
Bayesian analysis for OPC modeling with film stack properties and posterior predictive checking
Burbine, Andrew; Fenger, Germain; Sturtevant, John; Fryer, David
2016-10-01
The use of optical proximity correction (OPC) demands increasingly accurate models of the photolithographic process. Model building and analysis techniques in the data science community have seen great strides in the past two decades which make better use of available information. This paper expands upon Bayesian analysis methods for parameter selection in lithographic models by increasing the parameter set and employing posterior predictive checks. Work continues with a Markov chain Monte Carlo (MCMC) search algorithm to generate posterior distributions of parameters. Models now include wafer film stack refractive indices, n and k, as parameters, recognizing the uncertainties associated with these values. Posterior predictive checks are employed as a method to validate parameter vectors discovered by the analysis, akin to cross validation.
Bayesian sensitivity analysis of incomplete data: bridging pattern-mixture and selection models.
Kaciroti, Niko A; Raghunathan, Trivellore
2014-11-30
Pattern-mixture models (PMM) and selection models (SM) are alternative approaches for statistical analysis when faced with incomplete data and a nonignorable missing-data mechanism. Both models make empirically unverifiable assumptions and need additional constraints to identify the parameters. Here, we first introduce intuitive parameterizations to identify PMM for different types of outcome with distribution in the exponential family; then we translate these to their equivalent SM approach. This provides a unified framework for performing sensitivity analysis under either setting. These new parameterizations are transparent, easy-to-use, and provide dual interpretation from both the PMM and SM perspectives. A Bayesian approach is used to perform sensitivity analysis, deriving inferences using informative prior distributions on the sensitivity parameters. These models can be fitted using software that implements Gibbs sampling.
Current trends in Bayesian methodology with applications
Upadhyay, Satyanshu K; Dey, Dipak K; Loganathan, Appaia
2015-01-01
Collecting Bayesian material scattered throughout the literature, Current Trends in Bayesian Methodology with Applications examines the latest methodological and applied aspects of Bayesian statistics. The book covers biostatistics, econometrics, reliability and risk analysis, spatial statistics, image analysis, shape analysis, Bayesian computation, clustering, uncertainty assessment, high-energy astrophysics, neural networking, fuzzy information, objective Bayesian methodologies, empirical Bayes methods, small area estimation, and many more topics.Each chapter is self-contained and focuses on
Hsieh, Chueh-An; Maier, Kimberly S.
2009-01-01
The capacity of Bayesian methods in estimating complex statistical models is undeniable. Bayesian data analysis is seen as having a range of advantages, such as an intuitive probabilistic interpretation of the parameters of interest, the efficient incorporation of prior information to empirical data analysis, model averaging and model selection.…
A Bayesian analysis of the 69 highest energy cosmic rays detected by the Pierre Auger Observatory
Khanin, Alexander; Mortlock, Daniel J.
2016-08-01
The origins of ultrahigh energy cosmic rays (UHECRs) remain an open question. Several attempts have been made to cross-correlate the arrival directions of the UHECRs with catalogues of potential sources, but no definite conclusion has been reached. We report a Bayesian analysis of the 69 events, from the Pierre Auger Observatory (PAO), that aims to determine the fraction of the UHECRs that originate from known AGNs in the Veron-Cety & Verson (VCV) catalogue, as well as AGNs detected with the Swift Burst Alert Telescope (Swift-BAT), galaxies from the 2MASS Redshift Survey (2MRS), and an additional volume-limited sample of 17 nearby AGNs. The study makes use of a multilevel Bayesian model of UHECR injection, propagation and detection. We find that for reasonable ranges of prior parameters the Bayes factors disfavour a purely isotropic model. For fiducial values of the model parameters, we report 68 per cent credible intervals for the fraction of source originating UHECRs of 0.09^{+0.05}_{-0.04}, 0.25^{+0.09}_{-0.08}, 0.24^{+0.12}_{-0.10}, and 0.08^{+0.04}_{-0.03} for the VCV, Swift-BAT and 2MRS catalogues, and the sample of 17 AGNs, respectively.
Shi, Qi; Abdel-Aty, Mohamed; Yu, Rongjie
2016-03-01
In traffic safety studies, crash frequency modeling of total crashes is the cornerstone before proceeding to more detailed safety evaluation. The relationship between crash occurrence and factors such as traffic flow and roadway geometric characteristics has been extensively explored for a better understanding of crash mechanisms. In this study, a multi-level Bayesian framework has been developed in an effort to identify the crash contributing factors on an urban expressway in the Central Florida area. Two types of traffic data from the Automatic Vehicle Identification system, which are the processed data capped at speed limit and the unprocessed data retaining the original speed were incorporated in the analysis along with road geometric information. The model framework was proposed to account for the hierarchical data structure and the heterogeneity among the traffic and roadway geometric data. Multi-level and random parameters models were constructed and compared with the Negative Binomial model under the Bayesian inference framework. Results showed that the unprocessed traffic data was superior. Both multi-level models and random parameters models outperformed the Negative Binomial model and the models with random parameters achieved the best model fitting. The contributing factors identified imply that on the urban expressway lower speed and higher speed variation could significantly increase the crash likelihood. Other geometric factors were significant including auxiliary lanes and horizontal curvature.
Directory of Open Access Journals (Sweden)
Kai Cao
2016-05-01
Full Text Available Objective: To explore the spatial-temporal interaction effect within a Bayesian framework and to probe the ecological influential factors for tuberculosis. Methods: Six different statistical models containing parameters of time, space, spatial-temporal interaction and their combination were constructed based on a Bayesian framework. The optimum model was selected according to the deviance information criterion (DIC value. Coefficients of climate variables were then estimated using the best fitting model. Results: The model containing spatial-temporal interaction parameter was the best fitting one, with the smallest DIC value (−4,508,660. Ecological analysis results showed the relative risks (RRs of average temperature, rainfall, wind speed, humidity, and air pressure were 1.00324 (95% CI, 1.00150–1.00550, 1.01010 (95% CI, 1.01007–1.01013, 0.83518 (95% CI, 0.93732–0.96138, 0.97496 (95% CI, 0.97181–1.01386, and 1.01007 (95% CI, 1.01003–1.01011, respectively. Conclusions: The spatial-temporal interaction was statistically meaningful and the prevalence of tuberculosis was influenced by the time and space interaction effect. Average temperature, rainfall, wind speed, and air pressure influenced tuberculosis. Average humidity had no influence on tuberculosis.
Busschaert, P; Geeraerd, A H; Uyttendaele, M; Van Impe, J F
2011-06-01
Microbiological contamination data often is censored because of the presence of non-detects or because measurement outcomes are known only to be smaller than, greater than, or between certain boundary values imposed by the laboratory procedures. Therefore, it is not straightforward to fit distributions that summarize contamination data for use in quantitative microbiological risk assessment, especially when variability and uncertainty are to be characterized separately. In this paper, distributions are fit using Bayesian analysis, and results are compared to results obtained with a methodology based on maximum likelihood estimation and the non-parametric bootstrap method. The Bayesian model is also extended hierarchically to estimate the effects of the individual elements of a covariate such as, for example, on a national level, the food processing company where the analyzed food samples were processed, or, on an international level, the geographical origin of contamination data. Including this extra information allows a risk assessor to differentiate between several scenario's and increase the specificity of the estimate of risk of illness, or compare different scenario's to each other. Furthermore, inference is made on the predictive importance of several different covariates while taking into account uncertainty, allowing to indicate which covariates are influential factors determining contamination.
A Bayesian analysis of the 69 highest energy cosmic rays detected by the Pierre Auger Observatory
Khanin, Alexander
2016-01-01
The origins of ultra-high energy cosmic rays (UHECRs) remain an open question. Several attempts have been made to cross-correlate the arrival directions of the UHECRs with catalogs of potential sources, but no definite conclusion has been reached. We report a Bayesian analysis of the 69 events from the Pierre Auger Observatory (PAO), that aims to determine the fraction of the UHECRs that originate from known AGNs in the Veron-Cety & Veron (VCV) catalog, as well as AGNs detected with the Swift Burst Alert Telescope (Swift-BAT), galaxies from the 2MASS Redshift Survey (2MRS), and an additional volume-limited sample of 17 nearby AGNs. The study makes use of a multi-level Bayesian model of UHECR injection, propagation and detection. We find that for reasonable ranges of prior parameters, the Bayes factors disfavour a purely isotropic model. For fiducial values of the model parameters, we report 68% credible intervals for the fraction of source originating UHECRs of 0.09+0.05-0.04, 0.25+0.09-0.08, 0.24+0.12-0....
Onisko, Agnieszka; Druzdzel, Marek J.; Austin, R. Marshall
2016-01-01
Background: Classical statistics is a well-established approach in the analysis of medical data. While the medical community seems to be familiar with the concept of a statistical analysis and its interpretation, the Bayesian approach, argued by many of its proponents to be superior to the classical frequentist approach, is still not well-recognized in the analysis of medical data. Aim: The goal of this study is to encourage data analysts to use the Bayesian approach, such as modeling with graphical probabilistic networks, as an insightful alternative to classical statistical analysis of medical data. Materials and Methods: This paper offers a comparison of two approaches to analysis of medical time series data: (1) classical statistical approach, such as the Kaplan–Meier estimator and the Cox proportional hazards regression model, and (2) dynamic Bayesian network modeling. Our comparison is based on time series cervical cancer screening data collected at Magee-Womens Hospital, University of Pittsburgh Medical Center over 10 years. Results: The main outcomes of our comparison are cervical cancer risk assessments produced by the three approaches. However, our analysis discusses also several aspects of the comparison, such as modeling assumptions, model building, dealing with incomplete data, individualized risk assessment, results interpretation, and model validation. Conclusion: Our study shows that the Bayesian approach is (1) much more flexible in terms of modeling effort, and (2) it offers an individualized risk assessment, which is more cumbersome for classical statistical approaches. PMID:28163973
Status of the 2D Bayesian analysis of XENON100 data
Energy Technology Data Exchange (ETDEWEB)
Schindler, Stefan [JGU, Staudingerweg 7, 55128 Mainz (Germany)
2015-07-01
The XENON100 experiment is located in the underground laboratory at LNGS in Italy. Since Dark Matter particles will only interact very rarely with normal matter, an environment with ultra low background, which is shielded from cosmic radiation is needed. The standard analysis of XENON100 data has made use of the profile likelihood method (a most frequent approach) and still provides one of the most sensitive exclusion limits to WIMP Dark Matter. Here we present work towards a Bayesian approach to the analysis of XENON100 data, where we attempt to include the measured primary (S1) and secondary (S2) scintillation signals in a more complete way. The background and signal models in the S1-S2 space have to be defined and a corresponding likelihood function, describing these models, has to be constructed.
BaalChIP: Bayesian analysis of allele-specific transcription factor binding in cancer genomes.
de Santiago, Ines; Liu, Wei; Yuan, Ke; O'Reilly, Martin; Chilamakuri, Chandra Sekhar Reddy; Ponder, Bruce A J; Meyer, Kerstin B; Markowetz, Florian
2017-02-24
Allele-specific measurements of transcription factor binding from ChIP-seq data are key to dissecting the allelic effects of non-coding variants and their contribution to phenotypic diversity. However, most methods of detecting an allelic imbalance assume diploid genomes. This assumption severely limits their applicability to cancer samples with frequent DNA copy-number changes. Here we present a Bayesian statistical approach called BaalChIP to correct for the effect of background allele frequency on the observed ChIP-seq read counts. BaalChIP allows the joint analysis of multiple ChIP-seq samples across a single variant and outperforms competing approaches in simulations. Using 548 ENCODE ChIP-seq and six targeted FAIRE-seq samples, we show that BaalChIP effectively corrects allele-specific analysis for copy-number variation and increases the power to detect putative cis-acting regulatory variants in cancer genomes.
Bayesian methods for model uncertainty analysis with application to future sea level rise
Energy Technology Data Exchange (ETDEWEB)
Patwardhan, A.; Small, M.J.
1992-01-01
In no other area is the need for effective analysis of uncertainty more evident than in the problem of evaluating the consequences of increasing atmospheric concentrations of radiatively active gases. The major consequences of concern is global warming, with related environmental effects that include changes in local patterns of precipitation, soil moisture, forest and agricultural productivity, and a potential increase in global mean sea level. In order to identify an optimum set of responses to sea level change, a full characterization of the uncertainties associated with the predictions of future sea level rise is essential. The paper addresses the use of data for identifying and characterizing uncertainties in model parameters and predictions. The Bayesian Monte Carlo method is formally presented and elaborated, and applied to the analysis of the uncertainty in a predictive model for global mean sea level change.
Bayesian Analysis of Inertial Confinement Fusion Experiments at the National Ignition Facility
Gaffney, J A; Sonnad, V; Libby, S B
2012-01-01
We develop a Bayesian inference method that allows the efficient determination of several interesting parameters from complicated high-energy-density experiments performed on the National Ignition Facility (NIF). The model is based on an exploration of phase space using the hydrodynamic code HYDRA. A linear model is used to describe the effect of nuisance parameters on the analysis, allowing an analytic likelihood to be derived that can be determined from a small number of HYDRA runs and then used in existing advanced statistical analysis methods. This approach is applied to a recent experiment in order to determine the carbon opacity and X-ray drive; it is found that the inclusion of prior expert knowledge and fluctuations in capsule dimensions and chemical composition significantly improve the agreement between experiment and theoretical opacity calculations. A parameterisation of HYDRA results is used to test the application of both Markov chain Monte Carlo (MCMC) and genetic algorithm (GA) techniques to e...
Bayesian Analysis of Hybrid EoS based on Astrophysical Observational Data
Alvarez-Castillo, David; Blaschke, David; Grigorian, Hovik
2014-01-01
The most basic features of a neutron star (NS) are its radius and mass which so far have not been well determined simultaneously for a single object. In some cases masses are precisely measured like in the case of binary systems but radii are quite uncertain. In the other hand, for isolated neutron stars some radius and mass measurements exist but lack the necessary precision to inquire into their interiors. However, the present observable data allows to make probabilistic estimation of the internal structure of the star. In this work preliminary probabilistic estimation of the super dense stellar matter equation of state using Bayesian Analysis and modelling of relativistic configurations of neutron stars is shown. This analysis is important for research of existence the quark-gluon plasma in massive (around 2 sun masses) neutron stars.
Bayesian Analysis of $C_{x'}$ and $C_{z'}$ Double Polarizations in Kaon Photoproduction
Hutauruk, P T P
2010-01-01
Have been analyzed the latest experimental data for $\\gamma + p \\to K^{+} + \\Lambda$ reaction of $C_{x'}$ and $C_{z'}$ double polarizations. In theoretical calculation, all of these observables can be classified into four Legendre classes and represented by associated Legendre polynomial function itself \\cite{fasano92}. In this analysis we attempt to determine the best data model for both observables. We use the bayesian technique to select the best model by calculating the posterior probabilities and comparing the posterior among the models. The posteriors probabilities for each data model are computed using a Nested sampling integration. From this analysis we concluded that $C_{x'}$ and $C_{z'}$ double polarizations require two and three order of associated Legendre polynomials respectively to describe the data well. The extracted coefficients of each observable will also be presented. It shows the structure of baryon resonances qualitatively
Bayesian analysis for exponential random graph models using the adaptive exchange sampler
Jin, Ick Hoon
2013-01-01
Exponential random graph models have been widely used in social network analysis. However, these models are extremely difficult to handle from a statistical viewpoint, because of the existence of intractable normalizing constants. In this paper, we consider a fully Bayesian analysis for exponential random graph models using the adaptive exchange sampler, which solves the issue of intractable normalizing constants encountered in Markov chain Monte Carlo (MCMC) simulations. The adaptive exchange sampler can be viewed as a MCMC extension of the exchange algorithm, and it generates auxiliary networks via an importance sampling procedure from an auxiliary Markov chain running in parallel. The convergence of this algorithm is established under mild conditions. The adaptive exchange sampler is illustrated using a few social networks, including the Florentine business network, molecule synthetic network, and dolphins network. The results indicate that the adaptive exchange algorithm can produce more accurate estimates than approximate exchange algorithms, while maintaining the same computational efficiency.
Directory of Open Access Journals (Sweden)
Hakan Sarikaya
Full Text Available OBJECTIVE: To compare the effects of antiplatelets and anticoagulants on stroke and death in patients with acute cervical artery dissection. DESIGN: Systematic review with Bayesian meta-analysis. DATA SOURCES: The reviewers searched MEDLINE and EMBASE from inception to November 2012, checked reference lists, and contacted authors. STUDY SELECTION: Studies were eligible if they were randomised, quasi-randomised or observational comparisons of antiplatelets and anticoagulants in patients with cervical artery dissection. DATA EXTRACTION: Data were extracted by one reviewer and checked by another. Bayesian techniques were used to appropriately account for studies with scarce event data and imbalances in the size of comparison groups. DATA SYNTHESIS: Thirty-seven studies (1991 patients were included. We found no randomised trial. The primary analysis revealed a large treatment effect in favour of antiplatelets for preventing the primary composite outcome of ischaemic stroke, intracranial haemorrhage or death within the first 3 months after treatment initiation (relative risk 0.32, 95% credibility interval 0.12 to 0.63, while the degree of between-study heterogeneity was moderate (τ(2 = 0.18. In an analysis restricted to studies of higher methodological quality, the possible advantage of antiplatelets over anticoagulants was less obvious than in the main analysis (relative risk 0.73, 95% credibility interval 0.17 to 2.30. CONCLUSION: In view of these results and the safety advantages, easier usage and lower cost of antiplatelets, we conclude that antiplatelets should be given precedence over anticoagulants as a first line treatment in patients with cervical artery dissection unless results of an adequately powered randomised trial suggest the opposite.
Dachian, Serguei
2010-01-01
Different change-point type models encountered in statistical inference for stochastic processes give rise to different limiting likelihood ratio processes. In a previous paper of one of the authors it was established that one of these likelihood ratios, which is an exponential functional of a two-sided Poisson process driven by some parameter, can be approximated (for sufficiently small values of the parameter) by another one, which is an exponential functional of a two-sided Brownian motion. In this paper we consider yet another likelihood ratio, which is the exponent of a two-sided compound Poisson process driven by some parameter. We establish, that similarly to the Poisson type one, the compound Poisson type likelihood ratio can be approximated by the Brownian type one for sufficiently small values of the parameter. We equally discuss the asymptotics for large values of the parameter and illustrate the results by numerical simulations.
Vázquez-Polo, Francisco-Jose; Moreno, Elías; Negrín, Miguel A; Martel, Maria
2015-04-01
In most cases, including those of discrete random variables, statistical meta-analysis is carried out using the normal random effect model. The authors argue that normal approximation does not always properly reflect the underlying uncertainty of the original discrete data. Furthermore, in the presence of rare events the results from this approximation can be very poor. This review proposes a Bayesian meta-analysis to address binary outcomes from sparse data and also introduces a simple way to examine the sensitivity of the quantities of interest in the meta-analysis with respect to the structure dependence selected. The findings suggest that for binary outcomes data it is possible to develop a Bayesian procedure, which can be directly applied to sparse data without ad hoc corrections. By choosing a suitable class of linking distributions, the authors found that a Bayesian robustness study can be easily implemented. For illustrative purposes, an example with real data is analyzed using the proposed Bayesian meta-analysis for binomial sparse data.
Lee, Eun Gyung; Kim, Seung Won; Feigley, Charles E; Harper, Martin
2013-01-01
This study introduces two semi-quantitative methods, Structured Subjective Assessment (SSA) and Control of Substances Hazardous to Health (COSHH) Essentials, in conjunction with two-dimensional Monte Carlo simulations for determining prior probabilities. Prior distribution using expert judgment was included for comparison. Practical applications of the proposed methods were demonstrated using personal exposure measurements of isoamyl acetate in an electronics manufacturing facility and of isopropanol in a printing shop. Applicability of these methods in real workplaces was discussed based on the advantages and disadvantages of each method. Although these methods could not be completely independent of expert judgments, this study demonstrated a methodological improvement in the estimation of the prior distribution for the Bayesian decision analysis tool. The proposed methods provide a logical basis for the decision process by considering determinants of worker exposure.
Fermi's paradox, extraterrestrial life and the future of humanity: a Bayesian analysis
Verendel, Vilhelm; Häggström, Olle
2017-01-01
The Great Filter interpretation of Fermi's great silence asserts that Npq is not a very large number, where N is the number of potentially life-supporting planets in the observable universe, p is the probability that a randomly chosen such planet develops intelligent life to the level of present-day human civilization, and q is the conditional probability that it then goes on to develop a technological supercivilization visible all over the observable universe. Evidence suggests that N is huge, which implies that pq is very small. Hanson (1998) and Bostrom (2008) have argued that the discovery of extraterrestrial life would point towards p not being small and therefore a very small q, which can be seen as bad news for humanity's prospects of colonizing the universe. Here we investigate whether a Bayesian analysis supports their argument, and the answer turns out to depend critically on the choice of prior distribution.
Constraints on cosmic-ray propagation models from a global Bayesian analysis
Trotta, R; Moskalenko, I V; Porter, T A; de Austri, R Ruiz; Strong, A W
2010-01-01
Research in many areas of modern physics such as, e.g., indirect searches for dark matter and particle acceleration in SNR shocks, rely heavily on studies of cosmic rays (CRs) and associated diffuse emissions (radio, microwave, X-rays, gamma rays). While very detailed numerical models of CR propagation exist, a quantitative statistical analysis of such models has been so far hampered by the large computational effort that those models require. Although statistical analyses have been carried out before using semi-analytical models (where the computation is much faster), the evaluation of the results obtained from such models is difficult, as they necessarily suffer from many simplifying assumptions, The main objective of this paper is to present a working method for a full Bayesian parameter estimation for a numerical CR propagation model. For this study, we use the GALPROP code, the most advanced of its kind, that uses astrophysical information, nuclear and particle data as input to self-consistently predict ...
A Bayesian based functional mixed-effects model for analysis of LC-MS data.
Befekadu, Getachew K; Tadesse, Mahlet G; Ressom, Habtom W
2009-01-01
A Bayesian multilevel functional mixed-effects model with group specific random-effects is presented for analysis of liquid chromatography-mass spectrometry (LC-MS) data. The proposed framework allows alignment of LC-MS spectra with respect to both retention time (RT) and mass-to-charge ratio (m/z). Affine transformations are incorporated within the model to account for any variability along the RT and m/z dimensions. Simultaneous posterior inference of all unknown parameters is accomplished via Markov chain Monte Carlo method using the Gibbs sampling algorithm. The proposed approach is computationally tractable and allows incorporating prior knowledge in the inference process. We demonstrate the applicability of our approach for alignment of LC-MS spectra based on total ion count profiles derived from two LC-MS datasets.
Bayesian analysis of general failure data from an ageing distribution: advances in numerical methods
Energy Technology Data Exchange (ETDEWEB)
Procaccia, H.; Villain, B. [Electricite de France (EDF), 93 - Saint-Denis (France); Clarotti, C.A. [ENEA, Casaccia (Italy)
1996-12-31
EDF and ENEA carried out a joint research program for developing the numerical methods and computer codes needed for Bayesian analysis of component-lives in the case of ageing. Early results of this study were presented at ESREL`94. Since then the following further steps have been gone: input data have been generalized to the case that observed lives are censored both on the right and on the left; allowable life distributions are Weibull and gamma - their parameters are both unknown and can be statistically dependent; allowable priors are histograms relative to different parametrizations of the life distribution of concern; first-and-second-order-moments of the posterior distributions can be computed. In particular the covariance will give some important information about the degree of the statistical dependence between the parameters of interest. An application of the code to the appearance of a stress corrosion cracking in a tube of the PWR Steam Generator system is presented. (authors). 10 refs.
Shou, Yiyun; Smithson, Michael
2015-03-01
Conventional measures of predictor importance in linear models are applicable only when the assumption of homoscedasticity is satisfied. Moreover, they cannot be adapted to evaluating predictor importance in models of heteroscedasticity (i.e., dispersion), an issue that seems not to have been systematically addressed in the literature. We compare two suitable approaches, Dominance Analysis (DA) and Bayesian Model Averaging (BMA), for simultaneously evaluating predictor importance in models of location and dispersion. We apply them to the beta general linear model as a test-case, illustrating this with an example using real data. Simulations using several different model structures, sample sizes, and degrees of multicollinearity suggest that both DA and BMA largely agree on the relative importance of predictors of the mean, but differ when ranking predictors of dispersion. The main implication of these findings for researchers is that the choice between DA and BMA is most important when they wish to evaluate the importance of predictors of dispersion.
Directory of Open Access Journals (Sweden)
Miriam Marco
2017-02-01
Full Text Available This paper aimed to analyze the spatial distribution of drug-related police interventions and the neighborhood characteristics influencing these spatial patterns. To this end, police officers ranked each census block group in Valencia, Spain (N = 552, providing an index of drug-related police interventions. Data from the City Statistics Office and observational variables were used to analyze neighborhood characteristics. Distance to the police station was used as the control variable. A Bayesian ecological analysis was performed with a spatial beta regression model. Results indicated that high physical decay, low socioeconomic status, and high immigrant concentration were associated with high levels of drug-related police interventions after adjustment for distance to the police station. Results illustrate the importance of a spatial approach to understanding crime.
Selection of Trusted Service Providers by Enforcing Bayesian Analysis in iVCE
Institute of Scientific and Technical Information of China (English)
GU Bao-jun; LI Xiao-yong; WANG Wei-nong
2008-01-01
The initiative of internet-based virtual computing environment (iVCE) aims to provide the end users and applications With a harmonious, trustworthy and transparent integrated computing environment which will facilitate sharing and collaborating of network resources between applications. Trust management is an elementary component for iVCE. The uncertain and dynamic characteristics of iVCE necessitate the requirement for the trust management to be subjective, historical evidence based and context dependent. This paper presents a Bayesian analysis-based trust model, which aims to secure the active agents for selecting appropriate trustod services in iVCE. Simulations are made to analyze the properties of the trust model which show that the subjective prior information influences trust evaluation a lot and the model stimulates positive interactions.
Bayesian design and analysis of computer experiments: Use of derivatives in surface prediction
Energy Technology Data Exchange (ETDEWEB)
Morris, M.D.; Mitchell, T.J. (Oak Ridge National Lab., TN (USA)); Ylvisaker, D. (California Univ., Los Angeles, CA (USA). Dept. of Mathematics)
1991-06-01
The work of Currin et al. and others in developing fast predictive approximations'' of computer models is extended for the case in which derivatives of the output variable of interest with respect to input variables are available. In addition to describing the calculations required for the Bayesian analysis, the issue of experimental design is also discussed, and an algorithm is described for constructing maximin distance'' designs. An example is given based on a demonstration model of eight inputs and one output, in which predictions based on a maximin design, a Latin hypercube design, and two compromise'' designs are evaluated and compared. 12 refs., 2 figs., 6 tabs.
Takamizawa, Hisashi; Itoh, Hiroto; Nishiyama, Yutaka
2016-10-01
In order to understand neutron irradiation embrittlement in high fluence regions, statistical analysis using the Bayesian nonparametric (BNP) method was performed for the Japanese surveillance and material test reactor irradiation database. The BNP method is essentially expressed as an infinite summation of normal distributions, with input data being subdivided into clusters with identical statistical parameters, such as mean and standard deviation, for each cluster to estimate shifts in ductile-to-brittle transition temperature (DBTT). The clusters typically depend on chemical compositions, irradiation conditions, and the irradiation embrittlement. Specific variables contributing to the irradiation embrittlement include the content of Cu, Ni, P, Si, and Mn in the pressure vessel steels, neutron flux, neutron fluence, and irradiation temperatures. It was found that the measured shifts of DBTT correlated well with the calculated ones. Data associated with the same materials were subdivided into the same clusters even if neutron fluences were increased.
Bayesian Reliability Analysis of Non-Stationarity in Multi-agent Systems
Directory of Open Access Journals (Sweden)
TONT Gabriela
2013-05-01
Full Text Available The Bayesian methods provide information about the meaningful parameters in a statistical analysis obtained by combining the prior and sampling distributions to form the posterior distribution of theparameters. The desired inferences are obtained from this joint posterior. An estimation strategy for hierarchical models, where the resulting joint distribution of the associated model parameters cannotbe evaluated analytically, is to use sampling algorithms, known as Markov Chain Monte Carlo (MCMC methods, from which approximate solutions can be obtained. Both serial and parallel configurations of subcomponents are permitted. The capability of time-dependent method to describe a multi-state system is based on a case study, assessingthe operatial situation of studied system. The rationality and validity of the presented model are demonstrated via a case of study. The effect of randomness of the structural parameters is alsoexamined.
Intuitive logic revisited: new data and a Bayesian mixed model meta-analysis.
Singmann, Henrik; Klauer, Karl Christoph; Kellen, David
2014-01-01
Recent research on syllogistic reasoning suggests that the logical status (valid vs. invalid) of even difficult syllogisms can be intuitively detected via differences in conceptual fluency between logically valid and invalid syllogisms when participants are asked to rate how much they like a conclusion following from a syllogism (Morsanyi & Handley, 2012). These claims of an intuitive logic are at odds with most theories on syllogistic reasoning which posit that detecting the logical status of difficult syllogisms requires effortful and deliberate cognitive processes. We present new data replicating the effects reported by Morsanyi and Handley, but show that this effect is eliminated when controlling for a possible confound in terms of conclusion content. Additionally, we reanalyze three studies (n = 287) without this confound with a Bayesian mixed model meta-analysis (i.e., controlling for participant and item effects) which provides evidence for the null-hypothesis and against Morsanyi and Handley's claim.
Cha, YoonKyung; Kim, Young Mo; Choi, Jae-Woo; Sthiannopkao, Suthipong; Cho, Kyung Hwa
2016-01-01
In the Mekong River basin, groundwater from tube-wells is a major drinking water source. However, arsenic (As) contamination in groundwater resources has become a critical issue in the watershed. In this study, As species such as total As (AsTOT), As(III), and As(V), were monitored across the watershed to investigate their characteristics and inter-relationships with water quality parameters, including pH and redox potential (Eh). The data illustrated a dramatic change in the relationship between AsTOT and Eh over a specific Eh range, suggesting the importance of Eh in predicting AsTOT. Thus, a Bayesian change-point model was developed to predict AsTOT concentrations based on Eh and pH, to determine changes in the AsTOT-Eh relationship. The model captured the Eh change-point (∼-100±15mV), which was compatible with the data. Importantly, the inclusion of this change-point in the model resulted in improved model fit and prediction accuracy; AsTOT concentrations were strongly negatively related to Eh values higher than the change-point. The process underlying this relationship was subsequently posited to be the reductive dissolution of mineral oxides and As release. Overall, AsTOT showed a weak positive relationship with Eh at a lower range, similar to those commonly observed in the Mekong River basin delta. It is expected that these results would serve as a guide for establishing public health strategies in the Mekong River Basin.
Linkov, Igor; Massey, Olivia; Keisler, Jeff; Rusyn, Ivan; Hartung, Thomas
2015-01-01
"Weighing" available evidence in the process of decision-making is unavoidable, yet it is one step that routinely raises suspicions: what evidence should be used, how much does it weigh, and whose thumb may be tipping the scales? This commentary aims to evaluate the current state and future roles of various types of evidence for hazard assessment as it applies to environmental health. In its recent evaluation of the US Environmental Protection Agency's Integrated Risk Information System assessment process, the National Research Council committee singled out the term "weight of evidence" (WoE) for critique, deeming the process too vague and detractive to the practice of evaluating human health risks of chemicals. Moving the methodology away from qualitative, vague and controversial methods towards generalizable, quantitative and transparent methods for appropriately managing diverse lines of evidence is paramount for both regulatory and public acceptance of the hazard assessments. The choice of terminology notwithstanding, a number of recent Bayesian WoE-based methods, the emergence of multi criteria decision analysis for WoE applications, as well as the general principles behind the foundational concepts of WoE, show promise in how to move forward and regain trust in the data integration step of the assessments. We offer our thoughts on the current state of WoE as a whole and while we acknowledge that many WoE applications have been largely qualitative and subjective in nature, we see this as an opportunity to turn WoE towards a quantitative direction that includes Bayesian and multi criteria decision analysis.
Bayesian Analysis on Abduction%从贝叶斯方法看溯因推理
Institute of Scientific and Technical Information of China (English)
袁继红; 陈晓平
2014-01-01
皮尔斯指出溯因或溯因推理（abduction）是不同于归纳和演绎的第三种推理，然而皮尔斯对溯因概念的定义是模糊的，于是便出现溯因悖论：溯因既属于归纳又不属于归纳。本文基于贝叶斯方法对归纳的理解和处理，考察了当代两种典型的消解溯因悖论的路径，即辛提卡区分定义性规则和策略性规则的措施，以及利普顿的IBE 理论。指出这两种路径均是行不通的，而贝叶斯方法却可以容纳溯因性归纳和溯因，从而消解溯因悖论。%Charles S.Peirce argued that abduction is a third kind of reasoning,different from both deduction and induction.However,Peirce’s concept of abduction is ambiguous,which results in the paradox about abduction:on the one hand,abduction is distinct from induction;one the other hand, abduction belongs to induction.Based on the Bayesian analysis on induction,two typical approaches, Hintikka’s distinction between definitory rules and strategic rules and Lipton’s inference to the best explanation (IBE),are discussed respectively in this paper.As a result,the analysis shows that the paradox about abduction can not be eliminated by the two typical approaches but can be eliminated in Bayesian framework which can contain abductory induction and abduction.
Zhang, Hua; Huo, Mingdong; Chao, Jianqian; Liu, Pei
2016-01-01
Background Hepatitis B virus (HBV) infection is a major problem for public health; timely antiviral treatment can significantly prevent the progression of liver damage from HBV by slowing down or stopping the virus from reproducing. In the study we applied Bayesian approach to cost-effectiveness analysis, using Markov Chain Monte Carlo (MCMC) simulation methods for the relevant evidence input into the model to evaluate cost-effectiveness of entecavir (ETV) and lamivudine (LVD) therapy for chronic hepatitis B (CHB) in Jiangsu, China, thus providing information to the public health system in the CHB therapy. Methods Eight-stage Markov model was developed, a hypothetical cohort of 35-year-old HBeAg-positive patients with CHB was entered into the model. Treatment regimens were LVD100mg daily and ETV 0.5 mg daily. The transition parameters were derived either from systematic reviews of the literature or from previous economic studies. The outcome measures were life-years, quality-adjusted lifeyears (QALYs), and expected costs associated with the treatments and disease progression. For the Bayesian models all the analysis was implemented by using WinBUGS version 1.4. Results Expected cost, life expectancy, QALYs decreased with age. Cost-effectiveness increased with age. Expected cost of ETV was less than LVD, while life expectancy and QALYs were higher than that of LVD, ETV strategy was more cost-effective. Costs and benefits of the Monte Carlo simulation were very close to the results of exact form among the group, but standard deviation of each group indicated there was a big difference between individual patients. Conclusions Compared with lamivudine, entecavir is the more cost-effective option. CHB patients should accept antiviral treatment as soon as possible as the lower age the more cost-effective. Monte Carlo simulation obtained costs and effectiveness distribution, indicate our Markov model is of good robustness. PMID:27574976
Michelioudakis, Dimitrios G.; Hobbs, Richard W.; Caiado, Camila C. S.
2016-04-01
multivariate posterior distribution. The novelty of our approach and the major difference compared to the traditional semblance spectrum velocity analysis procedure is the calculation of uncertainty of the output model. As the model is able to estimate the credibility intervals of the corresponding interval velocities, we can produce the most probable PSDM images in an iterative manner. The depths extracted using our statistical algorithm are in very good agreement with the key horizons retrieved from the drilled core DSDP-258, showing that the Bayesian model is able to control the depth migration of the seismic data and estimate the uncertainty to the drilling targets.
Sea-level variability in tide-gauge and geological records: An empirical Bayesian analysis (Invited)
Kopp, R. E.; Hay, C.; Morrow, E.; Mitrovica, J. X.; Horton, B.; Kemp, A.
2013-12-01
Sea level varies at a range of temporal and spatial scales, and understanding all its significant sources of variability is crucial to building sea-level rise projections relevant to local decision-making. In the twentieth-century record, sites along the U.S. east coast have exhibited typical year-to-year variability of several centimeters. A faster-than-global increase in sea-level rise in the northeastern United States since about 1990 has led some to hypothesize a 'sea-level rise hot spot' in this region, perhaps driven by a trend in the Atlantic Meridional Overturning Circulation related to anthropogenic climate change [1]. However, such hypotheses must be evaluated in the context of natural variability, as revealed by observational and paleo-records. Bayesian and empirical Bayesian statistical approaches are well suited for assimilating data from diverse sources, such as tide-gauges and peats, with differing data availability and uncertainties, and for identifying regionally covarying patterns within these data. We present empirical Bayesian analyses of twentieth-century tide gauge data [2]. We find that the mid-Atlantic region of the United States has experienced a clear acceleration of sea level relative to the global average since about 1990, but this acceleration does not appear to be unprecedented in the twentieth-century record. The rate and extent of this acceleration instead appears comparable to an acceleration observed in the 1930s and 1940s. Both during the earlier episode of acceleration and today, the effect appears to be significantly positively correlated with the Atlantic Multidecadal Oscillation and likely negatively correlated with the North Atlantic Oscillation [2]. The Holocene and Common Era database of geological sea-level rise proxies [3,4] may allow these relationships to be assessed beyond the span of the direct observational record. At a global scale, similar approaches can be employed to look for the spatial fingerprints of land ice
Hip fracture in the elderly: a re-analysis of the EPIDOS study with causal Bayesian networks.
Directory of Open Access Journals (Sweden)
Pascal Caillet
Full Text Available Hip fractures commonly result in permanent disability, institutionalization or death in elderly. Existing hip-fracture predicting tools are underused in clinical practice, partly due to their lack of intuitive interpretation. By use of a graphical layer, Bayesian network models could increase the attractiveness of fracture prediction tools. Our aim was to study the potential contribution of a causal Bayesian network in this clinical setting. A logistic regression was performed as a standard control approach to check the robustness of the causal Bayesian network approach.EPIDOS is a multicenter study, conducted in an ambulatory care setting in five French cities between 1992 and 1996 and updated in 2010. The study included 7598 women aged 75 years or older, in which fractures were assessed quarterly during 4 years. A causal Bayesian network and a logistic regression were performed on EPIDOS data to describe major variables involved in hip fractures occurrences.Both models had similar association estimations and predictive performances. They detected gait speed and mineral bone density as variables the most involved in the fracture process. The causal Bayesian network showed that gait speed and bone mineral density were directly connected to fracture and seem to mediate the influence of all the other variables included in our model. The logistic regression approach detected multiple interactions involving psychotropic drug use, age and bone mineral density.Both approaches retrieved similar variables as predictors of hip fractures. However, Bayesian network highlighted the whole web of relation between the variables involved in the analysis, suggesting a possible mechanism leading to hip fracture. According to the latter results, intervention focusing concomitantly on gait speed and bone mineral density may be necessary for an optimal prevention of hip fracture occurrence in elderly people.
Bayesian Analysis of Nonlinear Structural Equation Models with Nonignorable Missing Data
Lee, Sik-Yum
2006-01-01
A Bayesian approach is developed for analyzing nonlinear structural equation models with nonignorable missing data. The nonignorable missingness mechanism is specified by a logistic regression model. A hybrid algorithm that combines the Gibbs sampler and the Metropolis-Hastings algorithm is used to produce the joint Bayesian estimates of…
Integrated survival analysis using an event-time approach in a Bayesian framework
Walsh, Daniel P.; Dreitz, VJ; Heisey, Dennis M.
2015-01-01
Event-time or continuous-time statistical approaches have been applied throughout the biostatistical literature and have led to numerous scientific advances. However, these techniques have traditionally relied on knowing failure times. This has limited application of these analyses, particularly, within the ecological field where fates of marked animals may be unknown. To address these limitations, we developed an integrated approach within a Bayesian framework to estimate hazard rates in the face of unknown fates. We combine failure/survival times from individuals whose fates are known and times of which are interval-censored with information from those whose fates are unknown, and model the process of detecting animals with unknown fates. This provides the foundation for our integrated model and permits necessary parameter estimation. We provide the Bayesian model, its derivation, and use simulation techniques to investigate the properties and performance of our approach under several scenarios. Lastly, we apply our estimation technique using a piece-wise constant hazard function to investigate the effects of year, age, chick size and sex, sex of the tending adult, and nesting habitat on mortality hazard rates of the endangered mountain plover (Charadrius montanus) chicks. Traditional models were inappropriate for this analysis because fates of some individual chicks were unknown due to failed radio transmitters. Simulations revealed biases of posterior mean estimates were minimal (≤ 4.95%), and posterior distributions behaved as expected with RMSE of the estimates decreasing as sample sizes, detection probability, and survival increased. We determined mortality hazard rates for plover chicks were highest at <5 days old and were lower for chicks with larger birth weights and/or whose nest was within agricultural habitats. Based on its performance, our approach greatly expands the range of problems for which event-time analyses can be used by eliminating the
Bias correction and Bayesian analysis of aggregate counts in SAGE libraries
Directory of Open Access Journals (Sweden)
Briggs William M
2010-02-01
Full Text Available Abstract Background Tag-based techniques, such as SAGE, are commonly used to sample the mRNA pool of an organism's transcriptome. Incomplete digestion during the tag formation process may allow for multiple tags to be generated from a given mRNA transcript. The probability of forming a tag varies with its relative location. As a result, the observed tag counts represent a biased sample of the actual transcript pool. In SAGE this bias can be avoided by ignoring all but the 3' most tag but will discard a large fraction of the observed data. Taking this bias into account should allow more of the available data to be used leading to increased statistical power. Results Three new hierarchical models, which directly embed a model for the variation in tag formation probability, are proposed and their associated Bayesian inference algorithms are developed. These models may be applied to libraries at both the tag and aggregate level. Simulation experiments and analysis of real data are used to contrast the accuracy of the various methods. The consequences of tag formation bias are discussed in the context of testing differential expression. A description is given as to how these algorithms can be applied in that context. Conclusions Several Bayesian inference algorithms that account for tag formation effects are compared with the DPB algorithm providing clear evidence of superior performance. The accuracy of inferences when using a particular non-informative prior is found to depend on the expression level of a given gene. The multivariate nature of the approach easily allows both univariate and joint tests of differential expression. Calculations demonstrate the potential for false positive and negative findings due to variation in tag formation probabilities across samples when testing for differential expression.
预案分析的贝叶斯网络方法%Contingency Plan Analysis of Bayesian Networks
Institute of Scientific and Technical Information of China (English)
徐立
2012-01-01
在对预案进行评估分析和执行过程中常会涉及不确定性问题,传统的预案编制工具关键路径法(Critical Path Method,CPM)不具备处理不确定性问题的能力.本文推荐的贝叶斯网络法(Bayesian Networks)因其处理分析不确定性问题的能力已经被广泛应用于一系列的决策支持应用,但对预案评估分析的应用是新颖的.本文介绍了用贝叶斯网络法分析传统关键路径法编制的预案.%In the process of contingency plan analysis and execution, we meet uncertainty problem frequently. The traditional critical path method (CPM) can not deal with uncertainty problem. Bayesian networks which has capability to dispose uncertainty problem is applied to support decision -making widely. It is novel to use bayesian networks in contingency plan analysis. In this paper, a contingency plan is presented, which utilizes bayesian networks in CPM.
Energy Technology Data Exchange (ETDEWEB)
Itagaki, H. [Yokohama National University, Yokohama (Japan). Faculty of Engineering; Asada, H.; Ito, S. [National Aerospace Laboratory, Tokyo (Japan); Shinozuka, M.
1996-12-31
Risk assessed structural positions in a pressurized fuselage of a transport-type aircraft applied with damage tolerance design are taken up as the subject of discussion. A small number of data obtained from inspections on the positions was used to discuss the Bayesian reliability analysis that can estimate also a proper non-periodic inspection schedule, while estimating proper values for uncertain factors. As a result, time period of generating fatigue cracks was determined according to procedure of detailed visual inspections. The analysis method was found capable of estimating values that are thought reasonable and the proper inspection schedule using these values, in spite of placing the fatigue crack progress expression in a very simple form and estimating both factors as the uncertain factors. Thus, the present analysis method was verified of its effectiveness. This study has discussed at the same time the structural positions, modeling of fatigue cracks generated and develop in the positions, conditions for destruction, damage factors, and capability of the inspection from different viewpoints. This reliability analysis method is thought effective also on such other structures as offshore structures. 18 refs., 8 figs., 1 tab.
Directory of Open Access Journals (Sweden)
C. Mukherjee
2011-01-01
Full Text Available Inverse modeling applications in atmospheric chemistry are increasingly addressing the challenging statistical issues of data synthesis by adopting refined statistical analysis methods. This paper advances this line of research by addressing several central questions in inverse modeling, focusing specifically on Bayesian statistical computation. Motivated by problems of refining bottom-up estimates of source/sink fluxes of trace gas and aerosols based on increasingly high-resolution satellite retrievals of atmospheric chemical concentrations, we address head-on the need for integrating formal spatial statistical methods of residual error structure in global scale inversion models. We do this using analytically and computationally tractable spatial statistical models, know as conditional autoregressive spatial models, as components of a global inversion framework. We develop Markov chain Monte Carlo methods to explore and fit these spatial structures in an overall statistical framework that simultaneously estimates source fluxes. Additional aspects of the study extend the statistical framework to utilize priors in a more physically realistic manner, and to formally address and deal with missing data in satellite retrievals. We demonstrate the analysis in the context of inferring carbon monoxide (CO sources constrained by satellite retrievals of column CO from the Measurement of Pollution in the Troposphere (MOPITT instrument on the TERRA satellite, paying special attention to evaluating performance of the inverse approach using various statistical diagnostic metrics. This is developed using synthetic data generated to resemble MOPITT data to define a~proof-of-concept and model assessment, and then in analysis of real MOPITT data.
Application of Bayesian and cost benefit risk analysis in water resources management
Varouchakis, E. A.; Palogos, I.; Karatzas, G. P.
2016-03-01
Decision making is a significant tool in water resources management applications. This technical note approaches a decision dilemma that has not yet been considered for the water resources management of a watershed. A common cost-benefit analysis approach, which is novel in the risk analysis of hydrologic/hydraulic applications, and a Bayesian decision analysis are applied to aid the decision making on whether or not to construct a water reservoir for irrigation purposes. The alternative option examined is a scaled parabolic fine variation in terms of over-pumping violations in contrast to common practices that usually consider short-term fines. The methodological steps are analytically presented associated with originally developed code. Such an application, and in such detail, represents new feedback. The results indicate that the probability uncertainty is the driving issue that determines the optimal decision with each methodology, and depending on the unknown probability handling, each methodology may lead to a different optimal decision. Thus, the proposed tool can help decision makers to examine and compare different scenarios using two different approaches before making a decision considering the cost of a hydrologic/hydraulic project and the varied economic charges that water table limit violations can cause inside an audit interval. In contrast to practices that assess the effect of each proposed action separately considering only current knowledge of the examined issue, this tool aids decision making by considering prior information and the sampling distribution of future successful audits.
Energy Technology Data Exchange (ETDEWEB)
George, J.S.; Schmidt, D.M.; Wood, C.C.
1999-02-01
We have developed a Bayesian approach to the analysis of neural electromagnetic (MEG/EEG) data that can incorporate or fuse information from other imaging modalities and addresses the ill-posed inverse problem by sarnpliig the many different solutions which could have produced the given data. From these samples one can draw probabilistic inferences about regions of activation. Our source model assumes a variable number of variable size cortical regions of stimulus-correlated activity. An active region consists of locations on the cortical surf ace, within a sphere centered on some location in cortex. The number and radi of active regions can vary to defined maximum values. The goal of the analysis is to determine the posterior probability distribution for the set of parameters that govern the number, location, and extent of active regions. Markov Chain Monte Carlo is used to generate a large sample of sets of parameters distributed according to the posterior distribution. This sample is representative of the many different source distributions that could account for given data, and allows identification of probable (i.e. consistent) features across solutions. Examples of the use of this analysis technique with both simulated and empirical MEG data are presented.
Wagner-Kaiser, R.; Stenning, D. C.; Sarajedini, A.; von Hippel, T.; van Dyk, D. A.; Robinson, E.; Stein, N.; Jefferys, W. H.
2016-12-01
We use Cycle 21 Hubble Space Telescope (HST) observations and HST archival ACS Treasury observations of 30 Galactic globular clusters to characterize two distinct stellar populations. A sophisticated Bayesian technique is employed to simultaneously sample the joint posterior distribution of age, distance, and extinction for each cluster, as well as unique helium values for two populations within each cluster and the relative proportion of those populations. We find the helium differences among the two populations in the clusters fall in the range of ˜0.04 to 0.11. Because adequate models varying in carbon, nitrogen, and oxygen are not presently available, we view these spreads as upper limits and present them with statistical rather than observational uncertainties. Evidence supports previous studies suggesting an increase in helium content concurrent with increasing mass of the cluster and we also find that the proportion of the first population of stars increases with mass as well. Our results are examined in the context of proposed globular cluster formation scenarios. Additionally, we leverage our Bayesian technique to shed light on the inconsistencies between the theoretical models and the observed data.
Naganathan, Athi N; Perez-Jimenez, Raul; Muñoz, Victor; Sanchez-Ruiz, Jose M
2011-10-14
The realization that folding free energy barriers can be small enough to result in significant population of the species at the barrier top has sprouted in several methods to estimate folding barriers from equilibrium experiments. Some of these approaches are based on fitting the experimental thermogram measured by differential scanning calorimetry (DSC) to a one-dimensional representation of the folding free-energy surface (FES). Different physical models have been used to represent the FES: (1) a Landau quartic polynomial as a function of the total enthalpy, which acts as an order parameter; (2) the projection onto a structural order parameter (i.e. number of native residues or native contacts) of the free energy of all the conformations generated by Ising-like statistical mechanical models; and (3) mean-field models that define conformational entropy and stabilization energy as functions of a continuous local order parameter. The fundamental question that emerges is how can we obtain robust, model-independent estimates of the thermodynamic folding barrier from the analysis of DSC experiments. Here we address this issue by comparing the performance of various FES models in interpreting the thermogram of a protein with a marginal folding barrier. We chose the small α-helical protein PDD, which folds-unfolds in microseconds crossing a free energy barrier previously estimated as ~1 RT. The fits of the PDD thermogram to the various models and assumptions produce FES with a consistently small free energy barrier separating the folded and unfolded ensembles. However, the fits vary in quality as well as in the estimated barrier. Applying Bayesian probabilistic analysis we rank the fit performance using a statistically rigorous criterion that leads to a global estimate of the folding barrier and its precision, which for PDD is 1.3 ± 0.4 kJ mol(-1). This result confirms that PDD folds over a minor barrier consistent with the downhill folding regime. We have further
A BAYESIAN HIERARCHICAL SPATIAL POINT PROCESS MODEL FOR MULTI-TYPE NEUROIMAGING META-ANALYSIS.
Kang, Jian; Nichols, Thomas E; Wager, Tor D; Johnson, Timothy D
2014-09-01
Neuroimaging meta-analysis is an important tool for finding consistent effects over studies that each usually have 20 or fewer subjects. Interest in meta-analysis in brain mapping is also driven by a recent focus on so-called "reverse inference": where as traditional "forward inference" identifies the regions of the brain involved in a task, a reverse inference identifies the cognitive processes that a task engages. Such reverse inferences, however, requires a set of meta-analysis, one for each possible cognitive domain. However, existing methods for neuroimaging meta-analysis have significant limitations. Commonly used methods for neuroimaging meta-analysis are not model based, do not provide interpretable parameter estimates, and only produce null hypothesis inferences; further, they are generally designed for a single group of studies and cannot produce reverse inferences. In this work we address these limitations by adopting a non-parametric Bayesian approach for meta analysis data from multiple classes or types of studies. In particular, foci from each type of study are modeled as a cluster process driven by a random intensity function that is modeled as a kernel convolution of a gamma random field. The type-specific gamma random fields are linked and modeled as a realization of a common gamma random field, shared by all types, that induces correlation between study types and mimics the behavior of a univariate mixed effects model. We illustrate our model on simulation studies and a meta analysis of five emotions from 219 studies and check model fit by a posterior predictive assessment. In addition, we implement reverse inference by using the model to predict study type from a newly presented study. We evaluate this predictive performance via leave-one-out cross validation that is efficiently implemented using importance sampling techniques.
Fontanazza, C M; Freni, G; Notaro, V
2012-01-01
Flood damage in urbanized watersheds may be assessed by combining the flood depth-damage curves and the outputs of urban flood models. The complexity of the physical processes that must be simulated and the limited amount of data available for model calibration may lead to high uncertainty in the model results and consequently in damage estimation. Moreover depth-damage functions are usually affected by significant uncertainty related to the collected data and to the simplified structure of the regression law that is used. The present paper carries out the analysis of the uncertainty connected to the flood damage estimate obtained combining the use of hydraulic models and depth-damage curves. A Bayesian inference analysis was proposed along with a probabilistic approach for the parameters estimating. The analysis demonstrated that the Bayesian approach is very effective considering that the available databases are usually short.
Bayesian Analysis Made Simple An Excel GUI for WinBUGS
Woodward, Philip
2011-01-01
From simple NLMs to complex GLMMs, this book describes how to use the GUI for WinBUGS - BugsXLA - an Excel add-in written by the author that allows a range of Bayesian models to be easily specified. With case studies throughout, the text shows how to routinely apply even the more complex aspects of model specification, such as GLMMs, outlier robust models, random effects Emax models, auto-regressive errors, and Bayesian variable selection. It provides brief, up-to-date discussions of current issues in the practical application of Bayesian methods. The author also explains how to obtain free so
Bayesian analysis of the dynamic cosmic web in the SDSS galaxy survey
Leclercq, Florent; Wandelt, Benjamin
2015-01-01
Recent application of the Bayesian algorithm BORG to the Sloan Digital Sky Survey (SDSS) main sample galaxies resulted in the physical inference of the formation history of the observed large-scale structure from its origin to the present epoch. In this work, we use these inferences as inputs for a detailed probabilistic cosmic web-type analysis. To do so, we generate a large set of data-constrained realizations of the large-scale structure using a fast, fully non-linear gravitational model. We then perform a dynamic classification of the cosmic web into four distinct components (voids, sheets, filaments and clusters) on the basis of the tidal field. Our inference framework automatically and self-consistently propagates typical observational uncertainties to web-type classification. As a result, this study produces highly detailed and accurate cosmographic classification of large-scale structure elements in the SDSS volume. By also providing the history of these structure maps, the approach allows an analysis...
Analysis of traffic accidents on rural highways using Latent Class Clustering and Bayesian Networks.
de Oña, Juan; López, Griselda; Mujalli, Randa; Calvo, Francisco J
2013-03-01
One of the principal objectives of traffic accident analyses is to identify key factors that affect the severity of an accident. However, with the presence of heterogeneity in the raw data used, the analysis of traffic accidents becomes difficult. In this paper, Latent Class Cluster (LCC) is used as a preliminary tool for segmentation of 3229 accidents on rural highways in Granada (Spain) between 2005 and 2008. Next, Bayesian Networks (BNs) are used to identify the main factors involved in accident severity for both, the entire database (EDB) and the clusters previously obtained by LCC. The results of these cluster-based analyses are compared with the results of a full-data analysis. The results show that the combined use of both techniques is very interesting as it reveals further information that would not have been obtained without prior segmentation of the data. BN inference is used to obtain the variables that best identify accidents with killed or seriously injured. Accident type and sight distance have been identify in all the cases analysed; other variables such as time, occupant involved or age are identified in EDB and only in one cluster; whereas variables vehicles involved, number of injuries, atmospheric factors, pavement markings and pavement width are identified only in one cluster.
Batterbee, D C; Sims, N D; Becker, W; Worden, K; Rowson, J
2011-11-01
Non-accidental head injury in infants, or shaken baby syndrome, is a highly controversial and disputed topic. Biomechanical studies often suggest that shaking alone cannot cause the classical symptoms, yet many medical experts believe the contrary. Researchers have turned to finite element modelling for a more detailed understanding of the interactions between the brain, skull, cerebrospinal fluid (CSF), and surrounding tissues. However, the uncertainties in such models are significant; these can arise from theoretical approximations, lack of information, and inherent variability. Consequently, this study presents an uncertainty analysis of a finite element model of a human head subject to shaking. Although the model geometry was greatly simplified, fluid-structure-interaction techniques were used to model the brain, skull, and CSF using a Eulerian mesh formulation with penalty-based coupling. Uncertainty and sensitivity measurements were obtained using Bayesian sensitivity analysis, which is a technique that is relatively new to the engineering community. Uncertainty in nine different model parameters was investigated for two different shaking excitations: sinusoidal translation only, and sinusoidal translation plus rotation about the base of the head. The level and type of sensitivity in the results was found to be highly dependent on the excitation type.
Wagner-Kaiser, R; Sarajedini, A; von Hippel, T; van Dyk, D A; Robinson, E; Stein, N; Jefferys, W H
2016-01-01
We use Cycle 21 Hubble Space Telescope (HST) observations and HST archival ACS Treasury observations of 30 Galactic Globular Clusters to characterize two distinct stellar populations. A sophisticated Bayesian technique is employed to simultaneously sample the joint posterior distribution of age, distance, and extinction for each cluster, as well as unique helium values for two populations within each cluster and the relative proportion of those populations. We find the helium differences among the two populations in the clusters fall in the range of ~0.04 to 0.11. Because adequate models varying in CNO are not presently available, we view these spreads as upper limits and present them with statistical rather than observational uncertainties. Evidence supports previous studies suggesting an increase in helium content concurrent with increasing mass of the cluster and also find that the proportion of the first population of stars increases with mass as well. Our results are examined in the context of proposed g...
A comparison of Bayesian and Monte Carlo sensitivity analysis for unmeasured confounding.
McCandless, Lawrence C; Gustafson, Paul
2017-04-06
Bias from unmeasured confounding is a persistent concern in observational studies, and sensitivity analysis has been proposed as a solution. In the recent years, probabilistic sensitivity analysis using either Monte Carlo sensitivity analysis (MCSA) or Bayesian sensitivity analysis (BSA) has emerged as a practical analytic strategy when there are multiple bias parameters inputs. BSA uses Bayes theorem to formally combine evidence from the prior distribution and the data. In contrast, MCSA samples bias parameters directly from the prior distribution. Intuitively, one would think that BSA and MCSA ought to give similar results. Both methods use similar models and the same (prior) probability distributions for the bias parameters. In this paper, we illustrate the surprising finding that BSA and MCSA can give very different results. Specifically, we demonstrate that MCSA can give inaccurate uncertainty assessments (e.g. 95% intervals) that do not reflect the data's influence on uncertainty about unmeasured confounding. Using a data example from epidemiology and simulation studies, we show that certain combinations of data and prior distributions can result in dramatic prior-to-posterior changes in uncertainty about the bias parameters. This occurs because the application of Bayes theorem in a non-identifiable model can sometimes rule out certain patterns of unmeasured confounding that are not compatible with the data. Consequently, the MCSA approach may give 95% intervals that are either too wide or too narrow and that do not have 95% frequentist coverage probability. Based on our findings, we recommend that analysts use BSA for probabilistic sensitivity analysis. Copyright © 2017 John Wiley & Sons, Ltd.
Kim, J.; Kwon, H. H.
2014-12-01
The existing regional frequency analysis has disadvantages in that it is difficult to consider geographical characteristics in estimating areal rainfall. In this regard, This study aims to develop a hierarchical Bayesian model based regional frequency analysis in that spatial patterns of the design rainfall with geographical information are explicitly incorporated. This study assumes that the parameters of Gumbel distribution are a function of geographical characteristics (e.g. altitude, latitude and longitude) within a general linear regression framework. Posterior distributions of the regression parameters are estimated by Bayesian Markov Chain Monte Calro (MCMC) method, and the identified functional relationship is used to spatially interpolate the parameters of the Gumbel distribution by using digital elevation models (DEM) as inputs. The proposed model is applied to derive design rainfalls over the entire Han-river watershed. It was found that the proposed Bayesian regional frequency analysis model showed similar results compared to L-moment based regional frequency analysis. In addition, the model showed an advantage in terms of quantifying uncertainty of the design rainfall and estimating the area rainfall considering geographical information. Acknowledgement: This research was supported by a grant (14AWMP-B079364-01) from Water Management Research Program funded by Ministry of Land, Infrastructure and Transport of Korean government.
Scalable Bayesian modeling, monitoring and analysis of dynamic network flow data
2016-01-01
Traffic flow count data in networks arise in many applications, such as automobile or aviation transportation, certain directed social network contexts, and Internet studies. Using an example of Internet browser traffic flow through site-segments of an international news website, we present Bayesian analyses of two linked classes of models which, in tandem, allow fast, scalable and interpretable Bayesian inference. We first develop flexible state-space models for streaming count data, able to...
The Resistible Rise of Bayesian Thinking in Management: Historical Lessons From Decision Analysis
Cabantous, L.; Gond, J-P.
2015-01-01
This paper draws from a case study of decision analysis—a discipline rooted in Bayesianism aimed at supporting managerial decision making—to inform the current discussion on the adoption of Bayesian modes of thinking in management research and practice. Relying on concepts from the science, technology, and society field of study and actor-network theory, we approach the production of scientific knowledge as a cultural, practical, and material affair. Specifically, we analyze the activities de...
Iryna Lobach; Ruzong Fan
2012-01-01
A key component to understanding etiology of complex diseases, such as cancer, diabetes, alcohol dependence, is to investigate gene-environment interactions. This work is motivated by the following two concerns in the analysis of gene-environment interactions. First, multiple genetic markers in moderate linkage disequilibrium may be involved in susceptibility to a complex disease. Second, environmental factors may be subject to misclassification. We develop a genotype based Bayesian pseudolik...
Williams, Michael S.; Ebel, Eric D.; Jennifer A Hoeting
2011-01-01
Bayesian methods are becoming increasingly popular in the field of food-safety risk assessment. Risk assessment models often require the integration of a dose-response function over the distribution of all possible doses of a pathogen ingested with a specific food. This requires the evaluation of an integral for every sample for a Markov chain Monte Carlo analysis of a model. While many statistical software packages have functions that allow for the evaluation of the integral, this functional...
Institute of Scientific and Technical Information of China (English)
无
2007-01-01
Several software reliability growth models (SRGM) have been developed to monitor the reliability growth during the testing phase of software development. In most of the existing research available in the literatures, it is considered that a similar testing effort is required on each debugging effort. However, in practice, different types of faults may require different amounts of testing efforts for their detection and removal. Consequently, faults are classified into three categories on the basis of severity: simple, hard and complex. This categorization may be extended to r type of faults on the basis of severity. Although some existing research in the literatures has incorporated this concept that fault removal rate (FRR) is different for different types of faults, they assume that the FRR remains constant during the overall testing period. On the contrary, it has been observed that as testing progresses, FRR changes due to changing testing strategy, skill, environment and personnel resources. In this paper, a general discrete SRGM is proposed for errors of different severity in software systems using the change-point concept. Then, the models are formulated for two particular environments. The models were validated on two real-life data sets. The results show better fit and wider applicability of the proposed models as to different types of failure datasets.
JAM: A Scalable Bayesian Framework for Joint Analysis of Marginal SNP Effects.
Newcombe, Paul J; Conti, David V; Richardson, Sylvia
2016-04-01
Recently, large scale genome-wide association study (GWAS) meta-analyses have boosted the number of known signals for some traits into the tens and hundreds. Typically, however, variants are only analysed one-at-a-time. This complicates the ability of fine-mapping to identify a small set of SNPs for further functional follow-up. We describe a new and scalable algorithm, joint analysis of marginal summary statistics (JAM), for the re-analysis of published marginal summary statistics under joint multi-SNP models. The correlation is accounted for according to estimates from a reference dataset, and models and SNPs that best explain the complete joint pattern of marginal effects are highlighted via an integrated Bayesian penalized regression framework. We provide both enumerated and Reversible Jump MCMC implementations of JAM and present some comparisons of performance. In a series of realistic simulation studies, JAM demonstrated identical performance to various alternatives designed for single region settings. In multi-region settings, where the only multivariate alternative involves stepwise selection, JAM offered greater power and specificity. We also present an application to real published results from MAGIC (meta-analysis of glucose and insulin related traits consortium) - a GWAS meta-analysis of more than 15,000 people. We re-analysed several genomic regions that produced multiple significant signals with glucose levels 2 hr after oral stimulation. Through joint multivariate modelling, JAM was able to formally rule out many SNPs, and for one gene, ADCY5, suggests that an additional SNP, which transpired to be more biologically plausible, should be followed up with equal priority to the reported index.
No control genes required: Bayesian analysis of qRT-PCR data.
Directory of Open Access Journals (Sweden)
Mikhail V Matz
Full Text Available BACKGROUND: Model-based analysis of data from quantitative reverse-transcription PCR (qRT-PCR is potentially more powerful and versatile than traditional methods. Yet existing model-based approaches cannot properly deal with the higher sampling variances associated with low-abundant targets, nor do they provide a natural way to incorporate assumptions about the stability of control genes directly into the model-fitting process. RESULTS: In our method, raw qPCR data are represented as molecule counts, and described using generalized linear mixed models under Poisson-lognormal error. A Markov Chain Monte Carlo (MCMC algorithm is used to sample from the joint posterior distribution over all model parameters, thereby estimating the effects of all experimental factors on the expression of every gene. The Poisson-based model allows for the correct specification of the mean-variance relationship of the PCR amplification process, and can also glean information from instances of no amplification (zero counts. Our method is very flexible with respect to control genes: any prior knowledge about the expected degree of their stability can be directly incorporated into the model. Yet the method provides sensible answers without such assumptions, or even in the complete absence of control genes. We also present a natural Bayesian analogue of the "classic" analysis, which uses standard data pre-processing steps (logarithmic transformation and multi-gene normalization but estimates all gene expression changes jointly within a single model. The new methods are considerably more flexible and powerful than the standard delta-delta Ct analysis based on pairwise t-tests. CONCLUSIONS: Our methodology expands the applicability of the relative-quantification analysis protocol all the way to the lowest-abundance targets, and provides a novel opportunity to analyze qRT-PCR data without making any assumptions concerning target stability. These procedures have been
Strauss, Jillian; Miranda-Moreno, Luis F; Morency, Patrick
2013-10-01
This study proposes a two-equation Bayesian modelling approach to simultaneously study cyclist injury occurrence and bicycle activity at signalized intersections as joint outcomes. This approach deals with the potential presence of endogeneity and unobserved heterogeneities and is used to identify factors associated with both cyclist injuries and volumes. Its application to identify high-risk corridors is also illustrated. Montreal, Quebec, Canada is the application environment, using an extensive inventory of a large sample of signalized intersections containing disaggregate motor-vehicle traffic volumes and bicycle flows, geometric design, traffic control and built environment characteristics in the vicinity of the intersections. Cyclist injury data for the period of 2003-2008 is used in this study. Also, manual bicycle counts were standardized using temporal and weather adjustment factors to obtain average annual daily volumes. Results confirm and quantify the effects of both bicycle and motor-vehicle flows on cyclist injury occurrence. Accordingly, more cyclists at an intersection translate into more cyclist injuries but lower injury rates due to the non-linear association between bicycle volume and injury occurrence. Furthermore, the results emphasize the importance of turning motor-vehicle movements. The presence of bus stops and total crosswalk length increase cyclist injury occurrence whereas the presence of a raised median has the opposite effect. Bicycle activity through intersections was found to increase as employment, number of metro stations, land use mix, area of commercial land use type, length of bicycle facilities and the presence of schools within 50-800 m of the intersection increase. Intersections with three approaches are expected to have fewer cyclists than those with four. Using Bayesian analysis, expected injury frequency and injury rates were estimated for each intersection and used to rank corridors. Corridors with high bicycle volumes
Bayesian analysis of sparse anisotropic universe models and application to the 5-yr WMAP data
Groeneboom, Nicolaas E
2008-01-01
We extend the previously described CMB Gibbs sampling framework to allow for exact Bayesian analysis of anisotropic universe models, and apply this method to the 5-year WMAP temperature observations. This involves adding support for non-diagonal signal covariance matrices, and implementing a general spectral parameter MCMC sampler. As a worked example we apply these techniques to the model recently introduced by Ackerman et al., describing for instance violations of rotational invariance during the inflationary epoch. After verifying the code with simulated data, we analyze the foreground-reduced 5-year WMAP temperature sky maps. For l < 400 and the W-band data, we find tentative evidence for a preferred direction pointing towards (l,b) = (110 deg, 10 deg) with an anisotropy amplitude of g* = 0.15 +- 0.039, nominally equivalent to a 3.8 sigma detection. Similar results are obtained from the V-band data [g* = 0.11 +- 0.039; (l,b) = (130 deg, 20 deg)]. Further, the preferred direction is stable with respect ...
Ryu, Duchwan; Li, Erning; Mallick, Bani K
2011-06-01
We consider nonparametric regression analysis in a generalized linear model (GLM) framework for data with covariates that are the subject-specific random effects of longitudinal measurements. The usual assumption that the effects of the longitudinal covariate processes are linear in the GLM may be unrealistic and if this happens it can cast doubt on the inference of observed covariate effects. Allowing the regression functions to be unknown, we propose to apply Bayesian nonparametric methods including cubic smoothing splines or P-splines for the possible nonlinearity and use an additive model in this complex setting. To improve computational efficiency, we propose the use of data-augmentation schemes. The approach allows flexible covariance structures for the random effects and within-subject measurement errors of the longitudinal processes. The posterior model space is explored through a Markov chain Monte Carlo (MCMC) sampler. The proposed methods are illustrated and compared to other approaches, the "naive" approach and the regression calibration, via simulations and by an application that investigates the relationship between obesity in adulthood and childhood growth curves.
Feng, Jie; Tomassetti, Nicola; Oliva, Alberto
2016-12-01
The AMS-02 experiment has reported a new measurement of the antiproton/proton ratio in Galactic cosmic rays (CRs). In the energy range E ˜60 - 450 GeV , this ratio is found to be remarkably constant. Using recent data on CR proton, helium, and carbon fluxes, 10Be/9Be and B/C ratios, we have performed a global Bayesian analysis based on a Markov chain Monte Carlo sampling algorithm under a "two halo model" of CR propagation. In this model, CRs are allowed to experience a different type of diffusion when they propagate in the region close to the Galactic disk. We found that the vertical extent of this region is about 900 pc above and below the disk, and the corresponding diffusion coefficient scales with energy as D ∝E0.15 , describing well the observations on primary CR spectra, secondary/primary ratios, and anisotropy. Under this model, we have carried out improved calculations of antiparticle spectra arising from secondary CR production and their corresponding uncertainties. We made use of Monte Carlo generators and accelerator data to assess the antiproton production cross sections and their uncertainties. While the positron excess requires the contribution of additional unknown sources, we found that the new AMS-02 antiproton data are consistent, within the estimated uncertainties, with our calculations based on secondary production.
Assessment of occupational safety risks in Floridian solid waste systems using Bayesian analysis.
Bastani, Mehrad; Celik, Nurcin
2015-10-01
Safety risks embedded within solid waste management systems continue to be a significant issue and are prevalent at every step in the solid waste management process. To recognise and address these occupational hazards, it is necessary to discover the potential safety concerns that cause them, as well as their direct and/or indirect impacts on the different types of solid waste workers. In this research, our goal is to statistically assess occupational safety risks to solid waste workers in the state of Florida. Here, we first review the related standard industrial codes to major solid waste management methods including recycling, incineration, landfilling, and composting. Then, a quantitative assessment of major risks is conducted based on the data collected using a Bayesian data analysis and predictive methods. The risks estimated in this study for the period of 2005-2012 are then compared with historical statistics (1993-1997) from previous assessment studies. The results have shown that the injury rates among refuse collectors in both musculoskeletal and dermal injuries have decreased from 88 and 15 to 16 and three injuries per 1000 workers, respectively. However, a contrasting trend is observed for the injury rates among recycling workers, for whom musculoskeletal and dermal injuries have increased from 13 and four injuries to 14 and six injuries per 1000 workers, respectively. Lastly, a linear regression model has been proposed to identify major elements of the high number of musculoskeletal and dermal injuries.
Bayesian analysis of the linear reaction norm model with unknown covariates.
Su, G; Madsen, P; Lund, M S; Sorensen, D; Korsgaard, I R; Jensen, J
2006-07-01
The reaction norm model is becoming a popular approach for the analysis of genotype x environment interactions. In a classical reaction norm model, the expression of a genotype in different environments is described as a linear function (a reaction norm) of an environmental gradient or value. An environmental value is typically defined as the mean performance of all genotypes in the environment, which is usually unknown. One approximation is to estimate the mean phenotypic performance in each environment and then treat these estimates as known covariates in the model. However, a more satisfactory alternative is to infer environmental values simultaneously with the other parameters of the model. This study describes a method and its Bayesian Markov Chain Monte Carlo implementation that makes this possible. Frequentist properties of the proposed method are tested in a simulation study. Estimates of parameters of interest agree well with the true values. Further, inferences about genetic parameters from the proposed method are similar to those derived from a reaction norm model using true environmental values. On the other hand, using phenotypic means as proxies for environmental values results in poor inferences.
Bayesian time series analysis of segments of the Rocky Mountain trumpeter swan population
Wright, Christopher K.; Sojda, Richard S.; Goodman, Daniel
2002-01-01
A Bayesian time series analysis technique, the dynamic linear model, was used to analyze counts of Trumpeter Swans (Cygnus buccinator) summering in Idaho, Montana, and Wyoming from 1931 to 2000. For the Yellowstone National Park segment of white birds (sub-adults and adults combined) the estimated probability of a positive growth rate is 0.01. The estimated probability of achieving the Subcommittee on Rocky Mountain Trumpeter Swans 2002 population goal of 40 white birds for the Yellowstone segment is less than 0.01. Outside of Yellowstone National Park, Wyoming white birds are estimated to have a 0.79 probability of a positive growth rate with a 0.05 probability of achieving the 2002 objective of 120 white birds. In the Centennial Valley in southwest Montana, results indicate a probability of 0.87 that the white bird population is growing at a positive rate with considerable uncertainty. The estimated probability of achieving the 2002 Centennial Valley objective of 160 white birds is 0.14 but under an alternative model falls to 0.04. The estimated probability that the Targhee National Forest segment of white birds has a positive growth rate is 0.03. In Idaho outside of the Targhee National Forest, white birds are estimated to have a 0.97 probability of a positive growth rate with a 0.18 probability of attaining the 2002 goal of 150 white birds.
Rubio, Francisco J.
2016-02-09
We study Bayesian linear regression models with skew-symmetric scale mixtures of normal error distributions. These kinds of models can be used to capture departures from the usual assumption of normality of the errors in terms of heavy tails and asymmetry. We propose a general noninformative prior structure for these regression models and show that the corresponding posterior distribution is proper under mild conditions. We extend these propriety results to cases where the response variables are censored. The latter scenario is of interest in the context of accelerated failure time models, which are relevant in survival analysis. We present a simulation study that demonstrates good frequentist properties of the posterior credible intervals associated with the proposed priors. This study also sheds some light on the trade-off between increased model flexibility and the risk of over-fitting. We illustrate the performance of the proposed models with real data. Although we focus on models with univariate response variables, we also present some extensions to the multivariate case in the Supporting Information.
Bayesian analysis of diagnostic test accuracy when disease state is unverified for some subjects.
Pennello, Gene A
2011-09-01
Studies of the accuracy of medical tests to diagnose the presence or absence of disease can suffer from an inability to verify the true disease state in everyone. When verification is missing at random (MAR), the missing data mechanism can be ignored in likelihood-based inference. However, this assumption may not hold even approximately. When verification is nonignorably missing, the most general model of the distribution of disease state, test result, and verification indicator is overparameterized. Parameters are only partially identified, creating regions of ignorance for maximum likelihood estimators. For studies of a single test, we use Bayesian analysis to implement the most general nonignorable model, a reduced nonignorable model with identifiable parameters, and the MAR model. Simple Gibbs sampling algorithms are derived that enable computation of the posterior distribution of test accuracy parameters. In particular, the posterior distribution is easily obtained for the most general nonignorable model, which makes relatively weak assumptions about the missing data mechanism. For this model, the posterior distribution combines two sources of uncertainty: ignorance in the estimation of partially identified parameters, and imprecision due to finite sampling variability. We compare the three models on data from a study of the accuracy of scintigraphy to diagnose liver disease.
Directory of Open Access Journals (Sweden)
Brentani Helena
2004-08-01
Full Text Available Abstract Background An important challenge for transcript counting methods such as Serial Analysis of Gene Expression (SAGE, "Digital Northern" or Massively Parallel Signature Sequencing (MPSS, is to carry out statistical analyses that account for the within-class variability, i.e., variability due to the intrinsic biological differences among sampled individuals of the same class, and not only variability due to technical sampling error. Results We introduce a Bayesian model that accounts for the within-class variability by means of mixture distribution. We show that the previously available approaches of aggregation in pools ("pseudo-libraries" and the Beta-Binomial model, are particular cases of the mixture model. We illustrate our method with a brain tumor vs. normal comparison using SAGE data from public databases. We show examples of tags regarded as differentially expressed with high significance if the within-class variability is ignored, but clearly not so significant if one accounts for it. Conclusion Using available information about biological replicates, one can transform a list of candidate transcripts showing differential expression to a more reliable one. Our method is freely available, under GPL/GNU copyleft, through a user friendly web-based on-line tool or as R language scripts at supplemental web-site.
Wu, Fenfang; Wu, Di; Ren, Yong; Duan, Chongyang; Chen, Shangwu; Xu, Anlong
2016-07-26
Acute promyelocytic leukemia (APL) is a curable subtype of acute myeloid leukemia. The optimum regimen for newly diagnosed APL remains inconclusive. In this Bayesian network meta-analysis, we compared the effectiveness of five regimens-arsenic trioxide (ATO) + all-trans retinoic acid (ATRA), realgar-indigo naturalis formula (RIF) which contains arsenic tetrasulfide + ATRA, ATRA + anthracycline-based chemotherapy (CT), ATO alone and ATRA alone, based on fourteen randomized controlled trials (RCTs), which included 1407 newly diagnosed APL patients. According to the results, the ranking efficacy of the treatment, including early death and complete remission in the induction stage, was the following: 1. ATO/RIF + ATRA; 2. ATRA + CT; 3. ATO, and 4. ATRA. For long-term benefit, ATO/RIF + ATRA significantly improved overall survival (OS) (hazard ratio = 0.35, 95%CI 0.15-0.82, p = 0.02) and event-free survival (EFS) (hazard ratio = 0.32, 95%CI 0.16-0.61, p = 0.001) over ATRA + CT regimen for the low-to-intermediate-risk patients. Thus, ATO + ATRA and RIF + ATRA might be considered the optimum treatments for the newly diagnosed APL and should be recommended as the standard care for frontline therapy.
Cross-validation analysis of bias models in Bayesian multi-model projections of climate
Huttunen, J. M. J.; Räisänen, J.; Nissinen, A.; Lipponen, A.; Kolehmainen, V.
2017-03-01
Climate change projections are commonly based on multi-model ensembles of climate simulations. In this paper we consider the choice of bias models in Bayesian multimodel predictions. Buser et al. (Clim Res 44(2-3):227-241, 2010a) introduced a hybrid bias model which combines commonly used constant bias and constant relation bias assumptions. The hybrid model includes a weighting parameter which balances these bias models. In this study, we use a cross-validation approach to study which bias model or bias parameter leads to, in a specific sense, optimal climate change projections. The analysis is carried out for summer and winter season means of 2 m-temperatures spatially averaged over the IPCC SREX regions, using 19 model runs from the CMIP5 data set. The cross-validation approach is applied to calculate optimal bias parameters (in the specific sense) for projecting the temperature change from the control period (1961-2005) to the scenario period (2046-2090). The results are compared to the results of the Buser et al. (Clim Res 44(2-3):227-241, 2010a) method which includes the bias parameter as one of the unknown parameters to be estimated from the data.
Paired Comparison Analysis of the van Baaren Model Using Bayesian Approach with Noninformative Prior
Directory of Open Access Journals (Sweden)
Saima Altaf
2012-03-01
Full Text Available 800x600 Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Times New Roman","serif";} One technique being commonly studied these days because of its attractive applications for the comparison of several objects is the method of paired comparisons. This technique permits the ranking of the objects by means of a score, which reflects the merit of the items on a linear scale. The present study is concerned with the Bayesian analysis of a paired comparison model, namely the van Baaren model VI using noninformative uniform prior. For this purpose, the joint posterior distribution for the parameters of the model, their marginal distributions, posterior estimates (means and modes, the posterior probabilities for comparing the two treatment parameters and the predictive probabilities are obtained.
A Bayesian Approach to the Design and Analysis of Computer Experiments
Energy Technology Data Exchange (ETDEWEB)
Currin, C.
1988-01-01
We consider the problem of designing and analyzing experiments for prediction of the function y(f), t {element_of} T, where y is evaluated by means of a computer code (typically by solving complicated equations that model a physical system), and T represents the domain of inputs to the code. We use a Bayesian approach, in which uncertainty about y is represented by a spatial stochastic process (random function); here we restrict attention to stationary Gaussian processes. The posterior mean function can be used as an interpolating function, with uncertainties given by the posterior standard deviations. Instead of completely specifying the prior process, we consider several families of priors, and suggest some cross-validational methods for choosing one that performs relatively well on the function at hand. As a design criterion, we use the expected reduction in the entropy of the random vector y (T*), where T* {contained_in} T is a given finite set of ''sites'' (input configurations) at which predictions are to be made. We describe an exchange algorithm for constructing designs that are optimal with respect to this criterion. To demonstrate the use of these design and analysis methods, several examples are given, including one experiment on a computer model of a thermal energy storage device and another on an integrated circuit simulator.
Feng, Jie; Oliva, Alberto
2016-01-01
The AMS-02 experiment has reported a new measurement of the antiproton/proton ratio in Galactic cosmic rays (CRs). In the energy range $E\\sim\\,$60-450 GeV, this ratio is found to be remarkably constant. Using recent data on CR proton, helium, carbon, 10Be/9Be, and B/C ratio, we have performed a global Bayesian analysis based on a Markov-Chain Monte-Carlo sampling algorithm under a "two halo model" of CR propagation. In this model, CRs are allowed to experience a different type of diffusion when they propagate in the region close of the Galactic disk. We found that the vertical extent of this region is about 900 pc above and below the disk, and the corresponding diffusion coefficient scales with energy as $D\\sim\\,E^{0.15}$, describing well the observations on primary CR spectra, secondary/primary ratios and anisotropy. Under this model we have carried out improved calculations of antiparticle spectra arising from secondary CR production and their corresponding uncertainties. We made use of Monte-Carlo generato...
Bayesian Modeling of MPSS Data: Gene Expression Analysis of Bovine Salmonella Infection
Dhavala, Soma S.
2010-09-01
Massively Parallel Signature Sequencing (MPSS) is a high-throughput, counting-based technology available for gene expression profiling. It produces output that is similar to Serial Analysis of Gene Expression and is ideal for building complex relational databases for gene expression. Our goal is to compare the in vivo global gene expression profiles of tissues infected with different strains of Salmonella obtained using the MPSS technology. In this article, we develop an exact ANOVA type model for this count data using a zero-inflatedPoisson distribution, different from existing methods that assume continuous densities. We adopt two Bayesian hierarchical models-one parametric and the other semiparametric with a Dirichlet process prior that has the ability to "borrow strength" across related signatures, where a signature is a specific arrangement of the nucleotides, usually 16-21 base pairs long. We utilize the discreteness of Dirichlet process prior to cluster signatures that exhibit similar differential expression profiles. Tests for differential expression are carried out using nonparametric approaches, while controlling the false discovery rate. We identify several differentially expressed genes that have important biological significance and conclude with a summary of the biological discoveries. This article has supplementary materials online. © 2010 American Statistical Association.
Ryu, Duchwan
2010-09-28
We consider nonparametric regression analysis in a generalized linear model (GLM) framework for data with covariates that are the subject-specific random effects of longitudinal measurements. The usual assumption that the effects of the longitudinal covariate processes are linear in the GLM may be unrealistic and if this happens it can cast doubt on the inference of observed covariate effects. Allowing the regression functions to be unknown, we propose to apply Bayesian nonparametric methods including cubic smoothing splines or P-splines for the possible nonlinearity and use an additive model in this complex setting. To improve computational efficiency, we propose the use of data-augmentation schemes. The approach allows flexible covariance structures for the random effects and within-subject measurement errors of the longitudinal processes. The posterior model space is explored through a Markov chain Monte Carlo (MCMC) sampler. The proposed methods are illustrated and compared to other approaches, the "naive" approach and the regression calibration, via simulations and by an application that investigates the relationship between obesity in adulthood and childhood growth curves. © 2010, The International Biometric Society.
A Bayesian parameter estimation approach to pulsar time-of-arrival analysis
Messenger, C; Demorest, P; Ransom, S
2011-01-01
The increasing sensitivities of pulsar timing arrays to ultra-low frequency (nHz) gravitational waves promises to achieve direct gravitational wave detection within the next 5-10 years. While there are many parallel efforts being made in the improvement of telescope sensitivity, the detection of stable millisecond pulsars and the improvement of the timing software, there are reasons to believe that the methods used to accurately determine the time-of-arrival (TOA) of pulses from radio pulsars can be improved upon. More specifically, the determination of the uncertainties on these TOAs, which strongly affect the ability to detect GWs through pulsar timing, may be unreliable. We propose two Bayesian methods for the generation of pulsar TOAs starting from pulsar "search-mode" data and pre-folded data. These methods are applied to simulated toy-model examples and in this initial work we focus on the issue of uncertainties in the folding period. The final results of our analysis are expressed in the form of poster...
MASSIVE: A Bayesian analysis of giant planet populations around low-mass stars
Lannier, J.; Delorme, P.; Lagrange, A. M.; Borgniet, S.; Rameau, J.; Schlieder, J. E.; Gagné, J.; Bonavita, M. A.; Malo, L.; Chauvin, G.; Bonnefoy, M.; Girard, J. H.
2016-12-01
Context. Direct imaging has led to the discovery of several giant planet and brown dwarf companions. These imaged companions populate a mass, separation and age domain (mass >1 MJup, orbits > 5 AU, age planetary formation models. Methods: We observed 58 young and nearby M-type dwarfs in L'-band with the VLT/NaCo instrument and used angular differential imaging algorithms to optimize the sensitivity to planetary-mass companions and to derive the best detection limits. We estimate the probability of detecting a planet as a function of its mass and physical separation around each target. We conduct a Bayesian analysis to determine the frequency of substellar companions orbiting low-mass stars, using a homogenous sub-sample of 54 stars. Results: We derive a frequency of for companions with masses in the range of 2-80 MJup, and % for planetary mass companions (2-14 MJup), at physical separations of 8 to 400 AU for both cases. Comparing our results with a previous survey targeting more massive stars, we find evidence that substellar companions more massive than 1 MJup with a low mass ratio Q with respect to their host star (Q 2 MJup might be independent from the mass of the host star.
Lesaffre, Emmanuel
2012-01-01
The growth of biostatistics has been phenomenal in recent years and has been marked by considerable technical innovation in both methodology and computational practicality. One area that has experienced significant growth is Bayesian methods. The growing use of Bayesian methodology has taken place partly due to an increasing number of practitioners valuing the Bayesian paradigm as matching that of scientific discovery. In addition, computational advances have allowed for more complex models to be fitted routinely to realistic data sets. Through examples, exercises and a combination of introd
新家, 健精
2013-01-01
© 2012 Springer Science+Business Media, LLC. All rights reserved. Article Outline: Glossary Definition of the Subject and Introduction The Bayesian Statistical Paradigm Three Examples Comparison with the Frequentist Statistical Paradigm Future Directions Bibliography
Wendling, Thierry; Tsamandouras, Nikolaos; Dumitras, Swati; Pigeolet, Etienne; Ogungbenro, Kayode; Aarons, Leon
2016-01-01
Whole-body physiologically based pharmacokinetic (PBPK) models are increasingly used in drug development for their ability to predict drug concentrations in clinically relevant tissues and to extrapolate across species, experimental conditions and sub-populations. A whole-body PBPK model can be fitted to clinical data using a Bayesian population approach. However, the analysis might be time consuming and numerically unstable if prior information on the model parameters is too vague given the complexity of the system. We suggest an approach where (i) a whole-body PBPK model is formally reduced using a Bayesian proper lumping method to retain the mechanistic interpretation of the system and account for parameter uncertainty, (ii) the simplified model is fitted to clinical data using Markov Chain Monte Carlo techniques and (iii) the optimised reduced PBPK model is used for extrapolation. A previously developed 16-compartment whole-body PBPK model for mavoglurant was reduced to 7 compartments while preserving plasma concentration-time profiles (median and variance) and giving emphasis to the brain (target site) and the liver (elimination site). The reduced model was numerically more stable than the whole-body model for the Bayesian analysis of mavoglurant pharmacokinetic data in healthy adult volunteers. Finally, the reduced yet mechanistic model could easily be scaled from adults to children and predict mavoglurant pharmacokinetics in children aged from 3 to 11 years with similar performance compared with the whole-body model. This study is a first example of the practicality of formal reduction of complex mechanistic models for Bayesian inference in drug development.
Bayesian analysis of the dynamic cosmic web in the SDSS galaxy survey
Leclercq, Florent; Jasche, Jens; Wandelt, Benjamin
2015-06-01
Recent application of the Bayesian algorithm \\textsc{borg} to the Sloan Digital Sky Survey (SDSS) main sample galaxies resulted in the physical inference of the formation history of the observed large-scale structure from its origin to the present epoch. In this work, we use these inferences as inputs for a detailed probabilistic cosmic web-type analysis. To do so, we generate a large set of data-constrained realizations of the large-scale structure using a fast, fully non-linear gravitational model. We then perform a dynamic classification of the cosmic web into four distinct components (voids, sheets, filaments, and clusters) on the basis of the tidal field. Our inference framework automatically and self-consistently propagates typical observational uncertainties to web-type classification. As a result, this study produces accurate cosmographic classification of large-scale structure elements in the SDSS volume. By also providing the history of these structure maps, the approach allows an analysis of the origin and growth of the early traces of the cosmic web present in the initial density field and of the evolution of global quantities such as the volume and mass filling fractions of different structures. For the problem of web-type classification, the results described in this work constitute the first connection between theory and observations at non-linear scales including a physical model of structure formation and the demonstrated capability of uncertainty quantification. A connection between cosmology and information theory using real data also naturally emerges from our probabilistic approach. Our results constitute quantitative chrono-cosmography of the complex web-like patterns underlying the observed galaxy distribution.
A Bayesian Network Approach for Offshore Risk Analysis Through Linguistic Variables
Institute of Scientific and Technical Information of China (English)
无
2007-01-01
This paper presents a new approach for offshore risk analysis that is capable of dealing with linguistic probabilities in Bayesian networks (BNs). In this paper, linguistic probabilities are used to describe occurrence likelihood of hazardous events that may cause possible accidents in offshore operations. In order to use fuzzy information, an f-weighted valuation function is proposed to transform linguistic judgements into crisp probability distributions which can be easily put into a BN to model causal relationships among risk factors. The use of linguistic variables makes it easier for human experts to express their knowledge, and the transformation of linguistic judgements into crisp probabilities can significantly save the cost of computation, modifying and maintaining a BN model. The flexibility of the method allows for multiple forms of information to be used to quantify model relationships, including formally assessed expert opinion when quantitative data are lacking, or when only qualitative or vague statements can be made. The model is a modular representation of uncertain knowledge caused due to randomness, vagueness and ignorance. This makes the risk analysis of offshore engineering systems more functional and easier in many assessment contexts. Specifically, the proposed f-weighted valuation function takes into account not only the dominating values, but also the α-level values that are ignored by conventional valuation methods. A case study of the collision risk between a Floating Production, Storage and Off-loading (FPSO) unit and the authorised vessels due to human elements during operation is used to illustrate the application of the proposed model.
Health at the borders: Bayesian multilevel analysis of women's malnutrition determinants in Ethiopia
Directory of Open Access Journals (Sweden)
Tefera Darge Delbiso
2016-07-01
Full Text Available Background: Women's malnutrition, particularly undernutrition, remains an important public health challenge in Ethiopia. Although various studies examined the levels and determinants of women's nutritional status, the influence of living close to an international border on women's nutrition has not been investigated. Yet, Ethiopian borders are regularly affected by conflict and refugee flows, which might ultimately impact health. Objective: To investigate the impact of living close to borders in the nutritional status of women in Ethiopia, while considering other important covariates. Design: Our analysis was based on the body mass index (BMI of 6,334 adult women aged 20–49 years, obtained from the 2011 Ethiopian Demographic and Health Survey (EDHS. A Bayesian multilevel multinomial logistic regression analysis was used to capture the clustered structure of the data and the possible correlation that may exist within and between clusters. Results: After controlling for potential confounders, women living close to borders (i.e. ≤100 km in Ethiopia were 59% more likely to be underweight (posterior odds ratio [OR]=1.59; 95% credible interval [CrI]: 1.32–1.90 than their counterparts living far from the borders. This result was robust to different choices of border delineation (i.e. ≤50, ≤75, ≤125, and ≤150 km. Women from poor families, those who have no access to improved toilets, reside in lowland areas, and are Muslim, were independently associated with underweight. In contrast, more wealth, higher education, older age, access to improved toilets, being married, and living in urban or lowlands were independently associated with overweight. Conclusions: The problem of undernutrition among women in Ethiopia is most worrisome in the border areas. Targeted interventions to improve nutritional status in these areas, such as improved access to sanitation, economic and livelihood support, are recommended.
Lander, Tonya A; Oddou-Muratorio, Sylvie; Prouillet-Leplat, Helene; Klein, Etienne K
2011-12-01
Range expansion and contraction has occurred in the history of most species and can seriously impact patterns of genetic diversity. Historical data about range change are rare and generally appropriate for studies at large scales, whereas the individual pollen and seed dispersal events that form the basis of geneflow and colonization generally occur at a local scale. In this study, we investigated range change in Fagus sylvatica on Mont Ventoux, France, using historical data from 1838 to the present and approximate Bayesian computation (ABC) analyses of genetic data. From the historical data, we identified a population minimum in 1845 and located remnant populations at least 200 years old. The ABC analysis selected a demographic scenario with three populations, corresponding to two remnant populations and one area of recent expansion. It also identified expansion from a smaller ancestral population but did not find that this expansion followed a population bottleneck, as suggested by the historical data. Despite a strong support to the selected scenario for our data set, the ABC approach showed a low power to discriminate among scenarios on average and a low ability to accurately estimate effective population sizes and divergence dates, probably due to the temporal scale of the study. This study provides an unusual opportunity to test ABC analysis in a system with a well-documented demographic history and identify discrepancies between the results of historical, classical population genetic and ABC analyses. The results also provide valuable insights into genetic processes at work at a fine spatial and temporal scale in range change and colonization.
How few countries will do? Comparative survey analysis from a Bayesian perspective
Directory of Open Access Journals (Sweden)
Joop J.C.M. Hox
2012-07-01
Full Text Available Meuleman and Billiet (2009 have carried out a simulation study aimed at the question how many countries are needed for accurate multilevel SEM estimation in comparative studies. The authors concluded that a sample of 50 to 100 countries is needed for accurate estimation. Recently, Bayesian estimation methods have been introduced in structural equation modeling which should work well with much lower sample sizes. The current study reanalyzes the simulation of Meuleman and Billiet using Bayesian estimation to find the lowest number of countries needed when conducting multilevel SEM. The main result of our simulations is that a sample of about 20 countries is sufficient for accurate Bayesian estimation, which makes multilevel SEM practicable for the number of countries commonly available in large scale comparative surveys.
DEFF Research Database (Denmark)
Pedersen, Thorkild Find
2003-01-01
Rotating and reciprocating mechanical machines emit acoustic noise and vibrations when they operate. Typically, the noise and vibrations are concentrated in narrow frequency bands related to the running speed of the machine. The frequency of the running speed is referred to as the fundamental...... of an adaptive comb filter is derived for tracking non-stationary signals. The estimation problem is then rephrased in terms of the Bayesian statistical framework. In the Bayesian framework both parameters and observations are considered stochastic processes. The result of the estimation is an expression...
Fancher, Chris M.; Han, Zhen; Levin, Igor; Page, Katharine; Reich, Brian J.; Smith, Ralph C.; Wilson, Alyson G.; Jones, Jacob L.
2016-01-01
A Bayesian inference method for refining crystallographic structures is presented. The distribution of model parameters is stochastically sampled using Markov chain Monte Carlo. Posterior probability distributions are constructed for all model parameters to properly quantify uncertainty by appropriately modeling the heteroskedasticity and correlation of the error structure. The proposed method is demonstrated by analyzing a National Institute of Standards and Technology silicon standard reference material. The results obtained by Bayesian inference are compared with those determined by Rietveld refinement. Posterior probability distributions of model parameters provide both estimates and uncertainties. The new method better estimates the true uncertainties in the model as compared to the Rietveld method. PMID:27550221
Gilkey, Kelly M.; Myers, Jerry G.; McRae, Michael P.; Griffin, Elise A.; Kallrui, Aditya S.
2012-01-01
The Exploration Medical Capability project is creating a catalog of risk assessments using the Integrated Medical Model (IMM). The IMM is a software-based system intended to assist mission planners in preparing for spaceflight missions by helping them to make informed decisions about medical preparations and supplies needed for combating and treating various medical events using Probabilistic Risk Assessment. The objective is to use statistical analyses to inform the IMM decision tool with estimated probabilities of medical events occurring during an exploration mission. Because data regarding astronaut health are limited, Bayesian statistical analysis is used. Bayesian inference combines prior knowledge, such as data from the general U.S. population, the U.S. Submarine Force, or the analog astronaut population located at the NASA Johnson Space Center, with observed data for the medical condition of interest. The posterior results reflect the best evidence for specific medical events occurring in flight. Bayes theorem provides a formal mechanism for combining available observed data with data from similar studies to support the quantification process. The IMM team performed Bayesian updates on the following medical events: angina, appendicitis, atrial fibrillation, atrial flutter, dental abscess, dental caries, dental periodontal disease, gallstone disease, herpes zoster, renal stones, seizure, and stroke.
Directory of Open Access Journals (Sweden)
Krzysztof Tomanek
2014-05-01
Full Text Available The purpose of this article is to present the basic methods for classifying text data. These methods make use of achievements earned in areas such as: natural language processing, the analysis of unstructured data. I introduce and compare two analytical techniques applied to text data. The first analysis makes use of thematic vocabulary tool (sentiment analysis. The second technique uses the idea of Bayesian classification and applies, so-called, naive Bayes algorithm. My comparison goes towards grading the efficiency of use of these two analytical techniques. I emphasize solutions that are to be used to build dictionary accurate for the task of text classification. Then, I compare supervised classification to automated unsupervised analysis’ effectiveness. These results reinforce the conclusion that a dictionary which has received good evaluation as a tool for classification should be subjected to review and modification procedures if is to be applied to new empirical material. Adaptation procedures used for analytical dictionary become, in my proposed approach, the basic step in the methodology of textual data analysis.
New class of hybrid EoS and Bayesian M - R data analysis
Energy Technology Data Exchange (ETDEWEB)
Alvarez-Castillo, D. [JINR Dubna, Bogoliubov Laboratory of Theoretical Physics, Dubna (Russian Federation); Ayriyan, A.; Grigorian, H. [JINR Dubna, Laboratory of Information Technologies, Dubna (Russian Federation); Benic, S. [University of Zagreb, Department of Physics, Zagreb (Croatia); Blaschke, D. [JINR Dubna, Bogoliubov Laboratory of Theoretical Physics, Dubna (Russian Federation); National Research Nuclear University (MEPhI), Moscow (Russian Federation); Typel, S. [GSI Helmholtzzentrum fuer Schwerionenforschung GmbH, Darmstadt (Germany)
2016-03-15
We explore systematically a new class of two-phase equations of state (EoS) for hybrid stars that is characterized by three main features: (1) stiffening of the nuclear EoS at supersaturation densities due to quark exchange effects (Pauli blocking) between hadrons, modelled by an excluded volume correction; (2) stiffening of the quark matter EoS at high densities due to multiquark interactions; and (3) possibility for a strong first-order phase transition with an early onset and large density jump. The third feature results from a Maxwell construction for the possible transition from the nuclear to a quark matter phase and its properties depend on the two parameters used for (1) and (2), respectively. Varying these two parameters, one obtains a class of hybrid EoS that yields solutions of the Tolman-Oppenheimer-Volkoff (TOV) equations for sequences of hadronic and hybrid stars in the mass-radius diagram which cover the full range of patterns according to the Alford-Han-Prakash classification following which a hybrid star branch can be either absent, connected or disconnected with the hadronic one. The latter case often includes a tiny connected branch. The disconnected hybrid star branch, also called ''third family'', corresponds to high-mass twin stars characterized by the same gravitational mass but different radii. We perform a Bayesian analysis and demonstrate that the observation of such a pair of high-mass twin stars would have a sufficient discriminating power to favor hybrid EoS with a strong first-order phase transition over alternative EoS. (orig.)
A method of spherical harmonic analysis in the geosciences via hierarchical Bayesian inference
Muir, J. B.; Tkalčić, H.
2015-11-01
The problem of decomposing irregular data on the sphere into a set of spherical harmonics is common in many fields of geosciences where it is necessary to build a quantitative understanding of a globally varying field. For example, in global seismology, a compressional or shear wave speed that emerges from tomographic images is used to interpret current state and composition of the mantle, and in geomagnetism, secular variation of magnetic field intensity measured at the surface is studied to better understand the changes in the Earth's core. Optimization methods are widely used for spherical harmonic analysis of irregular data, but they typically do not treat the dependence of the uncertainty estimates on the imposed regularization. This can cause significant difficulties in interpretation, especially when the best-fit model requires more variables as a result of underestimating data noise. Here, with the above limitations in mind, the problem of spherical harmonic expansion of irregular data is treated within the hierarchical Bayesian framework. The hierarchical approach significantly simplifies the problem by removing the need for regularization terms and user-supplied noise estimates. The use of the corrected Akaike Information Criterion for picking the optimal maximum degree of spherical harmonic expansion and the resulting spherical harmonic analyses are first illustrated on a noisy synthetic data set. Subsequently, the method is applied to two global data sets sensitive to the Earth's inner core and lowermost mantle, consisting of PKPab-df and PcP-P differential traveltime residuals relative to a spherically symmetric Earth model. The posterior probability distributions for each spherical harmonic coefficient are calculated via Markov Chain Monte Carlo sampling; the uncertainty obtained for the coefficients thus reflects the noise present in the real data and the imperfections in the spherical harmonic expansion.
Bayesian approach to the analysis of neutron Brillouin scattering data on liquid metals
De Francesco, A.; Guarini, E.; Bafile, U.; Formisano, F.; Scaccia, L.
2016-08-01
When the dynamics of liquids and disordered systems at mesoscopic level is investigated by means of inelastic scattering (e.g., neutron or x ray), spectra are often characterized by a poor definition of the excitation lines and spectroscopic features in general and one important issue is to establish how many of these lines need to be included in the modeling function and to estimate their parameters. Furthermore, when strongly damped excitations are present, commonly used and widespread fitting algorithms are particularly affected by the choice of initial values of the parameters. An inadequate choice may lead to an inefficient exploration of the parameter space, resulting in the algorithm getting stuck in a local minimum. In this paper, we present a Bayesian approach to the analysis of neutron Brillouin scattering data in which the number of excitation lines is treated as unknown and estimated along with the other model parameters. We propose a joint estimation procedure based on a reversible-jump Markov chain Monte Carlo algorithm, which efficiently explores the parameter space, producing a probabilistic measure to quantify the uncertainty on the number of excitation lines as well as reliable parameter estimates. The method proposed could turn out of great importance in extracting physical information from experimental data, especially when the detection of spectral features is complicated not only because of the properties of the sample, but also because of the limited instrumental resolution and count statistics. The approach is tested on generated data set and then applied to real experimental spectra of neutron Brillouin scattering from a liquid metal, previously analyzed in a more traditional way.
Cancer mortality inequalities in urban areas: a Bayesian small area analysis in Spanish cities
Directory of Open Access Journals (Sweden)
Martos Carmen M
2011-01-01
Full Text Available Abstract Background Intra-urban inequalities in mortality have been infrequently analysed in European contexts. The aim of the present study was to analyse patterns of cancer mortality and their relationship with socioeconomic deprivation in small areas in 11 Spanish cities. Methods It is a cross-sectional ecological design using mortality data (years 1996-2003. Units of analysis were the census tracts. A deprivation index was calculated for each census tract. In order to control the variability in estimating the risk of dying we used Bayesian models. We present the RR of the census tract with the highest deprivation vs. the census tract with the lowest deprivation. Results In the case of men, socioeconomic inequalities are observed in total cancer mortality in all cities, except in Castellon, Cordoba and Vigo, while Barcelona (RR = 1.53 95%CI 1.42-1.67, Madrid (RR = 1.57 95%CI 1.49-1.65 and Seville (RR = 1.53 95%CI 1.36-1.74 present the greatest inequalities. In general Barcelona and Madrid, present inequalities for most types of cancer. Among women for total cancer mortality, inequalities have only been found in Barcelona and Zaragoza. The excess number of cancer deaths due to socioeconomic deprivation was 16,413 for men and 1,142 for women. Conclusion This study has analysed inequalities in cancer mortality in small areas of cities in Spain, not only relating this mortality with socioeconomic deprivation, but also calculating the excess mortality which may be attributed to such deprivation. This knowledge is particularly useful to determine which geographical areas in each city need intersectorial policies in order to promote a healthy environment.
Analysis of ASR Clogging Investigations at Three Australian ASR Sites in a Bayesian Context
Directory of Open Access Journals (Sweden)
Peter Dillon
2016-10-01
Full Text Available When evaluating uncertainties in developing an aquifer storage and recovery (ASR system, under normal budgetary constraints, a systematic approach is needed to prioritise investigations. Three case studies where field trials have been undertaken, and clogging evaluated, reveal the changing perceptions of viability of ASR from a clogging perspective as a result of the progress of investigations. Two stormwater and one recycled water ASR investigations in siliceous aquifers are described that involved different strategies to evaluate the potential for clogging. This paper reviews these sites, as well as earlier case studies and information relating water quality, to clogging in column studies. Two novel theoretical concepts are introduced in the paper. Bayesian analysis is applied to demonstrate the increase in expected net benefit in developing a new ASR operation by undertaking clogging experiments (that have an assumed known reliability for predicting viability for the injectant treatment options and aquifer material from the site. Results for an example situation demonstrate benefit cost ratios of experiments ranging from 1.5 to 6 and apply if decisions are based on experimental results whether success or failure are predicted. Additionally, a theoretical assessment of clogging rates characterised as acute and chronic is given, to explore their combined impact, for two operating parameters that define the onset of purging for recovery of reversible clogging and the onset of occasional advanced bore rehabilitation to address recovery of chronic clogging. These allow the assessment of net recharge and the proportion of water purged or redeveloped. Both analyses could inform economic decisions and help motivate an improved investigation methodology. It is expected that aquifer heterogeneity will result in differing injection rates among wells, so operational experience will ultimately be valuable in differentiating clogging behaviour under
Use of Bayesian event trees in semi-quantitative volcano eruption forecasting and hazard analysis
Wright, Heather; Pallister, John; Newhall, Chris
2015-04-01
Use of Bayesian event trees to forecast eruptive activity during volcano crises is an increasingly common practice for the USGS-USAID Volcano Disaster Assistance Program (VDAP) in collaboration with foreign counterparts. This semi-quantitative approach combines conceptual models of volcanic processes with current monitoring data and patterns of occurrence to reach consensus probabilities. This approach allows a response team to draw upon global datasets, local observations, and expert judgment, where the relative influence of these data depends upon the availability and quality of monitoring data and the degree to which the volcanic history is known. The construction of such event trees additionally relies upon existence and use of relevant global databases and documented past periods of unrest. Because relevant global databases may be underpopulated or nonexistent, uncertainty in probability estimations may be large. Our 'hybrid' approach of combining local and global monitoring data and expert judgment facilitates discussion and constructive debate between disciplines: including seismology, gas geochemistry, geodesy, petrology, physical volcanology and technology/engineering, where difference in opinion between response team members contributes to definition of the uncertainty in the probability estimations. In collaboration with foreign colleagues, we have created event trees for numerous areas experiencing volcanic unrest. Event trees are created for a specified time frame and are updated, revised, or replaced as the crisis proceeds. Creation of an initial tree is often prompted by a change in monitoring data, such that rapid assessment of probability is needed. These trees are intended as a vehicle for discussion and a way to document relevant data and models, where the target audience is the scientists themselves. However, the probabilities derived through the event-tree analysis can also be used to help inform communications with emergency managers and the
Impact of breed and sex on porcine endocrine transcriptome: a bayesian biometrical analysis
Directory of Open Access Journals (Sweden)
Ojeda Ana
2009-02-01
Full Text Available Abstract Background Transcriptome variability is due to genetic and environmental causes, much like any other complex phenotype. Ascertaining the transcriptome differences between individuals is an important step to understand how selection and genetic drift may affect gene expression. To that end, extant divergent livestock breeds offer an ideal genetic material. Results We have analyzed with microarrays five tissues from the endocrine axis (hypothalamus, adenohypophysis, thyroid gland, gonads and fat tissue of 16 pigs from both sexes pertaining to four extreme breeds (Duroc, Large White, Iberian and a cross with SinoEuropean hybrid line. Using a Bayesian linear model approach, we observed that the largest breed variability corresponded to the male gonads, and was larger than at the remaining tissues, including ovaries. Measurement of sex hormones in peripheral blood at slaughter did not detect any breed-related differences. Not unexpectedly, the gonads were the tissue with the largest number of sex biased genes. There was a strong correlation between sex and breed bias expression, although the most breed biased genes were not the most sex biased genes. A combined analysis of connectivity and differential expression suggested three biological processes as being primarily different between breeds: spermatogenesis, muscle differentiation and several metabolic processes. Conclusion These results suggest that differences across breeds in gene expression of the male gonads are larger than in other endocrine tissues in the pig. Nevertheless, the strong presence of breed biased genes in the male gonads cannot be explained solely by changes in spermatogenesis nor by differences in the reproductive tract development.
A Bayesian analysis of rare B decays with advanced Monte Carlo methods
Energy Technology Data Exchange (ETDEWEB)
Beaujean, Frederik
2012-11-12
Searching for new physics in rare B meson decays governed by b {yields} s transitions, we perform a model-independent global fit of the short-distance couplings C{sub 7}, C{sub 9}, and C{sub 10} of the {Delta}B=1 effective field theory. We assume the standard-model set of b {yields} s{gamma} and b {yields} sl{sup +}l{sup -} operators with real-valued C{sub i}. A total of 59 measurements by the experiments BaBar, Belle, CDF, CLEO, and LHCb of observables in B{yields}K{sup *}{gamma}, B{yields}K{sup (*)}l{sup +}l{sup -}, and B{sub s}{yields}{mu}{sup +}{mu}{sup -} decays are used in the fit. Our analysis is the first of its kind to harness the full power of the Bayesian approach to probability theory. All main sources of theory uncertainty explicitly enter the fit in the form of nuisance parameters. We make optimal use of the experimental information to simultaneously constrain theWilson coefficients as well as hadronic form factors - the dominant theory uncertainty. Generating samples from the posterior probability distribution to compute marginal distributions and predict observables by uncertainty propagation is a formidable numerical challenge for two reasons. First, the posterior has multiple well separated maxima and degeneracies. Second, the computation of the theory predictions is very time consuming. A single posterior evaluation requires O(1s), and a few million evaluations are needed. Population Monte Carlo (PMC) provides a solution to both issues; a mixture density is iteratively adapted to the posterior, and samples are drawn in a massively parallel way using importance sampling. The major shortcoming of PMC is the need for cogent knowledge of the posterior at the initial stage. In an effort towards a general black-box Monte Carlo sampling algorithm, we present a new method to extract the necessary information in a reliable and automatic manner from Markov chains with the help of hierarchical clustering. Exploiting the latest 2012 measurements, the fit
Bayesian Analysis for Linearized Multi-Stage Models in Quantal Bioassay.
Kuo, Lynn; Cohen, Michael P.
Bayesian methods for estimating dose response curves in quantal bioassay are studied. A linearized multi-stage model is assumed for the shape of the curves. A Gibbs sampling approach with data augmentation is employed to compute the Bayes estimates. In addition, estimation of the "relative additional risk" and the "risk specific…
Another look at Bayesian analysis of AMMI models for genotype-environment data
Josse, J.; Eeuwijk, van F.A.; Piepho, H.P.; Denis, J.B.
2014-01-01
Linear–bilinear models are frequently used to analyze two-way data such as genotype-by-environment data. A well-known example of this class of models is the additive main effects and multiplicative interaction effects model (AMMI). We propose a new Bayesian treatment of such models offering a proper
Bayesian analysis of censored response data in family-based genetic association studies.
Del Greco M, Fabiola; Pattaro, Cristian; Minelli, Cosetta; Thompson, John R
2016-09-01
Biomarkers are subject to censoring whenever some measurements are not quantifiable given a laboratory detection limit. Methods for handling censoring have received less attention in genetic epidemiology, and censored data are still often replaced with a fixed value. We compared different strategies for handling a left-censored continuous biomarker in a family-based study, where the biomarker is tested for association with a genetic variant, S, adjusting for a covariate, X. Allowing different correlations between X and S, we compared simple substitution of censored observations with the detection limit followed by a linear mixed effect model (LMM), Bayesian model with noninformative priors, Tobit model with robust standard errors, the multiple imputation (MI) with and without S in the imputation followed by a LMM. Our comparison was based on real and simulated data in which 20% and 40% censoring were artificially induced. The complete data were also analyzed with a LMM. In the MICROS study, the Bayesian model gave results closer to those obtained with the complete data. In the simulations, simple substitution was always the most biased method, the Tobit approach gave the least biased estimates at all censoring levels and correlation values, the Bayesian model and both MI approaches gave slightly biased estimates but smaller root mean square errors. On the basis of these results the Bayesian approach is highly recommended for candidate gene studies; however, the computationally simpler Tobit and the MI without S are both good options for genome-wide studies.
How few countries will do? Comparative survey analysis from a Bayesian perspective
Hox, Joop; van de Schoot, Rens; Matthijsse, Suzette
2012-01-01
Meuleman and Billiet (2009) have carried out a simulation study aimed at the question how many countries are needed for accurate multilevel SEM estimation in comparative studies. The authors concluded that a sample of 50 to 100 countries is needed for accurate estimation. Recently, Bayesian estimati
Bayesian Missile System Reliability from Point Estimates
2014-10-28
OCT 2014 2. REPORT TYPE N/A 3. DATES COVERED - 4. TITLE AND SUBTITLE Bayesian Missile System Reliability from Point Estimates 5a. CONTRACT...Principle (MEP) to convert point estimates to probability distributions to be used as priors for Bayesian reliability analysis of missile data, and...illustrate this approach by applying the priors to a Bayesian reliability model of a missile system. 15. SUBJECT TERMS priors, Bayesian , missile
Bayesian Inference on Gravitational Waves
Directory of Open Access Journals (Sweden)
Asad Ali
2015-12-01
Full Text Available The Bayesian approach is increasingly becoming popular among the astrophysics data analysis communities. However, the Pakistan statistics communities are unaware of this fertile interaction between the two disciplines. Bayesian methods have been in use to address astronomical problems since the very birth of the Bayes probability in eighteenth century. Today the Bayesian methods for the detection and parameter estimation of gravitational waves have solid theoretical grounds with a strong promise for the realistic applications. This article aims to introduce the Pakistan statistics communities to the applications of Bayesian Monte Carlo methods in the analysis of gravitational wave data with an overview of the Bayesian signal detection and estimation methods and demonstration by a couple of simplified examples.
Bayesian Information-Gap Decision Analysis Applied to a CO2 Leakage Problem
O'Malley, D.; Vesselinov, V. V.
2014-12-01
We describe a decision analysis in the presence of uncertainty that combines a non-probabilistic approach (information-gap decision theory) with a probabilistic approach (Bayes' theorem). Bayes' theorem is one of the most popular techniques for probabilistic uncertainty quantification (UQ). It is effective in many situations, because it updates our understanding of the uncertainties by conditioning on real data using a mathematically rigorous technique. However, the application of Bayes' theorem in science and engineering is not always rigorous. There are two reasons for this: (1) We can enumerate the possible outcomes of dice-rolling, but not the possible outcomes of real-world contamination remediation; (2) We can precisely determine conditional probabilities for coin-tossing, but substantial uncertainty surrounds the conditional probabilities for real-world contamination remediation. Of course, Bayes' theorem is rigorously applicable beyond dice-rolling and coin-tossing, but even in cases that are constructed to be simple with ostensibly good probabilistic models, applying Bayes' theorem to the real world may not work as well as one might expect. Bayes' theorem is rigorously applicable only if all possible events can be described, and their conditional probabilities can be derived rigorously. Outside of this domain, it may still be useful, but its use lacks at least some rigor. The information-gap approach allows us to circumvent some of the highlighted shortcomings of Bayes' theorem. In particular, it provides a way to account for possibilities beyond those described by our models, and a way to deal with uncertainty in the conditional distribution that forms the core of Bayesian analysis. We have developed a three-tiered technique enables one to make scientifically defensible decisions in the face of severe uncertainty such as is found in many geologic problems. To demonstrate the applicability, we apply the technique to a CO2 leakage problem. The goal is to
Bernardo, Jose M
2000-01-01
This highly acclaimed text, now available in paperback, provides a thorough account of key concepts and theoretical results, with particular emphasis on viewing statistical inference as a special case of decision theory. Information-theoretic concepts play a central role in the development of the theory, which provides, in particular, a detailed discussion of the problem of specification of so-called prior ignorance . The work is written from the authors s committed Bayesian perspective, but an overview of non-Bayesian theories is also provided, and each chapter contains a wide-ranging critica
Chee, S Y
2015-05-25
The mitochondrial DNA (mtDNA) cytochrome oxidase I (COI) gene has been universally and successfully utilized as a barcoding gene, mainly because it can be amplified easily, applied across a wide range of taxa, and results can be obtained cheaply and quickly. However, in rare cases, the gene can fail to distinguish between species, particularly when exposed to highly sensitive methods of data analysis, such as the Bayesian method, or when taxa have undergone introgressive hybridization, over-splitting, or incomplete lineage sorting. Such cases require the use of alternative markers, and nuclear DNA markers are commonly used. In this study, a dendrogram produced by Bayesian analysis of an mtDNA COI dataset was compared with that of a nuclear DNA ATPS-α dataset, in order to evaluate the efficiency of COI in barcoding Malaysian nerites (Neritidae). In the COI dendrogram, most of the species were in individual clusters, except for two species: Nerita chamaeleon and N. histrio. These two species were placed in the same subcluster, whereas in the ATPS-α dendrogram they were in their own subclusters. Analysis of the ATPS-α gene also placed the two genera of nerites (Nerita and Neritina) in separate clusters, whereas COI gene analysis placed both genera in the same cluster. Therefore, in the case of the Neritidae, the ATPS-α gene is a better barcoding gene than the COI gene.
Osborne, S F
1984-02-01
The medical issues that arise in the isolated environment of a submarine can occasionally be grave. While crewmembers are carefully screened for health problems, they are still susceptible to serious acute illness. Currently, the submarine medical department representative, the hospital corpsman, utilizes a history and physical examination, clinical acumen, and limited laboratory testing in diagnosis. The application of a Bayesian method of analysis to an abdominal pain diagnostic system utilizing an onboard microcomputer is described herein. Early results from sea trials show an appropriate diagnosis in eight of 10 cases of abdominal pain, but the program should still be viewed as an extended "laboratory test" until proved effective at sea.
A Bayesian ridge regression analysis of congestion's impact on urban expressway safety.
Shi, Qi; Abdel-Aty, Mohamed; Lee, Jaeyoung
2016-03-01
With the rapid growth of traffic in urban areas, concerns about congestion and traffic safety have been heightened. This study leveraged both Automatic Vehicle Identification (AVI) system and Microwave Vehicle Detection System (MVDS) installed on an expressway in Central Florida to explore how congestion impacts the crash occurrence in urban areas. Multiple congestion measures from the two systems were developed. To ensure more precise estimates of the congestion's effects, the traffic data were aggregated into peak and non-peak hours. Multicollinearity among traffic parameters was examined. The results showed the presence of multicollinearity especially during peak hours. As a response, ridge regression was introduced to cope with this issue. Poisson models with uncorrelated random effects, correlated random effects, and both correlated random effects and random parameters were constructed within the Bayesian framework. It was proven that correlated random effects could significantly enhance model performance. The random parameters model has similar goodness-of-fit compared with the model with only correlated random effects. However, by accounting for the unobserved heterogeneity, more variables were found to be significantly related to crash frequency. The models indicated that congestion increased crash frequency during peak hours while during non-peak hours it was not a major crash contributing factor. Using the random parameter model, the three congestion measures were compared. It was found that all congestion indicators had similar effects while Congestion Index (CI) derived from MVDS data was a better congestion indicator for safety analysis. Also, analyses showed that the segments with higher congestion intensity could not only increase property damage only (PDO) crashes, but also more severe crashes. In addition, the issues regarding the necessity to incorporate specific congestion indicator for congestion's effects on safety and to take care of the
DEFF Research Database (Denmark)
Strathe, Anders Bjerring; Jørgensen, Henry; Kebreab, E
2012-01-01
developed, reflecting current knowledge about metabolic scaling and partial efficiencies of PD and LD rates, whereas flat non-informative priors were used for the reminder of the parameters. The experimental data analysed originate from a balance and respiration trial with 17 cross-bred pigs of three......ABSTRACT SUMMARY The objective of the current study was to develop Bayesian simultaneous equation models for modelling energy intake and partitioning in growing pigs. A key feature of the Bayesian approach is that parameters are assigned prior distributions, which may reflect the current state...... genders (barrows, boars and gilts) selected on the basis of similar birth weight. The pigs were fed four diets based on barley, wheat and soybean meal supplemented with crystalline amino acids to meet or exceed Danish nutrient requirement standards. Nutrient balances and gas exchanges were measured at c...
PARALLEL ADAPTIVE MULTILEVEL SAMPLING ALGORITHMS FOR THE BAYESIAN ANALYSIS OF MATHEMATICAL MODELS
Prudencio, Ernesto
2012-01-01
In recent years, Bayesian model updating techniques based on measured data have been applied to many engineering and applied science problems. At the same time, parallel computational platforms are becoming increasingly more powerful and are being used more frequently by the engineering and scientific communities. Bayesian techniques usually require the evaluation of multi-dimensional integrals related to the posterior probability density function (PDF) of uncertain model parameters. The fact that such integrals cannot be computed analytically motivates the research of stochastic simulation methods for sampling posterior PDFs. One such algorithm is the adaptive multilevel stochastic simulation algorithm (AMSSA). In this paper we discuss the parallelization of AMSSA, formulating the necessary load balancing step as a binary integer programming problem. We present a variety of results showing the effectiveness of load balancing on the overall performance of AMSSA in a parallel computational environment.
Bayesian analysis of spatial point processes in the neighbourhood of Voronoi networks
DEFF Research Database (Denmark)
Skare, Øivind; Møller, Jesper; Jensen, Eva B. Vedel
2007-01-01
A model for an inhomogeneous Poisson process with high intensity near the edges of a Voronoi tessellation in 2D or 3D is proposed. The model is analysed in a Bayesian setting with priors on nuclei of the Voronoi tessellation and other model parameters. An MCMC algorithm is constructed to sample f...... from biology (animal territories) and material science (alumina grain structure) are presented.......A model for an inhomogeneous Poisson process with high intensity near the edges of a Voronoi tessellation in 2D or 3D is proposed. The model is analysed in a Bayesian setting with priors on nuclei of the Voronoi tessellation and other model parameters. An MCMC algorithm is constructed to sample...
Bayesian analysis of spatial point processes in the neighbourhood of Voronoi networks
DEFF Research Database (Denmark)
Skare, Øivind; Møller, Jesper; Vedel Jensen, Eva B.
A model for an inhomogeneous Poisson process with high intensity near the edges of a Voronoi tessellation in 2D or 3D is proposed. The model is analysed in a Bayesian setting with priors on nuclei of the Voronoi tessellation and other model parameters. An MCMC algorithm is constructed to sample f...... from biology (animal territories) and material science (alumina grain structure) are presented.......A model for an inhomogeneous Poisson process with high intensity near the edges of a Voronoi tessellation in 2D or 3D is proposed. The model is analysed in a Bayesian setting with priors on nuclei of the Voronoi tessellation and other model parameters. An MCMC algorithm is constructed to sample...
DEFF Research Database (Denmark)
Ehsani, Alireza; Sørensen, Peter; Pomp, Daniel;
2012-01-01
Background To understand the genetic architecture of complex traits and bridge the genotype-phenotype gap, it is useful to study intermediate -omics data, e.g. the transcriptome. The present study introduces a method for simultaneous quantification of the contributions from single nucleotide...... polymorphisms (SNPs) and transcript abundances in explaining phenotypic variance, using Bayesian whole-omics models. Bayesian mixed models and variable selection models were used and, based on parameter samples from the model posterior distributions, explained variances were further partitioned at the level......-modal distribution of genomic values collapses, when gene expressions are added to the model Conclusions With increased availability of various -omics data, integrative approaches are promising tools for understanding the genetic architecture of complex traits. Partitioning of explained variances at the chromosome...
A Bayesian Analysis of Kepler-2b Using the EXONEST Algorithm
Placek, Ben
2014-01-01
The study of exoplanets (planets orbiting other stars) is revolutionizing the way we view our universe. High-precision photometric data provided by the Kepler Space Telescope (Kepler) enables not only the detection of such planets, but also their characterization. This presents a unique opportunity to apply Bayesian methods to better characterize the multitude of previously confirmed exoplanets. This paper focuses on applying the EXONEST algorithm to characterize the transiting short-period-hot-Jupiter, Kepler-2b. EXONEST evaluates a suite of exoplanet photometric models by applying Bayesian Model Selection, which is implemented with the MultiNest algorithm. These models take into account planetary effects, such as reflected light and thermal emissions, as well as the effect of the planetary motion on the host star, such as Doppler beaming, or boosting, of light from the reflex motion of the host star, and photometric variations due to the planet-induced ellipsoidal shape of the host star. By calculating mode...
Emmert-Streib, Frank; de Matos Simoes, Ricardo; Tripathi, Shailesh; Glazko, Galina V.; Dehmer, Matthias
2012-01-01
In this paper, we present a Bayesian approach to estimate a chromosome and a disorder network from the Online Mendelian Inheritance in Man (OMIM) database. In contrast to other approaches, we obtain statistic rather than deterministic networks enabling a parametric control in the uncertainty of the underlying disorder-disease gene associations contained in the OMIM, on which the networks are based. From a structural investigation of the chromosome network, we identify three chromosome subgrou...
Directory of Open Access Journals (Sweden)
Xian Shan
2017-01-01
Full Text Available Pipeline is the major mode of natural gas transportation. Leakage of natural gas pipelines may cause explosions and fires, resulting in casualties, environmental damage, and material loss. Efficient risk analysis is of great significance for preventing and mitigating such potential accidents. The objective of this study is to present a practical risk assessment method based on Bow-tie model and Bayesian network for risk analysis of natural gas pipeline leakage. Firstly, identify the potential risk factors and consequences of the failure. Then construct the Bow-tie model, use the quantitative analysis of Bayesian network to find the weak links in the system, and make a prediction of the control measures to reduce the rate of the accident. In order to deal with the uncertainty existing in the determination of the probability of basic events, fuzzy logic method is used. Results of a case study show that the most likely causes of natural gas pipeline leakage occurrence are parties ignore signage, implicit signage, overload, and design defect of auxiliaries. Once the leakage occurs, it is most likely to result in fire and explosion. Corresponding measures taken on time will reduce the disaster degree of accidents to the least extent.
Sankararaman, Shankar
2016-01-01
This paper presents a computational framework for uncertainty characterization and propagation, and sensitivity analysis under the presence of aleatory and epistemic un- certainty, and develops a rigorous methodology for efficient refinement of epistemic un- certainty by identifying important epistemic variables that significantly affect the overall performance of an engineering system. The proposed methodology is illustrated using the NASA Langley Uncertainty Quantification Challenge (NASA-LUQC) problem that deals with uncertainty analysis of a generic transport model (GTM). First, Bayesian inference is used to infer subsystem-level epistemic quantities using the subsystem-level model and corresponding data. Second, tools of variance-based global sensitivity analysis are used to identify four important epistemic variables (this limitation specified in the NASA-LUQC is reflective of practical engineering situations where not all epistemic variables can be refined due to time/budget constraints) that significantly affect system-level performance. The most significant contribution of this paper is the development of the sequential refine- ment methodology, where epistemic variables for refinement are not identified all-at-once. Instead, only one variable is first identified, and then, Bayesian inference and global sensi- tivity calculations are repeated to identify the next important variable. This procedure is continued until all 4 variables are identified and the refinement in the system-level perfor- mance is computed. The advantages of the proposed sequential refinement methodology over the all-at-once uncertainty refinement approach are explained, and then applied to the NASA Langley Uncertainty Quantification Challenge problem.
Directory of Open Access Journals (Sweden)
Chris Bambey Guure
2012-01-01
Full Text Available The survival function of the Weibull distribution determines the probability that a unit or an individual will survive beyond a certain specified time while the failure rate is the rate at which a randomly selected individual known to be alive at time will die at time (. The classical approach for estimating the survival function and the failure rate is the maximum likelihood method. In this study, we strive to determine the best method, by comparing the classical maximum likelihood against the Bayesian estimators using an informative prior and a proposed data-dependent prior known as generalised noninformative prior. The Bayesian estimation is considered under three loss functions. Due to the complexity in dealing with the integrals using the Bayesian estimator, Lindley’s approximation procedure is employed to reduce the ratio of the integrals. For the purpose of comparison, the mean squared error (MSE and the absolute bias are obtained. This study is conducted via simulation by utilising different sample sizes. We observed from the study that the generalised prior we assumed performed better than the others under linear exponential loss function with respect to MSE and under general entropy loss function with respect to absolute bias.
Gualandi, Adriano; Serpelloni, Enrico; Elina Belardinelli, Maria; Bonafede, Maurizio; Pezzo, Giuseppe; Tolomei, Cristiano
2015-04-01
A critical point in the analysis of ground displacement time series, as those measured by modern space geodetic techniques (primarly continuous GPS/GNSS and InSAR) is the development of data driven methods that allow to discern and characterize the different sources that generate the observed displacements. A widely used multivariate statistical technique is the Principal Component Analysis (PCA), which allows to reduce the dimensionality of the data space maintaining most of the variance of the dataset explained. It reproduces the original data using a limited number of Principal Components, but it also shows some deficiencies, since PCA does not perform well in finding the solution to the so-called Blind Source Separation (BSS) problem. The recovering and separation of the different sources that generate the observed ground deformation is a fundamental task in order to provide a physical meaning to the possible different sources. PCA fails in the BSS problem since it looks for a new Euclidean space where the projected data are uncorrelated. Usually, the uncorrelation condition is not strong enough and it has been proven that the BSS problem can be tackled imposing on the components to be independent. The Independent Component Analysis (ICA) is, in fact, another popular technique adopted to approach this problem, and it can be used in all those fields where PCA is also applied. An ICA approach enables us to explain the displacement time series imposing a fewer number of constraints on the model, and to reveal anomalies in the data such as transient deformation signals. However, the independence condition is not easy to impose, and it is often necessary to introduce some approximations. To work around this problem, we use a variational bayesian ICA (vbICA) method, which models the probability density function (pdf) of each source signal using a mix of Gaussian distributions. This technique allows for more flexibility in the description of the pdf of the sources
Bhadra, Anindya
2013-04-22
We describe a Bayesian technique to (a) perform a sparse joint selection of significant predictor variables and significant inverse covariance matrix elements of the response variables in a high-dimensional linear Gaussian sparse seemingly unrelated regression (SSUR) setting and (b) perform an association analysis between the high-dimensional sets of predictors and responses in such a setting. To search the high-dimensional model space, where both the number of predictors and the number of possibly correlated responses can be larger than the sample size, we demonstrate that a marginalization-based collapsed Gibbs sampler, in combination with spike and slab type of priors, offers a computationally feasible and efficient solution. As an example, we apply our method to an expression quantitative trait loci (eQTL) analysis on publicly available single nucleotide polymorphism (SNP) and gene expression data for humans where the primary interest lies in finding the significant associations between the sets of SNPs and possibly correlated genetic transcripts. Our method also allows for inference on the sparse interaction network of the transcripts (response variables) after accounting for the effect of the SNPs (predictor variables). We exploit properties of Gaussian graphical models to make statements concerning conditional independence of the responses. Our method compares favorably to existing Bayesian approaches developed for this purpose. © 2013, The International Biometric Society.
Gong, Maozhen
Selecting an appropriate prior distribution is a fundamental issue in Bayesian Statistics. In this dissertation, under the framework provided by Berger and Bernardo, I derive the reference priors for several models which include: Analysis of Variance (ANOVA)/Analysis of Covariance (ANCOVA) models with a categorical variable under common ordering constraints, the conditionally autoregressive (CAR) models and the simultaneous autoregressive (SAR) models with a spatial autoregression parameter rho considered. The performances of reference priors for ANOVA/ANCOVA models are evaluated by simulation studies with comparisons to Jeffreys' prior and Least Squares Estimation (LSE). The priors are then illustrated in a Bayesian model of the "Risk of Type 2 Diabetes in New Mexico" data, where the relationship between the type 2 diabetes risk (through Hemoglobin A1c) and different smoking levels is investigated. In both simulation studies and real data set modeling, the reference priors that incorporate internal order information show good performances and can be used as default priors. The reference priors for the CAR and SAR models are also illustrated in the "1999 SAT State Average Verbal Scores" data with a comparison to a Uniform prior distribution. Due to the complexity of the reference priors for both CAR and SAR models, only a portion (12 states in the Midwest) of the original data set is considered. The reference priors can give a different marginal posterior distribution compared to a Uniform prior, which provides an alternative for prior specifications for areal data in Spatial statistics.
Nisius, Britta; Vogt, Martin; Bajorath, Jürgen
2009-06-01
The contribution of individual fingerprint bit positions to similarity search performance is systematically evaluated. A method is introduced to determine bit significance on the basis of Kullback-Leibler divergence analysis of bit distributions in active and database compounds. Bit divergence analysis and Bayesian compound screening share a common methodological foundation. Hence, given the significance ranking of all individual bit positions comprising a fingerprint, subsets of bits are evaluated in the context of Bayesian screening, and minimal fingerprint representations are determined that meet or exceed the search performance of unmodified fingerprints. For fingerprints of different design evaluated on many compound activity classes, we consistently find that subsets of fingerprint bit positions are responsible for search performance. In part, these subsets are very small and contain in some cases only a few fingerprint bit positions. Structural or pharmacophore patterns captured by preferred bit positions can often be directly associated with characteristic features of active compounds. In some cases, reduced fingerprint representations clearly exceed the search performance of the original fingerprints. Thus, fingerprint reduction likely represents a promising approach for practical applications.
Maiti, Saumen; Tiwari, Ram Krishna
2010-10-01
A new probabilistic approach based on the concept of Bayesian neural network (BNN) learning theory is proposed for decoding litho-facies boundaries from well-log data. We show that how a multi-layer-perceptron neural network model can be employed in Bayesian framework to classify changes in litho-log successions. The method is then applied to the German Continental Deep Drilling Program (KTB) well-log data for classification and uncertainty estimation in the litho-facies boundaries. In this framework, a posteriori distribution of network parameter is estimated via the principle of Bayesian probabilistic theory, and an objective function is minimized following the scaled conjugate gradient optimization scheme. For the model development, we inflict a suitable criterion, which provides probabilistic information by emulating different combinations of synthetic data. Uncertainty in the relationship between the data and the model space is appropriately taken care by assuming a Gaussian a priori distribution of networks parameters (e.g., synaptic weights and biases). Prior to applying the new method to the real KTB data, we tested the proposed method on synthetic examples to examine the sensitivity of neural network hyperparameters in prediction. Within this framework, we examine stability and efficiency of this new probabilistic approach using different kinds of synthetic data assorted with different level of correlated noise. Our data analysis suggests that the designed network topology based on the Bayesian paradigm is steady up to nearly 40% correlated noise; however, adding more noise (˜50% or more) degrades the results. We perform uncertainty analyses on training, validation, and test data sets with and devoid of intrinsic noise by making the Gaussian approximation of the a posteriori distribution about the peak model. We present a standard deviation error-map at the network output corresponding to the three types of the litho-facies present over the entire litho
Ursino, Mauro; Cuppini, Cristiano; Magosso, Elisa
2017-03-01
Recent theoretical and experimental studies suggest that in multisensory conditions, the brain performs a near-optimal Bayesian estimate of external events, giving more weight to the more reliable stimuli. However, the neural mechanisms responsible for this behavior, and its progressive maturation in a multisensory environment, are still insufficiently understood. The aim of this letter is to analyze this problem with a neural network model of audiovisual integration, based on probabilistic population coding-the idea that a population of neurons can encode probability functions to perform Bayesian inference. The model consists of two chains of unisensory neurons (auditory and visual) topologically organized. They receive the corresponding input through a plastic receptive field and reciprocally exchange plastic cross-modal synapses, which encode the spatial co-occurrence of visual-auditory inputs. A third chain of multisensory neurons performs a simple sum of auditory and visual excitations. The work includes a theoretical part and a computer simulation study. We show how a simple rule for synapse learning (consisting of Hebbian reinforcement and a decay term) can be used during training to shrink the receptive fields and encode the unisensory likelihood functions. Hence, after training, each unisensory area realizes a maximum likelihood estimate of stimulus position (auditory or visual). In cross-modal conditions, the same learning rule can encode information on prior probability into the cross-modal synapses. Computer simulations confirm the theoretical results and show that the proposed network can realize a maximum likelihood estimate of auditory (or visual) positions in unimodal conditions and a Bayesian estimate, with moderate deviations from optimality, in cross-modal conditions. Furthermore, the model explains the ventriloquism illusion and, looking at the activity in the multimodal neurons, explains the automatic reweighting of auditory and visual inputs
Hedlund, Jonas
2014-01-01
This paper introduces private sender information into a sender-receiver game of Bayesian persuasion with monotonic sender preferences. I derive properties of increasing differences related to the precision of signals and use these to fully characterize the set of equilibria robust to the intuitive criterion. In particular, all such equilibria are either separating, i.e., the sender's choice of signal reveals his private information to the receiver, or fully disclosing, i.e., the outcome of th...
Kirstein, Roland
2005-01-01
This paper presents a modification of the inspection game: The ?Bayesian Monitoring? model rests on the assumption that judges are interested in enforcing compliant behavior and making correct decisions. They may base their judgements on an informative but imperfect signal which can be generated costlessly. In the original inspection game, monitoring is costly and generates a perfectly informative signal. While the inspection game has only one mixed strategy equilibrium, three Perfect Bayesia...
Ouyang, Bichun; Sinha, Debajyoti; Slate, Elizabeth H; Van Bakel, Adrian B
2013-07-10
For a heart transplant patient, the risk of graft rejection and risk of death are likely to be associated. Two fully specified Bayesian models for recurrent events with dependent termination are applied to investigate the potential relationships between these two types of risk as well as association with risk factors. We particularly focus on the choice of priors, selection of the appropriate prediction model, and prediction methods for these two types of risk for an individual patient. Our prediction tools can be easily implemented and helpful to physicians for setting heart transplant patients' biopsy schedule.
Khanin, Alexander
2014-01-01
Cosmic rays (CRs) are protons and atomic nuclei that flow into our Solar system and reach the Earth with energies of up to ~10^21 eV. The sources of ultra-high energy cosmic rays (UHECRs) with E >~ 10^19 eV remain unknown, although there are theoretical reasons to think that at least some come from active galactic nuclei (AGNs). One way to assess the different hypotheses is by analysing the arrival directions of UHECRs, in particular their self-clustering. We have developed a fully Bayesian approach to analyzing the self-clustering of points on the sphere, which we apply to the UHECR arrival directions. The analysis is based on a multi-step approach that enables the application of Bayesian model comparison to cases with weak prior information. We have applied this approach to the 69 highest energy events recorded by the Pierre Auger Observatory (PAO), which is the largest current UHECR data set. We do not detect self-clustering, but simulations show that this is consistent with the AGN-sourced model for a dat...
Lobach, Iryna; Fan, Ruzong
A key component to understanding etiology of complex diseases, such as cancer, diabetes, alcohol dependence, is to investigate gene-environment interactions. This work is motivated by the following two concerns in the analysis of gene-environment interactions. First, multiple genetic markers in moderate linkage disequilibrium may be involved in susceptibility to a complex disease. Second, environmental factors may be subject to misclassification. We develop a genotype based Bayesian pseudolikelihood approach that accommodates linkage disequilibrium in genetic markers and misclassification in environmental factors. Since our approach is genotype based, it allows the observed genetic information to enter the model directly thus eliminating the need to infer haplotype phase and simplifying computations. Bayesian approach allows shrinking parameter estimates towards prior distribution to improve estimation and inference when environmental factors are subject to misclassification. Simulation experiments demonstrated that our method produced parameter estimates that are nearly unbiased even for small sample sizes. An application of our method is illustrated using a case-control study of interaction between early onset of drinking and genes involved in dopamine pathway.
Directory of Open Access Journals (Sweden)
Iryna Lobach
2012-01-01
Full Text Available A key component to understanding etiology of complex diseases, such as cancer, diabetes, alcohol dependence, is to investigate gene-environment interactions. This work is motivated by the following two concerns in the analysis of gene-environment interactions. First, multiple genetic markers in moderate linkage disequilibrium may be involved in susceptibility to a complex disease. Second, environmental factors may be subject to misclassification. We develop a genotype based Bayesian pseudolikelihood approach that accommodates linkage disequilibrium in genetic markers and misclassification in environmental factors. Since our approach is genotype based, it allows the observed genetic information to enter the model directly thus eliminating the need to infer haplotype phase and simplifying computations. Bayesian approach allows shrinking parameter estimates towards prior distribution to improve estimation and inference when environmental factors are subject to misclassification. Simulation experiments demonstrated that our method produced parameter estimates that are nearly unbiased even for small sample sizes. An application of our method is illustrated using a case-control study of interaction between early onset of drinking and genes involved in dopamine pathway.
Directory of Open Access Journals (Sweden)
Giulia Carreras
2012-09-01
Full Text Available
Background: parameter uncertainty in the Markov model’s description of a disease course was addressed. Probabilistic sensitivity analysis (PSA is now considered the only tool that properly permits parameter uncertainty’s examination. This consists in sampling values from the parameter’s probability distributions.
Methods: Markov models fitted with microsimulation were considered and methods for carrying out a PSA on transition probabilities were studied. Two Bayesian solutions were developed: for each row of the modeled transition matrix the prior distribution was assumed as a product of Beta or a Dirichlet. The two solutions differ in the source of information: several different sources for each transition in the Beta approach and a single source for each transition from a given health state in the Dirichlet. The two methods were applied to a simple cervical cancer’s model.
Results : differences between posterior estimates from the two methods were negligible. Results showed that the prior variability highly influence the posterior distribution.
Conclusions: the novelty of this work is the Bayesian approach that integrates the two distributions with a product of Binomial distributions likelihood. Such methods could be also applied to cohort data and their application to more complex models could be useful and unique in the cervical cancer context, as well as in other disease modeling.
Wagner-Kaiser, R; Robinson, E; von Hippel, T; Sarajedini, A; van Dyk, D A; Stein, N; Jefferys, W H
2016-01-01
We use Cycle 21 Hubble Space Telescope (HST) observations and HST archival ACS Treasury observations of Galactic Globular Clusters to find and characterize two stellar populations in NGC 5024 (M53), NGC 5272 (M3), and NGC 6352. For these three clusters, both single and double-population analyses are used to determine a best fit isochrone(s). We employ a sophisticated Bayesian analysis technique to simultaneously fit the cluster parameters (age, distance, absorption, and metallicity) that characterize each cluster. For the two-population analysis, unique population level helium values are also fit to each distinct population of the cluster and the relative proportions of the populations are determined. We find differences in helium ranging from $\\sim$0.05 to 0.11 for these three clusters. Model grids with solar $\\alpha$-element abundances ([$\\alpha$/Fe] =0.0) and enhanced $\\alpha$-elements ([$\\alpha$/Fe]=0.4) are adopted.
OBJECTIVE BAYESIAN ANALYSIS OF ''ON/OFF'' MEASUREMENTS
Energy Technology Data Exchange (ETDEWEB)
Casadei, Diego, E-mail: diego.casadei@fhnw.ch [Visiting Scientist, Department of Physics and Astronomy, UCL, Gower Street, London WC1E 6BT (United Kingdom)
2015-01-01
In high-energy astrophysics, it is common practice to account for the background overlaid with counts from the source of interest with the help of auxiliary measurements carried out by pointing off-source. In this ''on/off'' measurement, one knows the number of photons detected while pointing toward the source, the number of photons collected while pointing away from the source, and how to estimate the background counts in the source region from the flux observed in the auxiliary measurements. For very faint sources, the number of photons detected is so low that the approximations that hold asymptotically are not valid. On the other hand, an analytical solution exists for the Bayesian statistical inference, which is valid at low and high counts. Here we illustrate the objective Bayesian solution based on the reference posterior and compare the result with the approach very recently proposed by Knoetig, and discuss its most delicate points. In addition, we propose to compute the significance of the excess with respect to the background-only expectation with a method that is able to account for any uncertainty on the background and is valid for any photon count. This method is compared to the widely used significance formula by Li and Ma, which is based on asymptotic properties.
A Bayesian Analysis for Identifying DNA Copy Number Variations Using a Compound Poisson Process
Directory of Open Access Journals (Sweden)
Yiğiter Ayten
2010-01-01
Full Text Available To study chromosomal aberrations that may lead to cancer formation or genetic diseases, the array-based Comparative Genomic Hybridization (aCGH technique is often used for detecting DNA copy number variants (CNVs. Various methods have been developed for gaining CNVs information based on aCGH data. However, most of these methods make use of the log-intensity ratios in aCGH data without taking advantage of other information such as the DNA probe (e.g., biomarker positions/distances contained in the data. Motivated by the specific features of aCGH data, we developed a novel method that takes into account the estimation of a change point or locus of the CNV in aCGH data with its associated biomarker position on the chromosome using a compound Poisson process. We used a Bayesian approach to derive the posterior probability for the estimation of the CNV locus. To detect loci of multiple CNVs in the data, a sliding window process combined with our derived Bayesian posterior probability was proposed. To evaluate the performance of the method in the estimation of the CNV locus, we first performed simulation studies. Finally, we applied our approach to real data from aCGH experiments, demonstrating its applicability.
Directory of Open Access Journals (Sweden)
Moslem Moradi
2015-06-01
Full Text Available Here in, an application of a new seismic inversion algorithm in one of Iran’s oilfields is described. Stochastic (geostatistical seismic inversion, as a complementary method to deterministic inversion, is perceived as contribution combination of geostatistics and seismic inversion algorithm. This method integrates information from different data sources with different scales, as prior information in Bayesian statistics. Data integration leads to a probability density function (named as a posteriori probability that can yield a model of subsurface. The Markov Chain Monte Carlo (MCMC method is used to sample the posterior probability distribution, and the subsurface model characteristics can be extracted by analyzing a set of the samples. In this study, the theory of stochastic seismic inversion in a Bayesian framework was described and applied to infer P-impedance and porosity models. The comparison between the stochastic seismic inversion and the deterministic model based seismic inversion indicates that the stochastic seismic inversion can provide more detailed information of subsurface character. Since multiple realizations are extracted by this method, an estimation of pore volume and uncertainty in the estimation were analyzed.
A Bayesian analysis of HAT-P-7b using the EXONEST algorithm
Energy Technology Data Exchange (ETDEWEB)
Placek, Ben [Department of Physics, University at Albany (SUNY), Albany NY (United States); Knuth, Kevin H. [Department of Physics, University at Albany (SUNY), Albany NY, USA and Department of Informatics, University at Albany (SUNY), Albany NY (United States)
2015-01-13
The study of exoplanets (planets orbiting other stars) is revolutionizing the way we view our universe. High-precision photometric data provided by the Kepler Space Telescope (Kepler) enables not only the detection of such planets, but also their characterization. This presents a unique opportunity to apply Bayesian methods to better characterize the multitude of previously confirmed exoplanets. This paper focuses on applying the EXONEST algorithm to characterize the transiting short-period-hot-Jupiter, HAT-P-7b (also referred to as Kepler-2b). EXONEST evaluates a suite of exoplanet photometric models by applying Bayesian Model Selection, which is implemented with the MultiNest algorithm. These models take into account planetary effects, such as reflected light and thermal emissions, as well as the effect of the planetary motion on the host star, such as Doppler beaming, or boosting, of light from the reflex motion of the host star, and photometric variations due to the planet-induced ellipsoidal shape of the host star. By calculating model evidences, one can determine which model best describes the observed data, thus identifying which effects dominate the planetary system. Presented are parameter estimates and model evidences for HAT-P-7b.
Sinha, Samiran
2009-08-10
We propose a semiparametric Bayesian method for handling measurement error in nutritional epidemiological data. Our goal is to estimate nonparametrically the form of association between a disease and exposure variable while the true values of the exposure are never observed. Motivated by nutritional epidemiological data, we consider the setting where a surrogate covariate is recorded in the primary data, and a calibration data set contains information on the surrogate variable and repeated measurements of an unbiased instrumental variable of the true exposure. We develop a flexible Bayesian method where not only is the relationship between the disease and exposure variable treated semiparametrically, but also the relationship between the surrogate and the true exposure is modeled semiparametrically. The two nonparametric functions are modeled simultaneously via B-splines. In addition, we model the distribution of the exposure variable as a Dirichlet process mixture of normal distributions, thus making its modeling essentially nonparametric and placing this work into the context of functional measurement error modeling. We apply our method to the NIH-AARP Diet and Health Study and examine its performance in a simulation study.
A Bayesian analysis of HAT-P-7b using the EXONEST algorithm
Placek, Ben; Knuth, Kevin H.
2015-01-01
The study of exoplanets (planets orbiting other stars) is revolutionizing the way we view our universe. High-precision photometric data provided by the Kepler Space Telescope (Kepler) enables not only the detection of such planets, but also their characterization. This presents a unique opportunity to apply Bayesian methods to better characterize the multitude of previously confirmed exoplanets. This paper focuses on applying the EXONEST algorithm to characterize the transiting short-period-hot-Jupiter, HAT-P-7b (also referred to as Kepler-2b). EXONEST evaluates a suite of exoplanet photometric models by applying Bayesian Model Selection, which is implemented with the MultiNest algorithm. These models take into account planetary effects, such as reflected light and thermal emissions, as well as the effect of the planetary motion on the host star, such as Doppler beaming, or boosting, of light from the reflex motion of the host star, and photometric variations due to the planet-induced ellipsoidal shape of the host star. By calculating model evidences, one can determine which model best describes the observed data, thus identifying which effects dominate the planetary system. Presented are parameter estimates and model evidences for HAT-P-7b.
Lander, Tonya A; Klein, Etienne K; Oddou-Muratorio, Sylvie; Candau, Jean-Noël; Gidoin, Cindy; Chalon, Alain; Roig, Anne; Fallour, Delphine; Auger-Rozenberg, Marie-Anne; Boivin, Thomas
2014-12-01
Understanding how invasive species establish and spread is vital for developing effective management strategies for invaded areas and identifying new areas where the risk of invasion is highest. We investigated the explanatory power of dispersal histories reconstructed based on local-scale wind data and a regional-scale wind-dispersed particle trajectory model for the invasive seed chalcid wasp Megastigmus schimitscheki (Hymenoptera: Torymidae) in France. The explanatory power was tested by: (1) survival analysis of empirical data on M. schimitscheki presence, absence and year of arrival at 52 stands of the wasp's obligate hosts, Cedrus (true cedar trees); and (2) Approximate Bayesian analysis of M. schimitscheki genetic data using a coalescence model. The Bayesian demographic modeling and traditional population genetic analysis suggested that initial invasion across the range was the result of long-distance dispersal from the longest established sites. The survival analyses of the windborne expansion patterns derived from a particle dispersal model indicated that there was an informative correlation between the M. schimitscheki presence/absence data from the annual surveys and the scenarios based on regional-scale wind data. These three very different analyses produced highly congruent results supporting our proposal that wind is the most probable vector for passive long-distance dispersal of this invasive seed wasp. This result confirms that long-distance dispersal from introduction areas is a likely driver of secondary expansion of alien invasive species. Based on our results, management programs for this and other windborne invasive species may consider (1) focusing effort at the longest established sites and (2) monitoring outlying populations remains critically important due to their influence on rates of spread. We also suggest that there is a distinct need for new analysis methods that have the capacity to combine empirical spatiotemporal field data
Bayesian methods for measures of agreement
Broemeling, Lyle D
2009-01-01
Using WinBUGS to implement Bayesian inferences of estimation and testing hypotheses, Bayesian Methods for Measures of Agreement presents useful methods for the design and analysis of agreement studies. It focuses on agreement among the various players in the diagnostic process.The author employs a Bayesian approach to provide statistical inferences based on various models of intra- and interrater agreement. He presents many examples that illustrate the Bayesian mode of reasoning and explains elements of a Bayesian application, including prior information, experimental information, the likelihood function, posterior distribution, and predictive distribution. The appendices provide the necessary theoretical foundation to understand Bayesian methods as well as introduce the fundamentals of programming and executing the WinBUGS software.Taking a Bayesian approach to inference, this hands-on book explores numerous measures of agreement, including the Kappa coefficient, the G coefficient, and intraclass correlation...
Chen, Cong; Zhang, Guohui; Liu, Xiaoyue Cathy; Ci, Yusheng; Huang, Helai; Ma, Jianming; Chen, Yanyan; Guan, Hongzhi
2016-12-01
There is a high potential of severe injury outcomes in traffic crashes on rural interstate highways due to the significant amount of high speed traffic on these corridors. Hierarchical Bayesian models are capable of incorporating between-crash variance and within-crash correlations into traffic crash data analysis and are increasingly utilized in traffic crash severity analysis. This paper applies a hierarchical Bayesian logistic model to examine the significant factors at crash and vehicle/driver levels and their heterogeneous impacts on driver injury severity in rural interstate highway crashes. Analysis results indicate that the majority of the total variance is induced by the between-crash variance, showing the appropriateness of the utilized hierarchical modeling approach. Three crash-level variables and six vehicle/driver-level variables are found significant in predicting driver injury severities: road curve, maximum vehicle damage in a crash, number of vehicles in a crash, wet road surface, vehicle type, driver age, driver gender, driver seatbelt use and driver alcohol or drug involvement. Among these variables, road curve, functional and disabled vehicle damage in crash, single-vehicle crashes, female drivers, senior drivers, motorcycles and driver alcohol or drug involvement tend to increase the odds of drivers being incapably injured or killed in rural interstate crashes, while wet road surface, male drivers and driver seatbelt use are more likely to decrease the probability of severe driver injuries. The developed methodology and estimation results provide insightful understanding of the internal mechanism of rural interstate crashes and beneficial references for developing effective countermeasures for rural interstate crash prevention.
DEFF Research Database (Denmark)
Dashab, Golam Reza; Kadri, Naveen Kumar; Mahdi Shariati, Mohammad;
2012-01-01
) Mixed model analysis (MMA), 2) Random haplotype model (RHM), 3) Genealogy-based mixed model (GENMIX), and 4) Bayesian variable selection (BVS). The data consisted of phenotypes of 2000 animals from 20 sire families and were genotyped with 9990 SNPs on five chromosomes. Results: Out of the eight...
Chung, Hwan; Anthony, James C.
2013-01-01
This article presents a multiple-group latent class-profile analysis (LCPA) by taking a Bayesian approach in which a Markov chain Monte Carlo simulation is employed to achieve more robust estimates for latent growth patterns. This article describes and addresses a label-switching problem that involves the LCPA likelihood function, which has…
Bayesian Networks and Influence Diagrams
DEFF Research Database (Denmark)
Kjærulff, Uffe Bro; Madsen, Anders Læsø
Probabilistic networks, also known as Bayesian networks and influence diagrams, have become one of the most promising technologies in the area of applied artificial intelligence, offering intuitive, efficient, and reliable methods for diagnosis, prediction, decision making, classification......, troubleshooting, and data mining under uncertainty. Bayesian Networks and Influence Diagrams: A Guide to Construction and Analysis provides a comprehensive guide for practitioners who wish to understand, construct, and analyze intelligent systems for decision support based on probabilistic networks. Intended...
Bessiere, Pierre; Ahuactzin, Juan Manuel; Mekhnacha, Kamel
2013-01-01
Probability as an Alternative to Boolean LogicWhile logic is the mathematical foundation of rational reasoning and the fundamental principle of computing, it is restricted to problems where information is both complete and certain. However, many real-world problems, from financial investments to email filtering, are incomplete or uncertain in nature. Probability theory and Bayesian computing together provide an alternative framework to deal with incomplete and uncertain data. Decision-Making Tools and Methods for Incomplete and Uncertain DataEmphasizing probability as an alternative to Boolean
A Bayesian Analysis of a Random Effects Small Business Loan Credit Scoring Model
Directory of Open Access Journals (Sweden)
Patrick J. Farrell
2011-09-01
Full Text Available One of the most important aspects of credit scoring is constructing a model that has low misclassification rates and is also flexible enough to allow for random variation. It is also well known that, when there are a large number of highly correlated variables as is typical in studies involving questionnaire data, a method must be found to reduce the number of variables to those that have high predictive power. Here we propose a Bayesian multivariate logistic regression model with both fixed and random effects for small business loan credit scoring and a variable reduction method using Bayes factors. The method is illustrated on an interesting data set based on questionnaires sent to loan officers in Canadian banks and venture capital companies
Default Bayesian analysis for multi-way tables: a data-augmentation approach
Polson, Nicholas G
2011-01-01
This paper proposes a strategy for regularized estimation in multi-way contingency tables, which are common in meta-analyses and multi-center clinical trials. Our approach is based on data augmentation, and appeals heavily to a novel class of Polya-Gamma distributions. Our main contributions are to build up the relevant distributional theory and to demonstrate three useful features of this data-augmentation scheme. First, it leads to simple EM and Gibbs-sampling algorithms for posterior inference, circumventing the need for analytic approximations, numerical integration, Metropolis--Hastings, or variational methods. Second, it allows modelers much more flexibility when choosing priors, which have traditionally come from the Dirichlet or logistic-normal family. For example, our approach allows users to incorporate Bayesian analogues of classical penalized-likelihood techniques (e.g. the lasso or bridge) in computing regularized estimates for log-odds ratios. Finally, our data-augmentation scheme naturally sugg...
Zhang, Yong; Isukapalli, Sastry S.; Bielory, Leonard; Georgopoulos, Panos G.
2013-04-01
A Bayesian framework is presented for modeling effects of climate change on pollen indices such as annual birch pollen count, maximum daily birch pollen count, start date of birch pollen season and the date of maximum daily birch pollen count. Annual mean CO2 concentration, mean spring temperature and the corresponding pollen index of prior year were found to be statistically significant accounting for effects of climate change on four pollen indices. Results suggest that annual productions and peak values from 2020 to 2100 under different scenarios will be 1.3-8.0 and 1.1-7.3 times higher respectively than the mean values for 2000, and start and peak dates will occur around two to four weeks earlier. These results have been partly confirmed by the available historical data. As a demonstration, the emission profiles in future years were generated by incorporating the predicted pollen indices into an existing emission model.
Bayesian timing analysis of giant flare of SGR 1806-20 by RXTE PCA
Hambaryan, V; Kokkotas, K D
2010-01-01
By detecting high frequency quasi-periodic oscillations (QPOs) and estimating frequencies of them during the decaying tail of giant flares from Soft Gamma-ray Repeaters (SGRs) useful constraints for the equation of state (EoS) of superdense matter may be obtained via comparison with theoretical predictions of eigenfrequencies. We used the data collected by the Rossi X-Ray Timing Explorer (RXTE/XTE) Proportional Counter Array (PCA) of a giant flare of SGR 1806-20 on 2004 Dec 27 and applied a Bayesian periodicity detection method (Gregory & Loredo, 1992) for the search of oscillations of transient nature. In addition to the already detected frequencies, we found a few new frequencies (f_{QPOs} ~ 16.9, 21.4, 36.4, 59.0, 116.3 Hz) of oscillations predicted by Colaiuda et al. (2009) based on the APR_{14} EoS (Akmal et al., 1998) for SGR 1806-20.
Directory of Open Access Journals (Sweden)
V. S.S. Yadavalli
2002-09-01
Full Text Available Bayesian estimation is presented for the stationary rate of disappointments, D∞, for two models (with different specifications of intermittently used systems. The random variables in the system are considered to be independently exponentially distributed. Jeffreys’ prior is assumed for the unknown parameters in the system. Inference about D∞ is being restrained in both models by the complex and non-linear definition of D∞. Monte Carlo simulation is used to derive the posterior distribution of D∞ and subsequently the highest posterior density (HPD intervals. A numerical example where Bayes estimates and the HPD intervals are determined illustrates these results. This illustration is extended to determine the frequentistical properties of this Bayes procedure, by calculating covering proportions for each of these HPD intervals, assuming fixed values for the parameters.
Improving PWR core simulations by Monte Carlo uncertainty analysis and Bayesian inference
Castro, Emilio; Buss, Oliver; Garcia-Herranz, Nuria; Hoefer, Axel; Porsch, Dieter
2016-01-01
A Monte Carlo-based Bayesian inference model is applied to the prediction of reactor operation parameters of a PWR nuclear power plant. In this non-perturbative framework, high-dimensional covariance information describing the uncertainty of microscopic nuclear data is combined with measured reactor operation data in order to provide statistically sound, well founded uncertainty estimates of integral parameters, such as the boron letdown curve and the burnup-dependent reactor power distribution. The performance of this methodology is assessed in a blind test approach, where we use measurements of a given reactor cycle to improve the prediction of the subsequent cycle. As it turns out, the resulting improvement of the prediction quality is impressive. In particular, the prediction uncertainty of the boron letdown curve, which is of utmost importance for the planning of the reactor cycle length, can be reduced by one order of magnitude by including the boron concentration measurement information of the previous...
Hierarchical Bayesian modeling and Markov chain Monte Carlo sampling for tuning-curve analysis.
Cronin, Beau; Stevenson, Ian H; Sur, Mriganka; Körding, Konrad P
2010-01-01
A central theme of systems neuroscience is to characterize the tuning of neural responses to sensory stimuli or the production of movement. Statistically, we often want to estimate the parameters of the tuning curve, such as preferred direction, as well as the associated degree of uncertainty, characterized by error bars. Here we present a new sampling-based, Bayesian method that allows the estimation of tuning-curve parameters, the estimation of error bars, and hypothesis testing. This method also provides a useful way of visualizing which tuning curves are compatible with the recorded data. We demonstrate the utility of this approach using recordings of orientation and direction tuning in primary visual cortex, direction of motion tuning in primary motor cortex, and simulated data.
Bayesian Analysis for Stellar Evolution with Nine Parameters (BASE-9): User's Manual
von Hippel, Ted; Jeffery, Elizabeth; Wagner-Kaiser, Rachel; DeGennaro, Steven; Stein, Nathan; Stenning, David; Jefferys, William H; van Dyk, David
2014-01-01
BASE-9 is a Bayesian software suite that recovers star cluster and stellar parameters from photometry. BASE-9 is useful for analyzing single-age, single-metallicity star clusters, binaries, or single stars, and for simulating such systems. BASE-9 uses Markov chain Monte Carlo and brute-force numerical integration techniques to estimate the posterior probability distributions for the age, metallicity, helium abundance, distance modulus, and line-of-sight absorption for a cluster, and the mass, binary mass ratio, and cluster membership probability for every stellar object. BASE-9 is provided as open source code on a version-controlled web server. The executables are also available as Amazon Elastic Compute Cloud images. This manual provides potential users with an overview of BASE-9, including instructions for installation and use.
Bayesian networks precipitation model based on hidden Markov analysis and its application
Institute of Scientific and Technical Information of China (English)
无
2010-01-01
Surface precipitation estimation is very important in hydrologic forecast. To account for the influence of the neighbors on the precipitation of an arbitrary grid in the network, Bayesian networks and Markov random field were adopted to estimate surface precipitation. Spherical coordinates and the expectation-maximization (EM) algorithm were used for region interpolation, and for estimation of the precipitation of arbitrary point in the region. Surface precipitation estimation of seven precipitation stations in Qinghai Lake region was performed. By comparing with other surface precipitation methods such as Thiessen polygon method, distance weighted mean method and arithmetic mean method, it is shown that the proposed method can judge the relationship of precipitation among different points in the area under complicated circumstances and the simulation results are more accurate and rational.
Directory of Open Access Journals (Sweden)
D.O. Olayungbo
2016-12-01
Full Text Available This paper examines the dynamic interactions between insurance and economic growth in eight African countries for the period of 1970–2013. Insurance demand is measured by insurance penetration which accounts for income differences across the sample countries. A Bayesian Time Varying Parameter Vector Auto regression (TVP-VAR model with stochastic volatility is used to analyze the short run and the long run among the variables of interest. Using insurance penetration as a measure of insurance to economic growth, we find positive relationship for Egypt, while short-run negative and long-run positive effects are found for Kenya, Mauritius, and South Africa. On the contrary, negative effects are found for Algeria, Nigeria, Tunisia, and Zimbabwe. Implementation of sound financial reforms and wide insurance coverage are proposed recommendations for insurance development in the selected African countries.
Energy Technology Data Exchange (ETDEWEB)
Du Xiaodong, E-mail: xdu23@wisc.ed [Department of Agricultural and Applied Economics, University of Wisconsin-Madison, WI (United States); Yu, Cindy L., E-mail: cindyyu@iastate.ed [Department of Statistics, Iowa State University, IA (United States); Hayes, Dermot J., E-mail: dhayes@iastate.ed [Department of Economics and Department of Finance, Iowa State University, IA (United States)
2011-05-15
This paper assesses factors that potentially influence the volatility of crude oil prices and the possible linkage between this volatility and agricultural commodity markets. Stochastic volatility models are applied to weekly crude oil, corn, and wheat futures prices from November 1998 to January 2009. Model parameters are estimated using Bayesian Markov Chain Monte Carlo methods. Speculation, scalping, and petroleum inventories are found to be important in explaining the volatility of crude oil prices. Several properties of crude oil price dynamics are established, including mean-reversion, an asymmetry between returns and volatility, volatility clustering, and infrequent compound jumps. We find evidence of volatility spillover among crude oil, corn, and wheat markets after the fall of 2006. This can be largely explained by tightened interdependence between crude oil and these commodity markets induced by ethanol production.
Analysis and assessment of injury risk in female gymnastics:Bayesian Network approach
Directory of Open Access Journals (Sweden)
Lyudmila Dimitrova
2015-02-01
Full Text Available This paper presents a Bayesian network (BN model for estimating injury risk in female artistic gymnastics. The model illustrates the connections betweenunderlying injury risk factorsthrough a series ofcausal dependencies. The quantitativepart of the model – the conditional probability tables, are determined using ТNormal distribution with parameters, derived by experts. The injury rates calculated by the network are in an agreement with injury statistic data and correctly reports the impact of various risk factors on injury rates. The model is designed to assist coaches and supporting teams in planning the training activity so that injuries are minimized. This study provides important background for further data collection and research necessary to improve the precision of the quantitative predictions of the model.
Fuster-Parra, P; García-Mas, A; Ponseti, F J; Leo, F M
2015-04-01
The purpose of this paper was to discover the relationships among 22 relevant psychological features in semi-professional football players in order to study team's performance and collective efficacy via a Bayesian network (BN). The paper includes optimization of team's performance and collective efficacy using intercausal reasoning pattern which constitutes a very common pattern in human reasoning. The BN is used to make inferences regarding our problem, and therefore we obtain some conclusions; among them: maximizing the team's performance causes a decrease in collective efficacy and when team's performance achieves the minimum value it causes an increase in moderate/high values of collective efficacy. Similarly, we may reason optimizing team collective efficacy instead. It also allows us to determine the features that have the strongest influence on performance and which on collective efficacy. From the BN two different coaching styles were differentiated taking into account the local Markov property: training leadership and autocratic leadership.
Bayesian method for the analysis of the dust emission in the Far-Infrared and Submillimeter
Veneziani, M; Noriega-Crespo, A; Carey, S; Paladini, R; Paradis, D
2013-01-01
We present a method, based on Bayesian statistics, to fit the dust emission parameters in the far-infrared and submillimeter wavelengths. The method estimates the dust temperature and spectral emissivity index, plus their relationship, taking into account properly the statistical and systematic uncertainties. We test it on three sets of simulated sources detectable by the Herschel Space Observatory in the PACS and SPIRE spectral bands (70-500 micron), spanning over a wide range of dust temperatures. The simulated observations are a one-component Interstellar Medium, and two two-component sources, both warm (HII regions) and cold (cold clumps). We first define a procedure to identify the better model, then we recover the parameters of the model and measure their physical correlations by means of a Monte Carlo Markov Chain algorithm adopting multi-variate Gaussian priors. In this process we assess the reliability of the model recovery, and of parameters estimation. We conclude that the model and parameters are ...
Bayesian Analysis of Inflation III: Slow Roll Reconstruction Using Model Selection
Noreña, Jorge; Verde, Licia; Peiris, Hiranya V; Easther, Richard
2012-01-01
We implement Slow Roll Reconstruction -- an optimal solution to the inverse problem for inflationary cosmology -- within ModeCode, a publicly available solver for the inflationary dynamics. We obtain up-to-date constraints on the reconstructed inflationary potential, derived from the WMAP 7-year dataset and South Pole Telescope observations, combined with large scale structure data derived from SDSS Data Release 7. Using ModeCode in conjunction with the MultiNest sampler, we compute Bayesian evidence for the reconstructed potential at each order in the truncated slow roll hierarchy. We find that the data are well-described by the first two slow roll parameters, \\epsilon and \\eta, and that there is no need to include a nontrivial \\xi parameter.
Converse, Sarah J.; Royle, J. Andrew; Urbanek, Richard P.
2012-01-01
Inbreeding depression is frequently a concern of managers interested in restoring endangered species. Decisions to reduce the potential for inbreeding depression by balancing genotypic contributions to reintroduced populations may exact a cost on long-term demographic performance of the population if those decisions result in reduced numbers of animals released and/or restriction of particularly successful genotypes (i.e., heritable traits of particular family lines). As part of an effort to restore a migratory flock of Whooping Cranes (Grus americana) to eastern North America using the offspring of captive breeders, we obtained a unique dataset which includes post-release mark-recapture data, as well as the pedigree of each released individual. We developed a Bayesian formulation of a multi-state model to analyze radio-telemetry, band-resight, and dead recovery data on reintroduced individuals, in order to track survival and breeding state transitions. We used studbook-based individual covariates to examine the comparative evidence for and degree of effects of inbreeding, genotype, and genotype quality on post-release survival of reintroduced individuals. We demonstrate implementation of the Bayesian multi-state model, which allows for the integration of imperfect detection, multiple data types, random effects, and individual- and time-dependent covariates. Our results provide only weak evidence for an effect of the quality of an individual's genotype in captivity on post-release survival as well as for an effect of inbreeding on post-release survival. We plan to integrate our results into a decision-analytic modeling framework that can explicitly examine tradeoffs between the effects of inbreeding and the effects of genotype and demographic stochasticity on population establishment.
基于贝叶斯网络的试卷分析＊%Paper based on Bayesian Network Analysis
Institute of Scientific and Technical Information of China (English)
王娜
2014-01-01
简要介绍了基于贝叶斯网络的试卷分析试验，试验主要用到的工具是基于MATLAB语言编写的BNT软件包。通过试验研究，分析了平时的出勤率、作业提交率等五方面内容对学生成绩的影响。%This paper introduces the papers based on Bayesian network analysis test; the test is based on the main tools used MATLAB language BNT package. Through experimental research, analyzes the impact of the five aspects of the usual attendance, job submission rate on student achievement.
Torres, Craig; Jones, Rachael; Boelter, Fred; Poole, James; Dell, Linda; Harper, Paul
2014-01-01
Bayesian Decision Analysis (BDA) uses Bayesian statistics to integrate multiple types of exposure information and classify exposures within the exposure rating categorization scheme promoted in American Industrial Hygiene Association (AIHA) publications. Prior distributions for BDA may be developed from existing monitoring data, mathematical models, or professional judgment. Professional judgments may misclassify exposures. We suggest that a structured qualitative risk assessment (QLRA) method can provide consistency and transparency in professional judgments. In this analysis, we use a structured QLRA method to define prior distributions (priors) for BDA. We applied this approach at three semiconductor facilities in South Korea, and present an evaluation of the performance of structured QLRA for determination of priors, and an evaluation of occupational exposures using BDA. Specifically, the structured QLRA was applied to chemical agents in similar exposure groups to identify provisional risk ratings. Standard priors were developed for each risk rating before review of historical monitoring data. Newly collected monitoring data were used to update priors informed by QLRA or historical monitoring data, and determine the posterior distribution. Exposure ratings were defined by the rating category with the highest probability--i.e., the most likely. We found the most likely exposure rating in the QLRA-informed priors to be consistent with historical and newly collected monitoring data, and the posterior exposure ratings developed with QLRA-informed priors to be equal to or greater than those developed with data-informed priors in 94% of comparisons. Overall, exposures at these facilities are consistent with well-controlled work environments. That is, the 95th percentile of exposure distributions are ≤50% of the occupational exposure limit (OEL) for all chemical-SEG combinations evaluated; and are ≤10% of the limit for 94% of chemical-SEG combinations evaluated.
Licquia, Timothy C.; Newman, Jeffrey A.
2016-11-01
The exponential scale length (L d ) of the Milky Way’s (MW’s) disk is a critical parameter for describing the global physical size of our Galaxy, important both for interpreting other Galactic measurements and helping us to understand how our Galaxy fits into extragalactic contexts. Unfortunately, current estimates span a wide range of values and are often statistically incompatible with one another. Here, we perform a Bayesian meta-analysis to determine an improved, aggregate estimate for L d , utilizing a mixture-model approach to account for the possibility that any one measurement has not properly accounted for all statistical or systematic errors. Within this machinery, we explore a variety of ways of modeling the nature of problematic measurements, and then employ a Bayesian model averaging technique to derive net posterior distributions that incorporate any model-selection uncertainty. Our meta-analysis combines 29 different (15 visible and 14 infrared) photometric measurements of L d available in the literature; these involve a broad assortment of observational data sets, MW models and assumptions, and methodologies, all tabulated herein. Analyzing the visible and infrared measurements separately yields estimates for L d of {2.71}-0.20+0.22 kpc and {2.51}-0.13+0.15 kpc, respectively, whereas considering them all combined yields 2.64 ± 0.13 kpc. The ratio between the visible and infrared scale lengths determined here is very similar to that measured in external spiral galaxies. We use these results to update the model of the Galactic disk from our previous work, constraining its stellar mass to be {4.8}-1.1+1.5× {10}10 M ⊙, and the MW’s total stellar mass to be {5.7}-1.1+1.5× {10}10 M ⊙.
Lobach, Iryna; Mallick, Bani; Carroll, Raymond J
2011-01-01
Case-control studies are widely used to detect gene-environment interactions in the etiology of complex diseases. Many variables that are of interest to biomedical researchers are difficult to measure on an individual level, e.g. nutrient intake, cigarette smoking exposure, long-term toxic exposure. Measurement error causes bias in parameter estimates, thus masking key features of data and leading to loss of power and spurious/masked associations. We develop a Bayesian methodology for analysis of case-control studies for the case when measurement error is present in an environmental covariate and the genetic variable has missing data. This approach offers several advantages. It allows prior information to enter the model to make estimation and inference more precise. The environmental covariates measured exactly are modeled completely nonparametrically. Further, information about the probability of disease can be incorporated in the estimation procedure to improve quality of parameter estimates, what cannot be done in conventional case-control studies. A unique feature of the procedure under investigation is that the analysis is based on a pseudo-likelihood function therefore conventional Bayesian techniques may not be technically correct. We propose an approach using Markov Chain Monte Carlo sampling as well as a computationally simple method based on an asymptotic posterior distribution. Simulation experiments demonstrated that our method produced parameter estimates that are nearly unbiased even for small sample sizes. An application of our method is illustrated using a population-based case-control study of the association between calcium intake with the risk of colorectal adenoma development.
Energy Technology Data Exchange (ETDEWEB)
Sigeti, David E. [Los Alamos National Laboratory; Pelak, Robert A. [Los Alamos National Laboratory
2012-09-11
We present a Bayesian statistical methodology for identifying improvement in predictive simulations, including an analysis of the number of (presumably expensive) simulations that will need to be made in order to establish with a given level of confidence that an improvement has been observed. Our analysis assumes the ability to predict (or postdict) the same experiments with legacy and new simulation codes and uses a simple binomial model for the probability, {theta}, that, in an experiment chosen at random, the new code will provide a better prediction than the old. This model makes it possible to do statistical analysis with an absolute minimum of assumptions about the statistics of the quantities involved, at the price of discarding some potentially important information in the data. In particular, the analysis depends only on whether or not the new code predicts better than the old in any given experiment, and not on the magnitude of the improvement. We show how the posterior distribution for {theta} may be used, in a kind of Bayesian hypothesis testing, both to decide if an improvement has been observed and to quantify our confidence in that decision. We quantify the predictive probability that should be assigned, prior to taking any data, to the possibility of achieving a given level of confidence, as a function of sample size. We show how this predictive probability depends on the true value of {theta} and, in particular, how there will always be a region around {theta} = 1/2 where it is highly improbable that we will be able to identify an improvement in predictive capability, although the width of this region will shrink to zero as the sample size goes to infinity. We show how the posterior standard deviation may be used, as a kind of 'plan B metric' in the case that the analysis shows that {theta} is close to 1/2 and argue that such a plan B should generally be part of hypothesis testing. All the analysis presented in the paper is done with a
Strong approximations and sequential change-point analysis for diffusion processes
DEFF Research Database (Denmark)
Mihalache, Stefan-Radu
2012-01-01
In this paper ergodic diffusion processes depending on a parameter in the drift are considered under the assumption that the processes can be observed continuously. Strong approximations by Wiener processes for a stochastic integral and for the estimator process constructed by the one-step proced......-step procedure of Le Cam are obtained. Applying these approximations, a CUSUM-type procedure is developed for the sequential testing of changes in the parameter....
12th Brazilian Meeting on Bayesian Statistics
Louzada, Francisco; Rifo, Laura; Stern, Julio; Lauretto, Marcelo
2015-01-01
Through refereed papers, this volume focuses on the foundations of the Bayesian paradigm; their comparison to objectivistic or frequentist Statistics counterparts; and the appropriate application of Bayesian foundations. This research in Bayesian Statistics is applicable to data analysis in biostatistics, clinical trials, law, engineering, and the social sciences. EBEB, the Brazilian Meeting on Bayesian Statistics, is held every two years by the ISBrA, the International Society for Bayesian Analysis, one of the most active chapters of the ISBA. The 12th meeting took place March 10-14, 2014 in Atibaia. Interest in foundations of inductive Statistics has grown recently in accordance with the increasing availability of Bayesian methodological alternatives. Scientists need to deal with the ever more difficult choice of the optimal method to apply to their problem. This volume shows how Bayes can be the answer. The examination and discussion on the foundations work towards the goal of proper application of Bayesia...
Kibret, Taddele; Richer, Danielle; Beyene, Joseph
2014-01-01
Network meta-analysis (NMA) has emerged as a useful analytical tool allowing comparison of multiple treatments based on direct and indirect evidence. Commonly, a hierarchical Bayesian NMA model is used, which allows rank probabilities (the probability that each treatment is best, second best, and so on) to be calculated for decision making. However, the statistical properties of rank probabilities are not well understood. This study investigates how rank probabilities are affected by various factors such as unequal number of studies per comparison in the network, the sample size of individual studies, the network configuration, and effect sizes between treatments. In order to explore these factors, a simulation study of four treatments (three equally effective treatments and one less effective reference) was conducted. The simulation illustrated that estimates of rank probabilities are highly sensitive to both the number of studies per comparison and the overall network configuration. An unequal number of studies per comparison resulted in biased estimates of treatment rank probabilities for every network considered. The rank probability for the treatment that was included in the fewest number of studies was biased upward. Conversely, the rank of the treatment included in the most number of studies was consistently underestimated. When the simulation was altered to include three equally effective treatments and one superior treatment, the hierarchical Bayesian NMA model correctly identified the most effective treatment, regardless of all factors varied. The results of this study offer important insight into the ability of NMA models to rank treatments accurately under several scenarios. The authors recommend that health researchers use rank probabilities cautiously in making important decisions.
Turner, N L; Dias, S; Ades, A E; Welton, N J
2015-05-30
Missing outcome data are a common threat to the validity of the results from randomised controlled trials (RCTs), which, if not analysed appropriately, can lead to misleading treatment effect estimates. Studies with missing outcome data also threaten the validity of any meta-analysis that includes them. A conceptually simple Bayesian framework is proposed, to account for uncertainty due to missing binary outcome data in meta-analysis. A pattern-mixture model is fitted, which allows the incorporation of prior information on a parameter describing the missingness mechanism. We describe several alternative parameterisations, with the simplest being a prior on the probability of an event in the missing individuals. We describe a series of structural assumptions that can be made concerning the missingness parameters. We use some artificial data scenarios to demonstrate the ability of the model to produce a bias-adjusted estimate of treatment effect that accounts for uncertainty. A meta-analysis of haloperidol versus placebo for schizophrenia is used to illustrate the model. We end with a discussion of elicitation of priors, issues with poor reporting and potential extensions of the framework. Our framework allows one to make the best use of evidence produced from RCTs with missing outcome data in a meta-analysis, accounts for any uncertainty induced by missing data and fits easily into a wider evidence synthesis framework for medical decision making.
Stenning, D. C.; Wagner-Kaiser, R.; Robinson, E.; van Dyk, D. A.; von Hippel, T.; Sarajedini, A.; Stein, N.
2016-07-01
We develop a Bayesian model for globular clusters composed of multiple stellar populations, extending earlier statistical models for open clusters composed of simple (single) stellar populations. Specifically, we model globular clusters with two populations that differ in helium abundance. Our model assumes a hierarchical structuring of the parameters in which physical properties—age, metallicity, helium abundance, distance, absorption, and initial mass—are common to (i) the cluster as a whole or to (ii) individual populations within a cluster, or are unique to (iii) individual stars. An adaptive Markov chain Monte Carlo (MCMC) algorithm is devised for model fitting that greatly improves convergence relative to its precursor non-adaptive MCMC algorithm. Our model and computational tools are incorporated into an open-source software suite known as BASE-9. We use numerical studies to demonstrate that our method can recover parameters of two-population clusters, and also show how model misspecification can potentially be identified. As a proof of concept, we analyze the two stellar populations of globular cluster NGC 5272 using our model and methods. (BASE-9 is available from GitHub: https://github.com/argiopetech/base/releases).
Uncertainty analysis of strain modal parameters by Bayesian method using frequency response function
Institute of Scientific and Technical Information of China (English)
Xu Li; Yi Weijian; Zhihua Yi
2007-01-01
Structural strain modes are able to detect changes in local structural performance, but errors are inevitably intermixed in the measured data. In this paper, strain modal parameters are considered as random variables, and their uncertainty is analyzed by a Bayesian method based on the structural frequency response function (FRF). The estimates of strain modal parameters with maximal posterior probability are determined. Several independent measurements of the FRF of a four-story reinforced concrete frame structural model were performed in the laboratory. The ability to identify the stiffness change in a concrete column using the strain mode was verified. It is shown that the uncertainty of the natural frequency is very small. Compared with the displacement mode shape, the variations of strain mode shapes at each point are quite different. The damping ratios are more affected by the types of test systems. Except for the case where a high order strain mode does not identify local damage, the first order strain mode can provide an exact indication of the damage location.
A Bayesian threshold-normal mixture model for analysis of a continuous mastitis-related trait.
Ødegård, J; Madsen, P; Gianola, D; Klemetsdal, G; Jensen, J; Heringstad, B; Korsgaard, I R
2005-07-01
Mastitis is associated with elevated somatic cell count in milk, inducing a positive correlation between milk somatic cell score (SCS) and the absence or presence of the disease. In most countries, selection against mastitis has focused on selecting parents with genetic evaluations that have low SCS. Univariate or multivariate mixed linear models have been used for statistical description of SCS. However, an observation of SCS can be regarded as drawn from a 2- (or more) component mixture defined by the (usually) unknown health status of a cow at the test-day on which SCS is recorded. A hierarchical 2-component mixture model was developed, assuming that the health status affecting the recorded test-day SCS is completely specified by an underlying liability variable. Based on the observed SCS, inferences can be drawn about disease status and parameters of both SCS and liability to mastitis. The prior probability of putative mastitis was allowed to vary between subgroups (e.g., herds, families), by specifying fixed and random effects affecting both SCS and liability. Using simulation, it was found that a Bayesian model fitted to the data yielded parameter estimates close to their true values. The model provides selection criteria that are more appealing than selection for lower SCS. The proposed model can be extended to handle a wide range of problems related to genetic analyses of mixture traits.
A Bayesian analysis of redshifted 21-cm HI signal and foregrounds: Simulations for LOFAR
Ghosh, Abhik; Chapman, Emma; Jelic, Vibor
2015-01-01
Observations of the EoR with the 21-cm hyperfine emission of neutral hydrogen (HI) promise to open an entirely new window onto the formation of the first stars, galaxies and accreting black holes. In order to characterize the weak 21-cm signal, we need to develop imaging techniques which can reconstruct the extended emission very precisely. Here, we present an inversion technique for LOFAR baselines at NCP, based on a Bayesian formalism with optimal spatial regularization, which is used to reconstruct the diffuse foreground map directly from the simulated visibility data. We notice the spatial regularization de-noises the images to a large extent, allowing one to recover the 21-cm power-spectrum over a considerable $k_{\\perp}-k_{\\para}$ space in the range of $0.03\\,{\\rm Mpc^{-1}}
Mbakwe, Anthony C; Saka, Anthony A; Choi, Keechoo; Lee, Young-Jae
2016-08-01
Highway traffic accidents all over the world result in more than 1.3 million fatalities annually. An alarming number of these fatalities occurs in developing countries. There are many risk factors that are associated with frequent accidents, heavy loss of lives, and property damage in developing countries. Unfortunately, poor record keeping practices are very difficult obstacle to overcome in striving to obtain a near accurate casualty and safety data. In light of the fact that there are numerous accident causes, any attempts to curb the escalating death and injury rates in developing countries must include the identification of the primary accident causes. This paper, therefore, seeks to show that the Delphi Technique is a suitable alternative method that can be exploited in generating highway traffic accident data through which the major accident causes can be identified. In order to authenticate the technique used, Korea, a country that underwent similar problems when it was in its early stages of development in addition to the availability of excellent highway safety records in its database, is chosen and utilized for this purpose. Validation of the methodology confirms the technique is suitable for application in developing countries. Furthermore, the Delphi Technique, in combination with the Bayesian Network Model, is utilized in modeling highway traffic accidents and forecasting accident rates in the countries of research.
Bayesian analysis of cosmic-ray propagation: evidence against homogeneous diffusion
Jóhannesson, G; Vincent, A C; Moskalenko, I V; Orlando, E; Porter, T A; Strong, A W; Trotta, R; Feroz, F; Graff, P; Hobson, M P
2016-01-01
We present the results of the most complete ever scan of the parameter space for cosmic ray (CR) injection and propagation. We perform a Bayesian search of the main GALPROP parameters, using the MultiNest nested sampling algorithm, augmented by the BAMBI neural network machine learning package. This is the first such study to separate out low-mass isotopes ($p$, $\\bar p$ and He) from the usual light elements (Be, B, C, N, O). We find that the propagation parameters that best fit $p$, $\\bar p$, He data are significantly different from those that fit light elements, including the B/C and $^{10}$Be/$^9$Be secondary-to-primary ratios normally used to calibrate propagation parameters. This suggests each set of species is probing a very different interstellar medium, and that the standard approach of calibrating propagation parameters using B/C can lead to incorrect results. We present posterior distributions and best fit parameters for propagation of both sets of nuclei, as well as for the injection abundances of ...
Snyder, Carolyn W.
2016-09-01
Statistical challenges often preclude comparisons among different sea surface temperature (SST) reconstructions over the past million years. Inadequate consideration of uncertainty can result in misinterpretation, overconfidence, and biased conclusions. Here I apply Bayesian hierarchical regressions to analyze local SST responsiveness to climate changes for 54 SST reconstructions from across the globe over the past million years. I develop methods to account for multiple sources of uncertainty, including the quantification of uncertainty introduced from absolute dating into interrecord comparisons. The estimates of local SST responsiveness explain 64% (62% to 77%, 95% interval) of the total variation within each SST reconstruction with a single number. There is remarkable agreement between SST proxy methods, with the exception of Mg/Ca proxy methods estimating muted responses at high latitudes. The Indian Ocean exhibits a muted response in comparison to other oceans. I find a stable estimate of the proposed "universal curve" of change in local SST responsiveness to climate changes as a function of sin2(latitude) over the past 400,000 years: SST change at 45°N/S is larger than the average tropical response by a factor of 1.9 (1.5 to 2.6, 95% interval) and explains 50% (35% to 58%, 95% interval) of the total variation between each SST reconstruction. These uncertainty and statistical methods are well suited for application across paleoclimate and environmental data series intercomparisons.
A Bayesian analysis of trends in ozone sounding data series from 9 Nordic stations
Christiansen, Bo; Jepsen, Nis; Larsen, Niels; Korsholm, Ulrik S.
2016-04-01
Ozone soundings from 9 Nordic stations have been homogenized and interpolated to standard pressure levels. The different stations have very different data coverage; the longest period with data is from the end of the 1980ies to 2013. We apply a model which includes both low-frequency variability in form of a polynomial, an annual cycle with harmonics, the possibility for low-frequency variability in the annual amplitude and phasing, and either white noise or AR1 noise. The fitting of the parameters is performed with a Bayesian approach not only giving the posterior mean values but also credible intervals. We find that all stations agree on an well-defined annual cycle in the free troposphere with a relatively confined maximum in the early summer. Regarding the low-frequency variability we find that Scoresbysund, Ny Aalesund, and Sodankyla show similar structures with a maximum near 2005 followed by a decrease. However, these results are only weakly significant. A significant change in the amplitude of the annual cycle was only found for Ny Aalesund. Here the peak-to-peak amplitude changes from 0.9 to 0.8 mhPa between 1995-2000 and 2007-2012. The results are shown to be robust to the different settings of the model parameters (order of the polynomial, number of harmonics in the annual cycle, type of noise, etc). The results are also shown to be characteristic for all pressure levels in the free troposphere.
Directory of Open Access Journals (Sweden)
Chen Yidong
2011-10-01
Full Text Available Abstract Background Transcriptional regulation by transcription factor (TF controls the time and abundance of mRNA transcription. Due to the limitation of current proteomics technologies, large scale measurements of protein level activities of TFs is usually infeasible, making computational reconstruction of transcriptional regulatory network a difficult task. Results We proposed here a novel Bayesian non-negative factor model for TF mediated regulatory networks. Particularly, the non-negative TF activities and sample clustering effect are modeled as the factors from a Dirichlet process mixture of rectified Gaussian distributions, and the sparse regulatory coefficients are modeled as the loadings from a sparse distribution that constrains its sparsity using knowledge from database; meantime, a Gibbs sampling solution was developed to infer the underlying network structure and the unknown TF activities simultaneously. The developed approach has been applied to simulated system and breast cancer gene expression data. Result shows that, the proposed method was able to systematically uncover TF mediated transcriptional regulatory network structure, the regulatory coefficients, the TF protein level activities and the sample clustering effect. The regulation target prediction result is highly coordinated with the prior knowledge, and sample clustering result shows superior performance over previous molecular based clustering method. Conclusions The results demonstrated the validity and effectiveness of the proposed approach in reconstructing transcriptional networks mediated by TFs through simulated systems and real data.
An empirical Bayesian analysis applied to the globular cluster pulsar population
Turk, P J
2013-01-01
We describe an empirical Bayesian approach to determine the most likely size of an astronomical population of sources of which only a small subset are observed above some limiting flux density threshold. The method is most naturally applied to astronomical source populations at a common distance (e.g.,stellar populations in globular clusters), and can be applied even to populations where a survey detects no objects. The model allows for the inclusion of physical parameters of the stellar population and the detection process. As an example, we apply this method to the current sample of radio pulsars in Galactic globular clusters. Using the sample of flux density limits on pulsar surveys in 94 globular clusters published by Boyles et al., we examine a large number of population models with different dependencies. We find that models which include the globular cluster two-body encounter rate, $\\Gamma$, are strongly favoured over models in which this is not a factor. The optimal model is one in which the mean num...
Directory of Open Access Journals (Sweden)
Zsolt Zador
Full Text Available Traumatic brain injury remains a global health problem. Understanding the relative importance of outcome predictors helps optimize our treatment strategies by informing assessment protocols, clinical decisions and trial designs. In this study we establish importance ranking for outcome predictors based on receiver operating indices to identify key predictors of outcome and create simple predictive models. We then explore the associations between key outcome predictors using Bayesian networks to gain further insight into predictor importance.We analyzed the corticosteroid randomization after significant head injury (CRASH trial database of 10008 patients and included patients for whom demographics, injury characteristics, computer tomography (CT findings and Glasgow Outcome Scale (GCS were recorded (total of 13 predictors, which would be available to clinicians within a few hours following the injury in 6945 patients. Predictions of clinical outcome (death or severe disability at 6 months were performed using logistic regression models with 5-fold cross validation. Predictive performance was measured using standardized partial area (pAUC under the receiver operating curve (ROC and we used Delong test for comparisons. Variable importance ranking was based on pAUC targeted at specificity (pAUCSP and sensitivity (pAUCSE intervals of 90-100%. Probabilistic associations were depicted using Bayesian networks.Complete AUC analysis showed very good predictive power (AUC = 0.8237, 95% CI: 0.8138-0.8336 for the complete model. Specificity focused importance ranking highlighted age, pupillary, motor responses, obliteration of basal cisterns/3rd ventricle and midline shift. Interestingly when targeting model sensitivity, the highest-ranking variables were age, severe extracranial injury, verbal response, hematoma on CT and motor response. Simplified models, which included only these key predictors, had similar performance (pAUCSP = 0.6523, 95% CI: 0
Prior approval: the growth of Bayesian methods in psychology.
Andrews, Mark; Baguley, Thom
2013-02-01
Within the last few years, Bayesian methods of data analysis in psychology have proliferated. In this paper, we briefly review the history or the Bayesian approach to statistics, and consider the implications that Bayesian methods have for the theory and practice of data analysis in psychology.
Bayesian peak bagging analysis of 19 low-mass low-luminosity red giants observed with Kepler
Corsaro, E; García, R A
2015-01-01
The currently available Kepler light curves contain an outstanding amount of information but a detailed analysis of the individual oscillation modes in the observed power spectra, also known as peak bagging, is computationally demanding and challenging to perform on a large number of targets. Our intent is to perform for the first time a peak bagging analysis on a sample of 19 low-mass low-luminosity red giants observed by Kepler for more than four years. This allows us to provide high-quality asteroseismic measurements that can be exploited for an intensive testing of the physics used in stellar structure models, stellar evolution and pulsation codes, as well as for refining existing asteroseismic scaling relations in the red giant branch regime. For this purpose, powerful and sophisticated analysis tools are needed. We exploit the Bayesian code Diamonds, using an efficient nested sampling Monte Carlo algorithm, to perform both a fast fitting of the individual oscillation modes and a peak detection test base...
Krishnamurthy, Krish
2013-12-01
The intrinsic quantitative nature of NMR is increasingly exploited in areas ranging from complex mixture analysis (as in metabolomics and reaction monitoring) to quality assurance/control. Complex NMR spectra are more common than not, and therefore, extraction of quantitative information generally involves significant prior knowledge and/or operator interaction to characterize resonances of interest. Moreover, in most NMR-based metabolomic experiments, the signals from metabolites are normally present as a mixture of overlapping resonances, making quantification difficult. Time-domain Bayesian approaches have been reported to be better than conventional frequency-domain analysis at identifying subtle changes in signal amplitude. We discuss an approach that exploits Bayesian analysis to achieve a complete reduction to amplitude frequency table (CRAFT) in an automated and time-efficient fashion - thus converting the time-domain FID to a frequency-amplitude table. CRAFT uses a two-step approach to FID analysis. First, the FID is digitally filtered and downsampled to several sub FIDs, and secondly, these sub FIDs are then modeled as sums of decaying sinusoids using the Bayesian approach. CRAFT tables can be used for further data mining of quantitative information using fingerprint chemical shifts of compounds of interest and/or statistical analysis of modulation of chemical quantity in a biological study (metabolomics) or process study (reaction monitoring) or quality assurance/control. The basic principles behind this approach as well as results to evaluate the effectiveness of this approach in mixture analysis are presented.
Directory of Open Access Journals (Sweden)
Zhi-Qiang Cai
Full Text Available The prognosis of hepatocellular carcinoma (HCC after hepatectomy involves many factors. Previous studies have evaluated the separate influences of single factors; few have considered the combined influence of various factors. This paper combines the Bayesian network (BN with importance measures to identify key factors that have significant effects on survival time.A dataset of 299 patients with HCC after hepatectomy was studied to establish a BN using a tree-augmented naïve Bayes algorithm that could mine relationships between factors. The composite importance measure was applied to rank the impact of factors on survival time.124 patients (>10 months and 77 patients (≤10 months were correctly classified. The accuracy of BN model was 67.2%. For patients with long survival time (>10 months, the true-positive rate of the model was 83.22% and the false-positive rate was 48.67%. According to the model, the preoperative alpha fetoprotein (AFP level and postoperative performance of transcatheter arterial chemoembolization (TACE were independent factors for survival of HCC patients. The grade of preoperative liver function reflected the tendency for postoperative complications. Intraoperative blood loss, tumor size, portal vein tumor thrombosis (PVTT, time of clamping the porta hepatis, tumor number, operative method, and metastasis were dependent variables in survival time prediction. PVTT was considered the most significant for the prognosis of survival time.Using the BN and importance measures, PVTT was identified as the most significant predictor of survival time for patients with HCC after hepatectomy.
Bayesian Analysis and Characterization of Multiple Populations in Galactic Globular Clusters
Wagner-Kaiser, Rachel A.; Stenning, David; Sarajedini, Ata; von Hippel, Ted; van Dyk, David A.; Robinson, Elliot; Stein, Nathan; Jefferys, William H.; BASE-9, HST UVIS Globular Cluster Treasury Program
2017-01-01
Globular clusters have long been important tools to unlock the early history of galaxies. Thus, it is crucial we understand the formation and characteristics of the globular clusters (GCs) themselves. Historically, GCs were thought to be simple and largely homogeneous populations, formed via collapse of a single molecular cloud. However, this classical view has been overwhelmingly invalidated by recent work. It is now clear that the vast majority of globular clusters in our Galaxy host two or more chemically distinct populations of stars, with variations in helium and light elements at discrete abundance levels. No coherent story has arisen that is able to fully explain the formation of multiple populations in globular clusters nor the mechanisms that drive stochastic variations from cluster to cluster.We use Cycle 21 Hubble Space Telescope (HST) observations and HST archival ACS Treasury observations of 30 Galactic Globular Clusters to characterize two distinct stellar populations. A sophisticated Bayesian technique is employed to simultaneously sample the joint posterior distribution of age, distance, and extinction for each cluster, as well as unique helium values for two populations within each cluster and the relative proportion of those populations. We find the helium differences among the two populations in the clusters fall in the range of 0.04 to 0.11. Because adequate models varying in CNO are not presently available, we view these spreads as upper limits and present them with statistical rather than observational uncertainties. Evidence supports previous studies suggesting an increase in helium content concurrent with increasing mass of the cluster. We also find that the proportion of the first population of stars increases with mass. Our results are examined in the context of proposed globular cluster formation scenarios.
Sensitivity issues in the bayesian analysis of failures in repairable systems
Energy Technology Data Exchange (ETDEWEB)
Ruggero, F.
2001-07-01
The paper arises from a consulting project in which failures (gas escapes) are considered in the steel pipelines of an urban gas distribution network. The available data are the 33 failure times from 1978 to 1997 over a network of 275 kilometres. More details on the data can be found in (1). Steel pipelines are subject to ageing and the global system reliability is barely affected by one failure. Thus, we consider the network as a repairable system, since it keeps the same reliability when minimal repairs immediately follow failures. Failures of Such systems are often described by non-homogeneous Poisson processes (NHPP), which take in account the degradation of the components; see, e. g. (2). In the paper we model the failures pattern with a NHPP with logarithmic intensity and present different sensitivity analyses when relaxing the assumption on the parametric model. We operate in a robust Bayesian framework, see (3) for a thoroughly illustration of the approach. In the paper we do not focus on the commonly addressed issue of sensitivity to the prior, but we are interested in model sensitivity and consider two ways to build classes of models around the NHPP. In the first approach, we consider the NHPP as an element of a class of processes, defined through differential equations whose solutions are mean value functions of NHPPs. In the second, nonparametric approach, we consider processes whose mean value functions are distribution functions of random measures. In both cases, we compare the models with the baseline, logarithmic NHPP. In Section 2 we analyse the logarithmic NHPP and the class of parametric models, whereas comparison between parametric and nonparametric models is performed in Section 3. Some concluding remarks are presented in Section. (Author) 12 refs.
Bayesian theory and applications
Dellaportas, Petros; Polson, Nicholas G; Stephens, David A
2013-01-01
The development of hierarchical models and Markov chain Monte Carlo (MCMC) techniques forms one of the most profound advances in Bayesian analysis since the 1970s and provides the basis for advances in virtually all areas of applied and theoretical Bayesian statistics. This volume guides the reader along a statistical journey that begins with the basic structure of Bayesian theory, and then provides details on most of the past and present advances in this field. The book has a unique format. There is an explanatory chapter devoted to each conceptual advance followed by journal-style chapters that provide applications or further advances on the concept. Thus, the volume is both a textbook and a compendium of papers covering a vast range of topics. It is appropriate for a well-informed novice interested in understanding the basic approach, methods and recent applications. Because of its advanced chapters and recent work, it is also appropriate for a more mature reader interested in recent applications and devel...
Walter, William D.; Smith, Rick; Vanderklok, Mike; VerCauterren, Kurt C.
2014-01-01
Bovine tuberculosis is a bacterial disease caused by Mycobacterium bovis in livestock and wildlife with hosts that include Eurasian badgers (Meles meles), brushtail possum (Trichosurus vulpecula), and white-tailed deer (Odocoileus virginianus). Risk-assessment efforts in Michigan have been initiated on farms to minimize interactions of cattle with wildlife hosts but research onM. bovis on cattle farms has not investigated the spatial context of disease epidemiology. To incorporate spatially explicit data, initial likelihood of infection probabilities for cattle farms tested for M. bovis, prevalence of M. bovis in white-tailed deer, deer density, and environmental variables for each farm were modeled in a Bayesian hierarchical framework. We used geo-referenced locations of 762 cattle farms that have been tested for M. bovis, white-tailed deer prevalence, and several environmental variables that may lead to long-term survival and viability of M. bovis on farms and surrounding habitats (i.e., soil type, habitat type). Bayesian hierarchical analyses identified deer prevalence and proportion of sandy soil within our sampling grid as the most supported model. Analysis of cattle farms tested for M. bovisidentified that for every 1% increase in sandy soil resulted in an increase in odds of infection by 4%. Our analysis revealed that the influence of prevalence of M. bovis in white-tailed deer was still a concern even after considerable efforts to prevent cattle interactions with white-tailed deer through on-farm mitigation and reduction in the deer population. Cattle farms test positive for M. bovis annually in our study area suggesting that the potential for an environmental source either on farms or in the surrounding landscape may contributing to new or re-infections with M. bovis. Our research provides an initial assessment of potential environmental factors that could be incorporated into additional modeling efforts as more knowledge of deer herd
Echeverria, Alex; Silva, Jorge F.; Mendez, Rene A.; Orchard, Marcos
2016-10-01
Context. The best precision that can be achieved to estimate the location of a stellar-like object is a topic of permanent interest in the astrometric community. Aims: We analyze bounds for the best position estimation of a stellar-like object on a CCD detector array in a Bayesian setting where the position is unknown, but where we have access to a prior distribution. In contrast to a parametric setting where we estimate a parameter from observations, the Bayesian approach estimates a random object (i.e., the position is a random variable) from observations that are statistically dependent on the position. Methods: We characterize the Bayesian Cramér-Rao (CR) that bounds the minimum mean square error (MMSE) of the best estimator of the position of a point source on a linear CCD-like detector, as a function of the properties of detector, the source, and the background. Results: We quantify and analyze the increase in astrometric performance from the use of a prior distribution of the object position, which is not available in the classical parametric setting. This gain is shown to be significant for various observational regimes, in particular in the case of faint objects or when the observations are taken under poor conditions. Furthermore, we present numerical evidence that the MMSE estimator of this problem tightly achieves the Bayesian CR bound. This is a remarkable result, demonstrating that all the performance gains presented in our analysis can be achieved with the MMSE estimator. Conclusions: The Bayesian CR bound can be used as a benchmark indicator of the expected maximum positional precision of a set of astrometric measurements in which prior information can be incorporated. This bound can be achieved through the conditional mean estimator, in contrast to the parametric case where no unbiased estimator precisely reaches the CR bound.
Moore, Jeffrey E; Read, Andrew J
2008-12-01
Wildlife ecologists and managers are challenged to make the most of sparse information for understanding demography of many species, especially those that are long lived and difficult to observe. For many odontocete (dolphin, porpoise, toothed whale) populations, only fertility and age-at-death data are feasibly obtainable. We describe a Bayesian approach for using fertilities and two types of age-at-death data (i.e., age structure of deaths from all mortality sources and age structure of anthropogenic mortalities only) to estimate rate of increase, mortality rates, and impacts of anthropogenic mortality on those rates for a population assumed to be in a stable age structure. We used strandings data from 1977 to 1993 (n = 96) and observer bycatch data from 1989 to 1993 (n = 233) for the Gulf of Maine, USA, and Bay of Fundy, Canada, harbor porpoise (Phocoena phocoena) population as a case study. Our method combines mortality risk functions to estimate parameters describing age-specific natural and bycatch mortality rates. Separate functions are simultaneously fit to bycatch and strandings data, the latter of which are described as a mixture of natural and bycatch mortalities. Euler-Lotka equations and an estimate of longevity were used to constrain parameter estimates, and we included a parameter to account for unequal probabilities of natural vs. bycatch deaths occurring in a sample. We fit models under two scenarios intended to correct for possible data bias due to indirect bycatch of calves (i.e., death following bycatch mortality of mothers) being underrepresented in the bycatch sample. Results from the two scenarios were "model averaged" by sampling from both Markov Chain Monte Carlo (MCMC) chains with uniform probability. The median estimate for potential population growth (r(nat)) was 0.046 (90% credible interval [CRI] = 0.004-0.116). The median for actual growth (r) was -0.030 (90% CRI = -0.192 to +0.065). The probability of population decline due to added
Melucci, L M; Birchmeier, A N; Cappa, E P; Cantet, R J C
2009-10-01
An experimental Hereford herd established in 1960 was used from 1986 to 2006 to select for increased weaning weight (W) without increasing birth weight (B). Data were B and W collected over the 47 yr from 2,124 calves. Including ancestors, the pedigree file had 2,369 animals. Selection was practiced only in males. In the first stage (1986 to 1993), mass-selected bulls were chosen with the index I = B + 9374.76 RDG (relative daily gain). From 1994 to 2006, the selection criterion for bull i was I(i) = BLUP(i)(WD) - 2.33 BLUP(i)(BD), where the BLUP were for the direct BV of B (BD) and W (WD), respectively. Predictions were obtained from a 2-trait animal model with B having only BD, and W with WD and WM (maternal additive effects). Selection response was estimated using a Bayesian approach by means of the Gibbs sampler for a 2-trait animal model including BD, BM (maternal BV for B), WD, and WM. Estimated heritabilities for BD, BM, WD, and WM were 0.40, 0.23, 0.05, and 0.23, respectively. The correlation between BD and BM was close to zero (0.01), and between WD and WM was positive (0.37). The correlation between BD and WD was 0.07, and between BM and WM was 0.58. The 2 methods used to estimate selection response gave similar results. In both periods BD decreased, whereas BM increased. The reduction of BD due to selection was slightly larger in the second period than in the first one. The regression of BV for W increased due to selection in both stages, but selection response was 21.6% larger from 1986 to 1992 than from 1993 to 2006. The maternal effect, WM increased more than 3 times compared with WD in the first period, but ended up being almost the same value as WD in period 2. The Bulmer effect was manifested by the decrease in magnitude of all (co)variance components during selection. It is concluded that selection to increase BW at weaning in beef cattle, although not increasing BW at birth, was moderately effective.
Bayesian artificial intelligence
Korb, Kevin B
2003-01-01
As the power of Bayesian techniques has become more fully realized, the field of artificial intelligence has embraced Bayesian methodology and integrated it to the point where an introduction to Bayesian techniques is now a core course in many computer science programs. Unlike other books on the subject, Bayesian Artificial Intelligence keeps mathematical detail to a minimum and covers a broad range of topics. The authors integrate all of Bayesian net technology and learning Bayesian net technology and apply them both to knowledge engineering. They emphasize understanding and intuition but also provide the algorithms and technical background needed for applications. Software, exercises, and solutions are available on the authors' website.
Bayesian artificial intelligence
Korb, Kevin B
2010-01-01
Updated and expanded, Bayesian Artificial Intelligence, Second Edition provides a practical and accessible introduction to the main concepts, foundation, and applications of Bayesian networks. It focuses on both the causal discovery of networks and Bayesian inference procedures. Adopting a causal interpretation of Bayesian networks, the authors discuss the use of Bayesian networks for causal modeling. They also draw on their own applied research to illustrate various applications of the technology.New to the Second EditionNew chapter on Bayesian network classifiersNew section on object-oriente
Owens Chantelle J; Owusu-Edusei Kwame
2009-01-01
Abstract Background Chlamydia continues to be the most prevalent disease in the United States. Effective spatial monitoring of chlamydia incidence is important for successful implementation of control and prevention programs. The objective of this study is to apply Bayesian smoothing and exploratory spatial data analysis (ESDA) methods to monitor Texas county-level chlamydia incidence rates by examining spatiotemporal patterns. We used county-level data on chlamydia incidence (for all ages, g...
Variational bayesian method of estimating variance components.
Arakawa, Aisaku; Taniguchi, Masaaki; Hayashi, Takeshi; Mikawa, Satoshi
2016-07-01
We developed a Bayesian analysis approach by using a variational inference method, a so-called variational Bayesian method, to determine the posterior distributions of variance components. This variational Bayesian method and an alternative Bayesian method using Gibbs sampling were compared in estimating genetic and residual variance components from both simulated data and publically available real pig data. In the simulated data set, we observed strong bias toward overestimation of genetic variance for the variational Bayesian method in the case of low heritability and low population size, and less bias was detected with larger population sizes in both methods examined. The differences in the estimates of variance components between the variational Bayesian and the Gibbs sampling were not found in the real pig data. However, the posterior distributions of the variance components obtained with the variational Bayesian method had shorter tails than those obtained with the Gibbs sampling. Consequently, the posterior standard deviations of the genetic and residual variances of the variational Bayesian method were lower than those of the method using Gibbs sampling. The computing time required was much shorter with the variational Bayesian method than with the method using Gibbs sampling.
Energy Technology Data Exchange (ETDEWEB)
Trucco, P. [Department of Management, Economics and Industrial Engineering-Politecnico di Milano, Piazza Leonardo da Vinci, 32, I-20133 Milan (Italy)], E-mail: paolo.trucco@polimi.it; Cagno, E. [Department of Management, Economics and Industrial Engineering-Politecnico di Milano, Piazza Leonardo da Vinci, 32, I-20133 Milan (Italy); Ruggeri, F. [CNR IMATI, via E.Bassini, 15, I-20133 Milan (Italy); Grande, O. [Department of Management, Economics and Industrial Engineering-Politecnico di Milano, Piazza Leonardo da Vinci, 32, I-20133 Milan (Italy)
2008-06-15
The paper presents an innovative approach to integrate Human and Organisational Factors (HOF) into risk analysis. The approach has been developed and applied to a case study in the maritime industry, but it can also be utilised in other sectors. A Bayesian Belief Network (BBN) has been developed to model the Maritime Transport System (MTS), by taking into account its different actors (i.e., ship-owner, shipyard, port and regulator) and their mutual influences. The latter have been modelled by means of a set of dependent variables whose combinations express the relevant functions performed by each actor. The BBN model of the MTS has been used in a case study for the quantification of HOF in the risk analysis carried out at the preliminary design stage of High Speed Craft (HSC). The study has focused on a collision in open sea hazard carried out by means of an original method of integration of a Fault Tree Analysis (FTA) of technical elements with a BBN model of the influences of organisational functions and regulations, as suggested by the International Maritime Organisation's (IMO) Guidelines for Formal Safety Assessment (FSA). The approach has allowed the identification of probabilistic correlations between the basic events of a collision accident and the BBN model of the operational and organisational conditions. The linkage can be exploited in different ways, especially to support identification and evaluation of risk control options also at the organisational level. Conditional probabilities for the BBN have been estimated by means of experts' judgments, collected from an international panel of different European countries. Finally, a sensitivity analysis has been carried out over the model to identify configurations of the MTS leading to a significant reduction of accident probability during the operation of the HSC.
Multilevel Group Analysis on Bayesian in fMRI Time Series%Bayesian方法对fMRI数据的多层群组分析
Institute of Scientific and Technical Information of China (English)
周广田; 杨丰; 田晓英
2016-01-01
This paper suggests a method to process fMRI time series based on Bayesian inference for group analysis. The method uses multilevel divided by voxel, subject and group as pair comparison to reinforce posterior probability in group analysis from single subjects as priors. And also it combines classical statistics, i.e., t-test to obtain voxel activation at subject level as prior for Bayesian inference at group level. It effectively solves computation expensive and complexity. And it shows robust on Bayesian inference for group analysis.%本文采用Bayesian方法对fMRI时间序列数据对群组进行分析，群组按照体素、脑体、个体分为多层，比较个体的特征选取作为先验加强群组的后验计算，对个体的参数估计结合经典统计方法获得体素的激活区域作为群组Bayesian推理的先验，可以有效解决计算复杂性和计算成本，有效应用在群组分析中。
Indian Academy of Sciences (India)
Tao Wei; Xiao Xiao Jin; Tian Jun Xu
2013-08-01
To understand the phylogenetic position of Bostrychus sinensis in Eleotridae and the phylogenetic relationships of the family, we determined the nucleotide sequence of the mitochondrial (mt) genome of Bostrychus sinensis. It is the first complete mitochondrial genome sequence of Bostrychus genus. The entire mtDNA sequence was 16508 bp in length with a standard set of 13 protein-coding genes, 22 transfer RNA genes (tRNAs), two ribosomal RNA genes (rRNAs) and a noncoding control region. The mitochondrial genome of B. sinensis had common features with those of other bony fishes with respect to gene arrangement, base composition, and tRNA structures. Phylogenetic hypotheses within Eleotridae fish have been controversial at the genus level. We used the mitochondrial cytochrome b (cytb) gene sequence to examine phylogenetic relationships of Eleotridae by using partitioned Bayesian method. When the specific models and parameter estimates were presumed for partitioning the total data, the harmonic mean –lnL was improved. The phylogenetic analysis supported the monophyly of Hypseleotris and Gobiomorphs. In addition, the Bostrychus were most closely related to Ophiocara, and the Philypnodon is also the sister to Microphlypnus, based on the current datasets. Further, extensive taxonomic sampling and more molecular information are needed to confirm the phylogenetic relationships in Eleotridae.
Directory of Open Access Journals (Sweden)
Alan F. Sasso
2012-01-01
Full Text Available A lipid-based physiologically based toxicokinetic (PBTK model has been developed for a mixture of six polychlorinated biphenyls (PCBs in rats. The aim of this study was to apply population Bayesian analysis to a lipid PBTK model, while incorporating an internal exposure-response model linking enzyme induction and metabolic rate. Lipid-based physiologically based toxicokinetic models are a subset of PBTK models that can simulate concentrations of highly lipophilic compounds in tissue lipids, without the need for partition coefficients. A hierarchical treatment of population metabolic parameters and a CYP450 induction model were incorporated into the lipid-based PBTK framework, and Markov-Chain Monte Carlo was applied to in vivo data. A mass balance of CYP1A and CYP2B in the liver was necessary to model PCB metabolism at high doses. The linked PBTK/induction model remained on a lipid basis and was capable of modeling PCB concentrations in multiple tissues for all dose levels and dose profiles.
Licquia, Timothy C
2014-01-01
We present improved estimates of several global properties of the Milky Way, including its current star formation rate (SFR), the stellar mass contained in its disk and bulge+bar components, as well as its total stellar mass. We do so by combining previous measurements from the literature using a hierarchical Bayesian (HB) statistical method that allows us to account for the possibility that any value may be incorrect or have underestimated errors. We show that this method is robust to a wide variety of assumptions about the nature of problems in individual measurements or error estimates. Ultimately, our analysis yields a SFR for the Galaxy of $\\dot{\\rm M}_\\star=1.65\\pm0.19$ ${\\rm M}_\\odot$ yr$^{-1}$. By combining HB methods with Monte Carlo simulations that incorporate the latest estimates of the Galactocentric radius of the Sun, $R_0$, the exponential scale-length of the disk, $L_d$, and the local surface density of stellar mass, $\\Sigma_\\star(R_0)$, we show that the mass of the Galactic bulge+bar is ${\\rm...
Lyons, James E.; Kendall, William; Royle, J. Andrew; Converse, Sarah J.; Andres, Brad A.; Buchanan, Joseph B.
2016-01-01
We present a novel formulation of a mark–recapture–resight model that allows estimation of population size, stopover duration, and arrival and departure schedules at migration areas. Estimation is based on encounter histories of uniquely marked individuals and relative counts of marked and unmarked animals. We use a Bayesian analysis of a state–space formulation of the Jolly–Seber mark–recapture model, integrated with a binomial model for counts of unmarked animals, to derive estimates of population size and arrival and departure probabilities. We also provide a novel estimator for stopover duration that is derived from the latent state variable representing the interim between arrival and departure in the state–space model. We conduct a simulation study of field sampling protocols to understand the impact of superpopulation size, proportion marked, and number of animals sampled on bias and precision of estimates. Simulation results indicate that relative bias of estimates of the proportion of the population with marks was low for all sampling scenarios and never exceeded 2%. Our approach does not require enumeration of all unmarked animals detected or direct knowledge of the number of marked animals in the population at the time of the study. This provides flexibility and potential application in a variety of sampling situations (e.g., migratory birds, breeding seabirds, sea turtles, fish, pinnipeds, etc.). Application of the methods is demonstrated with data from a study of migratory sandpipers.
A default Bayesian hypothesis test for ANOVA designs
Wetzels, R.; Grasman, R.P.P.P.; Wagenmakers, E.J.
2012-01-01
This article presents a Bayesian hypothesis test for analysis of variance (ANOVA) designs. The test is an application of standard Bayesian methods for variable selection in regression models. We illustrate the effect of various g-priors on the ANOVA hypothesis test. The Bayesian test for ANOVA desig
Congdon, Peter
2014-01-01
This book provides an accessible approach to Bayesian computing and data analysis, with an emphasis on the interpretation of real data sets. Following in the tradition of the successful first edition, this book aims to make a wide range of statistical modeling applications accessible using tested code that can be readily adapted to the reader's own applications. The second edition has been thoroughly reworked and updated to take account of advances in the field. A new set of worked examples is included. The novel aspect of the first edition was the coverage of statistical modeling using WinBU
Directory of Open Access Journals (Sweden)
Simone Perna
2016-11-01
Full Text Available Hazelnuts are rich in monounsaturated fatty acids and antioxidant bioactive substances: their consumption has been associated with a decreased risk of cardiovascular disease events. A systematic review and a meta-analysis was performed to combine the results from several trials and to estimate the pooled (overall effect of hazelnuts on blood lipids and body weight outcomes. Specifically, a Bayesian random effect meta-analysis of mean differences of Δ-changes from baseline across treatment (MDΔ (i.e., hazelnut-enriched diet vs. control diet has been conducted. Nine studies representing 425 participants were included in the analysis. The intervention diet lasted 28–84 days with a dosage of hazelnuts ranging from 29 to 69 g/day. Out of nine studies, three randomized studies have been meta-analyzed showing a significant reduction in low-density lipoprotein (LDL cholesterol (pooled MDΔ = −0.150 mmol/L; 95% highest posterior density interval (95%HPD = −0.308; −0.003 in favor of a hazelnut-enriched diet. Total cholesterol showed a marked trend toward a decrease (pooled MDΔ = −0.127 mmol/L; 95%HPD = −0.284; 0.014 and high-density lipoprotein (HDL cholesterol remained substantially stable (pooled MDΔ = 0.002 mmol/L; 95%HPD = −0.140; 0.147. No effects on triglycerides (pooled MDΔ = 0.045 mmol/L; 95%HPD = −0.195; 0.269 and body mass index (BMI (pooled MDΔ = 0.062 kg/m2; 95%HPD = −0.293; 0.469 were found. Hazelnut-enriched diet is associated with a decrease of LDL and total cholesterol, while HDL cholesterol, triglycerides and BMI remain substantially unchanged.
Perna, Simone; Giacosa, Attilio; Bonitta, Gianluca; Bologna, Chiara; Isu, Antonio; Guido, Davide; Rondanelli, Mariangela
2016-01-01
Hazelnuts are rich in monounsaturated fatty acids and antioxidant bioactive substances: their consumption has been associated with a decreased risk of cardiovascular disease events. A systematic review and a meta-analysis was performed to combine the results from several trials and to estimate the pooled (overall) effect of hazelnuts on blood lipids and body weight outcomes. Specifically, a Bayesian random effect meta-analysis of mean differences of Δ-changes from baseline across treatment (MDΔ) (i.e., hazelnut-enriched diet vs. control diet) has been conducted. Nine studies representing 425 participants were included in the analysis. The intervention diet lasted 28–84 days with a dosage of hazelnuts ranging from 29 to 69 g/day. Out of nine studies, three randomized studies have been meta-analyzed showing a significant reduction in low-density lipoprotein (LDL) cholesterol (pooled MDΔ = −0.150 mmol/L; 95% highest posterior density interval (95%HPD) = −0.308; −0.003) in favor of a hazelnut-enriched diet. Total cholesterol showed a marked trend toward a decrease (pooled MDΔ = −0.127 mmol/L; 95%HPD = −0.284; 0.014) and high-density lipoprotein (HDL) cholesterol remained substantially stable (pooled MDΔ = 0.002 mmol/L; 95%HPD = −0.140; 0.147). No effects on triglycerides (pooled MDΔ = 0.045 mmol/L; 95%HPD = −0.195; 0.269) and body mass index (BMI) (pooled MDΔ = 0.062 kg/m2; 95%HPD = −0.293; 0.469) were found. Hazelnut-enriched diet is associated with a decrease of LDL and total cholesterol, while HDL cholesterol, triglycerides and BMI remain substantially unchanged. PMID:27897978
Directory of Open Access Journals (Sweden)
Robert W Burn
Full Text Available Elephant poaching and the ivory trade remain high on the agenda at meetings of the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES. Well-informed debates require robust estimates of trends, the spatial distribution of poaching, and drivers of poaching. We present an analysis of trends and drivers of an indicator of elephant poaching of all elephant species. The site-based monitoring system known as Monitoring the Illegal Killing of Elephants (MIKE, set up by the 10(th Conference of the Parties of CITES in 1997, produces carcass encounter data reported mainly by anti-poaching patrols. Data analyzed were site by year totals of 6,337 carcasses from 66 sites in Africa and Asia from 2002-2009. Analysis of these observational data is a serious challenge to traditional statistical methods because of the opportunistic and non-random nature of patrols, and the heterogeneity across sites. Adopting a bayesian hierarchical modeling approach, we used the proportion of carcasses that were illegally killed (PIKE as a poaching index, to estimate the trend and the effects of site- and country-level factors associated with poaching. Important drivers of illegal killing that emerged at country level were poor governance and low levels of human development, and at site level, forest cover and area of the site in regions where human population density is low. After a drop from 2002, PIKE remained fairly constant from 2003 until 2006, after which it increased until 2008. The results for 2009 indicate a decline. Sites with PIKE ranging from the lowest to the highest were identified. The results of the analysis provide a sound information base for scientific evidence-based decision making in the CITES process.
Directory of Open Access Journals (Sweden)
Brian W Kunkle
Full Text Available In this study we have identified key genes that are critical in development of astrocytic tumors. Meta-analysis of microarray studies which compared normal tissue to astrocytoma revealed a set of 646 differentially expressed genes in the majority of astrocytoma. Reverse engineering of these 646 genes using Bayesian network analysis produced a gene network for each grade of astrocytoma (Grade I-IV, and 'key genes' within each grade were identified. Genes found to be most influential to development of the highest grade of astrocytoma, Glioblastoma multiforme were: COL4A1, EGFR, BTF3, MPP2, RAB31, CDK4, CD99, ANXA2, TOP2A, and SERBP1. All of these genes were up-regulated, except MPP2 (down regulated. These 10 genes were able to predict tumor status with 96-100% confidence when using logistic regression, cross validation, and the support vector machine analysis. Markov genes interact with NFkβ, ERK, MAPK, VEGF, growth hormone and collagen to produce a network whose top biological functions are cancer, neurological disease, and cellular movement. Three of the 10 genes - EGFR, COL4A1, and CDK4, in particular, seemed to be potential 'hubs of activity'. Modified expression of these 10 Markov Blanket genes increases lifetime risk of developing glioblastoma compared to the normal population. The glioblastoma risk estimates were dramatically increased with joint effects of 4 or more than 4 Markov Blanket genes. Joint interaction effects of 4, 5, 6, 7, 8, 9 or 10 Markov Blanket genes produced 9, 13, 20.9, 26.7, 52.8, 53.2, 78.1 or 85.9%, respectively, increase in lifetime risk of developing glioblastoma compared to normal population. In summary, it appears that modified expression of several 'key genes' may be required for the development of glioblastoma. Further studies are needed to validate these 'key genes' as useful tools for early detection and novel therapeutic options for these tumors.
Burn, Robert W; Underwood, Fiona M; Blanc, Julian
2011-01-01
Elephant poaching and the ivory trade remain high on the agenda at meetings of the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES). Well-informed debates require robust estimates of trends, the spatial distribution of poaching, and drivers of poaching. We present an analysis of trends and drivers of an indicator of elephant poaching of all elephant species. The site-based monitoring system known as Monitoring the Illegal Killing of Elephants (MIKE), set up by the 10(th) Conference of the Parties of CITES in 1997, produces carcass encounter data reported mainly by anti-poaching patrols. Data analyzed were site by year totals of 6,337 carcasses from 66 sites in Africa and Asia from 2002-2009. Analysis of these observational data is a serious challenge to traditional statistical methods because of the opportunistic and non-random nature of patrols, and the heterogeneity across sites. Adopting a bayesian hierarchical modeling approach, we used the proportion of carcasses that were illegally killed (PIKE) as a poaching index, to estimate the trend and the effects of site- and country-level factors associated with poaching. Important drivers of illegal killing that emerged at country level were poor governance and low levels of human development, and at site level, forest cover and area of the site in regions where human population density is low. After a drop from 2002, PIKE remained fairly constant from 2003 until 2006, after which it increased until 2008. The results for 2009 indicate a decline. Sites with PIKE ranging from the lowest to the highest were identified. The results of the analysis provide a sound information base for scientific evidence-based decision making in the CITES process.
Zhu, Lucheng; Liu, Jihong; Ma, Shenglin
2016-10-01
Fluoropyrimidine-based regimens are the most common treatments in advanced gastric cancer. We used a Bayesian network meta-analysis to identify the optimal fluoropyrimidine-based chemotherapy by comparing their relative efficacy and safety. We systematically searched databases and extracted data from randomized controlled trials, which compared fluoropyrimidine-based regimens as first-line treatment in AGC. The main outcomes were overall survival (OS), progression-free survival (PFS), overall response rate (ORR), and grade 3 or 4 adverse events (AEs). A total of 12 RCTs of 4026 patients were included in our network meta-analysis. Pooled analysis showed S-1 and capecitabine had a significant OS benefit over 5-Fu, with hazard ratios of 0.90 (95%CI = 0.81-0.99) and 0.88 (95%CI = 0.80-0.96), respectively. The result also exhibited a trend that S-1 and capecitabine prolonged PFS in contrast to 5-Fu, with hazard ratios of 0.84 (95%CI = 0.66-1.02) and 0.84 (95%CI = 0.65-1.03), respectively. Additionally, all the three fluoropyrimidine-based regimens were similar in terms of ORR and grade 3 or 4 AEs. Compared with regimens based on 5-Fu, regimens based on S-1 or capecitabine demonstrated a significant OS improvement without compromise of AEs as first-line treatment in AGC in Asian population. S-1 and capecitabine can be interchangeable according their different emphasis on AEs.
Kunkle, Brian W; Yoo, Changwon; Roy, Deodutta
2013-01-01
In this study we have identified key genes that are critical in development of astrocytic tumors. Meta-analysis of microarray studies which compared normal tissue to astrocytoma revealed a set of 646 differentially expressed genes in the majority of astrocytoma. Reverse engineering of these 646 genes using Bayesian network analysis produced a gene network for each grade of astrocytoma (Grade I-IV), and 'key genes' within each grade were identified. Genes found to be most influential to development of the highest grade of astrocytoma, Glioblastoma multiforme were: COL4A1, EGFR, BTF3, MPP2, RAB31, CDK4, CD99, ANXA2, TOP2A, and SERBP1. All of these genes were up-regulated, except MPP2 (down regulated). These 10 genes were able to predict tumor status with 96-100% confidence when using logistic regression, cross validation, and the support vector machine analysis. Markov genes interact with NFkβ, ERK, MAPK, VEGF, growth hormone and collagen to produce a network whose top biological functions are cancer, neurological disease, and cellular movement. Three of the 10 genes - EGFR, COL4A1, and CDK4, in particular, seemed to be potential 'hubs of activity'. Modified expression of these 10 Markov Blanket genes increases lifetime risk of developing glioblastoma compared to the normal population. The glioblastoma risk estimates were dramatically increased with joint effects of 4 or more than 4 Markov Blanket genes. Joint interaction effects of 4, 5, 6, 7, 8, 9 or 10 Markov Blanket genes produced 9, 13, 20.9, 26.7, 52.8, 53.2, 78.1 or 85.9%, respectively, increase in lifetime risk of developing glioblastoma compared to normal population. In summary, it appears that modified expression of several 'key genes' may be required for the development of glioblastoma. Further studies are needed to validate these 'key genes' as useful tools for early detection and novel therapeutic options for these tumors.
Directory of Open Access Journals (Sweden)
Pin Carmen
2007-11-01
Full Text Available Abstract Background Microarrays are widely used for the study of gene expression; however deciding on whether observed differences in expression are significant remains a challenge. Results A computing tool (ArrayLeaRNA has been developed for gene expression analysis. It implements a Bayesian approach which is based on the Gumbel distribution and uses printed genomic DNA control features for normalization and for estimation of the parameters of the Bayesian model and prior knowledge from predicted operon structure. The method is compared with two other approaches: the classical LOWESS normalization followed by a two fold cut-off criterion and the OpWise method (Price, et al. 2006. BMC Bioinformatics. 7, 19, a published Bayesian approach also using predicted operon structure. The three methods were compared on experimental datasets with prior knowledge of gene expression. With ArrayLeaRNA, data normalization is carried out according to the genomic features which reflect the results of equally transcribed genes; also the statistical significance of the difference in expression is based on the variability of the equally transcribed genes. The operon information helps the classification of genes with low confidence measurements. ArrayLeaRNA is implemented in Visual Basic and freely available as an Excel add-in at http://www.ifr.ac.uk/safety/ArrayLeaRNA/ Conclusion We have introduced a novel Bayesian model and demonstrated that it is a robust method for analysing microarray expression profiles. ArrayLeaRNA showed a considerable improvement in data normalization, in the estimation of the experimental variability intrinsic to each hybridization and in the establishment of a clear boundary between non-changing and differentially expressed genes. The method is applicable to data derived from hybridizations of labelled cDNA samples as well as from hybridizations of labelled cDNA with genomic DNA and can be used for the analysis of datasets where
Veilleux, Andrea G.; Stedinger, Jery R.; Eash, David A.
2012-01-01
This paper summarizes methodological advances in regional log-space skewness analyses that support flood-frequency analysis with the log Pearson Type III (LP3) distribution. A Bayesian Weighted Least Squares/Generalized Least Squares (B-WLS/B-GLS) methodology that relates observed skewness coefficient estimators to basin characteristics in conjunction with diagnostic statistics represents an extension of the previously developed B-GLS methodology. B-WLS/B-GLS has been shown to be effective in two California studies. B-WLS/B-GLS uses B-WLS to generate stable estimators of model parameters and B-GLS to estimate the precision of those B-WLS regression parameters, as well as the precision of the model. The study described here employs this methodology to develop a regional skewness model for the State of Iowa. To provide cost effective peak-flow data for smaller drainage basins in Iowa, the U.S. Geological Survey operates a large network of crest stage gages (CSGs) that only record flow values above an identified recording threshold (thus producing a censored data record). CSGs are different from continuous-record gages, which record almost all flow values and have been used in previous B-GLS and B-WLS/B-GLS regional skewness studies. The complexity of analyzing a large CSG network is addressed by using the B-WLS/B-GLS framework along with the Expected Moments Algorithm (EMA). Because EMA allows for the censoring of low outliers, as well as the use of estimated interval discharges for missing, censored, and historic data, it complicates the calculations of effective record length (and effective concurrent record length) used to describe the precision of sample estimators because the peak discharges are no longer solely represented by single values. Thus new record length calculations were developed. The regional skewness analysis for the State of Iowa illustrates the value of the new B-WLS/BGLS methodology with these new extensions.
Spiegel, David S
2011-01-01
Life arose on Earth sometime in the first few hundred million years after the young planet had cooled to the point that it could support water-based organisms on its surface. The early emergence of life on Earth has been taken as evidence that the probability of abiogenesis is high, if starting from young-Earth-like conditions. We revisit this argument quantitatively in a Bayesian statistical framework. By constructing a simple model of the probability of abiogenesis, we calculate a Bayesian estimate of its posterior probability, given the data that life emerged fairly early in Earth's history and that, billions of years later, sentient creatures noted this fact and considered its implications. We find that, given only this very limited empirical information, the choice of Bayesian prior for the abiogenesis probability parameter has a dominant influence on the computed posterior probability. Thus, although life began on this planet fairly soon after the Earth became habitable, this fact is consistent with an ...
Bayesian hierarchical models for network meta-analysis incorporating nonignorable missingness.
Zhang, Jing; Chu, Haitao; Hong, Hwanhee; Virnig, Beth A; Carlin, Bradley P
2015-07-28
Network meta-analysis expands the scope of a conventional pairwise meta-analysis to simultaneously compare multiple treatments, synthesizing both direct and indirect information and thus strengthening inference. Since most of trials only compare two treatments, a typical data set in a network meta-analysis managed as a trial-by-treatment matrix is extremely sparse, like an incomplete block structure with significant missing data. Zhang et al. proposed an arm-based method accounting for correlations among different treatments within the same trial and assuming that absent arms are missing at random. However, in randomized controlled trials, nonignorable missingness or missingness not at random may occur due to deliberate choices of treatments at the design stage. In addition, those undertaking a network meta-analysis may selectively choose treatments to include in the analysis, which may also lead to missingness not at random. In this paper, we extend our previous work to incorporate missingness not at random using selection models. The proposed method is then applied to two network meta-analyses and evaluated through extensive simulation studies. We also provide comprehensive comparisons of a commonly used contrast-based method and the arm-based method via simulations in a technical appendix under missing completely at random and missing at random.
Bayesian Networks and Influence Diagrams
DEFF Research Database (Denmark)
Kjærulff, Uffe Bro; Madsen, Anders Læsø
Bayesian Networks and Influence Diagrams: A Guide to Construction and Analysis, Second Edition, provides a comprehensive guide for practitioners who wish to understand, construct, and analyze intelligent systems for decision support based on probabilistic networks. This new edition contains six new...
Change-Point in the Mean of Heavy-Tailed Dependent Observations%厚尾相依序列的均值变点估计
Institute of Scientific and Technical Information of China (English)
韩四儿; 田铮; 王红军
2008-01-01
This paper studies the problem of mean change point in heavy-tailed dependent observations.We prove the consistency of CUSUM estimator of change-point and derive the rate of convergence.tions.%本文研究了厚尾相依序列的均值变点估计.证明了变点的CUSUM估计的一致性并得到了收敛速度.在方差无穷的情况下推广了Hájek-Rényi不等式.
Can, Seda; van de Schoot, Rens; Hox, Joop
2015-01-01
Because variables may be correlated in the social and behavioral sciences, multicollinearity might be problematic. This study investigates the effect of collinearity manipulated in within and between levels of a two-level confirmatory factor analysis by Monte Carlo simulation. Furthermore, the influence of the size of the intraclass correlation…
Can, Seda; van de Schoot, Rens; Hox, Joop
2014-01-01
Because variables may be correlated in the social and behavioral sciences, multicollinearity might be problematic. This study investigates the effect of collinearity manipulated in within and between levels of a two-level confirmatory factor analysis by Monte Carlo simulation. Furthermore, the influ
Analysis of simulated data for the KArlsruhe TRItium Neutrino experiment using Bayesian inference
DEFF Research Database (Denmark)
Riis, Anna Sejersen; Hannestad, Steen; Weinheimer, C.
2011-01-01
neutrinos. As an alternative to the frequentist minimization methods used in the analysis of the earlier experiments in Mainz and Troitsk we have been investigating Markov chain Monte Carlo (MCMC) methods which are very well suited for probing multiparameter spaces. We found that implementing the KATRIN χ2...
The US EPA’s ToxCastTM program seeks to combine advances in high-throughput screening technology with methodologies from statistics and computer science to develop high-throughput decision support tools for assessing chemical hazard and risk. To develop new methods of analysis of...
Integrative analysis of histone ChIP-seq and transcription data using Bayesian mixture models
DEFF Research Database (Denmark)
Klein, Hans-Ulrich; Schäfer, Martin; Porse, Bo T;
2014-01-01
Histone modifications are a key epigenetic mechanism to activate or repress the transcription of genes. Datasets of matched transcription data and histone modification data obtained by ChIP-seq exist, but methods for integrative analysis of both data types are still rare. Here, we present a novel...
Owusu-Edusei, Kwame; Owens, Chantelle J
2009-01-01
Background Chlamydia continues to be the most prevalent disease in the United States. Effective spatial monitoring of chlamydia incidence is important for successful implementation of control and prevention programs. The objective of this study is to apply Bayesian smoothing and exploratory spatial data analysis (ESDA) methods to monitor Texas county-level chlamydia incidence rates by examining spatiotemporal patterns. We used county-level data on chlamydia incidence (for all ages, gender and races) from the National Electronic Telecommunications System for Surveillance (NETSS) for 2004 and 2005. Results Bayesian-smoothed chlamydia incidence rates were spatially dependent both in levels and in relative changes. Erath county had significantly (p 300 cases per 100,000 residents) than its contiguous neighbors (195 or less) in both years. Gaines county experienced the highest relative increase in smoothed rates (173% – 139 to 379). The relative change in smoothed chlamydia rates in Newton county was significantly (p < 0.05) higher than its contiguous neighbors. Conclusion Bayesian smoothing and ESDA methods can assist programs in using chlamydia surveillance data to identify outliers, as well as relevant changes in chlamydia incidence in specific geographic units. Secondly, it may also indirectly help in assessing existing differences and changes in chlamydia surveillance systems over time. PMID:19245686
Directory of Open Access Journals (Sweden)
Mohammed Hussni O
2010-06-01
Full Text Available Abstract Background Cryptosporidium parvum is one of the most important biological contaminants in drinking water that produces life threatening infection in people with compromised immune systems. Dairy calves are thought to be the primary source of C. parvum contamination in watersheds. Understanding the spatial and temporal variation in the risk of C. parvum infection in dairy cattle is essential for designing cost-effective watershed management strategies to protect drinking water sources. Crude and Bayesian seasonal risk estimates for Cryptosporidium in dairy calves were used to investigate the spatio-temporal dynamics of C. parvum infection on dairy farms in the New York City watershed. Results Both global (Global Moran's I and specific (SaTScan cluster analysis methods revealed a significant (p C. parvum infection in all herds in the summer (p = 0.002, compared to the rest of the year. Bayesian estimates did not show significant spatial autocorrelation in any season. Conclusions Although we were not able to identify seasonal clusters using Bayesian approach, crude estimates highlighted both temporal and spatial clusters of C. parvum infection in dairy herds in a major watershed. We recommend that further studies focus on the factors that may lead to the presence of C. parvum clusters within the watershed, so that monitoring and prevention practices such as stream monitoring, riparian buffers, fencing and manure management can be prioritized and improved, to protect drinking water supplies and public health.
Directory of Open Access Journals (Sweden)
Owens Chantelle J
2009-02-01
Full Text Available Abstract Background Chlamydia continues to be the most prevalent disease in the United States. Effective spatial monitoring of chlamydia incidence is important for successful implementation of control and prevention programs. The objective of this study is to apply Bayesian smoothing and exploratory spatial data analysis (ESDA methods to monitor Texas county-level chlamydia incidence rates by examining spatiotemporal patterns. We used county-level data on chlamydia incidence (for all ages, gender and races from the National Electronic Telecommunications System for Surveillance (NETSS for 2004 and 2005. Results Bayesian-smoothed chlamydia incidence rates were spatially dependent both in levels and in relative changes. Erath county had significantly (p 300 cases per 100,000 residents than its contiguous neighbors (195 or less in both years. Gaines county experienced the highest relative increase in smoothed rates (173% – 139 to 379. The relative change in smoothed chlamydia rates in Newton county was significantly (p Conclusion Bayesian smoothing and ESDA methods can assist programs in using chlamydia surveillance data to identify outliers, as well as relevant changes in chlamydia incidence in specific geographic units. Secondly, it may also indirectly help in assessing existing differences and changes in chlamydia surveillance systems over time.
Directory of Open Access Journals (Sweden)
Goodacre Royston
2011-01-01
Full Text Available Abstract Background The rapid identification of Bacillus spores and bacterial identification are paramount because of their implications in food poisoning, pathogenesis and their use as potential biowarfare agents. Many automated analytical techniques such as Curie-point pyrolysis mass spectrometry (Py-MS have been used to identify bacterial spores giving use to large amounts of analytical data. This high number of features makes interpretation of the data extremely difficult We analysed Py-MS data from 36 different strains of aerobic endospore-forming bacteria encompassing seven different species. These bacteria were grown axenically on nutrient agar and vegetative biomass and spores were analyzed by Curie-point Py-MS. Results We develop a novel genetic algorithm-Bayesian network algorithm that accurately identifies sand selects a small subset of key relevant mass spectra (biomarkers to be further analysed. Once identified, this subset of relevant biomarkers was then used to identify Bacillus spores successfully and to identify Bacillus species via a Bayesian network model specifically built for this reduced set of features. Conclusions This final compact Bayesian network classification model is parsimonious, computationally fast to run and its graphical visualization allows easy interpretation of the probabilistic relationships among selected biomarkers. In addition, we compare the features selected by the genetic algorithm-Bayesian network approach with the features selected by partial least squares-discriminant analysis (PLS-DA. The classification accuracy results show that the set of features selected by the GA-BN is far superior to PLS-DA.
Rackham, Owen J L; Langley, Sarah R; Oates, Thomas; Vradi, Eleni; Harmston, Nathan; Srivastava, Prashant K; Behmoaras, Jacques; Dellaportas, Petros; Bottolo, Leonardo; Petretto, Enrico
2017-02-17
DNA methylation is a key epigenetic modification involved in gene regulation whose contribution to disease susceptibility remains to be fully understood. Here, we present a novel Bayesian smoothing approach (called ABBA) to detect differentially methylated regions (DMRs) from whole-genome bisulphite sequencing (WGBS). We also show how this approach can be leveraged to identify disease-associated changes in DNA methylation, suggesting mechanisms through which these alterations might affect disease. From a data modeling perspective, ABBA has the distinctive feature of automatically adapting to different correlation structures in CpG methylation levels across the genome whilst taking into account the distance between CpG sites as a covariate. Our simulation study shows that ABBA has greater power to detect DMRs than existing methods, providing an accurate identification of DMRs in the large majority of simulated cases. To empirically demonstrate the method's efficacy in generating biological hypotheses, we performed WGBS of primary macrophages derived from an experimental rat system of glomerulonephritis and used ABBA to identify >1,000 disease-associated DMRs. Investigation of these DMRs revealed differential DNA methylation localized to a 600bp region in the promoter of the Ifitm3 gene. This was confirmed by ChIP-seq and RNA-seq analyses, showing differential transcription factor binding at the Ifitm3 promoter by JunD (an established determinant of glomerulonephritis) and a consistent change in Ifitm3 expression. Our ABBA analysis allowed us to propose a new role for Ifitm3 in the pathogenesis of glomerulonephritis via a mechanism involving promoter hypermethylation that is associated with Ifitm3 repression in the rat strain susceptible to glomerulonephritis.
Directory of Open Access Journals (Sweden)
Myong Kim
Full Text Available To identify non-invasive clinical parameters to predict urodynamic bladder outlet obstruction (BOO in patients with benign prostatic hyperplasia (BPH using causal Bayesian networks (CBN.From October 2004 to August 2013, 1,381 eligible BPH patients with complete data were selected for analysis. The following clinical variables were considered: age, total prostate volume (TPV, transition zone volume (TZV, prostate specific antigen (PSA, maximum flow rate (Qmax, and post-void residual volume (PVR on uroflowmetry, and International Prostate Symptom Score (IPSS. Among these variables, the independent predictors of BOO were selected using the CBN model. The predictive performance of the CBN model using the selected variables was verified through a logistic regression (LR model with the same dataset.Mean age, TPV, and IPSS were 6.2 (±7.3, SD years, 48.5 (±25.9 ml, and 17.9 (±7.9, respectively. The mean BOO index was 35.1 (±25.2 and 477 patients (34.5% had urodynamic BOO (BOO index ≥40. By using the CBN model, we identified TPV, Qmax, and PVR as independent predictors of BOO. With these three variables, the BOO prediction accuracy was 73.5%. The LR model showed a similar accuracy (77.0%. However, the area under the receiver operating characteristic curve of the CBN model was statistically smaller than that of the LR model (0.772 vs. 0.798, p = 0.020.Our study demonstrated that TPV, Qmax, and PVR are independent predictors of urodynamic BOO.
Bayesian Statistics and Uncertainty Quantification for Safety Boundary Analysis in Complex Systems
He, Yuning; Davies, Misty Dawn
2014-01-01
The analysis of a safety-critical system often requires detailed knowledge of safe regions and their highdimensional non-linear boundaries. We present a statistical approach to iteratively detect and characterize the boundaries, which are provided as parameterized shape candidates. Using methods from uncertainty quantification and active learning, we incrementally construct a statistical model from only few simulation runs and obtain statistically sound estimates of the shape parameters for safety boundaries.
Nonparametric Bayesian Inference for Mean Residual Life Functions in Survival Analysis
Poynor, Valerie; Kottas, Athanasios
2014-01-01
Modeling and inference for survival analysis problems typically revolves around different functions related to the survival distribution. Here, we focus on the mean residual life function which provides the expected remaining lifetime given that a subject has survived (i.e., is event-free) up to a particular time. This function is of direct interest in reliability, medical, and actuarial fields. In addition to its practical interpretation, the mean residual life function characterizes the sur...
DEFF Research Database (Denmark)
Burgess, Stephen; Thompson, Simon G; Thompson, Grahame
2010-01-01
Genetic markers can be used as instrumental variables, in an analogous way to randomization in a clinical trial, to estimate the causal relationship between a phenotype and an outcome variable. Our purpose is to extend the existing methods for such Mendelian randomization studies to the context...... of multiple genetic markers measured in multiple studies, based on the analysis of individual participant data. First, for a single genetic marker in one study, we show that the usual ratio of coefficients approach can be reformulated as a regression with heterogeneous error in the explanatory variable...
Directory of Open Access Journals (Sweden)
Joseph P. Yurko
2015-01-01
Full Text Available System codes for simulation of safety performance of nuclear plants may contain parameters whose values are not known very accurately. New information from tests or operating experience is incorporated into safety codes by a process known as calibration, which reduces uncertainty in the output of the code and thereby improves its support for decision-making. The work reported here implements several improvements on classic calibration techniques afforded by modern analysis techniques. The key innovation has come from development of code surrogate model (or code emulator construction and prediction algorithms. Use of a fast emulator makes the calibration processes used here with Markov Chain Monte Carlo (MCMC sampling feasible. This work uses Gaussian Process (GP based emulators, which have been used previously to emulate computer codes in the nuclear field. The present work describes the formulation of an emulator that incorporates GPs into a factor analysis-type or pattern recognition-type model. This “function factorization” Gaussian Process (FFGP model allows overcoming limitations present in standard GP emulators, thereby improving both accuracy and speed of the emulator-based calibration process. Calibration of a friction-factor example using a Method of Manufactured Solution is performed to illustrate key properties of the FFGP based process.
A Bayesian network meta-analysis for binary outcome: how to do it.
Greco, Teresa; Landoni, Giovanni; Biondi-Zoccai, Giuseppe; D'Ascenzo, Fabrizio; Zangrillo, Alberto
2016-10-01
This study presents an overview of conceptual and practical issues of a network meta-analysis (NMA), particularly focusing on its application to randomised controlled trials with a binary outcome of interest. We start from general considerations on NMA to specifically appraise how to collect study data, structure the analytical network and specify the requirements for different models and parameter interpretations, with the ultimate goal of providing physicians and clinician-investigators a practical tool to understand pros and cons of NMA. Specifically, we outline the key steps, from the literature search to sensitivity analysis, necessary to perform a valid NMA of binomial data, exploiting Markov Chain Monte Carlo approaches. We also apply this analytical approach to a case study on the beneficial effects of volatile agents compared to total intravenous anaesthetics for surgery to further clarify the statistical details of the models, diagnostics and computations. Finally, datasets and models for the freeware WinBUGS package are presented for the anaesthetic agent example.
Bayesian Data Analysis with the Bivariate Hierarchical Ornstein-Uhlenbeck Process Model.
Oravecz, Zita; Tuerlinckx, Francis; Vandekerckhove, Joachim
2016-01-01
In this paper, we propose a multilevel process modeling approach to describing individual differences in within-person changes over time. To characterize changes within an individual, repeated measures over time are modeled in terms of three person-specific parameters: a baseline level, intraindividual variation around the baseline, and regulatory mechanisms adjusting toward baseline. Variation due to measurement error is separated from meaningful intraindividual variation. The proposed model allows for the simultaneous analysis of longitudinal measurements of two linked variables (bivariate longitudinal modeling) and captures their relationship via two person-specific parameters. Relationships between explanatory variables and model parameters can be studied in a one-stage analysis, meaning that model parameters and regression coefficients are estimated simultaneously. Mathematical details of the approach, including a description of the core process model-the Ornstein-Uhlenbeck model-are provided. We also describe a user friendly, freely accessible software program that provides a straightforward graphical interface to carry out parameter estimation and inference. The proposed approach is illustrated by analyzing data collected via self-reports on affective states.
Institute of Scientific and Technical Information of China (English)
MING Zhimao; TAO Junyong; ZHANG Yunan; YI Xiaoshan; CHEN Xun
2009-01-01
New armament systems are subjected to the method for dealing with multi-stage system reliability-growth statistical problems of diverse population in order to improve reliability before starting mass production. Aiming at the test process which is high expense and small sample-size in the development of complex system, the specific methods are studied on how to process the statistical information of Bayesian reliability growth regarding diverse populations. Firstly, according to the characteristics of reliability growth during product development, the Bayesian method is used to integrate the testing information of multi-stage and the order relations of distribution parameters. And then a Gamma-Beta prior distribution is proposed based on non-homogeneous Poisson process(NHPP) corresponding to the reliability growth process. The posterior distribution of reliability parameters is obtained regarding different stages of product, and the reliability parameters are evaluated based on the posterior distribution. Finally, Bayesian approach proposed in this paper for multi-stage reliability growth test is applied to the test process which is small sample-size in the astronautics filed. The results of a numerical example show that the presented model can make use of the diverse information synthetically, and pave the way for the application of the Bayesian model for multi-stage reliability growth test evaluation with small sample-size. The method is useful for evaluating multi-stage system reliability and making reliability growth plan rationally.
Directory of Open Access Journals (Sweden)
Sérgio L. Pereira
2008-01-01
Full Text Available Most Neotropical birds, including Pteroglossus aracaris, do not have an adequate fossil record to be used as time constraints in molecular dating. Hence, the evolutionary timeframe of the avian biota can only be inferred using alternative time constraints. We applied a Bayesian relaxed clock approach to propose an alternative interpretation for the historical biogeography of Pteroglossus based on mitochondrial DNA sequences, using different combinations of outgroups and time constraints obtained from outgroup fossils, vicariant barriers and molecular time estimates. The results indicated that outgroup choice has little effect on the Bayesian posterior distribution of divergence times within Pteroglossus , that geological and molecular time constraints seem equally suitable to estimate the Bayesian posterior distribution of divergence times for Pteroglossus , and that the fossil record alone overestimates divergence times within the fossil-lacking ingroup. The Bayesian estimates of divergence times suggest that the radiation of Pteroglossus occurred from the Late Miocene to the Pliocene (three times older than estimated by the “standard” mitochondrial rate of 2% sequence divergence per million years, likely triggered by Andean uplift, multiple episodes of marine transgressions in South America, and formation of present-day river basins. The time estimates are in agreement with other Neotropical taxa with similar geographic distributions.
DEFF Research Database (Denmark)
2010-01-01
Genetic markers can be used as instrumental variables, in an analogous way to randomization in a clinical trial, to estimate the causal relationship between a phenotype and an outcome variable. Our purpose is to extend the existing methods for such Mendelian randomization studies to the context...... an overall estimate of the causal relationship between the phenotype and the outcome, and an assessment of its heterogeneity across studies. As an example, we estimate the causal relationship of blood concentrations of C-reactive protein on fibrinogen levels using data from 11 studies. These methods provide...... a flexible framework for efficient estimation of causal relationships derived from multiple studies. Issues discussed include weak instrument bias, analysis of binary outcome data such as disease risk, missing genetic data, and the use of haplotypes....
Gaffney, Jim A; Sonnad, Vijay; Libby, Stephen B
2013-01-01
The complex nature of inertial confinement fusion (ICF) experiments results in a very large number of experimental parameters that are only known with limited reliability. These parameters, combined with the myriad physical models that govern target evolution, make the reliable extraction of physics from experimental campaigns very difficult. We develop an inference method that allows all important experimental parameters, and previous knowledge, to be taken into account when investigating underlying microphysics models. The result is framed as a modified $\\chi^{2}$ analysis which is easy to implement in existing analyses, and quite portable. We present a first application to a recent convergent ablator experiment performed at the NIF, and investigate the effect of variations in all physical dimensions of the target (very difficult to do using other methods). We show that for well characterised targets in which dimensions vary at the 0.5% level there is little effect, but 3% variations change the results of i...
Parameter-expanded data augmentation for Bayesian analysis of capture-recapture models
Royle, J. Andrew; Dorazio, Robert M.
2012-01-01
Data augmentation (DA) is a flexible tool for analyzing closed and open population models of capture-recapture data, especially models which include sources of hetereogeneity among individuals. The essential concept underlying DA, as we use the term, is based on adding "observations" to create a dataset composed of a known number of individuals. This new (augmented) dataset, which includes the unknown number of individuals N in the population, is then analyzed using a new model that includes a reformulation of the parameter N in the conventional model of the observed (unaugmented) data. In the context of capture-recapture models, we add a set of "all zero" encounter histories which are not, in practice, observable. The model of the augmented dataset is a zero-inflated version of either a binomial or a multinomial base model. Thus, our use of DA provides a general approach for analyzing both closed and open population models of all types. In doing so, this approach provides a unified framework for the analysis of a huge range of models that are treated as unrelated "black boxes" and named procedures in the classical literature. As a practical matter, analysis of the augmented dataset by MCMC is greatly simplified compared to other methods that require specialized algorithms. For example, complex capture-recapture models of an augmented dataset can be fitted with popular MCMC software packages (WinBUGS or JAGS) by providing a concise statement of the model's assumptions that usually involves only a few lines of pseudocode. In this paper, we review the basic technical concepts of data augmentation, and we provide examples of analyses of closed-population models (M 0, M h , distance sampling, and spatial capture-recapture models) and open-population models (Jolly-Seber) with individual effects.
Approximate Bayesian computation.
Directory of Open Access Journals (Sweden)
Mikael Sunnåker
Full Text Available Approximate Bayesian computation (ABC constitutes a class of computational methods rooted in Bayesian statistics. In all model-based statistical inference, the likelihood function is of central importance, since it expresses the probability of the observed data under a particular statistical model, and thus quantifies the support data lend to particular values of parameters and to choices among different models. For simple models, an analytical formula for the likelihood function can typically be derived. However, for more complex models, an analytical formula might be elusive or the likelihood function might be computationally very costly to evaluate. ABC methods bypass the evaluation of the likelihood function. In this way, ABC methods widen the realm of models for which statistical inference can be considered. ABC methods are mathematically well-founded, but they inevitably make assumptions and approximations whose impact needs to be carefully assessed. Furthermore, the wider application domain of ABC exacerbates the challenges of parameter estimation and model selection. ABC has rapidly gained popularity over the last years and in particular for the analysis of complex problems arising in biological sciences (e.g., in population genetics, ecology, epidemiology, and systems biology.
Bayesian structural equation modeling in sport and exercise psychology.
Stenling, Andreas; Ivarsson, Andreas; Johnson, Urban; Lindwall, Magnus
2015-08-01
Bayesian statistics is on the rise in mainstream psychology, but applications in sport and exercise psychology research are scarce. In this article, the foundations of Bayesian analysis are introduced, and we will illustrate how to apply Bayesian structural equation modeling in a sport and exercise psychology setting. More specifically, we contrasted a confirmatory factor analysis on the Sport Motivation Scale II estimated with the most commonly used estimator, maximum likelihood, and a Bayesian approach with weakly informative priors for cross-loadings and correlated residuals. The results indicated that the model with Bayesian estimation and weakly informative priors provided a good fit to the data, whereas the model estimated with a maximum likelihood estimator did not produce a well-fitting model. The reasons for this discrepancy between maximum likelihood and Bayesian estimation are discussed as well as potential advantages and caveats with the Bayesian approach.
Kan, Shun-Li; Yuan, Zhi-Fang; Chen, Ling-Xiao; Sun, Jing-Cheng; Ning, Guang-Zhi; Feng, Shi-Qing
2017-01-01
Introduction Osteoporotic vertebral compression fractures (OVCFs) commonly cause both acute and chronic back pain, substantial spinal deformity, functional disability and decreased quality of life and increase the risk of future vertebral fractures and mortality. Percutaneous vertebroplasty (PVP), balloon kyphoplasty (BK) and non-surgical treatment (NST) are mostly used for the treatment of OVCFs. However, which treatment is preferred is unknown. The purpose of this study is to comprehensively review the literature and ascertain the relative efficacy and safety of BK, PVP and NST for patients with OVCFs using a Bayesian network meta-analysis. Methods and analysis We will comprehensively search PubMed, EMBASE and the Cochrane Central Register of Controlled Trials, to include randomided controlled trials that compare BK, PVP or NST for treating OVCFs. The risk of bias for individual studies will be assessed according to the Cochrane Handbook. Bayesian network meta-analysis will be performed to compare the efficacy and safety of BK, PVP and NST. The quality of evidence will be evaluated by GRADE. Ethics and dissemination Ethical approval and patient consent are not required since this study is a meta-analysis based on published studies. The results of this network meta-analysis will be submitted to a peer-reviewed journal for publication. PROSPERO registration number CRD42016039452; Pre-results. PMID:28093431
Topographic factor analysis: a Bayesian model for inferring brain networks from neural data.
Directory of Open Access Journals (Sweden)
Jeremy R Manning
Full Text Available The neural patterns recorded during a neuroscientific experiment reflect complex interactions between many brain regions, each comprising millions of neurons. However, the measurements themselves are typically abstracted from that underlying structure. For example, functional magnetic resonance imaging (fMRI datasets comprise a time series of three-dimensional images, where each voxel in an image (roughly reflects the activity of the brain structure(s-located at the corresponding point in space-at the time the image was collected. FMRI data often exhibit strong spatial correlations, whereby nearby voxels behave similarly over time as the underlying brain structure modulates its activity. Here we develop topographic factor analysis (TFA, a technique that exploits spatial correlations in fMRI data to recover the underlying structure that the images reflect. Specifically, TFA casts each brain image as a weighted sum of spatial functions. The parameters of those spatial functions, which may be learned by applying TFA to an fMRI dataset, reveal the locations and sizes of the brain structures activated while the data were collected, as well as the interactions between those structures.
Casellas, J; Cañas-Álvarez, J J; González-Rodríguez, A; Puig-Oliveras, A; Fina, M; Piedrafita, J; Molina, A; Díaz, C; Baró, J A; Varona, L
2017-02-01
Transmission ratio distortion (TRD) is the departure from the expected Mendelian ratio in offspring, a poorly investigated biological phenomenon in livestock species. Given the current availability of specific parametric methods for the analysis of segregation data, this study focused on the screening of TRD in 602 402 single nucleotide polymorphisms covering all autosomal chromosomes in seven Spanish beef cattle breeds. On average, 0.13% (n = 786) and 0.01% (n = 29) of genetic markers evidenced sire- or dam-specific TRD respectively. There were no single nucleotide polymorphisms accounting for both sire- and dam-specific TRD at the same time, and only one marker (rs43147474) accounted for (sire-specific) TRD in all seven breeds. It must be noted that rs43147474 is located in the fourth intronic region of the GTP-binding protein 10 gene, and this locus has been previously linked to the maintenance of mitochondria and nucleolar architectures. Alternatively, other candidate genes surround this hot-spot for sire-specific TRD in the cattle genome, and they are related to embryonic and postnatal lethality as well as prostate cancer, among others. This research characterized the distribution of TRD in the bovine genome, highlighting heterogeneous results when comparing across breeds.
A Trans-dimensional Bayesian Approach to Pulsar Timing Noise Analysis
Ellis, Justin
2016-01-01
The modeling of intrinsic noise in pulsar timing residual data is of crucial importance for Gravitational Wave (GW) detection and pulsar timing (astro)physics in general. The noise budget in pulsars is a collection of several well studied effects including radiometer noise, pulse-phase jitter noise, dispersion measure (DM) variations, and low frequency spin noise. However, as pulsar timing data continues to improve, non-stationary and non-powerlaw noise terms are beginning to manifest which are not well modeled by current noise analysis techniques. In this work we use a trans-dimensional approach to model these non-stationary and non-powerlaw effects through the use of a wavelet basis and an interpolation based adaptive spectral modeling. In both cases, the number of wavelets and the number of control points in the interpolated spectrum are free parameters that are constrained by the data and then marginalized over in the final inferences, thus fully incorporating our ignorance of the noise model. We show tha...
Younes, A.; Delay, F.; Fajraoui, N.; Fahs, M.; Mara, T. A.
2016-08-01
The concept of dual flowing continuum is a promising approach for modeling solute transport in porous media that includes biofilm phases. The highly dispersed transit time distributions often generated by these media are taken into consideration by simply stipulating that advection-dispersion transport occurs through both the porous and the biofilm phases. Both phases are coupled but assigned with contrasting hydrodynamic properties. However, the dual flowing continuum suffers from intrinsic equifinality in the sense that the outlet solute concentration can be the result of several parameter sets of the two flowing phases. To assess the applicability of the dual flowing continuum, we investigate how the model behaves with respect to its parameters. For the purpose of this study, a Global Sensitivity Analysis (GSA) and a Statistical Calibration (SC) of model parameters are performed for two transport scenarios that differ by the strength of interaction between the flowing phases. The GSA is shown to be a valuable tool to understand how the complex system behaves. The results indicate that the rate of mass transfer between the two phases is a key parameter of the model behavior and influences the identifiability of the other parameters. For weak mass exchanges, the output concentration is mainly controlled by the velocity in the porous medium and by the porosity of both flowing phases. In the case of large mass exchanges, the kinetics of this exchange also controls the output concentration. The SC results show that transport with large mass exchange between the flowing phases is more likely affected by equifinality than transport with weak exchange. The SC also indicates that weakly sensitive parameters, such as the dispersion in each phase, can be accurately identified. Removing them from calibration procedures is not recommended because it might result in biased estimations of the highly sensitive parameters.
Analysis on Wake Vortex Accident Mechanism Based on Bayesian Networks%基于贝叶斯网络的尾流事故机理分析
Institute of Scientific and Technical Information of China (English)
陈芳; 孙瑶
2011-01-01
For the limitation of fault tree analysis, a fault tree of wake vortex was mapped onto a Bayesian network. By researching, analyzing calculation data, some key accident-causing factors such as the huge density of air traffic, the wrong judgment of space between two aircraft by ATC and the ignoring of the STCA warning, were found out. And then by introducing improved measures into Bayesian networks, the effectiveness of related measures was assessed. The application of Bayesian Network in wake vortex accident analysis has more advantage over fault tree analysis in the aspect of expressing the uncertainty and the identification of the key factors.%针对事故树分析法的局限性,在尾流事故树的基础上,建立贝叶斯网络(BN).运用推理运算对BN进行定量分析,得出:空中交通密度太大、空中交通管制(ATC)间隔判断错误和短期冲突告警(STCA)被忽略是事故的关键致因.将针对致因提出的改进措施引入到BN中,评价相关措施的有效性.应用BN进行尾流事故的机理分析,能够以比逻辑门更好的形式表达变量间的不确定性关系,从而更加方便地找到导致事故发生的关键因素.
Bauwens, Luc; Korobilis, Dimitris
2011-01-01
This comprehensive Handbook presents the current state of art in the theory and methodology of macroeconomic data analysis. It is intended as a reference for graduate students and researchers interested in exploring new methodologies, but can also be employed as a graduate text. The Handbook concentrates on the most important issues, models and techniques for research in macroeconomics, and highlights the core methodologies and their empirical application in an accessible manner. Each chapter...
Bayesian Games with Intentions
Directory of Open Access Journals (Sweden)
Adam Bjorndahl
2016-06-01
Full Text Available We show that standard Bayesian games cannot represent the full spectrum of belief-dependent preferences. However, by introducing a fundamental distinction between intended and actual strategies, we remove this limitation. We define Bayesian games with intentions, generalizing both Bayesian games and psychological games, and prove that Nash equilibria in psychological games correspond to a special class of equilibria as defined in our setting.
Kuczera, George; Kavetski, Dmitri; Franks, Stewart; Thyer, Mark
2006-11-01
SummaryCalibration and prediction in conceptual rainfall-runoff (CRR) modelling is affected by the uncertainty in the observed forcing/response data and the structural error in the model. This study works towards the goal of developing a robust framework for dealing with these sources of error and focuses on model error. The characterisation of model error in CRR modelling has been thwarted by the convenient but indefensible treatment of CRR models as deterministic descriptions of catchment dynamics. This paper argues that the fluxes in CRR models should be treated as stochastic quantities because their estimation involves spatial and temporal averaging. Acceptance that CRR models are intrinsically stochastic paves the way for a more rational characterisation of model error. The hypothesis advanced in this paper is that CRR model error can be characterised by storm-dependent random variation of one or more CRR model parameters. A simple sensitivity analysis is used to identify the parameters most likely to behave stochastically, with variation in these parameters yielding the largest changes in model predictions as measured by the Nash-Sutcliffe criterion. A Bayesian hierarchical model is then formulated to explicitly differentiate between forcing, response and model error. It provides a very general framework for calibration and prediction, as well as for testing hypotheses regarding model structure and data uncertainty. A case study calibrating a six-parameter CRR model to daily data from the Abercrombie catchment (Australia) demonstrates the considerable potential of this approach. Allowing storm-dependent variation in just two model parameters (with one of the parameters characterising model error and the other reflecting input uncertainty) yields a substantially improved model fit raising the Nash-Sutcliffe statistic from 0.74 to 0.94. Of particular significance is the use of posterior diagnostics to test the key assumptions about the data and model errors
Myte, Robin; Gylling, Björn; Häggström, Jenny; Schneede, Jörn; Magne Ueland, Per; Hallmans, Göran; Johansson, Ingegerd; Palmqvist, Richard; Van Guelpen, Bethany
2017-01-01
The role of one-carbon metabolism (1CM), particularly folate, in colorectal cancer (CRC) development has been extensively studied, but with inconclusive results. Given the complexity of 1CM, the conventional approach, investigating components individually, may be insufficient. We used a machine learning-based Bayesian network approach to study, simultaneously, 14 circulating one-carbon metabolites, 17 related single nucleotide polymorphisms (SNPs), and several environmental factors in relation to CRC risk in 613 cases and 1190 controls from the prospective Northern Sweden Health and Disease Study. The estimated networks corresponded largely to known biochemical relationships. Plasma concentrations of folate (direct), vitamin B6 (pyridoxal 5-phosphate) (inverse), and vitamin B2 (riboflavin) (inverse) had the strongest independent associations with CRC risk. Our study demonstrates the importance of incorporating B-vitamins in future studies of 1CM and CRC development, and the usefulness of Bayesian network learning for investigating complex biological systems in relation to disease. PMID:28233834
Bayesian statistics an introduction
Lee, Peter M
2012-01-01
Bayesian Statistics is the school of thought that combines prior beliefs with the likelihood of a hypothesis to arrive at posterior beliefs. The first edition of Peter Lee’s book appeared in 1989, but the subject has moved ever onwards, with increasing emphasis on Monte Carlo based techniques. This new fourth edition looks at recent techniques such as variational methods, Bayesian importance sampling, approximate Bayesian computation and Reversible Jump Markov Chain Monte Carlo (RJMCMC), providing a concise account of the way in which the Bayesian approach to statistics develops as wel
Understanding Computational Bayesian Statistics
Bolstad, William M
2011-01-01
A hands-on introduction to computational statistics from a Bayesian point of view Providing a solid grounding in statistics while uniquely covering the topics from a Bayesian perspective, Understanding Computational Bayesian Statistics successfully guides readers through this new, cutting-edge approach. With its hands-on treatment of the topic, the book shows how samples can be drawn from the posterior distribution when the formula giving its shape is all that is known, and how Bayesian inferences can be based on these samples from the posterior. These ideas are illustrated on common statistic
Ershadi, Ali
2013-05-01
The influence of uncertainty in land surface temperature, air temperature, and wind speed on the estimation of sensible heat flux is analyzed using a Bayesian inference technique applied to the Surface Energy Balance System (SEBS) model. The Bayesian approach allows for an explicit quantification of the uncertainties in input variables: a source of error generally ignored in surface heat flux estimation. An application using field measurements from the Soil Moisture Experiment 2002 is presented. The spatial variability of selected input meteorological variables in a multitower site is used to formulate the prior estimates for the sampling uncertainties, and the likelihood function is formulated assuming Gaussian errors in the SEBS model. Land surface temperature, air temperature, and wind speed were estimated by sampling their posterior distribution using a Markov chain Monte Carlo algorithm. Results verify that Bayesian-inferred air temperature and wind speed were generally consistent with those observed at the towers, suggesting that local observations of these variables were spatially representative. Uncertainties in the land surface temperature appear to have the strongest effect on the estimated sensible heat flux, with Bayesian-inferred values differing by up to ±5°C from the observed data. These differences suggest that the footprint of the in situ measured land surface temperature is not representative of the larger-scale variability. As such, these measurements should be used with caution in the calculation of surface heat fluxes and highlight the importance of capturing the spatial variability in the land surface temperature: particularly, for remote sensing retrieval algorithms that use this variable for flux estimation.
Bayesian Alternation During Tactile Augmentation
Directory of Open Access Journals (Sweden)
Caspar Mathias Goeke
2016-10-01
Full Text Available A large number of studies suggest that the integration of multisensory signals by humans is well described by Bayesian principles. However, there are very few reports about cue combination between a native and an augmented sense. In particular, we asked the question whether adult participants are able to integrate an augmented sensory cue with existing native sensory information. Hence for the purpose of this study we build a tactile augmentation device. Consequently, we compared different hypotheses of how untrained adult participants combine information from a native and an augmented sense. In a two-interval forced choice (2 IFC task, while subjects were blindfolded and seated on a rotating platform, our sensory augmentation device translated information on whole body yaw rotation to tactile stimulation. Three conditions were realized: tactile stimulation only (augmented condition, rotation only (native condition, and both augmented and native information (bimodal condition. Participants had to choose one out of two consecutive rotations with higher angular rotation. For the analysis, we fitted the participants’ responses with a probit model and calculated the just notable difference (JND. Then we compared several models for predicting bimodal from unimodal responses. An objective Bayesian alternation model yielded a better prediction (χred2 = 1.67 than the Bayesian integration model (χred2= 4.34. Slightly higher accuracy showed a non-Bayesian winner takes all model (χred2= 1.64, which either used only native or only augmented values per subject for prediction. However the performance of the Bayesian alternation model could be substantially improved (χred2= 1.09 utilizing subjective weights obtained by a questionnaire. As a result, the subjective Bayesian alternation model predicted bimodal performance most accurately among all tested models. These results suggest that information from augmented and existing sensory modalities in
Díaz, R F; Udry, S; Lovis, C; Pepe, F; Dumusque, X; Marmier, M; Alonso, R; Benz, W; Bouchy, F; Coffinet, A; Cameron, A Collier; Deleuil, M; Figueira, P; Gillon, M; Curto, G Lo; Mayor, M; Mordasini, C; Motalebi, F; Moutou, C; Pollacco, D; Pompei, E; Queloz, D; Santos, N; Wyttenbach, A
2016-01-01
We present the analysis of the entire HARPS observations of three stars that host planetary systems: HD1461, HD40307, and HD204313. The data set spans eight years and contains more than 200 nightly averaged velocity measurements for each star. This means that it is sensitive to both long-period and low-mass planets and also to the effects induced by stellar activity cycles. We modelled the data using Keplerian functions that correspond to planetary candidates and included the short- and long-term effects of magnetic activity. A Bayesian approach was taken both for the data modelling, which allowed us to include information from activity proxies such as $\\log{(R'_{\\rm HK})}$ in the velocity modelling, and for the model selection, which permitted determining the number of significant signals in the system. The Bayesian model comparison overcomes the limitations inherent to the traditional periodogram analysis. We report an additional super-Earth planet in the HD1461 system. Four out of the six planets previousl...