WorldWideScience

Sample records for nonparametric bayesian statistics

  1. Nonparametric Bayesian predictive distributions for future order statistics

    Science.gov (United States)

    Richard A. Johnson; James W. Evans; David W. Green

    1999-01-01

    We derive the predictive distribution for a specified order statistic, determined from a future random sample, under a Dirichlet process prior. Two variants of the approach are treated and some limiting cases studied. A practical application to monitoring the strength of lumber is discussed including choices of prior expectation and comparisons made to a Bayesian...

  2. Bayesian Nonparametric Statistical Inference for Shock Models and Wear Processes.

    Science.gov (United States)

    1979-12-01

    also note that the results in Section 2 do not depend on the support of F .) This shock model have been studied by Esary, Marshall and Proschan (1973...Barlow and Proschan (1975), among others. The analogy of the shock model in risk and acturial analysis has been given by BUhlmann (1970, Chapter 2... Mathematical Statistics, Vol. 4, pp. 894-906. Billingsley, P. (1968), CONVERGENCE OF PROBABILITY MEASURES, John Wiley, New York. BUhlmann, H. (1970

  3. Bayesian nonparametric data analysis

    CERN Document Server

    Müller, Peter; Jara, Alejandro; Hanson, Tim

    2015-01-01

    This book reviews nonparametric Bayesian methods and models that have proven useful in the context of data analysis. Rather than providing an encyclopedic review of probability models, the book’s structure follows a data analysis perspective. As such, the chapters are organized by traditional data analysis problems. In selecting specific nonparametric models, simpler and more traditional models are favored over specialized ones. The discussed methods are illustrated with a wealth of examples, including applications ranging from stylized examples to case studies from recent literature. The book also includes an extensive discussion of computational methods and details on their implementation. R code for many examples is included in on-line software pages.

  4. Statistical analysis using the Bayesian nonparametric method for irradiation embrittlement of reactor pressure vessels

    Energy Technology Data Exchange (ETDEWEB)

    Takamizawa, Hisashi, E-mail: takamizawa.hisashi@jaea.go.jp; Itoh, Hiroto, E-mail: ito.hiroto@jaea.go.jp; Nishiyama, Yutaka, E-mail: nishiyama.yutaka93@jaea.go.jp

    2016-10-15

    In order to understand neutron irradiation embrittlement in high fluence regions, statistical analysis using the Bayesian nonparametric (BNP) method was performed for the Japanese surveillance and material test reactor irradiation database. The BNP method is essentially expressed as an infinite summation of normal distributions, with input data being subdivided into clusters with identical statistical parameters, such as mean and standard deviation, for each cluster to estimate shifts in ductile-to-brittle transition temperature (DBTT). The clusters typically depend on chemical compositions, irradiation conditions, and the irradiation embrittlement. Specific variables contributing to the irradiation embrittlement include the content of Cu, Ni, P, Si, and Mn in the pressure vessel steels, neutron flux, neutron fluence, and irradiation temperatures. It was found that the measured shifts of DBTT correlated well with the calculated ones. Data associated with the same materials were subdivided into the same clusters even if neutron fluences were increased.

  5. Bayesian nonparametric hierarchical modeling.

    Science.gov (United States)

    Dunson, David B

    2009-04-01

    In biomedical research, hierarchical models are very widely used to accommodate dependence in multivariate and longitudinal data and for borrowing of information across data from different sources. A primary concern in hierarchical modeling is sensitivity to parametric assumptions, such as linearity and normality of the random effects. Parametric assumptions on latent variable distributions can be challenging to check and are typically unwarranted, given available prior knowledge. This article reviews some recent developments in Bayesian nonparametric methods motivated by complex, multivariate and functional data collected in biomedical studies. The author provides a brief review of flexible parametric approaches relying on finite mixtures and latent class modeling. Dirichlet process mixture models are motivated by the need to generalize these approaches to avoid assuming a fixed finite number of classes. Focusing on an epidemiology application, the author illustrates the practical utility and potential of nonparametric Bayes methods.

  6. Nonparametric statistical inference

    CERN Document Server

    Gibbons, Jean Dickinson

    2010-01-01

    Overall, this remains a very fine book suitable for a graduate-level course in nonparametric statistics. I recommend it for all people interested in learning the basic ideas of nonparametric statistical inference.-Eugenia Stoimenova, Journal of Applied Statistics, June 2012… one of the best books available for a graduate (or advanced undergraduate) text for a theory course on nonparametric statistics. … a very well-written and organized book on nonparametric statistics, especially useful and recommended for teachers and graduate students.-Biometrics, 67, September 2011This excellently presente

  7. Bayesian Nonparametric Longitudinal Data Analysis.

    Science.gov (United States)

    Quintana, Fernando A; Johnson, Wesley O; Waetjen, Elaine; Gold, Ellen

    2016-01-01

    Practical Bayesian nonparametric methods have been developed across a wide variety of contexts. Here, we develop a novel statistical model that generalizes standard mixed models for longitudinal data that include flexible mean functions as well as combined compound symmetry (CS) and autoregressive (AR) covariance structures. AR structure is often specified through the use of a Gaussian process (GP) with covariance functions that allow longitudinal data to be more correlated if they are observed closer in time than if they are observed farther apart. We allow for AR structure by considering a broader class of models that incorporates a Dirichlet Process Mixture (DPM) over the covariance parameters of the GP. We are able to take advantage of modern Bayesian statistical methods in making full predictive inferences and about characteristics of longitudinal profiles and their differences across covariate combinations. We also take advantage of the generality of our model, which provides for estimation of a variety of covariance structures. We observe that models that fail to incorporate CS or AR structure can result in very poor estimation of a covariance or correlation matrix. In our illustration using hormone data observed on women through the menopausal transition, biology dictates the use of a generalized family of sigmoid functions as a model for time trends across subpopulation categories.

  8. Nonparametric Bayesian inference in biostatistics

    CERN Document Server

    Müller, Peter

    2015-01-01

    As chapters in this book demonstrate, BNP has important uses in clinical sciences and inference for issues like unknown partitions in genomics. Nonparametric Bayesian approaches (BNP) play an ever expanding role in biostatistical inference from use in proteomics to clinical trials. Many research problems involve an abundance of data and require flexible and complex probability models beyond the traditional parametric approaches. As this book's expert contributors show, BNP approaches can be the answer. Survival Analysis, in particular survival regression, has traditionally used BNP, but BNP's potential is now very broad. This applies to important tasks like arrangement of patients into clinically meaningful subpopulations and segmenting the genome into functionally distinct regions. This book is designed to both review and introduce application areas for BNP. While existing books provide theoretical foundations, this book connects theory to practice through engaging examples and research questions. Chapters c...

  9. Nonparametric statistical inference

    CERN Document Server

    Gibbons, Jean Dickinson

    2014-01-01

    Thoroughly revised and reorganized, the fourth edition presents in-depth coverage of the theory and methods of the most widely used nonparametric procedures in statistical analysis and offers example applications appropriate for all areas of the social, behavioral, and life sciences. The book presents new material on the quantiles, the calculation of exact and simulated power, multiple comparisons, additional goodness-of-fit tests, methods of analysis of count data, and modern computer applications using MINITAB, SAS, and STATXACT. It includes tabular guides for simplified applications of tests and finding P values and confidence interval estimates.

  10. Nonparametric Bayesian Modeling of Complex Networks

    DEFF Research Database (Denmark)

    Schmidt, Mikkel Nørgaard; Mørup, Morten

    2013-01-01

    an infinite mixture model as running example, we go through the steps of deriving the model as an infinite limit of a finite parametric model, inferring the model parameters by Markov chain Monte Carlo, and checking the model?s fit and predictive performance. We explain how advanced nonparametric models......Modeling structure in complex networks using Bayesian nonparametrics makes it possible to specify flexible model structures and infer the adequate model complexity from the observed data. This article provides a gentle introduction to nonparametric Bayesian modeling of complex networks: Using...

  11. Technical Topic 3.2.2.d Bayesian and Non-Parametric Statistics: Integration of Neural Networks with Bayesian Networks for Data Fusion and Predictive Modeling

    Science.gov (United States)

    2016-05-31

    Distribution Unlimited UU UU UU UU 31-05-2016 15-Apr-2014 14-Jan-2015 Final Report: Technical Topic 3.2.2.d Bayesian and Non- parametric Statistics...of Papers published in non peer-reviewed journals: Final Report: Technical Topic 3.2.2.d Bayesian and Non- parametric Statistics: Integration of Neural...Transfer N/A Number of graduating undergraduates who achieved a 3.5 GPA to 4.0 (4.0 max scale ): Number of graduating undergraduates funded by a DoD funded

  12. A Bayesian Nonparametric Approach to Factor Analysis

    DEFF Research Database (Denmark)

    Piatek, Rémi; Papaspiliopoulos, Omiros

    2018-01-01

    This paper introduces a new approach for the inference of non-Gaussian factor models based on Bayesian nonparametric methods. It relaxes the usual normality assumption on the latent factors, widely used in practice, which is too restrictive in many settings. Our approach, on the contrary, does no...

  13. Decision support using nonparametric statistics

    CERN Document Server

    Beatty, Warren

    2018-01-01

    This concise volume covers nonparametric statistics topics that most are most likely to be seen and used from a practical decision support perspective. While many degree programs require a course in parametric statistics, these methods are often inadequate for real-world decision making in business environments. Much of the data collected today by business executives (for example, customer satisfaction opinions) requires nonparametric statistics for valid analysis, and this book provides the reader with a set of tools that can be used to validly analyze all data, regardless of type. Through numerous examples and exercises, this book explains why nonparametric statistics will lead to better decisions and how they are used to reach a decision, with a wide array of business applications. Online resources include exercise data, spreadsheets, and solutions.

  14. Network structure exploration via Bayesian nonparametric models

    International Nuclear Information System (INIS)

    Chen, Y; Wang, X L; Xiang, X; Tang, B Z; Bu, J Z

    2015-01-01

    Complex networks provide a powerful mathematical representation of complex systems in nature and society. To understand complex networks, it is crucial to explore their internal structures, also called structural regularities. The task of network structure exploration is to determine how many groups there are in a complex network and how to group the nodes of the network. Most existing structure exploration methods need to specify either a group number or a certain type of structure when they are applied to a network. In the real world, however, the group number and also the certain type of structure that a network has are usually unknown in advance. To explore structural regularities in complex networks automatically, without any prior knowledge of the group number or the certain type of structure, we extend a probabilistic mixture model that can handle networks with any type of structure but needs to specify a group number using Bayesian nonparametric theory. We also propose a novel Bayesian nonparametric model, called the Bayesian nonparametric mixture (BNPM) model. Experiments conducted on a large number of networks with different structures show that the BNPM model is able to explore structural regularities in networks automatically with a stable, state-of-the-art performance. (paper)

  15. Robustifying Bayesian nonparametric mixtures for count data.

    Science.gov (United States)

    Canale, Antonio; Prünster, Igor

    2017-03-01

    Our motivating application stems from surveys of natural populations and is characterized by large spatial heterogeneity in the counts, which makes parametric approaches to modeling local animal abundance too restrictive. We adopt a Bayesian nonparametric approach based on mixture models and innovate with respect to popular Dirichlet process mixture of Poisson kernels by increasing the model flexibility at the level both of the kernel and the nonparametric mixing measure. This allows to derive accurate and robust estimates of the distribution of local animal abundance and of the corresponding clusters. The application and a simulation study for different scenarios yield also some general methodological implications. Adding flexibility solely at the level of the mixing measure does not improve inferences, since its impact is severely limited by the rigidity of the Poisson kernel with considerable consequences in terms of bias. However, once a kernel more flexible than the Poisson is chosen, inferences can be robustified by choosing a prior more general than the Dirichlet process. Therefore, to improve the performance of Bayesian nonparametric mixtures for count data one has to enrich the model simultaneously at both levels, the kernel and the mixing measure. © 2016, The International Biometric Society.

  16. Bayesian Nonparametric Clustering for Positive Definite Matrices.

    Science.gov (United States)

    Cherian, Anoop; Morellas, Vassilios; Papanikolopoulos, Nikolaos

    2016-05-01

    Symmetric Positive Definite (SPD) matrices emerge as data descriptors in several applications of computer vision such as object tracking, texture recognition, and diffusion tensor imaging. Clustering these data matrices forms an integral part of these applications, for which soft-clustering algorithms (K-Means, expectation maximization, etc.) are generally used. As is well-known, these algorithms need the number of clusters to be specified, which is difficult when the dataset scales. To address this issue, we resort to the classical nonparametric Bayesian framework by modeling the data as a mixture model using the Dirichlet process (DP) prior. Since these matrices do not conform to the Euclidean geometry, rather belongs to a curved Riemannian manifold,existing DP models cannot be directly applied. Thus, in this paper, we propose a novel DP mixture model framework for SPD matrices. Using the log-determinant divergence as the underlying dissimilarity measure to compare these matrices, and further using the connection between this measure and the Wishart distribution, we derive a novel DPM model based on the Wishart-Inverse-Wishart conjugate pair. We apply this model to several applications in computer vision. Our experiments demonstrate that our model is scalable to the dataset size and at the same time achieves superior accuracy compared to several state-of-the-art parametric and nonparametric clustering algorithms.

  17. Bayesian nonparametric adaptive control using Gaussian processes.

    Science.gov (United States)

    Chowdhary, Girish; Kingravi, Hassan A; How, Jonathan P; Vela, Patricio A

    2015-03-01

    Most current model reference adaptive control (MRAC) methods rely on parametric adaptive elements, in which the number of parameters of the adaptive element are fixed a priori, often through expert judgment. An example of such an adaptive element is radial basis function networks (RBFNs), with RBF centers preallocated based on the expected operating domain. If the system operates outside of the expected operating domain, this adaptive element can become noneffective in capturing and canceling the uncertainty, thus rendering the adaptive controller only semiglobal in nature. This paper investigates a Gaussian process-based Bayesian MRAC architecture (GP-MRAC), which leverages the power and flexibility of GP Bayesian nonparametric models of uncertainty. The GP-MRAC does not require the centers to be preallocated, can inherently handle measurement noise, and enables MRAC to handle a broader set of uncertainties, including those that are defined as distributions over functions. We use stochastic stability arguments to show that GP-MRAC guarantees good closed-loop performance with no prior domain knowledge of the uncertainty. Online implementable GP inference methods are compared in numerical simulations against RBFN-MRAC with preallocated centers and are shown to provide better tracking and improved long-term learning.

  18. Introduction to Bayesian statistics

    CERN Document Server

    Bolstad, William M

    2017-01-01

    There is a strong upsurge in the use of Bayesian methods in applied statistical analysis, yet most introductory statistics texts only present frequentist methods. Bayesian statistics has many important advantages that students should learn about if they are going into fields where statistics will be used. In this Third Edition, four newly-added chapters address topics that reflect the rapid advances in the field of Bayesian staistics. The author continues to provide a Bayesian treatment of introductory statistical topics, such as scientific data gathering, discrete random variables, robust Bayesian methods, and Bayesian approaches to inferenfe cfor discrete random variables, bionomial proprotion, Poisson, normal mean, and simple linear regression. In addition, newly-developing topics in the field are presented in four new chapters: Bayesian inference with unknown mean and variance; Bayesian inference for Multivariate Normal mean vector; Bayesian inference for Multiple Linear RegressionModel; and Computati...

  19. Nonparametric statistics for social and behavioral sciences

    CERN Document Server

    Kraska-MIller, M

    2013-01-01

    Introduction to Research in Social and Behavioral SciencesBasic Principles of ResearchPlanning for ResearchTypes of Research Designs Sampling ProceduresValidity and Reliability of Measurement InstrumentsSteps of the Research Process Introduction to Nonparametric StatisticsData AnalysisOverview of Nonparametric Statistics and Parametric Statistics Overview of Parametric Statistics Overview of Nonparametric StatisticsImportance of Nonparametric MethodsMeasurement InstrumentsAnalysis of Data to Determine Association and Agreement Pearson Chi-Square Test of Association and IndependenceContingency

  20. Comparing nonparametric Bayesian tree priors for clonal reconstruction of tumors.

    Science.gov (United States)

    Deshwar, Amit G; Vembu, Shankar; Morris, Quaid

    2015-01-01

    Statistical machine learning methods, especially nonparametric Bayesian methods, have become increasingly popular to infer clonal population structure of tumors. Here we describe the treeCRP, an extension of the Chinese restaurant process (CRP), a popular construction used in nonparametric mixture models, to infer the phylogeny and genotype of major subclonal lineages represented in the population of cancer cells. We also propose new split-merge updates tailored to the subclonal reconstruction problem that improve the mixing time of Markov chains. In comparisons with the tree-structured stick breaking prior used in PhyloSub, we demonstrate superior mixing and running time using the treeCRP with our new split-merge procedures. We also show that given the same number of samples, TSSB and treeCRP have similar ability to recover the subclonal structure of a tumor…

  1. Nonparametric Bayesian inference for multidimensional compound Poisson processes

    NARCIS (Netherlands)

    Gugushvili, S.; van der Meulen, F.; Spreij, P.

    2015-01-01

    Given a sample from a discretely observed multidimensional compound Poisson process, we study the problem of nonparametric estimation of its jump size density r0 and intensity λ0. We take a nonparametric Bayesian approach to the problem and determine posterior contraction rates in this context,

  2. Understanding Computational Bayesian Statistics

    CERN Document Server

    Bolstad, William M

    2011-01-01

    A hands-on introduction to computational statistics from a Bayesian point of view Providing a solid grounding in statistics while uniquely covering the topics from a Bayesian perspective, Understanding Computational Bayesian Statistics successfully guides readers through this new, cutting-edge approach. With its hands-on treatment of the topic, the book shows how samples can be drawn from the posterior distribution when the formula giving its shape is all that is known, and how Bayesian inferences can be based on these samples from the posterior. These ideas are illustrated on common statistic

  3. Bayesian statistics an introduction

    CERN Document Server

    Lee, Peter M

    2012-01-01

    Bayesian Statistics is the school of thought that combines prior beliefs with the likelihood of a hypothesis to arrive at posterior beliefs. The first edition of Peter Lee’s book appeared in 1989, but the subject has moved ever onwards, with increasing emphasis on Monte Carlo based techniques. This new fourth edition looks at recent techniques such as variational methods, Bayesian importance sampling, approximate Bayesian computation and Reversible Jump Markov Chain Monte Carlo (RJMCMC), providing a concise account of the way in which the Bayesian approach to statistics develops as wel

  4. The Bayesian Score Statistic

    NARCIS (Netherlands)

    Kleibergen, F.R.; Kleijn, R.; Paap, R.

    2000-01-01

    We propose a novel Bayesian test under a (noninformative) Jeffreys'priorspecification. We check whether the fixed scalar value of the so-calledBayesian Score Statistic (BSS) under the null hypothesis is aplausiblerealization from its known and standardized distribution under thealternative. Unlike

  5. Effect on Prediction when Modeling Covariates in Bayesian Nonparametric Models.

    Science.gov (United States)

    Cruz-Marcelo, Alejandro; Rosner, Gary L; Müller, Peter; Stewart, Clinton F

    2013-04-01

    In biomedical research, it is often of interest to characterize biologic processes giving rise to observations and to make predictions of future observations. Bayesian nonparametric methods provide a means for carrying out Bayesian inference making as few assumptions about restrictive parametric models as possible. There are several proposals in the literature for extending Bayesian nonparametric models to include dependence on covariates. Limited attention, however, has been directed to the following two aspects. In this article, we examine the effect on fitting and predictive performance of incorporating covariates in a class of Bayesian nonparametric models by one of two primary ways: either in the weights or in the locations of a discrete random probability measure. We show that different strategies for incorporating continuous covariates in Bayesian nonparametric models can result in big differences when used for prediction, even though they lead to otherwise similar posterior inferences. When one needs the predictive density, as in optimal design, and this density is a mixture, it is better to make the weights depend on the covariates. We demonstrate these points via a simulated data example and in an application in which one wants to determine the optimal dose of an anticancer drug used in pediatric oncology.

  6. Statistics: a Bayesian perspective

    National Research Council Canada - National Science Library

    Berry, Donald A

    1996-01-01

    ...: it is the only introductory textbook based on Bayesian ideas, it combines concepts and methods, it presents statistics as a means of integrating data into the significant process, it develops ideas...

  7. Teaching Nonparametric Statistics Using Student Instrumental Values.

    Science.gov (United States)

    Anderson, Jonathan W.; Diddams, Margaret

    Nonparametric statistics are often difficult to teach in introduction to statistics courses because of the lack of real-world examples. This study demonstrated how teachers can use differences in the rankings and ratings of undergraduate and graduate values to discuss: (1) ipsative and normative scaling; (2) uses of the Mann-Whitney U-test; and…

  8. Bayesian nonparametric system reliability using sets of priors

    NARCIS (Netherlands)

    Walter, G.M.; Aslett, L.J.M.; Coolen, F.P.A.

    2016-01-01

    An imprecise Bayesian nonparametric approach to system reliability with multiple types of components is developed. This allows modelling partial or imperfect prior knowledge on component failure distributions in a flexible way through bounds on the functioning probability. Given component level test

  9. A Bayesian nonparametric approach to causal inference on quantiles.

    Science.gov (United States)

    Xu, Dandan; Daniels, Michael J; Winterstein, Almut G

    2018-02-25

    We propose a Bayesian nonparametric approach (BNP) for causal inference on quantiles in the presence of many confounders. In particular, we define relevant causal quantities and specify BNP models to avoid bias from restrictive parametric assumptions. We first use Bayesian additive regression trees (BART) to model the propensity score and then construct the distribution of potential outcomes given the propensity score using a Dirichlet process mixture (DPM) of normals model. We thoroughly evaluate the operating characteristics of our approach and compare it to Bayesian and frequentist competitors. We use our approach to answer an important clinical question involving acute kidney injury using electronic health records. © 2018, The International Biometric Society.

  10. Bayesian statistical inference

    Directory of Open Access Journals (Sweden)

    Bruno De Finetti

    2017-04-01

    Full Text Available This work was translated into English and published in the volume: Bruno De Finetti, Induction and Probability, Biblioteca di Statistica, eds. P. Monari, D. Cocchi, Clueb, Bologna, 1993.Bayesian statistical Inference is one of the last fundamental philosophical papers in which we can find the essential De Finetti's approach to the statistical inference.

  11. Nonparametric predictive inference in statistical process control

    NARCIS (Netherlands)

    Arts, G.R.J.; Coolen, F.P.A.; Laan, van der P.

    2000-01-01

    New methods for statistical process control are presented, where the inferences have a nonparametric predictive nature. We consider several problems in process control in terms of uncertainties about future observable random quantities, and we develop inferences for these random quantities hased on

  12. Nonparametric predictive inference in statistical process control

    NARCIS (Netherlands)

    Arts, G.R.J.; Coolen, F.P.A.; Laan, van der P.

    2004-01-01

    Statistical process control (SPC) is used to decide when to stop a process as confidence in the quality of the next item(s) is low. Information to specify a parametric model is not always available, and as SPC is of a predictive nature, we present a control chart developed using nonparametric

  13. Bayesian nonparametric meta-analysis using Polya tree mixture models.

    Science.gov (United States)

    Branscum, Adam J; Hanson, Timothy E

    2008-09-01

    Summary. A common goal in meta-analysis is estimation of a single effect measure using data from several studies that are each designed to address the same scientific inquiry. Because studies are typically conducted in geographically disperse locations, recent developments in the statistical analysis of meta-analytic data involve the use of random effects models that account for study-to-study variability attributable to differences in environments, demographics, genetics, and other sources that lead to heterogeneity in populations. Stemming from asymptotic theory, study-specific summary statistics are modeled according to normal distributions with means representing latent true effect measures. A parametric approach subsequently models these latent measures using a normal distribution, which is strictly a convenient modeling assumption absent of theoretical justification. To eliminate the influence of overly restrictive parametric models on inferences, we consider a broader class of random effects distributions. We develop a novel hierarchical Bayesian nonparametric Polya tree mixture (PTM) model. We present methodology for testing the PTM versus a normal random effects model. These methods provide researchers a straightforward approach for conducting a sensitivity analysis of the normality assumption for random effects. An application involving meta-analysis of epidemiologic studies designed to characterize the association between alcohol consumption and breast cancer is presented, which together with results from simulated data highlight the performance of PTMs in the presence of nonnormality of effect measures in the source population.

  14. Seismic Signal Compression Using Nonparametric Bayesian Dictionary Learning via Clustering

    Directory of Open Access Journals (Sweden)

    Xin Tian

    2017-06-01

    Full Text Available We introduce a seismic signal compression method based on nonparametric Bayesian dictionary learning method via clustering. The seismic data is compressed patch by patch, and the dictionary is learned online. Clustering is introduced for dictionary learning. A set of dictionaries could be generated, and each dictionary is used for one cluster’s sparse coding. In this way, the signals in one cluster could be well represented by their corresponding dictionaries. A nonparametric Bayesian dictionary learning method is used to learn the dictionaries, which naturally infers an appropriate dictionary size for each cluster. A uniform quantizer and an adaptive arithmetic coding algorithm are adopted to code the sparse coefficients. With comparisons to other state-of-the art approaches, the effectiveness of the proposed method could be validated in the experiments.

  15. Nonparametric Bayesian models through probit stick-breaking processes.

    Science.gov (United States)

    Rodríguez, Abel; Dunson, David B

    2011-03-01

    We describe a novel class of Bayesian nonparametric priors based on stick-breaking constructions where the weights of the process are constructed as probit transformations of normal random variables. We show that these priors are extremely flexible, allowing us to generate a great variety of models while preserving computational simplicity. Particular emphasis is placed on the construction of rich temporal and spatial processes, which are applied to two problems in finance and ecology.

  16. Probability and Bayesian statistics

    CERN Document Server

    1987-01-01

    This book contains selected and refereed contributions to the "Inter­ national Symposium on Probability and Bayesian Statistics" which was orga­ nized to celebrate the 80th birthday of Professor Bruno de Finetti at his birthplace Innsbruck in Austria. Since Professor de Finetti died in 1985 the symposium was dedicated to the memory of Bruno de Finetti and took place at Igls near Innsbruck from 23 to 26 September 1986. Some of the pa­ pers are published especially by the relationship to Bruno de Finetti's scientific work. The evolution of stochastics shows growing importance of probability as coherent assessment of numerical values as degrees of believe in certain events. This is the basis for Bayesian inference in the sense of modern statistics. The contributions in this volume cover a broad spectrum ranging from foundations of probability across psychological aspects of formulating sub­ jective probability statements, abstract measure theoretical considerations, contributions to theoretical statistics an...

  17. 10th Conference on Bayesian Nonparametrics

    Science.gov (United States)

    2016-05-08

    RETURN YOUR FORM TO THE ABOVE ADDRESS. North Carolina State University 2701 Sullivan Drive Admin Srvcs III, Box 7514 Raleigh, NC 27695 -7514 ABSTRACT...the conference. The findings from the conference is widely disseminated. The conference web site displays slides of the talks presented in the...being published by the Electronic Journal of Statistics consisting of about 20 papers read at the conference. The conference web site displays

  18. A Nonparametric Bayesian Approach For Emission Tomography Reconstruction

    International Nuclear Information System (INIS)

    Barat, Eric; Dautremer, Thomas

    2007-01-01

    We introduce a PET reconstruction algorithm following a nonparametric Bayesian (NPB) approach. In contrast with Expectation Maximization (EM), the proposed technique does not rely on any space discretization. Namely, the activity distribution--normalized emission intensity of the spatial poisson process--is considered as a spatial probability density and observations are the projections of random emissions whose distribution has to be estimated. This approach is nonparametric in the sense that the quantity of interest belongs to the set of probability measures on R k (for reconstruction in k-dimensions) and it is Bayesian in the sense that we define a prior directly on this spatial measure. In this context, we propose to model the nonparametric probability density as an infinite mixture of multivariate normal distributions. As a prior for this mixture we consider a Dirichlet Process Mixture (DPM) with a Normal-Inverse Wishart (NIW) model as base distribution of the Dirichlet Process. As in EM-family reconstruction, we use a data augmentation scheme where the set of hidden variables are the emission locations for each observed line of response in the continuous object space. Thanks to the data augmentation, we propose a Markov Chain Monte Carlo (MCMC) algorithm (Gibbs sampler) which is able to generate draws from the posterior distribution of the spatial intensity. A difference with EM is that one step of the Gibbs sampler corresponds to the generation of emission locations while only the expected number of emissions per pixel/voxel is used in EM. Another key difference is that the estimated spatial intensity is a continuous function such that there is no need to compute a projection matrix. Finally, draws from the intensity posterior distribution allow the estimation of posterior functionnals like the variance or confidence intervals. Results are presented for simulated data based on a 2D brain phantom and compared to Bayesian MAP-EM

  19. Statistical decisions under nonparametric a priori information

    International Nuclear Information System (INIS)

    Chilingaryan, A.A.

    1985-01-01

    The basic module of applied program package for statistical analysis of the ANI experiment data is described. By means of this module tasks of choosing theoretical model most adequately fitting to experimental data, selection of events of definte type, identification of elementary particles are carried out. For mentioned problems solving, the Bayesian rules, one-leave out test and KNN (K Nearest Neighbour) adaptive density estimation are utilized

  20. Adaptive nonparametric Bayesian inference using location-scale mixture priors

    NARCIS (Netherlands)

    Jonge, de R.; Zanten, van J.H.

    2010-01-01

    We study location-scale mixture priors for nonparametric statistical problems, including multivariate regression, density estimation and classification. We show that a rate-adaptive procedure can be obtained if the prior is properly constructed. In particular, we show that adaptation is achieved if

  1. Bayesian nonparametric dictionary learning for compressed sensing MRI.

    Science.gov (United States)

    Huang, Yue; Paisley, John; Lin, Qin; Ding, Xinghao; Fu, Xueyang; Zhang, Xiao-Ping

    2014-12-01

    We develop a Bayesian nonparametric model for reconstructing magnetic resonance images (MRIs) from highly undersampled k -space data. We perform dictionary learning as part of the image reconstruction process. To this end, we use the beta process as a nonparametric dictionary learning prior for representing an image patch as a sparse combination of dictionary elements. The size of the dictionary and patch-specific sparsity pattern are inferred from the data, in addition to other dictionary learning variables. Dictionary learning is performed directly on the compressed image, and so is tailored to the MRI being considered. In addition, we investigate a total variation penalty term in combination with the dictionary learning model, and show how the denoising property of dictionary learning removes dependence on regularization parameters in the noisy setting. We derive a stochastic optimization algorithm based on Markov chain Monte Carlo for the Bayesian model, and use the alternating direction method of multipliers for efficiently performing total variation minimization. We present empirical results on several MRI, which show that the proposed regularization framework can improve reconstruction accuracy over other methods.

  2. Nonparametric Bayesian density estimation on manifolds with applications to planar shapes.

    Science.gov (United States)

    Bhattacharya, Abhishek; Dunson, David B

    2010-12-01

    Statistical analysis on landmark-based shape spaces has diverse applications in morphometrics, medical diagnostics, machine vision and other areas. These shape spaces are non-Euclidean quotient manifolds. To conduct nonparametric inferences, one may define notions of centre and spread on this manifold and work with their estimates. However, it is useful to consider full likelihood-based methods, which allow nonparametric estimation of the probability density. This article proposes a broad class of mixture models constructed using suitable kernels on a general compact metric space and then on the planar shape space in particular. Following a Bayesian approach with a nonparametric prior on the mixing distribution, conditions are obtained under which the Kullback-Leibler property holds, implying large support and weak posterior consistency. Gibbs sampling methods are developed for posterior computation, and the methods are applied to problems in density estimation and classification with shape-based predictors. Simulation studies show improved estimation performance relative to existing approaches.

  3. Prior processes and their applications nonparametric Bayesian estimation

    CERN Document Server

    Phadia, Eswar G

    2016-01-01

    This book presents a systematic and comprehensive treatment of various prior processes that have been developed over the past four decades for dealing with Bayesian approach to solving selected nonparametric inference problems. This revised edition has been substantially expanded to reflect the current interest in this area. After an overview of different prior processes, it examines the now pre-eminent Dirichlet process and its variants including hierarchical processes, then addresses new processes such as dependent Dirichlet, local Dirichlet, time-varying and spatial processes, all of which exploit the countable mixture representation of the Dirichlet process. It subsequently discusses various neutral to right type processes, including gamma and extended gamma, beta and beta-Stacy processes, and then describes the Chinese Restaurant, Indian Buffet and infinite gamma-Poisson processes, which prove to be very useful in areas such as machine learning, information retrieval and featural modeling. Tailfree and P...

  4. Introduction to nonparametric statistics for the biological sciences using R

    CERN Document Server

    MacFarland, Thomas W

    2016-01-01

    This book contains a rich set of tools for nonparametric analyses, and the purpose of this supplemental text is to provide guidance to students and professional researchers on how R is used for nonparametric data analysis in the biological sciences: To introduce when nonparametric approaches to data analysis are appropriate To introduce the leading nonparametric tests commonly used in biostatistics and how R is used to generate appropriate statistics for each test To introduce common figures typically associated with nonparametric data analysis and how R is used to generate appropriate figures in support of each data set The book focuses on how R is used to distinguish between data that could be classified as nonparametric as opposed to data that could be classified as parametric, with both approaches to data classification covered extensively. Following an introductory lesson on nonparametric statistics for the biological sciences, the book is organized into eight self-contained lessons on various analyses a...

  5. A Bayesian nonparametric estimation of distributions and quantiles

    International Nuclear Information System (INIS)

    Poern, K.

    1988-11-01

    The report describes a Bayesian, nonparametric method for the estimation of a distribution function and its quantiles. The method, presupposing random sampling, is nonparametric, so the user has to specify a prior distribution on a space of distributions (and not on a parameter space). In the current application, where the method is used to estimate the uncertainty of a parametric calculational model, the Dirichlet prior distribution is to a large extent determined by the first batch of Monte Carlo-realizations. In this case the results of the estimation technique is very similar to the conventional empirical distribution function. The resulting posterior distribution is also Dirichlet, and thus facilitates the determination of probability (confidence) intervals at any given point in the space of interest. Another advantage is that also the posterior distribution of a specified quantitle can be derived and utilized to determine a probability interval for that quantile. The method was devised for use in the PROPER code package for uncertainty and sensitivity analysis. (orig.)

  6. Bayesian nonparametric estimation of hazard rate in monotone Aalen model

    Czech Academy of Sciences Publication Activity Database

    Timková, Jana

    2014-01-01

    Roč. 50, č. 6 (2014), s. 849-868 ISSN 0023-5954 Institutional support: RVO:67985556 Keywords : Aalen model * Bayesian estimation * MCMC Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.541, year: 2014 http://library.utia.cas.cz/separaty/2014/SI/timkova-0438210.pdf

  7. Modeling Non-Gaussian Time Series with Nonparametric Bayesian Model.

    Science.gov (United States)

    Xu, Zhiguang; MacEachern, Steven; Xu, Xinyi

    2015-02-01

    We present a class of Bayesian copula models whose major components are the marginal (limiting) distribution of a stationary time series and the internal dynamics of the series. We argue that these are the two features with which an analyst is typically most familiar, and hence that these are natural components with which to work. For the marginal distribution, we use a nonparametric Bayesian prior distribution along with a cdf-inverse cdf transformation to obtain large support. For the internal dynamics, we rely on the traditionally successful techniques of normal-theory time series. Coupling the two components gives us a family of (Gaussian) copula transformed autoregressive models. The models provide coherent adjustments of time scales and are compatible with many extensions, including changes in volatility of the series. We describe basic properties of the models, show their ability to recover non-Gaussian marginal distributions, and use a GARCH modification of the basic model to analyze stock index return series. The models are found to provide better fit and improved short-range and long-range predictions than Gaussian competitors. The models are extensible to a large variety of fields, including continuous time models, spatial models, models for multiple series, models driven by external covariate streams, and non-stationary models.

  8. DPpackage: Bayesian Semi- and Nonparametric Modeling in R

    Directory of Open Access Journals (Sweden)

    Alejandro Jara

    2011-04-01

    Full Text Available Data analysis sometimes requires the relaxation of parametric assumptions in order to gain modeling flexibility and robustness against mis-specification of the probability model. In the Bayesian context, this is accomplished by placing a prior distribution on a function space, such as the space of all probability distributions or the space of all regression functions. Unfortunately, posterior distributions ranging over function spaces are highly complex and hence sampling methods play a key role. This paper provides an introduction to a simple, yet comprehensive, set of programs for the implementation of some Bayesian nonparametric and semiparametric models in R, DPpackage. Currently, DPpackage includes models for marginal and conditional density estimation, receiver operating characteristic curve analysis, interval-censored data, binary regression data, item response data, longitudinal and clustered data using generalized linear mixed models, and regression data using generalized additive models. The package also contains functions to compute pseudo-Bayes factors for model comparison and for eliciting the precision parameter of the Dirichlet process prior, and a general purpose Metropolis sampling algorithm. To maximize computational efficiency, the actual sampling for each model is carried out using compiled C, C++ or Fortran code.

  9. 2nd Conference of the International Society for Nonparametric Statistics

    CERN Document Server

    Manteiga, Wenceslao; Romo, Juan

    2016-01-01

    This volume collects selected, peer-reviewed contributions from the 2nd Conference of the International Society for Nonparametric Statistics (ISNPS), held in Cádiz (Spain) between June 11–16 2014, and sponsored by the American Statistical Association, the Institute of Mathematical Statistics, the Bernoulli Society for Mathematical Statistics and Probability, the Journal of Nonparametric Statistics and Universidad Carlos III de Madrid. The 15 articles are a representative sample of the 336 contributed papers presented at the conference. They cover topics such as high-dimensional data modelling, inference for stochastic processes and for dependent data, nonparametric and goodness-of-fit testing, nonparametric curve estimation, object-oriented data analysis, and semiparametric inference. The aim of the ISNPS 2014 conference was to bring together recent advances and trends in several areas of nonparametric statistics in order to facilitate the exchange of research ideas, promote collaboration among researchers...

  10. Nonparametric statistics with applications to science and engineering

    CERN Document Server

    Kvam, Paul H

    2007-01-01

    A thorough and definitive book that fully addresses traditional and modern-day topics of nonparametric statistics This book presents a practical approach to nonparametric statistical analysis and provides comprehensive coverage of both established and newly developed methods. With the use of MATLAB, the authors present information on theorems and rank tests in an applied fashion, with an emphasis on modern methods in regression and curve fitting, bootstrap confidence intervals, splines, wavelets, empirical likelihood, and goodness-of-fit testing. Nonparametric Statistics with Applications to Science and Engineering begins with succinct coverage of basic results for order statistics, methods of categorical data analysis, nonparametric regression, and curve fitting methods. The authors then focus on nonparametric procedures that are becoming more relevant to engineering researchers and practitioners. The important fundamental materials needed to effectively learn and apply the discussed methods are also provide...

  11. Recent Advances and Trends in Nonparametric Statistics

    CERN Document Server

    Akritas, MG

    2003-01-01

    The advent of high-speed, affordable computers in the last two decades has given a new boost to the nonparametric way of thinking. Classical nonparametric procedures, such as function smoothing, suddenly lost their abstract flavour as they became practically implementable. In addition, many previously unthinkable possibilities became mainstream; prime examples include the bootstrap and resampling methods, wavelets and nonlinear smoothers, graphical methods, data mining, bioinformatics, as well as the more recent algorithmic approaches such as bagging and boosting. This volume is a collection o

  12. Non-parametric Bayesian networks: Improving theory and reviewing applications

    International Nuclear Information System (INIS)

    Hanea, Anca; Morales Napoles, Oswaldo; Ababei, Dan

    2015-01-01

    Applications in various domains often lead to high dimensional dependence modelling. A Bayesian network (BN) is a probabilistic graphical model that provides an elegant way of expressing the joint distribution of a large number of interrelated variables. BNs have been successfully used to represent uncertain knowledge in a variety of fields. The majority of applications use discrete BNs, i.e. BNs whose nodes represent discrete variables. Integrating continuous variables in BNs is an area fraught with difficulty. Several methods that handle discrete-continuous BNs have been proposed in the literature. This paper concentrates only on one method called non-parametric BNs (NPBNs). NPBNs were introduced in 2004 and they have been or are currently being used in at least twelve professional applications. This paper provides a short introduction to NPBNs, a couple of theoretical advances, and an overview of applications. The aim of the paper is twofold: one is to present the latest improvements of the theory underlying NPBNs, and the other is to complement the existing overviews of BNs applications with the NPNBs applications. The latter opens the opportunity to discuss some difficulties that applications pose to the theoretical framework and in this way offers some NPBN modelling guidance to practitioners. - Highlights: • The paper gives an overview of the current NPBNs methodology. • We extend the NPBN methodology by relaxing the conditions of one of its fundamental theorems. • We propose improvements of the data mining algorithm for the NPBNs. • We review the professional applications of the NPBNs.

  13. Bayesian nonparametric clustering in phylogenetics: modeling antigenic evolution in influenza.

    Science.gov (United States)

    Cybis, Gabriela B; Sinsheimer, Janet S; Bedford, Trevor; Rambaut, Andrew; Lemey, Philippe; Suchard, Marc A

    2018-01-30

    Influenza is responsible for up to 500,000 deaths every year, and antigenic variability represents much of its epidemiological burden. To visualize antigenic differences across many viral strains, antigenic cartography methods use multidimensional scaling on binding assay data to map influenza antigenicity onto a low-dimensional space. Analysis of such assay data ideally leads to natural clustering of influenza strains of similar antigenicity that correlate with sequence evolution. To understand the dynamics of these antigenic groups, we present a framework that jointly models genetic and antigenic evolution by combining multidimensional scaling of binding assay data, Bayesian phylogenetic machinery and nonparametric clustering methods. We propose a phylogenetic Chinese restaurant process that extends the current process to incorporate the phylogenetic dependency structure between strains in the modeling of antigenic clusters. With this method, we are able to use the genetic information to better understand the evolution of antigenicity throughout epidemics, as shown in applications of this model to H1N1 influenza. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  14. Introduction to Bayesian statistics

    CERN Document Server

    Koch, Karl-Rudolf

    2007-01-01

    This book presents Bayes' theorem, the estimation of unknown parameters, the determination of confidence regions and the derivation of tests of hypotheses for the unknown parameters. It does so in a simple manner that is easy to comprehend. The book compares traditional and Bayesian methods with the rules of probability presented in a logical way allowing an intuitive understanding of random variables and their probability distributions to be formed.

  15. Nonparametric, Coupled ,Bayesian ,Dictionary ,and Classifier Learning for Hyperspectral Classification.

    Science.gov (United States)

    Akhtar, Naveed; Mian, Ajmal

    2017-10-03

    We present a principled approach to learn a discriminative dictionary along a linear classifier for hyperspectral classification. Our approach places Gaussian Process priors over the dictionary to account for the relative smoothness of the natural spectra, whereas the classifier parameters are sampled from multivariate Gaussians. We employ two Beta-Bernoulli processes to jointly infer the dictionary and the classifier. These processes are coupled under the same sets of Bernoulli distributions. In our approach, these distributions signify the frequency of the dictionary atom usage in representing class-specific training spectra, which also makes the dictionary discriminative. Due to the coupling between the dictionary and the classifier, the popularity of the atoms for representing different classes gets encoded into the classifier. This helps in predicting the class labels of test spectra that are first represented over the dictionary by solving a simultaneous sparse optimization problem. The labels of the spectra are predicted by feeding the resulting representations to the classifier. Our approach exploits the nonparametric Bayesian framework to automatically infer the dictionary size--the key parameter in discriminative dictionary learning. Moreover, it also has the desirable property of adaptively learning the association between the dictionary atoms and the class labels by itself. We use Gibbs sampling to infer the posterior probability distributions over the dictionary and the classifier under the proposed model, for which, we derive analytical expressions. To establish the effectiveness of our approach, we test it on benchmark hyperspectral images. The classification performance is compared with the state-of-the-art dictionary learning-based classification methods.

  16. A nonparametric spatial scan statistic for continuous data.

    Science.gov (United States)

    Jung, Inkyung; Cho, Ho Jin

    2015-10-20

    Spatial scan statistics are widely used for spatial cluster detection, and several parametric models exist. For continuous data, a normal-based scan statistic can be used. However, the performance of the model has not been fully evaluated for non-normal data. We propose a nonparametric spatial scan statistic based on the Wilcoxon rank-sum test statistic and compared the performance of the method with parametric models via a simulation study under various scenarios. The nonparametric method outperforms the normal-based scan statistic in terms of power and accuracy in almost all cases under consideration in the simulation study. The proposed nonparametric spatial scan statistic is therefore an excellent alternative to the normal model for continuous data and is especially useful for data following skewed or heavy-tailed distributions.

  17. A nonparametric Bayesian approach for genetic evaluation in ...

    African Journals Online (AJOL)

    South African Journal of Animal Science ... the Bayesian and Classical models, a Bayesian procedure is provided which allows these random ... data from the Elsenburg Dormer sheep stud and data from a simulation experiment are utilized. >

  18. Application of nonparametric statistic method for DNBR limit calculation

    International Nuclear Information System (INIS)

    Dong Bo; Kuang Bo; Zhu Xuenong

    2013-01-01

    Background: Nonparametric statistical method is a kind of statistical inference method not depending on a certain distribution; it calculates the tolerance limits under certain probability level and confidence through sampling methods. The DNBR margin is one important parameter of NPP design, which presents the safety level of NPP. Purpose and Methods: This paper uses nonparametric statistical method basing on Wilks formula and VIPER-01 subchannel analysis code to calculate the DNBR design limits (DL) of 300 MW NPP (Nuclear Power Plant) during the complete loss of flow accident, simultaneously compared with the DL of DNBR through means of ITDP to get certain DNBR margin. Results: The results indicate that this method can gain 2.96% DNBR margin more than that obtained by ITDP methodology. Conclusions: Because of the reduction of the conservation during analysis process, the nonparametric statistical method can provide greater DNBR margin and the increase of DNBR margin is benefited for the upgrading of core refuel scheme. (authors)

  19. How to practise Bayesian statistics outside the Bayesian church: What philosophy for Bayesian statistical modelling?

    NARCIS (Netherlands)

    Borsboom, D.; Haig, B.D.

    2013-01-01

    Unlike most other statistical frameworks, Bayesian statistical inference is wedded to a particular approach in the philosophy of science (see Howson & Urbach, 2006); this approach is called Bayesianism. Rather than being concerned with model fitting, this position in the philosophy of science

  20. Methodology in robust and nonparametric statistics

    CERN Document Server

    Jurecková, Jana; Picek, Jan

    2012-01-01

    Introduction and SynopsisIntroductionSynopsisPreliminariesIntroductionInference in Linear ModelsRobustness ConceptsRobust and Minimax Estimation of LocationClippings from Probability and Asymptotic TheoryProblemsRobust Estimation of Location and RegressionIntroductionM-EstimatorsL-EstimatorsR-EstimatorsMinimum Distance and Pitman EstimatorsDifferentiable Statistical FunctionsProblemsAsymptotic Representations for L-Estimators

  1. A nonparametric Bayesian approach for genetic evaluation in ...

    African Journals Online (AJOL)

    Unknown

    Finally, one can report the whole of the posterior probability distributions of the parameters in ... the Markov Chain Monte Carlo Methods, and more specific Gibbs Sampling, these ...... Bayesian Methods in Animal Breeding Theory. J. Anim. Sci.

  2. Non-Parametric Bayesian Updating within the Assessment of Reliability for Offshore Wind Turbine Support Structures

    DEFF Research Database (Denmark)

    Ramirez, José Rangel; Sørensen, John Dalsgaard

    2011-01-01

    This work illustrates the updating and incorporation of information in the assessment of fatigue reliability for offshore wind turbine. The new information, coming from external and condition monitoring can be used to direct updating of the stochastic variables through a non-parametric Bayesian u...

  3. A Bayesian Beta-Mixture Model for Nonparametric IRT (BBM-IRT)

    Science.gov (United States)

    Arenson, Ethan A.; Karabatsos, George

    2017-01-01

    Item response models typically assume that the item characteristic (step) curves follow a logistic or normal cumulative distribution function, which are strictly monotone functions of person test ability. Such assumptions can be overly-restrictive for real item response data. We propose a simple and more flexible Bayesian nonparametric IRT model…

  4. A non-parametric Bayesian approach to decompounding from high frequency data

    NARCIS (Netherlands)

    Gugushvili, Shota; van der Meulen, F.H.; Spreij, Peter

    2016-01-01

    Given a sample from a discretely observed compound Poisson process, we consider non-parametric estimation of the density f0 of its jump sizes, as well as of its intensity λ0. We take a Bayesian approach to the problem and specify the prior on f0 as the Dirichlet location mixture of normal densities.

  5. Nonparametric statistics a step-by-step approach

    CERN Document Server

    Corder, Gregory W

    2014-01-01

    "…a very useful resource for courses in nonparametric statistics in which the emphasis is on applications rather than on theory.  It also deserves a place in libraries of all institutions where introductory statistics courses are taught."" -CHOICE This Second Edition presents a practical and understandable approach that enhances and expands the statistical toolset for readers. This book includes: New coverage of the sign test and the Kolmogorov-Smirnov two-sample test in an effort to offer a logical and natural progression to statistical powerSPSS® (Version 21) software and updated screen ca

  6. Scalable Bayesian nonparametric measures for exploring pairwise dependence via Dirichlet Process Mixtures.

    Science.gov (United States)

    Filippi, Sarah; Holmes, Chris C; Nieto-Barajas, Luis E

    2016-11-16

    In this article we propose novel Bayesian nonparametric methods using Dirichlet Process Mixture (DPM) models for detecting pairwise dependence between random variables while accounting for uncertainty in the form of the underlying distributions. A key criteria is that the procedures should scale to large data sets. In this regard we find that the formal calculation of the Bayes factor for a dependent-vs.-independent DPM joint probability measure is not feasible computationally. To address this we present Bayesian diagnostic measures for characterising evidence against a "null model" of pairwise independence. In simulation studies, as well as for a real data analysis, we show that our approach provides a useful tool for the exploratory nonparametric Bayesian analysis of large multivariate data sets.

  7. A menu-driven software package of Bayesian nonparametric (and parametric) mixed models for regression analysis and density estimation.

    Science.gov (United States)

    Karabatsos, George

    2017-02-01

    Most of applied statistics involves regression analysis of data. In practice, it is important to specify a regression model that has minimal assumptions which are not violated by data, to ensure that statistical inferences from the model are informative and not misleading. This paper presents a stand-alone and menu-driven software package, Bayesian Regression: Nonparametric and Parametric Models, constructed from MATLAB Compiler. Currently, this package gives the user a choice from 83 Bayesian models for data analysis. They include 47 Bayesian nonparametric (BNP) infinite-mixture regression models; 5 BNP infinite-mixture models for density estimation; and 31 normal random effects models (HLMs), including normal linear models. Each of the 78 regression models handles either a continuous, binary, or ordinal dependent variable, and can handle multi-level (grouped) data. All 83 Bayesian models can handle the analysis of weighted observations (e.g., for meta-analysis), and the analysis of left-censored, right-censored, and/or interval-censored data. Each BNP infinite-mixture model has a mixture distribution assigned one of various BNP prior distributions, including priors defined by either the Dirichlet process, Pitman-Yor process (including the normalized stable process), beta (two-parameter) process, normalized inverse-Gaussian process, geometric weights prior, dependent Dirichlet process, or the dependent infinite-probits prior. The software user can mouse-click to select a Bayesian model and perform data analysis via Markov chain Monte Carlo (MCMC) sampling. After the sampling completes, the software automatically opens text output that reports MCMC-based estimates of the model's posterior distribution and model predictive fit to the data. Additional text and/or graphical output can be generated by mouse-clicking other menu options. This includes output of MCMC convergence analyses, and estimates of the model's posterior predictive distribution, for selected

  8. MAP estimators and their consistency in Bayesian nonparametric inverse problems

    KAUST Repository

    Dashti, M.

    2013-09-01

    We consider the inverse problem of estimating an unknown function u from noisy measurements y of a known, possibly nonlinear, map applied to u. We adopt a Bayesian approach to the problem and work in a setting where the prior measure is specified as a Gaussian random field μ0. We work under a natural set of conditions on the likelihood which implies the existence of a well-posed posterior measure, μy. Under these conditions, we show that the maximum a posteriori (MAP) estimator is well defined as the minimizer of an Onsager-Machlup functional defined on the Cameron-Martin space of the prior; thus, we link a problem in probability with a problem in the calculus of variations. We then consider the case where the observational noise vanishes and establish a form of Bayesian posterior consistency for the MAP estimator. We also prove a similar result for the case where the observation of can be repeated as many times as desired with independent identically distributed noise. The theory is illustrated with examples from an inverse problem for the Navier-Stokes equation, motivated by problems arising in weather forecasting, and from the theory of conditioned diffusions, motivated by problems arising in molecular dynamics. © 2013 IOP Publishing Ltd.

  9. MAP estimators and their consistency in Bayesian nonparametric inverse problems

    International Nuclear Information System (INIS)

    Dashti, M; Law, K J H; Stuart, A M; Voss, J

    2013-01-01

    We consider the inverse problem of estimating an unknown function u from noisy measurements y of a known, possibly nonlinear, map G applied to u. We adopt a Bayesian approach to the problem and work in a setting where the prior measure is specified as a Gaussian random field μ 0 . We work under a natural set of conditions on the likelihood which implies the existence of a well-posed posterior measure, μ y . Under these conditions, we show that the maximum a posteriori (MAP) estimator is well defined as the minimizer of an Onsager–Machlup functional defined on the Cameron–Martin space of the prior; thus, we link a problem in probability with a problem in the calculus of variations. We then consider the case where the observational noise vanishes and establish a form of Bayesian posterior consistency for the MAP estimator. We also prove a similar result for the case where the observation of G(u) can be repeated as many times as desired with independent identically distributed noise. The theory is illustrated with examples from an inverse problem for the Navier–Stokes equation, motivated by problems arising in weather forecasting, and from the theory of conditioned diffusions, motivated by problems arising in molecular dynamics. (paper)

  10. A Bayesian approach to the analysis of quantal bioassay studies using nonparametric mixture models.

    Science.gov (United States)

    Fronczyk, Kassandra; Kottas, Athanasios

    2014-03-01

    We develop a Bayesian nonparametric mixture modeling framework for quantal bioassay settings. The approach is built upon modeling dose-dependent response distributions. We adopt a structured nonparametric prior mixture model, which induces a monotonicity restriction for the dose-response curve. Particular emphasis is placed on the key risk assessment goal of calibration for the dose level that corresponds to a specified response. The proposed methodology yields flexible inference for the dose-response relationship as well as for other inferential objectives, as illustrated with two data sets from the literature. © 2013, The International Biometric Society.

  11. Application of nonparametric statistics to material strength/reliability assessment

    International Nuclear Information System (INIS)

    Arai, Taketoshi

    1992-01-01

    An advanced material technology requires data base on a wide variety of material behavior which need to be established experimentally. It may often happen that experiments are practically limited in terms of reproducibility or a range of test parameters. Statistical methods can be applied to understanding uncertainties in such a quantitative manner as required from the reliability point of view. Statistical assessment involves determinations of a most probable value and the maximum and/or minimum value as one-sided or two-sided confidence limit. A scatter of test data can be approximated by a theoretical distribution only if the goodness of fit satisfies a test criterion. Alternatively, nonparametric statistics (NPS) or distribution-free statistics can be applied. Mathematical procedures by NPS are well established for dealing with most reliability problems. They handle only order statistics of a sample. Mathematical formulas and some applications to engineering assessments are described. They include confidence limits of median, population coverage of sample, required minimum number of a sample, and confidence limits of fracture probability. These applications demonstrate that a nonparametric statistical estimation is useful in logical decision making in the case a large uncertainty exists. (author)

  12. Bayesian Nonparametric Mixture Estimation for Time-Indexed Functional Data in R

    Directory of Open Access Journals (Sweden)

    Terrance D. Savitsky

    2016-08-01

    Full Text Available We present growfunctions for R that offers Bayesian nonparametric estimation models for analysis of dependent, noisy time series data indexed by a collection of domains. This data structure arises from combining periodically published government survey statistics, such as are reported in the Current Population Study (CPS. The CPS publishes monthly, by-state estimates of employment levels, where each state expresses a noisy time series. Published state-level estimates from the CPS are composed from household survey responses in a model-free manner and express high levels of volatility due to insufficient sample sizes. Existing software solutions borrow information over a modeled time-based dependence to extract a de-noised time series for each domain. These solutions, however, ignore the dependence among the domains that may be additionally leveraged to improve estimation efficiency. The growfunctions package offers two fully nonparametric mixture models that simultaneously estimate both a time and domain-indexed dependence structure for a collection of time series: (1 A Gaussian process (GP construction, which is parameterized through the covariance matrix, estimates a latent function for each domain. The covariance parameters of the latent functions are indexed by domain under a Dirichlet process prior that permits estimation of the dependence among functions across the domains: (2 An intrinsic Gaussian Markov random field prior construction provides an alternative to the GP that expresses different computation and estimation properties. In addition to performing denoised estimation of latent functions from published domain estimates, growfunctions allows estimation of collections of functions for observation units (e.g., households, rather than aggregated domains, by accounting for an informative sampling design under which the probabilities for inclusion of observation units are related to the response variable. growfunctions includes plot

  13. STATCAT, Statistical Analysis of Parametric and Non-Parametric Data

    International Nuclear Information System (INIS)

    David, Hugh

    1990-01-01

    1 - Description of program or function: A suite of 26 programs designed to facilitate the appropriate statistical analysis and data handling of parametric and non-parametric data, using classical and modern univariate and multivariate methods. 2 - Method of solution: Data is read entry by entry, using a choice of input formats, and the resultant data bank is checked for out-of- range, rare, extreme or missing data. The completed STATCAT data bank can be treated by a variety of descriptive and inferential statistical methods, and modified, using other standard programs as required

  14. Categorical and nonparametric data analysis choosing the best statistical technique

    CERN Document Server

    Nussbaum, E Michael

    2014-01-01

    Featuring in-depth coverage of categorical and nonparametric statistics, this book provides a conceptual framework for choosing the most appropriate type of test in various research scenarios. Class tested at the University of Nevada, the book's clear explanations of the underlying assumptions, computer simulations, and Exploring the Concept boxes help reduce reader anxiety. Problems inspired by actual studies provide meaningful illustrations of the techniques. The underlying assumptions of each test and the factors that impact validity and statistical power are reviewed so readers can explain

  15. Bayesian Nonparametric Model for Estimating Multistate Travel Time Distribution

    Directory of Open Access Journals (Sweden)

    Emmanuel Kidando

    2017-01-01

    Full Text Available Multistate models, that is, models with more than two distributions, are preferred over single-state probability models in modeling the distribution of travel time. Literature review indicated that the finite multistate modeling of travel time using lognormal distribution is superior to other probability functions. In this study, we extend the finite multistate lognormal model of estimating the travel time distribution to unbounded lognormal distribution. In particular, a nonparametric Dirichlet Process Mixture Model (DPMM with stick-breaking process representation was used. The strength of the DPMM is that it can choose the number of components dynamically as part of the algorithm during parameter estimation. To reduce computational complexity, the modeling process was limited to a maximum of six components. Then, the Markov Chain Monte Carlo (MCMC sampling technique was employed to estimate the parameters’ posterior distribution. Speed data from nine links of a freeway corridor, aggregated on a 5-minute basis, were used to calculate the corridor travel time. The results demonstrated that this model offers significant flexibility in modeling to account for complex mixture distributions of the travel time without specifying the number of components. The DPMM modeling further revealed that freeway travel time is characterized by multistate or single-state models depending on the inclusion of onset and offset of congestion periods.

  16. 12th Brazilian Meeting on Bayesian Statistics

    CERN Document Server

    Louzada, Francisco; Rifo, Laura; Stern, Julio; Lauretto, Marcelo

    2015-01-01

    Through refereed papers, this volume focuses on the foundations of the Bayesian paradigm; their comparison to objectivistic or frequentist Statistics counterparts; and the appropriate application of Bayesian foundations. This research in Bayesian Statistics is applicable to data analysis in biostatistics, clinical trials, law, engineering, and the social sciences. EBEB, the Brazilian Meeting on Bayesian Statistics, is held every two years by the ISBrA, the International Society for Bayesian Analysis, one of the most active chapters of the ISBA. The 12th meeting took place March 10-14, 2014 in Atibaia. Interest in foundations of inductive Statistics has grown recently in accordance with the increasing availability of Bayesian methodological alternatives. Scientists need to deal with the ever more difficult choice of the optimal method to apply to their problem. This volume shows how Bayes can be the answer. The examination and discussion on the foundations work towards the goal of proper application of Bayesia...

  17. A Bayesian nonparametric approach to reconstruction and prediction of random dynamical systems

    Science.gov (United States)

    Merkatas, Christos; Kaloudis, Konstantinos; Hatjispyros, Spyridon J.

    2017-06-01

    We propose a Bayesian nonparametric mixture model for the reconstruction and prediction from observed time series data, of discretized stochastic dynamical systems, based on Markov Chain Monte Carlo methods. Our results can be used by researchers in physical modeling interested in a fast and accurate estimation of low dimensional stochastic models when the size of the observed time series is small and the noise process (perhaps) is non-Gaussian. The inference procedure is demonstrated specifically in the case of polynomial maps of an arbitrary degree and when a Geometric Stick Breaking mixture process prior over the space of densities, is applied to the additive errors. Our method is parsimonious compared to Bayesian nonparametric techniques based on Dirichlet process mixtures, flexible and general. Simulations based on synthetic time series are presented.

  18. A Bayesian nonparametric approach to reconstruction and prediction of random dynamical systems.

    Science.gov (United States)

    Merkatas, Christos; Kaloudis, Konstantinos; Hatjispyros, Spyridon J

    2017-06-01

    We propose a Bayesian nonparametric mixture model for the reconstruction and prediction from observed time series data, of discretized stochastic dynamical systems, based on Markov Chain Monte Carlo methods. Our results can be used by researchers in physical modeling interested in a fast and accurate estimation of low dimensional stochastic models when the size of the observed time series is small and the noise process (perhaps) is non-Gaussian. The inference procedure is demonstrated specifically in the case of polynomial maps of an arbitrary degree and when a Geometric Stick Breaking mixture process prior over the space of densities, is applied to the additive errors. Our method is parsimonious compared to Bayesian nonparametric techniques based on Dirichlet process mixtures, flexible and general. Simulations based on synthetic time series are presented.

  19. Scalable Bayesian nonparametric regression via a Plackett-Luce model for conditional ranks

    Science.gov (United States)

    Gray-Davies, Tristan; Holmes, Chris C.; Caron, François

    2018-01-01

    We present a novel Bayesian nonparametric regression model for covariates X and continuous response variable Y ∈ ℝ. The model is parametrized in terms of marginal distributions for Y and X and a regression function which tunes the stochastic ordering of the conditional distributions F (y|x). By adopting an approximate composite likelihood approach, we show that the resulting posterior inference can be decoupled for the separate components of the model. This procedure can scale to very large datasets and allows for the use of standard, existing, software from Bayesian nonparametric density estimation and Plackett-Luce ranking estimation to be applied. As an illustration, we show an application of our approach to a US Census dataset, with over 1,300,000 data points and more than 100 covariates. PMID:29623150

  20. Driving Style Analysis Using Primitive Driving Patterns With Bayesian Nonparametric Approaches

    OpenAIRE

    Wang, Wenshuo; Xi, Junqiang; Zhao, Ding

    2017-01-01

    Analysis and recognition of driving styles are profoundly important to intelligent transportation and vehicle calibration. This paper presents a novel driving style analysis framework using the primitive driving patterns learned from naturalistic driving data. In order to achieve this, first, a Bayesian nonparametric learning method based on a hidden semi-Markov model (HSMM) is introduced to extract primitive driving patterns from time series driving data without prior knowledge of the number...

  1. A parametric interpretation of Bayesian Nonparametric Inference from Gene Genealogies: Linking ecological, population genetics and evolutionary processes.

    Science.gov (United States)

    Ponciano, José Miguel

    2017-11-22

    Using a nonparametric Bayesian approach Palacios and Minin (2013) dramatically improved the accuracy, precision of Bayesian inference of population size trajectories from gene genealogies. These authors proposed an extension of a Gaussian Process (GP) nonparametric inferential method for the intensity function of non-homogeneous Poisson processes. They found that not only the statistical properties of the estimators were improved with their method, but also, that key aspects of the demographic histories were recovered. The authors' work represents the first Bayesian nonparametric solution to this inferential problem because they specify a convenient prior belief without a particular functional form on the population trajectory. Their approach works so well and provides such a profound understanding of the biological process, that the question arises as to how truly "biology-free" their approach really is. Using well-known concepts of stochastic population dynamics, here I demonstrate that in fact, Palacios and Minin's GP model can be cast as a parametric population growth model with density dependence and environmental stochasticity. Making this link between population genetics and stochastic population dynamics modeling provides novel insights into eliciting biologically meaningful priors for the trajectory of the effective population size. The results presented here also bring novel understanding of GP as models for the evolution of a trait. Thus, the ecological principles foundation of Palacios and Minin (2013)'s prior adds to the conceptual and scientific value of these authors' inferential approach. I conclude this note by listing a series of insights brought about by this connection with Ecology. Copyright © 2017 The Author. Published by Elsevier Inc. All rights reserved.

  2. Bayesian models: A statistical primer for ecologists

    Science.gov (United States)

    Hobbs, N. Thompson; Hooten, Mevin B.

    2015-01-01

    Bayesian modeling has become an indispensable tool for ecological research because it is uniquely suited to deal with complexity in a statistically coherent way. This textbook provides a comprehensive and accessible introduction to the latest Bayesian methods—in language ecologists can understand. Unlike other books on the subject, this one emphasizes the principles behind the computations, giving ecologists a big-picture understanding of how to implement this powerful statistical approach.Bayesian Models is an essential primer for non-statisticians. It begins with a definition of probability and develops a step-by-step sequence of connected ideas, including basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and inference from single and multiple models. This unique book places less emphasis on computer coding, favoring instead a concise presentation of the mathematical statistics needed to understand how and why Bayesian analysis works. It also explains how to write out properly formulated hierarchical Bayesian models and use them in computing, research papers, and proposals.This primer enables ecologists to understand the statistical principles behind Bayesian modeling and apply them to research, teaching, policy, and management.Presents the mathematical and statistical foundations of Bayesian modeling in language accessible to non-statisticiansCovers basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and moreDeemphasizes computer coding in favor of basic principlesExplains how to write out properly factored statistical expressions representing Bayesian models

  3. 1st Conference of the International Society for Nonparametric Statistics

    CERN Document Server

    Lahiri, S; Politis, Dimitris

    2014-01-01

    This volume is composed of peer-reviewed papers that have developed from the First Conference of the International Society for NonParametric Statistics (ISNPS). This inaugural conference took place in Chalkidiki, Greece, June 15-19, 2012. It was organized with the co-sponsorship of the IMS, the ISI, and other organizations. M.G. Akritas, S.N. Lahiri, and D.N. Politis are the first executive committee members of ISNPS, and the editors of this volume. ISNPS has a distinguished Advisory Committee that includes Professors R.Beran, P.Bickel, R. Carroll, D. Cook, P. Hall, R. Johnson, B. Lindsay, E. Parzen, P. Robinson, M. Rosenblatt, G. Roussas, T. SubbaRao, and G. Wahba. The Charting Committee of ISNPS consists of more than 50 prominent researchers from all over the world.   The chapters in this volume bring forth recent advances and trends in several areas of nonparametric statistics. In this way, the volume facilitates the exchange of research ideas, promotes collaboration among researchers from all over the wo...

  4. Bayesian models a statistical primer for ecologists

    CERN Document Server

    Hobbs, N Thompson

    2015-01-01

    Bayesian modeling has become an indispensable tool for ecological research because it is uniquely suited to deal with complexity in a statistically coherent way. This textbook provides a comprehensive and accessible introduction to the latest Bayesian methods-in language ecologists can understand. Unlike other books on the subject, this one emphasizes the principles behind the computations, giving ecologists a big-picture understanding of how to implement this powerful statistical approach. Bayesian Models is an essential primer for non-statisticians. It begins with a definition of probabili

  5. Bayesian Non-Parametric Mixtures of GARCH(1,1 Models

    Directory of Open Access Journals (Sweden)

    John W. Lau

    2012-01-01

    Full Text Available Traditional GARCH models describe volatility levels that evolve smoothly over time, generated by a single GARCH regime. However, nonstationary time series data may exhibit abrupt changes in volatility, suggesting changes in the underlying GARCH regimes. Further, the number and times of regime changes are not always obvious. This article outlines a nonparametric mixture of GARCH models that is able to estimate the number and time of volatility regime changes by mixing over the Poisson-Kingman process. The process is a generalisation of the Dirichlet process typically used in nonparametric models for time-dependent data provides a richer clustering structure, and its application to time series data is novel. Inference is Bayesian, and a Markov chain Monte Carlo algorithm to explore the posterior distribution is described. The methodology is illustrated on the Standard and Poor's 500 financial index.

  6. Bayesian nonparametric estimation of continuous monotone functions with applications to dose-response analysis.

    Science.gov (United States)

    Bornkamp, Björn; Ickstadt, Katja

    2009-03-01

    In this article, we consider monotone nonparametric regression in a Bayesian framework. The monotone function is modeled as a mixture of shifted and scaled parametric probability distribution functions, and a general random probability measure is assumed as the prior for the mixing distribution. We investigate the choice of the underlying parametric distribution function and find that the two-sided power distribution function is well suited both from a computational and mathematical point of view. The model is motivated by traditional nonlinear models for dose-response analysis, and provides possibilities to elicitate informative prior distributions on different aspects of the curve. The method is compared with other recent approaches to monotone nonparametric regression in a simulation study and is illustrated on a data set from dose-response analysis.

  7. Analysing the length of care episode after hip fracture: a nonparametric and a parametric Bayesian approach.

    Science.gov (United States)

    Riihimäki, Jaakko; Sund, Reijo; Vehtari, Aki

    2010-06-01

    Effective utilisation of limited resources is a challenge for health care providers. Accurate and relevant information extracted from the length of stay distributions is useful for management purposes. Patient care episodes can be reconstructed from the comprehensive health registers, and in this paper we develop a Bayesian approach to analyse the length of care episode after a fractured hip. We model the large scale data with a flexible nonparametric multilayer perceptron network and with a parametric Weibull mixture model. To assess the performances of the models, we estimate expected utilities using predictive density as a utility measure. Since the model parameters cannot be directly compared, we focus on observables, and estimate the relevances of patient explanatory variables in predicting the length of stay. To demonstrate how the use of the nonparametric flexible model is advantageous for this complex health care data, we also study joint effects of variables in predictions, and visualise nonlinearities and interactions found in the data.

  8. Hierarchical Bayesian nonparametric mixture models for clustering with variable relevance determination.

    Science.gov (United States)

    Yau, Christopher; Holmes, Chris

    2011-07-01

    We propose a hierarchical Bayesian nonparametric mixture model for clustering when some of the covariates are assumed to be of varying relevance to the clustering problem. This can be thought of as an issue in variable selection for unsupervised learning. We demonstrate that by defining a hierarchical population based nonparametric prior on the cluster locations scaled by the inverse covariance matrices of the likelihood we arrive at a 'sparsity prior' representation which admits a conditionally conjugate prior. This allows us to perform full Gibbs sampling to obtain posterior distributions over parameters of interest including an explicit measure of each covariate's relevance and a distribution over the number of potential clusters present in the data. This also allows for individual cluster specific variable selection. We demonstrate improved inference on a number of canonical problems.

  9. Non-parametric Bayesian graph models reveal community structure in resting state fMRI

    DEFF Research Database (Denmark)

    Andersen, Kasper Winther; Madsen, Kristoffer H.; Siebner, Hartwig Roman

    2014-01-01

    Modeling of resting state functional magnetic resonance imaging (rs-fMRI) data using network models is of increasing interest. It is often desirable to group nodes into clusters to interpret the communication patterns between nodes. In this study we consider three different nonparametric Bayesian...... models for node clustering in complex networks. In particular, we test their ability to predict unseen data and their ability to reproduce clustering across datasets. The three generative models considered are the Infinite Relational Model (IRM), Bayesian Community Detection (BCD), and the Infinite...... between clusters. BCD restricts the between-cluster link probabilities to be strictly lower than within-cluster link probabilities to conform to the community structure typically seen in social networks. IDM only models a single between-cluster link probability, which can be interpreted as a background...

  10. Philosophy and the practice of Bayesian statistics.

    Science.gov (United States)

    Gelman, Andrew; Shalizi, Cosma Rohilla

    2013-02-01

    A substantial school in the philosophy of science identifies Bayesian inference with inductive inference and even rationality as such, and seems to be strengthened by the rise and practical success of Bayesian statistics. We argue that the most successful forms of Bayesian statistics do not actually support that particular philosophy but rather accord much better with sophisticated forms of hypothetico-deductivism. We examine the actual role played by prior distributions in Bayesian models, and the crucial aspects of model checking and model revision, which fall outside the scope of Bayesian confirmation theory. We draw on the literature on the consistency of Bayesian updating and also on our experience of applied work in social science. Clarity about these matters should benefit not just philosophy of science, but also statistical practice. At best, the inductivist view has encouraged researchers to fit and compare models without checking them; at worst, theorists have actively discouraged practitioners from performing model checking because it does not fit into their framework. © 2012 The British Psychological Society.

  11. Bayesian Inference in Statistical Analysis

    CERN Document Server

    Box, George E P

    2011-01-01

    The Wiley Classics Library consists of selected books that have become recognized classics in their respective fields. With these new unabridged and inexpensive editions, Wiley hopes to extend the life of these important works by making them available to future generations of mathematicians and scientists. Currently available in the Series: T. W. Anderson The Statistical Analysis of Time Series T. S. Arthanari & Yadolah Dodge Mathematical Programming in Statistics Emil Artin Geometric Algebra Norman T. J. Bailey The Elements of Stochastic Processes with Applications to the Natural Sciences Rob

  12. Efficient nonparametric and asymptotic Bayesian model selection methods for attributed graph clustering

    KAUST Repository

    Xu, Zhiqiang

    2017-02-16

    Attributed graph clustering, also known as community detection on attributed graphs, attracts much interests recently due to the ubiquity of attributed graphs in real life. Many existing algorithms have been proposed for this problem, which are either distance based or model based. However, model selection in attributed graph clustering has not been well addressed, that is, most existing algorithms assume the cluster number to be known a priori. In this paper, we propose two efficient approaches for attributed graph clustering with automatic model selection. The first approach is a popular Bayesian nonparametric method, while the second approach is an asymptotic method based on a recently proposed model selection criterion, factorized information criterion. Experimental results on both synthetic and real datasets demonstrate that our approaches for attributed graph clustering with automatic model selection significantly outperform the state-of-the-art algorithm.

  13. Efficient nonparametric and asymptotic Bayesian model selection methods for attributed graph clustering

    KAUST Repository

    Xu, Zhiqiang; Cheng, James; Xiao, Xiaokui; Fujimaki, Ryohei; Muraoka, Yusuke

    2017-01-01

    Attributed graph clustering, also known as community detection on attributed graphs, attracts much interests recently due to the ubiquity of attributed graphs in real life. Many existing algorithms have been proposed for this problem, which are either distance based or model based. However, model selection in attributed graph clustering has not been well addressed, that is, most existing algorithms assume the cluster number to be known a priori. In this paper, we propose two efficient approaches for attributed graph clustering with automatic model selection. The first approach is a popular Bayesian nonparametric method, while the second approach is an asymptotic method based on a recently proposed model selection criterion, factorized information criterion. Experimental results on both synthetic and real datasets demonstrate that our approaches for attributed graph clustering with automatic model selection significantly outperform the state-of-the-art algorithm.

  14. A BAYESIAN NONPARAMETRIC MIXTURE MODEL FOR SELECTING GENES AND GENE SUBNETWORKS.

    Science.gov (United States)

    Zhao, Yize; Kang, Jian; Yu, Tianwei

    2014-06-01

    It is very challenging to select informative features from tens of thousands of measured features in high-throughput data analysis. Recently, several parametric/regression models have been developed utilizing the gene network information to select genes or pathways strongly associated with a clinical/biological outcome. Alternatively, in this paper, we propose a nonparametric Bayesian model for gene selection incorporating network information. In addition to identifying genes that have a strong association with a clinical outcome, our model can select genes with particular expressional behavior, in which case the regression models are not directly applicable. We show that our proposed model is equivalent to an infinity mixture model for which we develop a posterior computation algorithm based on Markov chain Monte Carlo (MCMC) methods. We also propose two fast computing algorithms that approximate the posterior simulation with good accuracy but relatively low computational cost. We illustrate our methods on simulation studies and the analysis of Spellman yeast cell cycle microarray data.

  15. Bayesian Nonparametric Regression Analysis of Data with Random Effects Covariates from Longitudinal Measurements

    KAUST Repository

    Ryu, Duchwan

    2010-09-28

    We consider nonparametric regression analysis in a generalized linear model (GLM) framework for data with covariates that are the subject-specific random effects of longitudinal measurements. The usual assumption that the effects of the longitudinal covariate processes are linear in the GLM may be unrealistic and if this happens it can cast doubt on the inference of observed covariate effects. Allowing the regression functions to be unknown, we propose to apply Bayesian nonparametric methods including cubic smoothing splines or P-splines for the possible nonlinearity and use an additive model in this complex setting. To improve computational efficiency, we propose the use of data-augmentation schemes. The approach allows flexible covariance structures for the random effects and within-subject measurement errors of the longitudinal processes. The posterior model space is explored through a Markov chain Monte Carlo (MCMC) sampler. The proposed methods are illustrated and compared to other approaches, the "naive" approach and the regression calibration, via simulations and by an application that investigates the relationship between obesity in adulthood and childhood growth curves. © 2010, The International Biometric Society.

  16. Bayesian nonparametric inference on quantile residual life function: Application to breast cancer data.

    Science.gov (United States)

    Park, Taeyoung; Jeong, Jong-Hyeon; Lee, Jae Won

    2012-08-15

    There is often an interest in estimating a residual life function as a summary measure of survival data. For ease in presentation of the potential therapeutic effect of a new drug, investigators may summarize survival data in terms of the remaining life years of patients. Under heavy right censoring, however, some reasonably high quantiles (e.g., median) of a residual lifetime distribution cannot be always estimated via a popular nonparametric approach on the basis of the Kaplan-Meier estimator. To overcome the difficulties in dealing with heavily censored survival data, this paper develops a Bayesian nonparametric approach that takes advantage of a fully model-based but highly flexible probabilistic framework. We use a Dirichlet process mixture of Weibull distributions to avoid strong parametric assumptions on the unknown failure time distribution, making it possible to estimate any quantile residual life function under heavy censoring. Posterior computation through Markov chain Monte Carlo is straightforward and efficient because of conjugacy properties and partial collapse. We illustrate the proposed methods by using both simulated data and heavily censored survival data from a recent breast cancer clinical trial conducted by the National Surgical Adjuvant Breast and Bowel Project. Copyright © 2012 John Wiley & Sons, Ltd.

  17. Gaussian process-based Bayesian nonparametric inference of population size trajectories from gene genealogies.

    Science.gov (United States)

    Palacios, Julia A; Minin, Vladimir N

    2013-03-01

    Changes in population size influence genetic diversity of the population and, as a result, leave a signature of these changes in individual genomes in the population. We are interested in the inverse problem of reconstructing past population dynamics from genomic data. We start with a standard framework based on the coalescent, a stochastic process that generates genealogies connecting randomly sampled individuals from the population of interest. These genealogies serve as a glue between the population demographic history and genomic sequences. It turns out that only the times of genealogical lineage coalescences contain information about population size dynamics. Viewing these coalescent times as a point process, estimating population size trajectories is equivalent to estimating a conditional intensity of this point process. Therefore, our inverse problem is similar to estimating an inhomogeneous Poisson process intensity function. We demonstrate how recent advances in Gaussian process-based nonparametric inference for Poisson processes can be extended to Bayesian nonparametric estimation of population size dynamics under the coalescent. We compare our Gaussian process (GP) approach to one of the state-of-the-art Gaussian Markov random field (GMRF) methods for estimating population trajectories. Using simulated data, we demonstrate that our method has better accuracy and precision. Next, we analyze two genealogies reconstructed from real sequences of hepatitis C and human Influenza A viruses. In both cases, we recover more believed aspects of the viral demographic histories than the GMRF approach. We also find that our GP method produces more reasonable uncertainty estimates than the GMRF method. Copyright © 2013, The International Biometric Society.

  18. Bayesian approach to inverse statistical mechanics

    Science.gov (United States)

    Habeck, Michael

    2014-05-01

    Inverse statistical mechanics aims to determine particle interactions from ensemble properties. This article looks at this inverse problem from a Bayesian perspective and discusses several statistical estimators to solve it. In addition, a sequential Monte Carlo algorithm is proposed that draws the interaction parameters from their posterior probability distribution. The posterior probability involves an intractable partition function that is estimated along with the interactions. The method is illustrated for inverse problems of varying complexity, including the estimation of a temperature, the inverse Ising problem, maximum entropy fitting, and the reconstruction of molecular interaction potentials.

  19. Non-parametric Bayesian models of response function in dynamic image sequences

    Czech Academy of Sciences Publication Activity Database

    Tichý, Ondřej; Šmídl, Václav

    2016-01-01

    Roč. 151, č. 1 (2016), s. 90-100 ISSN 1077-3142 R&D Projects: GA ČR GA13-29225S Institutional support: RVO:67985556 Keywords : Response function * Blind source separation * Dynamic medical imaging * Probabilistic models * Bayesian methods Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 2.498, year: 2016 http://library.utia.cas.cz/separaty/2016/AS/tichy-0456983.pdf

  20. Single molecule force spectroscopy at high data acquisition: A Bayesian nonparametric analysis

    Science.gov (United States)

    Sgouralis, Ioannis; Whitmore, Miles; Lapidus, Lisa; Comstock, Matthew J.; Pressé, Steve

    2018-03-01

    Bayesian nonparametrics (BNPs) are poised to have a deep impact in the analysis of single molecule data as they provide posterior probabilities over entire models consistent with the supplied data, not just model parameters of one preferred model. Thus they provide an elegant and rigorous solution to the difficult problem encountered when selecting an appropriate candidate model. Nevertheless, BNPs' flexibility to learn models and their associated parameters from experimental data is a double-edged sword. Most importantly, BNPs are prone to increasing the complexity of the estimated models due to artifactual features present in time traces. Thus, because of experimental challenges unique to single molecule methods, naive application of available BNP tools is not possible. Here we consider traces with time correlations and, as a specific example, we deal with force spectroscopy traces collected at high acquisition rates. While high acquisition rates are required in order to capture dwells in short-lived molecular states, in this setup, a slow response of the optical trap instrumentation (i.e., trapped beads, ambient fluid, and tethering handles) distorts the molecular signals introducing time correlations into the data that may be misinterpreted as true states by naive BNPs. Our adaptation of BNP tools explicitly takes into consideration these response dynamics, in addition to drift and noise, and makes unsupervised time series analysis of correlated single molecule force spectroscopy measurements possible, even at acquisition rates similar to or below the trap's response times.

  1. Nonparametric Bayesian inference for mean residual life functions in survival analysis.

    Science.gov (United States)

    Poynor, Valerie; Kottas, Athanasios

    2018-01-19

    Modeling and inference for survival analysis problems typically revolves around different functions related to the survival distribution. Here, we focus on the mean residual life (MRL) function, which provides the expected remaining lifetime given that a subject has survived (i.e. is event-free) up to a particular time. This function is of direct interest in reliability, medical, and actuarial fields. In addition to its practical interpretation, the MRL function characterizes the survival distribution. We develop general Bayesian nonparametric inference for MRL functions built from a Dirichlet process mixture model for the associated survival distribution. The resulting model for the MRL function admits a representation as a mixture of the kernel MRL functions with time-dependent mixture weights. This model structure allows for a wide range of shapes for the MRL function. Particular emphasis is placed on the selection of the mixture kernel, taken to be a gamma distribution, to obtain desirable properties for the MRL function arising from the mixture model. The inference method is illustrated with a data set of two experimental groups and a data set involving right censoring. The supplementary material available at Biostatistics online provides further results on empirical performance of the model, using simulated data examples. © The Author 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  2. Bayesian Nonparametric Estimation of Targeted Agent Effects on Biomarker Change to Predict Clinical Outcome

    Science.gov (United States)

    Graziani, Rebecca; Guindani, Michele; Thall, Peter F.

    2015-01-01

    Summary The effect of a targeted agent on a cancer patient's clinical outcome putatively is mediated through the agent's effect on one or more early biological events. This is motivated by pre-clinical experiments with cells or animals that identify such events, represented by binary or quantitative biomarkers. When evaluating targeted agents in humans, central questions are whether the distribution of a targeted biomarker changes following treatment, the nature and magnitude of this change, and whether it is associated with clinical outcome. Major difficulties in estimating these effects are that a biomarker's distribution may be complex, vary substantially between patients, and have complicated relationships with clinical outcomes. We present a probabilistically coherent framework for modeling and estimation in this setting, including a hierarchical Bayesian nonparametric mixture model for biomarkers that we use to define a functional profile of pre-versus-post treatment biomarker distribution change. The functional is similar to the receiver operating characteristic used in diagnostic testing. The hierarchical model yields clusters of individual patient biomarker profile functionals, and we use the profile as a covariate in a regression model for clinical outcome. The methodology is illustrated by analysis of a dataset from a clinical trial in prostate cancer using imatinib to target platelet-derived growth factor, with the clinical aim to improve progression-free survival time. PMID:25319212

  3. A framework for Bayesian nonparametric inference for causal effects of mediation.

    Science.gov (United States)

    Kim, Chanmin; Daniels, Michael J; Marcus, Bess H; Roy, Jason A

    2017-06-01

    We propose a Bayesian non-parametric (BNP) framework for estimating causal effects of mediation, the natural direct, and indirect, effects. The strategy is to do this in two parts. Part 1 is a flexible model (using BNP) for the observed data distribution. Part 2 is a set of uncheckable assumptions with sensitivity parameters that in conjunction with Part 1 allows identification and estimation of the causal parameters and allows for uncertainty about these assumptions via priors on the sensitivity parameters. For Part 1, we specify a Dirichlet process mixture of multivariate normals as a prior on the joint distribution of the outcome, mediator, and covariates. This approach allows us to obtain a (simple) closed form of each marginal distribution. For Part 2, we consider two sets of assumptions: (a) the standard sequential ignorability (Imai et al., 2010) and (b) weakened set of the conditional independence type assumptions introduced in Daniels et al. (2012) and propose sensitivity analyses for both. We use this approach to assess mediation in a physical activity promotion trial. © 2016, The International Biometric Society.

  4. Marginal Bayesian nonparametric model for time to disease arrival of threatened amphibian populations.

    Science.gov (United States)

    Zhou, Haiming; Hanson, Timothy; Knapp, Roland

    2015-12-01

    The global emergence of Batrachochytrium dendrobatidis (Bd) has caused the extinction of hundreds of amphibian species worldwide. It has become increasingly important to be able to precisely predict time to Bd arrival in a population. The data analyzed herein present a unique challenge in terms of modeling because there is a strong spatial component to Bd arrival time and the traditional proportional hazards assumption is grossly violated. To address these concerns, we develop a novel marginal Bayesian nonparametric survival model for spatially correlated right-censored data. This class of models assumes that the logarithm of survival times marginally follow a mixture of normal densities with a linear-dependent Dirichlet process prior as the random mixing measure, and their joint distribution is induced by a Gaussian copula model with a spatial correlation structure. To invert high-dimensional spatial correlation matrices, we adopt a full-scale approximation that can capture both large- and small-scale spatial dependence. An efficient Markov chain Monte Carlo algorithm with delayed rejection is proposed for posterior computation, and an R package spBayesSurv is provided to fit the model. This approach is first evaluated through simulations, then applied to threatened frog populations in Sequoia-Kings Canyon National Park. © 2015, The International Biometric Society.

  5. msBP: An R Package to Perform Bayesian Nonparametric Inference Using Multiscale Bernstein Polynomials Mixtures

    Directory of Open Access Journals (Sweden)

    Antonio Canale

    2017-06-01

    Full Text Available msBP is an R package that implements a new method to perform Bayesian multiscale nonparametric inference introduced by Canale and Dunson (2016. The method, based on mixtures of multiscale beta dictionary densities, overcomes the drawbacks of Pólya trees and inherits many of the advantages of Dirichlet process mixture models. The key idea is that an infinitely-deep binary tree is introduced, with a beta dictionary density assigned to each node of the tree. Using a multiscale stick-breaking characterization, stochastically decreasing weights are assigned to each node. The result is an infinite mixture model. The package msBP implements a series of basic functions to deal with this family of priors such as random densities and numbers generation, creation and manipulation of binary tree objects, and generic functions to plot and print the results. In addition, it implements the Gibbs samplers for posterior computation to perform multiscale density estimation and multiscale testing of group differences described in Canale and Dunson (2016.

  6. Bayesian nonparametric generative models for causal inference with missing at random covariates.

    Science.gov (United States)

    Roy, Jason; Lum, Kirsten J; Zeldow, Bret; Dworkin, Jordan D; Re, Vincent Lo; Daniels, Michael J

    2018-03-26

    We propose a general Bayesian nonparametric (BNP) approach to causal inference in the point treatment setting. The joint distribution of the observed data (outcome, treatment, and confounders) is modeled using an enriched Dirichlet process. The combination of the observed data model and causal assumptions allows us to identify any type of causal effect-differences, ratios, or quantile effects, either marginally or for subpopulations of interest. The proposed BNP model is well-suited for causal inference problems, as it does not require parametric assumptions about the distribution of confounders and naturally leads to a computationally efficient Gibbs sampling algorithm. By flexibly modeling the joint distribution, we are also able to impute (via data augmentation) values for missing covariates within the algorithm under an assumption of ignorable missingness, obviating the need to create separate imputed data sets. This approach for imputing the missing covariates has the additional advantage of guaranteeing congeniality between the imputation model and the analysis model, and because we use a BNP approach, parametric models are avoided for imputation. The performance of the method is assessed using simulation studies. The method is applied to data from a cohort study of human immunodeficiency virus/hepatitis C virus co-infected patients. © 2018, The International Biometric Society.

  7. Examples of the Application of Nonparametric Information Geometry to Statistical Physics

    Directory of Open Access Journals (Sweden)

    Giovanni Pistone

    2013-09-01

    Full Text Available We review a nonparametric version of Amari’s information geometry in which the set of positive probability densities on a given sample space is endowed with an atlas of charts to form a differentiable manifold modeled on Orlicz Banach spaces. This nonparametric setting is used to discuss the setting of typical problems in machine learning and statistical physics, such as black-box optimization, Kullback-Leibler divergence, Boltzmann-Gibbs entropy and the Boltzmann equation.

  8. Topics in Bayesian statistics and maximum entropy

    International Nuclear Information System (INIS)

    Mutihac, R.; Cicuttin, A.; Cerdeira, A.; Stanciulescu, C.

    1998-12-01

    Notions of Bayesian decision theory and maximum entropy methods are reviewed with particular emphasis on probabilistic inference and Bayesian modeling. The axiomatic approach is considered as the best justification of Bayesian analysis and maximum entropy principle applied in natural sciences. Particular emphasis is put on solving the inverse problem in digital image restoration and Bayesian modeling of neural networks. Further topics addressed briefly include language modeling, neutron scattering, multiuser detection and channel equalization in digital communications, genetic information, and Bayesian court decision-making. (author)

  9. Assessing systematic risk in the S&P500 index between 2000 and 2011: A Bayesian nonparametric approach

    OpenAIRE

    Rodríguez, Abel; Wang, Ziwei; Kottas, Athanasios

    2017-01-01

    We develop a Bayesian nonparametric model to assess the effect of systematic risks on multiple financial markets, and apply it to understand the behavior of the S&P500 sector indexes between January 1, 2000 and December 31, 2011. More than prediction, our main goal is to understand the evolution of systematic and idiosyncratic risks in the U.S. economy over this particular time period, leading to novel sector-specific risk indexes. To accomplish this goal, we model the appearance of extreme l...

  10. An introduction to Bayesian statistics in health psychology

    NARCIS (Netherlands)

    Depaoli, Sarah; Rus, Holly; Clifton, James; van de Schoot, A.G.J.; Tiemensma, Jitske

    2017-01-01

    The aim of the current article is to provide a brief introduction to Bayesian statistics within the field of Health Psychology. Bayesian methods are increasing in prevalence in applied fields, and they have been shown in simulation research to improve the estimation accuracy of structural equation

  11. Applying Bayesian Statistics to Educational Evaluation. Theoretical Paper No. 62.

    Science.gov (United States)

    Brumet, Michael E.

    Bayesian statistical inference is unfamiliar to many educational evaluators. While the classical model is useful in educational research, it is not as useful in evaluation because of the need to identify solutions to practical problems based on a wide spectrum of information. The reason Bayesian analysis is effective for decision making is that it…

  12. An introduction to Bayesian statistics in health psychology.

    Science.gov (United States)

    Depaoli, Sarah; Rus, Holly M; Clifton, James P; van de Schoot, Rens; Tiemensma, Jitske

    2017-09-01

    The aim of the current article is to provide a brief introduction to Bayesian statistics within the field of health psychology. Bayesian methods are increasing in prevalence in applied fields, and they have been shown in simulation research to improve the estimation accuracy of structural equation models, latent growth curve (and mixture) models, and hierarchical linear models. Likewise, Bayesian methods can be used with small sample sizes since they do not rely on large sample theory. In this article, we discuss several important components of Bayesian statistics as they relate to health-based inquiries. We discuss the incorporation and impact of prior knowledge into the estimation process and the different components of the analysis that should be reported in an article. We present an example implementing Bayesian estimation in the context of blood pressure changes after participants experienced an acute stressor. We conclude with final thoughts on the implementation of Bayesian statistics in health psychology, including suggestions for reviewing Bayesian manuscripts and grant proposals. We have also included an extensive amount of online supplementary material to complement the content presented here, including Bayesian examples using many different software programmes and an extensive sensitivity analysis examining the impact of priors.

  13. Bayesian Information Criterion as an Alternative way of Statistical Inference

    Directory of Open Access Journals (Sweden)

    Nadejda Yu. Gubanova

    2012-05-01

    Full Text Available The article treats Bayesian information criterion as an alternative to traditional methods of statistical inference, based on NHST. The comparison of ANOVA and BIC results for psychological experiment is discussed.

  14. Daniel Goodman’s empirical approach to Bayesian statistics

    Science.gov (United States)

    Gerrodette, Tim; Ward, Eric; Taylor, Rebecca L.; Schwarz, Lisa K.; Eguchi, Tomoharu; Wade, Paul; Himes Boor, Gina

    2016-01-01

    Bayesian statistics, in contrast to classical statistics, uses probability to represent uncertainty about the state of knowledge. Bayesian statistics has often been associated with the idea that knowledge is subjective and that a probability distribution represents a personal degree of belief. Dr. Daniel Goodman considered this viewpoint problematic for issues of public policy. He sought to ground his Bayesian approach in data, and advocated the construction of a prior as an empirical histogram of “similar” cases. In this way, the posterior distribution that results from a Bayesian analysis combined comparable previous data with case-specific current data, using Bayes’ formula. Goodman championed such a data-based approach, but he acknowledged that it was difficult in practice. If based on a true representation of our knowledge and uncertainty, Goodman argued that risk assessment and decision-making could be an exact science, despite the uncertainties. In his view, Bayesian statistics is a critical component of this science because a Bayesian analysis produces the probabilities of future outcomes. Indeed, Goodman maintained that the Bayesian machinery, following the rules of conditional probability, offered the best legitimate inference from available data. We give an example of an informative prior in a recent study of Steller sea lion spatial use patterns in Alaska.

  15. Bayesian Bandwidth Selection for a Nonparametric Regression Model with Mixed Types of Regressors

    Directory of Open Access Journals (Sweden)

    Xibin Zhang

    2016-04-01

    Full Text Available This paper develops a sampling algorithm for bandwidth estimation in a nonparametric regression model with continuous and discrete regressors under an unknown error density. The error density is approximated by the kernel density estimator of the unobserved errors, while the regression function is estimated using the Nadaraya-Watson estimator admitting continuous and discrete regressors. We derive an approximate likelihood and posterior for bandwidth parameters, followed by a sampling algorithm. Simulation results show that the proposed approach typically leads to better accuracy of the resulting estimates than cross-validation, particularly for smaller sample sizes. This bandwidth estimation approach is applied to nonparametric regression model of the Australian All Ordinaries returns and the kernel density estimation of gross domestic product (GDP growth rates among the organisation for economic co-operation and development (OECD and non-OECD countries.

  16. Bayesian nonparametric areal wombling for small-scale maps with an application to urinary bladder cancer data from Connecticut.

    Science.gov (United States)

    Guhaniyogi, Rajarshi

    2017-11-10

    With increasingly abundant spatial data in the form of case counts or rates combined over areal regions (eg, ZIP codes, census tracts, or counties), interest turns to formal identification of difference "boundaries," or barriers on the map, in addition to the estimated statistical map itself. "Boundary" refers to a border that describes vastly disparate outcomes in the adjacent areal units, perhaps caused by latent risk factors. This article focuses on developing a model-based statistical tool, equipped to identify difference boundaries in maps with a small number of areal units, also referred to as small-scale maps. This article proposes a novel and robust nonparametric boundary detection rule based on nonparametric Dirichlet processes, later referred to as Dirichlet process wombling (DPW) rule, by employing Dirichlet process-based mixture models for small-scale maps. Unlike the recently proposed nonparametric boundary detection rules based on false discovery rates, the DPW rule is free of ad hoc parameters, computationally simple, and readily implementable in freely available software for public health practitioners such as JAGS and OpenBUGS and yet provides statistically interpretable boundary detection in small-scale wombling. We offer a detailed simulation study and an application of our proposed approach to a urinary bladder cancer incidence rates dataset between 1990 and 2012 in the 8 counties in Connecticut. Copyright © 2017 John Wiley & Sons, Ltd.

  17. Bayesian Nonparametric Hidden Markov Models with application to the analysis of copy-number-variation in mammalian genomes.

    Science.gov (United States)

    Yau, C; Papaspiliopoulos, O; Roberts, G O; Holmes, C

    2011-01-01

    We consider the development of Bayesian Nonparametric methods for product partition models such as Hidden Markov Models and change point models. Our approach uses a Mixture of Dirichlet Process (MDP) model for the unknown sampling distribution (likelihood) for the observations arising in each state and a computationally efficient data augmentation scheme to aid inference. The method uses novel MCMC methodology which combines recent retrospective sampling methods with the use of slice sampler variables. The methodology is computationally efficient, both in terms of MCMC mixing properties, and robustness to the length of the time series being investigated. Moreover, the method is easy to implement requiring little or no user-interaction. We apply our methodology to the analysis of genomic copy number variation.

  18. A spatio-temporal nonparametric Bayesian variable selection model of fMRI data for clustering correlated time courses.

    Science.gov (United States)

    Zhang, Linlin; Guindani, Michele; Versace, Francesco; Vannucci, Marina

    2014-07-15

    In this paper we present a novel wavelet-based Bayesian nonparametric regression model for the analysis of functional magnetic resonance imaging (fMRI) data. Our goal is to provide a joint analytical framework that allows to detect regions of the brain which exhibit neuronal activity in response to a stimulus and, simultaneously, infer the association, or clustering, of spatially remote voxels that exhibit fMRI time series with similar characteristics. We start by modeling the data with a hemodynamic response function (HRF) with a voxel-dependent shape parameter. We detect regions of the brain activated in response to a given stimulus by using mixture priors with a spike at zero on the coefficients of the regression model. We account for the complex spatial correlation structure of the brain by using a Markov random field (MRF) prior on the parameters guiding the selection of the activated voxels, therefore capturing correlation among nearby voxels. In order to infer association of the voxel time courses, we assume correlated errors, in particular long memory, and exploit the whitening properties of discrete wavelet transforms. Furthermore, we achieve clustering of the voxels by imposing a Dirichlet process (DP) prior on the parameters of the long memory process. For inference, we use Markov Chain Monte Carlo (MCMC) sampling techniques that combine Metropolis-Hastings schemes employed in Bayesian variable selection with sampling algorithms for nonparametric DP models. We explore the performance of the proposed model on simulated data, with both block- and event-related design, and on real fMRI data. Copyright © 2014 Elsevier Inc. All rights reserved.

  19. Bayesian methods for data analysis

    CERN Document Server

    Carlin, Bradley P.

    2009-01-01

    Approaches for statistical inference Introduction Motivating Vignettes Defining the Approaches The Bayes-Frequentist Controversy Some Basic Bayesian Models The Bayes approach Introduction Prior Distributions Bayesian Inference Hierarchical Modeling Model Assessment Nonparametric Methods Bayesian computation Introduction Asymptotic Methods Noniterative Monte Carlo Methods Markov Chain Monte Carlo Methods Model criticism and selection Bayesian Modeling Bayesian Robustness Model Assessment Bayes Factors via Marginal Density Estimation Bayes Factors

  20. Bayesian Nonparametric Measurement of Factor Betas and Clustering with Application to Hedge Fund Returns

    Directory of Open Access Journals (Sweden)

    Urbi Garay

    2016-03-01

    Full Text Available We define a dynamic and self-adjusting mixture of Gaussian Graphical Models to cluster financial returns, and provide a new method for extraction of nonparametric estimates of dynamic alphas (excess return and betas (to a choice set of explanatory factors in a multivariate setting. This approach, as well as the outputs, has a dynamic, nonstationary and nonparametric form, which circumvents the problem of model risk and parametric assumptions that the Kalman filter and other widely used approaches rely on. The by-product of clusters, used for shrinkage and information borrowing, can be of use to determine relationships around specific events. This approach exhibits a smaller Root Mean Squared Error than traditionally used benchmarks in financial settings, which we illustrate through simulation. As an illustration, we use hedge fund index data, and find that our estimated alphas are, on average, 0.13% per month higher (1.6% per year than alphas estimated through Ordinary Least Squares. The approach exhibits fast adaptation to abrupt changes in the parameters, as seen in our estimated alphas and betas, which exhibit high volatility, especially in periods which can be identified as times of stressful market events, a reflection of the dynamic positioning of hedge fund portfolio managers.

  1. The application of bayesian statistic in data fit processing

    International Nuclear Information System (INIS)

    Guan Xingyin; Li Zhenfu; Song Zhaohui

    2010-01-01

    The rationality and disadvantage of least squares fitting that is usually used in data processing is analyzed, and the theory and commonly method that Bayesian statistic is applied in data processing is shown in detail. As it is proved in analysis, Bayesian approach avoid the limitative hypothesis that least squares fitting has in data processing, and the result has traits that it is more scientific and more easily understood, may replace the least squares fitting to apply in data processing. (authors)

  2. Introduction to applied Bayesian statistics and estimation for social scientists

    CERN Document Server

    Lynch, Scott M

    2007-01-01

    ""Introduction to Applied Bayesian Statistics and Estimation for Social Scientists"" covers the complete process of Bayesian statistical analysis in great detail from the development of a model through the process of making statistical inference. The key feature of this book is that it covers models that are most commonly used in social science research - including the linear regression model, generalized linear models, hierarchical models, and multivariate regression models - and it thoroughly develops each real-data example in painstaking detail.The first part of the book provides a detailed

  3. Bayesian nonparametric modeling for comparison of single-neuron firing intensities.

    Science.gov (United States)

    Kottas, Athanasios; Behseta, Sam

    2010-03-01

    We propose a fully inferential model-based approach to the problem of comparing the firing patterns of a neuron recorded under two distinct experimental conditions. The methodology is based on nonhomogeneous Poisson process models for the firing times of each condition with flexible nonparametric mixture prior models for the corresponding intensity functions. We demonstrate posterior inferences from a global analysis, which may be used to compare the two conditions over the entire experimental time window, as well as from a pointwise analysis at selected time points to detect local deviations of firing patterns from one condition to another. We apply our method on two neurons recorded from the primary motor cortex area of a monkey's brain while performing a sequence of reaching tasks.

  4. Bayesian statistics in radionuclide metrology: measurement of a decaying source

    International Nuclear Information System (INIS)

    Bochud, F. O.; Bailat, C.J.; Laedermann, J.P.

    2007-01-01

    The most intuitive way of defining a probability is perhaps through the frequency at which it appears when a large number of trials are realized in identical conditions. The probability derived from the obtained histogram characterizes the so-called frequentist or conventional statistical approach. In this sense, probability is defined as a physical property of the observed system. By contrast, in Bayesian statistics, a probability is not a physical property or a directly observable quantity, but a degree of belief or an element of inference. The goal of this paper is to show how Bayesian statistics can be used in radionuclide metrology and what its advantages and disadvantages are compared with conventional statistics. This is performed through the example of an yttrium-90 source typically encountered in environmental surveillance measurement. Because of the very low activity of this kind of source and the small half-life of the radionuclide, this measurement takes several days, during which the source decays significantly. Several methods are proposed to compute simultaneously the number of unstable nuclei at a given reference time, the decay constant and the background. Asymptotically, all approaches give the same result. However, Bayesian statistics produces coherent estimates and confidence intervals in a much smaller number of measurements. Apart from the conceptual understanding of statistics, the main difficulty that could deter radionuclide metrologists from using Bayesian statistics is the complexity of the computation. (authors)

  5. Bayesian Sensitivity Analysis of a Nonlinear Dynamic Factor Analysis Model with Nonparametric Prior and Possible Nonignorable Missingness.

    Science.gov (United States)

    Tang, Niansheng; Chow, Sy-Miin; Ibrahim, Joseph G; Zhu, Hongtu

    2017-12-01

    Many psychological concepts are unobserved and usually represented as latent factors apprehended through multiple observed indicators. When multiple-subject multivariate time series data are available, dynamic factor analysis models with random effects offer one way of modeling patterns of within- and between-person variations by combining factor analysis and time series analysis at the factor level. Using the Dirichlet process (DP) as a nonparametric prior for individual-specific time series parameters further allows the distributional forms of these parameters to deviate from commonly imposed (e.g., normal or other symmetric) functional forms, arising as a result of these parameters' restricted ranges. Given the complexity of such models, a thorough sensitivity analysis is critical but computationally prohibitive. We propose a Bayesian local influence method that allows for simultaneous sensitivity analysis of multiple modeling components within a single fitting of the model of choice. Five illustrations and an empirical example are provided to demonstrate the utility of the proposed approach in facilitating the detection of outlying cases and common sources of misspecification in dynamic factor analysis models, as well as identification of modeling components that are sensitive to changes in the DP prior specification.

  6. Bayesian nonparametric variable selection as an exploratory tool for discovering differentially expressed genes.

    Science.gov (United States)

    Shahbaba, Babak; Johnson, Wesley O

    2013-05-30

    High-throughput scientific studies involving no clear a priori hypothesis are common. For example, a large-scale genomic study of a disease may examine thousands of genes without hypothesizing that any specific gene is responsible for the disease. In these studies, the objective is to explore a large number of possible factors (e.g., genes) in order to identify a small number that will be considered in follow-up studies that tend to be more thorough and on smaller scales. A simple, hierarchical, linear regression model with random coefficients is assumed for case-control data that correspond to each gene. The specific model used will be seen to be related to a standard Bayesian variable selection model. Relatively large regression coefficients correspond to potential differences in responses for cases versus controls and thus to genes that might 'matter'. For large-scale studies, and using a Dirichlet process mixture model for the regression coefficients, we are able to find clusters of regression effects of genes with increasing potential effect or 'relevance', in relation to the outcome of interest. One cluster will always correspond to genes whose coefficients are in a neighborhood that is relatively close to zero and will be deemed least relevant. Other clusters will correspond to increasing magnitudes of the random/latent regression coefficients. Using simulated data, we demonstrate that our approach could be quite effective in finding relevant genes compared with several alternative methods. We apply our model to two large-scale studies. The first study involves transcriptome analysis of infection by human cytomegalovirus. The second study's objective is to identify differentially expressed genes between two types of leukemia. Copyright © 2012 John Wiley & Sons, Ltd.

  7. Practical Statistics for LHC Physicists: Bayesian Inference (3/3)

    CERN Multimedia

    CERN. Geneva

    2015-01-01

    These lectures cover those principles and practices of statistics that are most relevant for work at the LHC. The first lecture discusses the basic ideas of descriptive statistics, probability and likelihood. The second lecture covers the key ideas in the frequentist approach, including confidence limits, profile likelihoods, p-values, and hypothesis testing. The third lecture covers inference in the Bayesian approach. Throughout, real-world examples will be used to illustrate the practical application of the ideas. No previous knowledge is assumed.

  8. The application of non-parametric statistical method for an ALARA implementation

    International Nuclear Information System (INIS)

    Cho, Young Ho; Herr, Young Hoi

    2003-01-01

    The cost-effective reduction of Occupational Radiation Dose (ORD) at a nuclear power plant could not be achieved without going through an extensive analysis of accumulated ORD data of existing plants. Through the data analysis, it is required to identify what are the jobs of repetitive high ORD at the nuclear power plant. In this study, Percentile Rank Sum Method (PRSM) is proposed to identify repetitive high ORD jobs, which is based on non-parametric statistical theory. As a case study, the method is applied to ORD data of maintenance and repair jobs at Kori units 3 and 4 that are pressurized water reactors with 950 MWe capacity and have been operated since 1986 and 1987, respectively in Korea. The results was verified and validated, and PRSM has been demonstrated to be an efficient method of analyzing the data

  9. Statistical assignment of DNA sequences using Bayesian phylogenetics

    DEFF Research Database (Denmark)

    Terkelsen, Kasper Munch; Boomsma, Wouter Krogh; Huelsenbeck, John P.

    2008-01-01

    We provide a new automated statistical method for DNA barcoding based on a Bayesian phylogenetic analysis. The method is based on automated database sequence retrieval, alignment, and phylogenetic analysis using a custom-built program for Bayesian phylogenetic analysis. We show on real data...... that the method outperforms Blast searches as a measure of confidence and can help eliminate 80% of all false assignment based on best Blast hit. However, the most important advance of the method is that it provides statistically meaningful measures of confidence. We apply the method to a re......-analysis of previously published ancient DNA data and show that, with high statistical confidence, most of the published sequences are in fact of Neanderthal origin. However, there are several cases of chimeric sequences that are comprised of a combination of both Neanderthal and modern human DNA....

  10. To P or Not to P: Backing Bayesian Statistics.

    Science.gov (United States)

    Buchinsky, Farrel J; Chadha, Neil K

    2017-12-01

    In biomedical research, it is imperative to differentiate chance variation from truth before we generalize what we see in a sample of subjects to the wider population. For decades, we have relied on null hypothesis significance testing, where we calculate P values for our data to decide whether to reject a null hypothesis. This methodology is subject to substantial misinterpretation and errant conclusions. Instead of working backward by calculating the probability of our data if the null hypothesis were true, Bayesian statistics allow us instead to work forward, calculating the probability of our hypothesis given the available data. This methodology gives us a mathematical means of incorporating our "prior probabilities" from previous study data (if any) to produce new "posterior probabilities." Bayesian statistics tell us how confidently we should believe what we believe. It is time to embrace and encourage their use in our otolaryngology research.

  11. A Bayesian statistical method for particle identification in shower counters

    International Nuclear Information System (INIS)

    Takashimizu, N.; Kimura, A.; Shibata, A.; Sasaki, T.

    2004-01-01

    We report an attempt on identifying particles using a Bayesian statistical method. We have developed the mathematical model and software for this purpose. We tried to identify electrons and charged pions in shower counters using this method. We designed an ideal shower counter and studied the efficiency of identification using Monte Carlo simulation based on Geant4. Without having any other information, e.g. charges of particles which are given by tracking detectors, we have achieved 95% identifications of both particles

  12. A nonparametric Bayesian approach for clustering bisulfate-based DNA methylation profiles.

    Science.gov (United States)

    Zhang, Lin; Meng, Jia; Liu, Hui; Huang, Yufei

    2012-01-01

    DNA methylation occurs in the context of a CpG dinucleotide. It is an important epigenetic modification, which can be inherited through cell division. The two major types of methylation include hypomethylation and hypermethylation. Unique methylation patterns have been shown to exist in diseases including various types of cancer. DNA methylation analysis promises to become a powerful tool in cancer diagnosis, treatment and prognostication. Large-scale methylation arrays are now available for studying methylation genome-wide. The Illumina methylation platform simultaneously measures cytosine methylation at more than 1500 CpG sites associated with over 800 cancer-related genes. Cluster analysis is often used to identify DNA methylation subgroups for prognosis and diagnosis. However, due to the unique non-Gaussian characteristics, traditional clustering methods may not be appropriate for DNA and methylation data, and the determination of optimal cluster number is still problematic. A Dirichlet process beta mixture model (DPBMM) is proposed that models the DNA methylation expressions as an infinite number of beta mixture distribution. The model allows automatic learning of the relevant parameters such as the cluster mixing proportion, the parameters of beta distribution for each cluster, and especially the number of potential clusters. Since the model is high dimensional and analytically intractable, we proposed a Gibbs sampling "no-gaps" solution for computing the posterior distributions, hence the estimates of the parameters. The proposed algorithm was tested on simulated data as well as methylation data from 55 Glioblastoma multiform (GBM) brain tissue samples. To reduce the computational burden due to the high data dimensionality, a dimension reduction method is adopted. The two GBM clusters yielded by DPBMM are based on data of different number of loci (P-value < 0.1), while hierarchical clustering cannot yield statistically significant clusters.

  13. Improving Transparency and Replication in Bayesian Statistics : The WAMBS-Checklist

    NARCIS (Netherlands)

    Depaoli, Sarah; van de Schoot, Rens

    2017-01-01

    Bayesian statistical methods are slowly creeping into all fields of science and are becoming ever more popular in applied research. Although it is very attractive to use Bayesian statistics, our personal experience has led us to believe that naively applying Bayesian methods can be dangerous for at

  14. Bayesian statistic methods and theri application in probabilistic simulation models

    Directory of Open Access Journals (Sweden)

    Sergio Iannazzo

    2007-03-01

    Full Text Available Bayesian statistic methods are facing a rapidly growing level of interest and acceptance in the field of health economics. The reasons of this success are probably to be found on the theoretical fundaments of the discipline that make these techniques more appealing to decision analysis. To this point should be added the modern IT progress that has developed different flexible and powerful statistical software framework. Among them probably one of the most noticeably is the BUGS language project and its standalone application for MS Windows WinBUGS. Scope of this paper is to introduce the subject and to show some interesting applications of WinBUGS in developing complex economical models based on Markov chains. The advantages of this approach reside on the elegance of the code produced and in its capability to easily develop probabilistic simulations. Moreover an example of the integration of bayesian inference models in a Markov model is shown. This last feature let the analyst conduce statistical analyses on the available sources of evidence and exploit them directly as inputs in the economic model.

  15. Bayesian statistical evaluation of peak area measurements in gamma spectrometry

    International Nuclear Information System (INIS)

    Silva, L.; Turkman, A.; Paulino, C.D.

    2010-01-01

    We analyze results from determinations of peak areas for a radioactive source containing several radionuclides. The statistical analysis was performed using Bayesian methods based on the usual Poisson model for observed counts. This model does not appear to be a very good assumption for the counting system under investigation, even though it is not questioned as a whole by the inferential procedures adopted. We conclude that, in order to avoid incorrect inferences on relevant quantities, one must proceed to a further study that allows us to include missing influence parameters and to select a model explaining the observed data much better.

  16. Bayesians versus frequentists a philosophical debate on statistical reasoning

    CERN Document Server

    Vallverdú, Jordi

    2016-01-01

    This book analyzes the origins of statistical thinking as well as its related philosophical questions, such as causality, determinism or chance. Bayesian and frequentist approaches are subjected to a historical, cognitive and epistemological analysis, making it possible to not only compare the two competing theories, but to also find a potential solution. The work pursues a naturalistic approach, proceeding from the existence of numerosity in natural environments to the existence of contemporary formulas and methodologies to heuristic pragmatism, a concept introduced in the book’s final section. This monograph will be of interest to philosophers and historians of science and students in related fields. Despite the mathematical nature of the topic, no statistical background is required, making the book a valuable read for anyone interested in the history of statistics and human cognition.

  17. Non-parametric order statistics method applied to uncertainty propagation in fuel rod calculations

    International Nuclear Information System (INIS)

    Arimescu, V.E.; Heins, L.

    2001-01-01

    Advances in modeling fuel rod behavior and accumulations of adequate experimental data have made possible the introduction of quantitative methods to estimate the uncertainty of predictions made with best-estimate fuel rod codes. The uncertainty range of the input variables is characterized by a truncated distribution which is typically a normal, lognormal, or uniform distribution. While the distribution for fabrication parameters is defined to cover the design or fabrication tolerances, the distribution of modeling parameters is inferred from the experimental database consisting of separate effects tests and global tests. The final step of the methodology uses a Monte Carlo type of random sampling of all relevant input variables and performs best-estimate code calculations to propagate these uncertainties in order to evaluate the uncertainty range of outputs of interest for design analysis, such as internal rod pressure and fuel centerline temperature. The statistical method underlying this Monte Carlo sampling is non-parametric order statistics, which is perfectly suited to evaluate quantiles of populations with unknown distribution. The application of this method is straightforward in the case of one single fuel rod, when a 95/95 statement is applicable: 'with a probability of 95% and confidence level of 95% the values of output of interest are below a certain value'. Therefore, the 0.95-quantile is estimated for the distribution of all possible values of one fuel rod with a statistical confidence of 95%. On the other hand, a more elaborate procedure is required if all the fuel rods in the core are being analyzed. In this case, the aim is to evaluate the following global statement: with 95% confidence level, the expected number of fuel rods which are not exceeding a certain value is all the fuel rods in the core except only a few fuel rods. In both cases, the thresholds determined by the analysis should be below the safety acceptable design limit. An indirect

  18. Integration of association statistics over genomic regions using Bayesian adaptive regression splines

    Directory of Open Access Journals (Sweden)

    Zhang Xiaohua

    2003-11-01

    Full Text Available Abstract In the search for genetic determinants of complex disease, two approaches to association analysis are most often employed, testing single loci or testing a small group of loci jointly via haplotypes for their relationship to disease status. It is still debatable which of these approaches is more favourable, and under what conditions. The former has the advantage of simplicity but suffers severely when alleles at the tested loci are not in linkage disequilibrium (LD with liability alleles; the latter should capture more of the signal encoded in LD, but is far from simple. The complexity of haplotype analysis could be especially troublesome for association scans over large genomic regions, which, in fact, is becoming the standard design. For these reasons, the authors have been evaluating statistical methods that bridge the gap between single-locus and haplotype-based tests. In this article, they present one such method, which uses non-parametric regression techniques embodied by Bayesian adaptive regression splines (BARS. For a set of markers falling within a common genomic region and a corresponding set of single-locus association statistics, the BARS procedure integrates these results into a single test by examining the class of smooth curves consistent with the data. The non-parametric BARS procedure generally finds no signal when no liability allele exists in the tested region (ie it achieves the specified size of the test and it is sensitive enough to pick up signals when a liability allele is present. The BARS procedure provides a robust and potentially powerful alternative to classical tests of association, diminishes the multiple testing problem inherent in those tests and can be applied to a wide range of data types, including genotype frequencies estimated from pooled samples.

  19. Bayesian Sensitivity Analysis of Statistical Models with Missing Data.

    Science.gov (United States)

    Zhu, Hongtu; Ibrahim, Joseph G; Tang, Niansheng

    2014-04-01

    Methods for handling missing data depend strongly on the mechanism that generated the missing values, such as missing completely at random (MCAR) or missing at random (MAR), as well as other distributional and modeling assumptions at various stages. It is well known that the resulting estimates and tests may be sensitive to these assumptions as well as to outlying observations. In this paper, we introduce various perturbations to modeling assumptions and individual observations, and then develop a formal sensitivity analysis to assess these perturbations in the Bayesian analysis of statistical models with missing data. We develop a geometric framework, called the Bayesian perturbation manifold, to characterize the intrinsic structure of these perturbations. We propose several intrinsic influence measures to perform sensitivity analysis and quantify the effect of various perturbations to statistical models. We use the proposed sensitivity analysis procedure to systematically investigate the tenability of the non-ignorable missing at random (NMAR) assumption. Simulation studies are conducted to evaluate our methods, and a dataset is analyzed to illustrate the use of our diagnostic measures.

  20. Notes on the Implementation of Non-Parametric Statistics within the Westinghouse Realistic Large Break LOCA Evaluation Model (ASTRUM)

    International Nuclear Information System (INIS)

    Frepoli, Cesare; Oriani, Luca

    2006-01-01

    In recent years, non-parametric or order statistics methods have been widely used to assess the impact of the uncertainties within Best-Estimate LOCA evaluation models. The bounding of the uncertainties is achieved with a direct Monte Carlo sampling of the uncertainty attributes, with the minimum trial number selected to 'stabilize' the estimation of the critical output values (peak cladding temperature (PCT), local maximum oxidation (LMO), and core-wide oxidation (CWO A non-parametric order statistics uncertainty analysis was recently implemented within the Westinghouse Realistic Large Break LOCA evaluation model, also referred to as 'Automated Statistical Treatment of Uncertainty Method' (ASTRUM). The implementation or interpretation of order statistics in safety analysis is not fully consistent within the industry. This has led to an extensive public debate among regulators and researchers which can be found in the open literature. The USNRC-approved Westinghouse method follows a rigorous implementation of the order statistics theory, which leads to the execution of 124 simulations within a Large Break LOCA analysis. This is a solid approach which guarantees that a bounding value (at 95% probability) of the 95 th percentile for each of the three 10 CFR 50.46 ECCS design acceptance criteria (PCT, LMO and CWO) is obtained. The objective of this paper is to provide additional insights on the ASTRUM statistical approach, with a more in-depth analysis of pros and cons of the order statistics and of the Westinghouse approach in the implementation of this statistical methodology. (authors)

  1. Speeding Up Non-Parametric Bootstrap Computations for Statistics Based on Sample Moments in Small/Moderate Sample Size Applications.

    Directory of Open Access Journals (Sweden)

    Elias Chaibub Neto

    Full Text Available In this paper we propose a vectorized implementation of the non-parametric bootstrap for statistics based on sample moments. Basically, we adopt the multinomial sampling formulation of the non-parametric bootstrap, and compute bootstrap replications of sample moment statistics by simply weighting the observed data according to multinomial counts instead of evaluating the statistic on a resampled version of the observed data. Using this formulation we can generate a matrix of bootstrap weights and compute the entire vector of bootstrap replications with a few matrix multiplications. Vectorization is particularly important for matrix-oriented programming languages such as R, where matrix/vector calculations tend to be faster than scalar operations implemented in a loop. We illustrate the application of the vectorized implementation in real and simulated data sets, when bootstrapping Pearson's sample correlation coefficient, and compared its performance against two state-of-the-art R implementations of the non-parametric bootstrap, as well as a straightforward one based on a for loop. Our investigations spanned varying sample sizes and number of bootstrap replications. The vectorized bootstrap compared favorably against the state-of-the-art implementations in all cases tested, and was remarkably/considerably faster for small/moderate sample sizes. The same results were observed in the comparison with the straightforward implementation, except for large sample sizes, where the vectorized bootstrap was slightly slower than the straightforward implementation due to increased time expenditures in the generation of weight matrices via multinomial sampling.

  2. The Bayesian New Statistics: Hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective.

    Science.gov (United States)

    Kruschke, John K; Liddell, Torrin M

    2018-02-01

    In the practice of data analysis, there is a conceptual distinction between hypothesis testing, on the one hand, and estimation with quantified uncertainty on the other. Among frequentists in psychology, a shift of emphasis from hypothesis testing to estimation has been dubbed "the New Statistics" (Cumming 2014). A second conceptual distinction is between frequentist methods and Bayesian methods. Our main goal in this article is to explain how Bayesian methods achieve the goals of the New Statistics better than frequentist methods. The article reviews frequentist and Bayesian approaches to hypothesis testing and to estimation with confidence or credible intervals. The article also describes Bayesian approaches to meta-analysis, randomized controlled trials, and power analysis.

  3. Possible Solution to Publication Bias Through Bayesian Statistics, Including Proper Null Hypothesis Testing

    NARCIS (Netherlands)

    Konijn, Elly A.; van de Schoot, Rens; Winter, Sonja D.; Ferguson, Christopher J.

    2015-01-01

    The present paper argues that an important cause of publication bias resides in traditional frequentist statistics forcing binary decisions. An alternative approach through Bayesian statistics provides various degrees of support for any hypothesis allowing balanced decisions and proper null

  4. Systematic search of Bayesian statistics in the field of psychotraumatology

    NARCIS (Netherlands)

    van de Schoot, Rens; Schalken, Naomi; Olff, Miranda

    2017-01-01

    In many different disciplines there is a recent increase in interest of Bayesian analysis. Bayesian methods implement Bayes' theorem, which states that prior beliefs are updated with data, and this process produces updated beliefs about model parameters. The prior is based on how much information we

  5. [Do we always correctly interpret the results of statistical nonparametric tests].

    Science.gov (United States)

    Moczko, Jerzy A

    2014-01-01

    Mann-Whitney, Wilcoxon, Kruskal-Wallis and Friedman tests create a group of commonly used tests to analyze the results of clinical and laboratory data. These tests are considered to be extremely flexible and their asymptotic relative efficiency exceeds 95 percent. Compared with the corresponding parametric tests they do not require checking the fulfillment of the conditions such as the normality of data distribution, homogeneity of variance, the lack of correlation means and standard deviations, etc. They can be used both in the interval and or-dinal scales. The article presents an example Mann-Whitney test, that does not in any case the choice of these four nonparametric tests treated as a kind of gold standard leads to correct inference.

  6. Bayesian Statistics: Concepts and Applications in Animal Breeding – A Review

    Directory of Open Access Journals (Sweden)

    Lsxmikant-Sambhaji Kokate

    2011-07-01

    Full Text Available Statistics uses two major approaches- conventional (or frequentist and Bayesian approach. Bayesian approach provides a complete paradigm for both statistical inference and decision making under uncertainty. Bayesian methods solve many of the difficulties faced by conventional statistical methods, and extend the applicability of statistical methods. It exploits the use of probabilistic models to formulate scientific problems. To use Bayesian statistics, there is computational difficulty and secondly, Bayesian methods require specifying prior probability distributions. Markov Chain Monte-Carlo (MCMC methods were applied to overcome the computational difficulty, and interest in Bayesian methods was renewed. In Bayesian statistics, Bayesian structural equation model (SEM is used. It provides a powerful and flexible approach for studying quantitative traits for wide spectrum problems and thus it has no operational difficulties, with the exception of some complex cases. In this method, the problems are solved at ease, and the statisticians feel it comfortable with the particular way of expressing the results and employing the software available to analyze a large variety of problems.

  7. A nonparametric statistical method for determination of a confidence interval for the mean of a set of results obtained in a laboratory intercomparison

    International Nuclear Information System (INIS)

    Veglia, A.

    1981-08-01

    In cases where sets of data are obviously not normally distributed, the application of a nonparametric method for the estimation of a confidence interval for the mean seems to be more suitable than some other methods because such a method requires few assumptions about the population of data. A two-step statistical method is proposed which can be applied to any set of analytical results: elimination of outliers by a nonparametric method based on Tchebycheff's inequality, and determination of a confidence interval for the mean by a non-parametric method based on binominal distribution. The method is appropriate only for samples of size n>=10

  8. Performances of non-parametric statistics in sensitivity analysis and parameter ranking

    International Nuclear Information System (INIS)

    Saltelli, A.

    1987-01-01

    Twelve parametric and non-parametric sensitivity analysis techniques are compared in the case of non-linear model responses. The test models used are taken from the long-term risk analysis for the disposal of high level radioactive waste in a geological formation. They describe the transport of radionuclides through a set of engineered and natural barriers from the repository to the biosphere and to man. The output data from these models are the dose rates affecting the maximum exposed individual of a critical group at a given point in time. All the techniques are applied to the output from the same Monte Carlo simulations, where a modified version of Latin Hypercube method is used for the sample selection. Hypothesis testing is systematically applied to quantify the degree of confidence in the results given by the various sensitivity estimators. The estimators are ranked according to their robustness and stability, on the basis of two test cases. The conclusions are that no estimator can be considered the best from all points of view and recommend the use of more than just one estimator in sensitivity analysis

  9. Applying Bayesian statistics to the study of psychological trauma: A suggestion for future research.

    Science.gov (United States)

    Yalch, Matthew M

    2016-03-01

    Several contemporary researchers have noted the virtues of Bayesian methods of data analysis. Although debates continue about whether conventional or Bayesian statistics is the "better" approach for researchers in general, there are reasons why Bayesian methods may be well suited to the study of psychological trauma in particular. This article describes how Bayesian statistics offers practical solutions to the problems of data non-normality, small sample size, and missing data common in research on psychological trauma. After a discussion of these problems and the effects they have on trauma research, this article explains the basic philosophical and statistical foundations of Bayesian statistics and how it provides solutions to these problems using an applied example. Results of the literature review and the accompanying example indicates the utility of Bayesian statistics in addressing problems common in trauma research. Bayesian statistics provides a set of methodological tools and a broader philosophical framework that is useful for trauma researchers. Methodological resources are also provided so that interested readers can learn more. (c) 2016 APA, all rights reserved).

  10. Nonparametric Information Geometry: From Divergence Function to Referential-Representational Biduality on Statistical Manifolds

    Directory of Open Access Journals (Sweden)

    Jun Zhang

    2013-12-01

    Full Text Available Divergence functions are the non-symmetric “distance” on the manifold, Μθ, of parametric probability density functions over a measure space, (Χ,μ. Classical information geometry prescribes, on Μθ: (i a Riemannian metric given by the Fisher information; (ii a pair of dual connections (giving rise to the family of α-connections that preserve the metric under parallel transport by their joint actions; and (iii a family of divergence functions ( α-divergence defined on Μθ x Μθ, which induce the metric and the dual connections. Here, we construct an extension of this differential geometric structure from Μθ (that of parametric probability density functions to the manifold, Μ, of non-parametric functions on X, removing the positivity and normalization constraints. The generalized Fisher information and α-connections on M are induced by an α-parameterized family of divergence functions, reflecting the fundamental convex inequality associated with any smooth and strictly convex function. The infinite-dimensional manifold, M, has zero curvature for all these α-connections; hence, the generally non-zero curvature of M can be interpreted as arising from an embedding of Μθ into Μ. Furthermore, when a parametric model (after a monotonic scaling forms an affine submanifold, its natural and expectation parameters form biorthogonal coordinates, and such a submanifold is dually flat for α = ± 1, generalizing the results of Amari’s α-embedding. The present analysis illuminates two different types of duality in information geometry, one concerning the referential status of a point (measurable function expressed in the divergence function (“referential duality” and the other concerning its representation under an arbitrary monotone scaling (“representational duality”.

  11. Inferential, non-parametric statistics to assess the quality of probabilistic forecast systems

    NARCIS (Netherlands)

    Maia, A.H.N.; Meinke, H.B.; Lennox, S.; Stone, R.C.

    2007-01-01

    Many statistical forecast systems are available to interested users. To be useful for decision making, these systems must be based on evidence of underlying mechanisms. Once causal connections between the mechanism and its statistical manifestation have been firmly established, the forecasts must

  12. Rationalizing method of replacement intervals by using Bayesian statistics

    International Nuclear Information System (INIS)

    Kasai, Masao; Notoya, Junichi; Kusakari, Yoshiyuki

    2007-01-01

    This study represents the formulations for rationalizing the replacement intervals of equipments and/or parts taking into account the probability density functions (PDF) of the parameters of failure distribution functions (FDF) and compares the optimized intervals by our formulations with those by conventional formulations which uses only representative values of the parameters of FDF instead of using these PDFs. The failure data are generated by Monte Carlo simulations since the real failure data can not be available for us. The PDF of PDF parameters are obtained by Bayesian method and the representative values are obtained by likelihood estimation and Bayesian method. We found that the method using PDF by Bayesian method brings longer replacement intervals than one using the representative of the parameters. (author)

  13. Post-fire debris flow prediction in Western United States: Advancements based on a nonparametric statistical technique

    Science.gov (United States)

    Nikolopoulos, E. I.; Destro, E.; Bhuiyan, M. A. E.; Borga, M., Sr.; Anagnostou, E. N.

    2017-12-01

    Fire disasters affect modern societies at global scale inducing significant economic losses and human casualties. In addition to their direct impacts they have various adverse effects on hydrologic and geomorphologic processes of a region due to the tremendous alteration of the landscape characteristics (vegetation, soil properties etc). As a consequence, wildfires often initiate a cascade of hazards such as flash floods and debris flows that usually follow the occurrence of a wildfire thus magnifying the overall impact in a region. Post-fire debris flows (PFDF) is one such type of hazards frequently occurring in Western United States where wildfires are a common natural disaster. Prediction of PDFD is therefore of high importance in this region and over the last years a number of efforts from United States Geological Survey (USGS) and National Weather Service (NWS) have been focused on the development of early warning systems that will help mitigate PFDF risk. This work proposes a prediction framework that is based on a nonparametric statistical technique (random forests) that allows predicting the occurrence of PFDF at regional scale with a higher degree of accuracy than the commonly used approaches that are based on power-law thresholds and logistic regression procedures. The work presented is based on a recently released database from USGS that reports a total of 1500 storms that triggered and did not trigger PFDF in a number of fire affected catchments in Western United States. The database includes information on storm characteristics (duration, accumulation, max intensity etc) and other auxiliary information of land surface properties (soil erodibility index, local slope etc). Results show that the proposed model is able to achieve a satisfactory prediction accuracy (threat score > 0.6) superior of previously published prediction frameworks highlighting the potential of nonparametric statistical techniques for development of PFDF prediction systems.

  14. Benchmark of the non-parametric Bayesian deconvolution method implemented in the SINBAD code for X/γ rays spectra processing

    Energy Technology Data Exchange (ETDEWEB)

    Rohée, E. [CEA, LIST, Laboratoire Capteurs et Architectures Electroniques, F-91191 Gif-sur-Yvette (France); Coulon, R., E-mail: romain.coulon@cea.fr [CEA, LIST, Laboratoire Capteurs et Architectures Electroniques, F-91191 Gif-sur-Yvette (France); Carrel, F. [CEA, LIST, Laboratoire Capteurs et Architectures Electroniques, F-91191 Gif-sur-Yvette (France); Dautremer, T.; Barat, E.; Montagu, T. [CEA, LIST, Laboratoire de Modélisation et Simulation des Systèmes, F-91191 Gif-sur-Yvette (France); Normand, S. [CEA, DAM, Le Ponant, DPN/STXN, F-75015 Paris (France); Jammes, C. [CEA, DEN, Cadarache, DER/SPEx/LDCI, F-13108 Saint-Paul-lez-Durance (France)

    2016-11-11

    Radionuclide identification and quantification are a serious concern for many applications as for in situ monitoring at nuclear facilities, laboratory analysis, special nuclear materials detection, environmental monitoring, and waste measurements. High resolution gamma-ray spectrometry based on high purity germanium diode detectors is the best solution available for isotopic identification. Over the last decades, methods have been developed to improve gamma spectra analysis. However, some difficulties remain in the analysis when full energy peaks are folded together with high ratio between their amplitudes, and when the Compton background is much larger compared to the signal of a single peak. In this context, this study deals with the comparison between a conventional analysis based on “iterative peak fitting deconvolution” method and a “nonparametric Bayesian deconvolution” approach developed by the CEA LIST and implemented into the SINBAD code. The iterative peak fit deconvolution is used in this study as a reference method largely validated by industrial standards to unfold complex spectra from HPGe detectors. Complex cases of spectra are studied from IAEA benchmark protocol tests and with measured spectra. The SINBAD code shows promising deconvolution capabilities compared to the conventional method without any expert parameter fine tuning.

  15. Incorporating Nonparametric Statistics into Delphi Studies in Library and Information Science

    Science.gov (United States)

    Ju, Boryung; Jin, Tao

    2013-01-01

    Introduction: The Delphi technique is widely used in library and information science research. However, many researchers in the field fail to employ standard statistical tests when using this technique. This makes the technique vulnerable to criticisms of its reliability and validity. The general goal of this article is to explore how…

  16. Nonparametric and group-based person-fit statistics : a validity study and an empirical example

    NARCIS (Netherlands)

    Meijer, R.R.

    1994-01-01

    In person-fit analysis, the object is to investigate whether an item score pattern is improbable given the item score patterns of the other persons in the group or given what is expected on the basis of a test model. In this study, several existing group-based statistics to detect such improbable

  17. Embracing Uncertainty: The Interface of Bayesian Statistics and Cognitive Psychology

    Directory of Open Access Journals (Sweden)

    Judith L. Anderson

    1998-06-01

    Full Text Available Ecologists working in conservation and resource management are discovering the importance of using Bayesian analytic methods to deal explicitly with uncertainty in data analyses and decision making. However, Bayesian procedures require, as inputs and outputs, an idea that is problematic for the human brain: the probability of a hypothesis ("single-event probability". I describe several cognitive concepts closely related to single-event probabilities, and discuss how their interchangeability in the human mind results in "cognitive illusions," apparent deficits in reasoning about uncertainty. Each cognitive illusion implies specific possible pitfalls for the use of single-event probabilities in ecology and resource management. I then discuss recent research in cognitive psychology showing that simple tactics of communication, suggested by an evolutionary perspective on human cognition, help people to process uncertain information more effectively as they read and talk about probabilities. In addition, I suggest that carefully considered standards for methodology and conventions for presentation may also make Bayesian analyses easier to understand.

  18. Using continuous time stochastic modelling and nonparametric statistics to improve the quality of first principles models

    DEFF Research Database (Denmark)

    A methodology is presented that combines modelling based on first principles and data based modelling into a modelling cycle that facilitates fast decision-making based on statistical methods. A strong feature of this methodology is that given a first principles model along with process data......, the corresponding modelling cycle model of the given system for a given purpose. A computer-aided tool, which integrates the elements of the modelling cycle, is also presented, and an example is given of modelling a fed-batch bioreactor....

  19. On Cooper's Nonparametric Test.

    Science.gov (United States)

    Schmeidler, James

    1978-01-01

    The basic assumption of Cooper's nonparametric test for trend (EJ 125 069) is questioned. It is contended that the proper assumption alters the distribution of the statistic and reduces its usefulness. (JKS)

  20. Optimization of inspection and replacement period by using Bayesian statistics

    International Nuclear Information System (INIS)

    Kasai, Masao; Watanabe, Yasushi; Kusakari, Yoshiyuki; Notoya, Junichi

    2006-01-01

    This study describes the formulations to optimize the time interval of inspections and/or replacements of equipment/parts taking into account the probability density functions (PDF) for failure rates and parameters of failure distribution functions (FDF) and evaluates the optimized results of these time intervals using our formulations by comparing with those using only representative values of failure rates and the parameters of FDF instead of using these PDFs. The PDFs are obtained with Bayesian method and the representative values are obtained with likelihood estimation method. However, any significant difference is not observed between both optimized results within our preliminary calculations. (author)

  1. Statistical Bayesian method for reliability evaluation based on ADT data

    Science.gov (United States)

    Lu, Dawei; Wang, Lizhi; Sun, Yusheng; Wang, Xiaohong

    2018-05-01

    Accelerated degradation testing (ADT) is frequently conducted in the laboratory to predict the products’ reliability under normal operating conditions. Two kinds of methods, degradation path models and stochastic process models, are utilized to analyze degradation data and the latter one is the most popular method. However, some limitations like imprecise solution process and estimation result of degradation ratio still exist, which may affect the accuracy of the acceleration model and the extrapolation value. Moreover, the conducted solution of this problem, Bayesian method, lose key information when unifying the degradation data. In this paper, a new data processing and parameter inference method based on Bayesian method is proposed to handle degradation data and solve the problems above. First, Wiener process and acceleration model is chosen; Second, the initial values of degradation model and parameters of prior and posterior distribution under each level is calculated with updating and iteration of estimation values; Third, the lifetime and reliability values are estimated on the basis of the estimation parameters; Finally, a case study is provided to demonstrate the validity of the proposed method. The results illustrate that the proposed method is quite effective and accuracy in estimating the lifetime and reliability of a product.

  2. Bayesian statistics for the calibration of the LISA Pathfinder experiment

    International Nuclear Information System (INIS)

    Armano, M; Freschi, M; Audley, H; Born, M; Danzmann, K; Diepholz, I; Auger, G; Binetruy, P; Bortoluzzi, D; Brandt, N; Fitzsimons, E; Bursi, A; Caleno, M; Cavalleri, A; Cesarini, A; Dolesi, R; Ferroni, V; Cruise, M; Dunbar, N; Ferraioli, L

    2015-01-01

    The main goal of LISA Pathfinder (LPF) mission is to estimate the acceleration noise models of the overall LISA Technology Package (LTP) experiment on-board. This will be of crucial importance for the future space-based Gravitational-Wave (GW) detectors, like eLISA. Here, we present the Bayesian analysis framework to process the planned system identification experiments designed for that purpose. In particular, we focus on the analysis strategies to predict the accuracy of the parameters that describe the system in all degrees of freedom. The data sets were generated during the latest operational simulations organised by the data analysis team and this work is part of the LTPDA Matlab toolbox. (paper)

  3. Targeted search for continuous gravitational waves: Bayesian versus maximum-likelihood statistics

    International Nuclear Information System (INIS)

    Prix, Reinhard; Krishnan, Badri

    2009-01-01

    We investigate the Bayesian framework for detection of continuous gravitational waves (GWs) in the context of targeted searches, where the phase evolution of the GW signal is assumed to be known, while the four amplitude parameters are unknown. We show that the orthodox maximum-likelihood statistic (known as F-statistic) can be rediscovered as a Bayes factor with an unphysical prior in amplitude parameter space. We introduce an alternative detection statistic ('B-statistic') using the Bayes factor with a more natural amplitude prior, namely an isotropic probability distribution for the orientation of GW sources. Monte Carlo simulations of targeted searches show that the resulting Bayesian B-statistic is more powerful in the Neyman-Pearson sense (i.e., has a higher expected detection probability at equal false-alarm probability) than the frequentist F-statistic.

  4. A Hierarchical Multivariate Bayesian Approach to Ensemble Model output Statistics in Atmospheric Prediction

    Science.gov (United States)

    2017-09-01

    application of statistical inference. Even when human forecasters leverage their professional experience, which is often gained through long periods of... application throughout statistics and Bayesian data analysis. The multivariate form of 2( , )  (e.g., Figure 12) is similarly analytically...data (i.e., no systematic manipulations with analytical functions), it is common in the statistical literature to apply mathematical transformations

  5. Use of SAMC for Bayesian analysis of statistical models with intractable normalizing constants

    KAUST Repository

    Jin, Ick Hoon

    2014-03-01

    Statistical inference for the models with intractable normalizing constants has attracted much attention. During the past two decades, various approximation- or simulation-based methods have been proposed for the problem, such as the Monte Carlo maximum likelihood method and the auxiliary variable Markov chain Monte Carlo methods. The Bayesian stochastic approximation Monte Carlo algorithm specifically addresses this problem: It works by sampling from a sequence of approximate distributions with their average converging to the target posterior distribution, where the approximate distributions can be achieved using the stochastic approximation Monte Carlo algorithm. A strong law of large numbers is established for the Bayesian stochastic approximation Monte Carlo estimator under mild conditions. Compared to the Monte Carlo maximum likelihood method, the Bayesian stochastic approximation Monte Carlo algorithm is more robust to the initial guess of model parameters. Compared to the auxiliary variable MCMC methods, the Bayesian stochastic approximation Monte Carlo algorithm avoids the requirement for perfect samples, and thus can be applied to many models for which perfect sampling is not available or very expensive. The Bayesian stochastic approximation Monte Carlo algorithm also provides a general framework for approximate Bayesian analysis. © 2012 Elsevier B.V. All rights reserved.

  6. Probabilistic risk assessment framework for structural systems under multiple hazards using Bayesian statistics

    International Nuclear Information System (INIS)

    Kwag, Shinyoung; Gupta, Abhinav

    2017-01-01

    Highlights: • This study presents the development of Bayesian framework for probabilistic risk assessment (PRA) of structural systems under multiple hazards. • The concepts of Bayesian network and Bayesian inference are combined by mapping the traditionally used fault trees into a Bayesian network. • The proposed mapping allows for consideration of dependencies as well as correlations between events. • Incorporation of Bayesian inference permits a novel way for exploration of a scenario that is likely to result in a system level “vulnerability.” - Abstract: Conventional probabilistic risk assessment (PRA) methodologies (USNRC, 1983; IAEA, 1992; EPRI, 1994; Ellingwood, 2001) conduct risk assessment for different external hazards by considering each hazard separately and independent of each other. The risk metric for a specific hazard is evaluated by a convolution of the fragility and the hazard curves. The fragility curve for basic event is obtained by using empirical, experimental, and/or numerical simulation data for a particular hazard. Treating each hazard as an independently can be inappropriate in some cases as certain hazards are statistically correlated or dependent. Examples of such correlated events include but are not limited to flooding induced fire, seismically induced internal or external flooding, or even seismically induced fire. In the current practice, system level risk and consequence sequences are typically calculated using logic trees to express the causative relationship between events. In this paper, we present the results from a study on multi-hazard risk assessment that is conducted using a Bayesian network (BN) with Bayesian inference. The framework can consider statistical dependencies among risks from multiple hazards, allows updating by considering the newly available data/information at any level, and provide a novel way to explore alternative failure scenarios that may exist due to vulnerabilities.

  7. Probabilistic risk assessment framework for structural systems under multiple hazards using Bayesian statistics

    Energy Technology Data Exchange (ETDEWEB)

    Kwag, Shinyoung [North Carolina State University, Raleigh, NC 27695 (United States); Korea Atomic Energy Research Institute, Daejeon 305-353 (Korea, Republic of); Gupta, Abhinav, E-mail: agupta1@ncsu.edu [North Carolina State University, Raleigh, NC 27695 (United States)

    2017-04-15

    Highlights: • This study presents the development of Bayesian framework for probabilistic risk assessment (PRA) of structural systems under multiple hazards. • The concepts of Bayesian network and Bayesian inference are combined by mapping the traditionally used fault trees into a Bayesian network. • The proposed mapping allows for consideration of dependencies as well as correlations between events. • Incorporation of Bayesian inference permits a novel way for exploration of a scenario that is likely to result in a system level “vulnerability.” - Abstract: Conventional probabilistic risk assessment (PRA) methodologies (USNRC, 1983; IAEA, 1992; EPRI, 1994; Ellingwood, 2001) conduct risk assessment for different external hazards by considering each hazard separately and independent of each other. The risk metric for a specific hazard is evaluated by a convolution of the fragility and the hazard curves. The fragility curve for basic event is obtained by using empirical, experimental, and/or numerical simulation data for a particular hazard. Treating each hazard as an independently can be inappropriate in some cases as certain hazards are statistically correlated or dependent. Examples of such correlated events include but are not limited to flooding induced fire, seismically induced internal or external flooding, or even seismically induced fire. In the current practice, system level risk and consequence sequences are typically calculated using logic trees to express the causative relationship between events. In this paper, we present the results from a study on multi-hazard risk assessment that is conducted using a Bayesian network (BN) with Bayesian inference. The framework can consider statistical dependencies among risks from multiple hazards, allows updating by considering the newly available data/information at any level, and provide a novel way to explore alternative failure scenarios that may exist due to vulnerabilities.

  8. Bayesian Statistics in Educational Research: A Look at the Current State of Affairs

    Science.gov (United States)

    König, Christoph; van de Schoot, Rens

    2018-01-01

    The ability of a scientific discipline to build cumulative knowledge depends on its predominant method of data analysis. A steady accumulation of knowledge requires approaches which allow researchers to consider results from comparable prior research. Bayesian statistics is especially relevant for establishing a cumulative scientific discipline,…

  9. Bayesian statistical analysis of censored data in geotechnical engineering

    DEFF Research Database (Denmark)

    Ditlevsen, Ove Dalager; Tarp-Johansen, Niels Jacob; Denver, Hans

    2000-01-01

    The geotechnical engineer is often faced with the problem ofhow to assess the statistical properties of a soil parameter on the basis ofa sample measured in-situ or in the laboratory with the defect that somevalues have been replaced by interval bounds because the corresponding soilparameter values...

  10. Fully probabilistic design of hierarchical Bayesian models

    Czech Academy of Sciences Publication Activity Database

    Quinn, A.; Kárný, Miroslav; Guy, Tatiana Valentine

    2016-01-01

    Roč. 369, č. 1 (2016), s. 532-547 ISSN 0020-0255 R&D Projects: GA ČR GA13-13502S Institutional support: RVO:67985556 Keywords : Fully probabilistic design * Ideal distribution * Minimum cross-entropy principle * Bayesian conditioning * Kullback-Leibler divergence * Bayesian nonparametric modelling Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 4.832, year: 2016 http://library.utia.cas.cz/separaty/2016/AS/karny-0463052.pdf

  11. Theory of nonparametric tests

    CERN Document Server

    Dickhaus, Thorsten

    2018-01-01

    This textbook provides a self-contained presentation of the main concepts and methods of nonparametric statistical testing, with a particular focus on the theoretical foundations of goodness-of-fit tests, rank tests, resampling tests, and projection tests. The substitution principle is employed as a unified approach to the nonparametric test problems discussed. In addition to mathematical theory, it also includes numerous examples and computer implementations. The book is intended for advanced undergraduate, graduate, and postdoc students as well as young researchers. Readers should be familiar with the basic concepts of mathematical statistics typically covered in introductory statistics courses.

  12. Applied Bayesian statistical studies in biology and medicine

    CERN Document Server

    D’Amore, G; Scalfari, F

    2004-01-01

    It was written on another occasion· that "It is apparent that the scientific culture, if one means production of scientific papers, is growing exponentially, and chaotically, in almost every field of investigation". The biomedical sciences sensu lato and mathematical statistics are no exceptions. One might say then, and with good reason, that another collection of bio­ statistical papers would only add to the overflow and cause even more confusion. Nevertheless, this book may be greeted with some interest if we state that most of the papers in it are the result of a collaboration between biologists and statisticians, and partly the product of the Summer School th "Statistical Inference in Human Biology" which reaches its 10 edition in 2003 (information about the School can be obtained at the Web site http://www2. stat. unibo. itleventilSito%20scuolalindex. htm). is common experience - and not only This is rather important. Indeed, it in Italy - that encounters between statisticians and researchers are spora...

  13. Bayesian inference – a way to combine statistical data and semantic analysis meaningfully

    Directory of Open Access Journals (Sweden)

    Eila Lindfors

    2011-11-01

    Full Text Available This article focuses on presenting the possibilities of Bayesian modelling (Finite Mixture Modelling in the semantic analysis of statistically modelled data. The probability of a hypothesis in relation to the data available is an important question in inductive reasoning. Bayesian modelling allows the researcher to use many models at a time and provides tools to evaluate the goodness of different models. The researcher should always be aware that there is no such thing as the exact probability of an exact event. This is the reason for using probabilistic models. Each model presents a different perspective on the phenomenon in focus, and the researcher has to choose the most probable model with a view to previous research and the knowledge available.The idea of Bayesian modelling is illustrated here by presenting two different sets of data, one from craft science research (n=167 and the other (n=63 from educational research (Lindfors, 2007, 2002. The principles of how to build models and how to combine different profiles are described in the light of the research mentioned.Bayesian modelling is an analysis based on calculating probabilities in relation to a specific set of quantitative data. It is a tool for handling data and interpreting it semantically. The reliability of the analysis arises from an argumentation of which model can be selected from the model space as the basis for an interpretation, and on which arguments.Keywords: method, sloyd, Bayesian modelling, student teachersURN:NBN:no-29959

  14. Bayesian models based on test statistics for multiple hypothesis testing problems.

    Science.gov (United States)

    Ji, Yuan; Lu, Yiling; Mills, Gordon B

    2008-04-01

    We propose a Bayesian method for the problem of multiple hypothesis testing that is routinely encountered in bioinformatics research, such as the differential gene expression analysis. Our algorithm is based on modeling the distributions of test statistics under both null and alternative hypotheses. We substantially reduce the complexity of the process of defining posterior model probabilities by modeling the test statistics directly instead of modeling the full data. Computationally, we apply a Bayesian FDR approach to control the number of rejections of null hypotheses. To check if our model assumptions for the test statistics are valid for various bioinformatics experiments, we also propose a simple graphical model-assessment tool. Using extensive simulations, we demonstrate the performance of our models and the utility of the model-assessment tool. In the end, we apply the proposed methodology to an siRNA screening and a gene expression experiment.

  15. Conjunction analysis and propositional logic in fMRI data analysis using Bayesian statistics.

    Science.gov (United States)

    Rudert, Thomas; Lohmann, Gabriele

    2008-12-01

    To evaluate logical expressions over different effects in data analyses using the general linear model (GLM) and to evaluate logical expressions over different posterior probability maps (PPMs). In functional magnetic resonance imaging (fMRI) data analysis, the GLM was applied to estimate unknown regression parameters. Based on the GLM, Bayesian statistics can be used to determine the probability of conjunction, disjunction, implication, or any other arbitrary logical expression over different effects or contrast. For second-level inferences, PPMs from individual sessions or subjects are utilized. These PPMs can be combined to a logical expression and its probability can be computed. The methods proposed in this article are applied to data from a STROOP experiment and the methods are compared to conjunction analysis approaches for test-statistics. The combination of Bayesian statistics with propositional logic provides a new approach for data analyses in fMRI. Two different methods are introduced for propositional logic: the first for analyses using the GLM and the second for common inferences about different probability maps. The methods introduced extend the idea of conjunction analysis to a full propositional logic and adapt it from test-statistics to Bayesian statistics. The new approaches allow inferences that are not possible with known standard methods in fMRI. (c) 2008 Wiley-Liss, Inc.

  16. Statistical detection of EEG synchrony using empirical bayesian inference.

    Directory of Open Access Journals (Sweden)

    Archana K Singh

    Full Text Available There is growing interest in understanding how the brain utilizes synchronized oscillatory activity to integrate information across functionally connected regions. Computing phase-locking values (PLV between EEG signals is a popular method for quantifying such synchronizations and elucidating their role in cognitive tasks. However, high-dimensionality in PLV data incurs a serious multiple testing problem. Standard multiple testing methods in neuroimaging research (e.g., false discovery rate, FDR suffer severe loss of power, because they fail to exploit complex dependence structure between hypotheses that vary in spectral, temporal and spatial dimension. Previously, we showed that a hierarchical FDR and optimal discovery procedures could be effectively applied for PLV analysis to provide better power than FDR. In this article, we revisit the multiple comparison problem from a new Empirical Bayes perspective and propose the application of the local FDR method (locFDR; Efron, 2001 for PLV synchrony analysis to compute FDR as a posterior probability that an observed statistic belongs to a null hypothesis. We demonstrate the application of Efron's Empirical Bayes approach for PLV synchrony analysis for the first time. We use simulations to validate the specificity and sensitivity of locFDR and a real EEG dataset from a visual search study for experimental validation. We also compare locFDR with hierarchical FDR and optimal discovery procedures in both simulation and experimental analyses. Our simulation results showed that the locFDR can effectively control false positives without compromising on the power of PLV synchrony inference. Our results from the application locFDR on experiment data detected more significant discoveries than our previously proposed methods whereas the standard FDR method failed to detect any significant discoveries.

  17. Statistical detection of EEG synchrony using empirical bayesian inference.

    Science.gov (United States)

    Singh, Archana K; Asoh, Hideki; Takeda, Yuji; Phillips, Steven

    2015-01-01

    There is growing interest in understanding how the brain utilizes synchronized oscillatory activity to integrate information across functionally connected regions. Computing phase-locking values (PLV) between EEG signals is a popular method for quantifying such synchronizations and elucidating their role in cognitive tasks. However, high-dimensionality in PLV data incurs a serious multiple testing problem. Standard multiple testing methods in neuroimaging research (e.g., false discovery rate, FDR) suffer severe loss of power, because they fail to exploit complex dependence structure between hypotheses that vary in spectral, temporal and spatial dimension. Previously, we showed that a hierarchical FDR and optimal discovery procedures could be effectively applied for PLV analysis to provide better power than FDR. In this article, we revisit the multiple comparison problem from a new Empirical Bayes perspective and propose the application of the local FDR method (locFDR; Efron, 2001) for PLV synchrony analysis to compute FDR as a posterior probability that an observed statistic belongs to a null hypothesis. We demonstrate the application of Efron's Empirical Bayes approach for PLV synchrony analysis for the first time. We use simulations to validate the specificity and sensitivity of locFDR and a real EEG dataset from a visual search study for experimental validation. We also compare locFDR with hierarchical FDR and optimal discovery procedures in both simulation and experimental analyses. Our simulation results showed that the locFDR can effectively control false positives without compromising on the power of PLV synchrony inference. Our results from the application locFDR on experiment data detected more significant discoveries than our previously proposed methods whereas the standard FDR method failed to detect any significant discoveries.

  18. Statistical analysis of water-quality data containing multiple detection limits II: S-language software for nonparametric distribution modeling and hypothesis testing

    Science.gov (United States)

    Lee, L.; Helsel, D.

    2007-01-01

    Analysis of low concentrations of trace contaminants in environmental media often results in left-censored data that are below some limit of analytical precision. Interpretation of values becomes complicated when there are multiple detection limits in the data-perhaps as a result of changing analytical precision over time. Parametric and semi-parametric methods, such as maximum likelihood estimation and robust regression on order statistics, can be employed to model distributions of multiply censored data and provide estimates of summary statistics. However, these methods are based on assumptions about the underlying distribution of data. Nonparametric methods provide an alternative that does not require such assumptions. A standard nonparametric method for estimating summary statistics of multiply-censored data is the Kaplan-Meier (K-M) method. This method has seen widespread usage in the medical sciences within a general framework termed "survival analysis" where it is employed with right-censored time-to-failure data. However, K-M methods are equally valid for the left-censored data common in the geosciences. Our S-language software provides an analytical framework based on K-M methods that is tailored to the needs of the earth and environmental sciences community. This includes routines for the generation of empirical cumulative distribution functions, prediction or exceedance probabilities, and related confidence limits computation. Additionally, our software contains K-M-based routines for nonparametric hypothesis testing among an unlimited number of grouping variables. A primary characteristic of K-M methods is that they do not perform extrapolation and interpolation. Thus, these routines cannot be used to model statistics beyond the observed data range or when linear interpolation is desired. For such applications, the aforementioned parametric and semi-parametric methods must be used.

  19. Reasoning with data an introduction to traditional and Bayesian statistics using R

    CERN Document Server

    Stanton, Jeffrey M

    2017-01-01

    Engaging and accessible, this book teaches readers how to use inferential statistical thinking to check their assumptions, assess evidence about their beliefs, and avoid overinterpreting results that may look more promising than they really are. It provides step-by-step guidance for using both classical (frequentist) and Bayesian approaches to inference. Statistical techniques covered side by side from both frequentist and Bayesian approaches include hypothesis testing, replication, analysis of variance, calculation of effect sizes, regression, time series analysis, and more. Students also get a complete introduction to the open-source R programming language and its key packages. Throughout the text, simple commands in R demonstrate essential data analysis skills using real-data examples. The companion website provides annotated R code for the book's examples, in-class exercises, supplemental reading lists, and links to online videos, interactive materials, and other resources.

  20. Bayesian statistics applied to neutron activation data for reactor flux spectrum analysis

    International Nuclear Information System (INIS)

    Chiesa, Davide; Previtali, Ezio; Sisti, Monica

    2014-01-01

    Highlights: • Bayesian statistics to analyze the neutron flux spectrum from activation data. • Rigorous statistical approach for accurate evaluation of the neutron flux groups. • Cross section and activation data uncertainties included for the problem solution. • Flexible methodology applied to analyze different nuclear reactor flux spectra. • The results are in good agreement with the MCNP simulations of neutron fluxes. - Abstract: In this paper, we present a statistical method, based on Bayesian statistics, to analyze the neutron flux spectrum from the activation data of different isotopes. The experimental data were acquired during a neutron activation experiment performed at the TRIGA Mark II reactor of Pavia University (Italy) in four irradiation positions characterized by different neutron spectra. In order to evaluate the neutron flux spectrum, subdivided in energy groups, a system of linear equations, containing the group effective cross sections and the activation rate data, has to be solved. However, since the system’s coefficients are experimental data affected by uncertainties, a rigorous statistical approach is fundamental for an accurate evaluation of the neutron flux groups. For this purpose, we applied the Bayesian statistical analysis, that allows to include the uncertainties of the coefficients and the a priori information about the neutron flux. A program for the analysis of Bayesian hierarchical models, based on Markov Chain Monte Carlo (MCMC) simulations, was used to define the problem statistical model and solve it. The first analysis involved the determination of the thermal, resonance-intermediate and fast flux components and the dependence of the results on the Prior distribution choice was investigated to confirm the reliability of the Bayesian analysis. After that, the main resonances of the activation cross sections were analyzed to implement multi-group models with finer energy subdivisions that would allow to determine the

  1. Nonparametric tests for censored data

    CERN Document Server

    Bagdonavicus, Vilijandas; Nikulin, Mikhail

    2013-01-01

    This book concerns testing hypotheses in non-parametric models. Generalizations of many non-parametric tests to the case of censored and truncated data are considered. Most of the test results are proved and real applications are illustrated using examples. Theories and exercises are provided. The incorrect use of many tests applying most statistical software is highlighted and discussed.

  2. Conditional maximum-entropy method for selecting prior distributions in Bayesian statistics

    Science.gov (United States)

    Abe, Sumiyoshi

    2014-11-01

    The conditional maximum-entropy method (abbreviated here as C-MaxEnt) is formulated for selecting prior probability distributions in Bayesian statistics for parameter estimation. This method is inspired by a statistical-mechanical approach to systems governed by dynamics with largely separated time scales and is based on three key concepts: conjugate pairs of variables, dimensionless integration measures with coarse-graining factors and partial maximization of the joint entropy. The method enables one to calculate a prior purely from a likelihood in a simple way. It is shown, in particular, how it not only yields Jeffreys's rules but also reveals new structures hidden behind them.

  3. Online Dectection and Modeling of Safety Boundaries for Aerospace Application Using Bayesian Statistics

    Science.gov (United States)

    He, Yuning

    2015-01-01

    The behavior of complex aerospace systems is governed by numerous parameters. For safety analysis it is important to understand how the system behaves with respect to these parameter values. In particular, understanding the boundaries between safe and unsafe regions is of major importance. In this paper, we describe a hierarchical Bayesian statistical modeling approach for the online detection and characterization of such boundaries. Our method for classification with active learning uses a particle filter-based model and a boundary-aware metric for best performance. From a library of candidate shapes incorporated with domain expert knowledge, the location and parameters of the boundaries are estimated using advanced Bayesian modeling techniques. The results of our boundary analysis are then provided in a form understandable by the domain expert. We illustrate our approach using a simulation model of a NASA neuro-adaptive flight control system, as well as a system for the detection of separation violations in the terminal airspace.

  4. Application of classical versus bayesian statistical control charts to on-line radiological monitoring

    International Nuclear Information System (INIS)

    DeVol, T.A.; Gohres, A.A.; Williams, C.L.

    2009-01-01

    False positive and false negative incidence rates of radiological monitoring data from classical and Bayesian statistical process control chart techniques are compared. The on-line monitoring for illicit radioactive material with no false positives or false negatives is the goal of homeland security monitoring, but is unrealistic. However, statistical fluctuations in the detector signal, short detection times, large source to detector distances, and shielding effects make distinguishing between a radiation source and natural background particularly difficult. Experimental time series data were collected using a 1' x 1' LaCl 3 (Ce) based scintillation detector (Scionix, Orlando, FL) under various simulated conditions. Experimental parameters include radionuclide (gamma-ray) energy, activity, density thickness (source to detector distance and shielding), time, and temperature. All statistical algorithms were developed using MATLAB TM . The Shewhart (3-σ) control chart and the cumulative sum (CUSUM) control chart are the classical procedures adopted, while the Bayesian technique is the Shiryayev-Roberts (S-R) control chart. The Shiryayev-Roberts method was the best method for controlling the number of false positive detects, followed by the CUSUM method. However, The Shiryayev-Roberts method, used without modification, resulted in one of the highest false negative incidence rates independent of the signal strength. Modification of The Shiryayev-Roberts statistical analysis method reduced the number of false negatives, but resulted in an increase in the false positive incidence rate. (author)

  5. Elements of probability and statistics an introduction to probability with De Finetti’s approach and to Bayesian statistics

    CERN Document Server

    Biagini, Francesca

    2016-01-01

    This book provides an introduction to elementary probability and to Bayesian statistics using de Finetti's subjectivist approach. One of the features of this approach is that it does not require the introduction of sample space – a non-intrinsic concept that makes the treatment of elementary probability unnecessarily complicate – but introduces as fundamental the concept of random numbers directly related to their interpretation in applications. Events become a particular case of random numbers and probability a particular case of expectation when it is applied to events. The subjective evaluation of expectation and of conditional expectation is based on an economic choice of an acceptable bet or penalty. The properties of expectation and conditional expectation are derived by applying a coherence criterion that the evaluation has to follow. The book is suitable for all introductory courses in probability and statistics for students in Mathematics, Informatics, Engineering, and Physics.

  6. Statistical modelling of railway track geometry degradation using Hierarchical Bayesian models

    International Nuclear Information System (INIS)

    Andrade, A.R.; Teixeira, P.F.

    2015-01-01

    Railway maintenance planners require a predictive model that can assess the railway track geometry degradation. The present paper uses a Hierarchical Bayesian model as a tool to model the main two quality indicators related to railway track geometry degradation: the standard deviation of longitudinal level defects and the standard deviation of horizontal alignment defects. Hierarchical Bayesian Models (HBM) are flexible statistical models that allow specifying different spatially correlated components between consecutive track sections, namely for the deterioration rates and the initial qualities parameters. HBM are developed for both quality indicators, conducting an extensive comparison between candidate models and a sensitivity analysis on prior distributions. HBM is applied to provide an overall assessment of the degradation of railway track geometry, for the main Portuguese railway line Lisbon–Oporto. - Highlights: • Rail track geometry degradation is analysed using Hierarchical Bayesian models. • A Gibbs sampling strategy is put forward to estimate the HBM. • Model comparison and sensitivity analysis find the most suitable model. • We applied the most suitable model to all the segments of the main Portuguese line. • Tackling spatial correlations using CAR structures lead to a better model fit

  7. Bayesian Statistics at Work: the Troublesome Extraction of the CKM Phase {alpha}

    Energy Technology Data Exchange (ETDEWEB)

    Charles, J. [CPT, Luminy Case 907, F-13288 Marseille Cedex 9 (France); Hoecker, A. [CERN, CH-1211 Geneva 23 (Switzerland); Lacker, H. [TU Dresden, IKTP, D-01062 Dresden (Germany); Le Diberder, F.R. [LAL, CNRS/IN2P3, Universite Paris-Sud 11, Bat. 200, BP 34, F-91898 Orsay Cedex (France); T' Jampens, S. [LAPP, CNRS/IN2P3, Universite de Savoie, 9 Chemin de Bellevue, BP 110, F-74941 Annecy-le-Vieux Cedex (France)

    2007-04-15

    In Bayesian statistics, one's prior beliefs about underlying model parameters are revised with the information content of observed data from which, using Bayes' rule, a posterior belief is obtained. A non-trivial example taken from the isospin analysis of B {yields} PP (P = {pi} or {rho}) decays in heavy-flavor physics is chosen to illustrate the effect of the naive 'objective' choice of flat priors in a multi- dimensional parameter space in presence of mirror solutions. It is demonstrated that the posterior distribution for the parameter of interest, the phase {alpha}, strongly depends on the choice of the parameterization in which the priors are uniform, and on the validity range in which the (un-normalizable) priors are truncated. We prove that the most probable values found by the Bayesian treatment do not coincide with the explicit analytical solutions, in contrast to the frequentist approach. It is also shown in the appendix that the {alpha} {yields} 0 limit cannot be consistently treated in the Bayesian paradigm, because the latter violates the physical symmetries of the problem. (authors)

  8. Probabilistic Model for Untargeted Peak Detection in LC-MS Using Bayesian Statistics.

    Science.gov (United States)

    Woldegebriel, Michael; Vivó-Truyols, Gabriel

    2015-07-21

    We introduce a novel Bayesian probabilistic peak detection algorithm for liquid chromatography-mass spectroscopy (LC-MS). The final probabilistic result allows the user to make a final decision about which points in a chromatogram are affected by a chromatographic peak and which ones are only affected by noise. The use of probabilities contrasts with the traditional method in which a binary answer is given, relying on a threshold. By contrast, with the Bayesian peak detection presented here, the values of probability can be further propagated into other preprocessing steps, which will increase (or decrease) the importance of chromatographic regions into the final results. The present work is based on the use of the statistical overlap theory of component overlap from Davis and Giddings (Davis, J. M.; Giddings, J. Anal. Chem. 1983, 55, 418-424) as prior probability in the Bayesian formulation. The algorithm was tested on LC-MS Orbitrap data and was able to successfully distinguish chemical noise from actual peaks without any data preprocessing.

  9. Bayesian Statistics at Work: the Troublesome Extraction of the CKM Phase α

    International Nuclear Information System (INIS)

    Charles, J.; Hoecker, A.; Lacker, H.; Le Diberder, F.R.; T'Jampens, S.

    2007-04-01

    In Bayesian statistics, one's prior beliefs about underlying model parameters are revised with the information content of observed data from which, using Bayes' rule, a posterior belief is obtained. A non-trivial example taken from the isospin analysis of B → PP (P = π or ρ) decays in heavy-flavor physics is chosen to illustrate the effect of the naive 'objective' choice of flat priors in a multi- dimensional parameter space in presence of mirror solutions. It is demonstrated that the posterior distribution for the parameter of interest, the phase α, strongly depends on the choice of the parameterization in which the priors are uniform, and on the validity range in which the (un-normalizable) priors are truncated. We prove that the most probable values found by the Bayesian treatment do not coincide with the explicit analytical solutions, in contrast to the frequentist approach. It is also shown in the appendix that the α → 0 limit cannot be consistently treated in the Bayesian paradigm, because the latter violates the physical symmetries of the problem. (authors)

  10. Robust statistics for deterministic and stochastic gravitational waves in non-Gaussian noise. II. Bayesian analyses

    International Nuclear Information System (INIS)

    Allen, Bruce; Creighton, Jolien D.E.; Flanagan, Eanna E.; Romano, Joseph D.

    2003-01-01

    In a previous paper (paper I), we derived a set of near-optimal signal detection techniques for gravitational wave detectors whose noise probability distributions contain non-Gaussian tails. The methods modify standard methods by truncating or clipping sample values which lie in those non-Gaussian tails. The methods were derived, in the frequentist framework, by minimizing false alarm probabilities at fixed false detection probability in the limit of weak signals. For stochastic signals, the resulting statistic consisted of a sum of an autocorrelation term and a cross-correlation term; it was necessary to discard 'by hand' the autocorrelation term in order to arrive at the correct, generalized cross-correlation statistic. In the present paper, we present an alternative derivation of the same signal detection techniques from within the Bayesian framework. We compute, for both deterministic and stochastic signals, the probability that a signal is present in the data, in the limit where the signal-to-noise ratio squared per frequency bin is small, where the signal is nevertheless strong enough to be detected (integrated signal-to-noise ratio large compared to 1), and where the total probability in the non-Gaussian tail part of the noise distribution is small. We show that, for each model considered, the resulting probability is to a good approximation a monotonic function of the detection statistic derived in paper I. Moreover, for stochastic signals, the new Bayesian derivation automatically eliminates the problematic autocorrelation term

  11. Enhancing pediatric clinical trial feasibility through the use of Bayesian statistics.

    Science.gov (United States)

    Huff, Robin A; Maca, Jeff D; Puri, Mala; Seltzer, Earl W

    2017-11-01

    BackgroundPediatric clinical trials commonly experience recruitment challenges including limited number of patients and investigators, inclusion/exclusion criteria that further reduce the patient pool, and a competitive research landscape created by pediatric regulatory commitments. To overcome these challenges, innovative approaches are needed.MethodsThis article explores the use of Bayesian statistics to improve pediatric trial feasibility, using pediatric Type-2 diabetes as an example. Data for six therapies approved for adults were used to perform simulations to determine the impact on pediatric trial size.ResultsWhen the number of adult patients contributing to the simulation was assumed to be the same as the number of patients to be enrolled in the pediatric trial, the pediatric trial size was reduced by 75-78% when compared with a frequentist statistical approach, but was associated with a 34-45% false-positive rate. In subsequent simulations, greater control was exerted over the false-positive rate by decreasing the contribution of the adult data. A 30-33% reduction in trial size was achieved when false-positives were held to less than 10%.ConclusionReducing the trial size through the use of Bayesian statistics would facilitate completion of pediatric trials, enabling drugs to be labeled appropriately for children.

  12. Dissolution curve comparisons through the F(2) parameter, a Bayesian extension of the f(2) statistic.

    Science.gov (United States)

    Novick, Steven; Shen, Yan; Yang, Harry; Peterson, John; LeBlond, Dave; Altan, Stan

    2015-01-01

    Dissolution (or in vitro release) studies constitute an important aspect of pharmaceutical drug development. One important use of such studies is for justifying a biowaiver for post-approval changes which requires establishing equivalence between the new and old product. We propose a statistically rigorous modeling approach for this purpose based on the estimation of what we refer to as the F2 parameter, an extension of the commonly used f2 statistic. A Bayesian test procedure is proposed in relation to a set of composite hypotheses that capture the similarity requirement on the absolute mean differences between test and reference dissolution profiles. Several examples are provided to illustrate the application. Results of our simulation study comparing the performance of f2 and the proposed method show that our Bayesian approach is comparable to or in many cases superior to the f2 statistic as a decision rule. Further useful extensions of the method, such as the use of continuous-time dissolution modeling, are considered.

  13. 2nd Bayesian Young Statisticians Meeting

    CERN Document Server

    Bitto, Angela; Kastner, Gregor; Posekany, Alexandra

    2015-01-01

    The Second Bayesian Young Statisticians Meeting (BAYSM 2014) and the research presented here facilitate connections among researchers using Bayesian Statistics by providing a forum for the development and exchange of ideas. WU Vienna University of Business and Economics hosted BAYSM 2014 from September 18th to 19th. The guidance of renowned plenary lecturers and senior discussants is a critical part of the meeting and this volume, which follows publication of contributions from BAYSM 2013. The meeting's scientific program reflected the variety of fields in which Bayesian methods are currently employed or could be introduced in the future. Three brilliant keynote lectures by Chris Holmes (University of Oxford), Christian Robert (Université Paris-Dauphine), and Mike West (Duke University), were complemented by 24 plenary talks covering the major topics Dynamic Models, Applications, Bayesian Nonparametrics, Biostatistics, Bayesian Methods in Economics, and Models and Methods, as well as a lively poster session ...

  14. Assessing compositional variability through graphical analysis and Bayesian statistical approaches: case studies on transgenic crops.

    Science.gov (United States)

    Harrigan, George G; Harrison, Jay M

    2012-01-01

    New transgenic (GM) crops are subjected to extensive safety assessments that include compositional comparisons with conventional counterparts as a cornerstone of the process. The influence of germplasm, location, environment, and agronomic treatments on compositional variability is, however, often obscured in these pair-wise comparisons. Furthermore, classical statistical significance testing can often provide an incomplete and over-simplified summary of highly responsive variables such as crop composition. In order to more clearly describe the influence of the numerous sources of compositional variation we present an introduction to two alternative but complementary approaches to data analysis and interpretation. These include i) exploratory data analysis (EDA) with its emphasis on visualization and graphics-based approaches and ii) Bayesian statistical methodology that provides easily interpretable and meaningful evaluations of data in terms of probability distributions. The EDA case-studies include analyses of herbicide-tolerant GM soybean and insect-protected GM maize and soybean. Bayesian approaches are presented in an analysis of herbicide-tolerant GM soybean. Advantages of these approaches over classical frequentist significance testing include the more direct interpretation of results in terms of probabilities pertaining to quantities of interest and no confusion over the application of corrections for multiple comparisons. It is concluded that a standardized framework for these methodologies could provide specific advantages through enhanced clarity of presentation and interpretation in comparative assessments of crop composition.

  15. Intelligent Condition Diagnosis Method Based on Adaptive Statistic Test Filter and Diagnostic Bayesian Network.

    Science.gov (United States)

    Li, Ke; Zhang, Qiuju; Wang, Kun; Chen, Peng; Wang, Huaqing

    2016-01-08

    A new fault diagnosis method for rotating machinery based on adaptive statistic test filter (ASTF) and Diagnostic Bayesian Network (DBN) is presented in this paper. ASTF is proposed to obtain weak fault features under background noise, ASTF is based on statistic hypothesis testing in the frequency domain to evaluate similarity between reference signal (noise signal) and original signal, and remove the component of high similarity. The optimal level of significance α is obtained using particle swarm optimization (PSO). To evaluate the performance of the ASTF, evaluation factor Ipq is also defined. In addition, a simulation experiment is designed to verify the effectiveness and robustness of ASTF. A sensitive evaluation method using principal component analysis (PCA) is proposed to evaluate the sensitiveness of symptom parameters (SPs) for condition diagnosis. By this way, the good SPs that have high sensitiveness for condition diagnosis can be selected. A three-layer DBN is developed to identify condition of rotation machinery based on the Bayesian Belief Network (BBN) theory. Condition diagnosis experiment for rolling element bearings demonstrates the effectiveness of the proposed method.

  16. Intelligent Condition Diagnosis Method Based on Adaptive Statistic Test Filter and Diagnostic Bayesian Network

    Directory of Open Access Journals (Sweden)

    Ke Li

    2016-01-01

    Full Text Available A new fault diagnosis method for rotating machinery based on adaptive statistic test filter (ASTF and Diagnostic Bayesian Network (DBN is presented in this paper. ASTF is proposed to obtain weak fault features under background noise, ASTF is based on statistic hypothesis testing in the frequency domain to evaluate similarity between reference signal (noise signal and original signal, and remove the component of high similarity. The optimal level of significance α is obtained using particle swarm optimization (PSO. To evaluate the performance of the ASTF, evaluation factor Ipq is also defined. In addition, a simulation experiment is designed to verify the effectiveness and robustness of ASTF. A sensitive evaluation method using principal component analysis (PCA is proposed to evaluate the sensitiveness of symptom parameters (SPs for condition diagnosis. By this way, the good SPs that have high sensitiveness for condition diagnosis can be selected. A three-layer DBN is developed to identify condition of rotation machinery based on the Bayesian Belief Network (BBN theory. Condition diagnosis experiment for rolling element bearings demonstrates the effectiveness of the proposed method.

  17. Intelligent Condition Diagnosis Method Based on Adaptive Statistic Test Filter and Diagnostic Bayesian Network

    Science.gov (United States)

    Li, Ke; Zhang, Qiuju; Wang, Kun; Chen, Peng; Wang, Huaqing

    2016-01-01

    A new fault diagnosis method for rotating machinery based on adaptive statistic test filter (ASTF) and Diagnostic Bayesian Network (DBN) is presented in this paper. ASTF is proposed to obtain weak fault features under background noise, ASTF is based on statistic hypothesis testing in the frequency domain to evaluate similarity between reference signal (noise signal) and original signal, and remove the component of high similarity. The optimal level of significance α is obtained using particle swarm optimization (PSO). To evaluate the performance of the ASTF, evaluation factor Ipq is also defined. In addition, a simulation experiment is designed to verify the effectiveness and robustness of ASTF. A sensitive evaluation method using principal component analysis (PCA) is proposed to evaluate the sensitiveness of symptom parameters (SPs) for condition diagnosis. By this way, the good SPs that have high sensitiveness for condition diagnosis can be selected. A three-layer DBN is developed to identify condition of rotation machinery based on the Bayesian Belief Network (BBN) theory. Condition diagnosis experiment for rolling element bearings demonstrates the effectiveness of the proposed method. PMID:26761006

  18. Evaluation of Oceanic Transport Statistics By Use of Transient Tracers and Bayesian Methods

    Science.gov (United States)

    Trossman, D. S.; Thompson, L.; Mecking, S.; Bryan, F.; Peacock, S.

    2013-12-01

    Key variables that quantify the time scales over which atmospheric signals penetrate into the oceanic interior and their uncertainties are computed using Bayesian methods and transient tracers from both models and observations. First, the mean residence times, subduction rates, and formation rates of Subtropical Mode Water (STMW) and Subpolar Mode Water (SPMW) in the North Atlantic and Subantarctic Mode Water (SAMW) in the Southern Ocean are estimated by combining a model and observations of chlorofluorocarbon-11 (CFC-11) via Bayesian Model Averaging (BMA), statistical technique that weights model estimates according to how close they agree with observations. Second, a Bayesian method is presented to find two oceanic transport parameters associated with the age distribution of ocean waters, the transit-time distribution (TTD), by combining an eddying global ocean model's estimate of the TTD with hydrographic observations of CFC-11, temperature, and salinity. Uncertainties associated with objectively mapping irregularly spaced bottle data are quantified by making use of a thin-plate spline and then propagated via the two Bayesian techniques. It is found that the subduction of STMW, SPMW, and SAMW is mostly an advective process, but up to about one-third of STMW subduction likely owes to non-advective processes. Also, while the formation of STMW is mostly due to subduction, the formation of SPMW is mostly due to other processes. About half of the formation of SAMW is due to subduction and half is due to other processes. A combination of air-sea flux, acting on relatively short time scales, and turbulent mixing, acting on a wide range of time scales, is likely the dominant SPMW erosion mechanism. Air-sea flux is likely responsible for most STMW erosion, and turbulent mixing is likely responsible for most SAMW erosion. Two oceanic transport parameters, the mean age of a water parcel and the half-variance associated with the TTD, estimated using the model's tracers as

  19. Exploring the Connection Between Sampling Problems in Bayesian Inference and Statistical Mechanics

    Science.gov (United States)

    Pohorille, Andrew

    2006-01-01

    The Bayesian and statistical mechanical communities often share the same objective in their work - estimating and integrating probability distribution functions (pdfs) describing stochastic systems, models or processes. Frequently, these pdfs are complex functions of random variables exhibiting multiple, well separated local minima. Conventional strategies for sampling such pdfs are inefficient, sometimes leading to an apparent non-ergodic behavior. Several recently developed techniques for handling this problem have been successfully applied in statistical mechanics. In the multicanonical and Wang-Landau Monte Carlo (MC) methods, the correct pdfs are recovered from uniform sampling of the parameter space by iteratively establishing proper weighting factors connecting these distributions. Trivial generalizations allow for sampling from any chosen pdf. The closely related transition matrix method relies on estimating transition probabilities between different states. All these methods proved to generate estimates of pdfs with high statistical accuracy. In another MC technique, parallel tempering, several random walks, each corresponding to a different value of a parameter (e.g. "temperature"), are generated and occasionally exchanged using the Metropolis criterion. This method can be considered as a statistically correct version of simulated annealing. An alternative approach is to represent the set of independent variables as a Hamiltonian system. Considerab!e progress has been made in understanding how to ensure that the system obeys the equipartition theorem or, equivalently, that coupling between the variables is correctly described. Then a host of techniques developed for dynamical systems can be used. Among them, probably the most powerful is the Adaptive Biasing Force method, in which thermodynamic integration and biased sampling are combined to yield very efficient estimates of pdfs. The third class of methods deals with transitions between states described

  20. Application of Bayesian statistical decision theory to the optimization of generating set maintenance

    International Nuclear Information System (INIS)

    Procaccia, H.; Cordier, R.; Muller, S.

    1994-07-01

    Statistical decision theory could be a alternative for the optimization of preventive maintenance periodicity. In effect, this theory concerns the situation in which a decision maker has to make a choice between a set of reasonable decisions, and where the loss associated to a given decision depends on a probabilistic risk, called state of nature. In the case of maintenance optimization, the decisions to be analyzed are different periodicities proposed by the experts, given the observed feedback experience, the states of nature are the associated failure probabilities, and the losses are the expectations of the induced cost of maintenance and of consequences of the failures. As failure probabilities concern rare events, at the ultimate state of RCM analysis (failure of sub-component), and as expected foreseeable behaviour of equipment has to be evaluated by experts, Bayesian approach is successfully used to compute states of nature. In Bayesian decision theory, a prior distribution for failure probabilities is modeled from expert knowledge, and is combined with few stochastic information provided by feedback experience, giving a posterior distribution of failure probabilities. The optimized decision is the decision that minimizes the expected loss over the posterior distribution. This methodology has been applied to inspection and maintenance optimization of cylinders of diesel generator engines of 900 MW nuclear plants. In these plants, auxiliary electric power is supplied by 2 redundant diesel generators which are tested every 2 weeks during about 1 hour. Until now, during yearly refueling of each plant, one endoscopic inspection of diesel cylinders is performed, and every 5 operating years, all cylinders are replaced. RCM has shown that cylinder failures could be critical. So Bayesian decision theory has been applied, taking into account expert opinions, and possibility of aging when maintenance periodicity is extended. (authors). 8 refs., 5 figs., 1 tab

  1. Bayesian versus frequentist statistical inference for investigating a one-off cancer cluster reported to a health department

    Directory of Open Access Journals (Sweden)

    Wills Rachael A

    2009-05-01

    Full Text Available Abstract Background The problem of silent multiple comparisons is one of the most difficult statistical problems faced by scientists. It is a particular problem for investigating a one-off cancer cluster reported to a health department because any one of hundreds, or possibly thousands, of neighbourhoods, schools, or workplaces could have reported a cluster, which could have been for any one of several types of cancer or any one of several time periods. Methods This paper contrasts the frequentist approach with a Bayesian approach for dealing with silent multiple comparisons in the context of a one-off cluster reported to a health department. Two published cluster investigations were re-analysed using the Dunn-Sidak method to adjust frequentist p-values and confidence intervals for silent multiple comparisons. Bayesian methods were based on the Gamma distribution. Results Bayesian analysis with non-informative priors produced results similar to the frequentist analysis, and suggested that both clusters represented a statistical excess. In the frequentist framework, the statistical significance of both clusters was extremely sensitive to the number of silent multiple comparisons, which can only ever be a subjective "guesstimate". The Bayesian approach is also subjective: whether there is an apparent statistical excess depends on the specified prior. Conclusion In cluster investigations, the frequentist approach is just as subjective as the Bayesian approach, but the Bayesian approach is less ambitious in that it treats the analysis as a synthesis of data and personal judgements (possibly poor ones, rather than objective reality. Bayesian analysis is (arguably a useful tool to support complicated decision-making, because it makes the uncertainty associated with silent multiple comparisons explicit.

  2. Crossing statistic: Bayesian interpretation, model selection and resolving dark energy parametrization problem

    International Nuclear Information System (INIS)

    Shafieloo, Arman

    2012-01-01

    By introducing Crossing functions and hyper-parameters I show that the Bayesian interpretation of the Crossing Statistics [1] can be used trivially for the purpose of model selection among cosmological models. In this approach to falsify a cosmological model there is no need to compare it with other models or assume any particular form of parametrization for the cosmological quantities like luminosity distance, Hubble parameter or equation of state of dark energy. Instead, hyper-parameters of Crossing functions perform as discriminators between correct and wrong models. Using this approach one can falsify any assumed cosmological model without putting priors on the underlying actual model of the universe and its parameters, hence the issue of dark energy parametrization is resolved. It will be also shown that the sensitivity of the method to the intrinsic dispersion of the data is small that is another important characteristic of the method in testing cosmological models dealing with data with high uncertainties

  3. How to interpret the results of medical time series data analysis: Classical statistical approaches versus dynamic Bayesian network modeling.

    Science.gov (United States)

    Onisko, Agnieszka; Druzdzel, Marek J; Austin, R Marshall

    2016-01-01

    Classical statistics is a well-established approach in the analysis of medical data. While the medical community seems to be familiar with the concept of a statistical analysis and its interpretation, the Bayesian approach, argued by many of its proponents to be superior to the classical frequentist approach, is still not well-recognized in the analysis of medical data. The goal of this study is to encourage data analysts to use the Bayesian approach, such as modeling with graphical probabilistic networks, as an insightful alternative to classical statistical analysis of medical data. This paper offers a comparison of two approaches to analysis of medical time series data: (1) classical statistical approach, such as the Kaplan-Meier estimator and the Cox proportional hazards regression model, and (2) dynamic Bayesian network modeling. Our comparison is based on time series cervical cancer screening data collected at Magee-Womens Hospital, University of Pittsburgh Medical Center over 10 years. The main outcomes of our comparison are cervical cancer risk assessments produced by the three approaches. However, our analysis discusses also several aspects of the comparison, such as modeling assumptions, model building, dealing with incomplete data, individualized risk assessment, results interpretation, and model validation. Our study shows that the Bayesian approach is (1) much more flexible in terms of modeling effort, and (2) it offers an individualized risk assessment, which is more cumbersome for classical statistical approaches.

  4. Reliability of equipments and theory of frequency statistics and Bayesian decision

    International Nuclear Information System (INIS)

    Procaccia, H.; Clarotti, C.A.

    1992-01-01

    The rapid development of Bayesian techniques use in the domain of industrial risk is a recent phenomenon linked to the development of powerful computers. These techniques involve a reasoning well adapted to experimental logics, based on the dynamical knowledge enrichment with experience data. In the framework of reliability studies and statistical decision making, these methods differ slightly from the methods commonly used to evaluate the reliability of systems and from classical theoretical frequency statistics. This particular approach is described in this book and illustrated with many examples of application (power plants, pressure vessels, industrial installations etc..). These examples generally concern the risk management in the cases where the application of rules and the respect of norms become insufficient. It is now well known that the risk cannot be reduced to zero and that its evaluation must be performed using statistics, taking into account the possible accident processes and also the investments necessary to avoid them (service life, failure, maintenance costs and availability of materials). The result is the optimizing of a decision process about rare or uncertain events. (J.S.)

  5. A novel approach for choosing summary statistics in approximate Bayesian computation.

    Science.gov (United States)

    Aeschbacher, Simon; Beaumont, Mark A; Futschik, Andreas

    2012-11-01

    The choice of summary statistics is a crucial step in approximate Bayesian computation (ABC). Since statistics are often not sufficient, this choice involves a trade-off between loss of information and reduction of dimensionality. The latter may increase the efficiency of ABC. Here, we propose an approach for choosing summary statistics based on boosting, a technique from the machine-learning literature. We consider different types of boosting and compare them to partial least-squares regression as an alternative. To mitigate the lack of sufficiency, we also propose an approach for choosing summary statistics locally, in the putative neighborhood of the true parameter value. We study a demographic model motivated by the reintroduction of Alpine ibex (Capra ibex) into the Swiss Alps. The parameters of interest are the mean and standard deviation across microsatellites of the scaled ancestral mutation rate (θ(anc) = 4N(e)u) and the proportion of males obtaining access to matings per breeding season (ω). By simulation, we assess the properties of the posterior distribution obtained with the various methods. According to our criteria, ABC with summary statistics chosen locally via boosting with the L(2)-loss performs best. Applying that method to the ibex data, we estimate θ(anc)≈ 1.288 and find that most of the variation across loci of the ancestral mutation rate u is between 7.7 × 10(-4) and 3.5 × 10(-3) per locus per generation. The proportion of males with access to matings is estimated as ω≈ 0.21, which is in good agreement with recent independent estimates.

  6. Bayesian inference in an extended SEIR model with nonparametric disease transmission rate: an application to the Ebola epidemic in Sierra Leone.

    Science.gov (United States)

    Frasso, Gianluca; Lambert, Philippe

    2016-10-01

    SummaryThe 2014 Ebola outbreak in Sierra Leone is analyzed using a susceptible-exposed-infectious-removed (SEIR) epidemic compartmental model. The discrete time-stochastic model for the epidemic evolution is coupled to a set of ordinary differential equations describing the dynamics of the expected proportions of subjects in each epidemic state. The unknown parameters are estimated in a Bayesian framework by combining data on the number of new (laboratory confirmed) Ebola cases reported by the Ministry of Health and prior distributions for the transition rates elicited using information collected by the WHO during the follow-up of specific Ebola cases. The time-varying disease transmission rate is modeled in a flexible way using penalized B-splines. Our framework represents a valuable stochastic tool for the study of an epidemic dynamic even when only irregularly observed and possibly aggregated data are available. Simulations and the analysis of the 2014 Sierra Leone Ebola data highlight the merits of the proposed methodology. In particular, the flexible modeling of the disease transmission rate makes the estimation of the effective reproduction number robust to the misspecification of the initial epidemic states and to underreporting of the infectious cases. © The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  7. Bayesian-statistical decision threshold, detection limit, and confidence interval in nuclear radiation measurement

    International Nuclear Information System (INIS)

    Weise, K.

    1998-01-01

    When a contribution of a particular nuclear radiation is to be detected, for instance, a spectral line of interest for some purpose of radiation protection, and quantities and their uncertainties must be taken into account which, such as influence quantities, cannot be determined by repeated measurements or by counting nuclear radiation events, then conventional statistics of event frequencies is not sufficient for defining the decision threshold, the detection limit, and the limits of a confidence interval. These characteristic limits are therefore redefined on the basis of Bayesian statistics for a wider applicability and in such a way that the usual practice remains as far as possible unaffected. The principle of maximum entropy is applied to establish probability distributions from available information. Quantiles of these distributions are used for defining the characteristic limits. But such a distribution must not be interpreted as a distribution of event frequencies such as the Poisson distribution. It rather expresses the actual state of incomplete knowledge of a physical quantity. The different definitions and interpretations and their quantitative consequences are presented and discussed with two examples. The new approach provides a theoretical basis for the DIN 25482-10 standard presently in preparation for general applications of the characteristic limits. (orig.) [de

  8. A Bayesian statistical analysis of mouse dermal tumor promotion assay data for evaluating cigarette smoke condensate.

    Science.gov (United States)

    Kathman, Steven J; Potts, Ryan J; Ayres, Paul H; Harp, Paul R; Wilson, Cody L; Garner, Charles D

    2010-10-01

    The mouse dermal assay has long been used to assess the dermal tumorigenicity of cigarette smoke condensate (CSC). This mouse skin model has been developed for use in carcinogenicity testing utilizing the SENCAR mouse as the standard strain. Though the model has limitations, it remains as the most relevant method available to study the dermal tumor promoting potential of mainstream cigarette smoke. In the typical SENCAR mouse CSC bioassay, CSC is applied for 29 weeks following the application of a tumor initiator such as 7,12-dimethylbenz[a]anthracene (DMBA). Several endpoints are considered for analysis including: the percentage of animals with at least one mass, latency, and number of masses per animal. In this paper, a relatively straightforward analytic model and procedure is presented for analyzing the time course of the incidence of masses. The procedure considered here takes advantage of Bayesian statistical techniques, which provide powerful methods for model fitting and simulation. Two datasets are analyzed to illustrate how the model fits the data, how well the model may perform in predicting data from such trials, and how the model may be used as a decision tool when comparing the dermal tumorigenicity of cigarette smoke condensate from multiple cigarette types. The analysis presented here was developed as a statistical decision tool for differentiating between two or more prototype products based on the dermal tumorigenicity. Copyright (c) 2010 Elsevier Inc. All rights reserved.

  9. To be certain about the uncertainty: Bayesian statistics for 13 C metabolic flux analysis.

    Science.gov (United States)

    Theorell, Axel; Leweke, Samuel; Wiechert, Wolfgang; Nöh, Katharina

    2017-11-01

    13 C Metabolic Fluxes Analysis ( 13 C MFA) remains to be the most powerful approach to determine intracellular metabolic reaction rates. Decisions on strain engineering and experimentation heavily rely upon the certainty with which these fluxes are estimated. For uncertainty quantification, the vast majority of 13 C MFA studies relies on confidence intervals from the paradigm of Frequentist statistics. However, it is well known that the confidence intervals for a given experimental outcome are not uniquely defined. As a result, confidence intervals produced by different methods can be different, but nevertheless equally valid. This is of high relevance to 13 C MFA, since practitioners regularly use three different approximate approaches for calculating confidence intervals. By means of a computational study with a realistic model of the central carbon metabolism of E. coli, we provide strong evidence that confidence intervals used in the field depend strongly on the technique with which they were calculated and, thus, their use leads to misinterpretation of the flux uncertainty. In order to provide a better alternative to confidence intervals in 13 C MFA, we demonstrate that credible intervals from the paradigm of Bayesian statistics give more reliable flux uncertainty quantifications which can be readily computed with high accuracy using Markov chain Monte Carlo. In addition, the widely applied chi-square test, as a means of testing whether the model reproduces the data, is examined closer. © 2017 Wiley Periodicals, Inc.

  10. Nonparametric Statistics Test Software Package.

    Science.gov (United States)

    1983-09-01

    25 I1l,lCELL WRITE (NCF,12 ) IvE (I ,RCCT(I) 122 FORMAT(IlXt 3(H5 9 1) IF( IeLT *NCELL) WRITE (NOF1123 J PARTV(I1J 123 FORMAT( Xll----’,FIo.3J 25 CONT...the user’s entries. Its purpose is to write two types of files needed by the program Crunch: the data file, and the option file. 211 Iuill rateLchiavar...data file and communicate the choice of test and test parameters to Crunch. After a data file is written, Lochinvar prompts the writing of the

  11. OSL Age Determination of the Hearths in a Bronze Age Dwelling Site by using Bayesian Statistics

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Myung Jin [Neosiskorea Co. Ltd., Seoul (Korea, Republic of); Yang, Hye Jin [Baekje Cultural Properties Research Institute, Gongju (Korea, Republic of); Hong, Duk Geun [Kangwon National University, Chuncheon (Korea, Republic of)

    2011-06-15

    OSL dating for three hearths having the sequence of use and discard in No. 29 and 29-1 dwelling sites at Sogol cultural site was carried out. Resulting from the deconvolution of natural CW-OSL decay curve and thermal zeroing test, it was turned out that OSL signal was entirely composed of the heat- and light-sensitive fast component with high photoionization cross-section and all quartz OSL signals were thermally bleached under 300 .deg. C which is the minimum temperature related to heating and cooking in Bronze age. After dose recovery test and plateau test, paleodose of each hearth sample was evaluated by using SAR method, and OSL age was determined from the ratio of paleodose to annual dose rate. For the purpose of the precision improvement of OSL age, Bayesian statistics was applied to each hearth's age and the archaeological sequence information. Finally, it could be concluded to the accurate use period of each hearth from the resultant OSL ages.

  12. Automated parameter estimation for biological models using Bayesian statistical model checking.

    Science.gov (United States)

    Hussain, Faraz; Langmead, Christopher J; Mi, Qi; Dutta-Moscato, Joyeeta; Vodovotz, Yoram; Jha, Sumit K

    2015-01-01

    Probabilistic models have gained widespread acceptance in the systems biology community as a useful way to represent complex biological systems. Such models are developed using existing knowledge of the structure and dynamics of the system, experimental observations, and inferences drawn from statistical analysis of empirical data. A key bottleneck in building such models is that some system variables cannot be measured experimentally. These variables are incorporated into the model as numerical parameters. Determining values of these parameters that justify existing experiments and provide reliable predictions when model simulations are performed is a key research problem. Using an agent-based model of the dynamics of acute inflammation, we demonstrate a novel parameter estimation algorithm by discovering the amount and schedule of doses of bacterial lipopolysaccharide that guarantee a set of observed clinical outcomes with high probability. We synthesized values of twenty-eight unknown parameters such that the parameterized model instantiated with these parameter values satisfies four specifications describing the dynamic behavior of the model. We have developed a new algorithmic technique for discovering parameters in complex stochastic models of biological systems given behavioral specifications written in a formal mathematical logic. Our algorithm uses Bayesian model checking, sequential hypothesis testing, and stochastic optimization to automatically synthesize parameters of probabilistic biological models.

  13. OSL Age Determination of the Hearths in a Bronze Age Dwelling Site by using Bayesian Statistics

    International Nuclear Information System (INIS)

    Kim, Myung Jin; Yang, Hye Jin; Hong, Duk Geun

    2011-01-01

    OSL dating for three hearths having the sequence of use and discard in No. 29 and 29-1 dwelling sites at Sogol cultural site was carried out. Resulting from the deconvolution of natural CW-OSL decay curve and thermal zeroing test, it was turned out that OSL signal was entirely composed of the heat- and light-sensitive fast component with high photoionization cross-section and all quartz OSL signals were thermally bleached under 300 .deg. C which is the minimum temperature related to heating and cooking in Bronze age. After dose recovery test and plateau test, paleodose of each hearth sample was evaluated by using SAR method, and OSL age was determined from the ratio of paleodose to annual dose rate. For the purpose of the precision improvement of OSL age, Bayesian statistics was applied to each hearth's age and the archaeological sequence information. Finally, it could be concluded to the accurate use period of each hearth from the resultant OSL ages

  14. A Bayesian Framework for Multiple Trait Colo-calization from Summary Association Statistics.

    Science.gov (United States)

    Giambartolomei, Claudia; Zhenli Liu, Jimmy; Zhang, Wen; Hauberg, Mads; Shi, Huwenbo; Boocock, James; Pickrell, Joe; Jaffe, Andrew E; Pasaniuc, Bogdan; Roussos, Panos

    2018-03-19

    Most genetic variants implicated in complex diseases by genome-wide association studies (GWAS) are non-coding, making it challenging to understand the causative genes involved in disease. Integrating external information such as quantitative trait locus (QTL) mapping of molecular traits (e.g., expression, methylation) is a powerful approach to identify the subset of GWAS signals explained by regulatory effects. In particular, expression QTLs (eQTLs) help pinpoint the responsible gene among the GWAS regions that harbor many genes, while methylation QTLs (mQTLs) help identify the epigenetic mechanisms that impact gene expression which in turn affect disease risk. In this work we propose multiple-trait-coloc (moloc), a Bayesian statistical framework that integrates GWAS summary data with multiple molecular QTL data to identify regulatory effects at GWAS risk loci. We applied moloc to schizophrenia (SCZ) and eQTL/mQTL data derived from human brain tissue and identified 52 candidate genes that influence SCZ through methylation. Our method can be applied to any GWAS and relevant functional data to help prioritize disease associated genes. moloc is available for download as an R package (https://github.com/clagiamba/moloc). We also developed a web site to visualize the biological findings (icahn.mssm.edu/moloc). The browser allows searches by gene, methylation probe, and scenario of interest. claudia.giambartolomei@gmail.com. Supplementary data are available at Bioinformatics online.

  15. Application of Bayesian statistical decision theory for a maintenance optimization problem

    International Nuclear Information System (INIS)

    Procaccia, H.; Cordier, R.; Muller, S.

    1997-01-01

    Reliability-centered maintenance (RCM) is a rational approach that can be used to identify the equipment of facilities that may turn out to be critical with respect to safety, to availability, or to maintenance costs. Is is dor these critical pieces of equipment alone that a corrective (one waits for a failure) or preventive (the type and frequency are specified) maintenance policy is established. But this approach has limitations: - when there is little operating feedback and it concerns rare events affecting a piece of equipment judged critical on a priori grounds (how is it possible, in this case, to decide whether or not it is critical, since there is conflict between the gravity of the potential failure and its frequency?); - when the aim is propose an optimal maintenance frequency for a critical piece of equipment - changing the maintenance frequency hitherto applied may cause a significant drift in the observed reliability of the equipment, an aspect not generally taken into account in the RCM approach. In these two situations, expert judgments can be combined with the available operating feedback (Bayesian approach) and the combination of risk of failure and economic consequences taken into account (statistical decision theory) to achieve a true optimization of maintenance policy choices. This paper presents an application on the maintenance of diesel generator component

  16. A contingency table approach to nonparametric testing

    CERN Document Server

    Rayner, JCW

    2000-01-01

    Most texts on nonparametric techniques concentrate on location and linear-linear (correlation) tests, with less emphasis on dispersion effects and linear-quadratic tests. Tests for higher moment effects are virtually ignored. Using a fresh approach, A Contingency Table Approach to Nonparametric Testing unifies and extends the popular, standard tests by linking them to tests based on models for data that can be presented in contingency tables.This approach unifies popular nonparametric statistical inference and makes the traditional, most commonly performed nonparametric analyses much more comp

  17. Nonparametric inference of network structure and dynamics

    Science.gov (United States)

    Peixoto, Tiago P.

    The network structure of complex systems determine their function and serve as evidence for the evolutionary mechanisms that lie behind them. Despite considerable effort in recent years, it remains an open challenge to formulate general descriptions of the large-scale structure of network systems, and how to reliably extract such information from data. Although many approaches have been proposed, few methods attempt to gauge the statistical significance of the uncovered structures, and hence the majority cannot reliably separate actual structure from stochastic fluctuations. Due to the sheer size and high-dimensionality of many networks, this represents a major limitation that prevents meaningful interpretations of the results obtained with such nonstatistical methods. In this talk, I will show how these issues can be tackled in a principled and efficient fashion by formulating appropriate generative models of network structure that can have their parameters inferred from data. By employing a Bayesian description of such models, the inference can be performed in a nonparametric fashion, that does not require any a priori knowledge or ad hoc assumptions about the data. I will show how this approach can be used to perform model comparison, and how hierarchical models yield the most appropriate trade-off between model complexity and quality of fit based on the statistical evidence present in the data. I will also show how this general approach can be elegantly extended to networks with edge attributes, that are embedded in latent spaces, and that change in time. The latter is obtained via a fully dynamic generative network model, based on arbitrary-order Markov chains, that can also be inferred in a nonparametric fashion. Throughout the talk I will illustrate the application of the methods with many empirical networks such as the internet at the autonomous systems level, the global airport network, the network of actors and films, social networks, citations among

  18. Review of bayesian statistical analysis methods for cytogenetic radiation biodosimetry, with a practical example

    International Nuclear Information System (INIS)

    Ainsbury, Elizabeth A.; Lloyd, David C.; Rothkamm, Kai; Vinnikov, Volodymyr A.; Maznyk, Nataliya A.; Puig, Pedro; Higueras, Manuel

    2014-01-01

    Classical methods of assessing the uncertainty associated with radiation doses estimated using cytogenetic techniques are now extremely well defined. However, several authors have suggested that a Bayesian approach to uncertainty estimation may be more suitable for cytogenetic data, which are inherently stochastic in nature. The Bayesian analysis framework focuses on identification of probability distributions (for yield of aberrations or estimated dose), which also means that uncertainty is an intrinsic part of the analysis, rather than an 'afterthought'. In this paper Bayesian, as well as some more advanced classical, data analysis methods for radiation cytogenetics are reviewed that have been proposed in the literature. A practical overview of Bayesian cytogenetic dose estimation is also presented, with worked examples from the literature. (authors)

  19. Bayesian Statistical Analysis of Historical and Late Holocene Rates of Sea-Level Change

    Science.gov (United States)

    Cahill, Niamh; Parnell, Andrew; Kemp, Andrew; Horton, Benjamin

    2014-05-01

    A fundamental concern associated with climate change is the rate at which sea levels are rising. Studies of past sea level (particularly beyond the instrumental data range) allow modern sea-level rise to be placed in a more complete context. Considering this, we perform a Bayesian statistical analysis on historical and late Holocene rates of sea-level change. The data that form the input to the statistical model are tide-gauge measurements and proxy reconstructions from cores of coastal sediment. The aims are to estimate rates of sea-level rise, to determine when modern rates of sea-level rise began and to observe how these rates have been changing over time. Many of the current methods for doing this use simple linear regression to estimate rates. This is often inappropriate as it is too rigid and it can ignore uncertainties that arise as part of the data collection exercise. This can lead to over confidence in the sea-level trends being characterized. The proposed Bayesian model places a Gaussian process prior on the rate process (i.e. the process that determines how rates of sea-level are changing over time). The likelihood of the observed data is the integral of this process. When dealing with proxy reconstructions, this is set in an errors-in-variables framework so as to take account of age uncertainty. It is also necessary, in this case, for the model to account for glacio-isostatic adjustment, which introduces a covariance between individual age and sea-level observations. This method provides a flexible fit and it allows for the direct estimation of the rate process with full consideration of all sources of uncertainty. Analysis of tide-gauge datasets and proxy reconstructions in this way means that changing rates of sea level can be estimated more comprehensively and accurately than previously possible. The model captures the continuous and dynamic evolution of sea-level change and results show that not only are modern sea levels rising but that the rates

  20. Statistical analysis of modal parameters of a suspension bridge based on Bayesian spectral density approach and SHM data

    Science.gov (United States)

    Li, Zhijun; Feng, Maria Q.; Luo, Longxi; Feng, Dongming; Xu, Xiuli

    2018-01-01

    Uncertainty of modal parameters estimation appear in structural health monitoring (SHM) practice of civil engineering to quite some significant extent due to environmental influences and modeling errors. Reasonable methodologies are needed for processing the uncertainty. Bayesian inference can provide a promising and feasible identification solution for the purpose of SHM. However, there are relatively few researches on the application of Bayesian spectral method in the modal identification using SHM data sets. To extract modal parameters from large data sets collected by SHM system, the Bayesian spectral density algorithm was applied to address the uncertainty of mode extraction from output-only response of a long-span suspension bridge. The posterior most possible values of modal parameters and their uncertainties were estimated through Bayesian inference. A long-term variation and statistical analysis was performed using the sensor data sets collected from the SHM system of the suspension bridge over a one-year period. The t location-scale distribution was shown to be a better candidate function for frequencies of lower modes. On the other hand, the burr distribution provided the best fitting to the higher modes which are sensitive to the temperature. In addition, wind-induced variation of modal parameters was also investigated. It was observed that both the damping ratios and modal forces increased during the period of typhoon excitations. Meanwhile, the modal damping ratios exhibit significant correlation with the spectral intensities of the corresponding modal forces.

  1. A Bayesian Nonparametric Meta-Analysis Model

    Science.gov (United States)

    Karabatsos, George; Talbott, Elizabeth; Walker, Stephen G.

    2015-01-01

    In a meta-analysis, it is important to specify a model that adequately describes the effect-size distribution of the underlying population of studies. The conventional normal fixed-effect and normal random-effects models assume a normal effect-size population distribution, conditionally on parameters and covariates. For estimating the mean overall…

  2. Nonparametric Bayesian models for a spatial covariance.

    Science.gov (United States)

    Reich, Brian J; Fuentes, Montserrat

    2012-01-01

    A crucial step in the analysis of spatial data is to estimate the spatial correlation function that determines the relationship between a spatial process at two locations. The standard approach to selecting the appropriate correlation function is to use prior knowledge or exploratory analysis, such as a variogram analysis, to select the correct parametric correlation function. Rather that selecting a particular parametric correlation function, we treat the covariance function as an unknown function to be estimated from the data. We propose a flexible prior for the correlation function to provide robustness to the choice of correlation function. We specify the prior for the correlation function using spectral methods and the Dirichlet process prior, which is a common prior for an unknown distribution function. Our model does not require Gaussian data or spatial locations on a regular grid. The approach is demonstrated using a simulation study as well as an analysis of California air pollution data.

  3. Image Denoising via Bayesian Estimation of Statistical Parameter Using Generalized Gamma Density Prior in Gaussian Noise Model

    Science.gov (United States)

    Kittisuwan, Pichid

    2015-03-01

    The application of image processing in industry has shown remarkable success over the last decade, for example, in security and telecommunication systems. The denoising of natural image corrupted by Gaussian noise is a classical problem in image processing. So, image denoising is an indispensable step during image processing. This paper is concerned with dual-tree complex wavelet-based image denoising using Bayesian techniques. One of the cruxes of the Bayesian image denoising algorithms is to estimate the statistical parameter of the image. Here, we employ maximum a posteriori (MAP) estimation to calculate local observed variance with generalized Gamma density prior for local observed variance and Laplacian or Gaussian distribution for noisy wavelet coefficients. Evidently, our selection of prior distribution is motivated by efficient and flexible properties of generalized Gamma density. The experimental results show that the proposed method yields good denoising results.

  4. Improved parameterization of managed grassland in a global process-based vegetation model using Bayesian statistics

    Science.gov (United States)

    Rolinski, S.; Müller, C.; Lotze-Campen, H.; Bondeau, A.

    2010-12-01

    information on boundary conditions such as water and light availability or temperature sensibility. Based on the given limitation factors, a number of sensitive parameters are chosen, e.g. for the phenological development, biomass allocation, and different management regimes. These are introduced to a sensitivity analysis and Bayesian parameter evaluation using the R package FME (Soetart & Petzoldt, Journal of Statistical Software, 2010). Given the extremely different climatic conditions at the FluxNet grass sites, the premises for the global sensitivity analysis are very promising.

  5. Frequentist and Bayesian inference for Gaussian-log-Gaussian wavelet trees and statistical signal processing applications

    DEFF Research Database (Denmark)

    Jacobsen, Christian Robert Dahl; Møller, Jesper

    2017-01-01

    We introduce new estimation methods for a subclass of the Gaussian scale mixture models for wavelet trees by Wainwright, Simoncelli and Willsky that rely on modern results for composite likelihoods and approximate Bayesian inference. Our methodology is illustrated for denoising and edge detection...

  6. Optimum Inductive Methods. A study in Inductive Probability, Bayesian Statistics, and Verisimilitude.

    NARCIS (Netherlands)

    Festa, Roberto

    1992-01-01

    According to the Bayesian view, scientific hypotheses must be appraised in terms of their posterior probabilities relative to the available experimental data. Such posterior probabilities are derived from the prior probabilities of the hypotheses by applying Bayes'theorem. One of the most important

  7. Probabilistic model for untargeted peak detection in LC-MS using Bayesian statistics

    NARCIS (Netherlands)

    Woldegebriel, M.; Vivó-Truyols, G.

    2015-01-01

    We introduce a novel Bayesian probabilistic peak detection algorithm for liquid chromatography mass spectroscopy (LC-MS). The final probabilistic result allows the user to make a final decision about which points in a 2 chromatogram are affected by a chromatographic peak and which ones are only

  8. Two sample Bayesian prediction intervals for order statistics based on the inverse exponential-type distributions using right censored sample

    Directory of Open Access Journals (Sweden)

    M.M. Mohie El-Din

    2011-10-01

    Full Text Available In this paper, two sample Bayesian prediction intervals for order statistics (OS are obtained. This prediction is based on a certain class of the inverse exponential-type distributions using a right censored sample. A general class of prior density functions is used and the predictive cumulative function is obtained in the two samples case. The class of the inverse exponential-type distributions includes several important distributions such the inverse Weibull distribution, the inverse Burr distribution, the loglogistic distribution, the inverse Pareto distribution and the inverse paralogistic distribution. Special cases of the inverse Weibull model such as the inverse exponential model and the inverse Rayleigh model are considered.

  9. Nonparametric predictive inference in reliability

    International Nuclear Information System (INIS)

    Coolen, F.P.A.; Coolen-Schrijner, P.; Yan, K.J.

    2002-01-01

    We introduce a recently developed statistical approach, called nonparametric predictive inference (NPI), to reliability. Bounds for the survival function for a future observation are presented. We illustrate how NPI can deal with right-censored data, and discuss aspects of competing risks. We present possible applications of NPI for Bernoulli data, and we briefly outline applications of NPI for replacement decisions. The emphasis is on introduction and illustration of NPI in reliability contexts, detailed mathematical justifications are presented elsewhere

  10. Predicting uncertainty in future marine ice sheet volume using Bayesian statistical methods

    Science.gov (United States)

    Davis, A. D.

    2015-12-01

    The marine ice instability can trigger rapid retreat of marine ice streams. Recent observations suggest that marine ice systems in West Antarctica have begun retreating. However, unknown ice dynamics, computationally intensive mathematical models, and uncertain parameters in these models make predicting retreat rate and ice volume difficult. In this work, we fuse current observational data with ice stream/shelf models to develop probabilistic predictions of future grounded ice sheet volume. Given observational data (e.g., thickness, surface elevation, and velocity) and a forward model that relates uncertain parameters (e.g., basal friction and basal topography) to these observations, we use a Bayesian framework to define a posterior distribution over the parameters. A stochastic predictive model then propagates uncertainties in these parameters to uncertainty in a particular quantity of interest (QoI)---here, the volume of grounded ice at a specified future time. While the Bayesian approach can in principle characterize the posterior predictive distribution of the QoI, the computational cost of both the forward and predictive models makes this effort prohibitively expensive. To tackle this challenge, we introduce a new Markov chain Monte Carlo method that constructs convergent approximations of the QoI target density in an online fashion, yielding accurate characterizations of future ice sheet volume at significantly reduced computational cost.Our second goal is to attribute uncertainty in these Bayesian predictions to uncertainties in particular parameters. Doing so can help target data collection, for the purpose of constraining the parameters that contribute most strongly to uncertainty in the future volume of grounded ice. For instance, smaller uncertainties in parameters to which the QoI is highly sensitive may account for more variability in the prediction than larger uncertainties in parameters to which the QoI is less sensitive. We use global sensitivity

  11. Nonparametric analysis of blocked ordered categories data: some examples revisited

    Directory of Open Access Journals (Sweden)

    O. Thas

    2006-08-01

    Full Text Available Nonparametric analysis for general block designs can be given by using the Cochran-Mantel-Haenszel (CMH statistics. We demonstrate this with four examples and note that several well-known nonparametric statistics are special cases of CMH statistics.

  12. On the limitations of standard statistical modeling in biological systems: a full Bayesian approach for biology.

    Science.gov (United States)

    Gomez-Ramirez, Jaime; Sanz, Ricardo

    2013-09-01

    One of the most important scientific challenges today is the quantitative and predictive understanding of biological function. Classical mathematical and computational approaches have been enormously successful in modeling inert matter, but they may be inadequate to address inherent features of biological systems. We address the conceptual and methodological obstacles that lie in the inverse problem in biological systems modeling. We introduce a full Bayesian approach (FBA), a theoretical framework to study biological function, in which probability distributions are conditional on biophysical information that physically resides in the biological system that is studied by the scientist. Copyright © 2013 Elsevier Ltd. All rights reserved.

  13. Using Bayesian statistics for modeling PTSD through Latent Growth Mixture Modeling: implementation and discussion

    Directory of Open Access Journals (Sweden)

    Sarah Depaoli

    2015-03-01

    Full Text Available Background: After traumatic events, such as disaster, war trauma, and injuries including burns (which is the focus here, the risk to develop posttraumatic stress disorder (PTSD is approximately 10% (Breslau & Davis, 1992. Latent Growth Mixture Modeling can be used to classify individuals into distinct groups exhibiting different patterns of PTSD (Galatzer-Levy, 2015. Currently, empirical evidence points to four distinct trajectories of PTSD patterns in those who have experienced burn trauma. These trajectories are labeled as: resilient, recovery, chronic, and delayed onset trajectories (e.g., Bonanno, 2004; Bonanno, Brewin, Kaniasty, & Greca, 2010; Maercker, Gäbler, O'Neil, Schützwohl, & Müller, 2013; Pietrzak et al., 2013. The delayed onset trajectory affects only a small group of individuals, that is, about 4–5% (O'Donnell, Elliott, Lau, & Creamer, 2007. In addition to its low frequency, the later onset of this trajectory may contribute to the fact that these individuals can be easily overlooked by professionals. In this special symposium on Estimating PTSD trajectories (Van de Schoot, 2015a, we illustrate how to properly identify this small group of individuals through the Bayesian estimation framework using previous knowledge through priors (see, e.g., Depaoli & Boyajian, 2014; Van de Schoot, Broere, Perryck, Zondervan-Zwijnenburg, & Van Loey, 2015. Method: We used latent growth mixture modeling (LGMM (Van de Schoot, 2015b to estimate PTSD trajectories across 4 years that followed a traumatic burn. We demonstrate and compare results from traditional (maximum likelihood and Bayesian estimation using priors (see, Depaoli, 2012, 2013. Further, we discuss where priors come from and how to define them in the estimation process. Results: We demonstrate that only the Bayesian approach results in the desired theory-driven solution of PTSD trajectories. Since the priors are chosen subjectively, we also present a sensitivity analysis of the

  14. Understanding Short-Term Nonmigrating Tidal Variability in the Ionospheric Dynamo Region from SABER Using Information Theory and Bayesian Statistics

    Science.gov (United States)

    Kumari, K.; Oberheide, J.

    2017-12-01

    Nonmigrating tidal diagnostics of SABER temperature observations in the ionospheric dynamo region reveal a large amount of variability on time-scales of a few days to weeks. In this paper, we discuss the physical reasons for the observed short-term tidal variability using a novel approach based on Information theory and Bayesian statistics. We diagnose short-term tidal variability as a function of season, QBO, ENSO, and solar cycle and other drivers using time dependent probability density functions, Shannon entropy and Kullback-Leibler divergence. The statistical significance of the approach and its predictive capability is exemplified using SABER tidal diagnostics with emphasis on the responses to the QBO and solar cycle. Implications for F-region plasma density will be discussed.

  15. Uncertainty analysis of depth predictions from seismic reflection data using Bayesian statistics

    Science.gov (United States)

    Michelioudakis, Dimitrios G.; Hobbs, Richard W.; Caiado, Camila C. S.

    2018-03-01

    Estimating the depths of target horizons from seismic reflection data is an important task in exploration geophysics. To constrain these depths we need a reliable and accurate velocity model. Here, we build an optimum 2D seismic reflection data processing flow focused on pre - stack deghosting filters and velocity model building and apply Bayesian methods, including Gaussian process emulation and Bayesian History Matching (BHM), to estimate the uncertainties of the depths of key horizons near the borehole DSDP-258 located in the Mentelle Basin, south west of Australia, and compare the results with the drilled core from that well. Following this strategy, the tie between the modelled and observed depths from DSDP-258 core was in accordance with the ± 2σ posterior credibility intervals and predictions for depths to key horizons were made for the two new drill sites, adjacent the existing borehole of the area. The probabilistic analysis allowed us to generate multiple realizations of pre-stack depth migrated images, these can be directly used to better constrain interpretation and identify potential risk at drill sites. The method will be applied to constrain the drilling targets for the upcoming International Ocean Discovery Program (IODP), leg 369.

  16. Topics in Computational Bayesian Statistics With Applications to Hierarchical Models in Astronomy and Sociology

    Science.gov (United States)

    Sahai, Swupnil

    This thesis includes three parts. The overarching theme is how to analyze structured hierarchical data, with applications to astronomy and sociology. The first part discusses how expectation propagation can be used to parallelize the computation when fitting big hierarchical bayesian models. This methodology is then used to fit a novel, nonlinear mixture model to ultraviolet radiation from various regions of the observable universe. The second part discusses how the Stan probabilistic programming language can be used to numerically integrate terms in a hierarchical bayesian model. This technique is demonstrated on supernovae data to significantly speed up convergence to the posterior distribution compared to a previous study that used a Gibbs-type sampler. The third part builds a formal latent kernel representation for aggregate relational data as a way to more robustly estimate the mixing characteristics of agents in a network. In particular, the framework is applied to sociology surveys to estimate, as a function of ego age, the age and sex composition of the personal networks of individuals in the United States.

  17. Uncertainty analysis of depth predictions from seismic reflection data using Bayesian statistics

    Science.gov (United States)

    Michelioudakis, Dimitrios G.; Hobbs, Richard W.; Caiado, Camila C. S.

    2018-06-01

    Estimating the depths of target horizons from seismic reflection data is an important task in exploration geophysics. To constrain these depths we need a reliable and accurate velocity model. Here, we build an optimum 2-D seismic reflection data processing flow focused on pre-stack deghosting filters and velocity model building and apply Bayesian methods, including Gaussian process emulation and Bayesian History Matching, to estimate the uncertainties of the depths of key horizons near the Deep Sea Drilling Project (DSDP) borehole 258 (DSDP-258) located in the Mentelle Basin, southwest of Australia, and compare the results with the drilled core from that well. Following this strategy, the tie between the modelled and observed depths from DSDP-258 core was in accordance with the ±2σ posterior credibility intervals and predictions for depths to key horizons were made for the two new drill sites, adjacent to the existing borehole of the area. The probabilistic analysis allowed us to generate multiple realizations of pre-stack depth migrated images, these can be directly used to better constrain interpretation and identify potential risk at drill sites. The method will be applied to constrain the drilling targets for the upcoming International Ocean Discovery Program, leg 369.

  18. Nonparametric Inference for Periodic Sequences

    KAUST Repository

    Sun, Ying

    2012-02-01

    This article proposes a nonparametric method for estimating the period and values of a periodic sequence when the data are evenly spaced in time. The period is estimated by a "leave-out-one-cycle" version of cross-validation (CV) and complements the periodogram, a widely used tool for period estimation. The CV method is computationally simple and implicitly penalizes multiples of the smallest period, leading to a "virtually" consistent estimator of integer periods. This estimator is investigated both theoretically and by simulation.We also propose a nonparametric test of the null hypothesis that the data have constantmean against the alternative that the sequence of means is periodic. Finally, our methodology is demonstrated on three well-known time series: the sunspots and lynx trapping data, and the El Niño series of sea surface temperatures. © 2012 American Statistical Association and the American Society for Quality.

  19. Comparison between linear and non-parametric regression models for genome-enabled prediction in wheat.

    Science.gov (United States)

    Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne

    2012-12-01

    In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models.

  20. Quantifying Trace Amounts of Aggregates in Biopharmaceuticals Using Analytical Ultracentrifugation Sedimentation Velocity: Bayesian Analyses and F Statistics.

    Science.gov (United States)

    Wafer, Lucas; Kloczewiak, Marek; Luo, Yin

    2016-07-01

    Analytical ultracentrifugation-sedimentation velocity (AUC-SV) is often used to quantify high molar mass species (HMMS) present in biopharmaceuticals. Although these species are often present in trace quantities, they have received significant attention due to their potential immunogenicity. Commonly, AUC-SV data is analyzed as a diffusion-corrected, sedimentation coefficient distribution, or c(s), using SEDFIT to numerically solve Lamm-type equations. SEDFIT also utilizes maximum entropy or Tikhonov-Phillips regularization to further allow the user to determine relevant sample information, including the number of species present, their sedimentation coefficients, and their relative abundance. However, this methodology has several, often unstated, limitations, which may impact the final analysis of protein therapeutics. These include regularization-specific effects, artificial "ripple peaks," and spurious shifts in the sedimentation coefficients. In this investigation, we experimentally verified that an explicit Bayesian approach, as implemented in SEDFIT, can largely correct for these effects. Clear guidelines on how to implement this technique and interpret the resulting data, especially for samples containing micro-heterogeneity (e.g., differential glycosylation), are also provided. In addition, we demonstrated how the Bayesian approach can be combined with F statistics to draw more accurate conclusions and rigorously exclude artifactual peaks. Numerous examples with an antibody and an antibody-drug conjugate were used to illustrate the strengths and drawbacks of each technique.

  1. Nonparametric functional mapping of quantitative trait loci.

    Science.gov (United States)

    Yang, Jie; Wu, Rongling; Casella, George

    2009-03-01

    Functional mapping is a useful tool for mapping quantitative trait loci (QTL) that control dynamic traits. It incorporates mathematical aspects of biological processes into the mixture model-based likelihood setting for QTL mapping, thus increasing the power of QTL detection and the precision of parameter estimation. However, in many situations there is no obvious functional form and, in such cases, this strategy will not be optimal. Here we propose to use nonparametric function estimation, typically implemented with B-splines, to estimate the underlying functional form of phenotypic trajectories, and then construct a nonparametric test to find evidence of existing QTL. Using the representation of a nonparametric regression as a mixed model, the final test statistic is a likelihood ratio test. We consider two types of genetic maps: dense maps and general maps, and the power of nonparametric functional mapping is investigated through simulation studies and demonstrated by examples.

  2. Bayesian Statistics and Uncertainty Quantification for Safety Boundary Analysis in Complex Systems

    Science.gov (United States)

    He, Yuning; Davies, Misty Dawn

    2014-01-01

    The analysis of a safety-critical system often requires detailed knowledge of safe regions and their highdimensional non-linear boundaries. We present a statistical approach to iteratively detect and characterize the boundaries, which are provided as parameterized shape candidates. Using methods from uncertainty quantification and active learning, we incrementally construct a statistical model from only few simulation runs and obtain statistically sound estimates of the shape parameters for safety boundaries.

  3. Variational Bayesian labeled multi-Bernoulli filter with unknown sensor noise statistics

    Directory of Open Access Journals (Sweden)

    Qiu Hao

    2016-10-01

    Full Text Available It is difficult to build accurate model for measurement noise covariance in complex backgrounds. For the scenarios of unknown sensor noise variances, an adaptive multi-target tracking algorithm based on labeled random finite set and variational Bayesian (VB approximation is proposed. The variational approximation technique is introduced to the labeled multi-Bernoulli (LMB filter to jointly estimate the states of targets and sensor noise variances. Simulation results show that the proposed method can give unbiased estimation of cardinality and has better performance than the VB probability hypothesis density (VB-PHD filter and the VB cardinality balanced multi-target multi-Bernoulli (VB-CBMeMBer filter in harsh situations. The simulations also confirm the robustness of the proposed method against the time-varying noise variances. The computational complexity of proposed method is higher than the VB-PHD and VB-CBMeMBer in extreme cases, while the mean execution times of the three methods are close when targets are well separated.

  4. A Bayesian statistical method for quantifying model form uncertainty and two model combination methods

    International Nuclear Information System (INIS)

    Park, Inseok; Grandhi, Ramana V.

    2014-01-01

    Apart from parametric uncertainty, model form uncertainty as well as prediction error may be involved in the analysis of engineering system. Model form uncertainty, inherently existing in selecting the best approximation from a model set cannot be ignored, especially when the predictions by competing models show significant differences. In this research, a methodology based on maximum likelihood estimation is presented to quantify model form uncertainty using the measured differences of experimental and model outcomes, and is compared with a fully Bayesian estimation to demonstrate its effectiveness. While a method called the adjustment factor approach is utilized to propagate model form uncertainty alone into the prediction of a system response, a method called model averaging is utilized to incorporate both model form uncertainty and prediction error into it. A numerical problem of concrete creep is used to demonstrate the processes for quantifying model form uncertainty and implementing the adjustment factor approach and model averaging. Finally, the presented methodology is applied to characterize the engineering benefits of a laser peening process

  5. Bayesian statistics applied to the location of the source of explosions at Stromboli Volcano, Italy

    Science.gov (United States)

    Saccorotti, G.; Chouet, B.; Martini, M.; Scarpa, R.

    1998-01-01

    We present a method for determining the location and spatial extent of the source of explosions at Stromboli Volcano, Italy, based on a Bayesian inversion of the slowness vector derived from frequency-slowness analyses of array data. The method searches for source locations that minimize the error between the expected and observed slowness vectors. For a given set of model parameters, the conditional probability density function of slowness vectors is approximated by a Gaussian distribution of expected errors. The method is tested with synthetics using a five-layer velocity model derived for the north flank of Stromboli and a smoothed velocity model derived from a power-law approximation of the layered structure. Application to data from Stromboli allows for a detailed examination of uncertainties in source location due to experimental errors and incomplete knowledge of the Earth model. Although the solutions are not constrained in the radial direction, excellent resolution is achieved in both transverse and depth directions. Under the assumption that the horizontal extent of the source does not exceed the crater dimension, the 90% confidence region in the estimate of the explosive source location corresponds to a small volume extending from a depth of about 100 m to a maximum depth of about 300 m beneath the active vents, with a maximum likelihood source region located in the 120- to 180-m-depth interval.

  6. Use of SAMC for Bayesian analysis of statistical models with intractable normalizing constants

    KAUST Repository

    Jin, Ick Hoon; Liang, Faming

    2014-01-01

    Statistical inference for the models with intractable normalizing constants has attracted much attention. During the past two decades, various approximation- or simulation-based methods have been proposed for the problem, such as the Monte Carlo

  7. Nonparametric identification of copula structures

    KAUST Repository

    Li, Bo

    2013-06-01

    We propose a unified framework for testing a variety of assumptions commonly made about the structure of copulas, including symmetry, radial symmetry, joint symmetry, associativity and Archimedeanity, and max-stability. Our test is nonparametric and based on the asymptotic distribution of the empirical copula process.We perform simulation experiments to evaluate our test and conclude that our method is reliable and powerful for assessing common assumptions on the structure of copulas, particularly when the sample size is moderately large. We illustrate our testing approach on two datasets. © 2013 American Statistical Association.

  8. Bayesian Modelling of Functional Whole Brain Connectivity

    DEFF Research Database (Denmark)

    Røge, Rasmus

    the prevalent strategy of standardizing of fMRI time series and model data using directional statistics or we model the variability in the signal across the brain and across multiple subjects. In either case, we use Bayesian nonparametric modeling to automatically learn from the fMRI data the number......This thesis deals with parcellation of whole-brain functional magnetic resonance imaging (fMRI) using Bayesian inference with mixture models tailored to the fMRI data. In the three included papers and manuscripts, we analyze two different approaches to modeling fMRI signal; either we accept...... of funcional units, i.e. parcels. We benchmark the proposed mixture models against state of the art methods of brain parcellation, both probabilistic and non-probabilistic. The time series of each voxel are most often standardized using z-scoring which projects the time series data onto a hypersphere...

  9. Nonparametric Transfer Function Models

    Science.gov (United States)

    Liu, Jun M.; Chen, Rong; Yao, Qiwei

    2009-01-01

    In this paper a class of nonparametric transfer function models is proposed to model nonlinear relationships between ‘input’ and ‘output’ time series. The transfer function is smooth with unknown functional forms, and the noise is assumed to be a stationary autoregressive-moving average (ARMA) process. The nonparametric transfer function is estimated jointly with the ARMA parameters. By modeling the correlation in the noise, the transfer function can be estimated more efficiently. The parsimonious ARMA structure improves the estimation efficiency in finite samples. The asymptotic properties of the estimators are investigated. The finite-sample properties are illustrated through simulations and one empirical example. PMID:20628584

  10. Bayesian analysis applied to statistical uncertainties of extreme response distributions of offshore wind turbines

    NARCIS (Netherlands)

    Cheng, P.W.; Kuik, van G.A.M.; Bussel, van G.J.W.; Vrouwenvelder, A.C.W.M.

    2002-01-01

    Extreme response is an important design variable for wind turbines. The statistical uncertainties concerning the extreme response distribution are simulated here with data concerning physical characteristics obtained from measurements. The extreme responses are the flap moment at the blade root and

  11. Data Mining Foundations and Intelligent Paradigms Volume 2 Statistical, Bayesian, Time Series and other Theoretical Aspects

    CERN Document Server

    Jain, Lakhmi

    2012-01-01

    Data mining is one of the most rapidly growing research areas in computer science and statistics. In Volume 2 of this three volume series, we have brought together contributions from some of the most prestigious researchers in theoretical data mining. Each of the chapters is self contained. Statisticians and applied scientists/ engineers will find this volume valuable. Additionally, it provides a sourcebook for graduate students interested in the current direction of research in data mining.

  12. Bayesian benefits with JASP

    NARCIS (Netherlands)

    Marsman, M.; Wagenmakers, E.-J.

    2017-01-01

    We illustrate the Bayesian approach to data analysis using the newly developed statistical software program JASP. With JASP, researchers are able to take advantage of the benefits that the Bayesian framework has to offer in terms of parameter estimation and hypothesis testing. The Bayesian

  13. The Bayesian statistical decision theory applied to the optimization of generating set maintenance

    International Nuclear Information System (INIS)

    Procaccia, H.; Cordier, R.; Muller, S.

    1994-11-01

    The difficulty in RCM methodology is the allocation of a new periodicity of preventive maintenance on one equipment when a critical failure has been identified: until now this new allocation has been based on the engineer's judgment, and one must wait for a full cycle of feedback experience before to validate it. Statistical decision theory could be a more rational alternative for the optimization of preventive maintenance periodicity. This methodology has been applied to inspection and maintenance optimization of cylinders of diesel generator engines of 900 MW nuclear plants, and has shown that previous preventive maintenance periodicity can be extended. (authors). 8 refs., 5 figs

  14. Application of a Bayesian algorithm for the Statistical Energy model updating of a railway coach

    DEFF Research Database (Denmark)

    Sadri, Mehran; Brunskog, Jonas; Younesian, Davood

    2016-01-01

    into account based on published data on comparison between experimental and theoretical results, so that the variance of the theory is estimated. The Monte Carlo Metropolis Hastings algorithm is employed to estimate the modified values of the parameters. It is shown that the algorithm can be efficiently used......The classical statistical energy analysis (SEA) theory is a common approach for vibroacoustic analysis of coupled complex structures, being efficient to predict high-frequency noise and vibration of engineering systems. There are however some limitations in applying the conventional SEA...... the performance of the proposed strategy, the SEA model updating of a railway passenger coach is carried out. First, a sensitivity analysis is carried out to select the most sensitive parameters of the SEA model. For the selected parameters of the model, prior probability density functions are then taken...

  15. A Structural Labor Supply Model with Nonparametric Preferences

    NARCIS (Netherlands)

    van Soest, A.H.O.; Das, J.W.M.; Gong, X.

    2000-01-01

    Nonparametric techniques are usually seen as a statistic device for data description and exploration, and not as a tool for estimating models with a richer economic structure, which are often required for policy analysis.This paper presents an example where nonparametric flexibility can be attained

  16. CADDIS Volume 4. Data Analysis: PECBO Appendix - R Scripts for Non-Parametric Regressions

    Science.gov (United States)

    Script for computing nonparametric regression analysis. Overview of using scripts to infer environmental conditions from biological observations, statistically estimating species-environment relationships, statistical scripts.

  17. Search for transient ultralight dark matter signatures with networks of precision measurement devices using a Bayesian statistics method

    Science.gov (United States)

    Roberts, B. M.; Blewitt, G.; Dailey, C.; Derevianko, A.

    2018-04-01

    We analyze the prospects of employing a distributed global network of precision measurement devices as a dark matter and exotic physics observatory. In particular, we consider the atomic clocks of the global positioning system (GPS), consisting of a constellation of 32 medium-Earth orbit satellites equipped with either Cs or Rb microwave clocks and a number of Earth-based receiver stations, some of which employ highly-stable H-maser atomic clocks. High-accuracy timing data is available for almost two decades. By analyzing the satellite and terrestrial atomic clock data, it is possible to search for transient signatures of exotic physics, such as "clumpy" dark matter and dark energy, effectively transforming the GPS constellation into a 50 000 km aperture sensor array. Here we characterize the noise of the GPS satellite atomic clocks, describe the search method based on Bayesian statistics, and test the method using simulated clock data. We present the projected discovery reach using our method, and demonstrate that it can surpass the existing constrains by several order of magnitude for certain models. Our method is not limited in scope to GPS or atomic clock networks, and can also be applied to other networks of precision measurement devices.

  18. Exploring neighborhood inequality in female breast cancer incidence in Tehran using Bayesian spatial models and a spatial scan statistic

    Directory of Open Access Journals (Sweden)

    Erfan Ayubi

    2017-05-01

    Full Text Available OBJECTIVES The aim of this study was to explore the spatial pattern of female breast cancer (BC incidence at the neighborhood level in Tehran, Iran. METHODS The present study included all registered incident cases of female BC from March 2008 to March 2011. The raw standardized incidence ratio (SIR of BC for each neighborhood was estimated by comparing observed cases relative to expected cases. The estimated raw SIRs were smoothed by a Besag, York, and Mollie spatial model and the spatial empirical Bayesian method. The purely spatial scan statistic was used to identify spatial clusters. RESULTS There were 4,175 incident BC cases in the study area from 2008 to 2011, of which 3,080 were successfully geocoded to the neighborhood level. Higher than expected rates of BC were found in neighborhoods located in northern and central Tehran, whereas lower rates appeared in southern areas. The most likely cluster of higher than expected BC incidence involved neighborhoods in districts 3 and 6, with an observed-to-expected ratio of 3.92 (p<0.001, whereas the most likely cluster of lower than expected rates involved neighborhoods in districts 17, 18, and 19, with an observed-to-expected ratio of 0.05 (p<0.001. CONCLUSIONS Neighborhood-level inequality in the incidence of BC exists in Tehran. These findings can serve as a basis for resource allocation and preventive strategies in at-risk areas.

  19. Quantal Response: Nonparametric Modeling

    Science.gov (United States)

    2017-01-01

    capture the behavior of observed phenomena. Higher-order polynomial and finite-dimensional spline basis models allow for more complicated responses as the...flexibility as these are nonparametric (not constrained to any particular functional form). These should be useful in identifying nonstandard behavior via... deviance ∆ = −2 log(Lreduced/Lfull) is defined in terms of the likelihood function L. For normal error, Lfull = 1, and based on Eq. A-2, we have log

  20. Paleotempestological chronology developed from gas ion source AMS analysis of carbonates determined through real-time Bayesian statistical approach

    Science.gov (United States)

    Wallace, D. J.; Rosenheim, B. E.; Roberts, M. L.; Burton, J. R.; Donnelly, J. P.; Woodruff, J. D.

    2014-12-01

    Is a small quantity of high-precision ages more robust than a higher quantity of lower-precision ages for sediment core chronologies? AMS Radiocarbon ages have been available to researchers for several decades now, and precision of the technique has continued to improve. Analysis and time cost is high, though, and projects are often limited in terms of the number of dates that can be used to develop a chronology. The Gas Ion Source at the National Ocean Sciences Accelerator Mass Spectrometry Facility (NOSAMS), while providing lower-precision (uncertainty of order 100 14C y for a sample), is significantly less expensive and far less time consuming than conventional age dating and offers the unique opportunity for large amounts of ages. Here we couple two approaches, one analytical and one statistical, to investigate the utility of an age model comprised of these lower-precision ages for paleotempestology. We use a gas ion source interfaced to a gas-bench type device to generate radiocarbon dates approximately every 5 minutes while determining the order of sample analysis using the published Bayesian accumulation histories for deposits (Bacon). During two day-long sessions, several dates were obtained from carbonate shells in living position in a sediment core comprised of sapropel gel from Mangrove Lake, Bermuda. Samples were prepared where large shells were available, and the order of analysis was determined by the depth with the highest uncertainty according to Bacon. We present the results of these analyses as well as a prognosis for a future where such age models can be constructed from many dates that are quickly obtained relative to conventional radiocarbon dates. This technique currently is limited to carbonates, but development of a system for organic material dating is underway. We will demonstrate the extent to which sacrificing some analytical precision in favor of more dates improves age models.

  1. A Bayesian Markov geostatistical model for estimation of hydrogeological properties

    International Nuclear Information System (INIS)

    Rosen, L.; Gustafson, G.

    1996-01-01

    A geostatistical methodology based on Markov-chain analysis and Bayesian statistics was developed for probability estimations of hydrogeological and geological properties in the siting process of a nuclear waste repository. The probability estimates have practical use in decision-making on issues such as siting, investigation programs, and construction design. The methodology is nonparametric which makes it possible to handle information that does not exhibit standard statistical distributions, as is often the case for classified information. Data do not need to meet the requirements on additivity and normality as with the geostatistical methods based on regionalized variable theory, e.g., kriging. The methodology also has a formal way for incorporating professional judgments through the use of Bayesian statistics, which allows for updating of prior estimates to posterior probabilities each time new information becomes available. A Bayesian Markov Geostatistical Model (BayMar) software was developed for implementation of the methodology in two and three dimensions. This paper gives (1) a theoretical description of the Bayesian Markov Geostatistical Model; (2) a short description of the BayMar software; and (3) an example of application of the model for estimating the suitability for repository establishment with respect to the three parameters of lithology, hydraulic conductivity, and rock quality designation index (RQD) at 400--500 meters below ground surface in an area around the Aespoe Hard Rock Laboratory in southeastern Sweden

  2. Predicting Market Impact Costs Using Nonparametric Machine Learning Models.

    Directory of Open Access Journals (Sweden)

    Saerom Park

    Full Text Available Market impact cost is the most significant portion of implicit transaction costs that can reduce the overall transaction cost, although it cannot be measured directly. In this paper, we employed the state-of-the-art nonparametric machine learning models: neural networks, Bayesian neural network, Gaussian process, and support vector regression, to predict market impact cost accurately and to provide the predictive model that is versatile in the number of variables. We collected a large amount of real single transaction data of US stock market from Bloomberg Terminal and generated three independent input variables. As a result, most nonparametric machine learning models outperformed a-state-of-the-art benchmark parametric model such as I-star model in four error measures. Although these models encounter certain difficulties in separating the permanent and temporary cost directly, nonparametric machine learning models can be good alternatives in reducing transaction costs by considerably improving in prediction performance.

  3. Predicting Market Impact Costs Using Nonparametric Machine Learning Models.

    Science.gov (United States)

    Park, Saerom; Lee, Jaewook; Son, Youngdoo

    2016-01-01

    Market impact cost is the most significant portion of implicit transaction costs that can reduce the overall transaction cost, although it cannot be measured directly. In this paper, we employed the state-of-the-art nonparametric machine learning models: neural networks, Bayesian neural network, Gaussian process, and support vector regression, to predict market impact cost accurately and to provide the predictive model that is versatile in the number of variables. We collected a large amount of real single transaction data of US stock market from Bloomberg Terminal and generated three independent input variables. As a result, most nonparametric machine learning models outperformed a-state-of-the-art benchmark parametric model such as I-star model in four error measures. Although these models encounter certain difficulties in separating the permanent and temporary cost directly, nonparametric machine learning models can be good alternatives in reducing transaction costs by considerably improving in prediction performance.

  4. Nonparametric combinatorial sequence models.

    Science.gov (United States)

    Wauthier, Fabian L; Jordan, Michael I; Jojic, Nebojsa

    2011-11-01

    This work considers biological sequences that exhibit combinatorial structures in their composition: groups of positions of the aligned sequences are "linked" and covary as one unit across sequences. If multiple such groups exist, complex interactions can emerge between them. Sequences of this kind arise frequently in biology but methodologies for analyzing them are still being developed. This article presents a nonparametric prior on sequences which allows combinatorial structures to emerge and which induces a posterior distribution over factorized sequence representations. We carry out experiments on three biological sequence families which indicate that combinatorial structures are indeed present and that combinatorial sequence models can more succinctly describe them than simpler mixture models. We conclude with an application to MHC binding prediction which highlights the utility of the posterior distribution over sequence representations induced by the prior. By integrating out the posterior, our method compares favorably to leading binding predictors.

  5. Bayesian non parametric modelling of Higgs pair production

    Directory of Open Access Journals (Sweden)

    Scarpa Bruno

    2017-01-01

    Full Text Available Statistical classification models are commonly used to separate a signal from a background. In this talk we face the problem of isolating the signal of Higgs pair production using the decay channel in which each boson decays into a pair of b-quarks. Typically in this context non parametric methods are used, such as Random Forests or different types of boosting tools. We remain in the same non-parametric framework, but we propose to face the problem following a Bayesian approach. A Dirichlet process is used as prior for the random effects in a logit model which is fitted by leveraging the Polya-Gamma data augmentation. Refinements of the model include the insertion in the simple model of P-splines to relate explanatory variables with the response and the use of Bayesian trees (BART to describe the atoms in the Dirichlet process.

  6. Speaker Linking and Applications using Non-Parametric Hashing Methods

    Science.gov (United States)

    2016-09-08

    nonparametric estimate of a multivariate density function,” The Annals of Math- ematical Statistics , vol. 36, no. 3, pp. 1049–1051, 1965. [9] E. A. Patrick...Speaker Linking and Applications using Non-Parametric Hashing Methods† Douglas Sturim and William M. Campbell MIT Lincoln Laboratory, Lexington, MA...with many approaches [1, 2]. For this paper, we focus on using i-vectors [2], but the methods apply to any embedding. For the task of speaker QBE and

  7. Nonparametric identification of copula structures

    KAUST Repository

    Li, Bo; Genton, Marc G.

    2013-01-01

    We propose a unified framework for testing a variety of assumptions commonly made about the structure of copulas, including symmetry, radial symmetry, joint symmetry, associativity and Archimedeanity, and max-stability. Our test is nonparametric

  8. Estimation of the limit of detection with a bootstrap-derived standard error by a partly non-parametric approach. Application to HPLC drug assays

    DEFF Research Database (Denmark)

    Linnet, Kristian

    2005-01-01

    Bootstrap, HPLC, limit of blank, limit of detection, non-parametric statistics, type I and II errors......Bootstrap, HPLC, limit of blank, limit of detection, non-parametric statistics, type I and II errors...

  9. Impulse response identification with deterministic inputs using non-parametric methods

    International Nuclear Information System (INIS)

    Bhargava, U.K.; Kashyap, R.L.; Goodman, D.M.

    1985-01-01

    This paper addresses the problem of impulse response identification using non-parametric methods. Although the techniques developed herein apply to the truncated, untruncated, and the circulant models, we focus on the truncated model which is useful in certain applications. Two methods of impulse response identification will be presented. The first is based on the minimization of the C/sub L/ Statistic, which is an estimate of the mean-square prediction error; the second is a Bayesian approach. For both of these methods, we consider the effects of using both the identity matrix and the Laplacian matrix as weights on the energy in the impulse response. In addition, we present a method for estimating the effective length of the impulse response. Estimating the length is particularly important in the truncated case. Finally, we develop a method for estimating the noise variance at the output. Often, prior information on the noise variance is not available, and a good estimate is crucial to the success of estimating the impulse response with a nonparametric technique

  10. Can a significance test be genuinely Bayesian?

    OpenAIRE

    Pereira, Carlos A. de B.; Stern, Julio Michael; Wechsler, Sergio

    2008-01-01

    The Full Bayesian Significance Test, FBST, is extensively reviewed. Its test statistic, a genuine Bayesian measure of evidence, is discussed in detail. Its behavior in some problems of statistical inference like testing for independence in contingency tables is discussed.

  11. Current trends in Bayesian methodology with applications

    CERN Document Server

    Upadhyay, Satyanshu K; Dey, Dipak K; Loganathan, Appaia

    2015-01-01

    Collecting Bayesian material scattered throughout the literature, Current Trends in Bayesian Methodology with Applications examines the latest methodological and applied aspects of Bayesian statistics. The book covers biostatistics, econometrics, reliability and risk analysis, spatial statistics, image analysis, shape analysis, Bayesian computation, clustering, uncertainty assessment, high-energy astrophysics, neural networking, fuzzy information, objective Bayesian methodologies, empirical Bayes methods, small area estimation, and many more topics.Each chapter is self-contained and focuses on

  12. Statistical trend analysis methods for temporal phenomena

    International Nuclear Information System (INIS)

    Lehtinen, E.; Pulkkinen, U.; Poern, K.

    1997-04-01

    We consider point events occurring in a random way in time. In many applications the pattern of occurrence is of intrinsic interest as indicating a trend or some other systematic feature in the rate of occurrence. The purpose of this report is to survey briefly different statistical trend analysis methods and illustrate their applicability to temporal phenomena in particular. The trend testing of point events is usually seen as the testing of the hypotheses concerning the intensity of the occurrence of events. When the intensity function is parametrized, the testing of trend is a typical parametric testing problem. In industrial applications the operational experience generally does not suggest any specified model and method in advance. Therefore, and particularly, if the Poisson process assumption is very questionable, it is desirable to apply tests that are valid for a wide variety of possible processes. The alternative approach for trend testing is to use some non-parametric procedure. In this report we have presented four non-parametric tests: The Cox-Stuart test, the Wilcoxon signed ranks test, the Mann test, and the exponential ordered scores test. In addition to the classical parametric and non-parametric approaches we have also considered the Bayesian trend analysis. First we discuss a Bayesian model, which is based on a power law intensity model. The Bayesian statistical inferences are based on the analysis of the posterior distribution of the trend parameters, and the probability of trend is immediately seen from these distributions. We applied some of the methods discussed in an example case. It should be noted, that this report is a feasibility study rather than a scientific evaluation of statistical methods, and the examples can only be seen as demonstrations of the methods

  13. Statistical trend analysis methods for temporal phenomena

    Energy Technology Data Exchange (ETDEWEB)

    Lehtinen, E.; Pulkkinen, U. [VTT Automation, (Finland); Poern, K. [Poern Consulting, Nykoeping (Sweden)

    1997-04-01

    We consider point events occurring in a random way in time. In many applications the pattern of occurrence is of intrinsic interest as indicating a trend or some other systematic feature in the rate of occurrence. The purpose of this report is to survey briefly different statistical trend analysis methods and illustrate their applicability to temporal phenomena in particular. The trend testing of point events is usually seen as the testing of the hypotheses concerning the intensity of the occurrence of events. When the intensity function is parametrized, the testing of trend is a typical parametric testing problem. In industrial applications the operational experience generally does not suggest any specified model and method in advance. Therefore, and particularly, if the Poisson process assumption is very questionable, it is desirable to apply tests that are valid for a wide variety of possible processes. The alternative approach for trend testing is to use some non-parametric procedure. In this report we have presented four non-parametric tests: The Cox-Stuart test, the Wilcoxon signed ranks test, the Mann test, and the exponential ordered scores test. In addition to the classical parametric and non-parametric approaches we have also considered the Bayesian trend analysis. First we discuss a Bayesian model, which is based on a power law intensity model. The Bayesian statistical inferences are based on the analysis of the posterior distribution of the trend parameters, and the probability of trend is immediately seen from these distributions. We applied some of the methods discussed in an example case. It should be noted, that this report is a feasibility study rather than a scientific evaluation of statistical methods, and the examples can only be seen as demonstrations of the methods. 14 refs, 10 figs.

  14. Parametric vs. Nonparametric Regression Modelling within Clinical Decision Support

    Czech Academy of Sciences Publication Activity Database

    Kalina, Jan; Zvárová, Jana

    2017-01-01

    Roč. 5, č. 1 (2017), s. 21-27 ISSN 1805-8698 R&D Projects: GA ČR GA17-01251S Institutional support: RVO:67985807 Keywords : decision support systems * decision rules * statistical analysis * nonparametric regression Subject RIV: IN - Informatics, Computer Science OBOR OECD: Statistics and probability

  15. Bayesian community detection

    DEFF Research Database (Denmark)

    Mørup, Morten; Schmidt, Mikkel N

    2012-01-01

    Many networks of scientific interest naturally decompose into clusters or communities with comparatively fewer external than internal links; however, current Bayesian models of network communities do not exert this intuitive notion of communities. We formulate a nonparametric Bayesian model...... for community detection consistent with an intuitive definition of communities and present a Markov chain Monte Carlo procedure for inferring the community structure. A Matlab toolbox with the proposed inference procedure is available for download. On synthetic and real networks, our model detects communities...... consistent with ground truth, and on real networks, it outperforms existing approaches in predicting missing links. This suggests that community structure is an important structural property of networks that should be explicitly modeled....

  16. IZI: INFERRING THE GAS PHASE METALLICITY (Z) AND IONIZATION PARAMETER (q) OF IONIZED NEBULAE USING BAYESIAN STATISTICS

    International Nuclear Information System (INIS)

    Blanc, Guillermo A.; Kewley, Lisa; Vogt, Frédéric P. A.; Dopita, Michael A.

    2015-01-01

    We present a new method for inferring the metallicity (Z) and ionization parameter (q) of H II regions and star-forming galaxies using strong nebular emission lines (SELs). We use Bayesian inference to derive the joint and marginalized posterior probability density functions for Z and q given a set of observed line fluxes and an input photoionization model. Our approach allows the use of arbitrary sets of SELs and the inclusion of flux upper limits. The method provides a self-consistent way of determining the physical conditions of ionized nebulae that is not tied to the arbitrary choice of a particular SEL diagnostic and uses all the available information. Unlike theoretically calibrated SEL diagnostics, the method is flexible and not tied to a particular photoionization model. We describe our algorithm, validate it against other methods, and present a tool that implements it called IZI. Using a sample of nearby extragalactic H II regions, we assess the performance of commonly used SEL abundance diagnostics. We also use a sample of 22 local H II regions having both direct and recombination line (RL) oxygen abundance measurements in the literature to study discrepancies in the abundance scale between different methods. We find that oxygen abundances derived through Bayesian inference using currently available photoionization models in the literature can be in good (∼30%) agreement with RL abundances, although some models perform significantly better than others. We also confirm that abundances measured using the direct method are typically ∼0.2 dex lower than both RL and photoionization-model-based abundances

  17. IZI: INFERRING THE GAS PHASE METALLICITY (Z) AND IONIZATION PARAMETER (q) OF IONIZED NEBULAE USING BAYESIAN STATISTICS

    Energy Technology Data Exchange (ETDEWEB)

    Blanc, Guillermo A. [Observatories of the Carnegie Institution for Science, 813 Santa Barbara Street, Pasadena, CA 91101 (United States); Kewley, Lisa; Vogt, Frédéric P. A.; Dopita, Michael A. [Research School of Astronomy and Astrophysics, Australian National University, Cotter Road, Weston, ACT 2611 (Australia)

    2015-01-10

    We present a new method for inferring the metallicity (Z) and ionization parameter (q) of H II regions and star-forming galaxies using strong nebular emission lines (SELs). We use Bayesian inference to derive the joint and marginalized posterior probability density functions for Z and q given a set of observed line fluxes and an input photoionization model. Our approach allows the use of arbitrary sets of SELs and the inclusion of flux upper limits. The method provides a self-consistent way of determining the physical conditions of ionized nebulae that is not tied to the arbitrary choice of a particular SEL diagnostic and uses all the available information. Unlike theoretically calibrated SEL diagnostics, the method is flexible and not tied to a particular photoionization model. We describe our algorithm, validate it against other methods, and present a tool that implements it called IZI. Using a sample of nearby extragalactic H II regions, we assess the performance of commonly used SEL abundance diagnostics. We also use a sample of 22 local H II regions having both direct and recombination line (RL) oxygen abundance measurements in the literature to study discrepancies in the abundance scale between different methods. We find that oxygen abundances derived through Bayesian inference using currently available photoionization models in the literature can be in good (∼30%) agreement with RL abundances, although some models perform significantly better than others. We also confirm that abundances measured using the direct method are typically ∼0.2 dex lower than both RL and photoionization-model-based abundances.

  18. Using polarimetric radar observations and probabilistic inference to develop the Bayesian Observationally-constrained Statistical-physical Scheme (BOSS), a novel microphysical parameterization framework

    Science.gov (United States)

    van Lier-Walqui, M.; Morrison, H.; Kumjian, M. R.; Prat, O. P.

    2016-12-01

    Microphysical parameterization schemes have reached an impressive level of sophistication: numerous prognostic hydrometeor categories, and either size-resolved (bin) particle size distributions, or multiple prognostic moments of the size distribution. Yet, uncertainty in model representation of microphysical processes and the effects of microphysics on numerical simulation of weather has not shown a improvement commensurate with the advanced sophistication of these schemes. We posit that this may be caused by unconstrained assumptions of these schemes, such as ad-hoc parameter value choices and structural uncertainties (e.g. choice of a particular form for the size distribution). We present work on development and observational constraint of a novel microphysical parameterization approach, the Bayesian Observationally-constrained Statistical-physical Scheme (BOSS), which seeks to address these sources of uncertainty. Our framework avoids unnecessary a priori assumptions, and instead relies on observations to provide probabilistic constraint of the scheme structure and sensitivities to environmental and microphysical conditions. We harness the rich microphysical information content of polarimetric radar observations to develop and constrain BOSS within a Bayesian inference framework using a Markov Chain Monte Carlo sampler (see Kumjian et al., this meeting for details on development of an associated polarimetric forward operator). Our work shows how knowledge of microphysical processes is provided by polarimetric radar observations of diverse weather conditions, and which processes remain highly uncertain, even after considering observations.

  19. Bayesian analysis of systems with random chemical composition: renormalization-group approach to Dirichlet distributions and the statistical theory of dilution.

    Science.gov (United States)

    Vlad, Marcel Ovidiu; Tsuchiya, Masa; Oefner, Peter; Ross, John

    2002-01-01

    We investigate the statistical properties of systems with random chemical composition and try to obtain a theoretical derivation of the self-similar Dirichlet distribution, which is used empirically in molecular biology, environmental chemistry, and geochemistry. We consider a system made up of many chemical species and assume that the statistical distribution of the abundance of each chemical species in the system is the result of a succession of a variable number of random dilution events, which can be described by using the renormalization-group theory. A Bayesian approach is used for evaluating the probability density of the chemical composition of the system in terms of the probability densities of the abundances of the different chemical species. We show that for large cascades of dilution events, the probability density of the composition vector of the system is given by a self-similar probability density of the Dirichlet type. We also give an alternative formal derivation for the Dirichlet law based on the maximum entropy approach, by assuming that the average values of the chemical potentials of different species, expressed in terms of molar fractions, are constant. Although the maximum entropy approach leads formally to the Dirichlet distribution, it does not clarify the physical origin of the Dirichlet statistics and has serious limitations. The random theory of dilution provides a physical picture for the emergence of Dirichlet statistics and makes it possible to investigate its validity range. We discuss the implications of our theory in molecular biology, geochemistry, and environmental science.

  20. International Conference on Robust Rank-Based and Nonparametric Methods

    CERN Document Server

    McKean, Joseph

    2016-01-01

    The contributors to this volume include many of the distinguished researchers in this area. Many of these scholars have collaborated with Joseph McKean to develop underlying theory for these methods, obtain small sample corrections, and develop efficient algorithms for their computation. The papers cover the scope of the area, including robust nonparametric rank-based procedures through Bayesian and big data rank-based analyses. Areas of application include biostatistics and spatial areas. Over the last 30 years, robust rank-based and nonparametric methods have developed considerably. These procedures generalize traditional Wilcoxon-type methods for one- and two-sample location problems. Research into these procedures has culminated in complete analyses for many of the models used in practice including linear, generalized linear, mixed, and nonlinear models. Settings are both multivariate and univariate. With the development of R packages in these areas, computation of these procedures is easily shared with r...

  1. Investigating the collision energy dependence of η /s in the beam energy scan at the BNL Relativistic Heavy Ion Collider using Bayesian statistics

    Science.gov (United States)

    Auvinen, Jussi; Bernhard, Jonah E.; Bass, Steffen A.; Karpenko, Iurii

    2018-04-01

    We determine the probability distributions of the shear viscosity over the entropy density ratio η /s in the quark-gluon plasma formed in Au + Au collisions at √{sN N}=19.6 ,39 , and 62.4 GeV , using Bayesian inference and Gaussian process emulators for a model-to-data statistical analysis that probes the full input parameter space of a transport + viscous hydrodynamics hybrid model. We find the most likely value of η /s to be larger at smaller √{sN N}, although the uncertainties still allow for a constant value between 0.10 and 0.15 for the investigated collision energy range.

  2. Bayesian Statistical Mechanics: Entropy-Enthalpy Compensation and Universal Equation of State at the Tip of Pen

    Directory of Open Access Journals (Sweden)

    Evgeni B. Starikov

    2018-02-01

    Full Text Available This work has shown the way to put the formal statistical-mechanical basement under the hotly debated notion of enthalpy-entropy compensation. The possibility of writing down the universal equation of state based upon the statistical mechanics is discussed here.

  3. Supremum Norm Posterior Contraction and Credible Sets for Nonparametric Multivariate Regression

    NARCIS (Netherlands)

    Yoo, W.W.; Ghosal, S

    2016-01-01

    In the setting of nonparametric multivariate regression with unknown error variance, we study asymptotic properties of a Bayesian method for estimating a regression function f and its mixed partial derivatives. We use a random series of tensor product of B-splines with normal basis coefficients as a

  4. A non-parametric hierarchical model to discover behavior dynamics from tracks

    NARCIS (Netherlands)

    Kooij, J.F.P.; Englebienne, G.; Gavrila, D.M.

    2012-01-01

    We present a novel non-parametric Bayesian model to jointly discover the dynamics of low-level actions and high-level behaviors of tracked people in open environments. Our model represents behaviors as Markov chains of actions which capture high-level temporal dynamics. Actions may be shared by

  5. A Bayesian Nonparametric Causal Model for Regression Discontinuity Designs

    Science.gov (United States)

    Karabatsos, George; Walker, Stephen G.

    2013-01-01

    The regression discontinuity (RD) design (Thistlewaite & Campbell, 1960; Cook, 2008) provides a framework to identify and estimate causal effects from a non-randomized design. Each subject of a RD design is assigned to the treatment (versus assignment to a non-treatment) whenever her/his observed value of the assignment variable equals or…

  6. Non-parametric Bayesian inference for inhomogeneous Markov point processes

    DEFF Research Database (Denmark)

    Berthelsen, Kasper Klitgaard; Møller, Jesper; Johansen, Per Michael

    is a shot noise process, and the interaction function for a pair of points depends only on the distance between the two points and is a piecewise linear function modelled by a marked Poisson process. Simulation of the resulting posterior using a Metropolis-Hastings algorithm in the "conventional" way...

  7. Nonparametric Bayesian Context Learning for Buried Threat Detection

    Science.gov (United States)

    2012-01-01

    example, in recommending mu- sic to listeners, leading algorithms identify the genre of a recording solely based on frequency-domain features of the...subject matter) as well as in complementary media such as books or films that the customer also enjoys. The consumption of media by a user’s friends may

  8. Online Nonparametric Bayesian Activity Mining and Analysis From Surveillance Video.

    Science.gov (United States)

    Bastani, Vahid; Marcenaro, Lucio; Regazzoni, Carlo S

    2016-05-01

    A method for online incremental mining of activity patterns from the surveillance video stream is presented in this paper. The framework consists of a learning block in which Dirichlet process mixture model is employed for the incremental clustering of trajectories. Stochastic trajectory pattern models are formed using the Gaussian process regression of the corresponding flow functions. Moreover, a sequential Monte Carlo method based on Rao-Blackwellized particle filter is proposed for tracking and online classification as well as the detection of abnormality during the observation of an object. Experimental results on real surveillance video data are provided to show the performance of the proposed algorithm in different tasks of trajectory clustering, classification, and abnormality detection.

  9. MAP estimators and their consistency in Bayesian nonparametric inverse problems

    KAUST Repository

    Dashti, M.; Law, K. J H; Stuart, A. M.; Voss, J.

    2013-01-01

    with examples from an inverse problem for the Navier-Stokes equation, motivated by problems arising in weather forecasting, and from the theory of conditioned diffusions, motivated by problems arising in molecular dynamics. © 2013 IOP Publishing Ltd.

  10. Bayesian dynamic mediation analysis.

    Science.gov (United States)

    Huang, Jing; Yuan, Ying

    2017-12-01

    Most existing methods for mediation analysis assume that mediation is a stationary, time-invariant process, which overlooks the inherently dynamic nature of many human psychological processes and behavioral activities. In this article, we consider mediation as a dynamic process that continuously changes over time. We propose Bayesian multilevel time-varying coefficient models to describe and estimate such dynamic mediation effects. By taking the nonparametric penalized spline approach, the proposed method is flexible and able to accommodate any shape of the relationship between time and mediation effects. Simulation studies show that the proposed method works well and faithfully reflects the true nature of the mediation process. By modeling mediation effect nonparametrically as a continuous function of time, our method provides a valuable tool to help researchers obtain a more complete understanding of the dynamic nature of the mediation process underlying psychological and behavioral phenomena. We also briefly discuss an alternative approach of using dynamic autoregressive mediation model to estimate the dynamic mediation effect. The computer code is provided to implement the proposed Bayesian dynamic mediation analysis. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  11. Bayesian Analysis for Penalized Spline Regression Using WinBUGS

    Directory of Open Access Journals (Sweden)

    Ciprian M. Crainiceanu

    2005-09-01

    Full Text Available Penalized splines can be viewed as BLUPs in a mixed model framework, which allows the use of mixed model software for smoothing. Thus, software originally developed for Bayesian analysis of mixed models can be used for penalized spline regression. Bayesian inference for nonparametric models enjoys the flexibility of nonparametric models and the exact inference provided by the Bayesian inferential machinery. This paper provides a simple, yet comprehensive, set of programs for the implementation of nonparametric Bayesian analysis in WinBUGS. Good mixing properties of the MCMC chains are obtained by using low-rank thin-plate splines, while simulation times per iteration are reduced employing WinBUGS specific computational tricks.

  12. Bayesian non- and semi-parametric methods and applications

    CERN Document Server

    Rossi, Peter

    2014-01-01

    This book reviews and develops Bayesian non-parametric and semi-parametric methods for applications in microeconometrics and quantitative marketing. Most econometric models used in microeconomics and marketing applications involve arbitrary distributional assumptions. As more data becomes available, a natural desire to provide methods that relax these assumptions arises. Peter Rossi advocates a Bayesian approach in which specific distributional assumptions are replaced with more flexible distributions based on mixtures of normals. The Bayesian approach can use either a large but fixed number

  13. Testing discontinuities in nonparametric regression

    KAUST Repository

    Dai, Wenlin

    2017-01-19

    In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100

  14. Testing discontinuities in nonparametric regression

    KAUST Repository

    Dai, Wenlin; Zhou, Yuejin; Tong, Tiejun

    2017-01-01

    In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100

  15. Bayesian computation with R

    CERN Document Server

    Albert, Jim

    2009-01-01

    There has been a dramatic growth in the development and application of Bayesian inferential methods. Some of this growth is due to the availability of powerful simulation-based algorithms to summarize posterior distributions. There has been also a growing interest in the use of the system R for statistical analyses. R's open source nature, free availability, and large number of contributor packages have made R the software of choice for many statisticians in education and industry. Bayesian Computation with R introduces Bayesian modeling by the use of computation using the R language. The earl

  16. A non-parametric meta-analysis approach for combining independent microarray datasets: application using two microarray datasets pertaining to chronic allograft nephropathy

    Directory of Open Access Journals (Sweden)

    Archer Kellie J

    2008-02-01

    Full Text Available Abstract Background With the popularity of DNA microarray technology, multiple groups of researchers have studied the gene expression of similar biological conditions. Different methods have been developed to integrate the results from various microarray studies, though most of them rely on distributional assumptions, such as the t-statistic based, mixed-effects model, or Bayesian model methods. However, often the sample size for each individual microarray experiment is small. Therefore, in this paper we present a non-parametric meta-analysis approach for combining data from independent microarray studies, and illustrate its application on two independent Affymetrix GeneChip studies that compared the gene expression of biopsies from kidney transplant recipients with chronic allograft nephropathy (CAN to those with normal functioning allograft. Results The simulation study comparing the non-parametric meta-analysis approach to a commonly used t-statistic based approach shows that the non-parametric approach has better sensitivity and specificity. For the application on the two CAN studies, we identified 309 distinct genes that expressed differently in CAN. By applying Fisher's exact test to identify enriched KEGG pathways among those genes called differentially expressed, we found 6 KEGG pathways to be over-represented among the identified genes. We used the expression measurements of the identified genes as predictors to predict the class labels for 6 additional biopsy samples, and the predicted results all conformed to their pathologist diagnosed class labels. Conclusion We present a new approach for combining data from multiple independent microarray studies. This approach is non-parametric and does not rely on any distributional assumptions. The rationale behind the approach is logically intuitive and can be easily understood by researchers not having advanced training in statistics. Some of the identified genes and pathways have been

  17. Bayesian Networks An Introduction

    CERN Document Server

    Koski, Timo

    2009-01-01

    Bayesian Networks: An Introduction provides a self-contained introduction to the theory and applications of Bayesian networks, a topic of interest and importance for statisticians, computer scientists and those involved in modelling complex data sets. The material has been extensively tested in classroom teaching and assumes a basic knowledge of probability, statistics and mathematics. All notions are carefully explained and feature exercises throughout. Features include:.: An introduction to Dirichlet Distribution, Exponential Families and their applications.; A detailed description of learni

  18. Nonparametric factor analysis of time series

    OpenAIRE

    Rodríguez-Poo, Juan M.; Linton, Oliver Bruce

    1998-01-01

    We introduce a nonparametric smoothing procedure for nonparametric factor analaysis of multivariate time series. The asymptotic properties of the proposed procedures are derived. We present an application based on the residuals from the Fair macromodel.

  19. Bayesian Penalized Likelihood Image Reconstruction (Q.Clear) in 82Rb Cardiac PET: Impact of Count Statistics

    DEFF Research Database (Denmark)

    Christensen, Nana Louise; Tolbod, Lars Poulsen

    PET scans. 3) Static and dynamic images from a set of 7 patients (BSA: 1.6-2.2 m2) referred for 82Rb cardiac PET was analyzed using a range of beta factors. Results were compared to the institution’s standard clinical practice reconstruction protocol. All scans were performed on GE DMI Digital......Aim: Q.Clear reconstruction is expected to improve detection of perfusion defects in cardiac PET due to the high degree of image convergence and effective noise suppression. However, 82Rb (T½=76s) possess a special problem, since count statistics vary significantly not only between patients...... statistics using a cardiac PET phantom as well as a selection of clinical patients referred for 82Rb cardiac PET. Methods: The study consistent of 3 parts: 1) A thorax-cardiac phantom was scanned for 10 minutes after injection of 1110 MBq 82Rb. Frames at 3 different times after infusion were reconstructed...

  20. Nonparametric regression using the concept of minimum energy

    International Nuclear Information System (INIS)

    Williams, Mike

    2011-01-01

    It has recently been shown that an unbinned distance-based statistic, the energy, can be used to construct an extremely powerful nonparametric multivariate two sample goodness-of-fit test. An extension to this method that makes it possible to perform nonparametric regression using multiple multivariate data sets is presented in this paper. The technique, which is based on the concept of minimizing the energy of the system, permits determination of parameters of interest without the need for parametric expressions of the parent distributions of the data sets. The application and performance of this new method is discussed in the context of some simple example analyses.

  1. Comparação de duas metodologias de amostragem atmosférica com ferramenta estatística não paramétrica Comparison of two atmospheric sampling methodologies with non-parametric statistical tools

    Directory of Open Access Journals (Sweden)

    Maria João Nunes

    2005-03-01

    Full Text Available In atmospheric aerosol sampling, it is inevitable that the air that carries particles is in motion, as a result of both externally driven wind and the sucking action of the sampler itself. High or low air flow sampling speeds may lead to significant particle size bias. The objective of this work is the validation of measurements enabling the comparison of species concentration from both air flow sampling techniques. The presence of several outliers and increase of residuals with concentration becomes obvious, requiring non-parametric methods, recommended for the handling of data which may not be normally distributed. This way, conversion factors are obtained for each of the various species under study using Kendall regression.

  2. Basics of Bayesian methods.

    Science.gov (United States)

    Ghosh, Sujit K

    2010-01-01

    Bayesian methods are rapidly becoming popular tools for making statistical inference in various fields of science including biology, engineering, finance, and genetics. One of the key aspects of Bayesian inferential method is its logical foundation that provides a coherent framework to utilize not only empirical but also scientific information available to a researcher. Prior knowledge arising from scientific background, expert judgment, or previously collected data is used to build a prior distribution which is then combined with current data via the likelihood function to characterize the current state of knowledge using the so-called posterior distribution. Bayesian methods allow the use of models of complex physical phenomena that were previously too difficult to estimate (e.g., using asymptotic approximations). Bayesian methods offer a means of more fully understanding issues that are central to many practical problems by allowing researchers to build integrated models based on hierarchical conditional distributions that can be estimated even with limited amounts of data. Furthermore, advances in numerical integration methods, particularly those based on Monte Carlo methods, have made it possible to compute the optimal Bayes estimators. However, there is a reasonably wide gap between the background of the empirically trained scientists and the full weight of Bayesian statistical inference. Hence, one of the goals of this chapter is to bridge the gap by offering elementary to advanced concepts that emphasize linkages between standard approaches and full probability modeling via Bayesian methods.

  3. Restoration of Bi-Contrast MRI Data for Intensity Uniformity with Bayesian Coring of Co-Occurrence Statistics

    Directory of Open Access Journals (Sweden)

    Stathis Hadjidemetriou

    2017-12-01

    Full Text Available The reconstruction of MRI data assumes a uniform radio-frequency field. However, in practice, the radio-frequency field is inhomogeneous and leads to anatomically inconsequential intensity non-uniformities across an image. An anatomic region can be imaged with multiple contrasts reconstructed independently and be suffering from different non-uniformities. These artifacts can complicate the further automated analysis of the images. A method is presented for the joint intensity uniformity restoration of two such images. The effect of the intensity distortion on the auto-co-occurrence statistics of each image as well as on the joint-co-occurrence statistics of the two images is modeled and used for their non-stationary restoration followed by their back-projection to the images. Several constraints that ensure a stable restoration are also imposed. Moreover, the method considers the inevitable differences between the signal regions of the two images. The method has been evaluated extensively with BrainWeb phantom brain data as well as with brain anatomic data from the Human Connectome Project (HCP and with data of Parkinson’s disease patients. The performance of the proposed method has been compared with that of the N4ITK tool. The proposed method increases tissues contrast at least 4 . 62 times more than the N4ITK tool for the BrainWeb images. The dynamic range with the N4ITK method for the same images is increased by up to +29.77%, whereas, for the proposed method, it has a corresponding limited decrease of - 1 . 15 % , as expected. The validation has demonstrated the accuracy and stability of the proposed method and hence its ability to reduce the requirements for additional calibration scans.

  4. Exact nonparametric inference for detection of nonlinear determinism

    OpenAIRE

    Luo, Xiaodong; Zhang, Jie; Small, Michael; Moroz, Irene

    2005-01-01

    We propose an exact nonparametric inference scheme for the detection of nonlinear determinism. The essential fact utilized in our scheme is that, for a linear stochastic process with jointly symmetric innovations, its ordinary least square (OLS) linear prediction error is symmetric about zero. Based on this viewpoint, a class of linear signed rank statistics, e.g. the Wilcoxon signed rank statistic, can be derived with the known null distributions from the prediction error. Thus one of the ad...

  5. Monte Carlo Bayesian inference on a statistical model of sub-gridcolumn moisture variability using high-resolution cloud observations. Part 1: Method

    Science.gov (United States)

    Norris, Peter M.; da Silva, Arlindo M.

    2018-01-01

    A method is presented to constrain a statistical model of sub-gridcolumn moisture variability using high-resolution satellite cloud data. The method can be used for large-scale model parameter estimation or cloud data assimilation. The gridcolumn model includes assumed probability density function (PDF) intra-layer horizontal variability and a copula-based inter-layer correlation model. The observables used in the current study are Moderate Resolution Imaging Spectroradiometer (MODIS) cloud-top pressure, brightness temperature and cloud optical thickness, but the method should be extensible to direct cloudy radiance assimilation for a small number of channels. The algorithm is a form of Bayesian inference with a Markov chain Monte Carlo (MCMC) approach to characterizing the posterior distribution. This approach is especially useful in cases where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach is not gradient-based and allows jumps into regions of non-zero cloud probability. The current study uses a skewed-triangle distribution for layer moisture. The article also includes a discussion of the Metropolis and multiple-try Metropolis versions of MCMC. PMID:29618847

  6. Monte Carlo Bayesian Inference on a Statistical Model of Sub-Gridcolumn Moisture Variability Using High-Resolution Cloud Observations. Part 1: Method

    Science.gov (United States)

    Norris, Peter M.; Da Silva, Arlindo M.

    2016-01-01

    A method is presented to constrain a statistical model of sub-gridcolumn moisture variability using high-resolution satellite cloud data. The method can be used for large-scale model parameter estimation or cloud data assimilation. The gridcolumn model includes assumed probability density function (PDF) intra-layer horizontal variability and a copula-based inter-layer correlation model. The observables used in the current study are Moderate Resolution Imaging Spectroradiometer (MODIS) cloud-top pressure, brightness temperature and cloud optical thickness, but the method should be extensible to direct cloudy radiance assimilation for a small number of channels. The algorithm is a form of Bayesian inference with a Markov chain Monte Carlo (MCMC) approach to characterizing the posterior distribution. This approach is especially useful in cases where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach is not gradient-based and allows jumps into regions of non-zero cloud probability. The current study uses a skewed-triangle distribution for layer moisture. The article also includes a discussion of the Metropolis and multiple-try Metropolis versions of MCMC.

  7. An approximate inversion method of geoelectrical sounding data using linear and bayesian statistical approaches. Examples of Tritrivakely volcanic lake and Mahitsy area (central part of Madagascar)

    International Nuclear Information System (INIS)

    Ranaivo Nomenjanahary, F.; Rakoto, H.; Ratsimbazafy, J.B.

    1994-08-01

    This paper is concerned with resistivity sounding measurements performed from single site (vertical sounding) or from several sites (profiles) within a bounded area. The objective is to present an accurate information about the study area and to estimate the likelihood of the produced quantitative models. The achievement of this objective obviously requires quite relevant data and processing methods. It also requires interpretation methods which should take into account the probable effect of an heterogeneous structure. In front of such difficulties, the interpretation of resistivity sounding data inevitably involves the use of inversion methods. We suggest starting the interpretation in simple situation (1-D approximation), and using the rough but correct model obtained as an a-priori model for any more refined interpretation. Related to this point of view, special attention should be paid for the inverse problem applied to the resistivity sounding data. This inverse problem is nonlinear, while linearity inherent in the functional response used to describe the physical experiment. Two different approaches are used to build an approximate but higher dimensional inversion of geoelectrical data: the linear approach and the bayesian statistical approach. Some illustrations of their application in resistivity sounding data acquired at Tritrivakely volcanic lake (single site) and at Mahitsy area (several sites) will be given. (author). 28 refs, 7 figs

  8. A novel variational Bayes multiple locus Z-statistic for genome-wide association studies with Bayesian model averaging

    Science.gov (United States)

    Logsdon, Benjamin A.; Carty, Cara L.; Reiner, Alexander P.; Dai, James Y.; Kooperberg, Charles

    2012-01-01

    Motivation: For many complex traits, including height, the majority of variants identified by genome-wide association studies (GWAS) have small effects, leaving a significant proportion of the heritable variation unexplained. Although many penalized multiple regression methodologies have been proposed to increase the power to detect associations for complex genetic architectures, they generally lack mechanisms for false-positive control and diagnostics for model over-fitting. Our methodology is the first penalized multiple regression approach that explicitly controls Type I error rates and provide model over-fitting diagnostics through a novel normally distributed statistic defined for every marker within the GWAS, based on results from a variational Bayes spike regression algorithm. Results: We compare the performance of our method to the lasso and single marker analysis on simulated data and demonstrate that our approach has superior performance in terms of power and Type I error control. In addition, using the Women's Health Initiative (WHI) SNP Health Association Resource (SHARe) GWAS of African-Americans, we show that our method has power to detect additional novel associations with body height. These findings replicate by reaching a stringent cutoff of marginal association in a larger cohort. Availability: An R-package, including an implementation of our variational Bayes spike regression (vBsr) algorithm, is available at http://kooperberg.fhcrc.org/soft.html. Contact: blogsdon@fhcrc.org Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22563072

  9. Semiparametric Bayesian Analysis of Nutritional Epidemiology Data in the Presence of Measurement Error

    KAUST Repository

    Sinha, Samiran; Mallick, Bani K.; Kipnis, Victor; Carroll, Raymond J.

    2009-01-01

    We propose a semiparametric Bayesian method for handling measurement error in nutritional epidemiological data. Our goal is to estimate nonparametrically the form of association between a disease and exposure variable while the true values

  10. Nonparametric predictive pairwise comparison with competing risks

    International Nuclear Information System (INIS)

    Coolen-Maturi, Tahani

    2014-01-01

    In reliability, failure data often correspond to competing risks, where several failure modes can cause a unit to fail. This paper presents nonparametric predictive inference (NPI) for pairwise comparison with competing risks data, assuming that the failure modes are independent. These failure modes could be the same or different among the two groups, and these can be both observed and unobserved failure modes. NPI is a statistical approach based on few assumptions, with inferences strongly based on data and with uncertainty quantified via lower and upper probabilities. The focus is on the lower and upper probabilities for the event that the lifetime of a future unit from one group, say Y, is greater than the lifetime of a future unit from the second group, say X. The paper also shows how the two groups can be compared based on particular failure mode(s), and the comparison of the two groups when some of the competing risks are combined is discussed

  11. Inference in hybrid Bayesian networks

    DEFF Research Database (Denmark)

    Lanseth, Helge; Nielsen, Thomas Dyhre; Rumí, Rafael

    2009-01-01

    Since the 1980s, Bayesian Networks (BNs) have become increasingly popular for building statistical models of complex systems. This is particularly true for boolean systems, where BNs often prove to be a more efficient modelling framework than traditional reliability-techniques (like fault trees...... decade's research on inference in hybrid Bayesian networks. The discussions are linked to an example model for estimating human reliability....

  12. Mathematical statistics

    CERN Document Server

    Pestman, Wiebe R

    2009-01-01

    This textbook provides a broad and solid introduction to mathematical statistics, including the classical subjects hypothesis testing, normal regression analysis, and normal analysis of variance. In addition, non-parametric statistics and vectorial statistics are considered, as well as applications of stochastic analysis in modern statistics, e.g., Kolmogorov-Smirnov testing, smoothing techniques, robustness and density estimation. For students with some elementary mathematical background. With many exercises. Prerequisites from measure theory and linear algebra are presented.

  13. A course in statistics with R

    CERN Document Server

    Tattar, Prabhanjan N; Manjunath, B G

    2016-01-01

    Integrates the theory and applications of statistics using R A Course in Statistics with R has been written to bridge the gap between theory and applications and explain how mathematical expressions are converted into R programs. The book has been primarily designed as a useful companion for a Masters student during each semester of the course, but will also help applied statisticians in revisiting the underpinnings of the subject. With this dual goal in mind, the book begins with R basics and quickly covers visualization and exploratory analysis. Probability and statistical inference, inclusive of classical, nonparametric, and Bayesian schools, is developed with definitions, motivations, mathematical expression and R programs in a way which will help the reader to understand the mathematical development as well as R implementation. Linear regression models, experimental designs, multivariate analysis, and categorical data analysis are treated in a way which makes effective use of visualization techniques and...

  14. Nonparametric Mixture of Regression Models.

    Science.gov (United States)

    Huang, Mian; Li, Runze; Wang, Shaoli

    2013-07-01

    Motivated by an analysis of US house price index data, we propose nonparametric finite mixture of regression models. We study the identifiability issue of the proposed models, and develop an estimation procedure by employing kernel regression. We further systematically study the sampling properties of the proposed estimators, and establish their asymptotic normality. A modified EM algorithm is proposed to carry out the estimation procedure. We show that our algorithm preserves the ascent property of the EM algorithm in an asymptotic sense. Monte Carlo simulations are conducted to examine the finite sample performance of the proposed estimation procedure. An empirical analysis of the US house price index data is illustrated for the proposed methodology.

  15. Copula Based Factorization in Bayesian Multivariate Infinite Mixture Models

    OpenAIRE

    Martin Burda; Artem Prokhorov

    2012-01-01

    Bayesian nonparametric models based on infinite mixtures of density kernels have been recently gaining in popularity due to their flexibility and feasibility of implementation even in complicated modeling scenarios. In economics, they have been particularly useful in estimating nonparametric distributions of latent variables. However, these models have been rarely applied in more than one dimension. Indeed, the multivariate case suffers from the curse of dimensionality, with a rapidly increas...

  16. Statistics

    CERN Document Server

    Hayslett, H T

    1991-01-01

    Statistics covers the basic principles of Statistics. The book starts by tackling the importance and the two kinds of statistics; the presentation of sample data; the definition, illustration and explanation of several measures of location; and the measures of variation. The text then discusses elementary probability, the normal distribution and the normal approximation to the binomial. Testing of statistical hypotheses and tests of hypotheses about the theoretical proportion of successes in a binomial population and about the theoretical mean of a normal population are explained. The text the

  17. Bayesian methods in reliability

    Science.gov (United States)

    Sander, P.; Badoux, R.

    1991-11-01

    The present proceedings from a course on Bayesian methods in reliability encompasses Bayesian statistical methods and their computational implementation, models for analyzing censored data from nonrepairable systems, the traits of repairable systems and growth models, the use of expert judgment, and a review of the problem of forecasting software reliability. Specific issues addressed include the use of Bayesian methods to estimate the leak rate of a gas pipeline, approximate analyses under great prior uncertainty, reliability estimation techniques, and a nonhomogeneous Poisson process. Also addressed are the calibration sets and seed variables of expert judgment systems for risk assessment, experimental illustrations of the use of expert judgment for reliability testing, and analyses of the predictive quality of software-reliability growth models such as the Weibull order statistics.

  18. Bayesian Kernel Mixtures for Counts.

    Science.gov (United States)

    Canale, Antonio; Dunson, David B

    2011-12-01

    Although Bayesian nonparametric mixture models for continuous data are well developed, there is a limited literature on related approaches for count data. A common strategy is to use a mixture of Poissons, which unfortunately is quite restrictive in not accounting for distributions having variance less than the mean. Other approaches include mixing multinomials, which requires finite support, and using a Dirichlet process prior with a Poisson base measure, which does not allow smooth deviations from the Poisson. As a broad class of alternative models, we propose to use nonparametric mixtures of rounded continuous kernels. An efficient Gibbs sampler is developed for posterior computation, and a simulation study is performed to assess performance. Focusing on the rounded Gaussian case, we generalize the modeling framework to account for multivariate count data, joint modeling with continuous and categorical variables, and other complications. The methods are illustrated through applications to a developmental toxicity study and marketing data. This article has supplementary material online.

  19. Nonparametric correlation models for portfolio allocation

    DEFF Research Database (Denmark)

    Aslanidis, Nektarios; Casas, Isabel

    2013-01-01

    This article proposes time-varying nonparametric and semiparametric estimators of the conditional cross-correlation matrix in the context of portfolio allocation. Simulations results show that the nonparametric and semiparametric models are best in DGPs with substantial variability or structural ...... currencies. Results show the nonparametric model generally dominates the others when evaluating in-sample. However, the semiparametric model is best for out-of-sample analysis....

  20. Bayesian methods for proteomic biomarker development

    Directory of Open Access Journals (Sweden)

    Belinda Hernández

    2015-12-01

    In this review we provide an introduction to Bayesian inference and demonstrate some of the advantages of using a Bayesian framework. We summarize how Bayesian methods have been used previously in proteomics and other areas of bioinformatics. Finally, we describe some popular and emerging Bayesian models from the statistical literature and provide a worked tutorial including code snippets to show how these methods may be applied for the evaluation of proteomic biomarkers.

  1. Projection of future climate change conditions using IPCC simulations, neural networks and Bayesian statistics. Part 2: Precipitation mean state and seasonal cycle in South America

    Energy Technology Data Exchange (ETDEWEB)

    Boulanger, Jean-Philippe [LODYC, UMR CNRS/IRD/UPMC, Tour 45-55/Etage 4/Case 100, UPMC, Paris Cedex 05 (France); University of Buenos Aires, Departamento de Ciencias de la Atmosfera y los Oceanos, Facultad de Ciencias Exactas y Naturales, Buenos Aires (Argentina); Martinez, Fernando; Segura, Enrique C. [University of Buenos Aires, Departamento de Computacion, Facultad de Ciencias Exactas y Naturales, Buenos Aires (Argentina)

    2007-02-15

    Evaluating the response of climate to greenhouse gas forcing is a major objective of the climate community, and the use of large ensemble of simulations is considered as a significant step toward that goal. The present paper thus discusses a new methodology based on neural network to mix ensemble of climate model simulations. Our analysis consists of one simulation of seven Atmosphere-Ocean Global Climate Models, which participated in the IPCC Project and provided at least one simulation for the twentieth century (20c3m) and one simulation for each of three SRES scenarios: A2, A1B and B1. Our statistical method based on neural networks and Bayesian statistics computes a transfer function between models and observations. Such a transfer function was then used to project future conditions and to derive what we would call the optimal ensemble combination for twenty-first century climate change projections. Our approach is therefore based on one statement and one hypothesis. The statement is that an optimal ensemble projection should be built by giving larger weights to models, which have more skill in representing present climate conditions. The hypothesis is that our method based on neural network is actually weighting the models that way. While the statement is actually an open question, which answer may vary according to the region or climate signal under study, our results demonstrate that the neural network approach indeed allows to weighting models according to their skills. As such, our method is an improvement of existing Bayesian methods developed to mix ensembles of simulations. However, the general low skill of climate models in simulating precipitation mean climatology implies that the final projection maps (whatever the method used to compute them) may significantly change in the future as models improve. Therefore, the projection results for late twenty-first century conditions are presented as possible projections based on the &apos

  2. Bayesian Inference on Gravitational Waves

    Directory of Open Access Journals (Sweden)

    Asad Ali

    2015-12-01

    Full Text Available The Bayesian approach is increasingly becoming popular among the astrophysics data analysis communities. However, the Pakistan statistics communities are unaware of this fertile interaction between the two disciplines. Bayesian methods have been in use to address astronomical problems since the very birth of the Bayes probability in eighteenth century. Today the Bayesian methods for the detection and parameter estimation of gravitational waves have solid theoretical grounds with a strong promise for the realistic applications. This article aims to introduce the Pakistan statistics communities to the applications of Bayesian Monte Carlo methods in the analysis of gravitational wave data with an  overview of the Bayesian signal detection and estimation methods and demonstration by a couple of simplified examples.

  3. Monte Carlo Bayesian inference on a statistical model of sub-gridcolumn moisture variability using high-resolution cloud observations. Part 2: Sensitivity tests and results

    Science.gov (United States)

    Norris, Peter M.; da Silva, Arlindo M.

    2018-01-01

    Part 1 of this series presented a Monte Carlo Bayesian method for constraining a complex statistical model of global circulation model (GCM) sub-gridcolumn moisture variability using high-resolution Moderate Resolution Imaging Spectroradiometer (MODIS) cloud data, thereby permitting parameter estimation and cloud data assimilation for large-scale models. This article performs some basic testing of this new approach, verifying that it does indeed reduce mean and standard deviation biases significantly with respect to the assimilated MODIS cloud optical depth, brightness temperature and cloud-top pressure and that it also improves the simulated rotational–Raman scattering cloud optical centroid pressure (OCP) against independent (non-assimilated) retrievals from the Ozone Monitoring Instrument (OMI). Of particular interest, the Monte Carlo method does show skill in the especially difficult case where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach allows non-gradient-based jumps into regions of non-zero cloud probability. In the example provided, the method is able to restore marine stratocumulus near the Californian coast, where the background state has a clear swath. This article also examines a number of algorithmic and physical sensitivities of the new method and provides guidance for its cost-effective implementation. One obvious difficulty for the method, and other cloud data assimilation methods as well, is the lack of information content in passive-radiometer-retrieved cloud observables on cloud vertical structure, beyond cloud-top pressure and optical thickness, thus necessitating strong dependence on the background vertical moisture structure. It is found that a simple flow-dependent correlation modification from Riishojgaard provides some help in this respect, by

  4. First study of correlation between oleic acid content and SAD gene polymorphism in olive oil samples through statistical and bayesian modeling analyses.

    Science.gov (United States)

    Ben Ayed, Rayda; Ennouri, Karim; Ercişli, Sezai; Ben Hlima, Hajer; Hanana, Mohsen; Smaoui, Slim; Rebai, Ahmed; Moreau, Fabienne

    2018-04-10

    Virgin olive oil is appreciated for its particular aroma and taste and is recognized worldwide for its nutritional value and health benefits. The olive oil contains a vast range of healthy compounds such as monounsaturated free fatty acids, especially, oleic acid. The SAD.1 polymorphism localized in the Stearoyl-acyl carrier protein desaturase gene (SAD) was genotyped and showed that it is associated with the oleic acid composition of olive oil samples. However, the effect of polymorphisms in fatty acid-related genes on olive oil monounsaturated and saturated fatty acids distribution in the Tunisian olive oil varieties is not understood. Seventeen Tunisian olive-tree varieties were selected for fatty acid content analysis by gas chromatography. The association of SAD.1 genotypes with the fatty acids composition was studied by statistical and Bayesian modeling analyses. Fatty acid content analysis showed interestingly that some Tunisian virgin olive oil varieties could be classified as a functional food and nutraceuticals due to their particular richness in oleic acid. In fact, the TT-SAD.1 genotype was found to be associated with a higher proportion of mono-unsaturated fatty acids (MUFA), mainly oleic acid (C18:1) (r = - 0.79, p SAD.1 association with the oleic acid composition of olive oil was identified among the studied varieties. This correlation fluctuated between studied varieties, which might elucidate variability in lipidic composition among them and therefore reflecting genetic diversity through differences in gene expression and biochemical pathways. SAD locus would represent an excellent marker for identifying interesting amongst virgin olive oil lipidic composition.

  5. Monte Carlo Bayesian Inference on a Statistical Model of Sub-gridcolumn Moisture Variability Using High-resolution Cloud Observations . Part II; Sensitivity Tests and Results

    Science.gov (United States)

    da Silva, Arlindo M.; Norris, Peter M.

    2013-01-01

    Part I presented a Monte Carlo Bayesian method for constraining a complex statistical model of GCM sub-gridcolumn moisture variability using high-resolution MODIS cloud data, thereby permitting large-scale model parameter estimation and cloud data assimilation. This part performs some basic testing of this new approach, verifying that it does indeed significantly reduce mean and standard deviation biases with respect to the assimilated MODIS cloud optical depth, brightness temperature and cloud top pressure, and that it also improves the simulated rotational-Ramman scattering cloud optical centroid pressure (OCP) against independent (non-assimilated) retrievals from the OMI instrument. Of particular interest, the Monte Carlo method does show skill in the especially difficult case where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach allows finite jumps into regions of non-zero cloud probability. In the example provided, the method is able to restore marine stratocumulus near the Californian coast where the background state has a clear swath. This paper also examines a number of algorithmic and physical sensitivities of the new method and provides guidance for its cost-effective implementation. One obvious difficulty for the method, and other cloud data assimilation methods as well, is the lack of information content in the cloud observables on cloud vertical structure, beyond cloud top pressure and optical thickness, thus necessitating strong dependence on the background vertical moisture structure. It is found that a simple flow-dependent correlation modification due to Riishojgaard (1998) provides some help in this respect, by better honoring inversion structures in the background state.

  6. Monte Carlo Bayesian Inference on a Statistical Model of Sub-Gridcolumn Moisture Variability Using High-Resolution Cloud Observations. Part 2: Sensitivity Tests and Results

    Science.gov (United States)

    Norris, Peter M.; da Silva, Arlindo M.

    2016-01-01

    Part 1 of this series presented a Monte Carlo Bayesian method for constraining a complex statistical model of global circulation model (GCM) sub-gridcolumn moisture variability using high-resolution Moderate Resolution Imaging Spectroradiometer (MODIS) cloud data, thereby permitting parameter estimation and cloud data assimilation for large-scale models. This article performs some basic testing of this new approach, verifying that it does indeed reduce mean and standard deviation biases significantly with respect to the assimilated MODIS cloud optical depth, brightness temperature and cloud-top pressure and that it also improves the simulated rotational-Raman scattering cloud optical centroid pressure (OCP) against independent (non-assimilated) retrievals from the Ozone Monitoring Instrument (OMI). Of particular interest, the Monte Carlo method does show skill in the especially difficult case where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach allows non-gradient-based jumps into regions of non-zero cloud probability. In the example provided, the method is able to restore marine stratocumulus near the Californian coast, where the background state has a clear swath. This article also examines a number of algorithmic and physical sensitivities of the new method and provides guidance for its cost-effective implementation. One obvious difficulty for the method, and other cloud data assimilation methods as well, is the lack of information content in passive-radiometer-retrieved cloud observables on cloud vertical structure, beyond cloud-top pressure and optical thickness, thus necessitating strong dependence on the background vertical moisture structure. It is found that a simple flow-dependent correlation modification from Riishojgaard provides some help in this respect, by

  7. Quantum-Like Bayesian Networks for Modeling Decision Making

    Directory of Open Access Journals (Sweden)

    Catarina eMoreira

    2016-01-01

    Full Text Available In this work, we explore an alternative quantum structure to perform quantum probabilistic inferences to accommodate the paradoxical findings of the Sure Thing Principle. We propose a Quantum-Like Bayesian Network, which consists in replacing classical probabilities by quantum probability amplitudes. However, since this approach suffers from the problem of exponential growth of quantum parameters, we also propose a similarity heuristic that automatically fits quantum parameters through vector similarities. This makes the proposed model general and predictive in contrast to the current state of the art models, which cannot be generalized for more complex decision scenarios and that only provide an explanatory nature for the observed paradoxes. In the end, the model that we propose consists in a nonparametric method for estimating inference effects from a statistical point of view. It is a statistical model that is simpler than the previous quantum dynamic and quantum-like models proposed in the literature. We tested the proposed network with several empirical data from the literature, mainly from the Prisoner's Dilemma game and the Two Stage Gambling game. The results obtained show that the proposed quantum Bayesian Network is a general method that can accommodate violations of the laws of classical probability theory and make accurate predictions regarding human decision-making in these scenarios.

  8. Statistics

    Science.gov (United States)

    Links to sources of cancer-related statistics, including the Surveillance, Epidemiology and End Results (SEER) Program, SEER-Medicare datasets, cancer survivor prevalence data, and the Cancer Trends Progress Report.

  9. Bayesian biostatistics

    CERN Document Server

    Lesaffre, Emmanuel

    2012-01-01

    The growth of biostatistics has been phenomenal in recent years and has been marked by considerable technical innovation in both methodology and computational practicality. One area that has experienced significant growth is Bayesian methods. The growing use of Bayesian methodology has taken place partly due to an increasing number of practitioners valuing the Bayesian paradigm as matching that of scientific discovery. In addition, computational advances have allowed for more complex models to be fitted routinely to realistic data sets. Through examples, exercises and a combination of introd

  10. Nonparametric e-Mixture Estimation.

    Science.gov (United States)

    Takano, Ken; Hino, Hideitsu; Akaho, Shotaro; Murata, Noboru

    2016-12-01

    This study considers the common situation in data analysis when there are few observations of the distribution of interest or the target distribution, while abundant observations are available from auxiliary distributions. In this situation, it is natural to compensate for the lack of data from the target distribution by using data sets from these auxiliary distributions-in other words, approximating the target distribution in a subspace spanned by a set of auxiliary distributions. Mixture modeling is one of the simplest ways to integrate information from the target and auxiliary distributions in order to express the target distribution as accurately as possible. There are two typical mixtures in the context of information geometry: the [Formula: see text]- and [Formula: see text]-mixtures. The [Formula: see text]-mixture is applied in a variety of research fields because of the presence of the well-known expectation-maximazation algorithm for parameter estimation, whereas the [Formula: see text]-mixture is rarely used because of its difficulty of estimation, particularly for nonparametric models. The [Formula: see text]-mixture, however, is a well-tempered distribution that satisfies the principle of maximum entropy. To model a target distribution with scarce observations accurately, this letter proposes a novel framework for a nonparametric modeling of the [Formula: see text]-mixture and a geometrically inspired estimation algorithm. As numerical examples of the proposed framework, a transfer learning setup is considered. The experimental results show that this framework works well for three types of synthetic data sets, as well as an EEG real-world data set.

  11. Statistics

    International Nuclear Information System (INIS)

    2005-01-01

    For the years 2004 and 2005 the figures shown in the tables of Energy Review are partly preliminary. The annual statistics published in Energy Review are presented in more detail in a publication called Energy Statistics that comes out yearly. Energy Statistics also includes historical time-series over a longer period of time (see e.g. Energy Statistics, Statistics Finland, Helsinki 2004.) The applied energy units and conversion coefficients are shown in the back cover of the Review. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in GDP, energy consumption and electricity consumption, Carbon dioxide emissions from fossile fuels use, Coal consumption, Consumption of natural gas, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices in heat production, Fuel prices in electricity production, Price of electricity by type of consumer, Average monthly spot prices at the Nord pool power exchange, Total energy consumption by source and CO 2 -emissions, Supplies and total consumption of electricity GWh, Energy imports by country of origin in January-June 2003, Energy exports by recipient country in January-June 2003, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Price of natural gas by type of consumer, Price of electricity by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes, precautionary stock fees and oil pollution fees

  12. Nonparametric Change Point Diagnosis Method of Concrete Dam Crack Behavior Abnormality

    Directory of Open Access Journals (Sweden)

    Zhanchao Li

    2013-01-01

    Full Text Available The study on diagnosis method of concrete crack behavior abnormality has always been a hot spot and difficulty in the safety monitoring field of hydraulic structure. Based on the performance of concrete dam crack behavior abnormality in parametric statistical model and nonparametric statistical model, the internal relation between concrete dam crack behavior abnormality and statistical change point theory is deeply analyzed from the model structure instability of parametric statistical model and change of sequence distribution law of nonparametric statistical model. On this basis, through the reduction of change point problem, the establishment of basic nonparametric change point model, and asymptotic analysis on test method of basic change point problem, the nonparametric change point diagnosis method of concrete dam crack behavior abnormality is created in consideration of the situation that in practice concrete dam crack behavior may have more abnormality points. And the nonparametric change point diagnosis method of concrete dam crack behavior abnormality is used in the actual project, demonstrating the effectiveness and scientific reasonableness of the method established. Meanwhile, the nonparametric change point diagnosis method of concrete dam crack behavior abnormality has a complete theoretical basis and strong practicality with a broad application prospect in actual project.

  13. Bayesian Utilitarianism

    OpenAIRE

    ZHOU, Lin

    1996-01-01

    In this paper I consider social choices under uncertainty. I prove that any social choice rule that satisfies independence of irrelevant alternatives, translation invariance, and weak anonymity is consistent with ex post Bayesian utilitarianism

  14. Statistics

    International Nuclear Information System (INIS)

    2001-01-01

    For the year 2000, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy Review appear in more detail from the publication Energiatilastot - Energy Statistics issued annually, which also includes historical time series over a longer period (see e.g. Energiatilastot 1999, Statistics Finland, Helsinki 2000, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions from the use of fossil fuels, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in 2000, Energy exports by recipient country in 2000, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products

  15. Statistics

    International Nuclear Information System (INIS)

    2000-01-01

    For the year 1999 and 2000, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy Review appear in more detail from the publication Energiatilastot - Energy Statistics issued annually, which also includes historical time series over a longer period (see e.g., Energiatilastot 1998, Statistics Finland, Helsinki 1999, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in January-March 2000, Energy exports by recipient country in January-March 2000, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products

  16. Statistics

    International Nuclear Information System (INIS)

    1999-01-01

    For the year 1998 and the year 1999, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy Review appear in more detail from the publication Energiatilastot - Energy Statistics issued annually, which also includes historical time series over a longer period (see e.g. Energiatilastot 1998, Statistics Finland, Helsinki 1999, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in January-June 1999, Energy exports by recipient country in January-June 1999, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products

  17. Internal representations of temporal statistics and feedback calibrate motor-sensory interval timing.

    Directory of Open Access Journals (Sweden)

    Luigi Acerbi

    Full Text Available Humans have been shown to adapt to the temporal statistics of timing tasks so as to optimize the accuracy of their responses, in agreement with the predictions of Bayesian integration. This suggests that they build an internal representation of both the experimentally imposed distribution of time intervals (the prior and of the error (the loss function. The responses of a Bayesian ideal observer depend crucially on these internal representations, which have only been previously studied for simple distributions. To study the nature of these representations we asked subjects to reproduce time intervals drawn from underlying temporal distributions of varying complexity, from uniform to highly skewed or bimodal while also varying the error mapping that determined the performance feedback. Interval reproduction times were affected by both the distribution and feedback, in good agreement with a performance-optimizing Bayesian observer and actor model. Bayesian model comparison highlighted that subjects were integrating the provided feedback and represented the experimental distribution with a smoothed approximation. A nonparametric reconstruction of the subjective priors from the data shows that they are generally in agreement with the true distributions up to third-order moments, but with systematically heavier tails. In particular, higher-order statistical features (kurtosis, multimodality seem much harder to acquire. Our findings suggest that humans have only minor constraints on learning lower-order statistical properties of unimodal (including peaked and skewed distributions of time intervals under the guidance of corrective feedback, and that their behavior is well explained by Bayesian decision theory.

  18. Statistics

    International Nuclear Information System (INIS)

    2003-01-01

    For the year 2002, part of the figures shown in the tables of the Energy Review are partly preliminary. The annual statistics of the Energy Review also includes historical time-series over a longer period (see e.g. Energiatilastot 2001, Statistics Finland, Helsinki 2002). The applied energy units and conversion coefficients are shown in the inside back cover of the Review. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in GDP, energy consumption and electricity consumption, Carbon dioxide emissions from fossile fuels use, Coal consumption, Consumption of natural gas, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices in heat production, Fuel prices in electricity production, Price of electricity by type of consumer, Average monthly spot prices at the Nord pool power exchange, Total energy consumption by source and CO 2 -emissions, Supply and total consumption of electricity GWh, Energy imports by country of origin in January-June 2003, Energy exports by recipient country in January-June 2003, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Price of natural gas by type of consumer, Price of electricity by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Excise taxes, precautionary stock fees on oil pollution fees on energy products

  19. Statistics

    International Nuclear Information System (INIS)

    2004-01-01

    For the year 2003 and 2004, the figures shown in the tables of the Energy Review are partly preliminary. The annual statistics of the Energy Review also includes historical time-series over a longer period (see e.g. Energiatilastot, Statistics Finland, Helsinki 2003, ISSN 0785-3165). The applied energy units and conversion coefficients are shown in the inside back cover of the Review. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in GDP, energy consumption and electricity consumption, Carbon dioxide emissions from fossile fuels use, Coal consumption, Consumption of natural gas, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices in heat production, Fuel prices in electricity production, Price of electricity by type of consumer, Average monthly spot prices at the Nord pool power exchange, Total energy consumption by source and CO 2 -emissions, Supplies and total consumption of electricity GWh, Energy imports by country of origin in January-March 2004, Energy exports by recipient country in January-March 2004, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Price of natural gas by type of consumer, Price of electricity by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Excise taxes, precautionary stock fees on oil pollution fees

  20. Statistics

    International Nuclear Information System (INIS)

    2000-01-01

    For the year 1999 and 2000, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy also includes historical time series over a longer period (see e.g., Energiatilastot 1999, Statistics Finland, Helsinki 2000, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in January-June 2000, Energy exports by recipient country in January-June 2000, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products

  1. High throughput nonparametric probability density estimation.

    Science.gov (United States)

    Farmer, Jenny; Jacobs, Donald

    2018-01-01

    In high throughput applications, such as those found in bioinformatics and finance, it is important to determine accurate probability distribution functions despite only minimal information about data characteristics, and without using human subjectivity. Such an automated process for univariate data is implemented to achieve this goal by merging the maximum entropy method with single order statistics and maximum likelihood. The only required properties of the random variables are that they are continuous and that they are, or can be approximated as, independent and identically distributed. A quasi-log-likelihood function based on single order statistics for sampled uniform random data is used to empirically construct a sample size invariant universal scoring function. Then a probability density estimate is determined by iteratively improving trial cumulative distribution functions, where better estimates are quantified by the scoring function that identifies atypical fluctuations. This criterion resists under and over fitting data as an alternative to employing the Bayesian or Akaike information criterion. Multiple estimates for the probability density reflect uncertainties due to statistical fluctuations in random samples. Scaled quantile residual plots are also introduced as an effective diagnostic to visualize the quality of the estimated probability densities. Benchmark tests show that estimates for the probability density function (PDF) converge to the true PDF as sample size increases on particularly difficult test probability densities that include cases with discontinuities, multi-resolution scales, heavy tails, and singularities. These results indicate the method has general applicability for high throughput statistical inference.

  2. Genomic breeding value estimation using nonparametric additive regression models

    Directory of Open Access Journals (Sweden)

    Solberg Trygve

    2009-01-01

    Full Text Available Abstract Genomic selection refers to the use of genomewide dense markers for breeding value estimation and subsequently for selection. The main challenge of genomic breeding value estimation is the estimation of many effects from a limited number of observations. Bayesian methods have been proposed to successfully cope with these challenges. As an alternative class of models, non- and semiparametric models were recently introduced. The present study investigated the ability of nonparametric additive regression models to predict genomic breeding values. The genotypes were modelled for each marker or pair of flanking markers (i.e. the predictors separately. The nonparametric functions for the predictors were estimated simultaneously using additive model theory, applying a binomial kernel. The optimal degree of smoothing was determined by bootstrapping. A mutation-drift-balance simulation was carried out. The breeding values of the last generation (genotyped was predicted using data from the next last generation (genotyped and phenotyped. The results show moderate to high accuracies of the predicted breeding values. A determination of predictor specific degree of smoothing increased the accuracy.

  3. Application of Bayesian methods to habitat selection modeling of the northern spotted owl in California: new statistical methods for wildlife research

    Science.gov (United States)

    Howard B. Stauffer; Cynthia J. Zabel; Jeffrey R. Dunk

    2005-01-01

    We compared a set of competing logistic regression habitat selection models for Northern Spotted Owls (Strix occidentalis caurina) in California. The habitat selection models were estimated, compared, evaluated, and tested using multiple sample datasets collected on federal forestlands in northern California. We used Bayesian methods in interpreting...

  4. An imprecise Dirichlet model for Bayesian analysis of failure data including right-censored observations

    International Nuclear Information System (INIS)

    Coolen, F.P.A.

    1997-01-01

    This paper is intended to make researchers in reliability theory aware of a recently introduced Bayesian model with imprecise prior distributions for statistical inference on failure data, that can also be considered as a robust Bayesian model. The model consists of a multinomial distribution with Dirichlet priors, making the approach basically nonparametric. New results for the model are presented, related to right-censored observations, where estimation based on this model is closely related to the product-limit estimator, which is an important statistical method to deal with reliability or survival data including right-censored observations. As for the product-limit estimator, the model considered in this paper aims at not using any information other than that provided by observed data, but our model fits into the robust Bayesian context which has the advantage that all inferences can be based on probabilities or expectations, or bounds for probabilities or expectations. The model uses a finite partition of the time-axis, and as such it is also related to life-tables

  5. Screen Wars, Star Wars, and Sequels: Nonparametric Reanalysis of Movie Profitability

    OpenAIRE

    W. D. Walls

    2012-01-01

    In this paper we use nonparametric statistical tools to quantify motion-picture profit. We quantify the unconditional distribution of profit, the distribution of profit conditional on stars and sequels, and we also model the conditional expectation of movie profits using a non- parametric data-driven regression model. The flexibility of the non-parametric approach accommodates the full range of possible relationships among the variables without prior specification of a functional form, thereb...

  6. Bayesian semiparametric regression models to characterize molecular evolution

    Directory of Open Access Journals (Sweden)

    Datta Saheli

    2012-10-01

    Full Text Available Abstract Background Statistical models and methods that associate changes in the physicochemical properties of amino acids with natural selection at the molecular level typically do not take into account the correlations between such properties. We propose a Bayesian hierarchical regression model with a generalization of the Dirichlet process prior on the distribution of the regression coefficients that describes the relationship between the changes in amino acid distances and natural selection in protein-coding DNA sequence alignments. Results The Bayesian semiparametric approach is illustrated with simulated data and the abalone lysin sperm data. Our method identifies groups of properties which, for this particular dataset, have a similar effect on evolution. The model also provides nonparametric site-specific estimates for the strength of conservation of these properties. Conclusions The model described here is distinguished by its ability to handle a large number of amino acid properties simultaneously, while taking into account that such data can be correlated. The multi-level clustering ability of the model allows for appealing interpretations of the results in terms of properties that are roughly equivalent from the standpoint of molecular evolution.

  7. Advances in Bayesian Modeling in Educational Research

    Science.gov (United States)

    Levy, Roy

    2016-01-01

    In this article, I provide a conceptually oriented overview of Bayesian approaches to statistical inference and contrast them with frequentist approaches that currently dominate conventional practice in educational research. The features and advantages of Bayesian approaches are illustrated with examples spanning several statistical modeling…

  8. Single versus mixture Weibull distributions for nonparametric satellite reliability

    International Nuclear Information System (INIS)

    Castet, Jean-Francois; Saleh, Joseph H.

    2010-01-01

    Long recognized as a critical design attribute for space systems, satellite reliability has not yet received the proper attention as limited on-orbit failure data and statistical analyses can be found in the technical literature. To fill this gap, we recently conducted a nonparametric analysis of satellite reliability for 1584 Earth-orbiting satellites launched between January 1990 and October 2008. In this paper, we provide an advanced parametric fit, based on mixture of Weibull distributions, and compare it with the single Weibull distribution model obtained with the Maximum Likelihood Estimation (MLE) method. We demonstrate that both parametric fits are good approximations of the nonparametric satellite reliability, but that the mixture Weibull distribution provides significant accuracy in capturing all the failure trends in the failure data, as evidenced by the analysis of the residuals and their quasi-normal dispersion.

  9. Nonparametric tests for equality of psychometric functions.

    Science.gov (United States)

    García-Pérez, Miguel A; Núñez-Antón, Vicente

    2017-12-07

    Many empirical studies measure psychometric functions (curves describing how observers' performance varies with stimulus magnitude) because these functions capture the effects of experimental conditions. To assess these effects, parametric curves are often fitted to the data and comparisons are carried out by testing for equality of mean parameter estimates across conditions. This approach is parametric and, thus, vulnerable to violations of the implied assumptions. Furthermore, testing for equality of means of parameters may be misleading: Psychometric functions may vary meaningfully across conditions on an observer-by-observer basis with no effect on the mean values of the estimated parameters. Alternative approaches to assess equality of psychometric functions per se are thus needed. This paper compares three nonparametric tests that are applicable in all situations of interest: The existing generalized Mantel-Haenszel test, a generalization of the Berry-Mielke test that was developed here, and a split variant of the generalized Mantel-Haenszel test also developed here. Their statistical properties (accuracy and power) are studied via simulation and the results show that all tests are indistinguishable as to accuracy but they differ non-uniformly as to power. Empirical use of the tests is illustrated via analyses of published data sets and practical recommendations are given. The computer code in MATLAB and R to conduct these tests is available as Electronic Supplemental Material.

  10. Rate-optimal Bayesian intensity smoothing for inhomogeneous Poisson processes

    NARCIS (Netherlands)

    Belitser, E.N.; Serra, P.; van Zanten, H.

    2015-01-01

    We apply nonparametric Bayesian methods to study the problem of estimating the intensity function of an inhomogeneous Poisson process. To motivate our results we start by analyzing count data coming from a call center which we model as a Poisson process. This analysis is carried out using a certain

  11. Rate-optimal Bayesian intensity smoothing for inhomogeneous Poisson processes

    NARCIS (Netherlands)

    Belitser, E.; Andrade Serra, De P.J.; Zanten, van J.H.

    2013-01-01

    We apply nonparametric Bayesian methods to study the problem of estimating the intensity function of an inhomogeneous Poisson process. We exhibit a prior on intensities which both leads to a computationally feasible method and enjoys desirable theoretical optimality properties. The prior we use is

  12. Nonparametric Bayes Classification and Hypothesis Testing on Manifolds

    Science.gov (United States)

    Bhattacharya, Abhishek; Dunson, David

    2012-01-01

    Our first focus is prediction of a categorical response variable using features that lie on a general manifold. For example, the manifold may correspond to the surface of a hypersphere. We propose a general kernel mixture model for the joint distribution of the response and predictors, with the kernel expressed in product form and dependence induced through the unknown mixing measure. We provide simple sufficient conditions for large support and weak and strong posterior consistency in estimating both the joint distribution of the response and predictors and the conditional distribution of the response. Focusing on a Dirichlet process prior for the mixing measure, these conditions hold using von Mises-Fisher kernels when the manifold is the unit hypersphere. In this case, Bayesian methods are developed for efficient posterior computation using slice sampling. Next we develop Bayesian nonparametric methods for testing whether there is a difference in distributions between groups of observations on the manifold having unknown densities. We prove consistency of the Bayes factor and develop efficient computational methods for its calculation. The proposed classification and testing methods are evaluated using simulation examples and applied to spherical data applications. PMID:22754028

  13. Essays on nonparametric econometrics of stochastic volatility

    NARCIS (Netherlands)

    Zu, Y.

    2012-01-01

    Volatility is a concept that describes the variation of financial returns. Measuring and modelling volatility dynamics is an important aspect of financial econometrics. This thesis is concerned with nonparametric approaches to volatility measurement and volatility model validation.

  14. Nonparametric methods for volatility density estimation

    NARCIS (Netherlands)

    Es, van Bert; Spreij, P.J.C.; Zanten, van J.H.

    2009-01-01

    Stochastic volatility modelling of financial processes has become increasingly popular. The proposed models usually contain a stationary volatility process. We will motivate and review several nonparametric methods for estimation of the density of the volatility process. Both models based on

  15. A Rigorous Statistical Framework for the Mathematics of Sensing, Exploitation and Execution

    Science.gov (United States)

    2015-05-01

    2013). Bayesian factorizations of big sparse tensors . arXiv:1306.1598. • L. Ren, Y. Wang, D. Dunson and L. Carin, “The kernel beta process, Neural... Votes , to appear in Bayesian Analysis • M. Ding, L. He, D. Dunson and L. Carin, “Nonparametric Bayesian Segmentation of Multi- variate Inhomogeneous

  16. Approximate Bayesian recursive estimation

    Czech Academy of Sciences Publication Activity Database

    Kárný, Miroslav

    2014-01-01

    Roč. 285, č. 1 (2014), s. 100-111 ISSN 0020-0255 R&D Projects: GA ČR GA13-13502S Institutional support: RVO:67985556 Keywords : Approximate parameter estimation * Bayesian recursive estimation * Kullback–Leibler divergence * Forgetting Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 4.038, year: 2014 http://library.utia.cas.cz/separaty/2014/AS/karny-0425539.pdf

  17. Sparse reconstruction using distribution agnostic bayesian matching pursuit

    KAUST Repository

    Masood, Mudassir; Al-Naffouri, Tareq Y.

    2013-01-01

    A fast matching pursuit method using a Bayesian approach is introduced for sparse signal recovery. This method performs Bayesian estimates of sparse signals even when the signal prior is non-Gaussian or unknown. It is agnostic on signal statistics

  18. Modern nonparametric, robust and multivariate methods festschrift in honour of Hannu Oja

    CERN Document Server

    Taskinen, Sara

    2015-01-01

    Written by leading experts in the field, this edited volume brings together the latest findings in the area of nonparametric, robust and multivariate statistical methods. The individual contributions cover a wide variety of topics ranging from univariate nonparametric methods to robust methods for complex data structures. Some examples from statistical signal processing are also given. The volume is dedicated to Hannu Oja on the occasion of his 65th birthday and is intended for researchers as well as PhD students with a good knowledge of statistics.

  19. Bayesian model selection techniques as decision support for shaping a statistical analysis plan of a clinical trial: An example from a vertigo phase III study with longitudinal count data as primary endpoint

    Directory of Open Access Journals (Sweden)

    Adrion Christine

    2012-09-01

    Full Text Available Abstract Background A statistical analysis plan (SAP is a critical link between how a clinical trial is conducted and the clinical study report. To secure objective study results, regulatory bodies expect that the SAP will meet requirements in pre-specifying inferential analyses and other important statistical techniques. To write a good SAP for model-based sensitivity and ancillary analyses involves non-trivial decisions on and justification of many aspects of the chosen setting. In particular, trials with longitudinal count data as primary endpoints pose challenges for model choice and model validation. In the random effects setting, frequentist strategies for model assessment and model diagnosis are complex and not easily implemented and have several limitations. Therefore, it is of interest to explore Bayesian alternatives which provide the needed decision support to finalize a SAP. Methods We focus on generalized linear mixed models (GLMMs for the analysis of longitudinal count data. A series of distributions with over- and under-dispersion is considered. Additionally, the structure of the variance components is modified. We perform a simulation study to investigate the discriminatory power of Bayesian tools for model criticism in different scenarios derived from the model setting. We apply the findings to the data from an open clinical trial on vertigo attacks. These data are seen as pilot data for an ongoing phase III trial. To fit GLMMs we use a novel Bayesian computational approach based on integrated nested Laplace approximations (INLAs. The INLA methodology enables the direct computation of leave-one-out predictive distributions. These distributions are crucial for Bayesian model assessment. We evaluate competing GLMMs for longitudinal count data according to the deviance information criterion (DIC or probability integral transform (PIT, and by using proper scoring rules (e.g. the logarithmic score. Results The instruments under study

  20. Bayesian model selection techniques as decision support for shaping a statistical analysis plan of a clinical trial: an example from a vertigo phase III study with longitudinal count data as primary endpoint.

    Science.gov (United States)

    Adrion, Christine; Mansmann, Ulrich

    2012-09-10

    A statistical analysis plan (SAP) is a critical link between how a clinical trial is conducted and the clinical study report. To secure objective study results, regulatory bodies expect that the SAP will meet requirements in pre-specifying inferential analyses and other important statistical techniques. To write a good SAP for model-based sensitivity and ancillary analyses involves non-trivial decisions on and justification of many aspects of the chosen setting. In particular, trials with longitudinal count data as primary endpoints pose challenges for model choice and model validation. In the random effects setting, frequentist strategies for model assessment and model diagnosis are complex and not easily implemented and have several limitations. Therefore, it is of interest to explore Bayesian alternatives which provide the needed decision support to finalize a SAP. We focus on generalized linear mixed models (GLMMs) for the analysis of longitudinal count data. A series of distributions with over- and under-dispersion is considered. Additionally, the structure of the variance components is modified. We perform a simulation study to investigate the discriminatory power of Bayesian tools for model criticism in different scenarios derived from the model setting. We apply the findings to the data from an open clinical trial on vertigo attacks. These data are seen as pilot data for an ongoing phase III trial. To fit GLMMs we use a novel Bayesian computational approach based on integrated nested Laplace approximations (INLAs). The INLA methodology enables the direct computation of leave-one-out predictive distributions. These distributions are crucial for Bayesian model assessment. We evaluate competing GLMMs for longitudinal count data according to the deviance information criterion (DIC) or probability integral transform (PIT), and by using proper scoring rules (e.g. the logarithmic score). The instruments under study provide excellent tools for preparing decisions

  1. Bayesian modelling of the emission spectrum of the Joint European Torus Lithium Beam Emission Spectroscopy system.

    Science.gov (United States)

    Kwak, Sehyun; Svensson, J; Brix, M; Ghim, Y-C

    2016-02-01

    A Bayesian model of the emission spectrum of the JET lithium beam has been developed to infer the intensity of the Li I (2p-2s) line radiation and associated uncertainties. The detected spectrum for each channel of the lithium beam emission spectroscopy system is here modelled by a single Li line modified by an instrumental function, Bremsstrahlung background, instrumental offset, and interference filter curve. Both the instrumental function and the interference filter curve are modelled with non-parametric Gaussian processes. All free parameters of the model, the intensities of the Li line, Bremsstrahlung background, and instrumental offset, are inferred using Bayesian probability theory with a Gaussian likelihood for photon statistics and electronic background noise. The prior distributions of the free parameters are chosen as Gaussians. Given these assumptions, the intensity of the Li line and corresponding uncertainties are analytically available using a Bayesian linear inversion technique. The proposed approach makes it possible to extract the intensity of Li line without doing a separate background subtraction through modulation of the Li beam.

  2. Applied Bayesian modelling

    CERN Document Server

    Congdon, Peter

    2014-01-01

    This book provides an accessible approach to Bayesian computing and data analysis, with an emphasis on the interpretation of real data sets. Following in the tradition of the successful first edition, this book aims to make a wide range of statistical modeling applications accessible using tested code that can be readily adapted to the reader's own applications. The second edition has been thoroughly reworked and updated to take account of advances in the field. A new set of worked examples is included. The novel aspect of the first edition was the coverage of statistical modeling using WinBU

  3. The Development of Bayesian Theory and Its Applications in Business and Bioinformatics

    Science.gov (United States)

    Zhang, Yifei

    2018-03-01

    Bayesian Theory originated from an Essay of a British mathematician named Thomas Bayes in 1763, and after its development in 20th century, Bayesian Statistics has been taking a significant part in statistical study of all fields. Due to the recent breakthrough of high-dimensional integral, Bayesian Statistics has been improved and perfected, and now it can be used to solve problems that Classical Statistics failed to solve. This paper summarizes Bayesian Statistics’ history, concepts and applications, which are illustrated in five parts: the history of Bayesian Statistics, the weakness of Classical Statistics, Bayesian Theory and its development and applications. The first two parts make a comparison between Bayesian Statistics and Classical Statistics in a macroscopic aspect. And the last three parts focus on Bayesian Theory in specific -- from introducing some particular Bayesian Statistics’ concepts to listing their development and finally their applications.

  4. Bayesian theory and applications

    CERN Document Server

    Dellaportas, Petros; Polson, Nicholas G; Stephens, David A

    2013-01-01

    The development of hierarchical models and Markov chain Monte Carlo (MCMC) techniques forms one of the most profound advances in Bayesian analysis since the 1970s and provides the basis for advances in virtually all areas of applied and theoretical Bayesian statistics. This volume guides the reader along a statistical journey that begins with the basic structure of Bayesian theory, and then provides details on most of the past and present advances in this field. The book has a unique format. There is an explanatory chapter devoted to each conceptual advance followed by journal-style chapters that provide applications or further advances on the concept. Thus, the volume is both a textbook and a compendium of papers covering a vast range of topics. It is appropriate for a well-informed novice interested in understanding the basic approach, methods and recent applications. Because of its advanced chapters and recent work, it is also appropriate for a more mature reader interested in recent applications and devel...

  5. Nonparametric Change Point Diagnosis Method of Concrete Dam Crack Behavior Abnormality

    OpenAIRE

    Li, Zhanchao; Gu, Chongshi; Wu, Zhongru

    2013-01-01

    The study on diagnosis method of concrete crack behavior abnormality has always been a hot spot and difficulty in the safety monitoring field of hydraulic structure. Based on the performance of concrete dam crack behavior abnormality in parametric statistical model and nonparametric statistical model, the internal relation between concrete dam crack behavior abnormality and statistical change point theory is deeply analyzed from the model structure instability of parametric statistical model ...

  6. Nonparametric statistics: a step-by-step approach

    National Research Council Canada - National Science Library

    Corder, Gregory W; Foreman, Dale I

    2014-01-01

    .... The book continues to follow the same format in all chapters to aid in reader comprehension, and each chapter begins with a general introduction and a list of the chapter's main learning objectives...

  7. Promotion time cure rate model with nonparametric form of covariate effects.

    Science.gov (United States)

    Chen, Tianlei; Du, Pang

    2018-05-10

    Survival data with a cured portion are commonly seen in clinical trials. Motivated from a biological interpretation of cancer metastasis, promotion time cure model is a popular alternative to the mixture cure rate model for analyzing such data. The existing promotion cure models all assume a restrictive parametric form of covariate effects, which can be incorrectly specified especially at the exploratory stage. In this paper, we propose a nonparametric approach to modeling the covariate effects under the framework of promotion time cure model. The covariate effect function is estimated by smoothing splines via the optimization of a penalized profile likelihood. Point-wise interval estimates are also derived from the Bayesian interpretation of the penalized profile likelihood. Asymptotic convergence rates are established for the proposed estimates. Simulations show excellent performance of the proposed nonparametric method, which is then applied to a melanoma study. Copyright © 2018 John Wiley & Sons, Ltd.

  8. Multivariate nonparametric regression and visualization with R and applications to finance

    CERN Document Server

    Klemelä, Jussi

    2014-01-01

    A modern approach to statistical learning and its applications through visualization methods With a unique and innovative presentation, Multivariate Nonparametric Regression and Visualization provides readers with the core statistical concepts to obtain complete and accurate predictions when given a set of data. Focusing on nonparametric methods to adapt to the multiple types of data generatingmechanisms, the book begins with an overview of classification and regression. The book then introduces and examines various tested and proven visualization techniques for learning samples and functio

  9. 3rd Bayesian Young Statisticians Meeting

    CERN Document Server

    Lanzarone, Ettore; Villalobos, Isadora; Mattei, Alessandra

    2017-01-01

    This book is a selection of peer-reviewed contributions presented at the third Bayesian Young Statisticians Meeting, BAYSM 2016, Florence, Italy, June 19-21. The meeting provided a unique opportunity for young researchers, M.S. students, Ph.D. students, and postdocs dealing with Bayesian statistics to connect with the Bayesian community at large, to exchange ideas, and to network with others working in the same field. The contributions develop and apply Bayesian methods in a variety of fields, ranging from the traditional (e.g., biostatistics and reliability) to the most innovative ones (e.g., big data and networks).

  10. Approximate Bayesian computation.

    Directory of Open Access Journals (Sweden)

    Mikael Sunnåker

    Full Text Available Approximate Bayesian computation (ABC constitutes a class of computational methods rooted in Bayesian statistics. In all model-based statistical inference, the likelihood function is of central importance, since it expresses the probability of the observed data under a particular statistical model, and thus quantifies the support data lend to particular values of parameters and to choices among different models. For simple models, an analytical formula for the likelihood function can typically be derived. However, for more complex models, an analytical formula might be elusive or the likelihood function might be computationally very costly to evaluate. ABC methods bypass the evaluation of the likelihood function. In this way, ABC methods widen the realm of models for which statistical inference can be considered. ABC methods are mathematically well-founded, but they inevitably make assumptions and approximations whose impact needs to be carefully assessed. Furthermore, the wider application domain of ABC exacerbates the challenges of parameter estimation and model selection. ABC has rapidly gained popularity over the last years and in particular for the analysis of complex problems arising in biological sciences (e.g., in population genetics, ecology, epidemiology, and systems biology.

  11. Bayesian programming

    CERN Document Server

    Bessiere, Pierre; Ahuactzin, Juan Manuel; Mekhnacha, Kamel

    2013-01-01

    Probability as an Alternative to Boolean LogicWhile logic is the mathematical foundation of rational reasoning and the fundamental principle of computing, it is restricted to problems where information is both complete and certain. However, many real-world problems, from financial investments to email filtering, are incomplete or uncertain in nature. Probability theory and Bayesian computing together provide an alternative framework to deal with incomplete and uncertain data. Decision-Making Tools and Methods for Incomplete and Uncertain DataEmphasizing probability as an alternative to Boolean

  12. A Gentle Introduction to Bayesian Analysis : Applications to Developmental Research

    NARCIS (Netherlands)

    Van de Schoot, Rens; Kaplan, David; Denissen, Jaap; Asendorpf, Jens B.; Neyer, Franz J.; van Aken, Marcel A G

    2014-01-01

    Bayesian statistical methods are becoming ever more popular in applied and fundamental research. In this study a gentle introduction to Bayesian analysis is provided. It is shown under what circumstances it is attractive to use Bayesian estimation, and how to interpret properly the results. First,

  13. A gentle introduction to Bayesian analysis : Applications to developmental research

    NARCIS (Netherlands)

    van de Schoot, R.; Kaplan, D.; Denissen, J.J.A.; Asendorpf, J.B.; Neyer, F.J.; van Aken, M.A.G.

    2014-01-01

    Bayesian statistical methods are becoming ever more popular in applied and fundamental research. In this study a gentle introduction to Bayesian analysis is provided. It is shown under what circumstances it is attractive to use Bayesian estimation, and how to interpret properly the results. First,

  14. Prior approval: the growth of Bayesian methods in psychology.

    Science.gov (United States)

    Andrews, Mark; Baguley, Thom

    2013-02-01

    Within the last few years, Bayesian methods of data analysis in psychology have proliferated. In this paper, we briefly review the history or the Bayesian approach to statistics, and consider the implications that Bayesian methods have for the theory and practice of data analysis in psychology.

  15. Strong consistency of nonparametric Bayes density estimation on compact metric spaces with applications to specific manifolds.

    Science.gov (United States)

    Bhattacharya, Abhishek; Dunson, David B

    2012-08-01

    This article considers a broad class of kernel mixture density models on compact metric spaces and manifolds. Following a Bayesian approach with a nonparametric prior on the location mixing distribution, sufficient conditions are obtained on the kernel, prior and the underlying space for strong posterior consistency at any continuous density. The prior is also allowed to depend on the sample size n and sufficient conditions are obtained for weak and strong consistency. These conditions are verified on compact Euclidean spaces using multivariate Gaussian kernels, on the hypersphere using a von Mises-Fisher kernel and on the planar shape space using complex Watson kernels.

  16. Nonparametric conditional predictive regions for time series

    NARCIS (Netherlands)

    de Gooijer, J.G.; Zerom Godefay, D.

    2000-01-01

    Several nonparametric predictors based on the Nadaraya-Watson kernel regression estimator have been proposed in the literature. They include the conditional mean, the conditional median, and the conditional mode. In this paper, we consider three types of predictive regions for these predictors — the

  17. Non-Parametric Estimation of Correlation Functions

    DEFF Research Database (Denmark)

    Brincker, Rune; Rytter, Anders; Krenk, Steen

    In this paper three methods of non-parametric correlation function estimation are reviewed and evaluated: the direct method, estimation by the Fast Fourier Transform and finally estimation by the Random Decrement technique. The basic ideas of the techniques are reviewed, sources of bias are point...

  18. Nonparametric estimation in models for unobservable heterogeneity

    OpenAIRE

    Hohmann, Daniel

    2014-01-01

    Nonparametric models which allow for data with unobservable heterogeneity are studied. The first publication introduces new estimators and their asymptotic properties for conditional mixture models. The second publication considers estimation of a function from noisy observations of its Radon transform in a Gaussian white noise model.

  19. Nonparametric estimation of location and scale parameters

    KAUST Repository

    Potgieter, C.J.; Lombard, F.

    2012-01-01

    Two random variables X and Y belong to the same location-scale family if there are constants μ and σ such that Y and μ+σX have the same distribution. In this paper we consider non-parametric estimation of the parameters μ and σ under minimal

  20. Panel data specifications in nonparametric kernel regression

    DEFF Research Database (Denmark)

    Czekaj, Tomasz Gerard; Henningsen, Arne

    parametric panel data estimators to analyse the production technology of Polish crop farms. The results of our nonparametric kernel regressions generally differ from the estimates of the parametric models but they only slightly depend on the choice of the kernel functions. Based on economic reasoning, we...

  1. Application of Fragment Ion Information as Further Evidence in Probabilistic Compound Screening Using Bayesian Statistics and Machine Learning: A Leap Toward Automation.

    Science.gov (United States)

    Woldegebriel, Michael; Zomer, Paul; Mol, Hans G J; Vivó-Truyols, Gabriel

    2016-08-02

    In this work, we introduce an automated, efficient, and elegant model to combine all pieces of evidence (e.g., expected retention times, peak shapes, isotope distributions, fragment-to-parent ratio) obtained from liquid chromatography-tandem mass spectrometry (LC-MS/MS/MS) data for screening purposes. Combining all these pieces of evidence requires a careful assessment of the uncertainties in the analytical system as well as all possible outcomes. To-date, the majority of the existing algorithms are highly dependent on user input parameters. Additionally, the screening process is tackled as a deterministic problem. In this work we present a Bayesian framework to deal with the combination of all these pieces of evidence. Contrary to conventional algorithms, the information is treated in a probabilistic way, and a final probability assessment of the presence/absence of a compound feature is computed. Additionally, all the necessary parameters except the chromatographic band broadening for the method are learned from the data in training and learning phase of the algorithm, avoiding the introduction of a large number of user-defined parameters. The proposed method was validated with a large data set and has shown improved sensitivity and specificity in comparison to a threshold-based commercial software package.

  2. Structure-based bayesian sparse reconstruction

    KAUST Repository

    Quadeer, Ahmed Abdul; Al-Naffouri, Tareq Y.

    2012-01-01

    Sparse signal reconstruction algorithms have attracted research attention due to their wide applications in various fields. In this paper, we present a simple Bayesian approach that utilizes the sparsity constraint and a priori statistical

  3. A non-parametric method for correction of global radiation observations

    DEFF Research Database (Denmark)

    Bacher, Peder; Madsen, Henrik; Perers, Bengt

    2013-01-01

    in the observations are corrected. These are errors such as: tilt in the leveling of the sensor, shadowing from surrounding objects, clipping and saturation in the signal processing, and errors from dirt and wear. The method is based on a statistical non-parametric clear-sky model which is applied to both...

  4. Bayesian Graphical Models

    DEFF Research Database (Denmark)

    Jensen, Finn Verner; Nielsen, Thomas Dyhre

    2016-01-01

    Mathematically, a Bayesian graphical model is a compact representation of the joint probability distribution for a set of variables. The most frequently used type of Bayesian graphical models are Bayesian networks. The structural part of a Bayesian graphical model is a graph consisting of nodes...

  5. Vital statistics

    CERN Document Server

    MacKenzie, Dana

    2004-01-01

    The drawbacks of using 19th-century mathematics in physics and astronomy are illustrated. To continue with the expansion of the knowledge about the cosmos, the scientists will have to come in terms with modern statistics. Some researchers have deliberately started importing techniques that are used in medical research. However, the physicists need to identify the brand of statistics that will be suitable for them, and make a choice between the Bayesian and the frequentists approach. (Edited abstract).

  6. Inference in hybrid Bayesian networks

    International Nuclear Information System (INIS)

    Langseth, Helge; Nielsen, Thomas D.; Rumi, Rafael; Salmeron, Antonio

    2009-01-01

    Since the 1980s, Bayesian networks (BNs) have become increasingly popular for building statistical models of complex systems. This is particularly true for boolean systems, where BNs often prove to be a more efficient modelling framework than traditional reliability techniques (like fault trees and reliability block diagrams). However, limitations in the BNs' calculation engine have prevented BNs from becoming equally popular for domains containing mixtures of both discrete and continuous variables (the so-called hybrid domains). In this paper we focus on these difficulties, and summarize some of the last decade's research on inference in hybrid Bayesian networks. The discussions are linked to an example model for estimating human reliability.

  7. Polarised neutron diffraction measurements of PrBa2Cu3O6+X and Bayesian statistical analysis of such data

    International Nuclear Information System (INIS)

    Markvardsen, A.J.

    2000-01-01

    The physics of the series Pr y Y 1-y Ba 2 CU 3 O 6+x , and ability of Pr to suppress superconductivity, has been a subject of frequent discussions in the literature for more than a decade. This thesis describes a polarised neutron diffraction (PND) experiment performed on PrBa 2 Cu 3 O 6.24 designed to find out something about the electron structure. This experiment pushed the limits of what can be done using the PND technique. The problem is one of a limited number of measured Fourier components that need to be inverted to form a real space image. To accomplish this inversion the maximum entropy technique has been employed. In some cases, the maximum entropy technique has the ability to increase the resolution of 'inverted' data immensely, but this ability is found to depend critically on the choice of constants used in the method. To investigate this a Bayesian robustness analysis of the maximum entropy method is carried out, resulting in an improvement of the maximum entropy technique for analysing PND data. Some results for nickel in the literature have been te-analysed and a comparison is made with different maximum entropy algorithms. Equipped with an improved data analysis technique and carefully measured PND data for PrBa 2 Cu 3 O 6.24 a number of new interesting features are observed, putting constraints on existing theoretical models of Pr y Y 1-y Ba 2 Cu 3 O 6+x and leaving room for more questions to be answered. (author)

  8. NONPARAMETRIC FIXED EFFECT PANEL DATA MODELS: RELATIONSHIP BETWEEN AIR POLLUTION AND INCOME FOR TURKEY

    Directory of Open Access Journals (Sweden)

    Rabia Ece OMAY

    2013-06-01

    Full Text Available In this study, relationship between gross domestic product (GDP per capita and sulfur dioxide (SO2 and particulate matter (PM10 per capita is modeled for Turkey. Nonparametric fixed effect panel data analysis is used for the modeling. The panel data covers 12 territories, in first level of Nomenclature of Territorial Units for Statistics (NUTS, for period of 1990-2001. Modeling of the relationship between GDP and SO2 and PM10 for Turkey, the non-parametric models have given good results.

  9. Nonparametric method for failures diagnosis in the actuating subsystem of aircraft control system

    Science.gov (United States)

    Terentev, M. N.; Karpenko, S. S.; Zybin, E. Yu; Kosyanchuk, V. V.

    2018-02-01

    In this paper we design a nonparametric method for failures diagnosis in the aircraft control system that uses the measurements of the control signals and the aircraft states only. It doesn’t require a priori information of the aircraft model parameters, training or statistical calculations, and is based on analytical nonparametric one-step-ahead state prediction approach. This makes it possible to predict the behavior of unidentified and failure dynamic systems, to weaken the requirements to control signals, and to reduce the diagnostic time and problem complexity.

  10. Non-parametric correlative uncertainty quantification and sensitivity analysis: Application to a Langmuir bimolecular adsorption model

    Science.gov (United States)

    Feng, Jinchao; Lansford, Joshua; Mironenko, Alexander; Pourkargar, Davood Babaei; Vlachos, Dionisios G.; Katsoulakis, Markos A.

    2018-03-01

    We propose non-parametric methods for both local and global sensitivity analysis of chemical reaction models with correlated parameter dependencies. The developed mathematical and statistical tools are applied to a benchmark Langmuir competitive adsorption model on a close packed platinum surface, whose parameters, estimated from quantum-scale computations, are correlated and are limited in size (small data). The proposed mathematical methodology employs gradient-based methods to compute sensitivity indices. We observe that ranking influential parameters depends critically on whether or not correlations between parameters are taken into account. The impact of uncertainty in the correlation and the necessity of the proposed non-parametric perspective are demonstrated.

  11. Non-parametric correlative uncertainty quantification and sensitivity analysis: Application to a Langmuir bimolecular adsorption model

    Directory of Open Access Journals (Sweden)

    Jinchao Feng

    2018-03-01

    Full Text Available We propose non-parametric methods for both local and global sensitivity analysis of chemical reaction models with correlated parameter dependencies. The developed mathematical and statistical tools are applied to a benchmark Langmuir competitive adsorption model on a close packed platinum surface, whose parameters, estimated from quantum-scale computations, are correlated and are limited in size (small data. The proposed mathematical methodology employs gradient-based methods to compute sensitivity indices. We observe that ranking influential parameters depends critically on whether or not correlations between parameters are taken into account. The impact of uncertainty in the correlation and the necessity of the proposed non-parametric perspective are demonstrated.

  12. Parametric and Non-Parametric System Modelling

    DEFF Research Database (Denmark)

    Nielsen, Henrik Aalborg

    1999-01-01

    the focus is on combinations of parametric and non-parametric methods of regression. This combination can be in terms of additive models where e.g. one or more non-parametric term is added to a linear regression model. It can also be in terms of conditional parametric models where the coefficients...... considered. It is shown that adaptive estimation in conditional parametric models can be performed by combining the well known methods of local polynomial regression and recursive least squares with exponential forgetting. The approach used for estimation in conditional parametric models also highlights how...... networks is included. In this paper, neural networks are used for predicting the electricity production of a wind farm. The results are compared with results obtained using an adaptively estimated ARX-model. Finally, two papers on stochastic differential equations are included. In the first paper, among...

  13. Nonparametric Bayes Modeling of Multivariate Categorical Data.

    Science.gov (United States)

    Dunson, David B; Xing, Chuanhua

    2012-01-01

    Modeling of multivariate unordered categorical (nominal) data is a challenging problem, particularly in high dimensions and cases in which one wishes to avoid strong assumptions about the dependence structure. Commonly used approaches rely on the incorporation of latent Gaussian random variables or parametric latent class models. The goal of this article is to develop a nonparametric Bayes approach, which defines a prior with full support on the space of distributions for multiple unordered categorical variables. This support condition ensures that we are not restricting the dependence structure a priori. We show this can be accomplished through a Dirichlet process mixture of product multinomial distributions, which is also a convenient form for posterior computation. Methods for nonparametric testing of violations of independence are proposed, and the methods are applied to model positional dependence within transcription factor binding motifs.

  14. Parametric Bayesian Estimation of Differential Entropy and Relative Entropy

    Directory of Open Access Journals (Sweden)

    Maya Gupta

    2010-04-01

    Full Text Available Given iid samples drawn from a distribution with known parametric form, we propose the minimization of expected Bregman divergence to form Bayesian estimates of differential entropy and relative entropy, and derive such estimators for the uniform, Gaussian, Wishart, and inverse Wishart distributions. Additionally, formulas are given for a log gamma Bregman divergence and the differential entropy and relative entropy for the Wishart and inverse Wishart. The results, as always with Bayesian estimates, depend on the accuracy of the prior parameters, but example simulations show that the performance can be substantially improved compared to maximum likelihood or state-of-the-art nonparametric estimators.

  15. The bootstrap and Bayesian bootstrap method in assessing bioequivalence

    International Nuclear Information System (INIS)

    Wan Jianping; Zhang Kongsheng; Chen Hui

    2009-01-01

    Parametric method for assessing individual bioequivalence (IBE) may concentrate on the hypothesis that the PK responses are normal. Nonparametric method for evaluating IBE would be bootstrap method. In 2001, the United States Food and Drug Administration (FDA) proposed a draft guidance. The purpose of this article is to evaluate the IBE between test drug and reference drug by bootstrap and Bayesian bootstrap method. We study the power of bootstrap test procedures and the parametric test procedures in FDA (2001). We find that the Bayesian bootstrap method is the most excellent.

  16. Bayesian analysis of CCDM models

    Science.gov (United States)

    Jesus, J. F.; Valentim, R.; Andrade-Oliveira, F.

    2017-09-01

    Creation of Cold Dark Matter (CCDM), in the context of Einstein Field Equations, produces a negative pressure term which can be used to explain the accelerated expansion of the Universe. In this work we tested six different spatially flat models for matter creation using statistical criteria, in light of SNe Ia data: Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and Bayesian Evidence (BE). These criteria allow to compare models considering goodness of fit and number of free parameters, penalizing excess of complexity. We find that JO model is slightly favoured over LJO/ΛCDM model, however, neither of these, nor Γ = 3αH0 model can be discarded from the current analysis. Three other scenarios are discarded either because poor fitting or because of the excess of free parameters. A method of increasing Bayesian evidence through reparameterization in order to reducing parameter degeneracy is also developed.

  17. Bayesian analysis of CCDM models

    Energy Technology Data Exchange (ETDEWEB)

    Jesus, J.F. [Universidade Estadual Paulista (Unesp), Câmpus Experimental de Itapeva, Rua Geraldo Alckmin 519, Vila N. Sra. de Fátima, Itapeva, SP, 18409-010 Brazil (Brazil); Valentim, R. [Departamento de Física, Instituto de Ciências Ambientais, Químicas e Farmacêuticas—ICAQF, Universidade Federal de São Paulo (UNIFESP), Unidade José Alencar, Rua São Nicolau No. 210, Diadema, SP, 09913-030 Brazil (Brazil); Andrade-Oliveira, F., E-mail: jfjesus@itapeva.unesp.br, E-mail: valentim.rodolfo@unifesp.br, E-mail: felipe.oliveira@port.ac.uk [Institute of Cosmology and Gravitation—University of Portsmouth, Burnaby Road, Portsmouth, PO1 3FX United Kingdom (United Kingdom)

    2017-09-01

    Creation of Cold Dark Matter (CCDM), in the context of Einstein Field Equations, produces a negative pressure term which can be used to explain the accelerated expansion of the Universe. In this work we tested six different spatially flat models for matter creation using statistical criteria, in light of SNe Ia data: Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and Bayesian Evidence (BE). These criteria allow to compare models considering goodness of fit and number of free parameters, penalizing excess of complexity. We find that JO model is slightly favoured over LJO/ΛCDM model, however, neither of these, nor Γ = 3α H {sub 0} model can be discarded from the current analysis. Three other scenarios are discarded either because poor fitting or because of the excess of free parameters. A method of increasing Bayesian evidence through reparameterization in order to reducing parameter degeneracy is also developed.

  18. portfolio optimization based on nonparametric estimation methods

    Directory of Open Access Journals (Sweden)

    mahsa ghandehari

    2017-03-01

    Full Text Available One of the major issues investors are facing with in capital markets is decision making about select an appropriate stock exchange for investing and selecting an optimal portfolio. This process is done through the risk and expected return assessment. On the other hand in portfolio selection problem if the assets expected returns are normally distributed, variance and standard deviation are used as a risk measure. But, the expected returns on assets are not necessarily normal and sometimes have dramatic differences from normal distribution. This paper with the introduction of conditional value at risk ( CVaR, as a measure of risk in a nonparametric framework, for a given expected return, offers the optimal portfolio and this method is compared with the linear programming method. The data used in this study consists of monthly returns of 15 companies selected from the top 50 companies in Tehran Stock Exchange during the winter of 1392 which is considered from April of 1388 to June of 1393. The results of this study show the superiority of nonparametric method over the linear programming method and the nonparametric method is much faster than the linear programming method.

  19. Nonparametric Mixture Models for Supervised Image Parcellation.

    Science.gov (United States)

    Sabuncu, Mert R; Yeo, B T Thomas; Van Leemput, Koen; Fischl, Bruce; Golland, Polina

    2009-09-01

    We present a nonparametric, probabilistic mixture model for the supervised parcellation of images. The proposed model yields segmentation algorithms conceptually similar to the recently developed label fusion methods, which register a new image with each training image separately. Segmentation is achieved via the fusion of transferred manual labels. We show that in our framework various settings of a model parameter yield algorithms that use image intensity information differently in determining the weight of a training subject during fusion. One particular setting computes a single, global weight per training subject, whereas another setting uses locally varying weights when fusing the training data. The proposed nonparametric parcellation approach capitalizes on recently developed fast and robust pairwise image alignment tools. The use of multiple registrations allows the algorithm to be robust to occasional registration failures. We report experiments on 39 volumetric brain MRI scans with expert manual labels for the white matter, cerebral cortex, ventricles and subcortical structures. The results demonstrate that the proposed nonparametric segmentation framework yields significantly better segmentation than state-of-the-art algorithms.

  20. Fast Bayesian Inference in Dirichlet Process Mixture Models.

    Science.gov (United States)

    Wang, Lianming; Dunson, David B

    2011-01-01

    There has been increasing interest in applying Bayesian nonparametric methods in large samples and high dimensions. As Markov chain Monte Carlo (MCMC) algorithms are often infeasible, there is a pressing need for much faster algorithms. This article proposes a fast approach for inference in Dirichlet process mixture (DPM) models. Viewing the partitioning of subjects into clusters as a model selection problem, we propose a sequential greedy search algorithm for selecting the partition. Then, when conjugate priors are chosen, the resulting posterior conditionally on the selected partition is available in closed form. This approach allows testing of parametric models versus nonparametric alternatives based on Bayes factors. We evaluate the approach using simulation studies and compare it with four other fast nonparametric methods in the literature. We apply the proposed approach to three datasets including one from a large epidemiologic study. Matlab codes for the simulation and data analyses using the proposed approach are available online in the supplemental materials.

  1. Bayesian Latent Class Analysis Tutorial.

    Science.gov (United States)

    Li, Yuelin; Lord-Bessen, Jennifer; Shiyko, Mariya; Loeb, Rebecca

    2018-01-01

    This article is a how-to guide on Bayesian computation using Gibbs sampling, demonstrated in the context of Latent Class Analysis (LCA). It is written for students in quantitative psychology or related fields who have a working knowledge of Bayes Theorem and conditional probability and have experience in writing computer programs in the statistical language R . The overall goals are to provide an accessible and self-contained tutorial, along with a practical computation tool. We begin with how Bayesian computation is typically described in academic articles. Technical difficulties are addressed by a hypothetical, worked-out example. We show how Bayesian computation can be broken down into a series of simpler calculations, which can then be assembled together to complete a computationally more complex model. The details are described much more explicitly than what is typically available in elementary introductions to Bayesian modeling so that readers are not overwhelmed by the mathematics. Moreover, the provided computer program shows how Bayesian LCA can be implemented with relative ease. The computer program is then applied in a large, real-world data set and explained line-by-line. We outline the general steps in how to extend these considerations to other methodological applications. We conclude with suggestions for further readings.

  2. STUDIES IN ASTRONOMICAL TIME SERIES ANALYSIS. VI. BAYESIAN BLOCK REPRESENTATIONS

    Energy Technology Data Exchange (ETDEWEB)

    Scargle, Jeffrey D. [Space Science and Astrobiology Division, MS 245-3, NASA Ames Research Center, Moffett Field, CA 94035-1000 (United States); Norris, Jay P. [Physics Department, Boise State University, 2110 University Drive, Boise, ID 83725-1570 (United States); Jackson, Brad [The Center for Applied Mathematics and Computer Science, Department of Mathematics, San Jose State University, One Washington Square, MH 308, San Jose, CA 95192-0103 (United States); Chiang, James, E-mail: jeffrey.d.scargle@nasa.gov [W. W. Hansen Experimental Physics Laboratory, Kavli Institute for Particle Astrophysics and Cosmology, Department of Physics and SLAC National Accelerator Laboratory, Stanford University, Stanford, CA 94305 (United States)

    2013-02-20

    This paper addresses the problem of detecting and characterizing local variability in time series and other forms of sequential data. The goal is to identify and characterize statistically significant variations, at the same time suppressing the inevitable corrupting observational errors. We present a simple nonparametric modeling technique and an algorithm implementing it-an improved and generalized version of Bayesian Blocks-that finds the optimal segmentation of the data in the observation interval. The structure of the algorithm allows it to be used in either a real-time trigger mode, or a retrospective mode. Maximum likelihood or marginal posterior functions to measure model fitness are presented for events, binned counts, and measurements at arbitrary times with known error distributions. Problems addressed include those connected with data gaps, variable exposure, extension to piecewise linear and piecewise exponential representations, multivariate time series data, analysis of variance, data on the circle, other data modes, and dispersed data. Simulations provide evidence that the detection efficiency for weak signals is close to a theoretical asymptotic limit derived by Arias-Castro et al. In the spirit of Reproducible Research all of the code and data necessary to reproduce all of the figures in this paper are included as supplementary material.

  3. Studies in Astronomical Time Series Analysis. VI. Bayesian Block Representations

    Science.gov (United States)

    Scargle, Jeffrey D.; Norris, Jay P.; Jackson, Brad; Chiang, James

    2013-01-01

    This paper addresses the problem of detecting and characterizing local variability in time series and other forms of sequential data. The goal is to identify and characterize statistically significant variations, at the same time suppressing the inevitable corrupting observational errors. We present a simple nonparametric modeling technique and an algorithm implementing it-an improved and generalized version of Bayesian Blocks [Scargle 1998]-that finds the optimal segmentation of the data in the observation interval. The structure of the algorithm allows it to be used in either a real-time trigger mode, or a retrospective mode. Maximum likelihood or marginal posterior functions to measure model fitness are presented for events, binned counts, and measurements at arbitrary times with known error distributions. Problems addressed include those connected with data gaps, variable exposure, extension to piece- wise linear and piecewise exponential representations, multivariate time series data, analysis of variance, data on the circle, other data modes, and dispersed data. Simulations provide evidence that the detection efficiency for weak signals is close to a theoretical asymptotic limit derived by [Arias-Castro, Donoho and Huo 2003]. In the spirit of Reproducible Research [Donoho et al. (2008)] all of the code and data necessary to reproduce all of the figures in this paper are included as auxiliary material.

  4. STUDIES IN ASTRONOMICAL TIME SERIES ANALYSIS. VI. BAYESIAN BLOCK REPRESENTATIONS

    International Nuclear Information System (INIS)

    Scargle, Jeffrey D.; Norris, Jay P.; Jackson, Brad; Chiang, James

    2013-01-01

    This paper addresses the problem of detecting and characterizing local variability in time series and other forms of sequential data. The goal is to identify and characterize statistically significant variations, at the same time suppressing the inevitable corrupting observational errors. We present a simple nonparametric modeling technique and an algorithm implementing it—an improved and generalized version of Bayesian Blocks—that finds the optimal segmentation of the data in the observation interval. The structure of the algorithm allows it to be used in either a real-time trigger mode, or a retrospective mode. Maximum likelihood or marginal posterior functions to measure model fitness are presented for events, binned counts, and measurements at arbitrary times with known error distributions. Problems addressed include those connected with data gaps, variable exposure, extension to piecewise linear and piecewise exponential representations, multivariate time series data, analysis of variance, data on the circle, other data modes, and dispersed data. Simulations provide evidence that the detection efficiency for weak signals is close to a theoretical asymptotic limit derived by Arias-Castro et al. In the spirit of Reproducible Research all of the code and data necessary to reproduce all of the figures in this paper are included as supplementary material.

  5. Statistical Analysis of Data for Timber Strengths

    DEFF Research Database (Denmark)

    Sørensen, John Dalsgaard; Hoffmeyer, P.

    Statistical analyses are performed for material strength parameters from approximately 6700 specimens of structural timber. Non-parametric statistical analyses and fits to the following distributions types have been investigated: Normal, Lognormal, 2 parameter Weibull and 3-parameter Weibull...

  6. Towards Bayesian Inference of the Fast-Ion Distribution Function

    DEFF Research Database (Denmark)

    Stagner, L.; Heidbrink, W.W.; Salewski, Mirko

    2012-01-01

    sensitivity of the measurements are incorporated into Bayesian likelihood probabilities, while prior probabilities enforce physical constraints. As an initial step, this poster uses Bayesian statistics to infer the DIII-D electron density profile from multiple diagnostic measurements. Likelihood functions....... However, when theory and experiment disagree (for one or more diagnostics), it is unclear how to proceed. Bayesian statistics provides a framework to infer the DF, quantify errors, and reconcile discrepant diagnostic measurements. Diagnostic errors and ``weight functions" that describe the phase space...

  7. New applications of statistical tools in plant pathology.

    Science.gov (United States)

    Garrett, K A; Madden, L V; Hughes, G; Pfender, W F

    2004-09-01

    ABSTRACT The series of papers introduced by this one address a range of statistical applications in plant pathology, including survival analysis, nonparametric analysis of disease associations, multivariate analyses, neural networks, meta-analysis, and Bayesian statistics. Here we present an overview of additional applications of statistics in plant pathology. An analysis of variance based on the assumption of normally distributed responses with equal variances has been a standard approach in biology for decades. Advances in statistical theory and computation now make it convenient to appropriately deal with discrete responses using generalized linear models, with adjustments for overdispersion as needed. New nonparametric approaches are available for analysis of ordinal data such as disease ratings. Many experiments require the use of models with fixed and random effects for data analysis. New or expanded computing packages, such as SAS PROC MIXED, coupled with extensive advances in statistical theory, allow for appropriate analyses of normally distributed data using linear mixed models, and discrete data with generalized linear mixed models. Decision theory offers a framework in plant pathology for contexts such as the decision about whether to apply or withhold a treatment. Model selection can be performed using Akaike's information criterion. Plant pathologists studying pathogens at the population level have traditionally been the main consumers of statistical approaches in plant pathology, but new technologies such as microarrays supply estimates of gene expression for thousands of genes simultaneously and present challenges for statistical analysis. Applications to the study of the landscape of the field and of the genome share the risk of pseudoreplication, the problem of determining the appropriate scale of the experimental unit and of obtaining sufficient replication at that scale.

  8. Bayesian artificial intelligence

    CERN Document Server

    Korb, Kevin B

    2010-01-01

    Updated and expanded, Bayesian Artificial Intelligence, Second Edition provides a practical and accessible introduction to the main concepts, foundation, and applications of Bayesian networks. It focuses on both the causal discovery of networks and Bayesian inference procedures. Adopting a causal interpretation of Bayesian networks, the authors discuss the use of Bayesian networks for causal modeling. They also draw on their own applied research to illustrate various applications of the technology.New to the Second EditionNew chapter on Bayesian network classifiersNew section on object-oriente

  9. Bayesian artificial intelligence

    CERN Document Server

    Korb, Kevin B

    2003-01-01

    As the power of Bayesian techniques has become more fully realized, the field of artificial intelligence has embraced Bayesian methodology and integrated it to the point where an introduction to Bayesian techniques is now a core course in many computer science programs. Unlike other books on the subject, Bayesian Artificial Intelligence keeps mathematical detail to a minimum and covers a broad range of topics. The authors integrate all of Bayesian net technology and learning Bayesian net technology and apply them both to knowledge engineering. They emphasize understanding and intuition but also provide the algorithms and technical background needed for applications. Software, exercises, and solutions are available on the authors' website.

  10. Nonparametric Estimation of Distributions in Random Effects Models

    KAUST Repository

    Hart, Jeffrey D.

    2011-01-01

    We propose using minimum distance to obtain nonparametric estimates of the distributions of components in random effects models. A main setting considered is equivalent to having a large number of small datasets whose locations, and perhaps scales, vary randomly, but which otherwise have a common distribution. Interest focuses on estimating the distribution that is common to all datasets, knowledge of which is crucial in multiple testing problems where a location/scale invariant test is applied to every small dataset. A detailed algorithm for computing minimum distance estimates is proposed, and the usefulness of our methodology is illustrated by a simulation study and an analysis of microarray data. Supplemental materials for the article, including R-code and a dataset, are available online. © 2011 American Statistical Association.

  11. Nonparametric estimation of benchmark doses in environmental risk assessment

    Science.gov (United States)

    Piegorsch, Walter W.; Xiong, Hui; Bhattacharya, Rabi N.; Lin, Lizhen

    2013-01-01

    Summary An important statistical objective in environmental risk analysis is estimation of minimum exposure levels, called benchmark doses (BMDs), that induce a pre-specified benchmark response in a dose-response experiment. In such settings, representations of the risk are traditionally based on a parametric dose-response model. It is a well-known concern, however, that if the chosen parametric form is misspecified, inaccurate and possibly unsafe low-dose inferences can result. We apply a nonparametric approach for calculating benchmark doses, based on an isotonic regression method for dose-response estimation with quantal-response data (Bhattacharya and Kong, 2007). We determine the large-sample properties of the estimator, develop bootstrap-based confidence limits on the BMDs, and explore the confidence limits’ small-sample properties via a short simulation study. An example from cancer risk assessment illustrates the calculations. PMID:23914133

  12. Bayesian analysis of energy and count rate data for detection of low count rate radioactive sources

    Energy Technology Data Exchange (ETDEWEB)

    Klumpp, John [Colorado State University, Department of Environmental and Radiological Health Sciences, Molecular and Radiological Biosciences Building, Colorado State University, Fort Collins, Colorado, 80523 (United States)

    2013-07-01

    We propose a radiation detection system which generates its own discrete sampling distribution based on past measurements of background. The advantage to this approach is that it can take into account variations in background with respect to time, location, energy spectra, detector-specific characteristics (i.e. different efficiencies at different count rates and energies), etc. This would therefore be a 'machine learning' approach, in which the algorithm updates and improves its characterization of background over time. The system would have a 'learning mode,' in which it measures and analyzes background count rates, and a 'detection mode,' in which it compares measurements from an unknown source against its unique background distribution. By characterizing and accounting for variations in the background, general purpose radiation detectors can be improved with little or no increase in cost. The statistical and computational techniques to perform this kind of analysis have already been developed. The necessary signal analysis can be accomplished using existing Bayesian algorithms which account for multiple channels, multiple detectors, and multiple time intervals. Furthermore, Bayesian machine-learning techniques have already been developed which, with trivial modifications, can generate appropriate decision thresholds based on the comparison of new measurements against a nonparametric sampling distribution. (authors)

  13. Non-parametric smoothing of experimental data

    International Nuclear Information System (INIS)

    Kuketayev, A.T.; Pen'kov, F.M.

    2007-01-01

    Full text: Rapid processing of experimental data samples in nuclear physics often requires differentiation in order to find extrema. Therefore, even at the preliminary stage of data analysis, a range of noise reduction methods are used to smooth experimental data. There are many non-parametric smoothing techniques: interval averages, moving averages, exponential smoothing, etc. Nevertheless, it is more common to use a priori information about the behavior of the experimental curve in order to construct smoothing schemes based on the least squares techniques. The latter methodology's advantage is that the area under the curve can be preserved, which is equivalent to conservation of total speed of counting. The disadvantages of this approach include the lack of a priori information. For example, very often the sums of undifferentiated (by a detector) peaks are replaced with one peak during the processing of data, introducing uncontrolled errors in the determination of the physical quantities. The problem is solvable only by having experienced personnel, whose skills are much greater than the challenge. We propose a set of non-parametric techniques, which allows the use of any additional information on the nature of experimental dependence. The method is based on a construction of a functional, which includes both experimental data and a priori information. Minimum of this functional is reached on a non-parametric smoothed curve. Euler (Lagrange) differential equations are constructed for these curves; then their solutions are obtained analytically or numerically. The proposed approach allows for automated processing of nuclear physics data, eliminating the need for highly skilled laboratory personnel. Pursuant to the proposed approach is the possibility to obtain smoothing curves in a given confidence interval, e.g. according to the χ 2 distribution. This approach is applicable when constructing smooth solutions of ill-posed problems, in particular when solving

  14. Decompounding random sums: A nonparametric approach

    DEFF Research Database (Denmark)

    Hansen, Martin Bøgsted; Pitts, Susan M.

    Observations from sums of random variables with a random number of summands, known as random, compound or stopped sums arise within many areas of engineering and science. Quite often it is desirable to infer properties of the distribution of the terms in the random sum. In the present paper we...... review a number of applications and consider the nonlinear inverse problem of inferring the cumulative distribution function of the components in the random sum. We review the existing literature on non-parametric approaches to the problem. The models amenable to the analysis are generalized considerably...

  15. A Nonparametric Test for Seasonal Unit Roots

    OpenAIRE

    Kunst, Robert M.

    2009-01-01

    Abstract: We consider a nonparametric test for the null of seasonal unit roots in quarterly time series that builds on the RUR (records unit root) test by Aparicio, Escribano, and Sipols. We find that the test concept is more promising than a formalization of visual aids such as plots by quarter. In order to cope with the sensitivity of the original RUR test to autocorrelation under its null of a unit root, we suggest an augmentation step by autoregression. We present some evidence on the siz...

  16. Testing for constant nonparametric effects in general semiparametric regression models with interactions

    KAUST Repository

    Wei, Jiawei

    2011-07-01

    We consider the problem of testing for a constant nonparametric effect in a general semi-parametric regression model when there is the potential for interaction between the parametrically and nonparametrically modeled variables. The work was originally motivated by a unique testing problem in genetic epidemiology (Chatterjee, et al., 2006) that involved a typical generalized linear model but with an additional term reminiscent of the Tukey one-degree-of-freedom formulation, and their interest was in testing for main effects of the genetic variables, while gaining statistical power by allowing for a possible interaction between genes and the environment. Later work (Maity, et al., 2009) involved the possibility of modeling the environmental variable nonparametrically, but they focused on whether there was a parametric main effect for the genetic variables. In this paper, we consider the complementary problem, where the interest is in testing for the main effect of the nonparametrically modeled environmental variable. We derive a generalized likelihood ratio test for this hypothesis, show how to implement it, and provide evidence that our method can improve statistical power when compared to standard partially linear models with main effects only. We use the method for the primary purpose of analyzing data from a case-control study of colorectal adenoma.

  17. Bayesian inversion of microtremor array dispersion data in southwestern British Columbia

    Science.gov (United States)

    Molnar, Sheri; Dosso, Stan E.; Cassidy, John F.

    2010-11-01

    This paper applies Bayesian inversion, with evaluation of data errors and model parametrization, to produce the most-probable shear-wave velocity profile together with quantitative uncertainty estimates from microtremor array dispersion data. Generally, the most important property for characterizing earthquake site response is the shear-wave velocity (VS) profile. The microtremor array method determines phase velocity dispersion of Rayleigh surface waves from multi-instrument recordings of urban noise. Inversion of dispersion curves for VS structure is a non-unique and non-linear problem such that meaningful evaluation of confidence intervals is required. Quantitative uncertainty estimation requires not only a non-linear inversion approach that samples models proportional to their probability, but also rigorous estimation of the data error statistics and an appropriate model parametrization. This paper applies a Bayesian formulation that represents the solution of the inverse problem in terms of the posterior probability density (PPD) of the geophysical model parameters. Markov-chain Monte Carlo methods are used with an efficient implementation of Metropolis-Hastings sampling to provide an unbiased sample from the PPD to compute parameter uncertainties and inter-relationships. Nonparametric estimation of a data error covariance matrix from residual analysis is applied with rigorous a posteriori statistical tests to validate the covariance estimate and the assumption of a Gaussian error distribution. The most appropriate model parametrization is determined using the Bayesian information criterion, which provides the simplest model consistent with the resolving power of the data. Parametrizations considered vary in the number of layers, and include layers with uniform, linear and power-law gradients. Parameter uncertainties are found to be underestimated when data error correlations are neglected and when compressional-wave velocity and/or density (nuisance

  18. Estimating extreme river discharges in Europe through a Bayesian network

    Science.gov (United States)

    Paprotny, Dominik; Morales-Nápoles, Oswaldo

    2017-06-01

    Large-scale hydrological modelling of flood hazards requires adequate extreme discharge data. In practise, models based on physics are applied alongside those utilizing only statistical analysis. The former require enormous computational power, while the latter are mostly limited in accuracy and spatial coverage. In this paper we introduce an alternate, statistical approach based on Bayesian networks (BNs), a graphical model for dependent random variables. We use a non-parametric BN to describe the joint distribution of extreme discharges in European rivers and variables representing the geographical characteristics of their catchments. Annual maxima of daily discharges from more than 1800 river gauges (stations with catchment areas ranging from 1.4 to 807 000 km2) were collected, together with information on terrain, land use and local climate. The (conditional) correlations between the variables are modelled through copulas, with the dependency structure defined in the network. The results show that using this method, mean annual maxima and return periods of discharges could be estimated with an accuracy similar to existing studies using physical models for Europe and better than a comparable global statistical model. Performance of the model varies slightly between regions of Europe, but is consistent between different time periods, and remains the same in a split-sample validation. Though discharge prediction under climate change is not the main scope of this paper, the BN was applied to a large domain covering all sizes of rivers in the continent both for present and future climate, as an example. Results show substantial variation in the influence of climate change on river discharges. The model can be used to provide quick estimates of extreme discharges at any location for the purpose of obtaining input information for hydraulic modelling.

  19. Bayesian Mediation Analysis

    OpenAIRE

    Yuan, Ying; MacKinnon, David P.

    2009-01-01

    This article proposes Bayesian analysis of mediation effects. Compared to conventional frequentist mediation analysis, the Bayesian approach has several advantages. First, it allows researchers to incorporate prior information into the mediation analysis, thus potentially improving the efficiency of estimates. Second, under the Bayesian mediation analysis, inference is straightforward and exact, which makes it appealing for studies with small samples. Third, the Bayesian approach is conceptua...

  20. Statistics Anxiety and Business Statistics: The International Student

    Science.gov (United States)

    Bell, James A.

    2008-01-01

    Does the international student suffer from statistics anxiety? To investigate this, the Statistics Anxiety Rating Scale (STARS) was administered to sixty-six beginning statistics students, including twelve international students and fifty-four domestic students. Due to the small number of international students, nonparametric methods were used to…

  1. Variations on Bayesian Prediction and Inference

    Science.gov (United States)

    2016-05-09

    inference 2.2.1 Background There are a number of statistical inference problems that are not generally formulated via a full probability model...problem of inference about an unknown parameter, the Bayesian approach requires a full probability 1. REPORT DATE (DD-MM-YYYY) 4. TITLE AND...the problem of inference about an unknown parameter, the Bayesian approach requires a full probability model/likelihood which can be an obstacle

  2. A Bayesian Method for Weighted Sampling

    OpenAIRE

    Lo, Albert Y.

    1993-01-01

    Bayesian statistical inference for sampling from weighted distribution models is studied. Small-sample Bayesian bootstrap clone (BBC) approximations to the posterior distribution are discussed. A second-order property for the BBC in unweighted i.i.d. sampling is given. A consequence is that BBC approximations to a posterior distribution of the mean and to the sampling distribution of the sample average, can be made asymptotically accurate by a proper choice of the random variables that genera...

  3. CATDAT - A program for parametric and nonparametric categorical data analysis user's manual, Version 1.0

    International Nuclear Information System (INIS)

    Peterson, James R.; Haas, Timothy C.; Lee, Danny C.

    2000-01-01

    Natural resource professionals are increasingly required to develop rigorous statistical models that relate environmental data to categorical responses data. Recent advances in the statistical and computing sciences have led to the development of sophisticated methods for parametric and nonparametric analysis of data with categorical responses. The statistical software package CATDAT was designed to make some of these relatively new and powerful techniques available to scientists. The CATDAT statistical package includes 4 analytical techniques: generalized logit modeling; binary classification tree; extended K-nearest neighbor classification; and modular neural network

  4. Bayesian framework for prediction of future number of failures from a single group of units in the field

    International Nuclear Information System (INIS)

    Ebrahimi, Nader

    2009-01-01

    This paper considers prediction of unknown number of failures in a future inspection of a group of in-service units based on number of failures observed from an earlier inspection. We develop a flexible Bayesian model and calculate Bayesian estimator for this unknown number and other quantities of interest. The paper also includes an illustration of our method in an example about heat exchanger. A main advantage of our approach is in its nonparametric nature. By nonparametric here we simply mean that no assumption is required about the failure time distribution of a unit

  5. Nonparametric predictive inference for combining diagnostic tests with parametric copula

    Science.gov (United States)

    Muhammad, Noryanti; Coolen, F. P. A.; Coolen-Maturi, T.

    2017-09-01

    Measuring the accuracy of diagnostic tests is crucial in many application areas including medicine and health care. The Receiver Operating Characteristic (ROC) curve is a popular statistical tool for describing the performance of diagnostic tests. The area under the ROC curve (AUC) is often used as a measure of the overall performance of the diagnostic test. In this paper, we interest in developing strategies for combining test results in order to increase the diagnostic accuracy. We introduce nonparametric predictive inference (NPI) for combining two diagnostic test results with considering dependence structure using parametric copula. NPI is a frequentist statistical framework for inference on a future observation based on past data observations. NPI uses lower and upper probabilities to quantify uncertainty and is based on only a few modelling assumptions. While copula is a well-known statistical concept for modelling dependence of random variables. A copula is a joint distribution function whose marginals are all uniformly distributed and it can be used to model the dependence separately from the marginal distributions. In this research, we estimate the copula density using a parametric method which is maximum likelihood estimator (MLE). We investigate the performance of this proposed method via data sets from the literature and discuss results to show how our method performs for different family of copulas. Finally, we briefly outline related challenges and opportunities for future research.

  6. On the prior probabilities for two-stage Bayesian estimates

    International Nuclear Information System (INIS)

    Kohut, P.

    1992-01-01

    The method of Bayesian inference is reexamined for its applicability and for the required underlying assumptions in obtaining and using prior probability estimates. Two different approaches are suggested to determine the first-stage priors in the two-stage Bayesian analysis which avoid certain assumptions required for other techniques. In the first scheme, the prior is obtained through a true frequency based distribution generated at selected intervals utilizing actual sampling of the failure rate distributions. The population variability distribution is generated as the weighed average of the frequency distributions. The second method is based on a non-parametric Bayesian approach using the Maximum Entropy Principle. Specific features such as integral properties or selected parameters of prior distributions may be obtained with minimal assumptions. It is indicated how various quantiles may also be generated with a least square technique

  7. Exact nonparametric confidence bands for the survivor function.

    Science.gov (United States)

    Matthews, David

    2013-10-12

    A method to produce exact simultaneous confidence bands for the empirical cumulative distribution function that was first described by Owen, and subsequently corrected by Jager and Wellner, is the starting point for deriving exact nonparametric confidence bands for the survivor function of any positive random variable. We invert a nonparametric likelihood test of uniformity, constructed from the Kaplan-Meier estimator of the survivor function, to obtain simultaneous lower and upper bands for the function of interest with specified global confidence level. The method involves calculating a null distribution and associated critical value for each observed sample configuration. However, Noe recursions and the Van Wijngaarden-Decker-Brent root-finding algorithm provide the necessary tools for efficient computation of these exact bounds. Various aspects of the effect of right censoring on these exact bands are investigated, using as illustrations two observational studies of survival experience among non-Hodgkin's lymphoma patients and a much larger group of subjects with advanced lung cancer enrolled in trials within the North Central Cancer Treatment Group. Monte Carlo simulations confirm the merits of the proposed method of deriving simultaneous interval estimates of the survivor function across the entire range of the observed sample. This research was supported by the Natural Sciences and Engineering Research Council (NSERC) of Canada. It was begun while the author was visiting the Department of Statistics, University of Auckland, and completed during a subsequent sojourn at the Medical Research Council Biostatistics Unit in Cambridge. The support of both institutions, in addition to that of NSERC and the University of Waterloo, is greatly appreciated.

  8. Smooth semi-nonparametric (SNP) estimation of the cumulative incidence function.

    Science.gov (United States)

    Duc, Anh Nguyen; Wolbers, Marcel

    2017-08-15

    This paper presents a novel approach to estimation of the cumulative incidence function in the presence of competing risks. The underlying statistical model is specified via a mixture factorization of the joint distribution of the event type and the time to the event. The time to event distributions conditional on the event type are modeled using smooth semi-nonparametric densities. One strength of this approach is that it can handle arbitrary censoring and truncation while relying on mild parametric assumptions. A stepwise forward algorithm for model estimation and adaptive selection of smooth semi-nonparametric polynomial degrees is presented, implemented in the statistical software R, evaluated in a sequence of simulation studies, and applied to data from a clinical trial in cryptococcal meningitis. The simulations demonstrate that the proposed method frequently outperforms both parametric and nonparametric alternatives. They also support the use of 'ad hoc' asymptotic inference to derive confidence intervals. An extension to regression modeling is also presented, and its potential and challenges are discussed. © 2017 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.

  9. On Parametric (and Non-Parametric Variation

    Directory of Open Access Journals (Sweden)

    Neil Smith

    2009-11-01

    Full Text Available This article raises the issue of the correct characterization of ‘Parametric Variation’ in syntax and phonology. After specifying their theoretical commitments, the authors outline the relevant parts of the Principles–and–Parameters framework, and draw a three-way distinction among Universal Principles, Parameters, and Accidents. The core of the contribution then consists of an attempt to provide identity criteria for parametric, as opposed to non-parametric, variation. Parametric choices must be antecedently known, and it is suggested that they must also satisfy seven individually necessary and jointly sufficient criteria. These are that they be cognitively represented, systematic, dependent on the input, deterministic, discrete, mutually exclusive, and irreversible.

  10. Nonparametric estimation of location and scale parameters

    KAUST Repository

    Potgieter, C.J.

    2012-12-01

    Two random variables X and Y belong to the same location-scale family if there are constants μ and σ such that Y and μ+σX have the same distribution. In this paper we consider non-parametric estimation of the parameters μ and σ under minimal assumptions regarding the form of the distribution functions of X and Y. We discuss an approach to the estimation problem that is based on asymptotic likelihood considerations. Our results enable us to provide a methodology that can be implemented easily and which yields estimators that are often near optimal when compared to fully parametric methods. We evaluate the performance of the estimators in a series of Monte Carlo simulations. © 2012 Elsevier B.V. All rights reserved.

  11. Discriminative Bayesian Dictionary Learning for Classification.

    Science.gov (United States)

    Akhtar, Naveed; Shafait, Faisal; Mian, Ajmal

    2016-12-01

    We propose a Bayesian approach to learn discriminative dictionaries for sparse representation of data. The proposed approach infers probability distributions over the atoms of a discriminative dictionary using a finite approximation of Beta Process. It also computes sets of Bernoulli distributions that associate class labels to the learned dictionary atoms. This association signifies the selection probabilities of the dictionary atoms in the expansion of class-specific data. Furthermore, the non-parametric character of the proposed approach allows it to infer the correct size of the dictionary. We exploit the aforementioned Bernoulli distributions in separately learning a linear classifier. The classifier uses the same hierarchical Bayesian model as the dictionary, which we present along the analytical inference solution for Gibbs sampling. For classification, a test instance is first sparsely encoded over the learned dictionary and the codes are fed to the classifier. We performed experiments for face and action recognition; and object and scene-category classification using five public datasets and compared the results with state-of-the-art discriminative sparse representation approaches. Experiments show that the proposed Bayesian approach consistently outperforms the existing approaches.

  12. On Wasserstein Two-Sample Testing and Related Families of Nonparametric Tests

    Directory of Open Access Journals (Sweden)

    Aaditya Ramdas

    2017-01-01

    Full Text Available Nonparametric two-sample or homogeneity testing is a decision theoretic problem that involves identifying differences between two random variables without making parametric assumptions about their underlying distributions. The literature is old and rich, with a wide variety of statistics having being designed and analyzed, both for the unidimensional and the multivariate setting. Inthisshortsurvey,wefocusonteststatisticsthatinvolvetheWassersteindistance. Usingan entropic smoothing of the Wasserstein distance, we connect these to very different tests including multivariate methods involving energy statistics and kernel based maximum mean discrepancy and univariate methods like the Kolmogorov–Smirnov test, probability or quantile (PP/QQ plots and receiver operating characteristic or ordinal dominance (ROC/ODC curves. Some observations are implicit in the literature, while others seem to have not been noticed thus far. Given nonparametric two-sample testing’s classical and continued importance, we aim to provide useful connections for theorists and practitioners familiar with one subset of methods but not others.

  13. Bayesian emulation for optimization in multi-step portfolio decisions

    OpenAIRE

    Irie, Kaoru; West, Mike

    2016-01-01

    We discuss the Bayesian emulation approach to computational solution of multi-step portfolio studies in financial time series. "Bayesian emulation for decisions" involves mapping the technical structure of a decision analysis problem to that of Bayesian inference in a purely synthetic "emulating" statistical model. This provides access to standard posterior analytic, simulation and optimization methods that yield indirect solutions of the decision problem. We develop this in time series portf...

  14. Bayesian estimation of dose rate effectiveness

    International Nuclear Information System (INIS)

    Arnish, J.J.; Groer, P.G.

    2000-01-01

    A Bayesian statistical method was used to quantify the effectiveness of high dose rate 137 Cs gamma radiation at inducing fatal mammary tumours and increasing the overall mortality rate in BALB/c female mice. The Bayesian approach considers both the temporal and dose dependence of radiation carcinogenesis and total mortality. This paper provides the first direct estimation of dose rate effectiveness using Bayesian statistics. This statistical approach provides a quantitative description of the uncertainty of the factor characterising the dose rate in terms of a probability density function. The results show that a fixed dose from 137 Cs gamma radiation delivered at a high dose rate is more effective at inducing fatal mammary tumours and increasing the overall mortality rate in BALB/c female mice than the same dose delivered at a low dose rate. (author)

  15. Correct Bayesian and frequentist intervals are similar

    International Nuclear Information System (INIS)

    Atwood, C.L.

    1986-01-01

    This paper argues that Bayesians and frequentists will normally reach numerically similar conclusions, when dealing with vague data or sparse data. It is shown that both statistical methodologies can deal reasonably with vague data. With sparse data, in many important practical cases Bayesian interval estimates and frequentist confidence intervals are approximately equal, although with discrete data the frequentist intervals are somewhat longer. This is not to say that the two methodologies are equally easy to use: The construction of a frequentist confidence interval may require new theoretical development. Bayesians methods typically require numerical integration, perhaps over many variables. Also, Bayesian can easily fall into the trap of over-optimism about their amount of prior knowledge. But in cases where both intervals are found correctly, the two intervals are usually not very different. (orig.)

  16. A NONPARAMETRIC HYPOTHESIS TEST VIA THE BOOTSTRAP RESAMPLING

    OpenAIRE

    Temel, Tugrul T.

    2001-01-01

    This paper adapts an already existing nonparametric hypothesis test to the bootstrap framework. The test utilizes the nonparametric kernel regression method to estimate a measure of distance between the models stated under the null hypothesis. The bootstraped version of the test allows to approximate errors involved in the asymptotic hypothesis test. The paper also develops a Mathematica Code for the test algorithm.

  17. Simple nonparametric checks for model data fit in CAT

    NARCIS (Netherlands)

    Meijer, R.R.

    2005-01-01

    In this paper, the usefulness of several nonparametric checks is discussed in a computerized adaptive testing (CAT) context. Although there is no tradition of nonparametric scalability in CAT, it can be argued that scalability checks can be useful to investigate, for example, the quality of item

  18. Bayesian inference on proportional elections.

    Directory of Open Access Journals (Sweden)

    Gabriel Hideki Vatanabe Brunello

    Full Text Available Polls for majoritarian voting systems usually show estimates of the percentage of votes for each candidate. However, proportional vote systems do not necessarily guarantee the candidate with the most percentage of votes will be elected. Thus, traditional methods used in majoritarian elections cannot be applied on proportional elections. In this context, the purpose of this paper was to perform a Bayesian inference on proportional elections considering the Brazilian system of seats distribution. More specifically, a methodology to answer the probability that a given party will have representation on the chamber of deputies was developed. Inferences were made on a Bayesian scenario using the Monte Carlo simulation technique, and the developed methodology was applied on data from the Brazilian elections for Members of the Legislative Assembly and Federal Chamber of Deputies in 2010. A performance rate was also presented to evaluate the efficiency of the methodology. Calculations and simulations were carried out using the free R statistical software.

  19. Uncertainty Estimation of Shear-wave Velocity Structure from Bayesian Inversion of Microtremor Array Dispersion Data

    Science.gov (United States)

    Dosso, S. E.; Molnar, S.; Cassidy, J.

    2010-12-01

    Bayesian inversion of microtremor array dispersion data is applied, with evaluation of data errors and model parameterization, to produce the most-probable shear-wave velocity (VS) profile together with quantitative uncertainty estimates. Generally, the most important property characterizing earthquake site response is the subsurface VS structure. The microtremor array method determines phase velocity dispersion of Rayleigh surface waves from multi-instrument recordings of urban noise. Inversion of dispersion curves for VS structure is a non-unique and nonlinear problem such that meaningful evaluation of confidence intervals is required. Quantitative uncertainty estimation requires not only a nonlinear inversion approach that samples models proportional to their probability, but also rigorous estimation of the data error statistics and an appropriate model parameterization. A Bayesian formulation represents the solution of the inverse problem in terms of the posterior probability density (PPD) of the geophysical model parameters. Markov-chain Monte Carlo methods are used with an efficient implementation of Metropolis-Hastings sampling to provide an unbiased sample from the PPD to compute parameter uncertainties and inter-relationships. Nonparametric estimation of a data error covariance matrix from residual analysis is applied with rigorous a posteriori statistical tests to validate the covariance estimate and the assumption of a Gaussian error distribution. The most appropriate model parameterization is determined using the Bayesian information criterion (BIC), which provides the simplest model consistent with the resolving power of the data. Parameter uncertainties are found to be under-estimated when data error correlations are neglected and when compressional-wave velocity and/or density (nuisance) parameters are fixed in the inversion. Bayesian inversion of microtremor array data is applied at two sites in British Columbia, the area of highest seismic risk in

  20. On the use of permutation in and the performance of a class of nonparametric methods to detect differential gene expression.

    Science.gov (United States)

    Pan, Wei

    2003-07-22

    Recently a class of nonparametric statistical methods, including the empirical Bayes (EB) method, the significance analysis of microarray (SAM) method and the mixture model method (MMM), have been proposed to detect differential gene expression for replicated microarray experiments conducted under two conditions. All the methods depend on constructing a test statistic Z and a so-called null statistic z. The null statistic z is used to provide some reference distribution for Z such that statistical inference can be accomplished. A common way of constructing z is to apply Z to randomly permuted data. Here we point our that the distribution of z may not approximate the null distribution of Z well, leading to possibly too conservative inference. This observation may apply to other permutation-based nonparametric methods. We propose a new method of constructing a null statistic that aims to estimate the null distribution of a test statistic directly. Using simulated data and real data, we assess and compare the performance of the existing method and our new method when applied in EB, SAM and MMM. Some interesting findings on operating characteristics of EB, SAM and MMM are also reported. Finally, by combining the idea of SAM and MMM, we outline a simple nonparametric method based on the direct use of a test statistic and a null statistic.

  1. Using Alien Coins to Test Whether Simple Inference Is Bayesian

    Science.gov (United States)

    Cassey, Peter; Hawkins, Guy E.; Donkin, Chris; Brown, Scott D.

    2016-01-01

    Reasoning and inference are well-studied aspects of basic cognition that have been explained as statistically optimal Bayesian inference. Using a simplified experimental design, we conducted quantitative comparisons between Bayesian inference and human inference at the level of individuals. In 3 experiments, with more than 13,000 participants, we…

  2. Nonparametric bayesian reward segmentation for skill discovery using inverse reinforcement learning

    CSIR Research Space (South Africa)

    Ranchod, P

    2015-10-01

    Full Text Available We present a method for segmenting a set of unstructured demonstration trajectories to discover reusable skills using inverse reinforcement learning (IRL). Each skill is characterised by a latent reward function which the demonstrator is assumed...

  3. Bayesian Nonparametric Regression Analysis of Data with Random Effects Covariates from Longitudinal Measurements

    KAUST Repository

    Ryu, Duchwan; Li, Erning; Mallick, Bani K.

    2010-01-01

    " approach and the regression calibration, via simulations and by an application that investigates the relationship between obesity in adulthood and childhood growth curves. © 2010, The International Biometric Society.

  4. Doing bayesian data analysis a tutorial with R and BUGS

    CERN Document Server

    Kruschke, John K

    2011-01-01

    There is an explosion of interest in Bayesian statistics, primarily because recently created computational methods have finally made Bayesian analysis obtainable to a wide audience. Doing Bayesian Data Analysis, A Tutorial Introduction with R and BUGS provides an accessible approach to Bayesian data analysis, as material is explained clearly with concrete examples. The book begins with the basics, including essential concepts of probability and random sampling, and gradually progresses to advanced hierarchical modeling methods for realistic data. The text delivers comprehensive coverage of all

  5. Nonparametric methods in actigraphy: An update

    Directory of Open Access Journals (Sweden)

    Bruno S.B. Gonçalves

    2014-09-01

    Full Text Available Circadian rhythmicity in humans has been well studied using actigraphy, a method of measuring gross motor movement. As actigraphic technology continues to evolve, it is important for data analysis to keep pace with new variables and features. Our objective is to study the behavior of two variables, interdaily stability and intradaily variability, to describe rest activity rhythm. Simulated data and actigraphy data of humans, rats, and marmosets were used in this study. We modified the method of calculation for IV and IS by modifying the time intervals of analysis. For each variable, we calculated the average value (IVm and ISm results for each time interval. Simulated data showed that (1 synchronization analysis depends on sample size, and (2 fragmentation is independent of the amplitude of the generated noise. We were able to obtain a significant difference in the fragmentation patterns of stroke patients using an IVm variable, while the variable IV60 was not identified. Rhythmic synchronization of activity and rest was significantly higher in young than adults with Parkinson׳s when using the ISM variable; however, this difference was not seen using IS60. We propose an updated format to calculate rhythmic fragmentation, including two additional optional variables. These alternative methods of nonparametric analysis aim to more precisely detect sleep–wake cycle fragmentation and synchronization.

  6. An introduction to using Bayesian linear regression with clinical data.

    Science.gov (United States)

    Baldwin, Scott A; Larson, Michael J

    2017-11-01

    Statistical training psychology focuses on frequentist methods. Bayesian methods are an alternative to standard frequentist methods. This article provides researchers with an introduction to fundamental ideas in Bayesian modeling. We use data from an electroencephalogram (EEG) and anxiety study to illustrate Bayesian models. Specifically, the models examine the relationship between error-related negativity (ERN), a particular event-related potential, and trait anxiety. Methodological topics covered include: how to set up a regression model in a Bayesian framework, specifying priors, examining convergence of the model, visualizing and interpreting posterior distributions, interval estimates, expected and predicted values, and model comparison tools. We also discuss situations where Bayesian methods can outperform frequentist methods as well has how to specify more complicated regression models. Finally, we conclude with recommendations about reporting guidelines for those using Bayesian methods in their own research. We provide data and R code for replicating our analyses. Copyright © 2017 Elsevier Ltd. All rights reserved.

  7. Bayesian Plackett-Luce Mixture Models for Partially Ranked Data.

    Science.gov (United States)

    Mollica, Cristina; Tardella, Luca

    2017-06-01

    The elicitation of an ordinal judgment on multiple alternatives is often required in many psychological and behavioral experiments to investigate preference/choice orientation of a specific population. The Plackett-Luce model is one of the most popular and frequently applied parametric distributions to analyze rankings of a finite set of items. The present work introduces a Bayesian finite mixture of Plackett-Luce models to account for unobserved sample heterogeneity of partially ranked data. We describe an efficient way to incorporate the latent group structure in the data augmentation approach and the derivation of existing maximum likelihood procedures as special instances of the proposed Bayesian method. Inference can be conducted with the combination of the Expectation-Maximization algorithm for maximum a posteriori estimation and the Gibbs sampling iterative procedure. We additionally investigate several Bayesian criteria for selecting the optimal mixture configuration and describe diagnostic tools for assessing the fitness of ranking distributions conditionally and unconditionally on the number of ranked items. The utility of the novel Bayesian parametric Plackett-Luce mixture for characterizing sample heterogeneity is illustrated with several applications to simulated and real preference ranked data. We compare our method with the frequentist approach and a Bayesian nonparametric mixture model both assuming the Plackett-Luce model as a mixture component. Our analysis on real datasets reveals the importance of an accurate diagnostic check for an appropriate in-depth understanding of the heterogenous nature of the partial ranking data.

  8. Bayesian Mediation Analysis

    Science.gov (United States)

    Yuan, Ying; MacKinnon, David P.

    2009-01-01

    In this article, we propose Bayesian analysis of mediation effects. Compared with conventional frequentist mediation analysis, the Bayesian approach has several advantages. First, it allows researchers to incorporate prior information into the mediation analysis, thus potentially improving the efficiency of estimates. Second, under the Bayesian…

  9. Probability Machines: Consistent Probability Estimation Using Nonparametric Learning Machines

    Science.gov (United States)

    Malley, J. D.; Kruppa, J.; Dasgupta, A.; Malley, K. G.; Ziegler, A.

    2011-01-01

    Summary Background Most machine learning approaches only provide a classification for binary responses. However, probabilities are required for risk estimation using individual patient characteristics. It has been shown recently that every statistical learning machine known to be consistent for a nonparametric regression problem is a probability machine that is provably consistent for this estimation problem. Objectives The aim of this paper is to show how random forests and nearest neighbors can be used for consistent estimation of individual probabilities. Methods Two random forest algorithms and two nearest neighbor algorithms are described in detail for estimation of individual probabilities. We discuss the consistency of random forests, nearest neighbors and other learning machines in detail. We conduct a simulation study to illustrate the validity of the methods. We exemplify the algorithms by analyzing two well-known data sets on the diagnosis of appendicitis and the diagnosis of diabetes in Pima Indians. Results Simulations demonstrate the validity of the method. With the real data application, we show the accuracy and practicality of this approach. We provide sample code from R packages in which the probability estimation is already available. This means that all calculations can be performed using existing software. Conclusions Random forest algorithms as well as nearest neighbor approaches are valid machine learning methods for estimating individual probabilities for binary responses. Freely available implementations are available in R and may be used for applications. PMID:21915433

  10. Nonparametric adaptive age replacement with a one-cycle criterion

    International Nuclear Information System (INIS)

    Coolen-Schrijner, P.; Coolen, F.P.A.

    2007-01-01

    Age replacement of technical units has received much attention in the reliability literature over the last four decades. Mostly, the failure time distribution for the units is assumed to be known, and minimal costs per unit of time is used as optimality criterion, where renewal reward theory simplifies the mathematics involved but requires the assumption that the same process and replacement strategy continues over a very large ('infinite') period of time. Recently, there has been increasing attention to adaptive strategies for age replacement, taking into account the information from the process. Although renewal reward theory can still be used to provide an intuitively and mathematically attractive optimality criterion, it is more logical to use minimal costs per unit of time over a single cycle as optimality criterion for adaptive age replacement. In this paper, we first show that in the classical age replacement setting, with known failure time distribution with increasing hazard rate, the one-cycle criterion leads to earlier replacement than the renewal reward criterion. Thereafter, we present adaptive age replacement with a one-cycle criterion within the nonparametric predictive inferential framework. We study the performance of this approach via simulations, which are also used for comparisons with the use of the renewal reward criterion within the same statistical framework

  11. Advances in statistics

    Science.gov (United States)

    Howard Stauffer; Nadav Nur

    2005-01-01

    The papers included in the Advances in Statistics section of the Partners in Flight (PIF) 2002 Proceedings represent a small sample of statistical topics of current importance to Partners In Flight research scientists: hierarchical modeling, estimation of detection probabilities, and Bayesian applications. Sauer et al. (this volume) examines a hierarchical model...

  12. Bayesian Probability Theory

    Science.gov (United States)

    von der Linden, Wolfgang; Dose, Volker; von Toussaint, Udo

    2014-06-01

    Preface; Part I. Introduction: 1. The meaning of probability; 2. Basic definitions; 3. Bayesian inference; 4. Combinatrics; 5. Random walks; 6. Limit theorems; 7. Continuous distributions; 8. The central limit theorem; 9. Poisson processes and waiting times; Part II. Assigning Probabilities: 10. Transformation invariance; 11. Maximum entropy; 12. Qualified maximum entropy; 13. Global smoothness; Part III. Parameter Estimation: 14. Bayesian parameter estimation; 15. Frequentist parameter estimation; 16. The Cramer-Rao inequality; Part IV. Testing Hypotheses: 17. The Bayesian way; 18. The frequentist way; 19. Sampling distributions; 20. Bayesian vs frequentist hypothesis tests; Part V. Real World Applications: 21. Regression; 22. Inconsistent data; 23. Unrecognized signal contributions; 24. Change point problems; 25. Function estimation; 26. Integral equations; 27. Model selection; 28. Bayesian experimental design; Part VI. Probabilistic Numerical Techniques: 29. Numerical integration; 30. Monte Carlo methods; 31. Nested sampling; Appendixes; References; Index.

  13. A systematic review of Bayesian articles in psychology : The last 25 years

    OpenAIRE

    van de Schoot, Rens; Winter, Sonja; Ryan, Oisín; Zondervan - Zwijnenburg, Mariëlle; Depaoli, Sarah

    2017-01-01

    Although the statistical tools most often used by researchers in the field of psychology over the last 25 years are based on frequentist statistics, it is often claimed that the alternative Bayesian approach to statistics is gaining in popularity. In the current article, we investigated this claim by performing the very first systematic review of Bayesian psychological articles published between 1990 and 2015 (n = 1,579). We aim to provide a thorough presentation of the role Bayesian statisti...

  14. Weak Disposability in Nonparametric Production Analysis with Undesirable Outputs

    NARCIS (Netherlands)

    Kuosmanen, T.K.

    2005-01-01

    Environmental Economics and Natural Resources Group at Wageningen University in The Netherlands Weak disposability of outputs means that firms can abate harmful emissions by decreasing the activity level. Modeling weak disposability in nonparametric production analysis has caused some confusion.

  15. Multi-sample nonparametric treatments comparison in medical ...

    African Journals Online (AJOL)

    Multi-sample nonparametric treatments comparison in medical follow-up study with unequal observation processes through simulation and bladder tumour case study. P. L. Tan, N.A. Ibrahim, M.B. Adam, J. Arasan ...

  16. A nonparametric mixture model for cure rate estimation.

    Science.gov (United States)

    Peng, Y; Dear, K B

    2000-03-01

    Nonparametric methods have attracted less attention than their parametric counterparts for cure rate analysis. In this paper, we study a general nonparametric mixture model. The proportional hazards assumption is employed in modeling the effect of covariates on the failure time of patients who are not cured. The EM algorithm, the marginal likelihood approach, and multiple imputations are employed to estimate parameters of interest in the model. This model extends models and improves estimation methods proposed by other researchers. It also extends Cox's proportional hazards regression model by allowing a proportion of event-free patients and investigating covariate effects on that proportion. The model and its estimation method are investigated by simulations. An application to breast cancer data, including comparisons with previous analyses using a parametric model and an existing nonparametric model by other researchers, confirms the conclusions from the parametric model but not those from the existing nonparametric model.

  17. A Bayesian Approach to Interactive Retrieval

    Science.gov (United States)

    Tague, Jean M.

    1973-01-01

    A probabilistic model for interactive retrieval is presented. Bayesian statistical decision theory principles are applied: use of prior and sample information about the relationship of document descriptions to query relevance; maximization of expected value of a utility function, to the problem of optimally restructuring search strategies in an…

  18. Statistical inference an integrated Bayesianlikelihood approach

    CERN Document Server

    Aitkin, Murray

    2010-01-01

    Filling a gap in current Bayesian theory, Statistical Inference: An Integrated Bayesian/Likelihood Approach presents a unified Bayesian treatment of parameter inference and model comparisons that can be used with simple diffuse prior specifications. This novel approach provides new solutions to difficult model comparison problems and offers direct Bayesian counterparts of frequentist t-tests and other standard statistical methods for hypothesis testing.After an overview of the competing theories of statistical inference, the book introduces the Bayes/likelihood approach used throughout. It pre

  19. THE GROWTH POINTS OF STATISTICAL METHODS

    OpenAIRE

    Orlov A. I.

    2014-01-01

    On the basis of a new paradigm of applied mathematical statistics, data analysis and economic-mathematical methods are identified; we have also discussed five topical areas in which modern applied statistics is developing as well as the other statistical methods, i.e. five "growth points" – nonparametric statistics, robustness, computer-statistical methods, statistics of interval data, statistics of non-numeric data

  20. An elementary introduction to Bayesian computing using WinBUGS.

    Science.gov (United States)

    Fryback, D G; Stout, N K; Rosenberg, M A

    2001-01-01

    Bayesian statistics provides effective techniques for analyzing data and translating the results to inform decision making. This paper provides an elementary tutorial overview of the WinBUGS software for performing Bayesian statistical analysis. Background information on the computational methods used by the software is provided. Two examples drawn from the field of medical decision making are presented to illustrate the features and functionality of the software.

  1. Machine learning a Bayesian and optimization perspective

    CERN Document Server

    Theodoridis, Sergios

    2015-01-01

    This tutorial text gives a unifying perspective on machine learning by covering both probabilistic and deterministic approaches, which rely on optimization techniques, as well as Bayesian inference, which is based on a hierarchy of probabilistic models. The book presents the major machine learning methods as they have been developed in different disciplines, such as statistics, statistical and adaptive signal processing and computer science. Focusing on the physical reasoning behind the mathematics, all the various methods and techniques are explained in depth, supported by examples and problems, giving an invaluable resource to the student and researcher for understanding and applying machine learning concepts. The book builds carefully from the basic classical methods to the most recent trends, with chapters written to be as self-contained as possible, making the text suitable for different courses: pattern recognition, statistical/adaptive signal processing, statistical/Bayesian learning, as well as shor...

  2. All of statistics a concise course in statistical inference

    CERN Document Server

    Wasserman, Larry

    2004-01-01

    This book is for people who want to learn probability and statistics quickly It brings together many of the main ideas in modern statistics in one place The book is suitable for students and researchers in statistics, computer science, data mining and machine learning This book covers a much wider range of topics than a typical introductory text on mathematical statistics It includes modern topics like nonparametric curve estimation, bootstrapping and classification, topics that are usually relegated to follow-up courses The reader is assumed to know calculus and a little linear algebra No previous knowledge of probability and statistics is required The text can be used at the advanced undergraduate and graduate level Larry Wasserman is Professor of Statistics at Carnegie Mellon University He is also a member of the Center for Automated Learning and Discovery in the School of Computer Science His research areas include nonparametric inference, asymptotic theory, causality, and applications to astrophysics, bi...

  3. Bayesian parameter estimation in probabilistic risk assessment

    International Nuclear Information System (INIS)

    Siu, Nathan O.; Kelly, Dana L.

    1998-01-01

    Bayesian statistical methods are widely used in probabilistic risk assessment (PRA) because of their ability to provide useful estimates of model parameters when data are sparse and because the subjective probability framework, from which these methods are derived, is a natural framework to address the decision problems motivating PRA. This paper presents a tutorial on Bayesian parameter estimation especially relevant to PRA. It summarizes the philosophy behind these methods, approaches for constructing likelihood functions and prior distributions, some simple but realistic examples, and a variety of cautions and lessons regarding practical applications. References are also provided for more in-depth coverage of various topics

  4. On the Choice of Difference Sequence in a Unified Framework for Variance Estimation in Nonparametric Regression

    KAUST Repository

    Dai, Wenlin; Tong, Tiejun; Zhu, Lixing

    2017-01-01

    Difference-based methods do not require estimating the mean function in nonparametric regression and are therefore popular in practice. In this paper, we propose a unified framework for variance estimation that combines the linear regression method with the higher-order difference estimators systematically. The unified framework has greatly enriched the existing literature on variance estimation that includes most existing estimators as special cases. More importantly, the unified framework has also provided a smart way to solve the challenging difference sequence selection problem that remains a long-standing controversial issue in nonparametric regression for several decades. Using both theory and simulations, we recommend to use the ordinary difference sequence in the unified framework, no matter if the sample size is small or if the signal-to-noise ratio is large. Finally, to cater for the demands of the application, we have developed a unified R package, named VarED, that integrates the existing difference-based estimators and the unified estimators in nonparametric regression and have made it freely available in the R statistical program http://cran.r-project.org/web/packages/.

  5. On the Choice of Difference Sequence in a Unified Framework for Variance Estimation in Nonparametric Regression

    KAUST Repository

    Dai, Wenlin

    2017-09-01

    Difference-based methods do not require estimating the mean function in nonparametric regression and are therefore popular in practice. In this paper, we propose a unified framework for variance estimation that combines the linear regression method with the higher-order difference estimators systematically. The unified framework has greatly enriched the existing literature on variance estimation that includes most existing estimators as special cases. More importantly, the unified framework has also provided a smart way to solve the challenging difference sequence selection problem that remains a long-standing controversial issue in nonparametric regression for several decades. Using both theory and simulations, we recommend to use the ordinary difference sequence in the unified framework, no matter if the sample size is small or if the signal-to-noise ratio is large. Finally, to cater for the demands of the application, we have developed a unified R package, named VarED, that integrates the existing difference-based estimators and the unified estimators in nonparametric regression and have made it freely available in the R statistical program http://cran.r-project.org/web/packages/.

  6. Bayesian data analysis for newcomers.

    Science.gov (United States)

    Kruschke, John K; Liddell, Torrin M

    2018-02-01

    This article explains the foundational concepts of Bayesian data analysis using virtually no mathematical notation. Bayesian ideas already match your intuitions from everyday reasoning and from traditional data analysis. Simple examples of Bayesian data analysis are presented that illustrate how the information delivered by a Bayesian analysis can be directly interpreted. Bayesian approaches to null-value assessment are discussed. The article clarifies misconceptions about Bayesian methods that newcomers might have acquired elsewhere. We discuss prior distributions and explain how they are not a liability but an important asset. We discuss the relation of Bayesian data analysis to Bayesian models of mind, and we briefly discuss what methodological problems Bayesian data analysis is not meant to solve. After you have read this article, you should have a clear sense of how Bayesian data analysis works and the sort of information it delivers, and why that information is so intuitive and useful for drawing conclusions from data.

  7. Inverse problems in the Bayesian framework

    International Nuclear Information System (INIS)

    Calvetti, Daniela; Somersalo, Erkki; Kaipio, Jari P

    2014-01-01

    The history of Bayesian methods dates back to the original works of Reverend Thomas Bayes and Pierre-Simon Laplace: the former laid down some of the basic principles on inverse probability in his classic article ‘An essay towards solving a problem in the doctrine of chances’ that was read posthumously in the Royal Society in 1763. Laplace, on the other hand, in his ‘Memoirs on inverse probability’ of 1774 developed the idea of updating beliefs and wrote down the celebrated Bayes’ formula in the form we know today. Although not identified yet as a framework for investigating inverse problems, Laplace used the formalism very much in the spirit it is used today in the context of inverse problems, e.g., in his study of the distribution of comets. With the evolution of computational tools, Bayesian methods have become increasingly popular in all fields of human knowledge in which conclusions need to be drawn based on incomplete and noisy data. Needless to say, inverse problems, almost by definition, fall into this category. Systematic work for developing a Bayesian inverse problem framework can arguably be traced back to the 1980s, (the original first edition being published by Elsevier in 1987), although articles on Bayesian methodology applied to inverse problems, in particular in geophysics, had appeared much earlier. Today, as testified by the articles in this special issue, the Bayesian methodology as a framework for considering inverse problems has gained a lot of popularity, and it has integrated very successfully with many traditional inverse problems ideas and techniques, providing novel ways to interpret and implement traditional procedures in numerical analysis, computational statistics, signal analysis and data assimilation. The range of applications where the Bayesian framework has been fundamental goes from geophysics, engineering and imaging to astronomy, life sciences and economy, and continues to grow. There is no question that Bayesian

  8. BAYESIAN ESTIMATION OF THERMONUCLEAR REACTION RATES

    Energy Technology Data Exchange (ETDEWEB)

    Iliadis, C.; Anderson, K. S. [Department of Physics and Astronomy, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-3255 (United States); Coc, A. [Centre de Sciences Nucléaires et de Sciences de la Matière (CSNSM), CNRS/IN2P3, Univ. Paris-Sud, Université Paris–Saclay, Bâtiment 104, F-91405 Orsay Campus (France); Timmes, F. X.; Starrfield, S., E-mail: iliadis@unc.edu [School of Earth and Space Exploration, Arizona State University, Tempe, AZ 85287-1504 (United States)

    2016-11-01

    The problem of estimating non-resonant astrophysical S -factors and thermonuclear reaction rates, based on measured nuclear cross sections, is of major interest for nuclear energy generation, neutrino physics, and element synthesis. Many different methods have been applied to this problem in the past, almost all of them based on traditional statistics. Bayesian methods, on the other hand, are now in widespread use in the physical sciences. In astronomy, for example, Bayesian statistics is applied to the observation of extrasolar planets, gravitational waves, and Type Ia supernovae. However, nuclear physics, in particular, has been slow to adopt Bayesian methods. We present astrophysical S -factors and reaction rates based on Bayesian statistics. We develop a framework that incorporates robust parameter estimation, systematic effects, and non-Gaussian uncertainties in a consistent manner. The method is applied to the reactions d(p, γ ){sup 3}He, {sup 3}He({sup 3}He,2p){sup 4}He, and {sup 3}He( α , γ ){sup 7}Be, important for deuterium burning, solar neutrinos, and Big Bang nucleosynthesis.

  9. Noncausal Bayesian Vector Autoregression

    DEFF Research Database (Denmark)

    Lanne, Markku; Luoto, Jani

    We propose a Bayesian inferential procedure for the noncausal vector autoregressive (VAR) model that is capable of capturing nonlinearities and incorporating effects of missing variables. In particular, we devise a fast and reliable posterior simulator that yields the predictive distribution...

  10. Bayesian psychometric scaling

    NARCIS (Netherlands)

    Fox, Gerardus J.A.; van den Berg, Stéphanie Martine; Veldkamp, Bernard P.; Irwing, P.; Booth, T.; Hughes, D.

    2015-01-01

    In educational and psychological studies, psychometric methods are involved in the measurement of constructs, and in constructing and validating measurement instruments. Assessment results are typically used to measure student proficiency levels and test characteristics. Recently, Bayesian item

  11. Modelación de episodios críticos de contaminación por material particulado (PM10 en Santiago de Chile: Comparación de la eficiencia predictiva de los modelos paramétricos y no paramétricos Modeling critical episodes of air pollution by PM10 in Santiago, Chile: Comparison of the predictive efficiency of parametric and non-parametric statistical models

    Directory of Open Access Journals (Sweden)

    Sergio A. Alvarado

    2010-12-01

    Full Text Available Objetivo: Evaluar la eficiencia predictiva de modelos estadísticos paramétricos y no paramétricos para predecir episodios críticos de contaminación por material particulado PM10 del día siguiente, que superen en Santiago de Chile la norma de calidad diaria. Una predicción adecuada de tales episodios permite a la autoridad decretar medidas restrictivas que aminoren la gravedad del episodio, y consecuentemente proteger la salud de la comunidad. Método: Se trabajó con las concentraciones de material particulado PM10 registradas en una estación asociada a la red de monitorización de la calidad del aire MACAM-2, considerando 152 observaciones diarias de 14 variables, y con información meteorológica registrada durante los años 2001 a 2004. Se ajustaron modelos estadísticos paramétricos Gamma usando el paquete estadístico STATA v11, y no paramétricos usando una demo del software estadístico MARS v 2.0 distribuida por Salford-Systems. Resultados: Ambos métodos de modelación presentan una alta correlación entre los valores observados y los predichos. Los modelos Gamma presentan mejores aciertos que MARS para las concentraciones de PM10 con valores Objective: To evaluate the predictive efficiency of two statistical models (one parametric and the other non-parametric to predict critical episodes of air pollution exceeding daily air quality standards in Santiago, Chile by using the next day PM10 maximum 24h value. Accurate prediction of such episodes would allow restrictive measures to be applied by health authorities to reduce their seriousness and protect the community´s health. Methods: We used the PM10 concentrations registered by a station of the Air Quality Monitoring Network (152 daily observations of 14 variables and meteorological information gathered from 2001 to 2004. To construct predictive models, we fitted a parametric Gamma model using STATA v11 software and a non-parametric MARS model by using a demo version of Salford

  12. A Bayesian encourages dropout

    OpenAIRE

    Maeda, Shin-ichi

    2014-01-01

    Dropout is one of the key techniques to prevent the learning from overfitting. It is explained that dropout works as a kind of modified L2 regularization. Here, we shed light on the dropout from Bayesian standpoint. Bayesian interpretation enables us to optimize the dropout rate, which is beneficial for learning of weight parameters and prediction after learning. The experiment result also encourages the optimization of the dropout.

  13. Statistical inference a short course

    CERN Document Server

    Panik, Michael J

    2012-01-01

    A concise, easily accessible introduction to descriptive and inferential techniques Statistical Inference: A Short Course offers a concise presentation of the essentials of basic statistics for readers seeking to acquire a working knowledge of statistical concepts, measures, and procedures. The author conducts tests on the assumption of randomness and normality, provides nonparametric methods when parametric approaches might not work. The book also explores how to determine a confidence interval for a population median while also providing coverage of ratio estimation, randomness, and causal

  14. The use of statistical procedures for comparison of the production of eggs considering different treatments and lines of quails

    Directory of Open Access Journals (Sweden)

    Robson Marcelo Rossi

    2012-04-01

    Full Text Available To make inferences concerning a population, parametric methods are traditionally used for sample data analysis assuming that these come from ordinary populations, which not always happens. Most of the times such assumption leads to inadequate conclusions and decisions, especially when the data are distributed asymmetrically. It was aimed on this work to adjust some plausible models for this type of data, aiming to compare eggs productions in quails of different lines through frequentist and Bayesian methods. In the analyzed data it could be verified that the parametric and non-parametric frequentist procedures were, in general, equally conclusive. The Bayesian procedure was the only one which detected difference between the yellow and red lines, regarding the preconized energy diet, and between the blue and yellow for the second eclosion group. By the frequentist methods differences between the blue and red lines were not found; however, given a Poisson distribution for the eggs production, through multiple comparisons between the posterior mean under the Bayesian focus, differences were found between all lines. For being more sensitive, besides flexible, such procedure detected differences not observed by the traditional methods between the yellow and red lines, in the higher energy diet; between the blue and the red one in the second eclosion group and in general. It was concluded that quails of the yellow line showed bigger eggs production, regardless of the eclosion group and diet. Such results raise the discussion on the appropriate use of methods and procedures for statistical data analysis.

  15. Statistical modelling of networked human-automation performance using working memory capacity.

    Science.gov (United States)

    Ahmed, Nisar; de Visser, Ewart; Shaw, Tyler; Mohamed-Ameen, Amira; Campbell, Mark; Parasuraman, Raja

    2014-01-01

    This study examines the challenging problem of modelling the interaction between individual attentional limitations and decision-making performance in networked human-automation system tasks. Analysis of real experimental data from a task involving networked supervision of multiple unmanned aerial vehicles by human participants shows that both task load and network message quality affect performance, but that these effects are modulated by individual differences in working memory (WM) capacity. These insights were used to assess three statistical approaches for modelling and making predictions with real experimental networked supervisory performance data: classical linear regression, non-parametric Gaussian processes and probabilistic Bayesian networks. It is shown that each of these approaches can help designers of networked human-automated systems cope with various uncertainties in order to accommodate future users by linking expected operating conditions and performance from real experimental data to observable cognitive traits like WM capacity. Practitioner Summary: Working memory (WM) capacity helps account for inter-individual variability in operator performance in networked unmanned aerial vehicle supervisory tasks. This is useful for reliable performance prediction near experimental conditions via linear models; robust statistical prediction beyond experimental conditions via Gaussian process models and probabilistic inference about unknown task conditions/WM capacities via Bayesian network models.

  16. When mechanism matters: Bayesian forecasting using models of ecological diffusion

    Science.gov (United States)

    Hefley, Trevor J.; Hooten, Mevin B.; Russell, Robin E.; Walsh, Daniel P.; Powell, James A.

    2017-01-01

    Ecological diffusion is a theory that can be used to understand and forecast spatio-temporal processes such as dispersal, invasion, and the spread of disease. Hierarchical Bayesian modelling provides a framework to make statistical inference and probabilistic forecasts, using mechanistic ecological models. To illustrate, we show how hierarchical Bayesian models of ecological diffusion can be implemented for large data sets that are distributed densely across space and time. The hierarchical Bayesian approach is used to understand and forecast the growth and geographic spread in the prevalence of chronic wasting disease in white-tailed deer (Odocoileus virginianus). We compare statistical inference and forecasts from our hierarchical Bayesian model to phenomenological regression-based methods that are commonly used to analyse spatial occurrence data. The mechanistic statistical model based on ecological diffusion led to important ecological insights, obviated a commonly ignored type of collinearity, and was the most accurate method for forecasting.

  17. Bayesian networks in educational assessment

    CERN Document Server

    Almond, Russell G; Steinberg, Linda S; Yan, Duanli; Williamson, David M

    2015-01-01

    Bayesian inference networks, a synthesis of statistics and expert systems, have advanced reasoning under uncertainty in medicine, business, and social sciences. This innovative volume is the first comprehensive treatment exploring how they can be applied to design and analyze innovative educational assessments. Part I develops Bayes nets’ foundations in assessment, statistics, and graph theory, and works through the real-time updating algorithm. Part II addresses parametric forms for use with assessment, model-checking techniques, and estimation with the EM algorithm and Markov chain Monte Carlo (MCMC). A unique feature is the volume’s grounding in Evidence-Centered Design (ECD) framework for assessment design. This “design forward” approach enables designers to take full advantage of Bayes nets’ modularity and ability to model complex evidentiary relationships that arise from performance in interactive, technology-rich assessments such as simulations. Part III describes ECD, situates Bayes nets as ...

  18. A robust nonparametric method for quantifying undetected extinctions.

    Science.gov (United States)

    Chisholm, Ryan A; Giam, Xingli; Sadanandan, Keren R; Fung, Tak; Rheindt, Frank E

    2016-06-01

    How many species have gone extinct in modern times before being described by science? To answer this question, and thereby get a full assessment of humanity's impact on biodiversity, statistical methods that quantify undetected extinctions are required. Such methods have been developed recently, but they are limited by their reliance on parametric assumptions; specifically, they assume the pools of extant and undetected species decay exponentially, whereas real detection rates vary temporally with survey effort and real extinction rates vary with the waxing and waning of threatening processes. We devised a new, nonparametric method for estimating undetected extinctions. As inputs, the method requires only the first and last date at which each species in an ensemble was recorded. As outputs, the method provides estimates of the proportion of species that have gone extinct, detected, or undetected and, in the special case where the number of undetected extant species in the present day is assumed close to zero, of the absolute number of undetected extinct species. The main assumption of the method is that the per-species extinction rate is independent of whether a species has been detected or not. We applied the method to the resident native bird fauna of Singapore. Of 195 recorded species, 58 (29.7%) have gone extinct in the last 200 years. Our method projected that an additional 9.6 species (95% CI 3.4, 19.8) have gone extinct without first being recorded, implying a true extinction rate of 33.0% (95% CI 31.0%, 36.2%). We provide R code for implementing our method. Because our method does not depend on strong assumptions, we expect it to be broadly useful for quantifying undetected extinctions. © 2016 Society for Conservation Biology.

  19. Bayesian Modeling of MPSS Data: Gene Expression Analysis of Bovine Salmonella Infection

    KAUST Repository

    Dhavala, Soma S.

    2010-09-01

    Massively Parallel Signature Sequencing (MPSS) is a high-throughput, counting-based technology available for gene expression profiling. It produces output that is similar to Serial Analysis of Gene Expression and is ideal for building complex relational databases for gene expression. Our goal is to compare the in vivo global gene expression profiles of tissues infected with different strains of Salmonella obtained using the MPSS technology. In this article, we develop an exact ANOVA type model for this count data using a zero-inflatedPoisson distribution, different from existing methods that assume continuous densities. We adopt two Bayesian hierarchical models-one parametric and the other semiparametric with a Dirichlet process prior that has the ability to "borrow strength" across related signatures, where a signature is a specific arrangement of the nucleotides, usually 16-21 base pairs long. We utilize the discreteness of Dirichlet process prior to cluster signatures that exhibit similar differential expression profiles. Tests for differential expression are carried out using nonparametric approaches, while controlling the false discovery rate. We identify several differentially expressed genes that have important biological significance and conclude with a summary of the biological discoveries. This article has supplementary materials online. © 2010 American Statistical Association.

  20. Bayesian Modeling of MPSS Data: Gene Expression Analysis of Bovine Salmonella Infection

    KAUST Repository

    Dhavala, Soma S.; Datta, Sujay; Mallick, Bani K.; Carroll, Raymond J.; Khare, Sangeeta; Lawhon, Sara D.; Adams, L. Garry

    2010-01-01

    Massively Parallel Signature Sequencing (MPSS) is a high-throughput, counting-based technology available for gene expression profiling. It produces output that is similar to Serial Analysis of Gene Expression and is ideal for building complex relational databases for gene expression. Our goal is to compare the in vivo global gene expression profiles of tissues infected with different strains of Salmonella obtained using the MPSS technology. In this article, we develop an exact ANOVA type model for this count data using a zero-inflatedPoisson distribution, different from existing methods that assume continuous densities. We adopt two Bayesian hierarchical models-one parametric and the other semiparametric with a Dirichlet process prior that has the ability to "borrow strength" across related signatures, where a signature is a specific arrangement of the nucleotides, usually 16-21 base pairs long. We utilize the discreteness of Dirichlet process prior to cluster signatures that exhibit similar differential expression profiles. Tests for differential expression are carried out using nonparametric approaches, while controlling the false discovery rate. We identify several differentially expressed genes that have important biological significance and conclude with a summary of the biological discoveries. This article has supplementary materials online. © 2010 American Statistical Association.

  1. The image recognition based on neural network and Bayesian decision

    Science.gov (United States)

    Wang, Chugege

    2018-04-01

    The artificial neural network began in 1940, which is an important part of artificial intelligence. At present, it has become a hot topic in the fields of neuroscience, computer science, brain science, mathematics, and psychology. Thomas Bayes firstly reported the Bayesian theory in 1763. After the development in the twentieth century, it has been widespread in all areas of statistics. In recent years, due to the solution of the problem of high-dimensional integral calculation, Bayesian Statistics has been improved theoretically, which solved many problems that cannot be solved by classical statistics and is also applied to the interdisciplinary fields. In this paper, the related concepts and principles of the artificial neural network are introduced. It also summarizes the basic content and principle of Bayesian Statistics, and combines the artificial neural network technology and Bayesian decision theory and implement them in all aspects of image recognition, such as enhanced face detection method based on neural network and Bayesian decision, as well as the image classification based on the Bayesian decision. It can be seen that the combination of artificial intelligence and statistical algorithms has always been the hot research topic.

  2. Bayesian estimation of the discrete coefficient of determination.

    Science.gov (United States)

    Chen, Ting; Braga-Neto, Ulisses M

    2016-12-01

    The discrete coefficient of determination (CoD) measures the nonlinear interaction between discrete predictor and target variables and has had far-reaching applications in Genomic Signal Processing. Previous work has addressed the inference of the discrete CoD using classical parametric and nonparametric approaches. In this paper, we introduce a Bayesian framework for the inference of the discrete CoD. We derive analytically the optimal minimum mean-square error (MMSE) CoD estimator, as well as a CoD estimator based on the Optimal Bayesian Predictor (OBP). For the latter estimator, exact expressions for its bias, variance, and root-mean-square (RMS) are given. The accuracy of both Bayesian CoD estimators with non-informative and informative priors, under fixed or random parameters, is studied via analytical and numerical approaches. We also demonstrate the application of the proposed Bayesian approach in the inference of gene regulatory networks, using gene-expression data from a previously published study on metastatic melanoma.

  3. Nonparametric method for failures detection and localization in the actuating subsystem of aircraft control system

    Science.gov (United States)

    Karpenko, S. S.; Zybin, E. Yu; Kosyanchuk, V. V.

    2018-02-01

    In this paper we design a nonparametric method for failures detection and localization in the aircraft control system that uses the measurements of the control signals and the aircraft states only. It doesn’t require a priori information of the aircraft model parameters, training or statistical calculations, and is based on algebraic solvability conditions for the aircraft model identification problem. This makes it possible to significantly increase the efficiency of detection and localization problem solution by completely eliminating errors, associated with aircraft model uncertainties.

  4. A nonparametric approach to calculate critical micelle concentrations: the local polynomial regression method

    Energy Technology Data Exchange (ETDEWEB)

    Lopez Fontan, J.L.; Costa, J.; Ruso, J.M.; Prieto, G. [Dept. of Applied Physics, Univ. of Santiago de Compostela, Santiago de Compostela (Spain); Sarmiento, F. [Dept. of Mathematics, Faculty of Informatics, Univ. of A Coruna, A Coruna (Spain)

    2004-02-01

    The application of a statistical method, the local polynomial regression method, (LPRM), based on a nonparametric estimation of the regression function to determine the critical micelle concentration (cmc) is presented. The method is extremely flexible because it does not impose any parametric model on the subjacent structure of the data but rather allows the data to speak for themselves. Good concordance of cmc values with those obtained by other methods was found for systems in which the variation of a measured physical property with concentration showed an abrupt change. When this variation was slow, discrepancies between the values obtained by LPRM and others methods were found. (orig.)

  5. Likelihood devices in spatial statistics

    NARCIS (Netherlands)

    Zwet, E.W. van

    1999-01-01

    One of the main themes of this thesis is the application to spatial data of modern semi- and nonparametric methods. Another, closely related theme is maximum likelihood estimation from spatial data. Maximum likelihood estimation is not common practice in spatial statistics. The method of moments

  6. Comparing parametric and nonparametric regression methods for panel data

    DEFF Research Database (Denmark)

    Czekaj, Tomasz Gerard; Henningsen, Arne

    We investigate and compare the suitability of parametric and non-parametric stochastic regression methods for analysing production technologies and the optimal firm size. Our theoretical analysis shows that the most commonly used functional forms in empirical production analysis, Cobb......-Douglas and Translog, are unsuitable for analysing the optimal firm size. We show that the Translog functional form implies an implausible linear relationship between the (logarithmic) firm size and the elasticity of scale, where the slope is artificially related to the substitutability between the inputs....... The practical applicability of the parametric and non-parametric regression methods is scrutinised and compared by an empirical example: we analyse the production technology and investigate the optimal size of Polish crop farms based on a firm-level balanced panel data set. A nonparametric specification test...

  7. Bayesian phylogeography finds its roots.

    Directory of Open Access Journals (Sweden)

    Philippe Lemey

    2009-09-01

    Full Text Available As a key factor in endemic and epidemic dynamics, the geographical distribution of viruses has been frequently interpreted in the light of their genetic histories. Unfortunately, inference of historical dispersal or migration patterns of viruses has mainly been restricted to model-free heuristic approaches that provide little insight into the temporal setting of the spatial dynamics. The introduction of probabilistic models of evolution, however, offers unique opportunities to engage in this statistical endeavor. Here we introduce a Bayesian framework for inference, visualization and hypothesis testing of phylogeographic history. By implementing character mapping in a Bayesian software that samples time-scaled phylogenies, we enable the reconstruction of timed viral dispersal patterns while accommodating phylogenetic uncertainty. Standard Markov model inference is extended with a stochastic search variable selection procedure that identifies the parsimonious descriptions of the diffusion process. In addition, we propose priors that can incorporate geographical sampling distributions or characterize alternative hypotheses about the spatial dynamics. To visualize the spatial and temporal information, we summarize inferences using virtual globe software. We describe how Bayesian phylogeography compares with previous parsimony analysis in the investigation of the influenza A H5N1 origin and H5N1 epidemiological linkage among sampling localities. Analysis of rabies in West African dog populations reveals how virus diffusion may enable endemic maintenance through continuous epidemic cycles. From these analyses, we conclude that our phylogeographic framework will make an important asset in molecular epidemiology that can be easily generalized to infer biogeogeography from genetic data for many organisms.

  8. Bayesian approach and application to operation safety

    International Nuclear Information System (INIS)

    Procaccia, H.; Suhner, M.Ch.

    2003-01-01

    The management of industrial risks requires the development of statistical and probabilistic analyses which use all the available convenient information in order to compensate the insufficient experience feedback in a domain where accidents and incidents remain too scarce to perform a classical statistical frequency analysis. The Bayesian decision approach is well adapted to this problem because it integrates both the expertise and the experience feedback. The domain of knowledge is widen, the forecasting study becomes possible and the decisions-remedial actions are strengthen thanks to risk-cost-benefit optimization analyzes. This book presents the bases of the Bayesian approach and its concrete applications in various industrial domains. After a mathematical presentation of the industrial operation safety concepts and of the Bayesian approach principles, this book treats of some of the problems that can be solved thanks to this approach: softwares reliability, controls linked with the equipments warranty, dynamical updating of databases, expertise modeling and weighting, Bayesian optimization in the domains of maintenance, quality control, tests and design of new equipments. A synthesis of the mathematical formulae used in this approach is given in conclusion. (J.S.)

  9. Bayesian networks with examples in R

    CERN Document Server

    Scutari, Marco

    2014-01-01

    Introduction. The Discrete Case: Multinomial Bayesian Networks. The Continuous Case: Gaussian Bayesian Networks. More Complex Cases. Theory and Algorithms for Bayesian Networks. Real-World Applications of Bayesian Networks. Appendices. Bibliography.

  10. Extending the linear model with R generalized linear, mixed effects and nonparametric regression models

    CERN Document Server

    Faraway, Julian J

    2005-01-01

    Linear models are central to the practice of statistics and form the foundation of a vast range of statistical methodologies. Julian J. Faraway''s critically acclaimed Linear Models with R examined regression and analysis of variance, demonstrated the different methods available, and showed in which situations each one applies. Following in those footsteps, Extending the Linear Model with R surveys the techniques that grow from the regression model, presenting three extensions to that framework: generalized linear models (GLMs), mixed effect models, and nonparametric regression models. The author''s treatment is thoroughly modern and covers topics that include GLM diagnostics, generalized linear mixed models, trees, and even the use of neural networks in statistics. To demonstrate the interplay of theory and practice, throughout the book the author weaves the use of the R software environment to analyze the data of real examples, providing all of the R commands necessary to reproduce the analyses. All of the ...

  11. Bayesian inference for psychology. Part II: Example applications with JASP.

    Science.gov (United States)

    Wagenmakers, Eric-Jan; Love, Jonathon; Marsman, Maarten; Jamil, Tahira; Ly, Alexander; Verhagen, Josine; Selker, Ravi; Gronau, Quentin F; Dropmann, Damian; Boutin, Bruno; Meerhoff, Frans; Knight, Patrick; Raj, Akash; van Kesteren, Erik-Jan; van Doorn, Johnny; Šmíra, Martin; Epskamp, Sacha; Etz, Alexander; Matzke, Dora; de Jong, Tim; van den Bergh, Don; Sarafoglou, Alexandra; Steingroever, Helen; Derks, Koen; Rouder, Jeffrey N; Morey, Richard D

    2018-02-01

    Bayesian hypothesis testing presents an attractive alternative to p value hypothesis testing. Part I of this series outlined several advantages of Bayesian hypothesis testing, including the ability to quantify evidence and the ability to monitor and update this evidence as data come in, without the need to know the intention with which the data were collected. Despite these and other practical advantages, Bayesian hypothesis tests are still reported relatively rarely. An important impediment to the widespread adoption of Bayesian tests is arguably the lack of user-friendly software for the run-of-the-mill statistical problems that confront psychologists for the analysis of almost every experiment: the t-test, ANOVA, correlation, regression, and contingency tables. In Part II of this series we introduce JASP ( http://www.jasp-stats.org ), an open-source, cross-platform, user-friendly graphical software package that allows users to carry out Bayesian hypothesis tests for standard statistical problems. JASP is based in part on the Bayesian analyses implemented in Morey and Rouder's BayesFactor package for R. Armed with JASP, the practical advantages of Bayesian hypothesis testing are only a mouse click away.

  12. Editorial: Bayesian benefits for child psychology and psychiatry researchers.

    Science.gov (United States)

    Oldehinkel, Albertine J

    2016-09-01

    For many scientists, performing statistical tests has become an almost automated routine. However, p-values are frequently used and interpreted incorrectly; and even when used appropriately, p-values tend to provide answers that do not match researchers' questions and hypotheses well. Bayesian statistics present an elegant and often more suitable alternative. The Bayesian approach has rarely been applied in child psychology and psychiatry research so far, but the development of user-friendly software packages and tutorials has placed it well within reach now. Because Bayesian analyses require a more refined definition of hypothesized probabilities of possible outcomes than the classical approach, going Bayesian may offer the additional benefit of sparkling the development and refinement of theoretical models in our field. © 2016 Association for Child and Adolescent Mental Health.

  13. Bayesian policy reuse

    CSIR Research Space (South Africa)

    Rosman, Benjamin

    2016-02-01

    Full Text Available Keywords Policy Reuse · Reinforcement Learning · Online Learning · Online Bandits · Transfer Learning · Bayesian Optimisation · Bayesian Decision Theory. 1 Introduction As robots and software agents are becoming more ubiquitous in many applications.... The agent has access to a library of policies (pi1, pi2 and pi3), and has previously experienced a set of task instances (τ1, τ2, τ3, τ4), as well as samples of the utilities of the library policies on these instances (the black dots indicate the means...

  14. The nonparametric bootstrap for the current status model

    NARCIS (Netherlands)

    Groeneboom, P.; Hendrickx, K.

    2017-01-01

    It has been proved that direct bootstrapping of the nonparametric maximum likelihood estimator (MLE) of the distribution function in the current status model leads to inconsistent confidence intervals. We show that bootstrapping of functionals of the MLE can however be used to produce valid

  15. Non-Parametric Analysis of Rating Transition and Default Data

    DEFF Research Database (Denmark)

    Fledelius, Peter; Lando, David; Perch Nielsen, Jens

    2004-01-01

    We demonstrate the use of non-parametric intensity estimation - including construction of pointwise confidence sets - for analyzing rating transition data. We find that transition intensities away from the class studied here for illustration strongly depend on the direction of the previous move b...

  16. Nonparametric modeling of dynamic functional connectivity in fmri data

    DEFF Research Database (Denmark)

    Nielsen, Søren Føns Vind; Madsen, Kristoffer H.; Røge, Rasmus

    2015-01-01

    dynamic changes. The existing approaches modeling dynamic connectivity have primarily been based on time-windowing the data and k-means clustering. We propose a nonparametric generative model for dynamic FC in fMRI that does not rely on specifying window lengths and number of dynamic states. Rooted...

  17. Surface Estimation, Variable Selection, and the Nonparametric Oracle Property.

    Science.gov (United States)

    Storlie, Curtis B; Bondell, Howard D; Reich, Brian J; Zhang, Hao Helen

    2011-04-01

    Variable selection for multivariate nonparametric regression is an important, yet challenging, problem due, in part, to the infinite dimensionality of the function space. An ideal selection procedure should be automatic, stable, easy to use, and have desirable asymptotic properties. In particular, we define a selection procedure to be nonparametric oracle (np-oracle) if it consistently selects the correct subset of predictors and at the same time estimates the smooth surface at the optimal nonparametric rate, as the sample size goes to infinity. In this paper, we propose a model selection procedure for nonparametric models, and explore the conditions under which the new method enjoys the aforementioned properties. Developed in the framework of smoothing spline ANOVA, our estimator is obtained via solving a regularization problem with a novel adaptive penalty on the sum of functional component norms. Theoretical properties of the new estimator are established. Additionally, numerous simulated and real examples further demonstrate that the new approach substantially outperforms other existing methods in the finite sample setting.

  18. On the robust nonparametric regression estimation for a functional regressor

    OpenAIRE

    Azzedine , Nadjia; Laksaci , Ali; Ould-Saïd , Elias

    2009-01-01

    On the robust nonparametric regression estimation for a functional regressor correspondance: Corresponding author. (Ould-Said, Elias) (Azzedine, Nadjia) (Laksaci, Ali) (Ould-Said, Elias) Departement de Mathematiques--> , Univ. Djillali Liabes--> , BP 89--> , 22000 Sidi Bel Abbes--> - ALGERIA (Azzedine, Nadjia) Departement de Mathema...

  19. A general approach to posterior contraction in nonparametric inverse problems

    NARCIS (Netherlands)

    Knapik, Bartek; Salomond, Jean Bernard

    In this paper, we propose a general method to derive an upper bound for the contraction rate of the posterior distribution for nonparametric inverse problems. We present a general theorem that allows us to derive contraction rates for the parameter of interest from contraction rates of the related

  20. Non-parametric analysis of production efficiency of poultry egg ...

    African Journals Online (AJOL)

    Non-parametric analysis of production efficiency of poultry egg farmers in Delta ... analysis of factors affecting the output of poultry farmers showed that stock ... should be put in place for farmers to learn the best farm practices carried out on the ...