WorldWideScience

Sample records for optimal statistical inference

  1. Statistical inference

    CERN Document Server

    Rohatgi, Vijay K

    2003-01-01

    Unified treatment of probability and statistics examines and analyzes the relationship between the two fields, exploring inferential issues. Numerous problems, examples, and diagrams--some with solutions--plus clear-cut, highlighted summaries of results. Advanced undergraduate to graduate level. Contents: 1. Introduction. 2. Probability Model. 3. Probability Distributions. 4. Introduction to Statistical Inference. 5. More on Mathematical Expectation. 6. Some Discrete Models. 7. Some Continuous Models. 8. Functions of Random Variables and Random Vectors. 9. Large-Sample Theory. 10. General Meth

  2. Bayesian statistical inference

    Directory of Open Access Journals (Sweden)

    Bruno De Finetti

    2017-04-01

    Full Text Available This work was translated into English and published in the volume: Bruno De Finetti, Induction and Probability, Biblioteca di Statistica, eds. P. Monari, D. Cocchi, Clueb, Bologna, 1993.Bayesian statistical Inference is one of the last fundamental philosophical papers in which we can find the essential De Finetti's approach to the statistical inference.

  3. Geometric statistical inference

    International Nuclear Information System (INIS)

    Periwal, Vipul

    1999-01-01

    A reparametrization-covariant formulation of the inverse problem of probability is explicitly solved for finite sample sizes. The inferred distribution is explicitly continuous for finite sample size. A geometric solution of the statistical inference problem in higher dimensions is outlined

  4. Probability and Statistical Inference

    OpenAIRE

    Prosper, Harrison B.

    2006-01-01

    These lectures introduce key concepts in probability and statistical inference at a level suitable for graduate students in particle physics. Our goal is to paint as vivid a picture as possible of the concepts covered.

  5. On quantum statistical inference

    NARCIS (Netherlands)

    Barndorff-Nielsen, O.E.; Gill, R.D.; Jupp, P.E.

    2003-01-01

    Interest in problems of statistical inference connected to measurements of quantum systems has recently increased substantially, in step with dramatic new developments in experimental techniques for studying small quantum systems. Furthermore, developments in the theory of quantum measurements have

  6. Introductory statistical inference

    CERN Document Server

    Mukhopadhyay, Nitis

    2014-01-01

    This gracefully organized text reveals the rigorous theory of probability and statistical inference in the style of a tutorial, using worked examples, exercises, figures, tables, and computer simulations to develop and illustrate concepts. Drills and boxed summaries emphasize and reinforce important ideas and special techniques.Beginning with a review of the basic concepts and methods in probability theory, moments, and moment generating functions, the author moves to more intricate topics. Introductory Statistical Inference studies multivariate random variables, exponential families of dist

  7. Nonparametric statistical inference

    CERN Document Server

    Gibbons, Jean Dickinson

    2010-01-01

    Overall, this remains a very fine book suitable for a graduate-level course in nonparametric statistics. I recommend it for all people interested in learning the basic ideas of nonparametric statistical inference.-Eugenia Stoimenova, Journal of Applied Statistics, June 2012… one of the best books available for a graduate (or advanced undergraduate) text for a theory course on nonparametric statistics. … a very well-written and organized book on nonparametric statistics, especially useful and recommended for teachers and graduate students.-Biometrics, 67, September 2011This excellently presente

  8. Statistical inference for financial engineering

    CERN Document Server

    Taniguchi, Masanobu; Ogata, Hiroaki; Taniai, Hiroyuki

    2014-01-01

    This monograph provides the fundamentals of statistical inference for financial engineering and covers some selected methods suitable for analyzing financial time series data. In order to describe the actual financial data, various stochastic processes, e.g. non-Gaussian linear processes, non-linear processes, long-memory processes, locally stationary processes etc. are introduced and their optimal estimation is considered as well. This book also includes several statistical approaches, e.g., discriminant analysis, the empirical likelihood method, control variate method, quantile regression, realized volatility etc., which have been recently developed and are considered to be powerful tools for analyzing the financial data, establishing a new bridge between time series and financial engineering. This book is well suited as a professional reference book on finance, statistics and statistical financial engineering. Readers are expected to have an undergraduate-level knowledge of statistics.

  9. Statistical inference an integrated approach

    CERN Document Server

    Migon, Helio S; Louzada, Francisco

    2014-01-01

    Introduction Information The concept of probability Assessing subjective probabilities An example Linear algebra and probability Notation Outline of the bookElements of Inference Common statistical modelsLikelihood-based functions Bayes theorem Exchangeability Sufficiency and exponential family Parameter elimination Prior Distribution Entirely subjective specification Specification through functional forms Conjugacy with the exponential family Non-informative priors Hierarchical priors Estimation Introduction to decision theoryBayesian point estimation Classical point estimation Empirical Bayes estimation Comparison of estimators Interval estimation Estimation in the Normal model Approximating Methods The general problem of inference Optimization techniquesAsymptotic theory Other analytical approximations Numerical integration methods Simulation methods Hypothesis Testing Introduction Classical hypothesis testingBayesian hypothesis testing Hypothesis testing and confidence intervalsAsymptotic tests Prediction...

  10. Statistical inference via fiducial methods

    OpenAIRE

    Salomé, Diemer

    1998-01-01

    In this thesis the attention is restricted to inductive reasoning using a mathematical probability model. A statistical procedure prescribes, for every theoretically possible set of data, the inference about the unknown of interest. ... Zie: Summary

  11. On quantum statistical inference

    NARCIS (Netherlands)

    Barndorff-Nielsen, O.E.; Gill, R.D.; Jupp, P.E.

    2001-01-01

    Recent developments in the mathematical foundations of quantum mechanics have brought the theory closer to that of classical probability and statistics. On the other hand, the unique character of quantum physics sets many of the questions addressed apart from those met classically in stochastics.

  12. Statistical theory and inference

    CERN Document Server

    Olive, David J

    2014-01-01

    This text is for  a one semester graduate course in statistical theory and covers minimal and complete sufficient statistics, maximum likelihood estimators, method of moments, bias and mean square error, uniform minimum variance estimators and the Cramer-Rao lower bound, an introduction to large sample theory, likelihood ratio tests and uniformly most powerful  tests and the Neyman Pearson Lemma. A major goal of this text is to make these topics much more accessible to students by using the theory of exponential families. Exponential families, indicator functions and the support of the distribution are used throughout the text to simplify the theory. More than 50 ``brand name" distributions are used to illustrate the theory with many examples of exponential families, maximum likelihood estimators and uniformly minimum variance unbiased estimators. There are many homework problems with over 30 pages of solutions.

  13. On quantum statistical inference

    DEFF Research Database (Denmark)

    Barndorff-Nielsen, Ole Eiler; Gill, Richard D.; Jupp, Peter E.

    Recent developments in the mathematical foundations of quantum mechanics have brought the theory closer to that of classical probability and statistics. On the other hand, the unique character of quantum physics sets many of the questions addressed apart from those met classically in stochastics....... Furthermore, concurrent advances in experimental techniques and in the theory of quantum computation have led to a strong interest in questions of quantum information, in particular in the sense of the amount of information about unknown parameters in given observational data or accessible through various...

  14. Nonparametric statistical inference

    CERN Document Server

    Gibbons, Jean Dickinson

    2014-01-01

    Thoroughly revised and reorganized, the fourth edition presents in-depth coverage of the theory and methods of the most widely used nonparametric procedures in statistical analysis and offers example applications appropriate for all areas of the social, behavioral, and life sciences. The book presents new material on the quantiles, the calculation of exact and simulated power, multiple comparisons, additional goodness-of-fit tests, methods of analysis of count data, and modern computer applications using MINITAB, SAS, and STATXACT. It includes tabular guides for simplified applications of tests and finding P values and confidence interval estimates.

  15. Statistical Physics, Optimization, Inference, and Message-Passing Algorithms : Lecture Notes of the Les Houches School of Physics : Special Issue, October 2013

    CERN Document Server

    Ricci-Tersenghi, Federico; Zdeborova, Lenka; Zecchina, Riccardo; Tramel, Eric W; Cugliandolo, Leticia F

    2015-01-01

    This book contains a collection of the presentations that were given in October 2013 at the Les Houches Autumn School on statistical physics, optimization, inference, and message-passing algorithms. In the last decade, there has been increasing convergence of interest and methods between theoretical physics and fields as diverse as probability, machine learning, optimization, and inference problems. In particular, much theoretical and applied work in statistical physics and computer science has relied on the use of message-passing algorithms and their connection to the statistical physics of glasses and spin glasses. For example, both the replica and cavity methods have led to recent advances in compressed sensing, sparse estimation, and random constraint satisfaction, to name a few. This book’s detailed pedagogical lectures on statistical inference, computational complexity, the replica and cavity methods, and belief propagation are aimed particularly at PhD students, post-docs, and young researchers desir...

  16. Statistical inference a short course

    CERN Document Server

    Panik, Michael J

    2012-01-01

    A concise, easily accessible introduction to descriptive and inferential techniques Statistical Inference: A Short Course offers a concise presentation of the essentials of basic statistics for readers seeking to acquire a working knowledge of statistical concepts, measures, and procedures. The author conducts tests on the assumption of randomness and normality, provides nonparametric methods when parametric approaches might not work. The book also explores how to determine a confidence interval for a population median while also providing coverage of ratio estimation, randomness, and causal

  17. Statistical inference and Aristotle's Rhetoric.

    Science.gov (United States)

    Macdonald, Ranald R

    2004-11-01

    Formal logic operates in a closed system where all the information relevant to any conclusion is present, whereas this is not the case when one reasons about events and states of the world. Pollard and Richardson drew attention to the fact that the reasoning behind statistical tests does not lead to logically justifiable conclusions. In this paper statistical inferences are defended not by logic but by the standards of everyday reasoning. Aristotle invented formal logic, but argued that people mostly get at the truth with the aid of enthymemes--incomplete syllogisms which include arguing from examples, analogies and signs. It is proposed that statistical tests work in the same way--in that they are based on examples, invoke the analogy of a model and use the size of the effect under test as a sign that the chance hypothesis is unlikely. Of existing theories of statistical inference only a weak version of Fisher's takes this into account. Aristotle anticipated Fisher by producing an argument of the form that there were too many cases in which an outcome went in a particular direction for that direction to be plausibly attributed to chance. We can therefore conclude that Aristotle would have approved of statistical inference and there is a good reason for calling this form of statistical inference classical.

  18. Statistical learning and selective inference.

    Science.gov (United States)

    Taylor, Jonathan; Tibshirani, Robert J

    2015-06-23

    We describe the problem of "selective inference." This addresses the following challenge: Having mined a set of data to find potential associations, how do we properly assess the strength of these associations? The fact that we have "cherry-picked"--searched for the strongest associations--means that we must set a higher bar for declaring significant the associations that we see. This challenge becomes more important in the era of big data and complex statistical modeling. The cherry tree (dataset) can be very large and the tools for cherry picking (statistical learning methods) are now very sophisticated. We describe some recent new developments in selective inference and illustrate their use in forward stepwise regression, the lasso, and principal components analysis.

  19. On Quantum Statistical Inference, II

    OpenAIRE

    Barndorff-Nielsen, O. E.; Gill, R. D.; Jupp, P. E.

    2003-01-01

    Interest in problems of statistical inference connected to measurements of quantum systems has recently increased substantially, in step with dramatic new developments in experimental techniques for studying small quantum systems. Furthermore, theoretical developments in the theory of quantum measurements have brought the basic mathematical framework for the probability calculations much closer to that of classical probability theory. The present paper reviews this field and proposes and inte...

  20. Statistical inference on residual life

    CERN Document Server

    Jeong, Jong-Hyeon

    2014-01-01

    This is a monograph on the concept of residual life, which is an alternative summary measure of time-to-event data, or survival data. The mean residual life has been used for many years under the name of life expectancy, so it is a natural concept for summarizing survival or reliability data. It is also more interpretable than the popular hazard function, especially for communications between patients and physicians regarding the efficacy of a new drug in the medical field. This book reviews existing statistical methods to infer the residual life distribution. The review and comparison includes existing inference methods for mean and median, or quantile, residual life analysis through medical data examples. The concept of the residual life is also extended to competing risks analysis. The targeted audience includes biostatisticians, graduate students, and PhD (bio)statisticians. Knowledge in survival analysis at an introductory graduate level is advisable prior to reading this book.

  1. Subjective randomness as statistical inference.

    Science.gov (United States)

    Griffiths, Thomas L; Daniels, Dylan; Austerweil, Joseph L; Tenenbaum, Joshua B

    2018-06-01

    Some events seem more random than others. For example, when tossing a coin, a sequence of eight heads in a row does not seem very random. Where do these intuitions about randomness come from? We argue that subjective randomness can be understood as the result of a statistical inference assessing the evidence that an event provides for having been produced by a random generating process. We show how this account provides a link to previous work relating randomness to algorithmic complexity, in which random events are those that cannot be described by short computer programs. Algorithmic complexity is both incomputable and too general to capture the regularities that people can recognize, but viewing randomness as statistical inference provides two paths to addressing these problems: considering regularities generated by simpler computing machines, and restricting the set of probability distributions that characterize regularity. Building on previous work exploring these different routes to a more restricted notion of randomness, we define strong quantitative models of human randomness judgments that apply not just to binary sequences - which have been the focus of much of the previous work on subjective randomness - but also to binary matrices and spatial clustering. Copyright © 2018 Elsevier Inc. All rights reserved.

  2. Optimization methods for logical inference

    CERN Document Server

    Chandru, Vijay

    2011-01-01

    Merging logic and mathematics in deductive inference-an innovative, cutting-edge approach. Optimization methods for logical inference? Absolutely, say Vijay Chandru and John Hooker, two major contributors to this rapidly expanding field. And even though ""solving logical inference problems with optimization methods may seem a bit like eating sauerkraut with chopsticks. . . it is the mathematical structure of a problem that determines whether an optimization model can help solve it, not the context in which the problem occurs."" Presenting powerful, proven optimization techniques for logic in

  3. Statistical Inference at Work: Statistical Process Control as an Example

    Science.gov (United States)

    Bakker, Arthur; Kent, Phillip; Derry, Jan; Noss, Richard; Hoyles, Celia

    2008-01-01

    To characterise statistical inference in the workplace this paper compares a prototypical type of statistical inference at work, statistical process control (SPC), with a type of statistical inference that is better known in educational settings, hypothesis testing. Although there are some similarities between the reasoning structure involved in…

  4. Statistical inference for stochastic processes

    National Research Council Canada - National Science Library

    Basawa, Ishwar V; Prakasa Rao, B. L. S

    1980-01-01

    The aim of this monograph is to attempt to reduce the gap between theory and applications in the area of stochastic modelling, by directing the interest of future researchers to the inference aspects...

  5. Statistical inference based on divergence measures

    CERN Document Server

    Pardo, Leandro

    2005-01-01

    The idea of using functionals of Information Theory, such as entropies or divergences, in statistical inference is not new. However, in spite of the fact that divergence statistics have become a very good alternative to the classical likelihood ratio test and the Pearson-type statistic in discrete models, many statisticians remain unaware of this powerful approach.Statistical Inference Based on Divergence Measures explores classical problems of statistical inference, such as estimation and hypothesis testing, on the basis of measures of entropy and divergence. The first two chapters form an overview, from a statistical perspective, of the most important measures of entropy and divergence and study their properties. The author then examines the statistical analysis of discrete multivariate data with emphasis is on problems in contingency tables and loglinear models using phi-divergence test statistics as well as minimum phi-divergence estimators. The final chapter looks at testing in general populations, prese...

  6. Statistical inference an integrated Bayesianlikelihood approach

    CERN Document Server

    Aitkin, Murray

    2010-01-01

    Filling a gap in current Bayesian theory, Statistical Inference: An Integrated Bayesian/Likelihood Approach presents a unified Bayesian treatment of parameter inference and model comparisons that can be used with simple diffuse prior specifications. This novel approach provides new solutions to difficult model comparison problems and offers direct Bayesian counterparts of frequentist t-tests and other standard statistical methods for hypothesis testing.After an overview of the competing theories of statistical inference, the book introduces the Bayes/likelihood approach used throughout. It pre

  7. Nonparametric predictive inference in statistical process control

    NARCIS (Netherlands)

    Arts, G.R.J.; Coolen, F.P.A.; Laan, van der P.

    2000-01-01

    New methods for statistical process control are presented, where the inferences have a nonparametric predictive nature. We consider several problems in process control in terms of uncertainties about future observable random quantities, and we develop inferences for these random quantities hased on

  8. Bayesian Inference in Statistical Analysis

    CERN Document Server

    Box, George E P

    2011-01-01

    The Wiley Classics Library consists of selected books that have become recognized classics in their respective fields. With these new unabridged and inexpensive editions, Wiley hopes to extend the life of these important works by making them available to future generations of mathematicians and scientists. Currently available in the Series: T. W. Anderson The Statistical Analysis of Time Series T. S. Arthanari & Yadolah Dodge Mathematical Programming in Statistics Emil Artin Geometric Algebra Norman T. J. Bailey The Elements of Stochastic Processes with Applications to the Natural Sciences Rob

  9. Statistical inference for template aging

    Science.gov (United States)

    Schuckers, Michael E.

    2006-04-01

    A change in classification error rates for a biometric device is often referred to as template aging. Here we offer two methods for determining whether the effect of time is statistically significant. The first of these is the use of a generalized linear model to determine if these error rates change linearly over time. This approach generalizes previous work assessing the impact of covariates using generalized linear models. The second approach uses of likelihood ratio tests methodology. The focus here is on statistical methods for estimation not the underlying cause of the change in error rates over time. These methodologies are applied to data from the National Institutes of Standards and Technology Biometric Score Set Release 1. The results of these applications are discussed.

  10. Order statistics & inference estimation methods

    CERN Document Server

    Balakrishnan, N

    1991-01-01

    The literature on order statistics and inferenc eis quite extensive and covers a large number of fields ,but most of it is dispersed throughout numerous publications. This volume is the consolidtion of the most important results and places an emphasis on estimation. Both theoretical and computational procedures are presented to meet the needs of researchers, professionals, and students. The methods of estimation discussed are well-illustrated with numerous practical examples from both the physical and life sciences, including sociology,psychology,a nd electrical and chemical engineering. A co

  11. Ignorability in Statistical and Probabilistic Inference

    DEFF Research Database (Denmark)

    Jaeger, Manfred

    2005-01-01

    When dealing with incomplete data in statistical learning, or incomplete observations in probabilistic inference, one needs to distinguish the fact that a certain event is observed from the fact that the observed event has happened. Since the modeling and computational complexities entailed...

  12. Inference and the Introductory Statistics Course

    Science.gov (United States)

    Pfannkuch, Maxine; Regan, Matt; Wild, Chris; Budgett, Stephanie; Forbes, Sharleen; Harraway, John; Parsonage, Ross

    2011-01-01

    This article sets out some of the rationale and arguments for making major changes to the teaching and learning of statistical inference in introductory courses at our universities by changing from a norm-based, mathematical approach to more conceptually accessible computer-based approaches. The core problem of the inferential argument with its…

  13. Thermodynamics of statistical inference by cells.

    Science.gov (United States)

    Lang, Alex H; Fisher, Charles K; Mora, Thierry; Mehta, Pankaj

    2014-10-03

    The deep connection between thermodynamics, computation, and information is now well established both theoretically and experimentally. Here, we extend these ideas to show that thermodynamics also places fundamental constraints on statistical estimation and learning. To do so, we investigate the constraints placed by (nonequilibrium) thermodynamics on the ability of biochemical signaling networks to estimate the concentration of an external signal. We show that accuracy is limited by energy consumption, suggesting that there are fundamental thermodynamic constraints on statistical inference.

  14. Pointwise probability reinforcements for robust statistical inference.

    Science.gov (United States)

    Frénay, Benoît; Verleysen, Michel

    2014-02-01

    Statistical inference using machine learning techniques may be difficult with small datasets because of abnormally frequent data (AFDs). AFDs are observations that are much more frequent in the training sample that they should be, with respect to their theoretical probability, and include e.g. outliers. Estimates of parameters tend to be biased towards models which support such data. This paper proposes to introduce pointwise probability reinforcements (PPRs): the probability of each observation is reinforced by a PPR and a regularisation allows controlling the amount of reinforcement which compensates for AFDs. The proposed solution is very generic, since it can be used to robustify any statistical inference method which can be formulated as a likelihood maximisation. Experiments show that PPRs can be easily used to tackle regression, classification and projection: models are freed from the influence of outliers. Moreover, outliers can be filtered manually since an abnormality degree is obtained for each observation. Copyright © 2013 Elsevier Ltd. All rights reserved.

  15. Statistical inference from imperfect photon detection

    International Nuclear Information System (INIS)

    Audenaert, Koenraad M R; Scheel, Stefan

    2009-01-01

    We consider the statistical properties of photon detection with imperfect detectors that exhibit dark counts and less than unit efficiency, in the context of tomographic reconstruction. In this context, the detectors are used to implement certain positive operator-valued measures (POVMs) that would allow us to reconstruct the quantum state or quantum process under consideration. Here we look at the intermediate step of inferring outcome probabilities from measured outcome frequencies, and show how this inference can be performed in a statistically sound way in the presence of detector imperfections. Merging outcome probabilities for different sets of POVMs into a consistent quantum state picture has been treated elsewhere (Audenaert and Scheel 2009 New J. Phys. 11 023028). Single-photon pulsed measurements as well as continuous wave measurements are covered.

  16. Initiating statistical maintenance optimization

    International Nuclear Information System (INIS)

    Doyle, E. Kevin; Tuomi, Vesa; Rowley, Ian

    2007-01-01

    Since the 1980 s maintenance optimization has been centered around various formulations of Reliability Centered Maintenance (RCM). Several such optimization techniques have been implemented at the Bruce Nuclear Station. Further cost refinement of the Station preventive maintenance strategy includes evaluation of statistical optimization techniques. A review of successful pilot efforts in this direction is provided as well as initial work with graphical analysis. The present situation reguarding data sourcing, the principle impediment to use of stochastic methods in previous years, is discussed. The use of Crowe/AMSAA (Army Materials Systems Analysis Activity) plots is demonstrated from the point of view of justifying expenditures in optimization efforts. (author)

  17. Optimal inference with suboptimal models: Addiction and active Bayesian inference

    Science.gov (United States)

    Schwartenbeck, Philipp; FitzGerald, Thomas H.B.; Mathys, Christoph; Dolan, Ray; Wurst, Friedrich; Kronbichler, Martin; Friston, Karl

    2015-01-01

    When casting behaviour as active (Bayesian) inference, optimal inference is defined with respect to an agent’s beliefs – based on its generative model of the world. This contrasts with normative accounts of choice behaviour, in which optimal actions are considered in relation to the true structure of the environment – as opposed to the agent’s beliefs about worldly states (or the task). This distinction shifts an understanding of suboptimal or pathological behaviour away from aberrant inference as such, to understanding the prior beliefs of a subject that cause them to behave less ‘optimally’ than our prior beliefs suggest they should behave. Put simply, suboptimal or pathological behaviour does not speak against understanding behaviour in terms of (Bayes optimal) inference, but rather calls for a more refined understanding of the subject’s generative model upon which their (optimal) Bayesian inference is based. Here, we discuss this fundamental distinction and its implications for understanding optimality, bounded rationality and pathological (choice) behaviour. We illustrate our argument using addictive choice behaviour in a recently described ‘limited offer’ task. Our simulations of pathological choices and addictive behaviour also generate some clear hypotheses, which we hope to pursue in ongoing empirical work. PMID:25561321

  18. All of statistics a concise course in statistical inference

    CERN Document Server

    Wasserman, Larry

    2004-01-01

    This book is for people who want to learn probability and statistics quickly It brings together many of the main ideas in modern statistics in one place The book is suitable for students and researchers in statistics, computer science, data mining and machine learning This book covers a much wider range of topics than a typical introductory text on mathematical statistics It includes modern topics like nonparametric curve estimation, bootstrapping and classification, topics that are usually relegated to follow-up courses The reader is assumed to know calculus and a little linear algebra No previous knowledge of probability and statistics is required The text can be used at the advanced undergraduate and graduate level Larry Wasserman is Professor of Statistics at Carnegie Mellon University He is also a member of the Center for Automated Learning and Discovery in the School of Computer Science His research areas include nonparametric inference, asymptotic theory, causality, and applications to astrophysics, bi...

  19. Statistical Inference for Data Adaptive Target Parameters.

    Science.gov (United States)

    Hubbard, Alan E; Kherad-Pajouh, Sara; van der Laan, Mark J

    2016-05-01

    Consider one observes n i.i.d. copies of a random variable with a probability distribution that is known to be an element of a particular statistical model. In order to define our statistical target we partition the sample in V equal size sub-samples, and use this partitioning to define V splits in an estimation sample (one of the V subsamples) and corresponding complementary parameter-generating sample. For each of the V parameter-generating samples, we apply an algorithm that maps the sample to a statistical target parameter. We define our sample-split data adaptive statistical target parameter as the average of these V-sample specific target parameters. We present an estimator (and corresponding central limit theorem) of this type of data adaptive target parameter. This general methodology for generating data adaptive target parameters is demonstrated with a number of practical examples that highlight new opportunities for statistical learning from data. This new framework provides a rigorous statistical methodology for both exploratory and confirmatory analysis within the same data. Given that more research is becoming "data-driven", the theory developed within this paper provides a new impetus for a greater involvement of statistical inference into problems that are being increasingly addressed by clever, yet ad hoc pattern finding methods. To suggest such potential, and to verify the predictions of the theory, extensive simulation studies, along with a data analysis based on adaptively determined intervention rules are shown and give insight into how to structure such an approach. The results show that the data adaptive target parameter approach provides a general framework and resulting methodology for data-driven science.

  20. Parametric statistical inference basic theory and modern approaches

    CERN Document Server

    Zacks, Shelemyahu; Tsokos, C P

    1981-01-01

    Parametric Statistical Inference: Basic Theory and Modern Approaches presents the developments and modern trends in statistical inference to students who do not have advanced mathematical and statistical preparation. The topics discussed in the book are basic and common to many fields of statistical inference and thus serve as a jumping board for in-depth study. The book is organized into eight chapters. Chapter 1 provides an overview of how the theory of statistical inference is presented in subsequent chapters. Chapter 2 briefly discusses statistical distributions and their properties. Chapt

  1. Stan: A Probabilistic Programming Language for Bayesian Inference and Optimization

    Science.gov (United States)

    Gelman, Andrew; Lee, Daniel; Guo, Jiqiang

    2015-01-01

    Stan is a free and open-source C++ program that performs Bayesian inference or optimization for arbitrary user-specified models and can be called from the command line, R, Python, Matlab, or Julia and has great promise for fitting large and complex statistical models in many areas of application. We discuss Stan from users' and developers'…

  2. Reasoning about Informal Statistical Inference: One Statistician's View

    Science.gov (United States)

    Rossman, Allan J.

    2008-01-01

    This paper identifies key concepts and issues associated with the reasoning of informal statistical inference. I focus on key ideas of inference that I think all students should learn, including at secondary level as well as tertiary. I argue that a fundamental component of inference is to go beyond the data at hand, and I propose that statistical…

  3. International Conference on Trends and Perspectives in Linear Statistical Inference

    CERN Document Server

    Rosen, Dietrich

    2018-01-01

    This volume features selected contributions on a variety of topics related to linear statistical inference. The peer-reviewed papers from the International Conference on Trends and Perspectives in Linear Statistical Inference (LinStat 2016) held in Istanbul, Turkey, 22-25 August 2016, cover topics in both theoretical and applied statistics, such as linear models, high-dimensional statistics, computational statistics, the design of experiments, and multivariate analysis. The book is intended for statisticians, Ph.D. students, and professionals who are interested in statistical inference. .

  4. Statistical Inference on the Canadian Middle Class

    Directory of Open Access Journals (Sweden)

    Russell Davidson

    2018-03-01

    Full Text Available Conventional wisdom says that the middle classes in many developed countries have recently suffered losses, in terms of both the share of the total population belonging to the middle class, and also their share in total income. Here, distribution-free methods are developed for inference on these shares, by means of deriving expressions for their asymptotic variances of sample estimates, and the covariance of the estimates. Asymptotic inference can be undertaken based on asymptotic normality. Bootstrap inference can be expected to be more reliable, and appropriate bootstrap procedures are proposed. As an illustration, samples of individual earnings drawn from Canadian census data are used to test various hypotheses about the middle-class shares, and confidence intervals for them are computed. It is found that, for the earlier censuses, sample sizes are large enough for asymptotic and bootstrap inference to be almost identical, but that, in the twenty-first century, the bootstrap fails on account of a strange phenomenon whereby many presumably different incomes in the data are rounded to one and the same value. Another difference between the centuries is the appearance of heavy right-hand tails in the income distributions of both men and women.

  5. Model averaging, optimal inference and habit formation

    Directory of Open Access Journals (Sweden)

    Thomas H B FitzGerald

    2014-06-01

    Full Text Available Postulating that the brain performs approximate Bayesian inference generates principled and empirically testable models of neuronal function – the subject of much current interest in neuroscience and related disciplines. Current formulations address inference and learning under some assumed and particular model. In reality, organisms are often faced with an additional challenge – that of determining which model or models of their environment are the best for guiding behaviour. Bayesian model averaging – which says that an agent should weight the predictions of different models according to their evidence – provides a principled way to solve this problem. Importantly, because model evidence is determined by both the accuracy and complexity of the model, optimal inference requires that these be traded off against one another. This means an agent’s behaviour should show an equivalent balance. We hypothesise that Bayesian model averaging plays an important role in cognition, given that it is both optimal and realisable within a plausible neuronal architecture. We outline model averaging and how it might be implemented, and then explore a number of implications for brain and behaviour. In particular, we propose that model averaging can explain a number of apparently suboptimal phenomena within the framework of approximate (bounded Bayesian inference, focussing particularly upon the relationship between goal-directed and habitual behaviour.

  6. Statistical Inference and Patterns of Inequality in the Global North

    Science.gov (United States)

    Moran, Timothy Patrick

    2006-01-01

    Cross-national inequality trends have historically been a crucial field of inquiry across the social sciences, and new methodological techniques of statistical inference have recently improved the ability to analyze these trends over time. This paper applies Monte Carlo, bootstrap inference methods to the income surveys of the Luxembourg Income…

  7. Combining statistical inference and decisions in ecology

    Science.gov (United States)

    Williams, Perry J.; Hooten, Mevin B.

    2016-01-01

    Statistical decision theory (SDT) is a sub-field of decision theory that formally incorporates statistical investigation into a decision-theoretic framework to account for uncertainties in a decision problem. SDT provides a unifying analysis of three types of information: statistical results from a data set, knowledge of the consequences of potential choices (i.e., loss), and prior beliefs about a system. SDT links the theoretical development of a large body of statistical methods including point estimation, hypothesis testing, and confidence interval estimation. The theory and application of SDT have mainly been developed and published in the fields of mathematics, statistics, operations research, and other decision sciences, but have had limited exposure in ecology. Thus, we provide an introduction to SDT for ecologists and describe its utility for linking the conventionally separate tasks of statistical investigation and decision making in a single framework. We describe the basic framework of both Bayesian and frequentist SDT, its traditional use in statistics, and discuss its application to decision problems that occur in ecology. We demonstrate SDT with two types of decisions: Bayesian point estimation, and an applied management problem of selecting a prescribed fire rotation for managing a grassland bird species. Central to SDT, and decision theory in general, are loss functions. Thus, we also provide basic guidance and references for constructing loss functions for an SDT problem.

  8. Bayesian Information Criterion as an Alternative way of Statistical Inference

    Directory of Open Access Journals (Sweden)

    Nadejda Yu. Gubanova

    2012-05-01

    Full Text Available The article treats Bayesian information criterion as an alternative to traditional methods of statistical inference, based on NHST. The comparison of ANOVA and BIC results for psychological experiment is discussed.

  9. Statistical inferences for bearings life using sudden death test

    Directory of Open Access Journals (Sweden)

    Morariu Cristin-Olimpiu

    2017-01-01

    Full Text Available In this paper we propose a calculus method for reliability indicators estimation and a complete statistical inferences for three parameters Weibull distribution of bearings life. Using experimental values regarding the durability of bearings tested on stands by the sudden death tests involves a series of particularities of the estimation using maximum likelihood method and statistical inference accomplishment. The paper detailing these features and also provides an example calculation.

  10. Statistical causal inferences and their applications in public health research

    CERN Document Server

    Wu, Pan; Chen, Ding-Geng

    2016-01-01

    This book compiles and presents new developments in statistical causal inference. The accompanying data and computer programs are publicly available so readers may replicate the model development and data analysis presented in each chapter. In this way, methodology is taught so that readers may implement it directly. The book brings together experts engaged in causal inference research to present and discuss recent issues in causal inference methodological development. This is also a timely look at causal inference applied to scenarios that range from clinical trials to mediation and public health research more broadly. In an academic setting, this book will serve as a reference and guide to a course in causal inference at the graduate level (Master's or Doctorate). It is particularly relevant for students pursuing degrees in Statistics, Biostatistics and Computational Biology. Researchers and data analysts in public health and biomedical research will also find this book to be an important reference.

  11. Statistical inference based on latent ability estimates

    NARCIS (Netherlands)

    Hoijtink, H.J.A.; Boomsma, A.

    The quality of approximations to first and second order moments (e.g., statistics like means, variances, regression coefficients) based on latent ability estimates is being discussed. The ability estimates are obtained using either the Rasch, oi the two-parameter logistic model. Straightforward use

  12. Nonparametric predictive inference in statistical process control

    NARCIS (Netherlands)

    Arts, G.R.J.; Coolen, F.P.A.; Laan, van der P.

    2004-01-01

    Statistical process control (SPC) is used to decide when to stop a process as confidence in the quality of the next item(s) is low. Information to specify a parametric model is not always available, and as SPC is of a predictive nature, we present a control chart developed using nonparametric

  13. QInfer: Statistical inference software for quantum applications

    Directory of Open Access Journals (Sweden)

    Christopher Granade

    2017-04-01

    Full Text Available Characterizing quantum systems through experimental data is critical to applications as diverse as metrology and quantum computing. Analyzing this experimental data in a robust and reproducible manner is made challenging, however, by the lack of readily-available software for performing principled statistical analysis. We improve the robustness and reproducibility of characterization by introducing an open-source library, QInfer, to address this need. Our library makes it easy to analyze data from tomography, randomized benchmarking, and Hamiltonian learning experiments either in post-processing, or online as data is acquired. QInfer also provides functionality for predicting the performance of proposed experimental protocols from simulated runs. By delivering easy-to-use characterization tools based on principled statistical analysis, QInfer helps address many outstanding challenges facing quantum technology.

  14. Statistical inference for the lifetime performance index based on generalised order statistics from exponential distribution

    Science.gov (United States)

    Vali Ahmadi, Mohammad; Doostparast, Mahdi; Ahmadi, Jafar

    2015-04-01

    In manufacturing industries, the lifetime of an item is usually characterised by a random variable X and considered to be satisfactory if X exceeds a given lower lifetime limit L. The probability of a satisfactory item is then ηL := P(X ≥ L), called conforming rate. In industrial companies, however, the lifetime performance index, proposed by Montgomery and denoted by CL, is widely used as a process capability index instead of the conforming rate. Assuming a parametric model for the random variable X, we show that there is a connection between the conforming rate and the lifetime performance index. Consequently, the statistical inferences about ηL and CL are equivalent. Hence, we restrict ourselves to statistical inference for CL based on generalised order statistics, which contains several ordered data models such as usual order statistics, progressively Type-II censored data and records. Various point and interval estimators for the parameter CL are obtained and optimal critical regions for the hypothesis testing problems concerning CL are proposed. Finally, two real data-sets on the lifetimes of insulating fluid and ball bearings, due to Nelson (1982) and Caroni (2002), respectively, and a simulated sample are analysed.

  15. Models for probability and statistical inference theory and applications

    CERN Document Server

    Stapleton, James H

    2007-01-01

    This concise, yet thorough, book is enhanced with simulations and graphs to build the intuition of readersModels for Probability and Statistical Inference was written over a five-year period and serves as a comprehensive treatment of the fundamentals of probability and statistical inference. With detailed theoretical coverage found throughout the book, readers acquire the fundamentals needed to advance to more specialized topics, such as sampling, linear models, design of experiments, statistical computing, survival analysis, and bootstrapping.Ideal as a textbook for a two-semester sequence on probability and statistical inference, early chapters provide coverage on probability and include discussions of: discrete models and random variables; discrete distributions including binomial, hypergeometric, geometric, and Poisson; continuous, normal, gamma, and conditional distributions; and limit theory. Since limit theory is usually the most difficult topic for readers to master, the author thoroughly discusses mo...

  16. Data-driven inference for the spatial scan statistic

    Directory of Open Access Journals (Sweden)

    Duczmal Luiz H

    2011-08-01

    Full Text Available Abstract Background Kulldorff's spatial scan statistic for aggregated area maps searches for clusters of cases without specifying their size (number of areas or geographic location in advance. Their statistical significance is tested while adjusting for the multiple testing inherent in such a procedure. However, as is shown in this work, this adjustment is not done in an even manner for all possible cluster sizes. Results A modification is proposed to the usual inference test of the spatial scan statistic, incorporating additional information about the size of the most likely cluster found. A new interpretation of the results of the spatial scan statistic is done, posing a modified inference question: what is the probability that the null hypothesis is rejected for the original observed cases map with a most likely cluster of size k, taking into account only those most likely clusters of size k found under null hypothesis for comparison? This question is especially important when the p-value computed by the usual inference process is near the alpha significance level, regarding the correctness of the decision based in this inference. Conclusions A practical procedure is provided to make more accurate inferences about the most likely cluster found by the spatial scan statistic.

  17. Data-driven inference for the spatial scan statistic.

    Science.gov (United States)

    Almeida, Alexandre C L; Duarte, Anderson R; Duczmal, Luiz H; Oliveira, Fernando L P; Takahashi, Ricardo H C

    2011-08-02

    Kulldorff's spatial scan statistic for aggregated area maps searches for clusters of cases without specifying their size (number of areas) or geographic location in advance. Their statistical significance is tested while adjusting for the multiple testing inherent in such a procedure. However, as is shown in this work, this adjustment is not done in an even manner for all possible cluster sizes. A modification is proposed to the usual inference test of the spatial scan statistic, incorporating additional information about the size of the most likely cluster found. A new interpretation of the results of the spatial scan statistic is done, posing a modified inference question: what is the probability that the null hypothesis is rejected for the original observed cases map with a most likely cluster of size k, taking into account only those most likely clusters of size k found under null hypothesis for comparison? This question is especially important when the p-value computed by the usual inference process is near the alpha significance level, regarding the correctness of the decision based in this inference. A practical procedure is provided to make more accurate inferences about the most likely cluster found by the spatial scan statistic.

  18. Inferring Demographic History Using Two-Locus Statistics.

    Science.gov (United States)

    Ragsdale, Aaron P; Gutenkunst, Ryan N

    2017-06-01

    Population demographic history may be learned from contemporary genetic variation data. Methods based on aggregating the statistics of many single loci into an allele frequency spectrum (AFS) have proven powerful, but such methods ignore potentially informative patterns of linkage disequilibrium (LD) between neighboring loci. To leverage such patterns, we developed a composite-likelihood framework for inferring demographic history from aggregated statistics of pairs of loci. Using this framework, we show that two-locus statistics are more sensitive to demographic history than single-locus statistics such as the AFS. In particular, two-locus statistics escape the notorious confounding of depth and duration of a bottleneck, and they provide a means to estimate effective population size based on the recombination rather than mutation rate. We applied our approach to a Zambian population of Drosophila melanogaster Notably, using both single- and two-locus statistics, we inferred a substantially lower ancestral effective population size than previous works and did not infer a bottleneck history. Together, our results demonstrate the broad potential for two-locus statistics to enable powerful population genetic inference. Copyright © 2017 by the Genetics Society of America.

  19. Powerful Statistical Inference for Nested Data Using Sufficient Summary Statistics

    Science.gov (United States)

    Dowding, Irene; Haufe, Stefan

    2018-01-01

    Hierarchically-organized data arise naturally in many psychology and neuroscience studies. As the standard assumption of independent and identically distributed samples does not hold for such data, two important problems are to accurately estimate group-level effect sizes, and to obtain powerful statistical tests against group-level null hypotheses. A common approach is to summarize subject-level data by a single quantity per subject, which is often the mean or the difference between class means, and treat these as samples in a group-level t-test. This “naive” approach is, however, suboptimal in terms of statistical power, as it ignores information about the intra-subject variance. To address this issue, we review several approaches to deal with nested data, with a focus on methods that are easy to implement. With what we call the sufficient-summary-statistic approach, we highlight a computationally efficient technique that can improve statistical power by taking into account within-subject variances, and we provide step-by-step instructions on how to apply this approach to a number of frequently-used measures of effect size. The properties of the reviewed approaches and the potential benefits over a group-level t-test are quantitatively assessed on simulated data and demonstrated on EEG data from a simulated-driving experiment. PMID:29615885

  20. Fisher information and statistical inference for phase-type distributions

    DEFF Research Database (Denmark)

    Bladt, Mogens; Esparza, Luz Judith R; Nielsen, Bo Friis

    2011-01-01

    This paper is concerned with statistical inference for both continuous and discrete phase-type distributions. We consider maximum likelihood estimation, where traditionally the expectation-maximization (EM) algorithm has been employed. Certain numerical aspects of this method are revised and we...

  1. Statistical inference for a class of multivariate negative binomial distributions

    DEFF Research Database (Denmark)

    Rubak, Ege Holger; Møller, Jesper; McCullagh, Peter

    This paper considers statistical inference procedures for a class of models for positively correlated count variables called α-permanental random fields, and which can be viewed as a family of multivariate negative binomial distributions. Their appealing probabilistic properties have earlier been...

  2. Practical Statistics for LHC Physicists: Bayesian Inference (3/3)

    CERN Multimedia

    CERN. Geneva

    2015-01-01

    These lectures cover those principles and practices of statistics that are most relevant for work at the LHC. The first lecture discusses the basic ideas of descriptive statistics, probability and likelihood. The second lecture covers the key ideas in the frequentist approach, including confidence limits, profile likelihoods, p-values, and hypothesis testing. The third lecture covers inference in the Bayesian approach. Throughout, real-world examples will be used to illustrate the practical application of the ideas. No previous knowledge is assumed.

  3. Practical Statistics for LHC Physicists: Frequentist Inference (2/3)

    CERN Multimedia

    CERN. Geneva

    2015-01-01

    These lectures cover those principles and practices of statistics that are most relevant for work at the LHC. The first lecture discusses the basic ideas of descriptive statistics, probability and likelihood. The second lecture covers the key ideas in the frequentist approach, including confidence limits, profile likelihoods, p-values, and hypothesis testing. The third lecture covers inference in the Bayesian approach. Throughout, real-world examples will be used to illustrate the practical application of the ideas. No previous knowledge is assumed.

  4. Statistical Inference for a Class of Multivariate Negative Binomial Distributions

    DEFF Research Database (Denmark)

    Rubak, Ege H.; Møller, Jesper; McCullagh, Peter

    This paper considers statistical inference procedures for a class of models for positively correlated count variables called -permanental random fields, and which can be viewed as a family of multivariate negative binomial distributions. Their appealing probabilistic properties have earlier been...... studied in the literature, while this is the first statistical paper on -permanental random fields. The focus is on maximum likelihood estimation, maximum quasi-likelihood estimation and on maximum composite likelihood estimation based on uni- and bivariate distributions. Furthermore, new results...

  5. Statistical detection of EEG synchrony using empirical bayesian inference.

    Directory of Open Access Journals (Sweden)

    Archana K Singh

    Full Text Available There is growing interest in understanding how the brain utilizes synchronized oscillatory activity to integrate information across functionally connected regions. Computing phase-locking values (PLV between EEG signals is a popular method for quantifying such synchronizations and elucidating their role in cognitive tasks. However, high-dimensionality in PLV data incurs a serious multiple testing problem. Standard multiple testing methods in neuroimaging research (e.g., false discovery rate, FDR suffer severe loss of power, because they fail to exploit complex dependence structure between hypotheses that vary in spectral, temporal and spatial dimension. Previously, we showed that a hierarchical FDR and optimal discovery procedures could be effectively applied for PLV analysis to provide better power than FDR. In this article, we revisit the multiple comparison problem from a new Empirical Bayes perspective and propose the application of the local FDR method (locFDR; Efron, 2001 for PLV synchrony analysis to compute FDR as a posterior probability that an observed statistic belongs to a null hypothesis. We demonstrate the application of Efron's Empirical Bayes approach for PLV synchrony analysis for the first time. We use simulations to validate the specificity and sensitivity of locFDR and a real EEG dataset from a visual search study for experimental validation. We also compare locFDR with hierarchical FDR and optimal discovery procedures in both simulation and experimental analyses. Our simulation results showed that the locFDR can effectively control false positives without compromising on the power of PLV synchrony inference. Our results from the application locFDR on experiment data detected more significant discoveries than our previously proposed methods whereas the standard FDR method failed to detect any significant discoveries.

  6. Statistical detection of EEG synchrony using empirical bayesian inference.

    Science.gov (United States)

    Singh, Archana K; Asoh, Hideki; Takeda, Yuji; Phillips, Steven

    2015-01-01

    There is growing interest in understanding how the brain utilizes synchronized oscillatory activity to integrate information across functionally connected regions. Computing phase-locking values (PLV) between EEG signals is a popular method for quantifying such synchronizations and elucidating their role in cognitive tasks. However, high-dimensionality in PLV data incurs a serious multiple testing problem. Standard multiple testing methods in neuroimaging research (e.g., false discovery rate, FDR) suffer severe loss of power, because they fail to exploit complex dependence structure between hypotheses that vary in spectral, temporal and spatial dimension. Previously, we showed that a hierarchical FDR and optimal discovery procedures could be effectively applied for PLV analysis to provide better power than FDR. In this article, we revisit the multiple comparison problem from a new Empirical Bayes perspective and propose the application of the local FDR method (locFDR; Efron, 2001) for PLV synchrony analysis to compute FDR as a posterior probability that an observed statistic belongs to a null hypothesis. We demonstrate the application of Efron's Empirical Bayes approach for PLV synchrony analysis for the first time. We use simulations to validate the specificity and sensitivity of locFDR and a real EEG dataset from a visual search study for experimental validation. We also compare locFDR with hierarchical FDR and optimal discovery procedures in both simulation and experimental analyses. Our simulation results showed that the locFDR can effectively control false positives without compromising on the power of PLV synchrony inference. Our results from the application locFDR on experiment data detected more significant discoveries than our previously proposed methods whereas the standard FDR method failed to detect any significant discoveries.

  7. Targeted estimation of nuisance parameters to obtain valid statistical inference.

    Science.gov (United States)

    van der Laan, Mark J

    2014-01-01

    In order to obtain concrete results, we focus on estimation of the treatment specific mean, controlling for all measured baseline covariates, based on observing independent and identically distributed copies of a random variable consisting of baseline covariates, a subsequently assigned binary treatment, and a final outcome. The statistical model only assumes possible restrictions on the conditional distribution of treatment, given the covariates, the so-called propensity score. Estimators of the treatment specific mean involve estimation of the propensity score and/or estimation of the conditional mean of the outcome, given the treatment and covariates. In order to make these estimators asymptotically unbiased at any data distribution in the statistical model, it is essential to use data-adaptive estimators of these nuisance parameters such as ensemble learning, and specifically super-learning. Because such estimators involve optimal trade-off of bias and variance w.r.t. the infinite dimensional nuisance parameter itself, they result in a sub-optimal bias/variance trade-off for the resulting real-valued estimator of the estimand. We demonstrate that additional targeting of the estimators of these nuisance parameters guarantees that this bias for the estimand is second order and thereby allows us to prove theorems that establish asymptotic linearity of the estimator of the treatment specific mean under regularity conditions. These insights result in novel targeted minimum loss-based estimators (TMLEs) that use ensemble learning with additional targeted bias reduction to construct estimators of the nuisance parameters. In particular, we construct collaborative TMLEs (C-TMLEs) with known influence curve allowing for statistical inference, even though these C-TMLEs involve variable selection for the propensity score based on a criterion that measures how effective the resulting fit of the propensity score is in removing bias for the estimand. As a particular special

  8. Network inference via adaptive optimal design

    Directory of Open Access Journals (Sweden)

    Stigter Johannes D

    2012-09-01

    Full Text Available Abstract Background Current research in network reverse engineering for genetic or metabolic networks very often does not include a proper experimental and/or input design. In this paper we address this issue in more detail and suggest a method that includes an iterative design of experiments based, on the most recent data that become available. The presented approach allows a reliable reconstruction of the network and addresses an important issue, i.e., the analysis and the propagation of uncertainties as they exist in both the data and in our own knowledge. These two types of uncertainties have their immediate ramifications for the uncertainties in the parameter estimates and, hence, are taken into account from the very beginning of our experimental design. Findings The method is demonstrated for two small networks that include a genetic network for mRNA synthesis and degradation and an oscillatory network describing a molecular network underlying adenosine 3’-5’ cyclic monophosphate (cAMP as observed in populations of Dyctyostelium cells. In both cases a substantial reduction in parameter uncertainty was observed. Extension to larger scale networks is possible but needs a more rigorous parameter estimation algorithm that includes sparsity as a constraint in the optimization procedure. Conclusion We conclude that a careful experiment design very often (but not always pays off in terms of reliability in the inferred network topology. For large scale networks a better parameter estimation algorithm is required that includes sparsity as an additional constraint. These algorithms are available in the literature and can also be used in an adaptive optimal design setting as demonstrated in this paper.

  9. Optimal causal inference: estimating stored information and approximating causal architecture.

    Science.gov (United States)

    Still, Susanne; Crutchfield, James P; Ellison, Christopher J

    2010-09-01

    We introduce an approach to inferring the causal architecture of stochastic dynamical systems that extends rate-distortion theory to use causal shielding--a natural principle of learning. We study two distinct cases of causal inference: optimal causal filtering and optimal causal estimation. Filtering corresponds to the ideal case in which the probability distribution of measurement sequences is known, giving a principled method to approximate a system's causal structure at a desired level of representation. We show that in the limit in which a model-complexity constraint is relaxed, filtering finds the exact causal architecture of a stochastic dynamical system, known as the causal-state partition. From this, one can estimate the amount of historical information the process stores. More generally, causal filtering finds a graded model-complexity hierarchy of approximations to the causal architecture. Abrupt changes in the hierarchy, as a function of approximation, capture distinct scales of structural organization. For nonideal cases with finite data, we show how the correct number of the underlying causal states can be found by optimal causal estimation. A previously derived model-complexity control term allows us to correct for the effect of statistical fluctuations in probability estimates and thereby avoid overfitting.

  10. Statistical inference for noisy nonlinear ecological dynamic systems.

    Science.gov (United States)

    Wood, Simon N

    2010-08-26

    Chaotic ecological dynamic systems defy conventional statistical analysis. Systems with near-chaotic dynamics are little better. Such systems are almost invariably driven by endogenous dynamic processes plus demographic and environmental process noise, and are only observable with error. Their sensitivity to history means that minute changes in the driving noise realization, or the system parameters, will cause drastic changes in the system trajectory. This sensitivity is inherited and amplified by the joint probability density of the observable data and the process noise, rendering it useless as the basis for obtaining measures of statistical fit. Because the joint density is the basis for the fit measures used by all conventional statistical methods, this is a major theoretical shortcoming. The inability to make well-founded statistical inferences about biological dynamic models in the chaotic and near-chaotic regimes, other than on an ad hoc basis, leaves dynamic theory without the methods of quantitative validation that are essential tools in the rest of biological science. Here I show that this impasse can be resolved in a simple and general manner, using a method that requires only the ability to simulate the observed data on a system from the dynamic model about which inferences are required. The raw data series are reduced to phase-insensitive summary statistics, quantifying local dynamic structure and the distribution of observations. Simulation is used to obtain the mean and the covariance matrix of the statistics, given model parameters, allowing the construction of a 'synthetic likelihood' that assesses model fit. This likelihood can be explored using a straightforward Markov chain Monte Carlo sampler, but one further post-processing step returns pure likelihood-based inference. I apply the method to establish the dynamic nature of the fluctuations in Nicholson's classic blowfly experiments.

  11. High-dimensional statistical inference: From vector to matrix

    Science.gov (United States)

    Zhang, Anru

    Statistical inference for sparse signals or low-rank matrices in high-dimensional settings is of significant interest in a range of contemporary applications. It has attracted significant recent attention in many fields including statistics, applied mathematics and electrical engineering. In this thesis, we consider several problems in including sparse signal recovery (compressed sensing under restricted isometry) and low-rank matrix recovery (matrix recovery via rank-one projections and structured matrix completion). The first part of the thesis discusses compressed sensing and affine rank minimization in both noiseless and noisy cases and establishes sharp restricted isometry conditions for sparse signal and low-rank matrix recovery. The analysis relies on a key technical tool which represents points in a polytope by convex combinations of sparse vectors. The technique is elementary while leads to sharp results. It is shown that, in compressed sensing, delta kA 0, delta kA < 1/3 + epsilon, deltak A + thetak,kA < 1 + epsilon, or deltatkA< √(t - 1) / t + epsilon are not sufficient to guarantee the exact recovery of all k-sparse signals for large k. Similar result also holds for matrix recovery. In addition, the conditions delta kA<1/3, deltak A+ thetak,kA<1, delta tkA < √(t - 1)/t and deltarM<1/3, delta rM+ thetar,rM<1, delta trM< √(t - 1)/ t are also shown to be sufficient respectively for stable recovery of approximately sparse signals and low-rank matrices in the noisy case. For the second part of the thesis, we introduce a rank-one projection model for low-rank matrix recovery and propose a constrained nuclear norm minimization method for stable recovery of low-rank matrices in the noisy case. The procedure is adaptive to the rank and robust against small perturbations. Both upper and lower bounds for the estimation accuracy under the Frobenius norm loss are obtained. The proposed estimator is shown to be rate-optimal under certain conditions. The

  12. Statistical Inference for Porous Materials using Persistent Homology.

    Energy Technology Data Exchange (ETDEWEB)

    Moon, Chul [Univ. of Georgia, Athens, GA (United States); Heath, Jason E. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Mitchell, Scott A. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2017-12-01

    We propose a porous materials analysis pipeline using persistent homology. We rst compute persistent homology of binarized 3D images of sampled material subvolumes. For each image we compute sets of homology intervals, which are represented as summary graphics called persistence diagrams. We convert persistence diagrams into image vectors in order to analyze the similarity of the homology of the material images using the mature tools for image analysis. Each image is treated as a vector and we compute its principal components to extract features. We t a statistical model using the loadings of principal components to estimate material porosity, permeability, anisotropy, and tortuosity. We also propose an adaptive version of the structural similarity index (SSIM), a similarity metric for images, as a measure to determine the statistical representative elementary volumes (sREV) for persistence homology. Thus we provide a capability for making a statistical inference of the uid ow and transport properties of porous materials based on their geometry and connectivity.

  13. A model independent safeguard against background mismodeling for statistical inference

    Energy Technology Data Exchange (ETDEWEB)

    Priel, Nadav; Landsman, Hagar; Manfredini, Alessandro; Budnik, Ranny [Department of Particle Physics and Astrophysics, Weizmann Institute of Science, Herzl St. 234, Rehovot (Israel); Rauch, Ludwig, E-mail: nadav.priel@weizmann.ac.il, E-mail: rauch@mpi-hd.mpg.de, E-mail: hagar.landsman@weizmann.ac.il, E-mail: alessandro.manfredini@weizmann.ac.il, E-mail: ran.budnik@weizmann.ac.il [Teilchen- und Astroteilchenphysik, Max-Planck-Institut für Kernphysik, Saupfercheckweg 1, 69117 Heidelberg (Germany)

    2017-05-01

    We propose a safeguard procedure for statistical inference that provides universal protection against mismodeling of the background. The method quantifies and incorporates the signal-like residuals of the background model into the likelihood function, using information available in a calibration dataset. This prevents possible false discovery claims that may arise through unknown mismodeling, and corrects the bias in limit setting created by overestimated or underestimated background. We demonstrate how the method removes the bias created by an incomplete background model using three realistic case studies.

  14. Statistical inference to advance network models in epidemiology.

    Science.gov (United States)

    Welch, David; Bansal, Shweta; Hunter, David R

    2011-03-01

    Contact networks are playing an increasingly important role in the study of epidemiology. Most of the existing work in this area has focused on considering the effect of underlying network structure on epidemic dynamics by using tools from probability theory and computer simulation. This work has provided much insight on the role that heterogeneity in host contact patterns plays on infectious disease dynamics. Despite the important understanding afforded by the probability and simulation paradigm, this approach does not directly address important questions about the structure of contact networks such as what is the best network model for a particular mode of disease transmission, how parameter values of a given model should be estimated, or how precisely the data allow us to estimate these parameter values. We argue that these questions are best answered within a statistical framework and discuss the role of statistical inference in estimating contact networks from epidemiological data. Copyright © 2011 Elsevier B.V. All rights reserved.

  15. Statistical inference involving binomial and negative binomial parameters.

    Science.gov (United States)

    García-Pérez, Miguel A; Núñez-Antón, Vicente

    2009-05-01

    Statistical inference about two binomial parameters implies that they are both estimated by binomial sampling. There are occasions in which one aims at testing the equality of two binomial parameters before and after the occurrence of the first success along a sequence of Bernoulli trials. In these cases, the binomial parameter before the first success is estimated by negative binomial sampling whereas that after the first success is estimated by binomial sampling, and both estimates are related. This paper derives statistical tools to test two hypotheses, namely, that both binomial parameters equal some specified value and that both parameters are equal though unknown. Simulation studies are used to show that in small samples both tests are accurate in keeping the nominal Type-I error rates, and also to determine sample size requirements to detect large, medium, and small effects with adequate power. Additional simulations also show that the tests are sufficiently robust to certain violations of their assumptions.

  16. Statistical Models for Inferring Vegetation Composition from Fossil Pollen

    Science.gov (United States)

    Paciorek, C.; McLachlan, J. S.; Shang, Z.

    2011-12-01

    Fossil pollen provide information about vegetation composition that can be used to help understand how vegetation has changed over the past. However, these data have not traditionally been analyzed in a way that allows for statistical inference about spatio-temporal patterns and trends. We build a Bayesian hierarchical model called STEPPS (Spatio-Temporal Empirical Prediction from Pollen in Sediments) that predicts forest composition in southern New England, USA, over the last two millenia based on fossil pollen. The critical relationships between abundances of tree taxa in the pollen record and abundances in actual vegetation are estimated using modern (Forest Inventory Analysis) data and (witness tree) data from colonial records. This gives us two time points at which both pollen and direct vegetation data are available. Based on these relationships, and incorporating our uncertainty about them, we predict forest composition using fossil pollen. We estimate the spatial distribution and relative abundances of tree species and draw inference about how these patterns have changed over time. Finally, we describe ongoing work to extend the modeling to the upper Midwest of the U.S., including an approach to infer tree density and thereby estimate the prairie-forest boundary in Minnesota and Wisconsin. This work is part of the PalEON project, which brings together a team of ecosystem modelers, paleoecologists, and statisticians with the goal of reconstructing vegetation responses to climate during the last two millenia in the northeastern and midwestern United States. The estimates from the statistical modeling will be used to assess and calibrate ecosystem models that are used to project ecological changes in response to global change.

  17. The Heuristic Value of p in Inductive Statistical Inference

    Directory of Open Access Journals (Sweden)

    Joachim I. Krueger

    2017-06-01

    Full Text Available Many statistical methods yield the probability of the observed data – or data more extreme – under the assumption that a particular hypothesis is true. This probability is commonly known as ‘the’ p-value. (Null Hypothesis Significance Testing ([NH]ST is the most prominent of these methods. The p-value has been subjected to much speculation, analysis, and criticism. We explore how well the p-value predicts what researchers presumably seek: the probability of the hypothesis being true given the evidence, and the probability of reproducing significant results. We also explore the effect of sample size on inferential accuracy, bias, and error. In a series of simulation experiments, we find that the p-value performs quite well as a heuristic cue in inductive inference, although there are identifiable limits to its usefulness. We conclude that despite its general usefulness, the p-value cannot bear the full burden of inductive inference; it is but one of several heuristic cues available to the data analyst. Depending on the inferential challenge at hand, investigators may supplement their reports with effect size estimates, Bayes factors, or other suitable statistics, to communicate what they think the data say.

  18. The Heuristic Value of p in Inductive Statistical Inference.

    Science.gov (United States)

    Krueger, Joachim I; Heck, Patrick R

    2017-01-01

    Many statistical methods yield the probability of the observed data - or data more extreme - under the assumption that a particular hypothesis is true. This probability is commonly known as 'the' p -value. (Null Hypothesis) Significance Testing ([NH]ST) is the most prominent of these methods. The p -value has been subjected to much speculation, analysis, and criticism. We explore how well the p -value predicts what researchers presumably seek: the probability of the hypothesis being true given the evidence, and the probability of reproducing significant results. We also explore the effect of sample size on inferential accuracy, bias, and error. In a series of simulation experiments, we find that the p -value performs quite well as a heuristic cue in inductive inference, although there are identifiable limits to its usefulness. We conclude that despite its general usefulness, the p -value cannot bear the full burden of inductive inference; it is but one of several heuristic cues available to the data analyst. Depending on the inferential challenge at hand, investigators may supplement their reports with effect size estimates, Bayes factors, or other suitable statistics, to communicate what they think the data say.

  19. Statistics for nuclear engineers and scientists. Part 1. Basic statistical inference

    Energy Technology Data Exchange (ETDEWEB)

    Beggs, W.J.

    1981-02-01

    This report is intended for the use of engineers and scientists working in the nuclear industry, especially at the Bettis Atomic Power Laboratory. It serves as the basis for several Bettis in-house statistics courses. The objectives of the report are to introduce the reader to the language and concepts of statistics and to provide a basic set of techniques to apply to problems of the collection and analysis of data. Part 1 covers subjects of basic inference. The subjects include: descriptive statistics; probability; simple inference for normally distributed populations, and for non-normal populations as well; comparison of two populations; the analysis of variance; quality control procedures; and linear regression analysis.

  20. Massive optimal data compression and density estimation for scalable, likelihood-free inference in cosmology

    Science.gov (United States)

    Alsing, Justin; Wandelt, Benjamin; Feeney, Stephen

    2018-03-01

    Many statistical models in cosmology can be simulated forwards but have intractable likelihood functions. Likelihood-free inference methods allow us to perform Bayesian inference from these models using only forward simulations, free from any likelihood assumptions or approximations. Likelihood-free inference generically involves simulating mock data and comparing to the observed data; this comparison in data-space suffers from the curse of dimensionality and requires compression of the data to a small number of summary statistics to be tractable. In this paper we use massive asymptotically-optimal data compression to reduce the dimensionality of the data-space to just one number per parameter, providing a natural and optimal framework for summary statistic choice for likelihood-free inference. Secondly, we present the first cosmological application of Density Estimation Likelihood-Free Inference (DELFI), which learns a parameterized model for joint distribution of data and parameters, yielding both the parameter posterior and the model evidence. This approach is conceptually simple, requires less tuning than traditional Approximate Bayesian Computation approaches to likelihood-free inference and can give high-fidelity posteriors from orders of magnitude fewer forward simulations. As an additional bonus, it enables parameter inference and Bayesian model comparison simultaneously. We demonstrate Density Estimation Likelihood-Free Inference with massive data compression on an analysis of the joint light-curve analysis supernova data, as a simple validation case study. We show that high-fidelity posterior inference is possible for full-scale cosmological data analyses with as few as ˜104 simulations, with substantial scope for further improvement, demonstrating the scalability of likelihood-free inference to large and complex cosmological datasets.

  1. Terminal-Dependent Statistical Inference for the FBSDEs Models

    Directory of Open Access Journals (Sweden)

    Yunquan Song

    2014-01-01

    Full Text Available The original stochastic differential equations (OSDEs and forward-backward stochastic differential equations (FBSDEs are often used to model complex dynamic process that arise in financial, ecological, and many other areas. The main difference between OSDEs and FBSDEs is that the latter is designed to depend on a terminal condition, which is a key factor in some financial and ecological circumstances. It is interesting but challenging to estimate FBSDE parameters from noisy data and the terminal condition. However, to the best of our knowledge, the terminal-dependent statistical inference for such a model has not been explored in the existing literature. We proposed a nonparametric terminal control variables estimation method to address this problem. The reason why we use the terminal control variables is that the newly proposed inference procedures inherit the terminal-dependent characteristic. Through this new proposed method, the estimators of the functional coefficients of the FBSDEs model are obtained. The asymptotic properties of the estimators are also discussed. Simulation studies show that the proposed method gives satisfying estimates for the FBSDE parameters from noisy data and the terminal condition. A simulation is performed to test the feasibility of our method.

  2. Statistical inference for imperfect maintenance models with missing data

    International Nuclear Information System (INIS)

    Dijoux, Yann; Fouladirad, Mitra; Nguyen, Dinh Tuan

    2016-01-01

    The paper considers complex industrial systems with incomplete maintenance history. A corrective maintenance is performed after the occurrence of a failure and its efficiency is assumed to be imperfect. In maintenance analysis, the databases are not necessarily complete. Specifically, the observations are assumed to be window-censored. This situation arises relatively frequently after the purchase of a second-hand unit or in the absence of maintenance record during the burn-in phase. The joint assessment of the wear-out of the system and the maintenance efficiency is investigated under missing data. A review along with extensions of statistical inference procedures from an observation window are proposed in the case of perfect and minimal repair using the renewal and Poisson theories, respectively. Virtual age models are employed to model imperfect repair. In this framework, new estimation procedures are developed. In particular, maximum likelihood estimation methods are derived for the most classical virtual age models. The benefits of the new estimation procedures are highlighted by numerical simulations and an application to a real data set. - Highlights: • New estimation procedures for window-censored observations and imperfect repair. • Extensions of inference methods for perfect and minimal repair with missing data. • Overview of maximum likelihood method with complete and incomplete observations. • Benefits of the new procedures highlighted by simulation studies and real application.

  3. Statistical inference using weak chaos and infinite memory

    International Nuclear Information System (INIS)

    Welling, Max; Chen Yutian

    2010-01-01

    We describe a class of deterministic weakly chaotic dynamical systems with infinite memory. These 'herding systems' combine learning and inference into one algorithm, where moments or data-items are converted directly into an arbitrarily long sequence of pseudo-samples. This sequence has infinite range correlations and as such is highly structured. We show that its information content, as measured by sub-extensive entropy, can grow as fast as K log T, which is faster than the usual 1/2 K log T for exchangeable sequences generated by random posterior sampling from a Bayesian model. In one dimension we prove that herding sequences are equivalent to Sturmian sequences which have complexity exactly log(T + 1). More generally, we advocate the application of the rich theoretical framework around nonlinear dynamical systems, chaos theory and fractal geometry to statistical learning.

  4. Statistical inference using weak chaos and infinite memory

    Energy Technology Data Exchange (ETDEWEB)

    Welling, Max; Chen Yutian, E-mail: welling@ics.uci.ed, E-mail: yutian.chen@uci.ed [Donald Bren School of Information and Computer Science, University of California Irvine CA 92697-3425 (United States)

    2010-06-01

    We describe a class of deterministic weakly chaotic dynamical systems with infinite memory. These 'herding systems' combine learning and inference into one algorithm, where moments or data-items are converted directly into an arbitrarily long sequence of pseudo-samples. This sequence has infinite range correlations and as such is highly structured. We show that its information content, as measured by sub-extensive entropy, can grow as fast as K log T, which is faster than the usual 1/2 K log T for exchangeable sequences generated by random posterior sampling from a Bayesian model. In one dimension we prove that herding sequences are equivalent to Sturmian sequences which have complexity exactly log(T + 1). More generally, we advocate the application of the rich theoretical framework around nonlinear dynamical systems, chaos theory and fractal geometry to statistical learning.

  5. Multiple Illuminant Colour Estimation via Statistical Inference on Factor Graphs.

    Science.gov (United States)

    Mutimbu, Lawrence; Robles-Kelly, Antonio

    2016-08-31

    This paper presents a method to recover a spatially varying illuminant colour estimate from scenes lit by multiple light sources. Starting with the image formation process, we formulate the illuminant recovery problem in a statistically datadriven setting. To do this, we use a factor graph defined across the scale space of the input image. In the graph, we utilise a set of illuminant prototypes computed using a data driven approach. As a result, our method delivers a pixelwise illuminant colour estimate being devoid of libraries or user input. The use of a factor graph also allows for the illuminant estimates to be recovered making use of a maximum a posteriori (MAP) inference process. Moreover, we compute the probability marginals by performing a Delaunay triangulation on our factor graph. We illustrate the utility of our method for pixelwise illuminant colour recovery on widely available datasets and compare against a number of alternatives. We also show sample colour correction results on real-world images.

  6. Recent Advances in System Reliability Signatures, Multi-state Systems and Statistical Inference

    CERN Document Server

    Frenkel, Ilia

    2012-01-01

    Recent Advances in System Reliability discusses developments in modern reliability theory such as signatures, multi-state systems and statistical inference. It describes the latest achievements in these fields, and covers the application of these achievements to reliability engineering practice. The chapters cover a wide range of new theoretical subjects and have been written by leading experts in reliability theory and its applications.  The topics include: concepts and different definitions of signatures (D-spectra),  their  properties and applications  to  reliability of coherent systems and network-type structures; Lz-transform of Markov stochastic process and its application to multi-state system reliability analysis; methods for cost-reliability and cost-availability analysis of multi-state systems; optimal replacement and protection strategy; and statistical inference. Recent Advances in System Reliability presents many examples to illustrate the theoretical results. Real world multi-state systems...

  7. Inference

    DEFF Research Database (Denmark)

    Møller, Jesper

    2010-01-01

    Chapter 9: This contribution concerns statistical inference for parametric models used in stochastic geometry and based on quick and simple simulation free procedures as well as more comprehensive methods based on a maximum likelihood or Bayesian approach combined with markov chain Monte Carlo...... (MCMC) techniques. Due to space limitations the focus is on spatial point processes....

  8. Multivariate Statistical Inference of Lightning Occurrence, and Using Lightning Observations

    Science.gov (United States)

    Boccippio, Dennis

    2004-01-01

    Two classes of multivariate statistical inference using TRMM Lightning Imaging Sensor, Precipitation Radar, and Microwave Imager observation are studied, using nonlinear classification neural networks as inferential tools. The very large and globally representative data sample provided by TRMM allows both training and validation (without overfitting) of neural networks with many degrees of freedom. In the first study, the flashing / or flashing condition of storm complexes is diagnosed using radar, passive microwave and/or environmental observations as neural network inputs. The diagnostic skill of these simple lightning/no-lightning classifiers can be quite high, over land (above 80% Probability of Detection; below 20% False Alarm Rate). In the second, passive microwave and lightning observations are used to diagnose radar reflectivity vertical structure. A priori diagnosis of hydrometeor vertical structure is highly important for improved rainfall retrieval from either orbital radars (e.g., the future Global Precipitation Mission "mothership") or radiometers (e.g., operational SSM/I and future Global Precipitation Mission passive microwave constellation platforms), we explore the incremental benefit to such diagnosis provided by lightning observations.

  9. Optimal Design and Related Areas in Optimization and Statistics

    CERN Document Server

    Pronzato, Luc

    2009-01-01

    This edited volume, dedicated to Henry P. Wynn, reflects his broad range of research interests, focusing in particular on the applications of optimal design theory in optimization and statistics. It covers algorithms for constructing optimal experimental designs, general gradient-type algorithms for convex optimization, majorization and stochastic ordering, algebraic statistics, Bayesian networks and nonlinear regression. Written by leading specialists in the field, each chapter contains a survey of the existing literature along with substantial new material. This work will appeal to both the

  10. Inference

    DEFF Research Database (Denmark)

    Møller, Jesper

    (This text written by Jesper Møller, Aalborg University, is submitted for the collection ‘Stochastic Geometry: Highlights, Interactions and New Perspectives', edited by Wilfrid S. Kendall and Ilya Molchanov, to be published by ClarendonPress, Oxford, and planned to appear as Section 4.1 with the ......(This text written by Jesper Møller, Aalborg University, is submitted for the collection ‘Stochastic Geometry: Highlights, Interactions and New Perspectives', edited by Wilfrid S. Kendall and Ilya Molchanov, to be published by ClarendonPress, Oxford, and planned to appear as Section 4.......1 with the title ‘Inference'.) This contribution concerns statistical inference for parametric models used in stochastic geometry and based on quick and simple simulation free procedures as well as more comprehensive methods using Markov chain Monte Carlo (MCMC) simulations. Due to space limitations the focus...

  11. Distinguishing between statistical significance and practical/clinical meaningfulness using statistical inference.

    Science.gov (United States)

    Wilkinson, Michael

    2014-03-01

    Decisions about support for predictions of theories in light of data are made using statistical inference. The dominant approach in sport and exercise science is the Neyman-Pearson (N-P) significance-testing approach. When applied correctly it provides a reliable procedure for making dichotomous decisions for accepting or rejecting zero-effect null hypotheses with known and controlled long-run error rates. Type I and type II error rates must be specified in advance and the latter controlled by conducting an a priori sample size calculation. The N-P approach does not provide the probability of hypotheses or indicate the strength of support for hypotheses in light of data, yet many scientists believe it does. Outcomes of analyses allow conclusions only about the existence of non-zero effects, and provide no information about the likely size of true effects or their practical/clinical value. Bayesian inference can show how much support data provide for different hypotheses, and how personal convictions should be altered in light of data, but the approach is complicated by formulating probability distributions about prior subjective estimates of population effects. A pragmatic solution is magnitude-based inference, which allows scientists to estimate the true magnitude of population effects and how likely they are to exceed an effect magnitude of practical/clinical importance, thereby integrating elements of subjective Bayesian-style thinking. While this approach is gaining acceptance, progress might be hastened if scientists appreciate the shortcomings of traditional N-P null hypothesis significance testing.

  12. Some challenges with statistical inference in adaptive designs.

    Science.gov (United States)

    Hung, H M James; Wang, Sue-Jane; Yang, Peiling

    2014-01-01

    Adaptive designs have generated a great deal of attention to clinical trial communities. The literature contains many statistical methods to deal with added statistical uncertainties concerning the adaptations. Increasingly encountered in regulatory applications are adaptive statistical information designs that allow modification of sample size or related statistical information and adaptive selection designs that allow selection of doses or patient populations during the course of a clinical trial. For adaptive statistical information designs, a few statistical testing methods are mathematically equivalent, as a number of articles have stipulated, but arguably there are large differences in their practical ramifications. We pinpoint some undesirable features of these methods in this work. For adaptive selection designs, the selection based on biomarker data for testing the correlated clinical endpoints may increase statistical uncertainty in terms of type I error probability, and most importantly the increased statistical uncertainty may be impossible to assess.

  13. Parametric statistical inference for discretely observed diffusion processes

    DEFF Research Database (Denmark)

    Pedersen, Asger Roer

    Part 1: Theoretical results Part 2: Statistical applications of Gaussian diffusion processes in freshwater ecology......Part 1: Theoretical results Part 2: Statistical applications of Gaussian diffusion processes in freshwater ecology...

  14. The statistical-inference approach to generalized thermodynamics

    International Nuclear Information System (INIS)

    Lavenda, B.H.; Scherer, C.

    1987-01-01

    Limit theorems, such as the central-limit theorem and the weak law of large numbers, are applicable to statistical thermodynamics for sufficiently large sample size of indipendent and identically distributed observations performed on extensive thermodynamic (chance) variables. The estimation of the intensive thermodynamic quantities is a problem in parametric statistical estimation. The normal approximation to the Gibbs' distribution is justified by the analysis of large deviations. Statistical thermodynamics is generalized to include the statistical estimation of variance as well as mean values

  15. Statistical Inference on Memory Structure of Processes and Its Applications to Information Theory

    Science.gov (United States)

    2016-05-12

    Distribution Unlimited UU UU UU UU 12-05-2016 15-May-2014 14-Feb-2015 Final Report: Statistical Inference on Memory Structure of Processes and Its Applications ...ES) U.S. Army Research Office P.O. Box 12211 Research Triangle Park, NC 27709-2211 mathematical statistics ; time series; Markov chains; random...journals: Final Report: Statistical Inference on Memory Structure of Processes and Its Applications to Information Theory Report Title Three areas

  16. STATISTICAL RELATIONAL LEARNING AND SCRIPT INDUCTION FOR TEXTUAL INFERENCE

    Science.gov (United States)

    2017-12-01

    compensate for parser errors. We replace deterministic conjunction by an average combiner, which encodes causal independence. Our framework was the...sentence similarity (STS) and sentence paraphrasing, but not Textual Entailment, where deeper inferences are required. As the formula for conjunction ...When combined, our algorithm learns to rely on systems that not just agree on an output but also the provenance of this output in conjunction with the

  17. Optimal state discrimination using particle statistics

    International Nuclear Information System (INIS)

    Bose, S.; Ekert, A.; Omar, Y.; Paunkovic, N.; Vedral, V.

    2003-01-01

    We present an application of particle statistics to the problem of optimal ambiguous discrimination of quantum states. The states to be discriminated are encoded in the internal degrees of freedom of identical particles, and we use the bunching and antibunching of the external degrees of freedom to discriminate between various internal states. We show that we can achieve the optimal single-shot discrimination probability using only the effects of particle statistics. We discuss interesting applications of our method to detecting entanglement and purifying mixed states. Our scheme can easily be implemented with the current technology

  18. Information Geometry, Inference Methods and Chaotic Energy Levels Statistics

    OpenAIRE

    Cafaro, Carlo

    2008-01-01

    In this Letter, we propose a novel information-geometric characterization of chaotic (integrable) energy level statistics of a quantum antiferromagnetic Ising spin chain in a tilted (transverse) external magnetic field. Finally, we conjecture our results might find some potential physical applications in quantum energy level statistics.

  19. The Role of the Sampling Distribution in Understanding Statistical Inference

    Science.gov (United States)

    Lipson, Kay

    2003-01-01

    Many statistics educators believe that few students develop the level of conceptual understanding essential for them to apply correctly the statistical techniques at their disposal and to interpret their outcomes appropriately. It is also commonly believed that the sampling distribution plays an important role in developing this understanding.…

  20. Statistical models for optimizing mineral exploration

    International Nuclear Information System (INIS)

    Wignall, T.K.; DeGeoffroy, J.

    1987-01-01

    The primary purpose of mineral exploration is to discover ore deposits. The emphasis of this volume is on the mathematical and computational aspects of optimizing mineral exploration. The seven chapters that make up the main body of the book are devoted to the description and application of various types of computerized geomathematical models. These chapters include: (1) the optimal selection of ore deposit types and regions of search, as well as prospecting selected areas, (2) designing airborne and ground field programs for the optimal coverage of prospecting areas, and (3) delineating and evaluating exploration targets within prospecting areas by means of statistical modeling. Many of these statistical programs are innovative and are designed to be useful for mineral exploration modeling. Examples of geomathematical models are applied to exploring for six main types of base and precious metal deposits, as well as other mineral resources (such as bauxite and uranium)

  1. Simulation and Statistical Inference of Stochastic Reaction Networks with Applications to Epidemic Models

    KAUST Repository

    Moraes, Alvaro

    2015-01-01

    Epidemics have shaped, sometimes more than wars and natural disasters, demo- graphic aspects of human populations around the world, their health habits and their economies. Ebola and the Middle East Respiratory Syndrome (MERS) are clear and current examples of potential hazards at planetary scale. During the spread of an epidemic disease, there are phenomena, like the sudden extinction of the epidemic, that can not be captured by deterministic models. As a consequence, stochastic models have been proposed during the last decades. A typical forward problem in the stochastic setting could be the approximation of the expected number of infected individuals found in one month from now. On the other hand, a typical inverse problem could be, given a discretely observed set of epidemiological data, infer the transmission rate of the epidemic or its basic reproduction number. Markovian epidemic models are stochastic models belonging to a wide class of pure jump processes known as Stochastic Reaction Networks (SRNs), that are intended to describe the time evolution of interacting particle systems where one particle interacts with the others through a finite set of reaction channels. SRNs have been mainly developed to model biochemical reactions but they also have applications in neural networks, virus kinetics, and dynamics of social networks, among others. 4 This PhD thesis is focused on novel fast simulation algorithms and statistical inference methods for SRNs. Our novel Multi-level Monte Carlo (MLMC) hybrid simulation algorithms provide accurate estimates of expected values of a given observable of SRNs at a prescribed final time. They are designed to control the global approximation error up to a user-selected accuracy and up to a certain confidence level, and with near optimal computational work. We also present novel dual-weighted residual expansions for fast estimation of weak and strong errors arising from the MLMC methodology. Regarding the statistical inference

  2. An efficient forward–reverse expectation-maximization algorithm for statistical inference in stochastic reaction networks

    KAUST Repository

    Bayer, Christian; Moraes, Alvaro; Tempone, Raul; Vilanova, Pedro

    2016-01-01

    then employ this SRN bridge-generation technique to the statistical inference problem of approximating reaction propensities based on discretely observed data. To this end, we introduce a two-phase iterative inference method in which, during phase I, we solve

  3. An Efficient Forward-Reverse EM Algorithm for Statistical Inference in Stochastic Reaction Networks

    KAUST Repository

    Bayer, Christian; Moraes, Alvaro; Tempone, Raul; Vilanova, Pedro

    2016-01-01

    In this work [1], we present an extension of the forward-reverse algorithm by Bayer and Schoenmakers [2] to the context of stochastic reaction networks (SRNs). We then apply this bridge-generation technique to the statistical inference problem

  4. Evaluating the Use of Random Distribution Theory to Introduce Statistical Inference Concepts to Business Students

    Science.gov (United States)

    Larwin, Karen H.; Larwin, David A.

    2011-01-01

    Bootstrapping methods and random distribution methods are increasingly recommended as better approaches for teaching students about statistical inference in introductory-level statistics courses. The authors examined the effect of teaching undergraduate business statistics students using random distribution and bootstrapping simulations. It is the…

  5. Introduction to statistical inference and its applications with R

    CERN Document Server

    Trosset, Michael W

    2009-01-01

    ExperimentsExamples Randomization The Importance of Probability Games of Chance Mathematical Preliminaries Sets Counting Functions Limits Probability Interpretations of Probability Axioms of Probability Finite Sample Spaces Conditional Probability Random VariablesCase Study: Padrolling in Milton Murayama's All I asking for is my bodyDiscrete Random VariablesBasic Concepts Examples Expectation Binomial DistributionsContinuous Random Variables A Motivating Example Basic Concepts Elementary Examples Normal Distributions Normal Sampling DistributionsQuantifying Population Attributes Symmetry Quantiles The Method of Least SquaresData The Plug-In Principle Plug-In Estimates of Mean and Variance Plug-In Estimates of Quantiles Kernel Density Estimates Case Study: Are Forearm Lengths Normally Distributed? TransformationsLots of Data Averaging Decreases Variation The Weak Law of Large Numbers The Central Limit TheoremInferenceA Motivating Example Point EstimationHeuristics of Hypothesis Testing Testing Hypotheses about...

  6. TARGETED SEQUENTIAL DESIGN FOR TARGETED LEARNING INFERENCE OF THE OPTIMAL TREATMENT RULE AND ITS MEAN REWARD.

    Science.gov (United States)

    Chambaz, Antoine; Zheng, Wenjing; van der Laan, Mark J

    2017-01-01

    This article studies the targeted sequential inference of an optimal treatment rule (TR) and its mean reward in the non-exceptional case, i.e. , assuming that there is no stratum of the baseline covariates where treatment is neither beneficial nor harmful, and under a companion margin assumption. Our pivotal estimator, whose definition hinges on the targeted minimum loss estimation (TMLE) principle, actually infers the mean reward under the current estimate of the optimal TR. This data-adaptive statistical parameter is worthy of interest on its own. Our main result is a central limit theorem which enables the construction of confidence intervals on both mean rewards under the current estimate of the optimal TR and under the optimal TR itself. The asymptotic variance of the estimator takes the form of the variance of an efficient influence curve at a limiting distribution, allowing to discuss the efficiency of inference. As a by product, we also derive confidence intervals on two cumulated pseudo-regrets, a key notion in the study of bandits problems. A simulation study illustrates the procedure. One of the corner-stones of the theoretical study is a new maximal inequality for martingales with respect to the uniform entropy integral.

  7. Optimizing refiner operation with statistical modelling

    Energy Technology Data Exchange (ETDEWEB)

    Broderick, G [Noranda Research Centre, Pointe Claire, PQ (Canada)

    1997-02-01

    The impact of refining conditions on the energy efficiency of the process and on the handsheet quality of a chemi-mechanical pulp was studied as part of a series of pilot scale refining trials. Statistical models of refiner performance were constructed from these results and non-linear optimization of process conditions were conducted. Optimization results indicated that increasing the ratio of specific energy applied in the first stage led to a reduction of some 15 per cent in the total energy requirement. The strategy can also be used to obtain significant increases in pulp quality for a given energy input. 20 refs., 6 tabs.

  8. An efficient forward–reverse expectation-maximization algorithm for statistical inference in stochastic reaction networks

    KAUST Repository

    Bayer, Christian

    2016-02-20

    © 2016 Taylor & Francis Group, LLC. ABSTRACT: In this work, we present an extension of the forward–reverse representation introduced by Bayer and Schoenmakers (Annals of Applied Probability, 24(5):1994–2032, 2014) to the context of stochastic reaction networks (SRNs). We apply this stochastic representation to the computation of efficient approximations of expected values of functionals of SRN bridges, that is, SRNs conditional on their values in the extremes of given time intervals. We then employ this SRN bridge-generation technique to the statistical inference problem of approximating reaction propensities based on discretely observed data. To this end, we introduce a two-phase iterative inference method in which, during phase I, we solve a set of deterministic optimization problems where the SRNs are replaced by their reaction-rate ordinary differential equations approximation; then, during phase II, we apply the Monte Carlo version of the expectation-maximization algorithm to the phase I output. By selecting a set of overdispersed seeds as initial points in phase I, the output of parallel runs from our two-phase method is a cluster of approximate maximum likelihood estimates. Our results are supported by numerical examples.

  9. An efficient forward-reverse expectation-maximization algorithm for statistical inference in stochastic reaction networks

    KAUST Repository

    Vilanova, Pedro

    2016-01-07

    In this work, we present an extension of the forward-reverse representation introduced in Simulation of forward-reverse stochastic representations for conditional diffusions , a 2014 paper by Bayer and Schoenmakers to the context of stochastic reaction networks (SRNs). We apply this stochastic representation to the computation of efficient approximations of expected values of functionals of SRN bridges, i.e., SRNs conditional on their values in the extremes of given time-intervals. We then employ this SRN bridge-generation technique to the statistical inference problem of approximating reaction propensities based on discretely observed data. To this end, we introduce a two-phase iterative inference method in which, during phase I, we solve a set of deterministic optimization problems where the SRNs are replaced by their reaction-rate ordinary differential equations approximation; then, during phase II, we apply the Monte Carlo version of the Expectation-Maximization algorithm to the phase I output. By selecting a set of over-dispersed seeds as initial points in phase I, the output of parallel runs from our two-phase method is a cluster of approximate maximum likelihood estimates. Our results are supported by numerical examples.

  10. Using statistical inference for decision making in best estimate analyses

    International Nuclear Information System (INIS)

    Sermer, P.; Weaver, K.; Hoppe, F.; Olive, C.; Quach, D.

    2008-01-01

    For broad classes of safety analysis problems, one needs to make decisions when faced with randomly varying quantities which are also subject to errors. The means for doing this involves a statistical approach which takes into account the nature of the physical problems, and the statistical constraints they impose. We describe the methodology for doing this which has been developed at Nuclear Safety Solutions, and we draw some comparisons to other methods which are commonly used in Canada and internationally. Our methodology has the advantages of being robust and accurate and compares favourably to other best estimate methods. (author)

  11. Assessment of statistical education in Indonesia: Preliminary results and initiation to simulation-based inference

    Science.gov (United States)

    Saputra, K. V. I.; Cahyadi, L.; Sembiring, U. A.

    2018-01-01

    Start in this paper, we assess our traditional elementary statistics education and also we introduce elementary statistics with simulation-based inference. To assess our statistical class, we adapt the well-known CAOS (Comprehensive Assessment of Outcomes in Statistics) test that serves as an external measure to assess the student’s basic statistical literacy. This test generally represents as an accepted measure of statistical literacy. We also introduce a new teaching method on elementary statistics class. Different from the traditional elementary statistics course, we will introduce a simulation-based inference method to conduct hypothesis testing. From the literature, it has shown that this new teaching method works very well in increasing student’s understanding of statistics.

  12. Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction

    Science.gov (United States)

    Imbens, Guido W.; Rubin, Donald B.

    2015-01-01

    Most questions in social and biomedical sciences are causal in nature: what would happen to individuals, or to groups, if part of their environment were changed? In this groundbreaking text, two world-renowned experts present statistical methods for studying such questions. This book starts with the notion of potential outcomes, each corresponding…

  13. Probably not future prediction using probability and statistical inference

    CERN Document Server

    Dworsky, Lawrence N

    2008-01-01

    An engaging, entertaining, and informative introduction to probability and prediction in our everyday lives Although Probably Not deals with probability and statistics, it is not heavily mathematical and is not filled with complex derivations, proofs, and theoretical problem sets. This book unveils the world of statistics through questions such as what is known based upon the information at hand and what can be expected to happen. While learning essential concepts including "the confidence factor" and "random walks," readers will be entertained and intrigued as they move from chapter to chapter. Moreover, the author provides a foundation of basic principles to guide decision making in almost all facets of life including playing games, developing winning business strategies, and managing personal finances. Much of the book is organized around easy-to-follow examples that address common, everyday issues such as: How travel time is affected by congestion, driving speed, and traffic lights Why different gambling ...

  14. Bayesian Nonparametric Statistical Inference for Shock Models and Wear Processes.

    Science.gov (United States)

    1979-12-01

    also note that the results in Section 2 do not depend on the support of F .) This shock model have been studied by Esary, Marshall and Proschan (1973...Barlow and Proschan (1975), among others. The analogy of the shock model in risk and acturial analysis has been given by BUhlmann (1970, Chapter 2... Mathematical Statistics, Vol. 4, pp. 894-906. Billingsley, P. (1968), CONVERGENCE OF PROBABILITY MEASURES, John Wiley, New York. BUhlmann, H. (1970

  15. Statistical inference for extended or shortened phase II studies based on Simon's two-stage designs.

    Science.gov (United States)

    Zhao, Junjun; Yu, Menggang; Feng, Xi-Ping

    2015-06-07

    Simon's two-stage designs are popular choices for conducting phase II clinical trials, especially in the oncology trials to reduce the number of patients placed on ineffective experimental therapies. Recently Koyama and Chen (2008) discussed how to conduct proper inference for such studies because they found that inference procedures used with Simon's designs almost always ignore the actual sampling plan used. In particular, they proposed an inference method for studies when the actual second stage sample sizes differ from planned ones. We consider an alternative inference method based on likelihood ratio. In particular, we order permissible sample paths under Simon's two-stage designs using their corresponding conditional likelihood. In this way, we can calculate p-values using the common definition: the probability of obtaining a test statistic value at least as extreme as that observed under the null hypothesis. In addition to providing inference for a couple of scenarios where Koyama and Chen's method can be difficult to apply, the resulting estimate based on our method appears to have certain advantage in terms of inference properties in many numerical simulations. It generally led to smaller biases and narrower confidence intervals while maintaining similar coverages. We also illustrated the two methods in a real data setting. Inference procedures used with Simon's designs almost always ignore the actual sampling plan. Reported P-values, point estimates and confidence intervals for the response rate are not usually adjusted for the design's adaptiveness. Proper statistical inference procedures should be used.

  16. Statistical inference of level densities from resolved resonance parameters

    International Nuclear Information System (INIS)

    Froehner, F.H.

    1983-08-01

    Level densities are most directly obtained by counting the resonances observed in the resolved resonance range. Even in the measurements, however, weak levels are invariably missed so that one has to estimate their number and add it to the raw count. The main categories of missinglevel estimators are discussed in the present review, viz. (I) ladder methods including those based on the theory of Hamiltonian matrix ensembles (Dyson-Mehta statistics), (II) methods based on comparison with artificial cross section curves (Monte Carlo simulation, Garrison's autocorrelation method), (III) methods exploiting the observed neutron width distribution by means of Bayesian or more approximate procedures such as maximum-likelihood, least-squares or moment methods, with various recipes for the treatment of detection thresholds and resolution effects. The language of mathematical statistics is employed to clarify the basis of, and the relationship between, the various techniques. Recent progress in the treatment of resolution effects, detection thresholds and p-wave admixture is described. (orig.) [de

  17. Entropy based statistical inference for methane emissions released from wetland

    Czech Academy of Sciences Publication Activity Database

    Sabolová, R.; Sečkárová, Vladimíra; Dušek, Jiří; Stehlík, M.

    2015-01-01

    Roč. 141, č. 1 (2015), s. 125-133 ISSN 0169-7439 R&D Projects: GA ČR GA13-13502S; GA ČR(CZ) GAP504/11/1151; GA MŠk(CZ) ED1.1.00/02.0073 Grant - others:GA ČR(CZ) GA201/12/0083; GA UK(CZ) SVV 2014-260105 Institutional support: RVO:67985556 ; RVO:67179843 Keywords : chaos * entropy * Kullback-Leibler divergence * Pareto distribution * saddlepoint approaximation * wetland ecosystem Subject RIV: BB - Applied Statistics, Operational Research; EH - Ecology, Behaviour (UEK-B) Impact factor: 2.217, year: 2015 http://library.utia.cas.cz/separaty/2014/AS/seckarova-0438651.pdf

  18. GWIS: Genome-Wide Inferred Statistics for Functions of Multiple Phenotypes

    NARCIS (Netherlands)

    Nieuwboer, H.A.; Pool, R.; Dolan, C.V.; Boomsma, D.I.; Nivard, M.G.

    2016-01-01

    Here we present a method of genome-wide inferred study (GWIS) that provides an approximation of genome-wide association study (GWAS) summary statistics for a variable that is a function of phenotypes for which GWAS summary statistics, phenotypic means, and covariances are available. A GWIS can be

  19. Statistical physics of hard optimization problems

    International Nuclear Information System (INIS)

    Zdeborova, L.

    2009-01-01

    Optimization is fundamental in many areas of science, from computer science and information theory to engineering and statistical physics, as well as to biology or social sciences. It typically involves a large number of variables and a cost function depending on these variables. Optimization problems in the non-deterministic polynomial (NP)-complete class are particularly difficult, it is believed that the number of operations required to minimize the cost function is in the most difficult cases exponential in the system size. However, even in an NP-complete problem the practically arising instances might, in fact, be easy to solve. The principal question we address in this article is: How to recognize if an NP-complete constraint satisfaction problem is typically hard and what are the main reasons for this? We adopt approaches from the statistical physics of disordered systems, in particular the cavity method developed originally to describe glassy systems. We describe new properties of the space of solutions in two of the most studied constraint satisfaction problems - random satisfy ability and random graph coloring. We suggest a relation between the existence of the so-called frozen variables and the algorithmic hardness of a problem. Based on these insights, we introduce a new class of problems which we named ”locked” constraint satisfaction, where the statistical description is easily solvable, but from the algorithmic point of view they are even more challenging than the canonical satisfy ability.

  20. Statistical physics of hard optimization problems

    International Nuclear Information System (INIS)

    Zdeborova, L.

    2009-01-01

    Optimization is fundamental in many areas of science, from computer science and information theory to engineering and statistical physics, as well as to biology or social sciences. It typically involves a large number of variables and a cost function depending on these variables. Optimization problems in the non-deterministic polynomial-complete class are particularly difficult, it is believed that the number of operations required to minimize the cost function is in the most difficult cases exponential in the system size. However, even in an non-deterministic polynomial-complete problem the practically arising instances might, in fact, be easy to solve. The principal the question we address in the article is: How to recognize if an non-deterministic polynomial-complete constraint satisfaction problem is typically hard and what are the main reasons for this? We adopt approaches from the statistical physics of disordered systems, in particular the cavity method developed originally to describe glassy systems. We describe new properties of the space of solutions in two of the most studied constraint satisfaction problems - random satisfiability and random graph coloring. We suggest a relation between the existence of the so-called frozen variables and the algorithmic hardness of a problem. Based on these insights, we introduce a new class of problems which we named 'locked' constraint satisfaction, where the statistical description is easily solvable, but from the algorithmic point of view they are even more challenging than the canonical satisfiability (Authors)

  1. Statistical physics of hard optimization problems

    Science.gov (United States)

    Zdeborová, Lenka

    2009-06-01

    Optimization is fundamental in many areas of science, from computer science and information theory to engineering and statistical physics, as well as to biology or social sciences. It typically involves a large number of variables and a cost function depending on these variables. Optimization problems in the non-deterministic polynomial (NP)-complete class are particularly difficult, it is believed that the number of operations required to minimize the cost function is in the most difficult cases exponential in the system size. However, even in an NP-complete problem the practically arising instances might, in fact, be easy to solve. The principal question we address in this article is: How to recognize if an NP-complete constraint satisfaction problem is typically hard and what are the main reasons for this? We adopt approaches from the statistical physics of disordered systems, in particular the cavity method developed originally to describe glassy systems. We describe new properties of the space of solutions in two of the most studied constraint satisfaction problems - random satisfiability and random graph coloring. We suggest a relation between the existence of the so-called frozen variables and the algorithmic hardness of a problem. Based on these insights, we introduce a new class of problems which we named "locked" constraint satisfaction, where the statistical description is easily solvable, but from the algorithmic point of view they are even more challenging than the canonical satisfiability.

  2. Critical examination of logical formulations in quantum theory. Statistical inference and Hilbertian distance between quantum states

    International Nuclear Information System (INIS)

    Hadjisawas, Nicolas.

    1982-01-01

    After a critical study of the logical quantum mechanics formulations of Jauch and Piron, classical and quantum versions of statistical inference are studied. In order to do this, the significance of the Jaynes and Kulback principles (maximum likelihood, least squares principles) is revealed from the theorems established. In the quantum mechanics inference problem, a ''distance'' between states is defined. This concept is used to solve the quantum equivalent of the classical problem studied by Kulback. The ''projection postulate'' proposition is subsequently deduced [fr

  3. Robust inference from multiple test statistics via permutations: a better alternative to the single test statistic approach for randomized trials.

    Science.gov (United States)

    Ganju, Jitendra; Yu, Xinxin; Ma, Guoguang Julie

    2013-01-01

    Formal inference in randomized clinical trials is based on controlling the type I error rate associated with a single pre-specified statistic. The deficiency of using just one method of analysis is that it depends on assumptions that may not be met. For robust inference, we propose pre-specifying multiple test statistics and relying on the minimum p-value for testing the null hypothesis of no treatment effect. The null hypothesis associated with the various test statistics is that the treatment groups are indistinguishable. The critical value for hypothesis testing comes from permutation distributions. Rejection of the null hypothesis when the smallest p-value is less than the critical value controls the type I error rate at its designated value. Even if one of the candidate test statistics has low power, the adverse effect on the power of the minimum p-value statistic is not much. Its use is illustrated with examples. We conclude that it is better to rely on the minimum p-value rather than a single statistic particularly when that single statistic is the logrank test, because of the cost and complexity of many survival trials. Copyright © 2013 John Wiley & Sons, Ltd.

  4. Statistical inference for discrete-time samples from affine stochastic delay differential equations

    DEFF Research Database (Denmark)

    Küchler, Uwe; Sørensen, Michael

    2013-01-01

    Statistical inference for discrete time observations of an affine stochastic delay differential equation is considered. The main focus is on maximum pseudo-likelihood estimators, which are easy to calculate in practice. A more general class of prediction-based estimating functions is investigated...

  5. Statistical inference and visualization in scale-space for spatially dependent images

    KAUST Repository

    Vaughan, Amy; Jun, Mikyoung; Park, Cheolwoo

    2012-01-01

    SiZer (SIgnificant ZERo crossing of the derivatives) is a graphical scale-space visualization tool that allows for statistical inferences. In this paper we develop a spatial SiZer for finding significant features and conducting goodness-of-fit tests

  6. Statistical inference for remote sensing-based estimates of net deforestation

    Science.gov (United States)

    Ronald E. McRoberts; Brian F. Walters

    2012-01-01

    Statistical inference requires expression of an estimate in probabilistic terms, usually in the form of a confidence interval. An approach to constructing confidence intervals for remote sensing-based estimates of net deforestation is illustrated. The approach is based on post-classification methods using two independent forest/non-forest classifications because...

  7. Large-Scale Optimization for Bayesian Inference in Complex Systems

    Energy Technology Data Exchange (ETDEWEB)

    Willcox, Karen [MIT; Marzouk, Youssef [MIT

    2013-11-12

    The SAGUARO (Scalable Algorithms for Groundwater Uncertainty Analysis and Robust Optimization) Project focused on the development of scalable numerical algorithms for large-scale Bayesian inversion in complex systems that capitalize on advances in large-scale simulation-based optimization and inversion methods. The project was a collaborative effort among MIT, the University of Texas at Austin, Georgia Institute of Technology, and Sandia National Laboratories. The research was directed in three complementary areas: efficient approximations of the Hessian operator, reductions in complexity of forward simulations via stochastic spectral approximations and model reduction, and employing large-scale optimization concepts to accelerate sampling. The MIT--Sandia component of the SAGUARO Project addressed the intractability of conventional sampling methods for large-scale statistical inverse problems by devising reduced-order models that are faithful to the full-order model over a wide range of parameter values; sampling then employs the reduced model rather than the full model, resulting in very large computational savings. Results indicate little effect on the computed posterior distribution. On the other hand, in the Texas--Georgia Tech component of the project, we retain the full-order model, but exploit inverse problem structure (adjoint-based gradients and partial Hessian information of the parameter-to-observation map) to implicitly extract lower dimensional information on the posterior distribution; this greatly speeds up sampling methods, so that fewer sampling points are needed. We can think of these two approaches as ``reduce then sample'' and ``sample then reduce.'' In fact, these two approaches are complementary, and can be used in conjunction with each other. Moreover, they both exploit deterministic inverse problem structure, in the form of adjoint-based gradient and Hessian information of the underlying parameter-to-observation map, to

  8. Principles for statistical inference on big spatio-temporal data from climate models

    KAUST Repository

    Castruccio, Stefano

    2018-02-24

    The vast increase in size of modern spatio-temporal datasets has prompted statisticians working in environmental applications to develop new and efficient methodologies that are still able to achieve inference for nontrivial models within an affordable time. Climate model outputs push the limits of inference for Gaussian processes, as their size can easily be larger than 10 billion data points. Drawing from our experience in a set of previous work, we provide three principles for the statistical analysis of such large datasets that leverage recent methodological and computational advances. These principles emphasize the need of embedding distributed and parallel computing in the inferential process.

  9. Principles for statistical inference on big spatio-temporal data from climate models

    KAUST Repository

    Castruccio, Stefano; Genton, Marc G.

    2018-01-01

    The vast increase in size of modern spatio-temporal datasets has prompted statisticians working in environmental applications to develop new and efficient methodologies that are still able to achieve inference for nontrivial models within an affordable time. Climate model outputs push the limits of inference for Gaussian processes, as their size can easily be larger than 10 billion data points. Drawing from our experience in a set of previous work, we provide three principles for the statistical analysis of such large datasets that leverage recent methodological and computational advances. These principles emphasize the need of embedding distributed and parallel computing in the inferential process.

  10. An inferentialist perspective on the coordination of actions and reasons involved in making a statistical inference

    Science.gov (United States)

    Bakker, Arthur; Ben-Zvi, Dani; Makar, Katie

    2017-12-01

    To understand how statistical and other types of reasoning are coordinated with actions to reduce uncertainty, we conducted a case study in vocational education that involved statistical hypothesis testing. We analyzed an intern's research project in a hospital laboratory in which reducing uncertainties was crucial to make a valid statistical inference. In his project, the intern, Sam, investigated whether patients' blood could be sent through pneumatic post without influencing the measurement of particular blood components. We asked, in the process of making a statistical inference, how are reasons and actions coordinated to reduce uncertainty? For the analysis, we used the semantic theory of inferentialism, specifically, the concept of webs of reasons and actions—complexes of interconnected reasons for facts and actions; these reasons include premises and conclusions, inferential relations, implications, motives for action, and utility of tools for specific purposes in a particular context. Analysis of interviews with Sam, his supervisor and teacher as well as video data of Sam in the classroom showed that many of Sam's actions aimed to reduce variability, rule out errors, and thus reduce uncertainties so as to arrive at a valid inference. Interestingly, the decisive factor was not the outcome of a t test but of the reference change value, a clinical chemical measure of analytic and biological variability. With insights from this case study, we expect that students can be better supported in connecting statistics with context and in dealing with uncertainty.

  11. Statistical inference and visualization in scale-space for spatially dependent images

    KAUST Repository

    Vaughan, Amy

    2012-03-01

    SiZer (SIgnificant ZERo crossing of the derivatives) is a graphical scale-space visualization tool that allows for statistical inferences. In this paper we develop a spatial SiZer for finding significant features and conducting goodness-of-fit tests for spatially dependent images. The spatial SiZer utilizes a family of kernel estimates of the image and provides not only exploratory data analysis but also statistical inference with spatial correlation taken into account. It is also capable of comparing the observed image with a specific null model being tested by adjusting the statistical inference using an assumed covariance structure. Pixel locations having statistically significant differences between the image and a given null model are highlighted by arrows. The spatial SiZer is compared with the existing independent SiZer via the analysis of simulated data with and without signal on both planar and spherical domains. We apply the spatial SiZer method to the decadal temperature change over some regions of the Earth. © 2011 The Korean Statistical Society.

  12. Designs and Methods for Association Studies and Population Size Inference in Statistical Genetics

    DEFF Research Database (Denmark)

    Waltoft, Berit Lindum

    method provides a simple goodness of t test by comparing the observed SFS with the expected SFS under a given model of population size changes. By the use of Monte Carlo estimation the expected time between coalescent events can be estimated and the expected SFS can thereby be evaluated. Using......). The OR is interpreted as the eect of an exposure on the probability of being diseased at the end of follow-up, while the interpretation of the IRR is the eect of an exposure on the probability of becoming diseased. Through a simulation study, the OR from a classical case-control study is shown to be an inconsistent...... the classical chi-square statistics we are able to infer single parameter models. Multiple parameter models, e.g. multiple epochs, are harder to identify. By introducing the inference of population size back in time as an inverse problem, the second procedure applies the theory of smoothing splines to infer...

  13. Statistical inference of the generation probability of T-cell receptors from sequence repertoires.

    Science.gov (United States)

    Murugan, Anand; Mora, Thierry; Walczak, Aleksandra M; Callan, Curtis G

    2012-10-02

    Stochastic rearrangement of germline V-, D-, and J-genes to create variable coding sequence for certain cell surface receptors is at the origin of immune system diversity. This process, known as "VDJ recombination", is implemented via a series of stochastic molecular events involving gene choices and random nucleotide insertions between, and deletions from, genes. We use large sequence repertoires of the variable CDR3 region of human CD4+ T-cell receptor beta chains to infer the statistical properties of these basic biochemical events. Because any given CDR3 sequence can be produced in multiple ways, the probability distribution of hidden recombination events cannot be inferred directly from the observed sequences; we therefore develop a maximum likelihood inference method to achieve this end. To separate the properties of the molecular rearrangement mechanism from the effects of selection, we focus on nonproductive CDR3 sequences in T-cell DNA. We infer the joint distribution of the various generative events that occur when a new T-cell receptor gene is created. We find a rich picture of correlation (and absence thereof), providing insight into the molecular mechanisms involved. The generative event statistics are consistent between individuals, suggesting a universal biochemical process. Our probabilistic model predicts the generation probability of any specific CDR3 sequence by the primitive recombination process, allowing us to quantify the potential diversity of the T-cell repertoire and to understand why some sequences are shared between individuals. We argue that the use of formal statistical inference methods, of the kind presented in this paper, will be essential for quantitative understanding of the generation and evolution of diversity in the adaptive immune system.

  14. Statistical comparison of a hybrid approach with approximate and exact inference models for Fusion 2+

    Science.gov (United States)

    Lee, K. David; Wiesenfeld, Eric; Gelfand, Andrew

    2007-04-01

    One of the greatest challenges in modern combat is maintaining a high level of timely Situational Awareness (SA). In many situations, computational complexity and accuracy considerations make the development and deployment of real-time, high-level inference tools very difficult. An innovative hybrid framework that combines Bayesian inference, in the form of Bayesian Networks, and Possibility Theory, in the form of Fuzzy Logic systems, has recently been introduced to provide a rigorous framework for high-level inference. In previous research, the theoretical basis and benefits of the hybrid approach have been developed. However, lacking is a concrete experimental comparison of the hybrid framework with traditional fusion methods, to demonstrate and quantify this benefit. The goal of this research, therefore, is to provide a statistical analysis on the comparison of the accuracy and performance of hybrid network theory, with pure Bayesian and Fuzzy systems and an inexact Bayesian system approximated using Particle Filtering. To accomplish this task, domain specific models will be developed under these different theoretical approaches and then evaluated, via Monte Carlo Simulation, in comparison to situational ground truth to measure accuracy and fidelity. Following this, a rigorous statistical analysis of the performance results will be performed, to quantify the benefit of hybrid inference to other fusion tools.

  15. Inference on network statistics by restricting to the network space: applications to sexual history data.

    Science.gov (United States)

    Goyal, Ravi; De Gruttola, Victor

    2018-01-30

    Analysis of sexual history data intended to describe sexual networks presents many challenges arising from the fact that most surveys collect information on only a very small fraction of the population of interest. In addition, partners are rarely identified and responses are subject to reporting biases. Typically, each network statistic of interest, such as mean number of sexual partners for men or women, is estimated independently of other network statistics. There is, however, a complex relationship among networks statistics; and knowledge of these relationships can aid in addressing concerns mentioned earlier. We develop a novel method that constrains a posterior predictive distribution of a collection of network statistics in order to leverage the relationships among network statistics in making inference about network properties of interest. The method ensures that inference on network properties is compatible with an actual network. Through extensive simulation studies, we also demonstrate that use of this method can improve estimates in settings where there is uncertainty that arises both from sampling and from systematic reporting bias compared with currently available approaches to estimation. To illustrate the method, we apply it to estimate network statistics using data from the Chicago Health and Social Life Survey. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  16. Statistical inference, the bootstrap, and neural-network modeling with application to foreign exchange rates.

    Science.gov (United States)

    White, H; Racine, J

    2001-01-01

    We propose tests for individual and joint irrelevance of network inputs. Such tests can be used to determine whether an input or group of inputs "belong" in a particular model, thus permitting valid statistical inference based on estimated feedforward neural-network models. The approaches employ well-known statistical resampling techniques. We conduct a small Monte Carlo experiment showing that our tests have reasonable level and power behavior, and we apply our methods to examine whether there are predictable regularities in foreign exchange rates. We find that exchange rates do appear to contain information that is exploitable for enhanced point prediction, but the nature of the predictive relations evolves through time.

  17. Statistical inferences under the Null hypothesis: Common mistakes and pitfalls in neuroimaging studies.

    Directory of Open Access Journals (Sweden)

    Jean-Michel eHupé

    2015-02-01

    Full Text Available Published studies using functional and structural MRI include many errors in the way data are analyzed and conclusions reported. This was observed when working on a comprehensive review of the neural bases of synesthesia, but these errors are probably endemic to neuroimaging studies. All studies reviewed had based their conclusions using Null Hypothesis Significance Tests (NHST. NHST have yet been criticized since their inception because they are more appropriate for taking decisions related to a Null hypothesis (like in manufacturing than for making inferences about behavioral and neuronal processes. Here I focus on a few key problems of NHST related to brain imaging techniques, and explain why or when we should not rely on significance tests. I also observed that, often, the ill-posed logic of NHST was even not correctly applied, and describe what I identified as common mistakes or at least problematic practices in published papers, in light of what could be considered as the very basics of statistical inference. MRI statistics also involve much more complex issues than standard statistical inference. Analysis pipelines vary a lot between studies, even for those using the same software, and there is no consensus which pipeline is the best. I propose a synthetic view of the logic behind the possible methodological choices, and warn against the usage and interpretation of two statistical methods popular in brain imaging studies, the false discovery rate (FDR procedure and permutation tests. I suggest that current models for the analysis of brain imaging data suffer from serious limitations and call for a revision taking into account the new statistics (confidence intervals logic.

  18. An Intelligent Inference System for Robot Hand Optimal Grasp Preshaping

    Directory of Open Access Journals (Sweden)

    Cabbar Veysel Baysal

    2010-11-01

    Full Text Available This paper presents a novel Intelligent Inference System (IIS for the determination of an optimum preshape for multifingered robot hand grasping, given object under a manipulation task. The IIS is formed as hybrid agent architecture, by the synthesis of object properties, manipulation task characteristics, grasp space partitioning, lowlevel kinematical analysis, evaluation of contact wrench patterns via fuzzy approximate reasoning and ANN structure for incremental learning. The IIS is implemented in software with a robot hand simulation.

  19. A normative inference approach for optimal sample sizes in decisions from experience

    Science.gov (United States)

    Ostwald, Dirk; Starke, Ludger; Hertwig, Ralph

    2015-01-01

    “Decisions from experience” (DFE) refers to a body of work that emerged in research on behavioral decision making over the last decade. One of the major experimental paradigms employed to study experience-based choice is the “sampling paradigm,” which serves as a model of decision making under limited knowledge about the statistical structure of the world. In this paradigm respondents are presented with two payoff distributions, which, in contrast to standard approaches in behavioral economics, are specified not in terms of explicit outcome-probability information, but by the opportunity to sample outcomes from each distribution without economic consequences. Participants are encouraged to explore the distributions until they feel confident enough to decide from which they would prefer to draw from in a final trial involving real monetary payoffs. One commonly employed measure to characterize the behavior of participants in the sampling paradigm is the sample size, that is, the number of outcome draws which participants choose to obtain from each distribution prior to terminating sampling. A natural question that arises in this context concerns the “optimal” sample size, which could be used as a normative benchmark to evaluate human sampling behavior in DFE. In this theoretical study, we relate the DFE sampling paradigm to the classical statistical decision theoretic literature and, under a probabilistic inference assumption, evaluate optimal sample sizes for DFE. In our treatment we go beyond analytically established results by showing how the classical statistical decision theoretic framework can be used to derive optimal sample sizes under arbitrary, but numerically evaluable, constraints. Finally, we critically evaluate the value of deriving optimal sample sizes under this framework as testable predictions for the experimental study of sampling behavior in DFE. PMID:26441720

  20. STATISTICAL OPTIMIZATION OF PROCESS VARIABLES FOR ...

    African Journals Online (AJOL)

    2012-11-03

    Nov 3, 2012 ... The osmotic dehydration process was optimized for water loss and solutes gain. ... basis) with safe moisture content for storage (10% wet basis) [3]. Due to ... sucrose, glucose, fructose, corn syrup and sodium chlo- ride have ...

  1. Inference of missing data and chemical model parameters using experimental statistics

    Science.gov (United States)

    Casey, Tiernan; Najm, Habib

    2017-11-01

    A method for determining the joint parameter density of Arrhenius rate expressions through the inference of missing experimental data is presented. This approach proposes noisy hypothetical data sets from target experiments and accepts those which agree with the reported statistics, in the form of nominal parameter values and their associated uncertainties. The data exploration procedure is formalized using Bayesian inference, employing maximum entropy and approximate Bayesian computation methods to arrive at a joint density on data and parameters. The method is demonstrated in the context of reactions in the H2-O2 system for predictive modeling of combustion systems of interest. Work supported by the US DOE BES CSGB. Sandia National Labs is a multimission lab managed and operated by Nat. Technology and Eng'g Solutions of Sandia, LLC., a wholly owned subsidiary of Honeywell Intl, for the US DOE NCSA under contract DE-NA-0003525.

  2. Optimal Design of Shock Tube Experiments for Parameter Inference

    KAUST Repository

    Bisetti, Fabrizio; Knio, Omar

    2014-01-01

    We develop a Bayesian framework for the optimal experimental design of the shock tube experiments which are being carried out at the KAUST Clean Combustion Research Center. The unknown parameters are the pre-exponential parameters and the activation

  3. An Efficient Forward-Reverse EM Algorithm for Statistical Inference in Stochastic Reaction Networks

    KAUST Repository

    Bayer, Christian

    2016-01-06

    In this work [1], we present an extension of the forward-reverse algorithm by Bayer and Schoenmakers [2] to the context of stochastic reaction networks (SRNs). We then apply this bridge-generation technique to the statistical inference problem of approximating the reaction coefficients based on discretely observed data. To this end, we introduce an efficient two-phase algorithm in which the first phase is deterministic and it is intended to provide a starting point for the second phase which is the Monte Carlo EM Algorithm.

  4. Developing a statistically powerful measure for quartet tree inference using phylogenetic identities and Markov invariants.

    Science.gov (United States)

    Sumner, Jeremy G; Taylor, Amelia; Holland, Barbara R; Jarvis, Peter D

    2017-12-01

    Recently there has been renewed interest in phylogenetic inference methods based on phylogenetic invariants, alongside the related Markov invariants. Broadly speaking, both these approaches give rise to polynomial functions of sequence site patterns that, in expectation value, either vanish for particular evolutionary trees (in the case of phylogenetic invariants) or have well understood transformation properties (in the case of Markov invariants). While both approaches have been valued for their intrinsic mathematical interest, it is not clear how they relate to each other, and to what extent they can be used as practical tools for inference of phylogenetic trees. In this paper, by focusing on the special case of binary sequence data and quartets of taxa, we are able to view these two different polynomial-based approaches within a common framework. To motivate the discussion, we present three desirable statistical properties that we argue any invariant-based phylogenetic method should satisfy: (1) sensible behaviour under reordering of input sequences; (2) stability as the taxa evolve independently according to a Markov process; and (3) explicit dependence on the assumption of a continuous-time process. Motivated by these statistical properties, we develop and explore several new phylogenetic inference methods. In particular, we develop a statistically bias-corrected version of the Markov invariants approach which satisfies all three properties. We also extend previous work by showing that the phylogenetic invariants can be implemented in such a way as to satisfy property (3). A simulation study shows that, in comparison to other methods, our new proposed approach based on bias-corrected Markov invariants is extremely powerful for phylogenetic inference. The binary case is of particular theoretical interest as-in this case only-the Markov invariants can be expressed as linear combinations of the phylogenetic invariants. A wider implication of this is that, for

  5. Statistical inference of the nuclear accidents occurrence number for the next decade

    International Nuclear Information System (INIS)

    Felizia, E.R.

    1987-01-01

    This paper aims to give a response using the classical statistical and bayesian inference techniques regarding the common characteristic in the Harrisburg and Chernobyl nuclear accidents: in both reactors, core fusion occurred. In relation to the last mentioned techniques, the most recent developments were applied, based on the decision theory of uncertainty; among others, the principle of maximum entropy. Besides, as a preliminar information on the accidents occurrence frequency with core fusion, the German risk analysis results were used. The estimations predicted for the next decade an average between one or two accidents with core fusion and low possibilities for the 'no accident' event in the same period. (Author)

  6. Robust optimization based upon statistical theory.

    Science.gov (United States)

    Sobotta, B; Söhn, M; Alber, M

    2010-08-01

    Organ movement is still the biggest challenge in cancer treatment despite advances in online imaging. Due to the resulting geometric uncertainties, the delivered dose cannot be predicted precisely at treatment planning time. Consequently, all associated dose metrics (e.g., EUD and maxDose) are random variables with a patient-specific probability distribution. The method that the authors propose makes these distributions the basis of the optimization and evaluation process. The authors start from a model of motion derived from patient-specific imaging. On a multitude of geometry instances sampled from this model, a dose metric is evaluated. The resulting pdf of this dose metric is termed outcome distribution. The approach optimizes the shape of the outcome distribution based on its mean and variance. This is in contrast to the conventional optimization of a nominal value (e.g., PTV EUD) computed on a single geometry instance. The mean and variance allow for an estimate of the expected treatment outcome along with the residual uncertainty. Besides being applicable to the target, the proposed method also seamlessly includes the organs at risk (OARs). The likelihood that a given value of a metric is reached in the treatment is predicted quantitatively. This information reveals potential hazards that may occur during the course of the treatment, thus helping the expert to find the right balance between the risk of insufficient normal tissue sparing and the risk of insufficient tumor control. By feeding this information to the optimizer, outcome distributions can be obtained where the probability of exceeding a given OAR maximum and that of falling short of a given target goal can be minimized simultaneously. The method is applicable to any source of residual motion uncertainty in treatment delivery. Any model that quantifies organ movement and deformation in terms of probability distributions can be used as basis for the algorithm. Thus, it can generate dose

  7. Application of maximum entropy to statistical inference for inversion of data from a single track segment.

    Science.gov (United States)

    Stotts, Steven A; Koch, Robert A

    2017-08-01

    In this paper an approach is presented to estimate the constraint required to apply maximum entropy (ME) for statistical inference with underwater acoustic data from a single track segment. Previous algorithms for estimating the ME constraint require multiple source track segments to determine the constraint. The approach is relevant for addressing model mismatch effects, i.e., inaccuracies in parameter values determined from inversions because the propagation model does not account for all acoustic processes that contribute to the measured data. One effect of model mismatch is that the lowest cost inversion solution may be well outside a relatively well-known parameter value's uncertainty interval (prior), e.g., source speed from track reconstruction or towed source levels. The approach requires, for some particular parameter value, the ME constraint to produce an inferred uncertainty interval that encompasses the prior. Motivating this approach is the hypothesis that the proposed constraint determination procedure would produce a posterior probability density that accounts for the effect of model mismatch on inferred values of other inversion parameters for which the priors might be quite broad. Applications to both measured and simulated data are presented for model mismatch that produces minimum cost solutions either inside or outside some priors.

  8. Optimal inverse magnetorheological damper modeling using shuffled frog-leaping algorithm–based adaptive neuro-fuzzy inference system approach

    Directory of Open Access Journals (Sweden)

    Xiufang Lin

    2016-08-01

    Full Text Available Magnetorheological dampers have become prominent semi-active control devices for vibration mitigation of structures which are subjected to severe loads. However, the damping force cannot be controlled directly due to the inherent nonlinear characteristics of the magnetorheological dampers. Therefore, for fully exploiting the capabilities of the magnetorheological dampers, one of the challenging aspects is to develop an accurate inverse model which can appropriately predict the input voltage to control the damping force. In this article, a hybrid modeling strategy combining shuffled frog-leaping algorithm and adaptive-network-based fuzzy inference system is proposed to model the inverse dynamic characteristics of the magnetorheological dampers for improving the modeling accuracy. The shuffled frog-leaping algorithm is employed to optimize the premise parameters of the adaptive-network-based fuzzy inference system while the consequent parameters are tuned by a least square estimation method, here known as shuffled frog-leaping algorithm-based adaptive-network-based fuzzy inference system approach. To evaluate the effectiveness of the proposed approach, the inverse modeling results based on the shuffled frog-leaping algorithm-based adaptive-network-based fuzzy inference system approach are compared with those based on the adaptive-network-based fuzzy inference system and genetic algorithm–based adaptive-network-based fuzzy inference system approaches. Analysis of variance test is carried out to statistically compare the performance of the proposed methods and the results demonstrate that the shuffled frog-leaping algorithm-based adaptive-network-based fuzzy inference system strategy outperforms the other two methods in terms of modeling (training accuracy and checking accuracy.

  9. Truth, possibility and probability new logical foundations of probability and statistical inference

    CERN Document Server

    Chuaqui, R

    1991-01-01

    Anyone involved in the philosophy of science is naturally drawn into the study of the foundations of probability. Different interpretations of probability, based on competing philosophical ideas, lead to different statistical techniques, and frequently to mutually contradictory consequences. This unique book presents a new interpretation of probability, rooted in the traditional interpretation that was current in the 17th and 18th centuries. Mathematical models are constructed based on this interpretation, and statistical inference and decision theory are applied, including some examples in artificial intelligence, solving the main foundational problems. Nonstandard analysis is extensively developed for the construction of the models and in some of the proofs. Many nonstandard theorems are proved, some of them new, in particular, a representation theorem that asserts that any stochastic process can be approximated by a process defined over a space with equiprobable outcomes.

  10. Statistical inference with quantum measurements: methodologies for nitrogen vacancy centers in diamond

    Science.gov (United States)

    Hincks, Ian; Granade, Christopher; Cory, David G.

    2018-01-01

    The analysis of photon count data from the standard nitrogen vacancy (NV) measurement process is treated as a statistical inference problem. This has applications toward gaining better and more rigorous error bars for tasks such as parameter estimation (e.g. magnetometry), tomography, and randomized benchmarking. We start by providing a summary of the standard phenomenological model of the NV optical process in terms of Lindblad jump operators. This model is used to derive random variables describing emitted photons during measurement, to which finite visibility, dark counts, and imperfect state preparation are added. NV spin-state measurement is then stated as an abstract statistical inference problem consisting of an underlying biased coin obstructed by three Poisson rates. Relevant frequentist and Bayesian estimators are provided, discussed, and quantitatively compared. We show numerically that the risk of the maximum likelihood estimator is well approximated by the Cramér-Rao bound, for which we provide a simple formula. Of the estimators, we in particular promote the Bayes estimator, owing to its slightly better risk performance, and straightforward error propagation into more complex experiments. This is illustrated on experimental data, where quantum Hamiltonian learning is performed and cross-validated in a fully Bayesian setting, and compared to a more traditional weighted least squares fit.

  11. Large scale statistical inference of signaling pathways from RNAi and microarray data

    Directory of Open Access Journals (Sweden)

    Poustka Annemarie

    2007-10-01

    Full Text Available Abstract Background The advent of RNA interference techniques enables the selective silencing of biologically interesting genes in an efficient way. In combination with DNA microarray technology this enables researchers to gain insights into signaling pathways by observing downstream effects of individual knock-downs on gene expression. These secondary effects can be used to computationally reverse engineer features of the upstream signaling pathway. Results In this paper we address this challenging problem by extending previous work by Markowetz et al., who proposed a statistical framework to score networks hypotheses in a Bayesian manner. Our extensions go in three directions: First, we introduce a way to omit the data discretization step needed in the original framework via a calculation based on p-values instead. Second, we show how prior assumptions on the network structure can be incorporated into the scoring scheme using regularization techniques. Third and most important, we propose methods to scale up the original approach, which is limited to around 5 genes, to large scale networks. Conclusion Comparisons of these methods on artificial data are conducted. Our proposed module network is employed to infer the signaling network between 13 genes in the ER-α pathway in human MCF-7 breast cancer cells. Using a bootstrapping approach this reconstruction can be found with good statistical stability. The code for the module network inference method is available in the latest version of the R-package nem, which can be obtained from the Bioconductor homepage.

  12. A statistical method for lung tumor segmentation uncertainty in PET images based on user inference.

    Science.gov (United States)

    Zheng, Chaojie; Wang, Xiuying; Feng, Dagan

    2015-01-01

    PET has been widely accepted as an effective imaging modality for lung tumor diagnosis and treatment. However, standard criteria for delineating tumor boundary from PET are yet to develop largely due to relatively low quality of PET images, uncertain tumor boundary definition, and variety of tumor characteristics. In this paper, we propose a statistical solution to segmentation uncertainty on the basis of user inference. We firstly define the uncertainty segmentation band on the basis of segmentation probability map constructed from Random Walks (RW) algorithm; and then based on the extracted features of the user inference, we use Principle Component Analysis (PCA) to formulate the statistical model for labeling the uncertainty band. We validated our method on 10 lung PET-CT phantom studies from the public RIDER collections [1] and 16 clinical PET studies where tumors were manually delineated by two experienced radiologists. The methods were validated using Dice similarity coefficient (DSC) to measure the spatial volume overlap. Our method achieved an average DSC of 0.878 ± 0.078 on phantom studies and 0.835 ± 0.039 on clinical studies.

  13. Statistical inference of seabed sound-speed structure in the Gulf of Oman Basin.

    Science.gov (United States)

    Sagers, Jason D; Knobles, David P

    2014-06-01

    Addressed is the statistical inference of the sound-speed depth profile of a thick soft seabed from broadband sound propagation data recorded in the Gulf of Oman Basin in 1977. The acoustic data are in the form of time series signals recorded on a sparse vertical line array and generated by explosive sources deployed along a 280 km track. The acoustic data offer a unique opportunity to study a deep-water bottom-limited thickly sedimented environment because of the large number of time series measurements, very low seabed attenuation, and auxiliary measurements. A maximum entropy method is employed to obtain a conditional posterior probability distribution (PPD) for the sound-speed ratio and the near-surface sound-speed gradient. The multiple data samples allow for a determination of the average error constraint value required to uniquely specify the PPD for each data sample. Two complicating features of the statistical inference study are addressed: (1) the need to develop an error function that can both utilize the measured multipath arrival structure and mitigate the effects of data errors and (2) the effect of small bathymetric slopes on the structure of the bottom interacting arrivals.

  14. Statistical inference approach to structural reconstruction of complex networks from binary time series

    Science.gov (United States)

    Ma, Chuang; Chen, Han-Shuang; Lai, Ying-Cheng; Zhang, Hai-Feng

    2018-02-01

    Complex networks hosting binary-state dynamics arise in a variety of contexts. In spite of previous works, to fully reconstruct the network structure from observed binary data remains challenging. We articulate a statistical inference based approach to this problem. In particular, exploiting the expectation-maximization (EM) algorithm, we develop a method to ascertain the neighbors of any node in the network based solely on binary data, thereby recovering the full topology of the network. A key ingredient of our method is the maximum-likelihood estimation of the probabilities associated with actual or nonexistent links, and we show that the EM algorithm can distinguish the two kinds of probability values without any ambiguity, insofar as the length of the available binary time series is reasonably long. Our method does not require any a priori knowledge of the detailed dynamical processes, is parameter-free, and is capable of accurate reconstruction even in the presence of noise. We demonstrate the method using combinations of distinct types of binary dynamical processes and network topologies, and provide a physical understanding of the underlying reconstruction mechanism. Our statistical inference based reconstruction method contributes an additional piece to the rapidly expanding "toolbox" of data based reverse engineering of complex networked systems.

  15. BIG-DATA and the Challenges for Statistical Inference and Economics Teaching and Learning

    Directory of Open Access Journals (Sweden)

    J.L. Peñaloza Figueroa

    2017-04-01

    Full Text Available The  increasing  automation  in  data  collection,  either  in  structured  or unstructured formats, as well as the development of reading, concatenation and comparison algorithms and the growing analytical skills which characterize the era of Big Data, cannot not only be considered a technological achievement, but an organizational, methodological and analytical challenge for knowledge as well, which is necessary to generate opportunities and added value. In fact, exploiting the potential of Big-Data includes all fields of community activity; and given its ability to extract behaviour patterns, we are interested in the challenges for the field of teaching and learning, particularly in the field of statistical inference and economic theory. Big-Data can improve the understanding of concepts, models and techniques used in both statistical inference and economic theory, and it can also generate reliable and robust short and long term predictions. These facts have led to the demand for analytical capabilities, which in turn encourages teachers and students to demand access to massive information produced by individuals, companies and public and private organizations in their transactions and inter- relationships. Mass data (Big Data is changing the way people access, understand and organize knowledge, which in turn is causing a shift in the approach to statistics and economics teaching, considering them as a real way of thinking rather than just operational and technical disciplines. Hence, the question is how teachers can use automated collection and analytical skills to their advantage when teaching statistics and economics; and whether it will lead to a change in what is taught and how it is taught.

  16. Optimal Design of Shock Tube Experiments for Parameter Inference

    KAUST Repository

    Bisetti, Fabrizio

    2014-01-06

    We develop a Bayesian framework for the optimal experimental design of the shock tube experiments which are being carried out at the KAUST Clean Combustion Research Center. The unknown parameters are the pre-exponential parameters and the activation energies in the reaction rate expressions. The control parameters are the initial mixture composition and the temperature. The approach is based on first building a polynomial based surrogate model for the observables relevant to the shock tube experiments. Based on these surrogates, a novel MAP based approach is used to estimate the expected information gain in the proposed experiments, and to select the best experimental set-ups yielding the optimal expected information gains. The validity of the approach is tested using synthetic data generated by sampling the PC surrogate. We finally outline a methodology for validation using actual laboratory experiments, and extending experimental design methodology to the cases where the control parameters are noisy.

  17. Software defined network inference with evolutionary optimal observation matrices

    OpenAIRE

    Malboubi, M; Gong, Y; Yang, Z; Wang, X; Chuah, CN; Sharma, P

    2017-01-01

    © 2017 Elsevier B.V. A key requirement for network management is the accurate and reliable monitoring of relevant network characteristics. In today's large-scale networks, this is a challenging task due to the scarcity of network measurement resources and the hard constraints that this imposes. This paper proposes a new framework, called SNIPER, which leverages the flexibility provided by Software-Defined Networking (SDN) to design the optimal observation or measurement matrix that can lead t...

  18. Statistical fluctuations of an ocean surface inferred from shoes and ships

    Science.gov (United States)

    Lerche, Ian; Maubeuge, Frédéric

    1995-12-01

    This paper shows that it is possible to roughly estimate some ocean properties using simple time-dependent statistical models of ocean fluctuations. Based on a real incident, the loss by a vessel of a Nike shoes container in the North Pacific Ocean, a statistical model was tested on data sets consisting of the Nike shoes found by beachcombers a few months later. This statistical treatment of the shoes' motion allows one to infer velocity trends of the Pacific Ocean, together with their fluctuation strengths. The idea is to suppose that there is a mean bulk flow speed that can depend on location on the ocean surface and time. The fluctuations of the surface flow speed are then treated as statistically random. The distribution of shoes is described in space and time using Markov probability processes related to the mean and fluctuating ocean properties. The aim of the exercise is to provide some of the properties of the Pacific Ocean that are otherwise calculated using a sophisticated numerical model, OSCURS, where numerous data are needed. Relevant quantities are sharply estimated, which can be useful to (1) constrain output results from OSCURS computations, and (2) elucidate the behavior patterns of ocean flow characteristics on long time scales.

  19. Confidence intervals permit, but don't guarantee, better inference than statistical significance testing

    Directory of Open Access Journals (Sweden)

    Melissa Coulson

    2010-07-01

    Full Text Available A statistically significant result, and a non-significant result may differ little, although significance status may tempt an interpretation of difference. Two studies are reported that compared interpretation of such results presented using null hypothesis significance testing (NHST, or confidence intervals (CIs. Authors of articles published in psychology, behavioural neuroscience, and medical journals were asked, via email, to interpret two fictitious studies that found similar results, one statistically significant, and the other non-significant. Responses from 330 authors varied greatly, but interpretation was generally poor, whether results were presented as CIs or using NHST. However, when interpreting CIs respondents who mentioned NHST were 60% likely to conclude, unjustifiably, the two results conflicted, whereas those who interpreted CIs without reference to NHST were 95% likely to conclude, justifiably, the two results were consistent. Findings were generally similar for all three disciplines. An email survey of academic psychologists confirmed that CIs elicit better interpretations if NHST is not invoked. Improved statistical inference can result from encouragement of meta-analytic thinking and use of CIs but, for full benefit, such highly desirable statistical reform requires also that researchers interpret CIs without recourse to NHST.

  20. Statistical optimization of cultural conditions by response surface ...

    African Journals Online (AJOL)

    STORAGESEVER

    2009-08-04

    Aug 4, 2009 ... Full Length Research Paper. Statistical optimization of cultural conditions by response surface methodology for phenol degradation by a novel ... Phenol is a hydrocarbon compound that is highly toxic, ... Microorganism.

  1. On statistical inference in time series analysis of the evolution of road safety.

    Science.gov (United States)

    Commandeur, Jacques J F; Bijleveld, Frits D; Bergel-Hayat, Ruth; Antoniou, Constantinos; Yannis, George; Papadimitriou, Eleonora

    2013-11-01

    Data collected for building a road safety observatory usually include observations made sequentially through time. Examples of such data, called time series data, include annual (or monthly) number of road traffic accidents, traffic fatalities or vehicle kilometers driven in a country, as well as the corresponding values of safety performance indicators (e.g., data on speeding, seat belt use, alcohol use, etc.). Some commonly used statistical techniques imply assumptions that are often violated by the special properties of time series data, namely serial dependency among disturbances associated with the observations. The first objective of this paper is to demonstrate the impact of such violations to the applicability of standard methods of statistical inference, which leads to an under or overestimation of the standard error and consequently may produce erroneous inferences. Moreover, having established the adverse consequences of ignoring serial dependency issues, the paper aims to describe rigorous statistical techniques used to overcome them. In particular, appropriate time series analysis techniques of varying complexity are employed to describe the development over time, relating the accident-occurrences to explanatory factors such as exposure measures or safety performance indicators, and forecasting the development into the near future. Traditional regression models (whether they are linear, generalized linear or nonlinear) are shown not to naturally capture the inherent dependencies in time series data. Dedicated time series analysis techniques, such as the ARMA-type and DRAG approaches are discussed next, followed by structural time series models, which are a subclass of state space methods. The paper concludes with general recommendations and practice guidelines for the use of time series models in road safety research. Copyright © 2012 Elsevier Ltd. All rights reserved.

  2. Optimizing Groundwater Monitoring Networks Using Integrated Statistical and Geostatistical Approaches

    Directory of Open Access Journals (Sweden)

    Jay Krishna Thakur

    2015-08-01

    Full Text Available The aim of this work is to investigate new approaches using methods based on statistics and geo-statistics for spatio-temporal optimization of groundwater monitoring networks. The formulated and integrated methods were tested with the groundwater quality data set of Bitterfeld/Wolfen, Germany. Spatially, the monitoring network was optimized using geo-statistical methods. Temporal optimization of the monitoring network was carried out using Sen’s method (1968. For geostatistical network optimization, a geostatistical spatio-temporal algorithm was used to identify redundant wells in 2- and 2.5-D Quaternary and Tertiary aquifers. Influences of interpolation block width, dimension, contaminant association, groundwater flow direction and aquifer homogeneity on statistical and geostatistical methods for monitoring network optimization were analysed. The integrated approach shows 37% and 28% redundancies in the monitoring network in Quaternary aquifer and Tertiary aquifer respectively. The geostatistical method also recommends 41 and 22 new monitoring wells in the Quaternary and Tertiary aquifers respectively. In temporal optimization, an overall optimized sampling interval was recommended in terms of lower quartile (238 days, median quartile (317 days and upper quartile (401 days in the research area of Bitterfeld/Wolfen. Demonstrated methods for improving groundwater monitoring network can be used in real monitoring network optimization with due consideration given to influencing factors.

  3. Survey design, statistical analysis, and basis for statistical inferences in coastal habitat injury assessment: Exxon Valdez oil spill

    International Nuclear Information System (INIS)

    McDonald, L.L.; Erickson, W.P.; Strickland, M.D.

    1995-01-01

    The objective of the Coastal Habitat Injury Assessment study was to document and quantify injury to biota of the shallow subtidal, intertidal, and supratidal zones throughout the shoreline affected by oil or cleanup activity associated with the Exxon Valdez oil spill. The results of these studies were to be used to support the Trustee's Type B Natural Resource Damage Assessment under the Comprehensive Environmental Response, Compensation, and Liability Act of 1980 (CERCLA). A probability based stratified random sample of shoreline segments was selected with probability proportional to size from each of 15 strata (5 habitat types crossed with 3 levels of potential oil impact) based on those data available in July, 1989. Three study regions were used: Prince William Sound, Cook Inlet/Kenai Peninsula, and Kodiak/Alaska Peninsula. A Geographic Information System was utilized to combine oiling and habitat data and to select the probability sample of study sites. Quasi-experiments were conducted where randomly selected oiled sites were compared to matched reference sites. Two levels of statistical inferences, philosophical bases, and limitations are discussed and illustrated with example data from the resulting studies. 25 refs., 4 figs., 1 tab

  4. Challenges and Approaches to Statistical Design and Inference in High Dimensional Investigations

    Science.gov (United States)

    Garrett, Karen A.; Allison, David B.

    2015-01-01

    Summary Advances in modern technologies have facilitated high-dimensional experiments (HDEs) that generate tremendous amounts of genomic, proteomic, and other “omic” data. HDEs involving whole-genome sequences and polymorphisms, expression levels of genes, protein abundance measurements, and combinations thereof have become a vanguard for new analytic approaches to the analysis of HDE data. Such situations demand creative approaches to the processes of statistical inference, estimation, prediction, classification, and study design. The novel and challenging biological questions asked from HDE data have resulted in many specialized analytic techniques being developed. This chapter discusses some of the unique statistical challenges facing investigators studying high-dimensional biology, and describes some approaches being developed by statistical scientists. We have included some focus on the increasing interest in questions involving testing multiple propositions simultaneously, appropriate inferential indicators for the types of questions biologists are interested in, and the need for replication of results across independent studies, investigators, and settings. A key consideration inherent throughout is the challenge in providing methods that a statistician judges to be sound and a biologist finds informative. PMID:19588106

  5. Challenges and approaches to statistical design and inference in high-dimensional investigations.

    Science.gov (United States)

    Gadbury, Gary L; Garrett, Karen A; Allison, David B

    2009-01-01

    Advances in modern technologies have facilitated high-dimensional experiments (HDEs) that generate tremendous amounts of genomic, proteomic, and other "omic" data. HDEs involving whole-genome sequences and polymorphisms, expression levels of genes, protein abundance measurements, and combinations thereof have become a vanguard for new analytic approaches to the analysis of HDE data. Such situations demand creative approaches to the processes of statistical inference, estimation, prediction, classification, and study design. The novel and challenging biological questions asked from HDE data have resulted in many specialized analytic techniques being developed. This chapter discusses some of the unique statistical challenges facing investigators studying high-dimensional biology and describes some approaches being developed by statistical scientists. We have included some focus on the increasing interest in questions involving testing multiple propositions simultaneously, appropriate inferential indicators for the types of questions biologists are interested in, and the need for replication of results across independent studies, investigators, and settings. A key consideration inherent throughout is the challenge in providing methods that a statistician judges to be sound and a biologist finds informative.

  6. Statistical Inference on Stochastic Dominance Efficiency. Do Omitted Risk Factors Explain the Size and Book-to-Market Effects?

    NARCIS (Netherlands)

    G.T. Post (Thierry)

    2003-01-01

    textabstractThis paper discusses statistical inference on the second-order stochastic dominance (SSD) efficiency of a given portfolio relative to all portfolios formed from a set of assets. We derive the asymptotic sampling distribution of the Post test statistic for SSD efficiency. Unfortunately, a

  7. Difference-of-Convex optimization for variational kl-corrected inference in dirichlet process mixtures

    DEFF Research Database (Denmark)

    Bonnevie, Rasmus; Schmidt, Mikkel Nørgaard; Mørup, Morten

    2017-01-01

    Variational methods for approximate inference in Bayesian models optimise a lower bound on the marginal likelihood, but the optimization problem often suffers from being nonconvex and high-dimensional. This can be alleviated by working in a collapsed domain where a part of the parameter space...

  8. Statistical distributions of optimal global alignment scores of random protein sequences

    Directory of Open Access Journals (Sweden)

    Tang Jiaowei

    2005-10-01

    Full Text Available Abstract Background The inference of homology from statistically significant sequence similarity is a central issue in sequence alignments. So far the statistical distribution function underlying the optimal global alignments has not been completely determined. Results In this study, random and real but unrelated sequences prepared in six different ways were selected as reference datasets to obtain their respective statistical distributions of global alignment scores. All alignments were carried out with the Needleman-Wunsch algorithm and optimal scores were fitted to the Gumbel, normal and gamma distributions respectively. The three-parameter gamma distribution performs the best as the theoretical distribution function of global alignment scores, as it agrees perfectly well with the distribution of alignment scores. The normal distribution also agrees well with the score distribution frequencies when the shape parameter of the gamma distribution is sufficiently large, for this is the scenario when the normal distribution can be viewed as an approximation of the gamma distribution. Conclusion We have shown that the optimal global alignment scores of random protein sequences fit the three-parameter gamma distribution function. This would be useful for the inference of homology between sequences whose relationship is unknown, through the evaluation of gamma distribution significance between sequences.

  9. A Statistical Approach to Optimizing Concrete Mixture Design

    OpenAIRE

    Ahmad, Shamsad; Alghamdi, Saeid A.

    2014-01-01

    A step-by-step statistical approach is proposed to obtain optimum proportioning of concrete mixtures using the data obtained through a statistically planned experimental program. The utility of the proposed approach for optimizing the design of concrete mixture is illustrated considering a typical case in which trial mixtures were considered according to a full factorial experiment design involving three factors and their three levels (33). A total of 27 concrete mixtures with three replicate...

  10. AD Model Builder: using automatic differentiation for statistical inference of highly parameterized complex nonlinear models

    DEFF Research Database (Denmark)

    Fournier, David A.; Skaug, Hans J.; Ancheta, Johnoel

    2011-01-01

    Many criteria for statistical parameter estimation, such as maximum likelihood, are formulated as a nonlinear optimization problem.Automatic Differentiation Model Builder (ADMB) is a programming framework based on automatic differentiation, aimed at highly nonlinear models with a large number...... of such a feature is the generic implementation of Laplace approximation of high-dimensional integrals for use in latent variable models. We also review the literature in which ADMB has been used, and discuss future development of ADMB as an open source project. Overall, the main advantages ofADMB are flexibility...

  11. Statistical inference for the additive hazards model under outcome-dependent sampling.

    Science.gov (United States)

    Yu, Jichang; Liu, Yanyan; Sandler, Dale P; Zhou, Haibo

    2015-09-01

    Cost-effective study design and proper inference procedures for data from such designs are always of particular interests to study investigators. In this article, we propose a biased sampling scheme, an outcome-dependent sampling (ODS) design for survival data with right censoring under the additive hazards model. We develop a weighted pseudo-score estimator for the regression parameters for the proposed design and derive the asymptotic properties of the proposed estimator. We also provide some suggestions for using the proposed method by evaluating the relative efficiency of the proposed method against simple random sampling design and derive the optimal allocation of the subsamples for the proposed design. Simulation studies show that the proposed ODS design is more powerful than other existing designs and the proposed estimator is more efficient than other estimators. We apply our method to analyze a cancer study conducted at NIEHS, the Cancer Incidence and Mortality of Uranium Miners Study, to study the risk of radon exposure to cancer.

  12. Racing to learn: statistical inference and learning in a single spiking neuron with adaptive kernels.

    Science.gov (United States)

    Afshar, Saeed; George, Libin; Tapson, Jonathan; van Schaik, André; Hamilton, Tara J

    2014-01-01

    This paper describes the Synapto-dendritic Kernel Adapting Neuron (SKAN), a simple spiking neuron model that performs statistical inference and unsupervised learning of spatiotemporal spike patterns. SKAN is the first proposed neuron model to investigate the effects of dynamic synapto-dendritic kernels and demonstrate their computational power even at the single neuron scale. The rule-set defining the neuron is simple: there are no complex mathematical operations such as normalization, exponentiation or even multiplication. The functionalities of SKAN emerge from the real-time interaction of simple additive and binary processes. Like a biological neuron, SKAN is robust to signal and parameter noise, and can utilize both in its operations. At the network scale neurons are locked in a race with each other with the fastest neuron to spike effectively "hiding" its learnt pattern from its neighbors. The robustness to noise, high speed, and simple building blocks not only make SKAN an interesting neuron model in computational neuroscience, but also make it ideal for implementation in digital and analog neuromorphic systems which is demonstrated through an implementation in a Field Programmable Gate Array (FPGA). Matlab, Python, and Verilog implementations of SKAN are available at: http://www.uws.edu.au/bioelectronics_neuroscience/bens/reproducible_research.

  13. Hippocampal Structure Predicts Statistical Learning and Associative Inference Abilities during Development.

    Science.gov (United States)

    Schlichting, Margaret L; Guarino, Katharine F; Schapiro, Anna C; Turk-Browne, Nicholas B; Preston, Alison R

    2017-01-01

    Despite the importance of learning and remembering across the lifespan, little is known about how the episodic memory system develops to support the extraction of associative structure from the environment. Here, we relate individual differences in volumes along the hippocampal long axis to performance on statistical learning and associative inference tasks-both of which require encoding associations that span multiple episodes-in a developmental sample ranging from ages 6 to 30 years. Relating age to volume, we found dissociable patterns across the hippocampal long axis, with opposite nonlinear volume changes in the head and body. These structural differences were paralleled by performance gains across the age range on both tasks, suggesting improvements in the cross-episode binding ability from childhood to adulthood. Controlling for age, we also found that smaller hippocampal heads were associated with superior behavioral performance on both tasks, consistent with this region's hypothesized role in forming generalized codes spanning events. Collectively, these results highlight the importance of examining hippocampal development as a function of position along the hippocampal axis and suggest that the hippocampal head is particularly important in encoding associative structure across development.

  14. A probabilistic framework for microarray data analysis: fundamental probability models and statistical inference.

    Science.gov (United States)

    Ogunnaike, Babatunde A; Gelmi, Claudio A; Edwards, Jeremy S

    2010-05-21

    Gene expression studies generate large quantities of data with the defining characteristic that the number of genes (whose expression profiles are to be determined) exceed the number of available replicates by several orders of magnitude. Standard spot-by-spot analysis still seeks to extract useful information for each gene on the basis of the number of available replicates, and thus plays to the weakness of microarrays. On the other hand, because of the data volume, treating the entire data set as an ensemble, and developing theoretical distributions for these ensembles provides a framework that plays instead to the strength of microarrays. We present theoretical results that under reasonable assumptions, the distribution of microarray intensities follows the Gamma model, with the biological interpretations of the model parameters emerging naturally. We subsequently establish that for each microarray data set, the fractional intensities can be represented as a mixture of Beta densities, and develop a procedure for using these results to draw statistical inference regarding differential gene expression. We illustrate the results with experimental data from gene expression studies on Deinococcus radiodurans following DNA damage using cDNA microarrays. Copyright (c) 2010 Elsevier Ltd. All rights reserved.

  15. Maximum entropy approach to statistical inference for an ocean acoustic waveguide.

    Science.gov (United States)

    Knobles, D P; Sagers, J D; Koch, R A

    2012-02-01

    A conditional probability distribution suitable for estimating the statistical properties of ocean seabed parameter values inferred from acoustic measurements is derived from a maximum entropy principle. The specification of the expectation value for an error function constrains the maximization of an entropy functional. This constraint determines the sensitivity factor (β) to the error function of the resulting probability distribution, which is a canonical form that provides a conservative estimate of the uncertainty of the parameter values. From the conditional distribution, marginal distributions for individual parameters can be determined from integration over the other parameters. The approach is an alternative to obtaining the posterior probability distribution without an intermediary determination of the likelihood function followed by an application of Bayes' rule. In this paper the expectation value that specifies the constraint is determined from the values of the error function for the model solutions obtained from a sparse number of data samples. The method is applied to ocean acoustic measurements taken on the New Jersey continental shelf. The marginal probability distribution for the values of the sound speed ratio at the surface of the seabed and the source levels of a towed source are examined for different geoacoustic model representations. © 2012 Acoustical Society of America

  16. Statistical inference and comparison of stochastic models for the hydraulic conductivity at the Finnsjoen-site

    International Nuclear Information System (INIS)

    Norman, S.

    1992-04-01

    The origin of this study was to find a good, or even the best, stochastic model for the hydraulic conductivity field at the Finnsjoe site. The conductivity field in question are regularized, that is upscaled. The reason for performing regularization of measurement data is primarily the need for long correlation scales. This is needed in order to model reasonably large domains that can be used when describing regional groundwater flow accurately. A theory of regularization is discussed in this report. In order to find the best model, jacknifing is employed to compare different stochastic models. The theory for this method is described. In the act of doing so we also take a look at linear predictor theory, so called kriging, and include a general discussion of stochastic functions and intrinsic random functions. The statistical inference methods for finding the models are also described, in particular regression, iterative generalized regression (IGLSE) and non-parametric variogram estimators. A large amount of results is presented for a regularization scale of 36 metre. (30 refs.) (au)

  17. Population-based statistical inference for temporal sequence of somatic mutations in cancer genomes.

    Science.gov (United States)

    Rhee, Je-Keun; Kim, Tae-Min

    2018-04-20

    It is well recognized that accumulation of somatic mutations in cancer genomes plays a role in carcinogenesis; however, the temporal sequence and evolutionary relationship of somatic mutations remain largely unknown. In this study, we built a population-based statistical framework to infer the temporal sequence of acquisition of somatic mutations. Using the model, we analyzed the mutation profiles of 1954 tumor specimens across eight tumor types. As a result, we identified tumor type-specific directed networks composed of 2-15 cancer-related genes (nodes) and their mutational orders (edges). The most common ancestors identified in pairwise comparison of somatic mutations were TP53 mutations in breast, head/neck, and lung cancers. The known relationship of KRAS to TP53 mutations in colorectal cancers was identified, as well as potential ancestors of TP53 mutation such as NOTCH1, EGFR, and PTEN mutations in head/neck, lung and endometrial cancers, respectively. We also identified apoptosis-related genes enriched with ancestor mutations in lung cancers and a relationship between APC hotspot mutations and TP53 mutations in colorectal cancers. While evolutionary analysis of cancers has focused on clonal versus subclonal mutations identified in individual genomes, our analysis aims to further discriminate ancestor versus descendant mutations in population-scale mutation profiles that may help select cancer drivers with clinical relevance.

  18. A review of statistical modelling and inference for electrical capacitance tomography

    International Nuclear Information System (INIS)

    Watzenig, D; Fox, C

    2009-01-01

    Bayesian inference applied to electrical capacitance tomography, or other inverse problems, provides a framework for quantified model fitting. Estimation of unknown quantities of interest is based on the posterior distribution over the unknown permittivity and unobserved data, conditioned on measured data. Key components in this framework are a prior model requiring a parametrization of the permittivity and a normalizable prior density, the likelihood function that follows from a decomposition of measurements into deterministic and random parts, and numerical simulation of noise-free measurements. Uncertainty in recovered permittivities arises from measurement noise, measurement sensitivities, model inaccuracy, discretization error and a priori uncertainty; each of these sources may be accounted for and in some cases taken advantage of. Estimates or properties of the permittivity can be calculated as summary statistics over the posterior distribution using Markov chain Monte Carlo sampling. Several modified Metropolis–Hastings algorithms are available to speed up this computationally expensive step. The bias in estimates that is induced by the representation of unknowns may be avoided by design of a prior density. The differing purpose of applications means that there is no single 'Bayesian' analysis. Further, differing solutions will use different modelling choices, perhaps influenced by the need for computational efficiency. We solve a reference problem of recovering the unknown shape of a constant permittivity inclusion in an otherwise uniform background. Statistics calculated in the reference problem give accurate estimates of inclusion area, and other properties, when using measured data. The alternatives available for structuring inferential solutions in other applications are clarified by contrasting them against the choice we made in our reference solution. (topical review)

  19. Optimal allocation of testing resources for statistical simulations

    Science.gov (United States)

    Quintana, Carolina; Millwater, Harry R.; Singh, Gulshan; Golden, Patrick

    2015-07-01

    Statistical estimates from simulation involve uncertainty caused by the variability in the input random variables due to limited data. Allocating resources to obtain more experimental data of the input variables to better characterize their probability distributions can reduce the variance of statistical estimates. The methodology proposed determines the optimal number of additional experiments required to minimize the variance of the output moments given single or multiple constraints. The method uses multivariate t-distribution and Wishart distribution to generate realizations of the population mean and covariance of the input variables, respectively, given an amount of available data. This method handles independent and correlated random variables. A particle swarm method is used for the optimization. The optimal number of additional experiments per variable depends on the number and variance of the initial data, the influence of the variable in the output function and the cost of each additional experiment. The methodology is demonstrated using a fretting fatigue example.

  20. Statistical Optimality in Multipartite Ranking and Ordinal Regression.

    Science.gov (United States)

    Uematsu, Kazuki; Lee, Yoonkyung

    2015-05-01

    Statistical optimality in multipartite ranking is investigated as an extension of bipartite ranking. We consider the optimality of ranking algorithms through minimization of the theoretical risk which combines pairwise ranking errors of ordinal categories with differential ranking costs. The extension shows that for a certain class of convex loss functions including exponential loss, the optimal ranking function can be represented as a ratio of weighted conditional probability of upper categories to lower categories, where the weights are given by the misranking costs. This result also bridges traditional ranking methods such as proportional odds model in statistics with various ranking algorithms in machine learning. Further, the analysis of multipartite ranking with different costs provides a new perspective on non-smooth list-wise ranking measures such as the discounted cumulative gain and preference learning. We illustrate our findings with simulation study and real data analysis.

  1. Feature network models for proximity data : statistical inference, model selection, network representations and links with related models

    NARCIS (Netherlands)

    Frank, Laurence Emmanuelle

    2006-01-01

    Feature Network Models (FNM) are graphical structures that represent proximity data in a discrete space with the use of features. A statistical inference theory is introduced, based on the additivity properties of networks and the linear regression framework. Considering features as predictor

  2. Final Report, DOE Early Career Award: Predictive modeling of complex physical systems: new tools for statistical inference, uncertainty quantification, and experimental design

    Energy Technology Data Exchange (ETDEWEB)

    Marzouk, Youssef [Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States)

    2016-08-31

    Predictive simulation of complex physical systems increasingly rests on the interplay of experimental observations with computational models. Key inputs, parameters, or structural aspects of models may be incomplete or unknown, and must be developed from indirect and limited observations. At the same time, quantified uncertainties are needed to qualify computational predictions in the support of design and decision-making. In this context, Bayesian statistics provides a foundation for inference from noisy and limited data, but at prohibitive computional expense. This project intends to make rigorous predictive modeling *feasible* in complex physical systems, via accelerated and scalable tools for uncertainty quantification, Bayesian inference, and experimental design. Specific objectives are as follows: 1. Develop adaptive posterior approximations and dimensionality reduction approaches for Bayesian inference in high-dimensional nonlinear systems. 2. Extend accelerated Bayesian methodologies to large-scale {\\em sequential} data assimilation, fully treating nonlinear models and non-Gaussian state and parameter distributions. 3. Devise efficient surrogate-based methods for Bayesian model selection and the learning of model structure. 4. Develop scalable simulation/optimization approaches to nonlinear Bayesian experimental design, for both parameter inference and model selection. 5. Demonstrate these inferential tools on chemical kinetic models in reacting flow, constructing and refining thermochemical and electrochemical models from limited data. Demonstrate Bayesian filtering on canonical stochastic PDEs and in the dynamic estimation of inhomogeneous subsurface properties and flow fields.

  3. An application of an optimal statistic for characterizing relative orientations

    Science.gov (United States)

    Jow, Dylan L.; Hill, Ryley; Scott, Douglas; Soler, J. D.; Martin, P. G.; Devlin, M. J.; Fissel, L. M.; Poidevin, F.

    2018-02-01

    We present the projected Rayleigh statistic (PRS), a modification of the classic Rayleigh statistic, as a test for non-uniform relative orientation between two pseudo-vector fields. In the application here, this gives an effective way of investigating whether polarization pseudo-vectors (spin-2 quantities) are preferentially parallel or perpendicular to filaments in the interstellar medium. For example, there are other potential applications in astrophysics, e.g. when comparing small-scale orientations with larger scale shear patterns. We compare the efficiency of the PRS against histogram binning methods that have previously been used for characterizing the relative orientations of gas column density structures with the magnetic field projected on the plane of the sky. We examine data for the Vela C molecular cloud, where the column density is inferred from Herschel submillimetre observations, and the magnetic field from observations by the Balloon-borne Large-Aperture Submillimetre Telescope in the 250-, 350- and 500-μm wavelength bands. We find that the PRS has greater statistical power than approaches that bin the relative orientation angles, as it makes more efficient use of the information contained in the data. In particular, the use of the PRS to test for preferential alignment results in a higher statistical significance, in each of the four Vela C regions, with the greatest increase being by a factor 1.3 in the South-Nest region in the 250 - μ m band.

  4. Fast optimization of statistical potentials for structurally constrained phylogenetic models

    Directory of Open Access Journals (Sweden)

    Rodrigue Nicolas

    2009-09-01

    Full Text Available Abstract Background Statistical approaches for protein design are relevant in the field of molecular evolutionary studies. In recent years, new, so-called structurally constrained (SC models of protein-coding sequence evolution have been proposed, which use statistical potentials to assess sequence-structure compatibility. In a previous work, we defined a statistical framework for optimizing knowledge-based potentials especially suited to SC models. Our method used the maximum likelihood principle and provided what we call the joint potentials. However, the method required numerical estimations by the use of computationally heavy Markov Chain Monte Carlo sampling algorithms. Results Here, we develop an alternative optimization procedure, based on a leave-one-out argument coupled to fast gradient descent algorithms. We assess that the leave-one-out potential yields very similar results to the joint approach developed previously, both in terms of the resulting potential parameters, and by Bayes factor evaluation in a phylogenetic context. On the other hand, the leave-one-out approach results in a considerable computational benefit (up to a 1,000 fold decrease in computational time for the optimization procedure. Conclusion Due to its computational speed, the optimization method we propose offers an attractive alternative for the design and empirical evaluation of alternative forms of potentials, using large data sets and high-dimensional parameterizations.

  5. Quantifying secondary pest outbreaks in cotton and their monetary cost with causal-inference statistics.

    Science.gov (United States)

    Gross, Kevin; Rosenheim, Jay A

    2011-10-01

    Secondary pest outbreaks occur when the use of a pesticide to reduce densities of an unwanted target pest species triggers subsequent outbreaks of other pest species. Although secondary pest outbreaks are thought to be familiar in agriculture, their rigorous documentation is made difficult by the challenges of performing randomized experiments at suitable scales. Here, we quantify the frequency and monetary cost of secondary pest outbreaks elicited by early-season applications of broad-spectrum insecticides to control the plant bug Lygus spp. (primarily L. hesperus) in cotton grown in the San Joaquin Valley, California, USA. We do so by analyzing pest-control management practices for 969 cotton fields spanning nine years and 11 private ranches. Our analysis uses statistical methods to draw formal causal inferences from nonexperimental data that have become popular in public health and economics, but that are not yet widely known in ecology or agriculture. We find that, in fields that received an early-season broad-spectrum insecticide treatment for Lygus, 20.2% +/- 4.4% (mean +/- SE) of late-season pesticide costs were attributable to secondary pest outbreaks elicited by the early-season insecticide application for Lygus. In 2010 U.S. dollars, this equates to an additional $6.00 +/- $1.30 (mean +/- SE) per acre in management costs. To the extent that secondary pest outbreaks may be driven by eliminating pests' natural enemies, these figures place a lower bound on the monetary value of ecosystem services provided by native communities of arthropod predators and parasitoids in this agricultural system.

  6. OASIS is Automated Statistical Inference for Segmentation, with applications to multiple sclerosis lesion segmentation in MRI.

    Science.gov (United States)

    Sweeney, Elizabeth M; Shinohara, Russell T; Shiee, Navid; Mateen, Farrah J; Chudgar, Avni A; Cuzzocreo, Jennifer L; Calabresi, Peter A; Pham, Dzung L; Reich, Daniel S; Crainiceanu, Ciprian M

    2013-01-01

    Magnetic resonance imaging (MRI) can be used to detect lesions in the brains of multiple sclerosis (MS) patients and is essential for diagnosing the disease and monitoring its progression. In practice, lesion load is often quantified by either manual or semi-automated segmentation of MRI, which is time-consuming, costly, and associated with large inter- and intra-observer variability. We propose OASIS is Automated Statistical Inference for Segmentation (OASIS), an automated statistical method for segmenting MS lesions in MRI studies. We use logistic regression models incorporating multiple MRI modalities to estimate voxel-level probabilities of lesion presence. Intensity-normalized T1-weighted, T2-weighted, fluid-attenuated inversion recovery and proton density volumes from 131 MRI studies (98 MS subjects, 33 healthy subjects) with manual lesion segmentations were used to train and validate our model. Within this set, OASIS detected lesions with a partial area under the receiver operating characteristic curve for clinically relevant false positive rates of 1% and below of 0.59% (95% CI; [0.50%, 0.67%]) at the voxel level. An experienced MS neuroradiologist compared these segmentations to those produced by LesionTOADS, an image segmentation software that provides segmentation of both lesions and normal brain structures. For lesions, OASIS out-performed LesionTOADS in 74% (95% CI: [65%, 82%]) of cases for the 98 MS subjects. To further validate the method, we applied OASIS to 169 MRI studies acquired at a separate center. The neuroradiologist again compared the OASIS segmentations to those from LesionTOADS. For lesions, OASIS ranked higher than LesionTOADS in 77% (95% CI: [71%, 83%]) of cases. For a randomly selected subset of 50 of these studies, one additional radiologist and one neurologist also scored the images. Within this set, the neuroradiologist ranked OASIS higher than LesionTOADS in 76% (95% CI: [64%, 88%]) of cases, the neurologist 66% (95% CI: [52%, 78

  7. An assessment of machine and statistical learning approaches to inferring networks of protein-protein interactions

    Directory of Open Access Journals (Sweden)

    Browne Fiona

    2006-12-01

    Full Text Available Protein-protein interactions (PPI play a key role in many biological systems. Over the past few years, an explosion in availability of functional biological data obtained from high-throughput technologies to infer PPI has been observed. However, results obtained from such experiments show high rates of false positives and false negatives predictions as well as systematic predictive bias. Recent research has revealed that several machine and statistical learning methods applied to integrate relatively weak, diverse sources of large-scale functional data may provide improved predictive accuracy and coverage of PPI. In this paper we describe the effects of applying different computational, integrative methods to predict PPI in Saccharomyces cerevisiae. We investigated the predictive ability of combining different sets of relatively strong and weak predictive datasets. We analysed several genomic datasets ranging from mRNA co-expression to marginal essentiality. Moreover, we expanded an existing multi-source dataset from S. cerevisiae by constructing a new set of putative interactions extracted from Gene Ontology (GO- driven annotations in the Saccharomyces Genome Database. Different classification techniques: Simple Naive Bayesian (SNB, Multilayer Perceptron (MLP and K-Nearest Neighbors (KNN were evaluated. Relatively simple classification methods (i.e. less computing intensive and mathematically complex, such as SNB, have been proven to be proficient at predicting PPI. SNB produced the “highest” predictive quality obtaining an area under Receiver Operating Characteristic (ROC curve (AUC value of 0.99. The lowest AUC value of 0.90 was obtained by the KNN classifier. This assessment also demonstrates the strong predictive power of GO-driven models, which offered predictive performance above 0.90 using the different machine learning and statistical techniques. As the predictive power of single-source datasets became weaker MLP and SNB performed

  8. Inferring species richness and turnover by statistical multiresolution texture analysis of satellite imagery.

    Directory of Open Access Journals (Sweden)

    Matteo Convertino

    richness, or [Formula: see text] diversity, based on the Shannon entropy of pixel intensity.To test our approach, we specifically use the green band of Landsat images for a water conservation area in the Florida Everglades. We validate our predictions against data of species occurrences for a twenty-eight years long period for both wet and dry seasons. Our method correctly predicts 73% of species richness. For species turnover, the newly proposed KL divergence prediction performance is near 100% accurate. This represents a significant improvement over the more conventional Shannon entropy difference, which provides 85% accuracy. Furthermore, we find that changes in soil and water patterns, as measured by fluctuations of the Shannon entropy for the red and blue bands respectively, are positively correlated with changes in vegetation. The fluctuations are smaller in the wet season when compared to the dry season. CONCLUSIONS/SIGNIFICANCE: Texture-based statistical multiresolution image analysis is a promising method for quantifying interseasonal differences and, consequently, the degree to which vegetation, soil, and water patterns vary. The proposed automated method for quantifying species richness and turnover can also provide analysis at higher spatial and temporal resolution than is currently obtainable from expensive monitoring campaigns, thus enabling more prompt, more cost effective inference and decision making support regarding anomalous variations in biodiversity. Additionally, a matrix-based visualization of the statistical multiresolution analysis is presented to facilitate both insight and quick recognition of anomalous data.

  9. Statistical physics of hard combinatorial optimization: Vertex cover problem

    Science.gov (United States)

    Zhao, Jin-Hua; Zhou, Hai-Jun

    2014-07-01

    Typical-case computation complexity is a research topic at the boundary of computer science, applied mathematics, and statistical physics. In the last twenty years, the replica-symmetry-breaking mean field theory of spin glasses and the associated message-passing algorithms have greatly deepened our understanding of typical-case computation complexity. In this paper, we use the vertex cover problem, a basic nondeterministic-polynomial (NP)-complete combinatorial optimization problem of wide application, as an example to introduce the statistical physical methods and algorithms. We do not go into the technical details but emphasize mainly the intuitive physical meanings of the message-passing equations. A nonfamiliar reader shall be able to understand to a large extent the physics behind the mean field approaches and to adjust the mean field methods in solving other optimization problems.

  10. Statistical Sensitive Data Protection and Inference Prevention with Decision Tree Methods

    National Research Council Canada - National Science Library

    Chang, LiWu

    2003-01-01

    .... We consider inference as correct classification and approach it with decision tree methods. As in our previous work, sensitive data are viewed as classes of those test data and non-sensitive data are the rest attribute values...

  11. Heuristic versus statistical physics approach to optimization problems

    International Nuclear Information System (INIS)

    Jedrzejek, C.; Cieplinski, L.

    1995-01-01

    Optimization is a crucial ingredient of many calculation schemes in science and engineering. In this paper we assess several classes of methods: heuristic algorithms, methods directly relying on statistical physics such as the mean-field method and simulated annealing; and Hopfield-type neural networks and genetic algorithms partly related to statistical physics. We perform the analysis for three types of problems: (1) the Travelling Salesman Problem, (2) vector quantization, and (3) traffic control problem in multistage interconnection network. In general, heuristic algorithms perform better (except for genetic algorithms) and much faster but have to be specific for every problem. The key to improving the performance could be to include heuristic features into general purpose statistical physics methods. (author)

  12. Beyond P Values and Hypothesis Testing: Using the Minimum Bayes Factor to Teach Statistical Inference in Undergraduate Introductory Statistics Courses

    Science.gov (United States)

    Page, Robert; Satake, Eiki

    2017-01-01

    While interest in Bayesian statistics has been growing in statistics education, the treatment of the topic is still inadequate in both textbooks and the classroom. Because so many fields of study lead to careers that involve a decision-making process requiring an understanding of Bayesian methods, it is becoming increasingly clear that Bayesian…

  13. Optimization of Indoor Thermal Comfort Parameters with the Adaptive Network-Based Fuzzy Inference System and Particle Swarm Optimization Algorithm

    Directory of Open Access Journals (Sweden)

    Jing Li

    2017-01-01

    Full Text Available The goal of this study is to improve thermal comfort and indoor air quality with the adaptive network-based fuzzy inference system (ANFIS model and improved particle swarm optimization (PSO algorithm. A method to optimize air conditioning parameters and installation distance is proposed. The methodology is demonstrated through a prototype case, which corresponds to a typical laboratory in colleges and universities. A laboratory model is established, and simulated flow field information is obtained with the CFD software. Subsequently, the ANFIS model is employed instead of the CFD model to predict indoor flow parameters, and the CFD database is utilized to train ANN input-output “metamodels” for the subsequent optimization. With the improved PSO algorithm and the stratified sequence method, the objective functions are optimized. The functions comprise PMV, PPD, and mean age of air. The optimal installation distance is determined with the hemisphere model. Results show that most of the staff obtain a satisfactory degree of thermal comfort and that the proposed method can significantly reduce the cost of building an experimental device. The proposed methodology can be used to determine appropriate air supply parameters and air conditioner installation position for a pleasant and healthy indoor environment.

  14. Using Alien Coins to Test Whether Simple Inference Is Bayesian

    Science.gov (United States)

    Cassey, Peter; Hawkins, Guy E.; Donkin, Chris; Brown, Scott D.

    2016-01-01

    Reasoning and inference are well-studied aspects of basic cognition that have been explained as statistically optimal Bayesian inference. Using a simplified experimental design, we conducted quantitative comparisons between Bayesian inference and human inference at the level of individuals. In 3 experiments, with more than 13,000 participants, we…

  15. A statistical approach to optimizing concrete mixture design.

    Science.gov (United States)

    Ahmad, Shamsad; Alghamdi, Saeid A

    2014-01-01

    A step-by-step statistical approach is proposed to obtain optimum proportioning of concrete mixtures using the data obtained through a statistically planned experimental program. The utility of the proposed approach for optimizing the design of concrete mixture is illustrated considering a typical case in which trial mixtures were considered according to a full factorial experiment design involving three factors and their three levels (3(3)). A total of 27 concrete mixtures with three replicates (81 specimens) were considered by varying the levels of key factors affecting compressive strength of concrete, namely, water/cementitious materials ratio (0.38, 0.43, and 0.48), cementitious materials content (350, 375, and 400 kg/m(3)), and fine/total aggregate ratio (0.35, 0.40, and 0.45). The experimental data were utilized to carry out analysis of variance (ANOVA) and to develop a polynomial regression model for compressive strength in terms of the three design factors considered in this study. The developed statistical model was used to show how optimization of concrete mixtures can be carried out with different possible options.

  16. A Statistical Approach to Optimizing Concrete Mixture Design

    Directory of Open Access Journals (Sweden)

    Shamsad Ahmad

    2014-01-01

    Full Text Available A step-by-step statistical approach is proposed to obtain optimum proportioning of concrete mixtures using the data obtained through a statistically planned experimental program. The utility of the proposed approach for optimizing the design of concrete mixture is illustrated considering a typical case in which trial mixtures were considered according to a full factorial experiment design involving three factors and their three levels (33. A total of 27 concrete mixtures with three replicates (81 specimens were considered by varying the levels of key factors affecting compressive strength of concrete, namely, water/cementitious materials ratio (0.38, 0.43, and 0.48, cementitious materials content (350, 375, and 400 kg/m3, and fine/total aggregate ratio (0.35, 0.40, and 0.45. The experimental data were utilized to carry out analysis of variance (ANOVA and to develop a polynomial regression model for compressive strength in terms of the three design factors considered in this study. The developed statistical model was used to show how optimization of concrete mixtures can be carried out with different possible options.

  17. Finite-sample instrumental variables Inference using an Asymptotically Pivotal Statistic

    NARCIS (Netherlands)

    Bekker, P.; Kleibergen, F.R.

    2001-01-01

    The paper considers the K-statistic, Kleibergen’s (2000) adaptation ofthe Anderson-Rubin (AR) statistic in instrumental variables regression.Compared to the AR-statistic this K-statistic shows improvedasymptotic efficiency in terms of degrees of freedom in overidentifiedmodels and yet it shares,

  18. Finite-sample instrumental variables inference using an asymptotically pivotal statistic

    NARCIS (Netherlands)

    Bekker, Paul A.; Kleibergen, Frank

    2001-01-01

    The paper considers the K-statistic, Kleibergen’s (2000) adaptation of the Anderson-Rubin (AR) statistic in instrumental variables regression. Compared to the AR-statistic this K-statistic shows improved asymptotic efficiency in terms of degrees of freedom in overidenti?ed models and yet it shares,

  19. Frequentist and Bayesian inference for Gaussian-log-Gaussian wavelet trees and statistical signal processing applications

    DEFF Research Database (Denmark)

    Jacobsen, Christian Robert Dahl; Møller, Jesper

    2017-01-01

    We introduce new estimation methods for a subclass of the Gaussian scale mixture models for wavelet trees by Wainwright, Simoncelli and Willsky that rely on modern results for composite likelihoods and approximate Bayesian inference. Our methodology is illustrated for denoising and edge detection...

  20. A neuro-fuzzy inference system tuned by particle swarm optimization algorithm for sensor monitoring

    Energy Technology Data Exchange (ETDEWEB)

    Oliveira, Mauro Vitor de [Instituto de Engenharia Nuclear (IEN), Rio de Janeiro, RJ (Brazil). Div. de Instrumentacao e Confiabilidade Humana]. E-mail: mvitor@ien.gov.br; Schirru, Roberto [Universidade Federal, Rio de Janeiro, RJ (Brazil). Coordenacao dos Programas de Pos-graduacao de Engenharia. Lab. de Monitoracao de Processos

    2005-07-01

    A neuro-fuzzy inference system (ANFIS) tuned by particle swarm optimization (PSO) algorithm has been developed for monitor the relevant sensor in a nuclear plant using the information of other sensors. The antecedent parameters of the ANFIS that estimates the relevant sensor signal are optimized by a PSO algorithm and consequent parameters use a least-squares algorithm. The proposed sensor-monitoring algorithm was demonstrated through the estimation of the nuclear power value in a pressurized water reactor using as input to the ANFIS six other correlated signals. The obtained results are compared to two similar ANFIS using one gradient descendent (GD) and other genetic algorithm (GA), as antecedent parameters training algorithm. (author)

  1. A neuro-fuzzy inference system tuned by particle swarm optimization algorithm for sensor monitoring

    International Nuclear Information System (INIS)

    Oliveira, Mauro Vitor de; Schirru, Roberto

    2005-01-01

    A neuro-fuzzy inference system (ANFIS) tuned by particle swarm optimization (PSO) algorithm has been developed for monitor the relevant sensor in a nuclear plant using the information of other sensors. The antecedent parameters of the ANFIS that estimates the relevant sensor signal are optimized by a PSO algorithm and consequent parameters use a least-squares algorithm. The proposed sensor-monitoring algorithm was demonstrated through the estimation of the nuclear power value in a pressurized water reactor using as input to the ANFIS six other correlated signals. The obtained results are compared to two similar ANFIS using one gradient descendent (GD) and other genetic algorithm (GA), as antecedent parameters training algorithm. (author)

  2. Rigorous force field optimization principles based on statistical distance minimization

    Energy Technology Data Exchange (ETDEWEB)

    Vlcek, Lukas, E-mail: vlcekl1@ornl.gov [Chemical Sciences Division, Geochemistry & Interfacial Sciences Group, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831-6110 (United States); Joint Institute for Computational Sciences, University of Tennessee, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831-6173 (United States); Chialvo, Ariel A. [Chemical Sciences Division, Geochemistry & Interfacial Sciences Group, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831-6110 (United States)

    2015-10-14

    We use the concept of statistical distance to define a measure of distinguishability between a pair of statistical mechanical systems, i.e., a model and its target, and show that its minimization leads to general convergence of the model’s static measurable properties to those of the target. We exploit this feature to define a rigorous basis for the development of accurate and robust effective molecular force fields that are inherently compatible with coarse-grained experimental data. The new model optimization principles and their efficient implementation are illustrated through selected examples, whose outcome demonstrates the higher robustness and predictive accuracy of the approach compared to other currently used methods, such as force matching and relative entropy minimization. We also discuss relations between the newly developed principles and established thermodynamic concepts, which include the Gibbs-Bogoliubov inequality and the thermodynamic length.

  3. Variability aware compact model characterization for statistical circuit design optimization

    Science.gov (United States)

    Qiao, Ying; Qian, Kun; Spanos, Costas J.

    2012-03-01

    Variability modeling at the compact transistor model level can enable statistically optimized designs in view of limitations imposed by the fabrication technology. In this work we propose an efficient variabilityaware compact model characterization methodology based on the linear propagation of variance. Hierarchical spatial variability patterns of selected compact model parameters are directly calculated from transistor array test structures. This methodology has been implemented and tested using transistor I-V measurements and the EKV-EPFL compact model. Calculation results compare well to full-wafer direct model parameter extractions. Further studies are done on the proper selection of both compact model parameters and electrical measurement metrics used in the method.

  4. Finite-sample instrumental variables inference using an asymptotically pivotal statistic

    NARCIS (Netherlands)

    Bekker, P; Kleibergen, F

    2003-01-01

    We consider the K-statistic, Kleibergen's (2002, Econometrica 70, 1781-1803) adaptation of the Anderson-Rubin (AR) statistic in instrumental variables regression. Whereas Kleibergen (2002) especially analyzes the asymptotic behavior of the statistic, we focus on finite-sample properties in, a

  5. Causal inference as an emerging statistical approach in neurology: an example for epilepsy in the elderly

    Directory of Open Access Journals (Sweden)

    Moura LMVR

    2016-12-01

    Full Text Available Lidia MVR Moura,1,2 M Brandon Westover,1,2 David Kwasnik,1 Andrew J Cole,1,2 John Hsu3–5 1Massachusetts General Hospital, Department of Neurology, Epilepsy Service, Boston, MA, USA; 2Harvard Medical School, Boston, MA, USA; 3Massachusetts General Hospital, Mongan Institute, Boston, MA, USA; 4Harvard Medical School, Department of Medicine, Boston, MA, USA; 5Harvard Medical School, Department of Health Care Policy, Boston, MA, USA Abstract: The elderly population faces an increasing number of cases of chronic neurological conditions, such as epilepsy and Alzheimer’s disease. Because the elderly with epilepsy are commonly excluded from randomized controlled clinical trials, there are few rigorous studies to guide clinical practice. When the elderly are eligible for trials, they either rarely participate or frequently have poor adherence to therapy, thus limiting both generalizability and validity. In contrast, large observational data sets are increasingly available, but are susceptible to bias when using common analytic approaches. Recent developments in causal inference-analytic approaches also introduce the possibility of emulating randomized controlled trials to yield valid estimates. We provide a practical example of the application of the principles of causal inference to a large observational data set of patients with epilepsy. This review also provides a framework for comparative-effectiveness research in chronic neurological conditions. Keywords: epilepsy, epidemiology, neurostatistics, causal inference

  6. Statistical Methods for Population Genetic Inference Based on Low-Depth Sequencing Data from Modern and Ancient DNA

    DEFF Research Database (Denmark)

    Korneliussen, Thorfinn Sand

    Due to the recent advances in DNA sequencing technology genomic data are being generated at an unprecedented rate and we are gaining access to entire genomes at population level. The technology does, however, not give direct access to the genetic variation and the many levels of preprocessing...... that is required before being able to make inferences from the data introduces multiple levels of uncertainty, especially for low-depth data. Therefore methods that take into account the inherent uncertainty are needed for being able to make robust inferences in the downstream analysis of such data. This poses...... a problem for a range of key summary statistics within populations genetics where existing methods are based on the assumption that the true genotypes are known. Motivated by this I present: 1) a new method for the estimation of relatedness between pairs of individuals, 2) a new method for estimating...

  7. Statistical and optimal learning with applications in business analytics

    Science.gov (United States)

    Han, Bin

    Statistical learning is widely used in business analytics to discover structure or exploit patterns from historical data, and build models that capture relationships between an outcome of interest and a set of variables. Optimal learning on the other hand, solves the operational side of the problem, by iterating between decision making and data acquisition/learning. All too often the two problems go hand-in-hand, which exhibit a feedback loop between statistics and optimization. We apply this statistical/optimal learning concept on a context of fundraising marketing campaign problem arising in many non-profit organizations. Many such organizations use direct-mail marketing to cultivate one-time donors and convert them into recurring contributors. Cultivated donors generate much more revenue than new donors, but also lapse with time, making it important to steadily draw in new cultivations. The direct-mail budget is limited, but better-designed mailings can improve success rates without increasing costs. We first apply statistical learning to analyze the effectiveness of several design approaches used in practice, based on a massive dataset covering 8.6 million direct-mail communications with donors to the American Red Cross during 2009-2011. We find evidence that mailed appeals are more effective when they emphasize disaster preparedness and training efforts over post-disaster cleanup. Including small cards that affirm donors' identity as Red Cross supporters is an effective strategy, while including gift items such as address labels is not. Finally, very recent acquisitions are more likely to respond to appeals that ask them to contribute an amount similar to their most recent donation, but this approach has an adverse effect on donors with a longer history. We show via simulation that a simple design strategy based on these insights has potential to improve success rates from 5.4% to 8.1%. Given these findings, when new scenario arises, however, new data need to

  8. The Stream Flow Prediction Model Using Fuzzy Inference System and Particle Swarm Optimization

    Directory of Open Access Journals (Sweden)

    Mahmoud Mohammad RezapourTabari

    2013-03-01

    Full Text Available The aim of this study is the spatial prediction runoff using hydrometric and meteorological stations data. The research shows that usually there is a certain communication between the meteorological and hydrometric data of upstream basin and runoff rates in output basin. So, if can be extracted the rules related to historical data that recorded at stations, can be easily predicted runoff amount based on data measured. Accordingly, among the tools available, the fuzzy theory (with flexibility in developing fuzzy rules can be provide the knowledge lies in the observed data to parameters prediction in real time. So, in this research the fuzzy inference system has been used for estimating runoff rates at stations located in the Taleghan river downstream using rain gage stations and hydrometric stations upstream. Because the inappropriate values associated with membership functions, the fuzzy system model can not provide correct value for the prediction. In this study, a combination of intelligence-based optimization algorithm and fuzzy theory developed to accelerate and improve modeling. The result of proposed model, optimum values to each membership function that related to dependent and independent variable extracted and based on it’s the runoff rates in rivers downstream predicted. The results of this study were shown that the high accuracy of proposed model compared with fuzzy inference system. Also based on proposed model can be more accurately the rate of runoff estimated for future conditions.

  9. Pre-service primary school teachers’ knowledge of informal statistical inference

    NARCIS (Netherlands)

    de Vetten, Arjen; Schoonenboom, Judith; Keijzer, Ronald; van Oers, Bert

    2018-01-01

    The ability to reason inferentially is increasingly important in today’s society. It is hypothesized here that engaging primary school students in informal statistical reasoning (ISI), defined as making generalizations without the use of formal statistical tests, will help them acquire the

  10. ddClone: joint statistical inference of clonal populations from single cell and bulk tumour sequencing data.

    Science.gov (United States)

    Salehi, Sohrab; Steif, Adi; Roth, Andrew; Aparicio, Samuel; Bouchard-Côté, Alexandre; Shah, Sohrab P

    2017-03-01

    Next-generation sequencing (NGS) of bulk tumour tissue can identify constituent cell populations in cancers and measure their abundance. This requires computational deconvolution of allelic counts from somatic mutations, which may be incapable of fully resolving the underlying population structure. Single cell sequencing (SCS) is a more direct method, although its replacement of NGS is impeded by technical noise and sampling limitations. We propose ddClone, which analytically integrates NGS and SCS data, leveraging their complementary attributes through joint statistical inference. We show on real and simulated datasets that ddClone produces more accurate results than can be achieved by either method alone.

  11. Optimizing DNA assembly based on statistical language modelling.

    Science.gov (United States)

    Fang, Gang; Zhang, Shemin; Dong, Yafei

    2017-12-15

    By successively assembling genetic parts such as BioBrick according to grammatical models, complex genetic constructs composed of dozens of functional blocks can be built. However, usually every category of genetic parts includes a few or many parts. With increasing quantity of genetic parts, the process of assembling more than a few sets of these parts can be expensive, time consuming and error prone. At the last step of assembling it is somewhat difficult to decide which part should be selected. Based on statistical language model, which is a probability distribution P(s) over strings S that attempts to reflect how frequently a string S occurs as a sentence, the most commonly used parts will be selected. Then, a dynamic programming algorithm was designed to figure out the solution of maximum probability. The algorithm optimizes the results of a genetic design based on a grammatical model and finds an optimal solution. In this way, redundant operations can be reduced and the time and cost required for conducting biological experiments can be minimized. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  12. Statistical process control using optimized neural networks: a case study.

    Science.gov (United States)

    Addeh, Jalil; Ebrahimzadeh, Ata; Azarbad, Milad; Ranaee, Vahid

    2014-09-01

    The most common statistical process control (SPC) tools employed for monitoring process changes are control charts. A control chart demonstrates that the process has altered by generating an out-of-control signal. This study investigates the design of an accurate system for the control chart patterns (CCPs) recognition in two aspects. First, an efficient system is introduced that includes two main modules: feature extraction module and classifier module. In the feature extraction module, a proper set of shape features and statistical feature are proposed as the efficient characteristics of the patterns. In the classifier module, several neural networks, such as multilayer perceptron, probabilistic neural network and radial basis function are investigated. Based on an experimental study, the best classifier is chosen in order to recognize the CCPs. Second, a hybrid heuristic recognition system is introduced based on cuckoo optimization algorithm (COA) algorithm to improve the generalization performance of the classifier. The simulation results show that the proposed algorithm has high recognition accuracy. Copyright © 2013 ISA. Published by Elsevier Ltd. All rights reserved.

  13. View discovery in OLAP databases through statistical combinatorial optimization

    Energy Technology Data Exchange (ETDEWEB)

    Hengartner, Nick W [Los Alamos National Laboratory; Burke, John [PNNL; Critchlow, Terence [PNNL; Joslyn, Cliff [PNNL; Hogan, Emilie [PNNL

    2009-01-01

    OnLine Analytical Processing (OLAP) is a relational database technology providing users with rapid access to summary, aggregated views of a single large database, and is widely recognized for knowledge representation and discovery in high-dimensional relational databases. OLAP technologies provide intuitive and graphical access to the massively complex set of possible summary views available in large relational (SQL) structured data repositories. The capability of OLAP database software systems to handle data complexity comes at a high price for analysts, presenting them a combinatorially vast space of views of a relational database. We respond to the need to deploy technologies sufficient to allow users to guide themselves to areas of local structure by casting the space of 'views' of an OLAP database as a combinatorial object of all projections and subsets, and 'view discovery' as an search process over that lattice. We equip the view lattice with statistical information theoretical measures sufficient to support a combinatorial optimization process. We outline 'hop-chaining' as a particular view discovery algorithm over this object, wherein users are guided across a permutation of the dimensions by searching for successive two-dimensional views, pushing seen dimensions into an increasingly large background filter in a 'spiraling' search process. We illustrate this work in the context of data cubes recording summary statistics for radiation portal monitors at US ports.

  14. View Discovery in OLAP Databases through Statistical Combinatorial Optimization

    Energy Technology Data Exchange (ETDEWEB)

    Joslyn, Cliff A.; Burke, Edward J.; Critchlow, Terence J.

    2009-05-01

    The capability of OLAP database software systems to handle data complexity comes at a high price for analysts, presenting them a combinatorially vast space of views of a relational database. We respond to the need to deploy technologies sufficient to allow users to guide themselves to areas of local structure by casting the space of ``views'' of an OLAP database as a combinatorial object of all projections and subsets, and ``view discovery'' as an search process over that lattice. We equip the view lattice with statistical information theoretical measures sufficient to support a combinatorial optimization process. We outline ``hop-chaining'' as a particular view discovery algorithm over this object, wherein users are guided across a permutation of the dimensions by searching for successive two-dimensional views, pushing seen dimensions into an increasingly large background filter in a ``spiraling'' search process. We illustrate this work in the context of data cubes recording summary statistics for radiation portal monitors at US ports.

  15. Powerful Inference With the D-Statistic on Low-Coverage Whole-Genome Data

    DEFF Research Database (Denmark)

    Soraggi, Samuele; Wiuf, Carsten; Albrechtsen, Anders

    2018-01-01

    The detection of ancient gene flow between human populations is an important issue in population genetics. A common tool for detecting ancient admixture events is the D-statistic. The D-statistic is based on the hypothesis of a genetic relationship that involves four populations, whose correctness...... is assessed by evaluating specific coincidences of alleles between the groups. When working with high throughput sequencing data calling genotypes accurately is not always possible, therefore the D-statistic currently samples a single base from the reads of one individual per population. This implies ignoring...... much of the information in the data, an issue especially striking in the case of ancient genomes. We provide a significant improvement to overcome the problems of the D-statistic by considering all reads from multiple individuals in each population. We also apply type-specific error correction...

  16. Penultimate modeling of spatial extremes: statistical inference for max-infinitely divisible processes

    KAUST Repository

    Huser, Raphaë l; Opitz, Thomas; Thibaud, Emeric

    2018-01-01

    Extreme-value theory for stochastic processes has motivated the statistical use of max-stable models for spatial extremes. However, fitting such asymptotic models to maxima observed over finite blocks is problematic when the asymptotic stability

  17. Powerful Inference with the D-Statistic on Low-Coverage Whole-Genome Data.

    Science.gov (United States)

    Soraggi, Samuele; Wiuf, Carsten; Albrechtsen, Anders

    2018-02-02

    The detection of ancient gene flow between human populations is an important issue in population genetics. A common tool for detecting ancient admixture events is the D-statistic. The D-statistic is based on the hypothesis of a genetic relationship that involves four populations, whose correctness is assessed by evaluating specific coincidences of alleles between the groups. When working with high-throughput sequencing data, calling genotypes accurately is not always possible; therefore, the D-statistic currently samples a single base from the reads of one individual per population. This implies ignoring much of the information in the data, an issue especially striking in the case of ancient genomes. We provide a significant improvement to overcome the problems of the D-statistic by considering all reads from multiple individuals in each population. We also apply type-specific error correction to combat the problems of sequencing errors, and show a way to correct for introgression from an external population that is not part of the supposed genetic relationship, and how this leads to an estimate of the admixture rate. We prove that the D-statistic is approximated by a standard normal distribution. Furthermore, we show that our method outperforms the traditional D-statistic in detecting admixtures. The power gain is most pronounced for low and medium sequencing depth (1-10×), and performances are as good as with perfectly called genotypes at a sequencing depth of 2×. We show the reliability of error correction in scenarios with simulated errors and ancient data, and correct for introgression in known scenarios to estimate the admixture rates. Copyright © 2018 Soraggi et al.

  18. A generalization of voxel-wise procedures for highdimensional statistical inference using ridge regression

    DEFF Research Database (Denmark)

    Sjöstrand, Karl; Cardenas, Valerie A.; Larsen, Rasmus

    2008-01-01

    regression to address this issue, allowing for a gradual introduction of correlation information into the model. We make the connections between ridge regression and voxel-wise procedures explicit and discuss relations to other statistical methods. Results are given on an in-vivo data set of deformation......Whole-brain morphometry denotes a group of methods with the aim of relating clinical and cognitive measurements to regions of the brain. Typically, such methods require the statistical analysis of a data set with many variables (voxels and exogenous variables) paired with few observations (subjects...

  19. Learning Curves and Bootstrap Estimates for Inference with Gaussian Processes: A Statistical Mechanics Study

    DEFF Research Database (Denmark)

    Malzahn, Dorthe; Opper, Manfred

    2003-01-01

    We employ the replica method of statistical physics to study the average case performance of learning systems. The new feature of our theory is that general distributions of data can be treated, which enables applications to real data. For a class of Bayesian prediction models which are based...... on Gaussian processes, we discuss Bootstrap estimates for learning curves....

  20. Reflections on fourteen cryptic issues concerning the nature of statistical inference

    NARCIS (Netherlands)

    Kardaun, O.J.W.F.; Salomé, D.; Schaafsma, W; Steerneman, A.G.M.; Willems, J.C; Cox, D.R.

    The present paper provides the original formulation and a joint response of a group of statistically trained scientists to fourteen cryptic issues for discussion, which were handed out to the public by Professor Dr. D.R. Cox after his Bernoulli Lecture 1997 at Groningen University.

  1. Constrained statistical inference : sample-size tables for ANOVA and regression

    NARCIS (Netherlands)

    Vanbrabant, Leonard; Van De Schoot, Rens; Rosseel, Yves

    2015-01-01

    Researchers in the social and behavioral sciences often have clear expectations about the order/direction of the parameters in their statistical model. For example, a researcher might expect that regression coefficient β1 is larger than β2 and β3. The corresponding hypothesis is H: β1 > {β2, β3} and

  2. Inference of Functionally-Relevant N-acetyltransferase Residues Based on Statistical Correlations.

    Science.gov (United States)

    Neuwald, Andrew F; Altschul, Stephen F

    2016-12-01

    Over evolutionary time, members of a superfamily of homologous proteins sharing a common structural core diverge into subgroups filling various functional niches. At the sequence level, such divergence appears as correlations that arise from residue patterns distinct to each subgroup. Such a superfamily may be viewed as a population of sequences corresponding to a complex, high-dimensional probability distribution. Here we model this distribution as hierarchical interrelated hidden Markov models (hiHMMs), which describe these sequence correlations implicitly. By characterizing such correlations one may hope to obtain information regarding functionally-relevant properties that have thus far evaded detection. To do so, we infer a hiHMM distribution from sequence data using Bayes' theorem and Markov chain Monte Carlo (MCMC) sampling, which is widely recognized as the most effective approach for characterizing a complex, high dimensional distribution. Other routines then map correlated residue patterns to available structures with a view to hypothesis generation. When applied to N-acetyltransferases, this reveals sequence and structural features indicative of functionally important, yet generally unknown biochemical properties. Even for sets of proteins for which nothing is known beyond unannotated sequences and structures, this can lead to helpful insights. We describe, for example, a putative coenzyme-A-induced-fit substrate binding mechanism mediated by arginine residue switching between salt bridge and π-π stacking interactions. A suite of programs implementing this approach is available (psed.igs.umaryland.edu).

  3. Inference of Functionally-Relevant N-acetyltransferase Residues Based on Statistical Correlations.

    Directory of Open Access Journals (Sweden)

    Andrew F Neuwald

    2016-12-01

    Full Text Available Over evolutionary time, members of a superfamily of homologous proteins sharing a common structural core diverge into subgroups filling various functional niches. At the sequence level, such divergence appears as correlations that arise from residue patterns distinct to each subgroup. Such a superfamily may be viewed as a population of sequences corresponding to a complex, high-dimensional probability distribution. Here we model this distribution as hierarchical interrelated hidden Markov models (hiHMMs, which describe these sequence correlations implicitly. By characterizing such correlations one may hope to obtain information regarding functionally-relevant properties that have thus far evaded detection. To do so, we infer a hiHMM distribution from sequence data using Bayes' theorem and Markov chain Monte Carlo (MCMC sampling, which is widely recognized as the most effective approach for characterizing a complex, high dimensional distribution. Other routines then map correlated residue patterns to available structures with a view to hypothesis generation. When applied to N-acetyltransferases, this reveals sequence and structural features indicative of functionally important, yet generally unknown biochemical properties. Even for sets of proteins for which nothing is known beyond unannotated sequences and structures, this can lead to helpful insights. We describe, for example, a putative coenzyme-A-induced-fit substrate binding mechanism mediated by arginine residue switching between salt bridge and π-π stacking interactions. A suite of programs implementing this approach is available (psed.igs.umaryland.edu.

  4. Statistical inference for classification of RRIM clone series using near IR reflectance properties

    Science.gov (United States)

    Ismail, Faridatul Aima; Madzhi, Nina Korlina; Hashim, Hadzli; Abdullah, Noor Ezan; Khairuzzaman, Noor Aishah; Azmi, Azrie Faris Mohd; Sampian, Ahmad Faiz Mohd; Harun, Muhammad Hafiz

    2015-08-01

    RRIM clone is a rubber breeding series produced by RRIM (Rubber Research Institute of Malaysia) through "rubber breeding program" to improve latex yield and producing clones attractive to farmers. The objective of this work is to analyse measurement of optical sensing device on latex of selected clone series. The device using transmitting NIR properties and its reflectance is converted in terms of voltage. The obtained reflectance index value via voltage was analyzed using statistical technique in order to find out the discrimination among the clones. From the statistical results using error plots and one-way ANOVA test, there is an overwhelming evidence showing discrimination of RRIM 2002, RRIM 2007 and RRIM 3001 clone series with p value = 0.000. RRIM 2008 cannot be discriminated with RRIM 2014; however both of these groups are distinct from the other clones.

  5. Exploring the Connection Between Sampling Problems in Bayesian Inference and Statistical Mechanics

    Science.gov (United States)

    Pohorille, Andrew

    2006-01-01

    The Bayesian and statistical mechanical communities often share the same objective in their work - estimating and integrating probability distribution functions (pdfs) describing stochastic systems, models or processes. Frequently, these pdfs are complex functions of random variables exhibiting multiple, well separated local minima. Conventional strategies for sampling such pdfs are inefficient, sometimes leading to an apparent non-ergodic behavior. Several recently developed techniques for handling this problem have been successfully applied in statistical mechanics. In the multicanonical and Wang-Landau Monte Carlo (MC) methods, the correct pdfs are recovered from uniform sampling of the parameter space by iteratively establishing proper weighting factors connecting these distributions. Trivial generalizations allow for sampling from any chosen pdf. The closely related transition matrix method relies on estimating transition probabilities between different states. All these methods proved to generate estimates of pdfs with high statistical accuracy. In another MC technique, parallel tempering, several random walks, each corresponding to a different value of a parameter (e.g. "temperature"), are generated and occasionally exchanged using the Metropolis criterion. This method can be considered as a statistically correct version of simulated annealing. An alternative approach is to represent the set of independent variables as a Hamiltonian system. Considerab!e progress has been made in understanding how to ensure that the system obeys the equipartition theorem or, equivalently, that coupling between the variables is correctly described. Then a host of techniques developed for dynamical systems can be used. Among them, probably the most powerful is the Adaptive Biasing Force method, in which thermodynamic integration and biased sampling are combined to yield very efficient estimates of pdfs. The third class of methods deals with transitions between states described

  6. Supervised variational model with statistical inference and its application in medical image segmentation.

    Science.gov (United States)

    Li, Changyang; Wang, Xiuying; Eberl, Stefan; Fulham, Michael; Yin, Yong; Dagan Feng, David

    2015-01-01

    Automated and general medical image segmentation can be challenging because the foreground and the background may have complicated and overlapping density distributions in medical imaging. Conventional region-based level set algorithms often assume piecewise constant or piecewise smooth for segments, which are implausible for general medical image segmentation. Furthermore, low contrast and noise make identification of the boundaries between foreground and background difficult for edge-based level set algorithms. Thus, to address these problems, we suggest a supervised variational level set segmentation model to harness the statistical region energy functional with a weighted probability approximation. Our approach models the region density distributions by using the mixture-of-mixtures Gaussian model to better approximate real intensity distributions and distinguish statistical intensity differences between foreground and background. The region-based statistical model in our algorithm can intuitively provide better performance on noisy images. We constructed a weighted probability map on graphs to incorporate spatial indications from user input with a contextual constraint based on the minimization of contextual graphs energy functional. We measured the performance of our approach on ten noisy synthetic images and 58 medical datasets with heterogeneous intensities and ill-defined boundaries and compared our technique to the Chan-Vese region-based level set model, the geodesic active contour model with distance regularization, and the random walker model. Our method consistently achieved the highest Dice similarity coefficient when compared to the other methods.

  7. The relation between statistical power and inference in fMRI.

    Directory of Open Access Journals (Sweden)

    Henk R Cremers

    Full Text Available Statistically underpowered studies can result in experimental failure even when all other experimental considerations have been addressed impeccably. In fMRI the combination of a large number of dependent variables, a relatively small number of observations (subjects, and a need to correct for multiple comparisons can decrease statistical power dramatically. This problem has been clearly addressed yet remains controversial-especially in regards to the expected effect sizes in fMRI, and especially for between-subjects effects such as group comparisons and brain-behavior correlations. We aimed to clarify the power problem by considering and contrasting two simulated scenarios of such possible brain-behavior correlations: weak diffuse effects and strong localized effects. Sampling from these scenarios shows that, particularly in the weak diffuse scenario, common sample sizes (n = 20-30 display extremely low statistical power, poorly represent the actual effects in the full sample, and show large variation on subsequent replications. Empirical data from the Human Connectome Project resembles the weak diffuse scenario much more than the localized strong scenario, which underscores the extent of the power problem for many studies. Possible solutions to the power problem include increasing the sample size, using less stringent thresholds, or focusing on a region-of-interest. However, these approaches are not always feasible and some have major drawbacks. The most prominent solutions that may help address the power problem include model-based (multivariate prediction methods and meta-analyses with related synthesis-oriented approaches.

  8. Statistically defining optimal conditions of coagulation time of skim milk

    International Nuclear Information System (INIS)

    Celebi, M.; Ozdemir, Z.O.; Eroglu, E.; Guney, I

    2014-01-01

    Milk consist huge amount of largely water and different proteins. Kappa-kazein of these milk proteins can be coagulated by Mucor miehei rennet enzyme, is an aspartic protease which cleavege 105 (phenly alanine)-106 (methionine) peptide bond. It is commonly used clotting milk proteins for cheese production in dairy industry. The aim of this study to measure milk clotting times of skim milk by using Mucor Miehei rennet and determination of optimal conditions of milk clotting time by mathematical modelling. In this research, milk clotting times of skim milk were measured at different pHs (3.0, 4.0, 5.0, 6.0, 7.0, 8.0) and temperatures (20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75 degree C). It was used statistical approach for defining best pH and temperature for milk clotting time of skim milk. Milk clotting activity was increase at acidic pHs and high temperatures. (author)

  9. Statistical inference on censored data for targeted clinical trials under enrichment design.

    Science.gov (United States)

    Chen, Chen-Fang; Lin, Jr-Rung; Liu, Jen-Pei

    2013-01-01

    For the traditional clinical trials, inclusion and exclusion criteria are usually based on some clinical endpoints; the genetic or genomic variability of the trial participants are not totally utilized in the criteria. After completion of the human genome project, the disease targets at the molecular level can be identified and can be utilized for the treatment of diseases. However, the accuracy of diagnostic devices for identification of such molecular targets is usually not perfect. Some of the patients enrolled in targeted clinical trials with a positive result for the molecular target might not have the specific molecular targets. As a result, the treatment effect may be underestimated in the patient population truly with the molecular target. To resolve this issue, under the exponential distribution, we develop inferential procedures for the treatment effects of the targeted drug based on the censored endpoints in the patients truly with the molecular targets. Under an enrichment design, we propose using the expectation-maximization algorithm in conjunction with the bootstrap technique to incorporate the inaccuracy of the diagnostic device for detection of the molecular targets on the inference of the treatment effects. A simulation study was conducted to empirically investigate the performance of the proposed methods. Simulation results demonstrate that under the exponential distribution, the proposed estimator is nearly unbiased with adequate precision, and the confidence interval can provide adequate coverage probability. In addition, the proposed testing procedure can adequately control the size with sufficient power. On the other hand, when the proportional hazard assumption is violated, additional simulation studies show that the type I error rate is not controlled at the nominal level and is an increasing function of the positive predictive value. A numerical example illustrates the proposed procedures. Copyright © 2013 John Wiley & Sons, Ltd.

  10. Optimal structural inference of signaling pathways from unordered and overlapping gene sets.

    Science.gov (United States)

    Acharya, Lipi R; Judeh, Thair; Wang, Guangdi; Zhu, Dongxiao

    2012-02-15

    A plethora of bioinformatics analysis has led to the discovery of numerous gene sets, which can be interpreted as discrete measurements emitted from latent signaling pathways. Their potential to infer signaling pathway structures, however, has not been sufficiently exploited. Existing methods accommodating discrete data do not explicitly consider signal cascading mechanisms that characterize a signaling pathway. Novel computational methods are thus needed to fully utilize gene sets and broaden the scope from focusing only on pairwise interactions to the more general cascading events in the inference of signaling pathway structures. We propose a gene set based simulated annealing (SA) algorithm for the reconstruction of signaling pathway structures. A signaling pathway structure is a directed graph containing up to a few hundred nodes and many overlapping signal cascades, where each cascade represents a chain of molecular interactions from the cell surface to the nucleus. Gene sets in our context refer to discrete sets of genes participating in signal cascades, the basic building blocks of a signaling pathway, with no prior information about gene orderings in the cascades. From a compendium of gene sets related to a pathway, SA aims to search for signal cascades that characterize the optimal signaling pathway structure. In the search process, the extent of overlap among signal cascades is used to measure the optimality of a structure. Throughout, we treat gene sets as random samples from a first-order Markov chain model. We evaluated the performance of SA in three case studies. In the first study conducted on 83 KEGG pathways, SA demonstrated a significantly better performance than Bayesian network methods. Since both SA and Bayesian network methods accommodate discrete data, use a 'search and score' network learning strategy and output a directed network, they can be compared in terms of performance and computational time. In the second study, we compared SA and

  11. Statistical analysis of real ILI (In-Line Inspection) data: implications, inferences and lessons learned

    Energy Technology Data Exchange (ETDEWEB)

    Timashev, Svyatoslav A.; Bushinskaya, Anna V. [Russian Academy of Sciences, Ekaterinburg (Russian Federation). Ural Branch. Sciences and Engineering Center ' Reliability and Safety of Large Systems and Machines'

    2009-07-01

    The paper discusses current possibilities and drawbacks of in-line inspection (ILI) in sizing defects in oil and gas pipelines. A methodology based on analysis of variances (ANOVA) is presented that extracts maximum possible information from the ILI measurements of defects and subsequent verification results. This full statistical analysis (FSA) methodology was extensively tested by using the Monte Carlo simulation method. It was then applied to analyze the content of sections 7, 9 and appendix E of the API 1163 RP Standard. (author)

  12. Effects of statistical models and items difficulties on making trait-level inferences: A simulation study

    Directory of Open Access Journals (Sweden)

    Nelson Hauck Filho

    2014-12-01

    Full Text Available Researchers dealing with the task of estimating locations of individuals on continuous latent variables may rely on several statistical models described in the literature. However, weighting costs and benefits of using one specific model over alternative models depends on empirical information that is not always clearly available. Therefore, the aim of this simulation study was to compare the performance of seven popular statistical models in providing adequate latent trait estimates in conditions of items difficulties targeted at the sample mean or at the tails of the latent trait distribution. Results suggested an overall tendency of models to provide more accurate estimates of true latent scores when using items targeted at the sample mean of the latent trait distribution. Rating Scale Model, Graded Response Model, and Weighted Least Squares Mean- and Variance-adjusted Confirmatory Factor Analysis yielded the most reliable latent trait estimates, even when applied to inadequate items for the sample distribution of the latent variable. These findings have important implications concerning some popular methodological practices in Psychology and related areas.

  13. Statistical pixelwise inference models for planar data analysis: an application to gamma-camera uniformity monitoring

    Energy Technology Data Exchange (ETDEWEB)

    Kalemis, A [Joint Department of Physics, Institute of Cancer Research and Royal Marsden NHS Foundation Trust, Downs Road, Sutton, Surrey SM2 5PT (United Kingdom); Bailey, D L [Department of Nuclear Medicine, Royal North Shore Hospital, St Leonards, NSW 2065 (Australia); Flower, M A [Joint Department of Physics, Institute of Cancer Research and Royal Marsden NHS Foundation Trust, Downs Road, Sutton, Surrey SM2 5PT (United Kingdom); Lord, S K [Joint Department of Physics, Institute of Cancer Research and Royal Marsden NHS Foundation Trust, Downs Road, Sutton, Surrey SM2 5PT (United Kingdom); Ott, R J [Joint Department of Physics, Institute of Cancer Research and Royal Marsden NHS Foundation Trust, Downs Road, Sutton, Surrey SM2 5PT (United Kingdom)

    2004-07-21

    In this paper two tests based on statistical models are presented and used to assess, quantify and provide positional information of the existence of bias and/or variations between planar images acquired at different times but under similar conditions. In the first test a linear regression model is fitted to the data in a pixelwise fashion, using three mathematical operators. In the second test a comparison using z-scoring is used based on the assumption that Poisson statistics are valid. For both tests the underlying assumptions are as simple and few as possible. The results are presented as parametric maps of either the three operators or the z-score. The z-score maps can then be thresholded to show the parts of the images which demonstrate change. Three different thresholding methods (naive, adaptive and multiple) are presented: together they cover almost all the needs for separating the signal from the background in the z-score maps. Where the expected size of the signal is known or can be estimated, a spatial correction technique (referred to as the reef correction) can be applied. These tests were applied to flood images used for the quality control of gamma camera uniformity. Simulated data were used to check the validity of the methods. Real data were acquired from four different cameras from two different institutions using a variety of acquisition parameters. The regression model found the bias in all five simulated cases and it also found patterns of unstable regions in real data where visual inspection of the flood images did not show any problems. In comparison the z-map revealed the differences in the simulated images from as low as 1.8 standard deviations from the mean, corresponding to a differential uniformity of 2.2% over the central field of view. In all cases studied, the reef correction increased significantly the sensitivity of the method and in most cases the specificity as well. The two proposed tests can be used either separately or in

  14. Statistical pixelwise inference models for planar data analysis: an application to gamma-camera uniformity monitoring

    International Nuclear Information System (INIS)

    Kalemis, A; Bailey, D L; Flower, M A; Lord, S K; Ott, R J

    2004-01-01

    In this paper two tests based on statistical models are presented and used to assess, quantify and provide positional information of the existence of bias and/or variations between planar images acquired at different times but under similar conditions. In the first test a linear regression model is fitted to the data in a pixelwise fashion, using three mathematical operators. In the second test a comparison using z-scoring is used based on the assumption that Poisson statistics are valid. For both tests the underlying assumptions are as simple and few as possible. The results are presented as parametric maps of either the three operators or the z-score. The z-score maps can then be thresholded to show the parts of the images which demonstrate change. Three different thresholding methods (naive, adaptive and multiple) are presented: together they cover almost all the needs for separating the signal from the background in the z-score maps. Where the expected size of the signal is known or can be estimated, a spatial correction technique (referred to as the reef correction) can be applied. These tests were applied to flood images used for the quality control of gamma camera uniformity. Simulated data were used to check the validity of the methods. Real data were acquired from four different cameras from two different institutions using a variety of acquisition parameters. The regression model found the bias in all five simulated cases and it also found patterns of unstable regions in real data where visual inspection of the flood images did not show any problems. In comparison the z-map revealed the differences in the simulated images from as low as 1.8 standard deviations from the mean, corresponding to a differential uniformity of 2.2% over the central field of view. In all cases studied, the reef correction increased significantly the sensitivity of the method and in most cases the specificity as well. The two proposed tests can be used either separately or in

  15. Simulations and cosmological inference: A statistical model for power spectra means and covariances

    International Nuclear Information System (INIS)

    Schneider, Michael D.; Knox, Lloyd; Habib, Salman; Heitmann, Katrin; Higdon, David; Nakhleh, Charles

    2008-01-01

    We describe an approximate statistical model for the sample variance distribution of the nonlinear matter power spectrum that can be calibrated from limited numbers of simulations. Our model retains the common assumption of a multivariate normal distribution for the power spectrum band powers but takes full account of the (parameter-dependent) power spectrum covariance. The model is calibrated using an extension of the framework in Habib et al. (2007) to train Gaussian processes for the power spectrum mean and covariance given a set of simulation runs over a hypercube in parameter space. We demonstrate the performance of this machinery by estimating the parameters of a power-law model for the power spectrum. Within this framework, our calibrated sample variance distribution is robust to errors in the estimated covariance and shows rapid convergence of the posterior parameter constraints with the number of training simulations.

  16. Statistical inference methods for two crossing survival curves: a comparison of methods.

    Science.gov (United States)

    Li, Huimin; Han, Dong; Hou, Yawen; Chen, Huilin; Chen, Zheng

    2015-01-01

    A common problem that is encountered in medical applications is the overall homogeneity of survival distributions when two survival curves cross each other. A survey demonstrated that under this condition, which was an obvious violation of the assumption of proportional hazard rates, the log-rank test was still used in 70% of studies. Several statistical methods have been proposed to solve this problem. However, in many applications, it is difficult to specify the types of survival differences and choose an appropriate method prior to analysis. Thus, we conducted an extensive series of Monte Carlo simulations to investigate the power and type I error rate of these procedures under various patterns of crossing survival curves with different censoring rates and distribution parameters. Our objective was to evaluate the strengths and weaknesses of tests in different situations and for various censoring rates and to recommend an appropriate test that will not fail for a wide range of applications. Simulation studies demonstrated that adaptive Neyman's smooth tests and the two-stage procedure offer higher power and greater stability than other methods when the survival distributions cross at early, middle or late times. Even for proportional hazards, both methods maintain acceptable power compared with the log-rank test. In terms of the type I error rate, Renyi and Cramér-von Mises tests are relatively conservative, whereas the statistics of the Lin-Xu test exhibit apparent inflation as the censoring rate increases. Other tests produce results close to the nominal 0.05 level. In conclusion, adaptive Neyman's smooth tests and the two-stage procedure are found to be the most stable and feasible approaches for a variety of situations and censoring rates. Therefore, they are applicable to a wider spectrum of alternatives compared with other tests.

  17. Inferring epidemiological dynamics of infectious diseases using Tajima's D statistic on nucleotide sequences of pathogens.

    Science.gov (United States)

    Kim, Kiyeon; Omori, Ryosuke; Ito, Kimihito

    2017-12-01

    The estimation of the basic reproduction number is essential to understand epidemic dynamics, and time series data of infected individuals are usually used for the estimation. However, such data are not always available. Methods to estimate the basic reproduction number using genealogy constructed from nucleotide sequences of pathogens have been proposed so far. Here, we propose a new method to estimate epidemiological parameters of outbreaks using the time series change of Tajima's D statistic on the nucleotide sequences of pathogens. To relate the time evolution of Tajima's D to the number of infected individuals, we constructed a parsimonious mathematical model describing both the transmission process of pathogens among hosts and the evolutionary process of the pathogens. As a case study we applied this method to the field data of nucleotide sequences of pandemic influenza A (H1N1) 2009 viruses collected in Argentina. The Tajima's D-based method estimated basic reproduction number to be 1.55 with 95% highest posterior density (HPD) between 1.31 and 2.05, and the date of epidemic peak to be 10th July with 95% HPD between 22nd June and 9th August. The estimated basic reproduction number was consistent with estimation by birth-death skyline plot and estimation using the time series of the number of infected individuals. These results suggested that Tajima's D statistic on nucleotide sequences of pathogens could be useful to estimate epidemiological parameters of outbreaks. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.

  18. Penultimate modeling of spatial extremes: statistical inference for max-infinitely divisible processes

    KAUST Repository

    Huser, Raphaël

    2018-01-09

    Extreme-value theory for stochastic processes has motivated the statistical use of max-stable models for spatial extremes. However, fitting such asymptotic models to maxima observed over finite blocks is problematic when the asymptotic stability of the dependence does not prevail in finite samples. This issue is particularly serious when data are asymptotically independent, such that the dependence strength weakens and eventually vanishes as events become more extreme. We here aim to provide flexible sub-asymptotic models for spatially indexed block maxima, which more realistically account for discrepancies between data and asymptotic theory. We develop models pertaining to the wider class of max-infinitely divisible processes, extending the class of max-stable processes while retaining dependence properties that are natural for maxima: max-id models are positively associated, and they yield a self-consistent family of models for block maxima defined over any time unit. We propose two parametric construction principles for max-id models, emphasizing a point process-based generalized spectral representation, that allows for asymptotic independence while keeping the max-stable extremal-$t$ model as a special case. Parameter estimation is efficiently performed by pairwise likelihood, and we illustrate our new modeling framework with an application to Dutch wind gust maxima calculated over different time units.

  19. Bayesian inference – a way to combine statistical data and semantic analysis meaningfully

    Directory of Open Access Journals (Sweden)

    Eila Lindfors

    2011-11-01

    Full Text Available This article focuses on presenting the possibilities of Bayesian modelling (Finite Mixture Modelling in the semantic analysis of statistically modelled data. The probability of a hypothesis in relation to the data available is an important question in inductive reasoning. Bayesian modelling allows the researcher to use many models at a time and provides tools to evaluate the goodness of different models. The researcher should always be aware that there is no such thing as the exact probability of an exact event. This is the reason for using probabilistic models. Each model presents a different perspective on the phenomenon in focus, and the researcher has to choose the most probable model with a view to previous research and the knowledge available.The idea of Bayesian modelling is illustrated here by presenting two different sets of data, one from craft science research (n=167 and the other (n=63 from educational research (Lindfors, 2007, 2002. The principles of how to build models and how to combine different profiles are described in the light of the research mentioned.Bayesian modelling is an analysis based on calculating probabilities in relation to a specific set of quantitative data. It is a tool for handling data and interpreting it semantically. The reliability of the analysis arises from an argumentation of which model can be selected from the model space as the basis for an interpretation, and on which arguments.Keywords: method, sloyd, Bayesian modelling, student teachersURN:NBN:no-29959

  20. Optimization of analytical parameters for inferring relationships among Escherichia coli isolates from repetitive-element PCR by maximizing correspondence with multilocus sequence typing data.

    Science.gov (United States)

    Goldberg, Tony L; Gillespie, Thomas R; Singer, Randall S

    2006-09-01

    Repetitive-element PCR (rep-PCR) is a method for genotyping bacteria based on the selective amplification of repetitive genetic elements dispersed throughout bacterial chromosomes. The method has great potential for large-scale epidemiological studies because of its speed and simplicity; however, objective guidelines for inferring relationships among bacterial isolates from rep-PCR data are lacking. We used multilocus sequence typing (MLST) as a "gold standard" to optimize the analytical parameters for inferring relationships among Escherichia coli isolates from rep-PCR data. We chose 12 isolates from a large database to represent a wide range of pairwise genetic distances, based on the initial evaluation of their rep-PCR fingerprints. We conducted MLST with these same isolates and systematically varied the analytical parameters to maximize the correspondence between the relationships inferred from rep-PCR and those inferred from MLST. Methods that compared the shapes of densitometric profiles ("curve-based" methods) yielded consistently higher correspondence values between data types than did methods that calculated indices of similarity based on shared and different bands (maximum correspondences of 84.5% and 80.3%, respectively). Curve-based methods were also markedly more robust in accommodating variations in user-specified analytical parameter values than were "band-sharing coefficient" methods, and they enhanced the reproducibility of rep-PCR. Phylogenetic analyses of rep-PCR data yielded trees with high topological correspondence to trees based on MLST and high statistical support for major clades. These results indicate that rep-PCR yields accurate information for inferring relationships among E. coli isolates and that accuracy can be enhanced with the use of analytical methods that consider the shapes of densitometric profiles.

  1. Constrained statistical inference: sample-size tables for ANOVA and regression

    Directory of Open Access Journals (Sweden)

    Leonard eVanbrabant

    2015-01-01

    Full Text Available Researchers in the social and behavioral sciences often have clear expectations about the order/direction of the parameters in their statistical model. For example, a researcher might expect that regression coefficient beta1 is larger than beta2 and beta3. The corresponding hypothesis is H: beta1 > {beta2, beta3} and this is known as an (order constrained hypothesis. A major advantage of testing such a hypothesis is that power can be gained and inherently a smaller sample size is needed. This article discusses this gain in sample size reduction, when an increasing number of constraints is included into the hypothesis. The main goal is to present sample-size tables for constrained hypotheses. A sample-size table contains the necessary sample-size at a prespecified power (say, 0.80 for an increasing number of constraints. To obtain sample-size tables, two Monte Carlo simulations were performed, one for ANOVA and one for multiple regression. Three results are salient. First, in an ANOVA the needed sample-size decreases with 30% to 50% when complete ordering of the parameters is taken into account. Second, small deviations from the imposed order have only a minor impact on the power. Third, at the maximum number of constraints, the linear regression results are comparable with the ANOVA results. However, in the case of fewer constraints, ordering the parameters (e.g., beta1 > beta2 results in a higher power than assigning a positive or a negative sign to the parameters (e.g., beta1 > 0.

  2. Image-Data Compression Using Edge-Optimizing Algorithm for WFA Inference.

    Science.gov (United States)

    Culik, Karel II; Kari, Jarkko

    1994-01-01

    Presents an inference algorithm that produces a weighted finite automata (WFA), in particular, the grayness functions of graytone images. Image-data compression results based on the new inference algorithm produces a WFA with a relatively small number of edges. Image-data compression results alone and in combination with wavelets are discussed.…

  3. Improving statistical inference on pathogen densities estimated by quantitative molecular methods: malaria gametocytaemia as a case study.

    Science.gov (United States)

    Walker, Martin; Basáñez, María-Gloria; Ouédraogo, André Lin; Hermsen, Cornelus; Bousema, Teun; Churcher, Thomas S

    2015-01-16

    Quantitative molecular methods (QMMs) such as quantitative real-time polymerase chain reaction (q-PCR), reverse-transcriptase PCR (qRT-PCR) and quantitative nucleic acid sequence-based amplification (QT-NASBA) are increasingly used to estimate pathogen density in a variety of clinical and epidemiological contexts. These methods are often classified as semi-quantitative, yet estimates of reliability or sensitivity are seldom reported. Here, a statistical framework is developed for assessing the reliability (uncertainty) of pathogen densities estimated using QMMs and the associated diagnostic sensitivity. The method is illustrated with quantification of Plasmodium falciparum gametocytaemia by QT-NASBA. The reliability of pathogen (e.g. gametocyte) densities, and the accompanying diagnostic sensitivity, estimated by two contrasting statistical calibration techniques, are compared; a traditional method and a mixed model Bayesian approach. The latter accounts for statistical dependence of QMM assays run under identical laboratory protocols and permits structural modelling of experimental measurements, allowing precision to vary with pathogen density. Traditional calibration cannot account for inter-assay variability arising from imperfect QMMs and generates estimates of pathogen density that have poor reliability, are variable among assays and inaccurately reflect diagnostic sensitivity. The Bayesian mixed model approach assimilates information from replica QMM assays, improving reliability and inter-assay homogeneity, providing an accurate appraisal of quantitative and diagnostic performance. Bayesian mixed model statistical calibration supersedes traditional techniques in the context of QMM-derived estimates of pathogen density, offering the potential to improve substantially the depth and quality of clinical and epidemiological inference for a wide variety of pathogens.

  4. Comparison between statistical and optimization methods in accessing unmixing of spectrally similar materials

    CSIR Research Space (South Africa)

    Debba, Pravesh

    2010-11-01

    Full Text Available This paper reports on the results from ordinary least squares and ridge regression as statistical methods, and is compared to numerical optimization methods such as the stochastic method for global optimization, simulated annealing, particle swarm...

  5. Distributional Inference

    NARCIS (Netherlands)

    Kroese, A.H.; van der Meulen, E.A.; Poortema, Klaas; Schaafsma, W.

    1995-01-01

    The making of statistical inferences in distributional form is conceptionally complicated because the epistemic 'probabilities' assigned are mixtures of fact and fiction. In this respect they are essentially different from 'physical' or 'frequency-theoretic' probabilities. The distributional form is

  6. Optimal Decision Rules in Repeated Games Where Players Infer an Opponent’s Mind via Simplified Belief Calculation

    Directory of Open Access Journals (Sweden)

    Mitsuhiro Nakamura

    2016-07-01

    Full Text Available In strategic situations, humans infer the state of mind of others, e.g., emotions or intentions, adapting their behavior appropriately. Nonetheless, evolutionary studies of cooperation typically focus only on reaction norms, e.g., tit for tat, whereby individuals make their next decisions by only considering the observed outcome rather than focusing on their opponent’s state of mind. In this paper, we analyze repeated two-player games in which players explicitly infer their opponent’s unobservable state of mind. Using Markov decision processes, we investigate optimal decision rules and their performance in cooperation. The state-of-mind inference requires Bayesian belief calculations, which is computationally intensive. We therefore study two models in which players simplify these belief calculations. In Model 1, players adopt a heuristic to approximately infer their opponent’s state of mind, whereas in Model 2, players use information regarding their opponent’s previous state of mind, obtained from external evidence, e.g., emotional signals. We show that players in both models reach almost optimal behavior through commitment-like decision rules by which players are committed to selecting the same action regardless of their opponent’s behavior. These commitment-like decision rules can enhance or reduce cooperation depending on the opponent’s strategy.

  7. Cluster-level statistical inference in fMRI datasets: The unexpected behavior of random fields in high dimensions.

    Science.gov (United States)

    Bansal, Ravi; Peterson, Bradley S

    2018-06-01

    Identifying regional effects of interest in MRI datasets usually entails testing a priori hypotheses across many thousands of brain voxels, requiring control for false positive findings in these multiple hypotheses testing. Recent studies have suggested that parametric statistical methods may have incorrectly modeled functional MRI data, thereby leading to higher false positive rates than their nominal rates. Nonparametric methods for statistical inference when conducting multiple statistical tests, in contrast, are thought to produce false positives at the nominal rate, which has thus led to the suggestion that previously reported studies should reanalyze their fMRI data using nonparametric tools. To understand better why parametric methods may yield excessive false positives, we assessed their performance when applied both to simulated datasets of 1D, 2D, and 3D Gaussian Random Fields (GRFs) and to 710 real-world, resting-state fMRI datasets. We showed that both the simulated 2D and 3D GRFs and the real-world data contain a small percentage (<6%) of very large clusters (on average 60 times larger than the average cluster size), which were not present in 1D GRFs. These unexpectedly large clusters were deemed statistically significant using parametric methods, leading to empirical familywise error rates (FWERs) as high as 65%: the high empirical FWERs were not a consequence of parametric methods failing to model spatial smoothness accurately, but rather of these very large clusters that are inherently present in smooth, high-dimensional random fields. In fact, when discounting these very large clusters, the empirical FWER for parametric methods was 3.24%. Furthermore, even an empirical FWER of 65% would yield on average less than one of those very large clusters in each brain-wide analysis. Nonparametric methods, in contrast, estimated distributions from those large clusters, and therefore, by construct rejected the large clusters as false positives at the nominal

  8. Optimization of inspection and replacement period by using Bayesian statistics

    International Nuclear Information System (INIS)

    Kasai, Masao; Watanabe, Yasushi; Kusakari, Yoshiyuki; Notoya, Junichi

    2006-01-01

    This study describes the formulations to optimize the time interval of inspections and/or replacements of equipment/parts taking into account the probability density functions (PDF) for failure rates and parameters of failure distribution functions (FDF) and evaluates the optimized results of these time intervals using our formulations by comparing with those using only representative values of failure rates and the parameters of FDF instead of using these PDFs. The PDFs are obtained with Bayesian method and the representative values are obtained with likelihood estimation method. However, any significant difference is not observed between both optimized results within our preliminary calculations. (author)

  9. Kinetic Analysis of Dynamic Positron Emission Tomography Data using Open-Source Image Processing and Statistical Inference Tools.

    Science.gov (United States)

    Hawe, David; Hernández Fernández, Francisco R; O'Suilleabháin, Liam; Huang, Jian; Wolsztynski, Eric; O'Sullivan, Finbarr

    2012-05-01

    In dynamic mode, positron emission tomography (PET) can be used to track the evolution of injected radio-labelled molecules in living tissue. This is a powerful diagnostic imaging technique that provides a unique opportunity to probe the status of healthy and pathological tissue by examining how it processes substrates. The spatial aspect of PET is well established in the computational statistics literature. This article focuses on its temporal aspect. The interpretation of PET time-course data is complicated because the measured signal is a combination of vascular delivery and tissue retention effects. If the arterial time-course is known, the tissue time-course can typically be expressed in terms of a linear convolution between the arterial time-course and the tissue residue. In statistical terms, the residue function is essentially a survival function - a familiar life-time data construct. Kinetic analysis of PET data is concerned with estimation of the residue and associated functionals such as flow, flux, volume of distribution and transit time summaries. This review emphasises a nonparametric approach to the estimation of the residue based on a piecewise linear form. Rapid implementation of this by quadratic programming is described. The approach provides a reference for statistical assessment of widely used one- and two-compartmental model forms. We illustrate the method with data from two of the most well-established PET radiotracers, (15)O-H(2)O and (18)F-fluorodeoxyglucose, used for assessment of blood perfusion and glucose metabolism respectively. The presentation illustrates the use of two open-source tools, AMIDE and R, for PET scan manipulation and model inference.

  10. Statistical Inference on Optimal Points to Evaluate Multi-State Classification Systems

    Science.gov (United States)

    2014-09-18

    vs2+ ( dbcm3 ˆ 2 ) *vm3+( dbcs3 ˆ 2 ) * vs3 80 VETA <−VBCA+VBC EETA<−EBCA−EBC 82 W<− (EETA−TV) / s q r t ( VETA ) # T e s t p−v a l u e − t o compare t o a...event set, E = (ε1, ε2, ..., εk) to k distinct elements of a label set, L = (l1, l2, ..., lk) . These partitions may be referred to as classes. For...set of features, F = ( f1, f2, ..., fm) . These features are then used to assign the different elements from E to the respective labels, L , (A : E → F

  11. A Simulation Approach to Statistical Estimation of Multiperiod Optimal Portfolios

    Directory of Open Access Journals (Sweden)

    Hiroshi Shiraishi

    2012-01-01

    Full Text Available This paper discusses a simulation-based method for solving discrete-time multiperiod portfolio choice problems under AR(1 process. The method is applicable even if the distributions of return processes are unknown. We first generate simulation sample paths of the random returns by using AR bootstrap. Then, for each sample path and each investment time, we obtain an optimal portfolio estimator, which optimizes a constant relative risk aversion (CRRA utility function. When an investor considers an optimal investment strategy with portfolio rebalancing, it is convenient to introduce a value function. The most important difference between single-period portfolio choice problems and multiperiod ones is that the value function is time dependent. Our method takes care of the time dependency by using bootstrapped sample paths. Numerical studies are provided to examine the validity of our method. The result shows the necessity to take care of the time dependency of the value function.

  12. Genetic interaction motif finding by expectation maximization – a novel statistical model for inferring gene modules from synthetic lethality

    Directory of Open Access Journals (Sweden)

    Ye Ping

    2005-12-01

    Full Text Available Abstract Background Synthetic lethality experiments identify pairs of genes with complementary function. More direct functional associations (for example greater probability of membership in a single protein complex may be inferred between genes that share synthetic lethal interaction partners than genes that are directly synthetic lethal. Probabilistic algorithms that identify gene modules based on motif discovery are highly appropriate for the analysis of synthetic lethal genetic interaction data and have great potential in integrative analysis of heterogeneous datasets. Results We have developed Genetic Interaction Motif Finding (GIMF, an algorithm for unsupervised motif discovery from synthetic lethal interaction data. Interaction motifs are characterized by position weight matrices and optimized through expectation maximization. Given a seed gene, GIMF performs a nonlinear transform on the input genetic interaction data and automatically assigns genes to the motif or non-motif category. We demonstrate the capacity to extract known and novel pathways for Saccharomyces cerevisiae (budding yeast. Annotations suggested for several uncharacterized genes are supported by recent experimental evidence. GIMF is efficient in computation, requires no training and automatically down-weights promiscuous genes with high degrees. Conclusion GIMF effectively identifies pathways from synthetic lethality data with several unique features. It is mostly suitable for building gene modules around seed genes. Optimal choice of one single model parameter allows construction of gene networks with different levels of confidence. The impact of hub genes the generic probabilistic framework of GIMF may be used to group other types of biological entities such as proteins based on stochastic motifs. Analysis of the strongest motifs discovered by the algorithm indicates that synthetic lethal interactions are depleted between genes within a motif, suggesting that synthetic

  13. Statistical optimization of process parameters for the production of ...

    African Journals Online (AJOL)

    In this study, optimization of process parameters such as moisture content, incubation temperature and initial pH (fixed) for the improvement of citric acid production from oil palm empty fruit bunches through solid state bioconversion was carried out using traditional one-factor-at-a-time (OFAT) method and response surface ...

  14. Statistical optimization of xylanase production by Aspergillus niger ...

    African Journals Online (AJOL)

    SERVER

    2008-03-04

    Mar 4, 2008 ... 2Key Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, P.R. China. 3School of ... of the main factors determining the economic of a process. Reducing the costs of enzyme production by optimization ..... tation periods during the potential industry application by.

  15. Statistical analysis and optimization of copper biosorption capability ...

    African Journals Online (AJOL)

    These three variables were further adopted by the three-level Box– Behnken design to correlate between the three variables and to estimate the optimal conditions required for Cu2+ biosorption. Although the maximum value for Cu2+ biosorption was 85%, the calculated optimum percentage was 72%, which was a 2.57-fold ...

  16. Statistical optimization of substrate, carbon and nitrogen source by ...

    African Journals Online (AJOL)

    PRECIOUS

    2009-11-16

    Nov 16, 2009 ... Full Length Research Paper. Statistical ... extraction, in chocolate and tea fermentation and in vegetable waste ... Total of 20 shake flasks media (50 mL in 250 mL Erlenmeyer), including the ..... Physiological comparison between pectinase producing mutants of ... pectinases in bioreactor. Bioprocess Eng.

  17. A Robust Statistics Approach to Minimum Variance Portfolio Optimization

    Science.gov (United States)

    Yang, Liusha; Couillet, Romain; McKay, Matthew R.

    2015-12-01

    We study the design of portfolios under a minimum risk criterion. The performance of the optimized portfolio relies on the accuracy of the estimated covariance matrix of the portfolio asset returns. For large portfolios, the number of available market returns is often of similar order to the number of assets, so that the sample covariance matrix performs poorly as a covariance estimator. Additionally, financial market data often contain outliers which, if not correctly handled, may further corrupt the covariance estimation. We address these shortcomings by studying the performance of a hybrid covariance matrix estimator based on Tyler's robust M-estimator and on Ledoit-Wolf's shrinkage estimator while assuming samples with heavy-tailed distribution. Employing recent results from random matrix theory, we develop a consistent estimator of (a scaled version of) the realized portfolio risk, which is minimized by optimizing online the shrinkage intensity. Our portfolio optimization method is shown via simulations to outperform existing methods both for synthetic and real market data.

  18. Optimal Inference for Instrumental Variables Regression with non-Gaussian Errors

    DEFF Research Database (Denmark)

    Cattaneo, Matias D.; Crump, Richard K.; Jansson, Michael

    This paper is concerned with inference on the coefficient on the endogenous regressor in a linear instrumental variables model with a single endogenous regressor, nonrandom exogenous regressors and instruments, and i.i.d. errors whose distribution is unknown. It is shown that under mild smoothness...

  19. Multivariate Statistical Process Optimization in the Industrial Production of Enzymes

    DEFF Research Database (Denmark)

    Klimkiewicz, Anna

    of productyield. The potential of NIR technology to monitor the activity of the enzyme has beenthe subject of a feasibility study presented in PAPER I. It included (a) evaluation onwhich of the two real-time NIR flow cell configurations is the preferred arrangementfor monitoring of the retentate stream downstream...... strategies for theorganization of these datasets, with varying number of timestamps, into datastructures fit for latent variable (LV) modeling, have been compared. The ultimateaim of the data mining steps is the construction of statistical ‘soft models’ whichcapture the principle or latent behavior...

  20. Improvement of characteristic statistic algorithm and its application on equilibrium cycle reloading optimization

    International Nuclear Information System (INIS)

    Hu, Y.; Liu, Z.; Shi, X.; Wang, B.

    2006-01-01

    A brief introduction of characteristic statistic algorithm (CSA) is given in the paper, which is a new global optimization algorithm to solve the problem of PWR in-core fuel management optimization. CSA is modified by the adoption of back propagation neural network and fast local adjustment. Then the modified CSA is applied to PWR Equilibrium Cycle Reloading Optimization, and the corresponding optimization code of CSA-DYW is developed. CSA-DYW is used to optimize the equilibrium cycle of 18 month reloading of Daya bay nuclear plant Unit 1 reactor. The results show that CSA-DYW has high efficiency and good global performance on PWR Equilibrium Cycle Reloading Optimization. (authors)

  1. Final Report: Large-Scale Optimization for Bayesian Inference in Complex Systems

    Energy Technology Data Exchange (ETDEWEB)

    Ghattas, Omar [The University of Texas at Austin

    2013-10-15

    The SAGUARO (Scalable Algorithms for Groundwater Uncertainty Analysis and Robust Optimiza- tion) Project focuses on the development of scalable numerical algorithms for large-scale Bayesian inversion in complex systems that capitalize on advances in large-scale simulation-based optimiza- tion and inversion methods. Our research is directed in three complementary areas: efficient approximations of the Hessian operator, reductions in complexity of forward simulations via stochastic spectral approximations and model reduction, and employing large-scale optimization concepts to accelerate sampling. Our efforts are integrated in the context of a challenging testbed problem that considers subsurface reacting flow and transport. The MIT component of the SAGUARO Project addresses the intractability of conventional sampling methods for large-scale statistical inverse problems by devising reduced-order models that are faithful to the full-order model over a wide range of parameter values; sampling then employs the reduced model rather than the full model, resulting in very large computational savings. Results indicate little effect on the computed posterior distribution. On the other hand, in the Texas-Georgia Tech component of the project, we retain the full-order model, but exploit inverse problem structure (adjoint-based gradients and partial Hessian information of the parameter-to- observation map) to implicitly extract lower dimensional information on the posterior distribution; this greatly speeds up sampling methods, so that fewer sampling points are needed. We can think of these two approaches as "reduce then sample" and "sample then reduce." In fact, these two approaches are complementary, and can be used in conjunction with each other. Moreover, they both exploit deterministic inverse problem structure, in the form of adjoint-based gradient and Hessian information of the underlying parameter-to-observation map, to achieve their speedups.

  2. Estimating the Optimal Dosage of Sodium Valproate in Idiopathic Generalized Epilepsy with Adaptive Neuro-Fuzzy Inference System

    Directory of Open Access Journals (Sweden)

    Somayyeh Lotfi Noghabi

    2012-07-01

    Full Text Available Introduction: Epilepsy is a clinical syndrome in which seizures have a tendency to recur. Sodium valproate is the most effective drug in the treatment of all types of generalized seizures. Finding the optimal dosage (the lowest effective dose of sodium valproate is a real challenge to all neurologists. In this study, a new approach based on Adaptive Neuro-Fuzzy Inference System (ANFIS was presented for estimating the optimal dosage of sodium valproate in IGE (Idiopathic Generalized Epilepsy patients. Methods: 40 patients with Idiopathic Generalized Epilepsy, who were referred to the neurology department of Mashhad University of Medical Sciences between the years 2006-2011, were included in this study. The function Adaptive Neuro- Fuzzy Inference System (ANFIS constructs a Fuzzy Inference System (FIS whose membership function parameters are tuned (adjusted using either a back-propagation algorithm alone, or in combination with the least squares type of method (hybrid algorithm. In this study, we used hybrid method for adjusting the parameters. Methods: The R-square of the proposed system was %598 and the Pearson correlation coefficient was significant (P 0.05. Although the accuracy of the model was not high, it wasgood enough to be applied for treating the IGE patients with sodium valproate. Discussion: This paper presented a new application of ANFIS for estimating the optimal dosage of sodium valproate in IGE patients. Fuzzy set theory plays an important role in dealing with uncertainty when making decisions in medical applications. Collectively, it seems that ANFIS has a high capacity to be applied in medical sciences, especially neurology.

  3. Statistical Optimization of Sustained Release Venlafaxine HCI Wax Matrix Tablet.

    Science.gov (United States)

    Bhalekar, M R; Madgulkar, A R; Sheladiya, D D; Kshirsagar, S J; Wable, N D; Desale, S S

    2008-01-01

    The purpose of this research was to prepare a sustained release drug delivery system of venlafaxine hydrochloride by using a wax matrix system. The effects of bees wax and carnauba wax on drug release profile was investigated. A 3(2) full factorial design was applied to systemically optimize the drug release profile. Amounts of carnauba wax (X(1)) and bees wax (X(2)) were selected as independent variables and release after 12 h and time required for 50% (t(50)) drug release were selected as dependent variables. A mathematical model was generated for each response parameter. Both waxes retarded release after 12 h and increases the t(50) but bees wax showed significant influence. The drug release pattern for all the formulation combinations was found to be approaching Peppas kinetic model. Suitable combination of two waxes provided fairly good regulated release profile. The response surfaces and contour plots for each response parameter are presented for further interpretation of the results. The optimum formulations were chosen and their predicted results found to be in close agreement with experimental findings.

  4. Phase Transitions in Combinatorial Optimization Problems Basics, Algorithms and Statistical Mechanics

    CERN Document Server

    Hartmann, Alexander K

    2005-01-01

    A concise, comprehensive introduction to the topic of statistical physics of combinatorial optimization, bringing together theoretical concepts and algorithms from computer science with analytical methods from physics. The result bridges the gap between statistical physics and combinatorial optimization, investigating problems taken from theoretical computing, such as the vertex-cover problem, with the concepts and methods of theoretical physics. The authors cover rapid developments and analytical methods that are both extremely complex and spread by word-of-mouth, providing all the necessary

  5. Characteristic statistic algorithm (CSA) for in-core loading pattern optimization

    International Nuclear Information System (INIS)

    Liu Zhihong; Hu Yongming; Shi Gong

    2007-01-01

    To solve the problem of PWR in-core loading pattern optimization, a more suitable global optimization algorithm, i.e., Characteristic statistic algorithm (CSA), is used. The searching process of this algorithm and how to apply it to this problem are presented. Loading pattern optimization code SCYCLE is developed. Two different problems on real PWR models are calculated and the results are compared with other algorithms. It is shown that SCYCLE has high efficiency and good global performance on this problem. (authors)

  6. Sensitivity analysis and optimization of system dynamics models : Regression analysis and statistical design of experiments

    NARCIS (Netherlands)

    Kleijnen, J.P.C.

    1995-01-01

    This tutorial discusses what-if analysis and optimization of System Dynamics models. These problems are solved, using the statistical techniques of regression analysis and design of experiments (DOE). These issues are illustrated by applying the statistical techniques to a System Dynamics model for

  7. Is the P-Value Really Dead? Assessing Inference Learning Outcomes for Social Science Students in an Introductory Statistics Course

    Science.gov (United States)

    Lane-Getaz, Sharon

    2017-01-01

    In reaction to misuses and misinterpretations of p-values and confidence intervals, a social science journal editor banned p-values from its pages. This study aimed to show that education could address misuse and abuse. This study examines inference-related learning outcomes for social science students in an introductory course supplemented with…

  8. [Confidence interval or p-value--similarities and differences between two important methods of statistical inference of quantitative studies].

    Science.gov (United States)

    Harari, Gil

    2014-01-01

    Statistic significance, also known as p-value, and CI (Confidence Interval) are common statistics measures and are essential for the statistical analysis of studies in medicine and life sciences. These measures provide complementary information about the statistical probability and conclusions regarding the clinical significance of study findings. This article is intended to describe the methodologies, compare between the methods, assert their suitability for the different needs of study results analysis and to explain situations in which each method should be used.

  9. Intelligent Modeling Combining Adaptive Neuro Fuzzy Inference System and Genetic Algorithm for Optimizing Welding Process Parameters

    Science.gov (United States)

    Gowtham, K. N.; Vasudevan, M.; Maduraimuthu, V.; Jayakumar, T.

    2011-04-01

    Modified 9Cr-1Mo ferritic steel is used as a structural material for steam generator components of power plants. Generally, tungsten inert gas (TIG) welding is preferred for welding of these steels in which the depth of penetration achievable during autogenous welding is limited. Therefore, activated flux TIG (A-TIG) welding, a novel welding technique, has been developed in-house to increase the depth of penetration. In modified 9Cr-1Mo steel joints produced by the A-TIG welding process, weld bead width, depth of penetration, and heat-affected zone (HAZ) width play an important role in determining the mechanical properties as well as the performance of the weld joints during service. To obtain the desired weld bead geometry and HAZ width, it becomes important to set the welding process parameters. In this work, adaptative neuro fuzzy inference system is used to develop independent models correlating the welding process parameters like current, voltage, and torch speed with weld bead shape parameters like depth of penetration, bead width, and HAZ width. Then a genetic algorithm is employed to determine the optimum A-TIG welding process parameters to obtain the desired weld bead shape parameters and HAZ width.

  10. Optimal Experimental Design of Borehole Locations for Bayesian Inference of Past Ice Sheet Surface Temperatures

    Science.gov (United States)

    Davis, A. D.; Huan, X.; Heimbach, P.; Marzouk, Y.

    2017-12-01

    Borehole data are essential for calibrating ice sheet models. However, field expeditions for acquiring borehole data are often time-consuming, expensive, and dangerous. It is thus essential to plan the best sampling locations that maximize the value of data while minimizing costs and risks. We present an uncertainty quantification (UQ) workflow based on rigorous probability framework to achieve these objectives. First, we employ an optimal experimental design (OED) procedure to compute borehole locations that yield the highest expected information gain. We take into account practical considerations of location accessibility (e.g., proximity to research sites, terrain, and ice velocity may affect feasibility of drilling) and robustness (e.g., real-time constraints such as weather may force researchers to drill at sub-optimal locations near those originally planned), by incorporating a penalty reflecting accessibility as well as sensitivity to deviations from the optimal locations. Next, we extract vertical temperature profiles from these boreholes and formulate a Bayesian inverse problem to reconstruct past surface temperatures. Using a model of temperature advection/diffusion, the top boundary condition (corresponding to surface temperatures) is calibrated via efficient Markov chain Monte Carlo (MCMC). The overall procedure can then be iterated to choose new optimal borehole locations for the next expeditions.Through this work, we demonstrate powerful UQ methods for designing experiments, calibrating models, making predictions, and assessing sensitivity--all performed under an uncertain environment. We develop a theoretical framework as well as practical software within an intuitive workflow, and illustrate their usefulness for combining data and models for environmental and climate research.

  11. Statistical identifiability and convergence evaluation for nonlinear pharmacokinetic models with particle swarm optimization.

    Science.gov (United States)

    Kim, Seongho; Li, Lang

    2014-02-01

    The statistical identifiability of nonlinear pharmacokinetic (PK) models with the Michaelis-Menten (MM) kinetic equation is considered using a global optimization approach, which is particle swarm optimization (PSO). If a model is statistically non-identifiable, the conventional derivative-based estimation approach is often terminated earlier without converging, due to the singularity. To circumvent this difficulty, we develop a derivative-free global optimization algorithm by combining PSO with a derivative-free local optimization algorithm to improve the rate of convergence of PSO. We further propose an efficient approach to not only checking the convergence of estimation but also detecting the identifiability of nonlinear PK models. PK simulation studies demonstrate that the convergence and identifiability of the PK model can be detected efficiently through the proposed approach. The proposed approach is then applied to clinical PK data along with a two-compartmental model. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  12. Three-dimensional reconstruction of statistically optimal unit cells of polydisperse particulate composites from microtomography

    International Nuclear Information System (INIS)

    Lee, H.; Brandyberry, M.; Tudor, A.; Matous, K.

    2009-01-01

    In this paper, we present a systematic approach for characterization and reconstruction of statistically optimal representative unit cells of polydisperse particulate composites. Microtomography is used to gather rich three-dimensional data of a packed glass bead system. First-, second-, and third-order probability functions are used to characterize the morphology of the material, and the parallel augmented simulated annealing algorithm is employed for reconstruction of the statistically equivalent medium. Both the fully resolved probability spectrum and the geometrically exact particle shapes are considered in this study, rendering the optimization problem multidimensional with a highly complex objective function. A ten-phase particulate composite composed of packed glass beads in a cylindrical specimen is investigated, and a unit cell is reconstructed on massively parallel computers. Further, rigorous error analysis of the statistical descriptors (probability functions) is presented and a detailed comparison between statistics of the voxel-derived pack and the representative cell is made.

  13. IZI: INFERRING THE GAS PHASE METALLICITY (Z) AND IONIZATION PARAMETER (q) OF IONIZED NEBULAE USING BAYESIAN STATISTICS

    International Nuclear Information System (INIS)

    Blanc, Guillermo A.; Kewley, Lisa; Vogt, Frédéric P. A.; Dopita, Michael A.

    2015-01-01

    We present a new method for inferring the metallicity (Z) and ionization parameter (q) of H II regions and star-forming galaxies using strong nebular emission lines (SELs). We use Bayesian inference to derive the joint and marginalized posterior probability density functions for Z and q given a set of observed line fluxes and an input photoionization model. Our approach allows the use of arbitrary sets of SELs and the inclusion of flux upper limits. The method provides a self-consistent way of determining the physical conditions of ionized nebulae that is not tied to the arbitrary choice of a particular SEL diagnostic and uses all the available information. Unlike theoretically calibrated SEL diagnostics, the method is flexible and not tied to a particular photoionization model. We describe our algorithm, validate it against other methods, and present a tool that implements it called IZI. Using a sample of nearby extragalactic H II regions, we assess the performance of commonly used SEL abundance diagnostics. We also use a sample of 22 local H II regions having both direct and recombination line (RL) oxygen abundance measurements in the literature to study discrepancies in the abundance scale between different methods. We find that oxygen abundances derived through Bayesian inference using currently available photoionization models in the literature can be in good (∼30%) agreement with RL abundances, although some models perform significantly better than others. We also confirm that abundances measured using the direct method are typically ∼0.2 dex lower than both RL and photoionization-model-based abundances

  14. IZI: INFERRING THE GAS PHASE METALLICITY (Z) AND IONIZATION PARAMETER (q) OF IONIZED NEBULAE USING BAYESIAN STATISTICS

    Energy Technology Data Exchange (ETDEWEB)

    Blanc, Guillermo A. [Observatories of the Carnegie Institution for Science, 813 Santa Barbara Street, Pasadena, CA 91101 (United States); Kewley, Lisa; Vogt, Frédéric P. A.; Dopita, Michael A. [Research School of Astronomy and Astrophysics, Australian National University, Cotter Road, Weston, ACT 2611 (Australia)

    2015-01-10

    We present a new method for inferring the metallicity (Z) and ionization parameter (q) of H II regions and star-forming galaxies using strong nebular emission lines (SELs). We use Bayesian inference to derive the joint and marginalized posterior probability density functions for Z and q given a set of observed line fluxes and an input photoionization model. Our approach allows the use of arbitrary sets of SELs and the inclusion of flux upper limits. The method provides a self-consistent way of determining the physical conditions of ionized nebulae that is not tied to the arbitrary choice of a particular SEL diagnostic and uses all the available information. Unlike theoretically calibrated SEL diagnostics, the method is flexible and not tied to a particular photoionization model. We describe our algorithm, validate it against other methods, and present a tool that implements it called IZI. Using a sample of nearby extragalactic H II regions, we assess the performance of commonly used SEL abundance diagnostics. We also use a sample of 22 local H II regions having both direct and recombination line (RL) oxygen abundance measurements in the literature to study discrepancies in the abundance scale between different methods. We find that oxygen abundances derived through Bayesian inference using currently available photoionization models in the literature can be in good (∼30%) agreement with RL abundances, although some models perform significantly better than others. We also confirm that abundances measured using the direct method are typically ∼0.2 dex lower than both RL and photoionization-model-based abundances.

  15. Portfolio optimization problem with nonidentical variances of asset returns using statistical mechanical informatics

    Science.gov (United States)

    Shinzato, Takashi

    2016-12-01

    The portfolio optimization problem in which the variances of the return rates of assets are not identical is analyzed in this paper using the methodology of statistical mechanical informatics, specifically, replica analysis. We defined two characteristic quantities of an optimal portfolio, namely, minimal investment risk and investment concentration, in order to solve the portfolio optimization problem and analytically determined their asymptotical behaviors using replica analysis. Numerical experiments were also performed, and a comparison between the results of our simulation and those obtained via replica analysis validated our proposed method.

  16. Perceptual inference.

    Science.gov (United States)

    Aggelopoulos, Nikolaos C

    2015-08-01

    Perceptual inference refers to the ability to infer sensory stimuli from predictions that result from internal neural representations built through prior experience. Methods of Bayesian statistical inference and decision theory model cognition adequately by using error sensing either in guiding action or in "generative" models that predict the sensory information. In this framework, perception can be seen as a process qualitatively distinct from sensation, a process of information evaluation using previously acquired and stored representations (memories) that is guided by sensory feedback. The stored representations can be utilised as internal models of sensory stimuli enabling long term associations, for example in operant conditioning. Evidence for perceptual inference is contributed by such phenomena as the cortical co-localisation of object perception with object memory, the response invariance in the responses of some neurons to variations in the stimulus, as well as from situations in which perception can be dissociated from sensation. In the context of perceptual inference, sensory areas of the cerebral cortex that have been facilitated by a priming signal may be regarded as comparators in a closed feedback loop, similar to the better known motor reflexes in the sensorimotor system. The adult cerebral cortex can be regarded as similar to a servomechanism, in using sensory feedback to correct internal models, producing predictions of the outside world on the basis of past experience. Copyright © 2015 Elsevier Ltd. All rights reserved.

  17. Simultaneous learning and filtering without delusions: a Bayes-optimal combination of Predictive Inference and Adaptive Filtering.

    Science.gov (United States)

    Kneissler, Jan; Drugowitsch, Jan; Friston, Karl; Butz, Martin V

    2015-01-01

    Predictive coding appears to be one of the fundamental working principles of brain processing. Amongst other aspects, brains often predict the sensory consequences of their own actions. Predictive coding resembles Kalman filtering, where incoming sensory information is filtered to produce prediction errors for subsequent adaptation and learning. However, to generate prediction errors given motor commands, a suitable temporal forward model is required to generate predictions. While in engineering applications, it is usually assumed that this forward model is known, the brain has to learn it. When filtering sensory input and learning from the residual signal in parallel, a fundamental problem arises: the system can enter a delusional loop when filtering the sensory information using an overly trusted forward model. In this case, learning stalls before accurate convergence because uncertainty about the forward model is not properly accommodated. We present a Bayes-optimal solution to this generic and pernicious problem for the case of linear forward models, which we call Predictive Inference and Adaptive Filtering (PIAF). PIAF filters incoming sensory information and learns the forward model simultaneously. We show that PIAF is formally related to Kalman filtering and to the Recursive Least Squares linear approximation method, but combines these procedures in a Bayes optimal fashion. Numerical evaluations confirm that the delusional loop is precluded and that the learning of the forward model is more than 10-times faster when compared to a naive combination of Kalman filtering and Recursive Least Squares.

  18. Simultaneous Learning and Filtering without Delusions: A Bayes-Optimal Derivation of Combining Predictive Inference and AdaptiveFiltering

    Directory of Open Access Journals (Sweden)

    Jan eKneissler

    2015-04-01

    Full Text Available Predictive coding appears to be one of the fundamental working principles of brain processing. Amongst other aspects, brains often predict the sensory consequences of their own actions. Predictive coding resembles Kalman filtering, where incoming sensory information is filtered to produce prediction errors for subsequent adaptation and learning. However, to generate prediction errors given motor commands, a suitable temporal forward model is required to generate predictions. While in engineering applications, it is usually assumed that this forward model is known, the brain has to learn it. When filtering sensory input and learning from the residual signal in parallel, a fundamental problem arises: the system can enter a delusional loop when filtering the sensory information using an overly trusted forward model. In this case, learning stalls before accurate convergence because uncertainty about the forward model is not properly accommodated. We present a Bayes-optimal solution to this generic and pernicious problem for the case of linear forward models, which we call Predictive Inference and Adaptive Filtering (PIAF. PIAF filters incoming sensory information and learns the forward model simultaneously. We show that PIAF is formally related to Kalman filtering and to the Recursive Least Squares linear approximation method, but combines these procedures in a Bayes optimal fashion. Numerical evaluations confirm that the delusional loop is precluded and that the learning of the forward model is more than ten-times faster when compared to a naive combination of Kalman filtering and Recursive Least Squares.

  19. The influence of design characteristics on statistical inference in nonlinear estimation: A simulation study based on survival data and hazard modeling

    DEFF Research Database (Denmark)

    Andersen, J.S.; Bedaux, J.J.M.; Kooijman, S.A.L.M.

    2000-01-01

    This paper describes the influence of design characteristics on the statistical inference for an ecotoxicological hazard-based model using simulated survival data. The design characteristics of interest are the number and spacing of observations (counts) in time, the number and spacing of exposure...... concentrations (within c(min) and c(max)), and the initial number of individuals at time 0 in each concentration. A comparison of the coverage probabilities for confidence limits arising from the profile-likelihood approach and the Wald-based approach is carried out. The Wald-based approach is very sensitive...

  20. Statistical optimization of thermo-alkali stable xylanase production from Bacillus tequilensis strain ARMATI

    Directory of Open Access Journals (Sweden)

    Ameer Khusro

    2016-07-01

    Conclusions: The cellulase-free xylanase showed an alkali-tolerant and thermo-stable property with potentially applicable nature at industrial scale. This statistical approach established a major contribution in enzyme production from the isolate by optimizing independent factors and represents a first reference on the enhanced production of thermo-alkali stable cellulase-free xylanase from B. tequilensis.

  1. Damping layout optimization for ship's cabin noise reduction based on statistical energy analysis

    Directory of Open Access Journals (Sweden)

    WU Weiguo

    2017-08-01

    Full Text Available An optimization analysis study concerning the damping control of ship's cabin noise was carried out in order to improve the effect and reduce the weight of damping. Based on the Statistical Energy Analysis (SEA method, a theoretical deduction and numerical analysis of the first-order sensitivity analysis of the A-weighted sound pressure level concerning the damping loss factor of the subsystem were carried out. On this basis, a mathematical optimization model was proposed and an optimization program developed. Next, the secondary development of VA One software was implemented through the use of MATLAB, while the cabin noise damping control layout optimization system was established. Finally, the optimization model of the ship was constructed and numerical experiments of damping control optimization conducted. The damping installation region was divided into five parts with different damping thicknesses. The total weight of damping was set as an objective function and the A-weighted sound pressure level of the target cabin was set as a constraint condition. The best damping thickness was obtained through the optimization program, and the total damping weight was reduced by 60.4%. The results show that the damping noise reduction effect of unit weight is significantly improved through the optimization method. This research successfully solves the installation position and thickness selection problems in the acoustic design of damping control, providing a reliable analysis method and guidance for the design.

  2. Daily Average Wind Power Interval Forecasts Based on an Optimal Adaptive-Network-Based Fuzzy Inference System and Singular Spectrum Analysis

    Directory of Open Access Journals (Sweden)

    Zhongrong Zhang

    2016-01-01

    Full Text Available Wind energy has increasingly played a vital role in mitigating conventional resource shortages. Nevertheless, the stochastic nature of wind poses a great challenge when attempting to find an accurate forecasting model for wind power. Therefore, precise wind power forecasts are of primary importance to solve operational, planning and economic problems in the growing wind power scenario. Previous research has focused efforts on the deterministic forecast of wind power values, but less attention has been paid to providing information about wind energy. Based on an optimal Adaptive-Network-Based Fuzzy Inference System (ANFIS and Singular Spectrum Analysis (SSA, this paper develops a hybrid uncertainty forecasting model, IFASF (Interval Forecast-ANFIS-SSA-Firefly Alogorithm, to obtain the upper and lower bounds of daily average wind power, which is beneficial for the practical operation of both the grid company and independent power producers. To strengthen the practical ability of this developed model, this paper presents a comparison between IFASF and other benchmarks, which provides a general reference for this aspect for statistical or artificially intelligent interval forecast methods. The comparison results show that the developed model outperforms eight benchmarks and has a satisfactory forecasting effectiveness in three different wind farms with two time horizons.

  3. Statistics

    CERN Document Server

    Hayslett, H T

    1991-01-01

    Statistics covers the basic principles of Statistics. The book starts by tackling the importance and the two kinds of statistics; the presentation of sample data; the definition, illustration and explanation of several measures of location; and the measures of variation. The text then discusses elementary probability, the normal distribution and the normal approximation to the binomial. Testing of statistical hypotheses and tests of hypotheses about the theoretical proportion of successes in a binomial population and about the theoretical mean of a normal population are explained. The text the

  4. Statistical analysis and decoding of neural activity in the rodent geniculate ganglion using a metric-based inference system.

    Directory of Open Access Journals (Sweden)

    Wei Wu

    Full Text Available We analyzed the spike discharge patterns of two types of neurons in the rodent peripheral gustatory system, Na specialists (NS and acid generalists (AG to lingual stimulation with NaCl, acetic acid, and mixtures of the two stimuli. Previous computational investigations found that both spike rate and spike timing contribute to taste quality coding. These studies used commonly accepted computational methods, but they do not provide a consistent statistical evaluation of spike trains. In this paper, we adopted a new computational framework that treated each spike train as an individual data point for computing summary statistics such as mean and variance in the spike train space. We found that these statistical summaries properly characterized the firing patterns (e. g. template and variability and quantified the differences between NS and AG neurons. The same framework was also used to assess the discrimination performance of NS and AG neurons and to remove spontaneous background activity or "noise" from the spike train responses. The results indicated that the new metric system provided the desired decoding performance and noise-removal improved stimulus classification accuracy, especially of neurons with high spontaneous rates. In summary, this new method naturally conducts statistical analysis and neural decoding under one consistent framework, and the results demonstrated that individual peripheral-gustatory neurons generate a unique and reliable firing pattern during sensory stimulation and that this pattern can be reliably decoded.

  5. The Bayesian statistical decision theory applied to the optimization of generating set maintenance

    International Nuclear Information System (INIS)

    Procaccia, H.; Cordier, R.; Muller, S.

    1994-11-01

    The difficulty in RCM methodology is the allocation of a new periodicity of preventive maintenance on one equipment when a critical failure has been identified: until now this new allocation has been based on the engineer's judgment, and one must wait for a full cycle of feedback experience before to validate it. Statistical decision theory could be a more rational alternative for the optimization of preventive maintenance periodicity. This methodology has been applied to inspection and maintenance optimization of cylinders of diesel generator engines of 900 MW nuclear plants, and has shown that previous preventive maintenance periodicity can be extended. (authors). 8 refs., 5 figs

  6. An Integrated Simulation, Inference and Optimization Approach for Groundwater Remediation with Two-stage Health-Risk Assessment

    Directory of Open Access Journals (Sweden)

    Aili Yang

    2018-05-01

    Full Text Available In this study, an integrated simulation, inference and optimization approach with two-stage health risk assessment (i.e., ISIO-THRA is developed for supporting groundwater remediation for a petroleum-contaminated site in western Canada. Both environmental standards and health risk are considered as the constraints in the ISIO-THRA model. The health risk includes two parts: (1 the health risk during the remediation process and (2 the health risk in the natural attenuation period after remediation. In the ISIO-THRA framework, the relationship between contaminant concentrations and time is expressed through first-order decay models. The results demonstrate that: (1 stricter environmental standards and health risk would require larger pumping rates for the same remediation duration; (2 higher health risk may happen in the period of the remediation process; (3 for the same environmental standard and acceptable health-risk level, the remediation techniques that take the shortest time would be chosen. ISIO-THRA can help to systematically analyze interaction among contaminant transport, remediation duration, and environmental and health concerns, and further provide useful supportive information for decision makers.

  7. Parameters optimization defined by statistical analysis for cysteine-dextran radiolabeling with technetium tricarbonyl core.

    Science.gov (United States)

    Núñez, Eutimio Gustavo Fernández; Faintuch, Bluma Linkowski; Teodoro, Rodrigo; Wiecek, Danielle Pereira; da Silva, Natanael Gomes; Papadopoulos, Minas; Pelecanou, Maria; Pirmettis, Ioannis; de Oliveira Filho, Renato Santos; Duatti, Adriano; Pasqualini, Roberto

    2011-04-01

    The objective of this study was the development of a statistical approach for radiolabeling optimization of cysteine-dextran conjugates with Tc-99m tricarbonyl core. This strategy has been applied to the labeling of 2-propylene-S-cysteine-dextran in the attempt to prepare a new class of tracers for sentinel lymph node detection, and can be extended to other radiopharmaceuticals for different targets. The statistical routine was based on three-level factorial design. Best labeling conditions were achieved. The specific activity reached was 5 MBq/μg. Crown Copyright © 2011. Published by Elsevier Ltd. All rights reserved.

  8. Parameters optimization defined by statistical analysis for cysteine-dextran radiolabeling with technetium tricarbonyl core

    Energy Technology Data Exchange (ETDEWEB)

    Fernandez Nunez, Eutimio Gustavo, E-mail: eutimiocu@yahoo.co [Radiopharmacy Center, Institute of Energetic and Nuclear Research, Sao Paulo, SP 05508-000 (Brazil); Linkowski Faintuch, Bluma; Teodoro, Rodrigo; Pereira Wiecek, Danielle; Gomes da Silva, Natanael [Radiopharmacy Center, Institute of Energetic and Nuclear Research, Sao Paulo, SP 05508-000 (Brazil); Papadopoulos, Minas [Institute of Radioisotopes, Radiodiagnostic Products, National Center for Scientific Research ' Demokritos' , Athens (Greece); Pelecanou, Maria [Institute of Biology, National Center for Scientific Research ' Demokritos' , Athens (Greece); Pirmettis, Ioannis [Institute of Radioisotopes, Radiodiagnostic Products, National Center for Scientific Research ' Demokritos' , Athens (Greece); Santos Oliveira Filho, Renato de [Faculty of Medicine, Federal University of Sao Paulo, SP (Brazil); Duatti, Adriano [Department of Radiological Sciences, University of Ferrara, Ferrara (Italy); Pasqualini, Roberto [CIS Bio International, Gif sur Yvette (France)

    2011-04-15

    The objective of this study was the development of a statistical approach for radiolabeling optimization of cysteine-dextran conjugates with Tc-99m tricarbonyl core. This strategy has been applied to the labeling of 2-propylene-S-cysteine-dextran in the attempt to prepare a new class of tracers for sentinel lymph node detection, and can be extended to other radiopharmaceuticals for different targets. The statistical routine was based on three-level factorial design. Best labeling conditions were achieved. The specific activity reached was 5 MBq/{mu}g.

  9. Bayesian versus frequentist statistical inference for investigating a one-off cancer cluster reported to a health department

    Directory of Open Access Journals (Sweden)

    Wills Rachael A

    2009-05-01

    Full Text Available Abstract Background The problem of silent multiple comparisons is one of the most difficult statistical problems faced by scientists. It is a particular problem for investigating a one-off cancer cluster reported to a health department because any one of hundreds, or possibly thousands, of neighbourhoods, schools, or workplaces could have reported a cluster, which could have been for any one of several types of cancer or any one of several time periods. Methods This paper contrasts the frequentist approach with a Bayesian approach for dealing with silent multiple comparisons in the context of a one-off cluster reported to a health department. Two published cluster investigations were re-analysed using the Dunn-Sidak method to adjust frequentist p-values and confidence intervals for silent multiple comparisons. Bayesian methods were based on the Gamma distribution. Results Bayesian analysis with non-informative priors produced results similar to the frequentist analysis, and suggested that both clusters represented a statistical excess. In the frequentist framework, the statistical significance of both clusters was extremely sensitive to the number of silent multiple comparisons, which can only ever be a subjective "guesstimate". The Bayesian approach is also subjective: whether there is an apparent statistical excess depends on the specified prior. Conclusion In cluster investigations, the frequentist approach is just as subjective as the Bayesian approach, but the Bayesian approach is less ambitious in that it treats the analysis as a synthesis of data and personal judgements (possibly poor ones, rather than objective reality. Bayesian analysis is (arguably a useful tool to support complicated decision-making, because it makes the uncertainty associated with silent multiple comparisons explicit.

  10. Insight From the Statistics of Nothing: Estimating Limits of Change Detection Using Inferred No-Change Areas in DEM Difference Maps and Application to Landslide Hazard Studies

    Science.gov (United States)

    Haneberg, W. C.

    2017-12-01

    Remote characterization of new landslides or areas of ongoing movement using differences in high resolution digital elevation models (DEMs) created through time, for example before and after major rains or earthquakes, is an attractive proposition. In the case of large catastrophic landslides, changes may be apparent enough that simple subtraction suffices. In other cases, statistical noise can obscure landslide signatures and place practical limits on detection. In ideal cases on land, GPS surveys of representative areas at the time of DEM creation can quantify the inherent errors. In less-than-ideal terrestrial cases and virtually all submarine cases, it may be impractical or impossible to independently estimate the DEM errors. Examining DEM difference statistics for areas reasonably inferred to have no change, however, can provide insight into the limits of detectability. Data from inferred no-change areas of airborne LiDAR DEM difference maps of the 2014 Oso, Washington landslide and landslide-prone colluvium slopes along the Ohio River valley in northern Kentucky, show that DEM difference maps can have non-zero mean and slope dependent error components consistent with published studies of DEM errors. Statistical thresholds derived from DEM difference error and slope data can help to distinguish between DEM differences that are likely real—and which may indicate landsliding—from those that are likely spurious or irrelevant. This presentation describes and compares two different approaches, one based upon a heuristic assumption about the proportion of the study area likely covered by new landslides and another based upon the amount of change necessary to ensure difference at a specified level of probability.

  11. Phase Transitions in Combinatorial Optimization Problems: Basics, Algorithms and Statistical Mechanics

    Science.gov (United States)

    Hartmann, Alexander K.; Weigt, Martin

    2005-10-01

    A concise, comprehensive introduction to the topic of statistical physics of combinatorial optimization, bringing together theoretical concepts and algorithms from computer science with analytical methods from physics. The result bridges the gap between statistical physics and combinatorial optimization, investigating problems taken from theoretical computing, such as the vertex-cover problem, with the concepts and methods of theoretical physics. The authors cover rapid developments and analytical methods that are both extremely complex and spread by word-of-mouth, providing all the necessary basics in required detail. Throughout, the algorithms are shown with examples and calculations, while the proofs are given in a way suitable for graduate students, post-docs, and researchers. Ideal for newcomers to this young, multidisciplinary field.

  12. Statistics

    Science.gov (United States)

    Links to sources of cancer-related statistics, including the Surveillance, Epidemiology and End Results (SEER) Program, SEER-Medicare datasets, cancer survivor prevalence data, and the Cancer Trends Progress Report.

  13. On Difference of Convex Optimization to Visualize Statistical Data and Dissimilarities

    DEFF Research Database (Denmark)

    Carrizosa, Emilio; Guerrero, Vanesa; Morales, Dolores Romero

    2016-01-01

    In this talk we address the problem of visualizing in a bounded region a set of individuals, which has attached a dissimilarity measure and a statistical value. This problem, which extends the standard Multidimensional Scaling Analysis, is written as a global optimization problem whose objective...... is the difference of two convex functions (DC). Suitable DC decompositions allow us to use the DCA algorithm in a very efficient way. Our algorithmic approach is used to visualize two real-world datasets....

  14. Optimization Model for Uncertain Statistics Based on an Analytic Hierarchy Process

    Directory of Open Access Journals (Sweden)

    Yongchao Hou

    2014-01-01

    Full Text Available Uncertain statistics is a methodology for collecting and interpreting the expert’s experimental data by uncertainty theory. In order to estimate uncertainty distributions, an optimization model based on analytic hierarchy process (AHP and interpolation method is proposed in this paper. In addition, the principle of least squares method is presented to estimate uncertainty distributions with known functional form. Finally, the effectiveness of this method is illustrated by an example.

  15. Statistical Optimization of Tannase Production by Penicillium sp. EZ-ZH390 in Submerged Fermentation

    OpenAIRE

    Zohreh Hamidi-Esfahani; Mohammad Ali Sahari; Mohammad Hossein Azizi

    2015-01-01

    Tannase has several important applications in food, feed, chemical and pharmaceutical industries. In the present study, production of tannase by mutant strain, Penicillium sp. EZ-ZH390, was optimized in submerged fermentation utilizing two statistical approaches. At first step, a one factor at a time design was employed to screen the preferable nutriments (carbon and nitrogen sources of the medium) to produce tannase. Screening of the carbon source resulted in the production of 10.74 U/mL of ...

  16. Enhanced Bio-Ethanol Production from Industrial Potato Waste by Statistical Medium Optimization

    OpenAIRE

    Izmirlioglu, Gulten; Demirci, Ali

    2015-01-01

    Industrial wastes are of great interest as a substrate in production of value-added products to reduce cost, while managing the waste economically and environmentally. Bio-ethanol production from industrial wastes has gained attention because of its abundance, availability, and rich carbon and nitrogen content. In this study, industrial potato waste was used as a carbon source and a medium was optimized for ethanol production by using statistical designs. The effect of various medium componen...

  17. An Optimization Principle for Deriving Nonequilibrium Statistical Models of Hamiltonian Dynamics

    Science.gov (United States)

    Turkington, Bruce

    2013-08-01

    A general method for deriving closed reduced models of Hamiltonian dynamical systems is developed using techniques from optimization and statistical estimation. Given a vector of resolved variables, selected to describe the macroscopic state of the system, a family of quasi-equilibrium probability densities on phase space corresponding to the resolved variables is employed as a statistical model, and the evolution of the mean resolved vector is estimated by optimizing over paths of these densities. Specifically, a cost function is constructed to quantify the lack-of-fit to the microscopic dynamics of any feasible path of densities from the statistical model; it is an ensemble-averaged, weighted, squared-norm of the residual that results from submitting the path of densities to the Liouville equation. The path that minimizes the time integral of the cost function determines the best-fit evolution of the mean resolved vector. The closed reduced equations satisfied by the optimal path are derived by Hamilton-Jacobi theory. When expressed in terms of the macroscopic variables, these equations have the generic structure of governing equations for nonequilibrium thermodynamics. In particular, the value function for the optimization principle coincides with the dissipation potential that defines the relation between thermodynamic forces and fluxes. The adjustable closure parameters in the best-fit reduced equations depend explicitly on the arbitrary weights that enter into the lack-of-fit cost function. Two particular model reductions are outlined to illustrate the general method. In each example the set of weights in the optimization principle contracts into a single effective closure parameter.

  18. Sb2Te3 and Its Superlattices: Optimization by Statistical Design.

    Science.gov (United States)

    Behera, Jitendra K; Zhou, Xilin; Ranjan, Alok; Simpson, Robert E

    2018-05-02

    The objective of this work is to demonstrate the usefulness of fractional factorial design for optimizing the crystal quality of chalcogenide van der Waals (vdW) crystals. We statistically analyze the growth parameters of highly c axis oriented Sb 2 Te 3 crystals and Sb 2 Te 3 -GeTe phase change vdW heterostructured superlattices. The statistical significance of the growth parameters of temperature, pressure, power, buffer materials, and buffer layer thickness was found by fractional factorial design and response surface analysis. Temperature, pressure, power, and their second-order interactions are the major factors that significantly influence the quality of the crystals. Additionally, using tungsten rather than molybdenum as a buffer layer significantly enhances the crystal quality. Fractional factorial design minimizes the number of experiments that are necessary to find the optimal growth conditions, resulting in an order of magnitude improvement in the crystal quality. We highlight that statistical design of experiment methods, which is more commonly used in product design, should be considered more broadly by those designing and optimizing materials.

  19. Optimal choice of word length when comparing two Markov sequences using a χ 2-statistic.

    Science.gov (United States)

    Bai, Xin; Tang, Kujin; Ren, Jie; Waterman, Michael; Sun, Fengzhu

    2017-10-03

    Alignment-free sequence comparison using counts of word patterns (grams, k-tuples) has become an active research topic due to the large amount of sequence data from the new sequencing technologies. Genome sequences are frequently modelled by Markov chains and the likelihood ratio test or the corresponding approximate χ 2 -statistic has been suggested to compare two sequences. However, it is not known how to best choose the word length k in such studies. We develop an optimal strategy to choose k by maximizing the statistical power of detecting differences between two sequences. Let the orders of the Markov chains for the two sequences be r 1 and r 2 , respectively. We show through both simulations and theoretical studies that the optimal k= max(r 1 ,r 2 )+1 for both long sequences and next generation sequencing (NGS) read data. The orders of the Markov chains may be unknown and several methods have been developed to estimate the orders of Markov chains based on both long sequences and NGS reads. We study the power loss of the statistics when the estimated orders are used. It is shown that the power loss is minimal for some of the estimators of the orders of Markov chains. Our studies provide guidelines on choosing the optimal word length for the comparison of Markov sequences.

  20. Statistical optimization of lovastatin production by Omphalotus olearius (DC.) singer in submerged fermentation.

    Science.gov (United States)

    Atlı, Burcu; Yamaç, Mustafa; Yıldız, Zeki; Isikhuemhen, Omoanghe S

    2016-01-01

    In this study, culture conditions were optimized to improve lovastatin production by Omphalotus olearius, isolate OBCC 2002, using statistical experimental designs. The Plackett-Burman design was used to select important variables affecting lovastatin production. Accordingly, glucose, peptone, and agitation speed were determined as the variables that have influence on lovastatin production. In a further experiment, these variables were optimized with a Box-Behnken design and applied in a submerged process; this resulted in 12.51 mg/L lovastatin production on a medium containing glucose (10 g/L), peptone (5 g/L), thiamine (1 mg/L), and NaCl (0.4 g/L) under static conditions. This level of lovastatin production is eight times higher than that produced under unoptimized media and growth conditions by Omphalotus olearius. To the best of our knowledge, this is the first attempt to optimize submerged fermentation process for lovastatin production by Omphalotus olearius.

  1. Application of Statistical Analysis for the Optimization of Mycelia and Polysaccharide Production by Tremella aurantialba

    Directory of Open Access Journals (Sweden)

    Zhicai Zhang

    2007-01-01

    Full Text Available Statistical analyses were applied to optimize the medium composition for the mycelial growth and polysaccharide production by Tremella aurantialba in shake flask cultures. Firstly, four significant factors (xylan, peptone, wheat bran and corn powder on mycelial growth and polysaccharide yield (p≤0.05 were obtained using one-at-a-time design. Subsequently, in order to study the mutual interactions between variables, the effects of these factors were further investigated using four-factor, three-level orthogonal test design and the optimal composition was (in g/L: xylan 40, peptone 10, wheat bran 20, corn powder 20, KH2PO4 1.2 and MgSO4·7H2O 0.6. Finally, the maximum mycelium yield and polysaccharide production in 50-litre stirred-tank bioreactor reached 36.8 and 3.01 g/L under the optimized medium, respectively.

  2. Statistical media and process optimization for biotransformation of rice bran to vanillin using Pediococcus acidilactici.

    Science.gov (United States)

    Kaur, Baljinder; Chakraborty, Debkumar

    2013-11-01

    An isolate of P. acidilactici capable of producing vanillin from rice bran was isolated from a milk product. Response Surface Methodology was employed for statistical media and process optimization for production of biovanillin. Statistical medium optimization was done in two steps involving Placket Burman Design and Central Composite Response Designs. The RSM optimized vanillin production medium consisted of 15% (w/v) rice bran, 0.5% (w/v) peptone, 0.1% (w/v) ammonium nitrate, 0.005% (w/v) ferulic acid, 0.005% (w/v) magnesium sulphate, and 0.1% (v/v) tween-80, pH 5.6, at a temperature of 37 degrees C under shaking conditions at 180 rpm. 1.269 g/L vanillin was obtained within 24 h of incubation in optimized culture medium. This is the first report indicating such a high vanillin yield obtained during biotransformation of ferulic acid to vanillin using a Pediococcal isolate.

  3. Optimizing the maximum reported cluster size in the spatial scan statistic for ordinal data.

    Science.gov (United States)

    Kim, Sehwi; Jung, Inkyung

    2017-01-01

    The spatial scan statistic is an important tool for spatial cluster detection. There have been numerous studies on scanning window shapes. However, little research has been done on the maximum scanning window size or maximum reported cluster size. Recently, Han et al. proposed to use the Gini coefficient to optimize the maximum reported cluster size. However, the method has been developed and evaluated only for the Poisson model. We adopt the Gini coefficient to be applicable to the spatial scan statistic for ordinal data to determine the optimal maximum reported cluster size. Through a simulation study and application to a real data example, we evaluate the performance of the proposed approach. With some sophisticated modification, the Gini coefficient can be effectively employed for the ordinal model. The Gini coefficient most often picked the optimal maximum reported cluster sizes that were the same as or smaller than the true cluster sizes with very high accuracy. It seems that we can obtain a more refined collection of clusters by using the Gini coefficient. The Gini coefficient developed specifically for the ordinal model can be useful for optimizing the maximum reported cluster size for ordinal data and helpful for properly and informatively discovering cluster patterns.

  4. Statistics

    International Nuclear Information System (INIS)

    2005-01-01

    For the years 2004 and 2005 the figures shown in the tables of Energy Review are partly preliminary. The annual statistics published in Energy Review are presented in more detail in a publication called Energy Statistics that comes out yearly. Energy Statistics also includes historical time-series over a longer period of time (see e.g. Energy Statistics, Statistics Finland, Helsinki 2004.) The applied energy units and conversion coefficients are shown in the back cover of the Review. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in GDP, energy consumption and electricity consumption, Carbon dioxide emissions from fossile fuels use, Coal consumption, Consumption of natural gas, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices in heat production, Fuel prices in electricity production, Price of electricity by type of consumer, Average monthly spot prices at the Nord pool power exchange, Total energy consumption by source and CO 2 -emissions, Supplies and total consumption of electricity GWh, Energy imports by country of origin in January-June 2003, Energy exports by recipient country in January-June 2003, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Price of natural gas by type of consumer, Price of electricity by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes, precautionary stock fees and oil pollution fees

  5. Statistics

    International Nuclear Information System (INIS)

    2001-01-01

    For the year 2000, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy Review appear in more detail from the publication Energiatilastot - Energy Statistics issued annually, which also includes historical time series over a longer period (see e.g. Energiatilastot 1999, Statistics Finland, Helsinki 2000, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions from the use of fossil fuels, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in 2000, Energy exports by recipient country in 2000, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products

  6. Statistics

    International Nuclear Information System (INIS)

    2000-01-01

    For the year 1999 and 2000, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy Review appear in more detail from the publication Energiatilastot - Energy Statistics issued annually, which also includes historical time series over a longer period (see e.g., Energiatilastot 1998, Statistics Finland, Helsinki 1999, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in January-March 2000, Energy exports by recipient country in January-March 2000, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products

  7. Statistics

    International Nuclear Information System (INIS)

    1999-01-01

    For the year 1998 and the year 1999, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy Review appear in more detail from the publication Energiatilastot - Energy Statistics issued annually, which also includes historical time series over a longer period (see e.g. Energiatilastot 1998, Statistics Finland, Helsinki 1999, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in January-June 1999, Energy exports by recipient country in January-June 1999, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products

  8. Improving alignment in Tract-based spatial statistics: evaluation and optimization of image registration.

    Science.gov (United States)

    de Groot, Marius; Vernooij, Meike W; Klein, Stefan; Ikram, M Arfan; Vos, Frans M; Smith, Stephen M; Niessen, Wiro J; Andersson, Jesper L R

    2013-08-01

    Anatomical alignment in neuroimaging studies is of such importance that considerable effort is put into improving the registration used to establish spatial correspondence. Tract-based spatial statistics (TBSS) is a popular method for comparing diffusion characteristics across subjects. TBSS establishes spatial correspondence using a combination of nonlinear registration and a "skeleton projection" that may break topological consistency of the transformed brain images. We therefore investigated feasibility of replacing the two-stage registration-projection procedure in TBSS with a single, regularized, high-dimensional registration. To optimize registration parameters and to evaluate registration performance in diffusion MRI, we designed an evaluation framework that uses native space probabilistic tractography for 23 white matter tracts, and quantifies tract similarity across subjects in standard space. We optimized parameters for two registration algorithms on two diffusion datasets of different quality. We investigated reproducibility of the evaluation framework, and of the optimized registration algorithms. Next, we compared registration performance of the regularized registration methods and TBSS. Finally, feasibility and effect of incorporating the improved registration in TBSS were evaluated in an example study. The evaluation framework was highly reproducible for both algorithms (R(2) 0.993; 0.931). The optimal registration parameters depended on the quality of the dataset in a graded and predictable manner. At optimal parameters, both algorithms outperformed the registration of TBSS, showing feasibility of adopting such approaches in TBSS. This was further confirmed in the example experiment. Copyright © 2013 Elsevier Inc. All rights reserved.

  9. Artificial Intelligence versus Statistical Modeling and Optimization of Cholesterol Oxidase Production by using Streptomyces Sp.

    Science.gov (United States)

    Pathak, Lakshmi; Singh, Vineeta; Niwas, Ram; Osama, Khwaja; Khan, Saif; Haque, Shafiul; Tripathi, C K M; Mishra, B N

    2015-01-01

    Cholesterol oxidase (COD) is a bi-functional FAD-containing oxidoreductase which catalyzes the oxidation of cholesterol into 4-cholesten-3-one. The wider biological functions and clinical applications of COD have urged the screening, isolation and characterization of newer microbes from diverse habitats as a source of COD and optimization and over-production of COD for various uses. The practicability of statistical/ artificial intelligence techniques, such as response surface methodology (RSM), artificial neural network (ANN) and genetic algorithm (GA) have been tested to optimize the medium composition for the production of COD from novel strain Streptomyces sp. NCIM 5500. All experiments were performed according to the five factor central composite design (CCD) and the generated data was analysed using RSM and ANN. GA was employed to optimize the models generated by RSM and ANN. Based upon the predicted COD concentration, the model developed with ANN was found to be superior to the model developed with RSM. The RSM-GA approach predicted maximum of 6.283 U/mL COD production, whereas the ANN-GA approach predicted a maximum of 9.93 U/mL COD concentration. The optimum concentrations of the medium variables predicted through ANN-GA approach were: 1.431 g/50 mL soybean, 1.389 g/50 mL maltose, 0.029 g/50 mL MgSO4, 0.45 g/50 mL NaCl and 2.235 ml/50 mL glycerol. The experimental COD concentration was concurrent with the GA predicted yield and led to 9.75 U/mL COD production, which was nearly two times higher than the yield (4.2 U/mL) obtained with the un-optimized medium. This is the very first time we are reporting the statistical versus artificial intelligence based modeling and optimization of COD production by Streptomyces sp. NCIM 5500.

  10. Dynamic statistical optimization of GNSS radio occultation bending angles: advanced algorithm and performance analysis

    Science.gov (United States)

    Li, Y.; Kirchengast, G.; Scherllin-Pirscher, B.; Norman, R.; Yuan, Y. B.; Fritzer, J.; Schwaerz, M.; Zhang, K.

    2015-08-01

    We introduce a new dynamic statistical optimization algorithm to initialize ionosphere-corrected bending angles of Global Navigation Satellite System (GNSS)-based radio occultation (RO) measurements. The new algorithm estimates background and observation error covariance matrices with geographically varying uncertainty profiles and realistic global-mean correlation matrices. The error covariance matrices estimated by the new approach are more accurate and realistic than in simplified existing approaches and can therefore be used in statistical optimization to provide optimal bending angle profiles for high-altitude initialization of the subsequent Abel transform retrieval of refractivity. The new algorithm is evaluated against the existing Wegener Center Occultation Processing System version 5.6 (OPSv5.6) algorithm, using simulated data on two test days from January and July 2008 and real observed CHAllenging Minisatellite Payload (CHAMP) and Constellation Observing System for Meteorology, Ionosphere, and Climate (COSMIC) measurements from the complete months of January and July 2008. The following is achieved for the new method's performance compared to OPSv5.6: (1) significant reduction of random errors (standard deviations) of optimized bending angles down to about half of their size or more; (2) reduction of the systematic differences in optimized bending angles for simulated MetOp data; (3) improved retrieval of refractivity and temperature profiles; and (4) realistically estimated global-mean correlation matrices and realistic uncertainty fields for the background and observations. Overall the results indicate high suitability for employing the new dynamic approach in the processing of long-term RO data into a reference climate record, leading to well-characterized and high-quality atmospheric profiles over the entire stratosphere.

  11. Estimating Unbiased Land Cover Change Areas In The Colombian Amazon Using Landsat Time Series And Statistical Inference Methods

    Science.gov (United States)

    Arevalo, P. A.; Olofsson, P.; Woodcock, C. E.

    2017-12-01

    Unbiased estimation of the areas of conversion between land categories ("activity data") and their uncertainty is crucial for providing more robust calculations of carbon emissions to the atmosphere, as well as their removals. This is particularly important for the REDD+ mechanism of UNFCCC where an economic compensation is tied to the magnitude and direction of such fluxes. Dense time series of Landsat data and statistical protocols are becoming an integral part of forest monitoring efforts, but there are relatively few studies in the tropics focused on using these methods to advance operational MRV systems (Monitoring, Reporting and Verification). We present the results of a prototype methodology for continuous monitoring and unbiased estimation of activity data that is compliant with the IPCC Approach 3 for representation of land. We used a break detection algorithm (Continuous Change Detection and Classification, CCDC) to fit pixel-level temporal segments to time series of Landsat data in the Colombian Amazon. The segments were classified using a Random Forest classifier to obtain annual maps of land categories between 2001 and 2016. Using these maps, a biannual stratified sampling approach was implemented and unbiased stratified estimators constructed to calculate area estimates with confidence intervals for each of the stable and change classes. Our results provide evidence of a decrease in primary forest as a result of conversion to pastures, as well as increase in secondary forest as pastures are abandoned and the forest allowed to regenerate. Estimating areas of other land transitions proved challenging because of their very small mapped areas compared to stable classes like forest, which corresponds to almost 90% of the study area. Implications on remote sensing data processing, sample allocation and uncertainty reduction are also discussed.

  12. Robust optimization of the output voltage of nanogenerators by statistical design of experiments

    KAUST Repository

    Song, Jinhui; Xie, Huizhi; Wu, Wenzhuo; Roshan Joseph, V.; Jeff Wu, C. F.; Wang, Zhong Lin

    2010-01-01

    Nanogenerators were first demonstrated by deflecting aligned ZnO nanowires using a conductive atomic force microscopy (AFM) tip. The output of a nanogenerator is affected by three parameters: tip normal force, tip scanning speed, and tip abrasion. In this work, systematic experimental studies have been carried out to examine the combined effects of these three parameters on the output, using statistical design of experiments. A statistical model has been built to analyze the data and predict the optimal parameter settings. For an AFM tip of cone angle 70° coated with Pt, and ZnO nanowires with a diameter of 50 nm and lengths of 600 nm to 1 μm, the optimized parameters for the nanogenerator were found to be a normal force of 137 nN and scanning speed of 40 μm/s, rather than the conventional settings of 120 nN for the normal force and 30 μm/s for the scanning speed. A nanogenerator with the optimized settings has three times the average output voltage of one with the conventional settings. © 2010 Tsinghua University Press and Springer-Verlag Berlin Heidelberg.

  13. Robust optimization of the output voltage of nanogenerators by statistical design of experiments

    KAUST Repository

    Song, Jinhui

    2010-09-01

    Nanogenerators were first demonstrated by deflecting aligned ZnO nanowires using a conductive atomic force microscopy (AFM) tip. The output of a nanogenerator is affected by three parameters: tip normal force, tip scanning speed, and tip abrasion. In this work, systematic experimental studies have been carried out to examine the combined effects of these three parameters on the output, using statistical design of experiments. A statistical model has been built to analyze the data and predict the optimal parameter settings. For an AFM tip of cone angle 70° coated with Pt, and ZnO nanowires with a diameter of 50 nm and lengths of 600 nm to 1 μm, the optimized parameters for the nanogenerator were found to be a normal force of 137 nN and scanning speed of 40 μm/s, rather than the conventional settings of 120 nN for the normal force and 30 μm/s for the scanning speed. A nanogenerator with the optimized settings has three times the average output voltage of one with the conventional settings. © 2010 Tsinghua University Press and Springer-Verlag Berlin Heidelberg.

  14. Statistics

    International Nuclear Information System (INIS)

    2003-01-01

    For the year 2002, part of the figures shown in the tables of the Energy Review are partly preliminary. The annual statistics of the Energy Review also includes historical time-series over a longer period (see e.g. Energiatilastot 2001, Statistics Finland, Helsinki 2002). The applied energy units and conversion coefficients are shown in the inside back cover of the Review. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in GDP, energy consumption and electricity consumption, Carbon dioxide emissions from fossile fuels use, Coal consumption, Consumption of natural gas, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices in heat production, Fuel prices in electricity production, Price of electricity by type of consumer, Average monthly spot prices at the Nord pool power exchange, Total energy consumption by source and CO 2 -emissions, Supply and total consumption of electricity GWh, Energy imports by country of origin in January-June 2003, Energy exports by recipient country in January-June 2003, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Price of natural gas by type of consumer, Price of electricity by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Excise taxes, precautionary stock fees on oil pollution fees on energy products

  15. Statistics

    International Nuclear Information System (INIS)

    2004-01-01

    For the year 2003 and 2004, the figures shown in the tables of the Energy Review are partly preliminary. The annual statistics of the Energy Review also includes historical time-series over a longer period (see e.g. Energiatilastot, Statistics Finland, Helsinki 2003, ISSN 0785-3165). The applied energy units and conversion coefficients are shown in the inside back cover of the Review. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in GDP, energy consumption and electricity consumption, Carbon dioxide emissions from fossile fuels use, Coal consumption, Consumption of natural gas, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices in heat production, Fuel prices in electricity production, Price of electricity by type of consumer, Average monthly spot prices at the Nord pool power exchange, Total energy consumption by source and CO 2 -emissions, Supplies and total consumption of electricity GWh, Energy imports by country of origin in January-March 2004, Energy exports by recipient country in January-March 2004, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Price of natural gas by type of consumer, Price of electricity by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Excise taxes, precautionary stock fees on oil pollution fees

  16. Statistics

    International Nuclear Information System (INIS)

    2000-01-01

    For the year 1999 and 2000, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy also includes historical time series over a longer period (see e.g., Energiatilastot 1999, Statistics Finland, Helsinki 2000, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in January-June 2000, Energy exports by recipient country in January-June 2000, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products

  17. Forward and backward inference in spatial cognition.

    Directory of Open Access Journals (Sweden)

    Will D Penny

    Full Text Available This paper shows that the various computations underlying spatial cognition can be implemented using statistical inference in a single probabilistic model. Inference is implemented using a common set of 'lower-level' computations involving forward and backward inference over time. For example, to estimate where you are in a known environment, forward inference is used to optimally combine location estimates from path integration with those from sensory input. To decide which way to turn to reach a goal, forward inference is used to compute the likelihood of reaching that goal under each option. To work out which environment you are in, forward inference is used to compute the likelihood of sensory observations under the different hypotheses. For reaching sensory goals that require a chaining together of decisions, forward inference can be used to compute a state trajectory that will lead to that goal, and backward inference to refine the route and estimate control signals that produce the required trajectory. We propose that these computations are reflected in recent findings of pattern replay in the mammalian brain. Specifically, that theta sequences reflect decision making, theta flickering reflects model selection, and remote replay reflects route and motor planning. We also propose a mapping of the above computational processes onto lateral and medial entorhinal cortex and hippocampus.

  18. Statistical modeling/optimization and process intensification of microwave-assisted acidified oil esterification

    International Nuclear Information System (INIS)

    Ma, Lingling; Lv, Enmin; Du, Lixiong; Lu, Jie; Ding, Jincheng

    2016-01-01

    Highlights: • Microwave irradiation was employed for the esterification of acidified oil. • Optimization and modeling of the process was performed by RSM and ANN. • Both models have reliable prediction abilities but the ANN was superior over the RSM. • Membrane vapor permeation and in-situ dehydration were used to shift the equilibrium. • Two dehydration approaches improved the FFAs conversion rate by 20.0% approximately. - Abstract: The esterification of acidified oil with ethanol under microwave radiation was modeled and optimized using response surface methodology (RSM) and artificial neural network (ANN). The impacts of mass ratio of ethanol to acidified oil, catalyst loading, microwave power and reaction time are evaluated by Box-Behnken design (BBD) of RSM and multi-layer perceptron (MLP) of ANN. RSM combined with BBD shows the optimal conditions as catalyst loading of 5.85 g, mass ratio of ethanol to acidified oil of 0.35 (20.0 g acidified oil), microwave power of 328 W and reaction time of 98.0 min with the free fatty acids (FFAs) conversion of 78.57%. Both of the models are fitted well with the experimental data, however, ANN exhibits better prediction accuracy than RSM based on the statistical analyses. Furthermore, membrane vapor permeation and in-situ molecular sieve dehydration were investigated to enhance the esterification under the optimized conditions.

  19. The duration of uncertain times: audiovisual information about intervals is integrated in a statistically optimal fashion.

    Directory of Open Access Journals (Sweden)

    Jess Hartcher-O'Brien

    Full Text Available Often multisensory information is integrated in a statistically optimal fashion where each sensory source is weighted according to its precision. This integration scheme isstatistically optimal because it theoretically results in unbiased perceptual estimates with the highest precisionpossible.There is a current lack of consensus about how the nervous system processes multiple sensory cues to elapsed time.In order to shed light upon this, we adopt a computational approach to pinpoint the integration strategy underlying duration estimationof audio/visual stimuli. One of the assumptions of our computational approach is that the multisensory signals redundantly specify the same stimulus property. Our results clearly show that despite claims to the contrary, perceived duration is the result of an optimal weighting process, similar to that adopted for estimates of space. That is, participants weight the audio and visual information to arrive at the most precise, single duration estimate possible. The work also disentangles how different integration strategies - i.e. consideringthe time of onset/offset ofsignals - might alter the final estimate. As such we provide the first concrete evidence of an optimal integration strategy in human duration estimates.

  20. Statistical and optimization methods to expedite neural network training for transient identification

    International Nuclear Information System (INIS)

    Reifman, J.; Vitela, E.J.; Lee, J.C.

    1993-01-01

    Two complementary methods, statistical feature selection and nonlinear optimization through conjugate gradients, are used to expedite feedforward neural network training. Statistical feature selection techniques in the form of linear correlation coefficients and information-theoretic entropy are used to eliminate redundant and non-informative plant parameters to reduce the size of the network. The method of conjugate gradients is used to accelerate the network training convergence and to systematically calculate the Teaming and momentum constants at each iteration. The proposed techniques are compared with the backpropagation algorithm using the entire set of plant parameters in the training of neural networks to identify transients simulated with the Midland Nuclear Power Plant Unit 2 simulator. By using 25% of the plant parameters and the conjugate gradients, a 30-fold reduction in CPU time was obtained without degrading the diagnostic ability of the network

  1. Supersonic acoustic intensity with statistically optimized near-field acoustic holography

    DEFF Research Database (Denmark)

    Fernandez Grande, Efren; Jacobsen, Finn

    2011-01-01

    The concept of supersonic acoustic intensity was introduced some years ago for estimating the fraction of the flow of energy radiated by a source that propagates to the far field. It differs from the usual (active) intensity by excluding the near-field energy resulting from evanescent waves...... to the information provided by the near-field acoustic holography technique. This study proposes a version of the supersonic acoustic intensity applied to statistically optimized near-field acoustic holography (SONAH). The theory, numerical results and an experimental study are presented. The possibility of using...

  2. Cleaving of TOPAS and PMMA microstructured polymer optical fibers: Core-shift and statistical quality optimization

    DEFF Research Database (Denmark)

    Stefani, Alessio; Nielsen, Kristian; Rasmussen, Henrik K.

    2012-01-01

    We fabricated an electronically controlled polymer optical fiber cleaver, which uses a razor-blade guillotine and provides independent control of fiber temperature, blade temperature, and cleaving speed. To determine the optimum cleaving conditions of microstructured polymer optical fibers (m......POFs) with hexagonal hole structures we developed a program for cleaving quality optimization, which reads in a microscope image of the fiber end-facet and determines the core-shift and the statistics of the hole diameter, hole-to-hole pitch, hole ellipticity, and direction of major ellipse axis. For 125μm in diameter...

  3. Optimization of critical medium components for higher phycocyanin ...

    African Journals Online (AJOL)

    STORAGESEVER

    2009-09-01

    Sep 1, 2009 ... culture medium was screened and optimized using the statistical experimental designs of Plackett- .... statistical inference for exploring the functional relation- ..... from the F-test with a very low probability value (0.0015).

  4. Optimal statistic for detecting gravitational wave signals from binary inspirals with LISA

    CERN Document Server

    Rogan, A

    2004-01-01

    A binary compact object early in its inspiral phase will be picked up by its nearly monochromatic gravitational radiation by LISA. But even this innocuous appearing candidate poses interesting detection challenges. The data that will be scanned for such sources will be a set of three functions of LISA's twelve data streams obtained through time-delay interferometry, which is necessary to cancel the noise contributions from laser-frequency fluctuations and optical-bench motions to these data streams. We call these three functions pseudo-detectors. The sensitivity of any pseudo-detector to a given sky position is a function of LISA's orbital position. Moreover, at a given point in LISA's orbit, each pseudo-detector has a different sensitivity to the same sky position. In this work, we obtain the optimal statistic for detecting gravitational wave signals, such as from compact binaries early in their inspiral stage, in LISA data. We also present how the sensitivity of LISA, defined by this optimal statistic, vari...

  5. Statistical mechanical analysis of linear programming relaxation for combinatorial optimization problems.

    Science.gov (United States)

    Takabe, Satoshi; Hukushima, Koji

    2016-05-01

    Typical behavior of the linear programming (LP) problem is studied as a relaxation of the minimum vertex cover (min-VC), a type of integer programming (IP) problem. A lattice-gas model on the Erdös-Rényi random graphs of α-uniform hyperedges is proposed to express both the LP and IP problems of the min-VC in the common statistical mechanical model with a one-parameter family. Statistical mechanical analyses reveal for α=2 that the LP optimal solution is typically equal to that given by the IP below the critical average degree c=e in the thermodynamic limit. The critical threshold for good accuracy of the relaxation extends the mathematical result c=1 and coincides with the replica symmetry-breaking threshold of the IP. The LP relaxation for the minimum hitting sets with α≥3, minimum vertex covers on α-uniform random graphs, is also studied. Analytic and numerical results strongly suggest that the LP relaxation fails to estimate optimal values above the critical average degree c=e/(α-1) where the replica symmetry is broken.

  6. Statistical mechanical analysis of linear programming relaxation for combinatorial optimization problems

    Science.gov (United States)

    Takabe, Satoshi; Hukushima, Koji

    2016-05-01

    Typical behavior of the linear programming (LP) problem is studied as a relaxation of the minimum vertex cover (min-VC), a type of integer programming (IP) problem. A lattice-gas model on the Erdös-Rényi random graphs of α -uniform hyperedges is proposed to express both the LP and IP problems of the min-VC in the common statistical mechanical model with a one-parameter family. Statistical mechanical analyses reveal for α =2 that the LP optimal solution is typically equal to that given by the IP below the critical average degree c =e in the thermodynamic limit. The critical threshold for good accuracy of the relaxation extends the mathematical result c =1 and coincides with the replica symmetry-breaking threshold of the IP. The LP relaxation for the minimum hitting sets with α ≥3 , minimum vertex covers on α -uniform random graphs, is also studied. Analytic and numerical results strongly suggest that the LP relaxation fails to estimate optimal values above the critical average degree c =e /(α -1 ) where the replica symmetry is broken.

  7. Efficient Coding and Statistically Optimal Weighting of Covariance among Acoustic Attributes in Novel Sounds

    Science.gov (United States)

    Stilp, Christian E.; Kluender, Keith R.

    2012-01-01

    To the extent that sensorineural systems are efficient, redundancy should be extracted to optimize transmission of information, but perceptual evidence for this has been limited. Stilp and colleagues recently reported efficient coding of robust correlation (r = .97) among complex acoustic attributes (attack/decay, spectral shape) in novel sounds. Discrimination of sounds orthogonal to the correlation was initially inferior but later comparable to that of sounds obeying the correlation. These effects were attenuated for less-correlated stimuli (r = .54) for reasons that are unclear. Here, statistical properties of correlation among acoustic attributes essential for perceptual organization are investigated. Overall, simple strength of the principal correlation is inadequate to predict listener performance. Initial superiority of discrimination for statistically consistent sound pairs was relatively insensitive to decreased physical acoustic/psychoacoustic range of evidence supporting the correlation, and to more frequent presentations of the same orthogonal test pairs. However, increased range supporting an orthogonal dimension has substantial effects upon perceptual organization. Connectionist simulations and Eigenvalues from closed-form calculations of principal components analysis (PCA) reveal that perceptual organization is near-optimally weighted to shared versus unshared covariance in experienced sound distributions. Implications of reduced perceptual dimensionality for speech perception and plausible neural substrates are discussed. PMID:22292057

  8. High-throughput optimization by statistical designs: example with rat liver slices cryopreservation.

    Science.gov (United States)

    Martin, H; Bournique, B; Blanchi, B; Lerche-Langrand, C

    2003-08-01

    The purpose of this study was to optimize cryopreservation conditions of rat liver slices in a high-throughput format, with focus on reproducibility. A statistical design of 32 experiments was performed and intracellular lactate dehydrogenase (LDHi) activity and antipyrine (AP) metabolism were evaluated as biomarkers. At freezing, modified University of Wisconsin solution was better than Williams'E medium, and pure dimethyl sulfoxide was better than a cryoprotectant mixture. The best cryoprotectant concentrations were 10% for LDHi and 20% for AP metabolism. Fetal calf serum could be used at 50 or 80%, and incubation of slices with the cryoprotectant could last 10 or 20 min. At thawing, 42 degrees C was better than 22 degrees C. After thawing, 1h was better than 3h of preculture. Cryopreservation increased the interslice variability of the biomarkers. After cryopreservation, LDHi and AP metabolism levels were up to 84 and 80% of fresh values. However, these high levels were not reproducibly achieved. Two factors involved in the day-to-day variability of LDHi were identified: the incubation time with the cryoprotectant and the preculture time. In conclusion, the statistical design was very efficient to quickly determine optimized conditions by simultaneously measuring the role of numerous factors. The cryopreservation procedure developed appears suitable for qualitative metabolic profiling studies.

  9. Practical Bayesian Inference

    Science.gov (United States)

    Bailer-Jones, Coryn A. L.

    2017-04-01

    Preface; 1. Probability basics; 2. Estimation and uncertainty; 3. Statistical models and inference; 4. Linear models, least squares, and maximum likelihood; 5. Parameter estimation: single parameter; 6. Parameter estimation: multiple parameters; 7. Approximating distributions; 8. Monte Carlo methods for inference; 9. Parameter estimation: Markov chain Monte Carlo; 10. Frequentist hypothesis testing; 11. Model comparison; 12. Dealing with more complicated problems; References; Index.

  10. Artificial Intelligence versus Statistical Modeling and Optimization of Cholesterol Oxidase Production by using Streptomyces Sp.

    Directory of Open Access Journals (Sweden)

    Lakshmi Pathak

    Full Text Available Cholesterol oxidase (COD is a bi-functional FAD-containing oxidoreductase which catalyzes the oxidation of cholesterol into 4-cholesten-3-one. The wider biological functions and clinical applications of COD have urged the screening, isolation and characterization of newer microbes from diverse habitats as a source of COD and optimization and over-production of COD for various uses. The practicability of statistical/ artificial intelligence techniques, such as response surface methodology (RSM, artificial neural network (ANN and genetic algorithm (GA have been tested to optimize the medium composition for the production of COD from novel strain Streptomyces sp. NCIM 5500. All experiments were performed according to the five factor central composite design (CCD and the generated data was analysed using RSM and ANN. GA was employed to optimize the models generated by RSM and ANN. Based upon the predicted COD concentration, the model developed with ANN was found to be superior to the model developed with RSM. The RSM-GA approach predicted maximum of 6.283 U/mL COD production, whereas the ANN-GA approach predicted a maximum of 9.93 U/mL COD concentration. The optimum concentrations of the medium variables predicted through ANN-GA approach were: 1.431 g/50 mL soybean, 1.389 g/50 mL maltose, 0.029 g/50 mL MgSO4, 0.45 g/50 mL NaCl and 2.235 ml/50 mL glycerol. The experimental COD concentration was concurrent with the GA predicted yield and led to 9.75 U/mL COD production, which was nearly two times higher than the yield (4.2 U/mL obtained with the un-optimized medium. This is the very first time we are reporting the statistical versus artificial intelligence based modeling and optimization of COD production by Streptomyces sp. NCIM 5500.

  11. Statistical optimization of fermentative hydrogen production from xylose by newly isolated Enterobacter sp. CN1

    Energy Technology Data Exchange (ETDEWEB)

    Long, Chuannan; Cui, Jingjing; Liu, Zuotao; Liu, Yuntao; Hu, Zhong [Department of Biology, Shantou University, Shantou 515063 (China); Long, Minnan [The School of Energy Research, Xiamen University, Xiamen 361005 (China)

    2010-07-15

    Statistical experimental designs were applied for the optimization of medium constituents for hydrogen production from xylose by newly isolated Enterobacter sp. CN1. Using Plackett-Burman design, xylose, FeSO{sub 4} and peptone were identified as significant variables which highly influenced hydrogen production. The path of steepest ascent was undertaken to approach the optimal region of the three significant factors. These variables were subsequently optimized using Box-Behnken design of response surface methodology (RSM). The optimum conditions were found to be xylose 16.15 g/L, FeSO{sub 4} 250.17 mg/L, peptone 2.54 g/L. Hydrogen production at these optimum conditions was 1149.9 {+-} 65 ml H{sub 2}/L medium. Under different carbon sources condition, the cumulative hydrogen volume were 1217 ml H{sub 2}/L xylose medium, 1102 ml H{sub 2}/L glucose medium and 977 ml H{sub 2}/L sucrose medium; the maximum hydrogen yield were 2.0 {+-} 0.05 mol H{sub 2}/mol xylose, 0.64 mol H{sub 2}/mol glucose. Fermentative hydrogen production from xylose by Enterobacter sp. CN1 was superior to glucose and sucrose. (author)

  12. Optimization of phototrophic hydrogen production by Rhodopseudomonas palustris PBUM001 via statistical experimental design

    Energy Technology Data Exchange (ETDEWEB)

    Jamil, Zadariana [Department of Civil Engineering, Faculty of Engineering, University of Malaya (Malaysia); Faculty of Civil Engineering, Technology University of MARA (Malaysia); Mohamad Annuar, Mohamad Suffian; Vikineswary, S. [Institute of Biological Sciences, University of Malaya (Malaysia); Ibrahim, Shaliza [Department of Civil Engineering, Faculty of Engineering, University of Malaya (Malaysia)

    2009-09-15

    Phototrophic hydrogen production by indigenous purple non-sulfur bacteria, Rhodopseudomonas palustris PBUM001 from palm oil mill effluent (POME) was optimized using response surface methodology (RSM). The process parameters studied include inoculum sizes (% v/v), POME concentration (% v/v), light intensity (klux), agitation (rpm) and pH. The experimental data on cumulative hydrogen production and COD reduction were fitted into a quadratic polynomial model using response surface regression analysis. The path to optimal process conditions was determined by analyzing response surface three-dimensional surface plot and contour plot. Statistical analysis on experimental data collected following Box-Behnken design showed that 100% (v/v) POME concentration, 10% (v/v) inoculum size, light intensity at 4.0 klux, agitation rate at 250 rpm and pH of 6 were the best conditions. The maximum predicted cumulative hydrogen production and COD reduction obtained under these conditions was 1.05 ml H{sub 2}/ml POME and 31.71% respectively. Subsequent verification experiments at optimal process values gave the maximum yield of cumulative hydrogen at 0.66 {+-} 0.07 ml H{sub 2}/ml POME and COD reduction at 30.54 {+-} 9.85%. (author)

  13. Statistical Optimization of Conditions for Decolorization of Synthetic Dyes by Cordyceps militaris MTCC 3936 Using RSM

    Directory of Open Access Journals (Sweden)

    Baljinder Kaur

    2015-01-01

    Full Text Available In the present study, the biobleaching potential of white rot fungus Cordyceps militaris MTCC3936 was investigated. For preliminary screening, decolorization properties of C. militaris were comparatively studied using whole cells in agar-based and liquid culture systems. Preliminary investigation in liquid culture systems revealed 100% decolorization achieved within 3 days of incubation for reactive yellow 18, 6 days for reactive red 31, 7 days for reactive black 8, and 11 days for reactive green 19 and reactive red 74. RSM was further used to study the effect of three independent variables such as pH, incubation time, and concentration of dye on decolorization properties of cell free supernatant of C. militaris. RSM based statistical analysis revealed that dye decolorization by cell free supernatants of C. militaris is more efficient than whole cell based system. The optimized conditions for decolorization of synthetic dyes were identified as dye concentration of 300 ppm, incubation time of 48 h, and optimal pH value as 5.5, except for reactive red 31 (for which the model was nonsignificant. The maximum dye decolorizations achieved under optimized conditions for reactive yellow 18, reactive green 19, reactive red 74, and reactive black 8 were 73.07, 65.36, 55.37, and 68.59%, respectively.

  14. Statistical inference for Cox processes

    DEFF Research Database (Denmark)

    Møller, Jesper; Waagepetersen, Rasmus Plenge

    2002-01-01

    Research has generated a number of advances in methods for spatial cluster modelling in recent years, particularly in the area of Bayesian cluster modelling. Along with these advances has come an explosion of interest in the potential applications of this work, especially in epidemiology and genome...... research.   In one integrated volume, this book reviews the state-of-the-art in spatial clustering and spatial cluster modelling, bringing together research and applications previously scattered throughout the literature. It begins with an overview of the field, then presents a series of chapters...... that illuminate the nature and purpose of cluster modelling within different application areas, including astrophysics, epidemiology, ecology, and imaging. The focus then shifts to methods, with discussions on point and object process modelling, perfect sampling of cluster processes, partitioning in space...

  15. Plausible values in statistical inference

    NARCIS (Netherlands)

    Marsman, M.

    2014-01-01

    In Chapter 2 it is shown that the marginal distribution of plausible values is a consistent estimator of the true latent variable distribution, and, furthermore, that convergence is monotone in an embedding in which the number of items tends to infinity. This result is used to clarify some of the

  16. Optimal statistical damage detection and classification in an experimental wind turbine blade using minimum instrumentation

    Science.gov (United States)

    Hoell, Simon; Omenzetter, Piotr

    2017-04-01

    The increasing demand for carbon neutral energy in a challenging economic environment is a driving factor for erecting ever larger wind turbines in harsh environments using novel wind turbine blade (WTBs) designs characterized by high flexibilities and lower buckling capacities. To counteract resulting increasing of operation and maintenance costs, efficient structural health monitoring systems can be employed to prevent dramatic failures and to schedule maintenance actions according to the true structural state. This paper presents a novel methodology for classifying structural damages using vibrational responses from a single sensor. The method is based on statistical classification using Bayes' theorem and an advanced statistic, which allows controlling the performance by varying the number of samples which represent the current state. This is done for multivariate damage sensitive features defined as partial autocorrelation coefficients (PACCs) estimated from vibrational responses and principal component analysis scores from PACCs. Additionally, optimal DSFs are composed not only for damage classification but also for damage detection based on binary statistical hypothesis testing, where features selections are found with a fast forward procedure. The method is applied to laboratory experiments with a small scale WTB with wind-like excitation and non-destructive damage scenarios. The obtained results demonstrate the advantages of the proposed procedure and are promising for future applications of vibration-based structural health monitoring in WTBs.

  17. A Unified Statistical Rain-Attenuation Model for Communication Link Fade Predictions and Optimal Stochastic Fade Control Design Using a Location-Dependent Rain-Statistic Database

    Science.gov (United States)

    Manning, Robert M.

    1990-01-01

    A static and dynamic rain-attenuation model is presented which describes the statistics of attenuation on an arbitrarily specified satellite link for any location for which there are long-term rainfall statistics. The model may be used in the design of the optimal stochastic control algorithms to mitigate the effects of attenuation and maintain link reliability. A rain-statistics data base is compiled, which makes it possible to apply the model to any location in the continental U.S. with a resolution of 0-5 degrees in latitude and longitude. The model predictions are compared with experimental observations, showing good agreement.

  18. Application of Bayesian statistical decision theory to the optimization of generating set maintenance

    International Nuclear Information System (INIS)

    Procaccia, H.; Cordier, R.; Muller, S.

    1994-07-01

    Statistical decision theory could be a alternative for the optimization of preventive maintenance periodicity. In effect, this theory concerns the situation in which a decision maker has to make a choice between a set of reasonable decisions, and where the loss associated to a given decision depends on a probabilistic risk, called state of nature. In the case of maintenance optimization, the decisions to be analyzed are different periodicities proposed by the experts, given the observed feedback experience, the states of nature are the associated failure probabilities, and the losses are the expectations of the induced cost of maintenance and of consequences of the failures. As failure probabilities concern rare events, at the ultimate state of RCM analysis (failure of sub-component), and as expected foreseeable behaviour of equipment has to be evaluated by experts, Bayesian approach is successfully used to compute states of nature. In Bayesian decision theory, a prior distribution for failure probabilities is modeled from expert knowledge, and is combined with few stochastic information provided by feedback experience, giving a posterior distribution of failure probabilities. The optimized decision is the decision that minimizes the expected loss over the posterior distribution. This methodology has been applied to inspection and maintenance optimization of cylinders of diesel generator engines of 900 MW nuclear plants. In these plants, auxiliary electric power is supplied by 2 redundant diesel generators which are tested every 2 weeks during about 1 hour. Until now, during yearly refueling of each plant, one endoscopic inspection of diesel cylinders is performed, and every 5 operating years, all cylinders are replaced. RCM has shown that cylinder failures could be critical. So Bayesian decision theory has been applied, taking into account expert opinions, and possibility of aging when maintenance periodicity is extended. (authors). 8 refs., 5 figs., 1 tab

  19. Using polarimetric radar observations and probabilistic inference to develop the Bayesian Observationally-constrained Statistical-physical Scheme (BOSS), a novel microphysical parameterization framework

    Science.gov (United States)

    van Lier-Walqui, M.; Morrison, H.; Kumjian, M. R.; Prat, O. P.

    2016-12-01

    Microphysical parameterization schemes have reached an impressive level of sophistication: numerous prognostic hydrometeor categories, and either size-resolved (bin) particle size distributions, or multiple prognostic moments of the size distribution. Yet, uncertainty in model representation of microphysical processes and the effects of microphysics on numerical simulation of weather has not shown a improvement commensurate with the advanced sophistication of these schemes. We posit that this may be caused by unconstrained assumptions of these schemes, such as ad-hoc parameter value choices and structural uncertainties (e.g. choice of a particular form for the size distribution). We present work on development and observational constraint of a novel microphysical parameterization approach, the Bayesian Observationally-constrained Statistical-physical Scheme (BOSS), which seeks to address these sources of uncertainty. Our framework avoids unnecessary a priori assumptions, and instead relies on observations to provide probabilistic constraint of the scheme structure and sensitivities to environmental and microphysical conditions. We harness the rich microphysical information content of polarimetric radar observations to develop and constrain BOSS within a Bayesian inference framework using a Markov Chain Monte Carlo sampler (see Kumjian et al., this meeting for details on development of an associated polarimetric forward operator). Our work shows how knowledge of microphysical processes is provided by polarimetric radar observations of diverse weather conditions, and which processes remain highly uncertain, even after considering observations.

  20. Monte Carlo Bayesian inference on a statistical model of sub-gridcolumn moisture variability using high-resolution cloud observations. Part 1: Method

    Science.gov (United States)

    Norris, Peter M.; da Silva, Arlindo M.

    2018-01-01

    A method is presented to constrain a statistical model of sub-gridcolumn moisture variability using high-resolution satellite cloud data. The method can be used for large-scale model parameter estimation or cloud data assimilation. The gridcolumn model includes assumed probability density function (PDF) intra-layer horizontal variability and a copula-based inter-layer correlation model. The observables used in the current study are Moderate Resolution Imaging Spectroradiometer (MODIS) cloud-top pressure, brightness temperature and cloud optical thickness, but the method should be extensible to direct cloudy radiance assimilation for a small number of channels. The algorithm is a form of Bayesian inference with a Markov chain Monte Carlo (MCMC) approach to characterizing the posterior distribution. This approach is especially useful in cases where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach is not gradient-based and allows jumps into regions of non-zero cloud probability. The current study uses a skewed-triangle distribution for layer moisture. The article also includes a discussion of the Metropolis and multiple-try Metropolis versions of MCMC. PMID:29618847

  1. Monte Carlo Bayesian Inference on a Statistical Model of Sub-Gridcolumn Moisture Variability Using High-Resolution Cloud Observations. Part 1: Method

    Science.gov (United States)

    Norris, Peter M.; Da Silva, Arlindo M.

    2016-01-01

    A method is presented to constrain a statistical model of sub-gridcolumn moisture variability using high-resolution satellite cloud data. The method can be used for large-scale model parameter estimation or cloud data assimilation. The gridcolumn model includes assumed probability density function (PDF) intra-layer horizontal variability and a copula-based inter-layer correlation model. The observables used in the current study are Moderate Resolution Imaging Spectroradiometer (MODIS) cloud-top pressure, brightness temperature and cloud optical thickness, but the method should be extensible to direct cloudy radiance assimilation for a small number of channels. The algorithm is a form of Bayesian inference with a Markov chain Monte Carlo (MCMC) approach to characterizing the posterior distribution. This approach is especially useful in cases where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach is not gradient-based and allows jumps into regions of non-zero cloud probability. The current study uses a skewed-triangle distribution for layer moisture. The article also includes a discussion of the Metropolis and multiple-try Metropolis versions of MCMC.

  2. Inferring the origin of rare fruit distillates from compositional data using multivariate statistical analyses and the identification of new flavour constituents.

    Science.gov (United States)

    Mihajilov-Krstev, Tatjana M; Denić, Marija S; Zlatković, Bojan K; Stankov-Jovanović, Vesna P; Mitić, Violeta D; Stojanović, Gordana S; Radulović, Niko S

    2015-04-01

    In Serbia, delicatessen fruit alcoholic drinks are produced from autochthonous fruit-bearing species such as cornelian cherry, blackberry, elderberry, wild strawberry, European wild apple, European blueberry and blackthorn fruits. There are no chemical data on many of these and herein we analysed volatile minor constituents of these rare fruit distillates. Our second goal was to determine possible chemical markers of these distillates through a statistical/multivariate treatment of the herein obtained and previously reported data. Detailed chemical analyses revealed a complex volatile profile of all studied fruit distillates with 371 identified compounds. A number of constituents were recognised as marker compounds for a particular distillate. Moreover, 33 of them represent newly detected flavour constituents in alcoholic beverages or, in general, in foodstuffs. With the aid of multivariate analyses, these volatile profiles were successfully exploited to infer the origin of raw materials used in the production of these spirits. It was also shown that all fruit distillates possessed weak antimicrobial properties. It seems that the aroma of these highly esteemed wild-fruit spirits depends on the subtle balance of various minor volatile compounds, whereby some of them are specific to a certain type of fruit distillate and enable their mutual distinction. © 2014 Society of Chemical Industry.

  3. IMAGINE: Interstellar MAGnetic field INference Engine

    Science.gov (United States)

    Steininger, Theo

    2018-03-01

    IMAGINE (Interstellar MAGnetic field INference Engine) performs inference on generic parametric models of the Galaxy. The modular open source framework uses highly optimized tools and technology such as the MultiNest sampler (ascl:1109.006) and the information field theory framework NIFTy (ascl:1302.013) to create an instance of the Milky Way based on a set of parameters for physical observables, using Bayesian statistics to judge the mismatch between measured data and model prediction. The flexibility of the IMAGINE framework allows for simple refitting for newly available data sets and makes state-of-the-art Bayesian methods easily accessible particularly for random components of the Galactic magnetic field.

  4. Hybrid artificial intelligence approach based on neural fuzzy inference model and metaheuristic optimization for flood susceptibilitgy modeling in a high-frequency tropical cyclone area using GIS

    Science.gov (United States)

    Tien Bui, Dieu; Pradhan, Biswajeet; Nampak, Haleh; Bui, Quang-Thanh; Tran, Quynh-An; Nguyen, Quoc-Phi

    2016-09-01

    This paper proposes a new artificial intelligence approach based on neural fuzzy inference system and metaheuristic optimization for flood susceptibility modeling, namely MONF. In the new approach, the neural fuzzy inference system was used to create an initial flood susceptibility model and then the model was optimized using two metaheuristic algorithms, Evolutionary Genetic and Particle Swarm Optimization. A high-frequency tropical cyclone area of the Tuong Duong district in Central Vietnam was used as a case study. First, a GIS database for the study area was constructed. The database that includes 76 historical flood inundated areas and ten flood influencing factors was used to develop and validate the proposed model. Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Receiver Operating Characteristic (ROC) curve, and area under the ROC curve (AUC) were used to assess the model performance and its prediction capability. Experimental results showed that the proposed model has high performance on both the training (RMSE = 0.306, MAE = 0.094, AUC = 0.962) and validation dataset (RMSE = 0.362, MAE = 0.130, AUC = 0.911). The usability of the proposed model was evaluated by comparing with those obtained from state-of-the art benchmark soft computing techniques such as J48 Decision Tree, Random Forest, Multi-layer Perceptron Neural Network, Support Vector Machine, and Adaptive Neuro Fuzzy Inference System. The results show that the proposed MONF model outperforms the above benchmark models; we conclude that the MONF model is a new alternative tool that should be used in flood susceptibility mapping. The result in this study is useful for planners and decision makers for sustainable management of flood-prone areas.

  5. Statistical-QoS Guaranteed Energy Efficiency Optimization for Energy Harvesting Wireless Sensor Networks.

    Science.gov (United States)

    Gao, Ya; Cheng, Wenchi; Zhang, Hailin

    2017-08-23

    Energy harvesting, which offers a never-ending energy supply, has emerged as a prominent technology to prolong the lifetime and reduce costs for the battery-powered wireless sensor networks. However, how to improve the energy efficiency while guaranteeing the quality of service (QoS) for energy harvesting based wireless sensor networks is still an open problem. In this paper, we develop statistical delay-bounded QoS-driven power control policies to maximize the effective energy efficiency (EEE), which is defined as the spectrum efficiency under given specified QoS constraints per unit harvested energy, for energy harvesting based wireless sensor networks. For the battery-infinite wireless sensor networks, our developed QoS-driven power control policy converges to the Energy harvesting Water Filling (E-WF) scheme and the Energy harvesting Channel Inversion (E-CI) scheme under the very loose and stringent QoS constraints, respectively. For the battery-finite wireless sensor networks, our developed QoS-driven power control policy becomes the Truncated energy harvesting Water Filling (T-WF) scheme and the Truncated energy harvesting Channel Inversion (T-CI) scheme under the very loose and stringent QoS constraints, respectively. Furthermore, we evaluate the outage probabilities to theoretically analyze the performance of our developed QoS-driven power control policies. The obtained numerical results validate our analysis and show that our developed optimal power control policies can optimize the EEE over energy harvesting based wireless sensor networks.

  6. Statistical optimization of harvesting Chlorella vulgaris using a novel bio-source, Strychnos potatorum

    Directory of Open Access Journals (Sweden)

    Sirajunnisa Abdul Razack

    2015-09-01

    Full Text Available The present study was aimed at harvesting microalga, Chlorella vulgaris, by bioflocculation using seed powder of clearing nut, Strychnos potatorum. The research was essentially the prime step to yield a large biomass for utilising the cells in biodiesel production. Optimization of the parameters influencing bioflocculation was carried out statistically using RSM. The optimized conditions were 100 mg L−1 bioflocculant concentration, 35 °C temperature, 150 rpm agitation speed and 30 min incubation time and resulted in a maximum efficiency of 99.68%. Through cell viability test, using Trypan blue stain, it was found that cells were completely intact when treated with bioflocculant, but destroyed when exposed to chemical flocculant, alum. The overall study represented that S. potatorum could potentially be a bioflocculant of microalgal cells and a promising substitute for expensive and hazardous chemical flocculants. Moreover, this bioflocculant demonstrated their utility to harvest microalgal cells by economically, effectively and in an ecofriendly way.

  7. Optimization of Ficus deltoidea Using Ultrasound-Assisted Extraction by Box-Behnken Statistical Design

    Directory of Open Access Journals (Sweden)

    L. J. Ong

    2016-09-01

    Full Text Available In this study, the effect of extraction parameters (ethanol concentration, sonication time, and solvent-to-sample ratio on Ficus deltoidea leaves was investigated using ultrasound-assisted extraction by response surface methodology (RSM. Total phenolic content (TPC of F. deltoidea extracts was identified using Folin-Ciocalteu method and expressed in gallic acid equivalent (GAE per g. Box-Behnken statistical design (BBD was the tool used to find the optimal conditions for maximum TPC. Besides, the extraction yield was measured and stated in percentage. The optimized TPC attained was 455.78 mg GAE/g at 64% ethanol concentration, 10 minutes sonication time, and 20 mL/g solvent-to-sample ratio whereas the greatest extraction yield was 33% with ethanol concentration of 70%, sonication time of 40 minutes, and solvent-to-material ratio at 40 mL/g. The determination coefficient, R2, for TPC indicates that 99.5% capriciousness in the response could be clarified by the ANOVA model and the value of 0.9681 of predicted R2 is in equitable agreement with the 0.9890 of adjusted R2. The present study shows that ethanol water as solvent, a short time of 10 minutes, and adequate solvent-to-sample ratio (20 mL/g are the best conditions for extraction.

  8. Application of Bayesian statistical decision theory for a maintenance optimization problem

    International Nuclear Information System (INIS)

    Procaccia, H.; Cordier, R.; Muller, S.

    1997-01-01

    Reliability-centered maintenance (RCM) is a rational approach that can be used to identify the equipment of facilities that may turn out to be critical with respect to safety, to availability, or to maintenance costs. Is is dor these critical pieces of equipment alone that a corrective (one waits for a failure) or preventive (the type and frequency are specified) maintenance policy is established. But this approach has limitations: - when there is little operating feedback and it concerns rare events affecting a piece of equipment judged critical on a priori grounds (how is it possible, in this case, to decide whether or not it is critical, since there is conflict between the gravity of the potential failure and its frequency?); - when the aim is propose an optimal maintenance frequency for a critical piece of equipment - changing the maintenance frequency hitherto applied may cause a significant drift in the observed reliability of the equipment, an aspect not generally taken into account in the RCM approach. In these two situations, expert judgments can be combined with the available operating feedback (Bayesian approach) and the combination of risk of failure and economic consequences taken into account (statistical decision theory) to achieve a true optimization of maintenance policy choices. This paper presents an application on the maintenance of diesel generator component

  9. Statistical Analysis of Solar PV Power Frequency Spectrum for Optimal Employment of Building Loads

    Energy Technology Data Exchange (ETDEWEB)

    Olama, Mohammed M [ORNL; Sharma, Isha [ORNL; Kuruganti, Teja [ORNL; Fugate, David L [ORNL

    2017-01-01

    In this paper, a statistical analysis of the frequency spectrum of solar photovoltaic (PV) power output is conducted. This analysis quantifies the frequency content that can be used for purposes such as developing optimal employment of building loads and distributed energy resources. One year of solar PV power output data was collected and analyzed using one-second resolution to find ideal bounds and levels for the different frequency components. The annual, seasonal, and monthly statistics of the PV frequency content are computed and illustrated in boxplot format. To examine the compatibility of building loads for PV consumption, a spectral analysis of building loads such as Heating, Ventilation and Air-Conditioning (HVAC) units and water heaters was performed. This defined the bandwidth over which these devices can operate. Results show that nearly all of the PV output (about 98%) is contained within frequencies lower than 1 mHz (equivalent to ~15 min), which is compatible for consumption with local building loads such as HVAC units and water heaters. Medium frequencies in the range of ~15 min to ~1 min are likely to be suitable for consumption by fan equipment of variable air volume HVAC systems that have time constants in the range of few seconds to few minutes. This study indicates that most of the PV generation can be consumed by building loads with the help of proper control strategies, thereby reducing impact on the grid and the size of storage systems.

  10. Validation of statistical assessment method for the optimization of the inspection need for nuclear steam generators

    International Nuclear Information System (INIS)

    Wallin, K.; Voskamp, R.; Schmibauer, J.; Ostermeyer, H.; Nagel, G.

    2011-01-01

    The cost of steam generator inspections in nuclear power plants is high. A new quantitative assessment methodology for the accumulation of flaws due to stochastic causes like fretting has been developed for cases where limited inspection data is available. Additionally, a new quantitative assessment methodology for the accumulation of environment related flaws, caused e.g. by corrosion in steam generator tubes, has been developed. The method that combines deterministic information regarding flaw initiation and growth with stochastic elements connected to environmental aspects requires only knowledge of the experimental flaw accumulation history. The method, combining both types of flaw types, provides a complete description of the flaw accumulation and there are several possible uses of the method. The method can be used to evaluate the total life expectancy of the steam generator and simple statistically defined plugging criteria can be established based on flaw behaviour. This way the inspection interval and inspection coverage can be optimized with respect to allowable flaws and the method can recognize flaw type subsets requiring more frequent inspection intervals. The method can also be used to develop statistically realistic safety factors accounting for uncertainties in inspection flaw sizing and detection. The statistical assessment method has been showed to be robust and insensitive to different assessments of plugged tubes. Because the procedure is re-calibrated after each inspection, it reacts effectively to possible changes in the steam generator environment. Validation of the assessment method is provided for real steam generators, both in the case of stochastic damage as well as environment related flaws. (authors)

  11. Artificial intelligence versus statistical modeling and optimization of continuous bead milling process for bacterial cell lysis

    Directory of Open Access Journals (Sweden)

    Shafiul Haque

    2016-11-01

    Full Text Available AbstractFor a commercially viable recombinant intracellular protein production process, efficient cell lysis and protein release is a major bottleneck. The recovery of recombinant protein, cholesterol oxidase (COD was studied in a continuous bead milling process. A full factorial Response Surface Model (RSM design was employed and compared to Artificial Neural Networks coupled with Genetic Algorithm (ANN-GA. Significant process variables, cell slurry feed rate (A, bead load (B, cell load (C and run time (D, were investigated and optimized for maximizing COD recovery. RSM predicted an optimum of feed rate of 310.73 mL/h, bead loading of 79.9% (v/v, cell loading OD600 nm of 74, and run time of 29.9 min with a recovery of ~3.2 g/L. ANN coupled with GA predicted a maximum COD recovery of ~3.5 g/L at an optimum feed rate (mL/h: 258.08, bead loading (%, v/v: 80%, cell loading (OD600 nm: 73.99, and run time of 32 min. An overall 3.7-fold increase in productivity is obtained when compared to a batch process. Optimization and comparison of statistical vs. artificial intelligence techniques in continuous bead milling process has been attempted for the very first time in our study. We were able to successfully represent the complex non-linear multivariable dependence of enzyme recovery on bead milling parameters. The quadratic second order response functions are not flexible enough to represent such complex non-linear dependence. ANN being a summation function of multiple layers are capable to represent complex non-linear dependence of variables in this case; enzyme recovery as a function of bead milling parameters. Since GA can even optimize discontinuous functions present study cites a perfect example of using machine learning (ANN in combination with evolutionary optimization (GA for representing undefined biological functions which is the case for common industrial processes involving biological moieties.

  12. A statistical inference approach for the retrieval of the atmospheric ozone profile from simulated satellite measurements of solar backscattered ultraviolet radiation

    Science.gov (United States)

    Bonavito, N. L.; Gordon, C. L.; Inguva, R.; Serafino, G. N.; Barnes, R. A.

    1994-01-01

    NASA's Mission to Planet Earth (MTPE) will address important interdisciplinary and environmental issues such as global warming, ozone depletion, deforestation, acid rain, and the like with its long term satellite observations of the Earth and with its comprehensive Data and Information System. Extensive sets of satellite observations supporting MTPE will be provided by the Earth Observing System (EOS), while more specific process related observations will be provided by smaller Earth Probes. MTPE will use data from ground and airborne scientific investigations to supplement and validate the global observations obtained from satellite imagery, while the EOS satellites will support interdisciplinary research and model development. This is important for understanding the processes that control the global environment and for improving the prediction of events. In this paper we illustrate the potential for powerful artificial intelligence (AI) techniques when used in the analysis of the formidable problems that exist in the NASA Earth Science programs and of those to be encountered in the future MTPE and EOS programs. These techniques, based on the logical and probabilistic reasoning aspects of plausible inference, strongly emphasize the synergetic relation between data and information. As such, they are ideally suited for the analysis of the massive data streams to be provided by both MTPE and EOS. To demonstrate this, we address both the satellite imagery and model enhancement issues for the problem of ozone profile retrieval through a method based on plausible scientific inferencing. Since in the retrieval problem, the atmospheric ozone profile that is consistent with a given set of measured radiances may not be unique, an optimum statistical method is used to estimate a 'best' profile solution from the radiances and from additional a priori information.

  13. Statistically Optimized Inversion Algorithm for Enhanced Retrieval of Aerosol Properties from Spectral Multi-Angle Polarimetric Satellite Observations

    Science.gov (United States)

    Dubovik, O; Herman, M.; Holdak, A.; Lapyonok, T.; Taure, D.; Deuze, J. L.; Ducos, F.; Sinyuk, A.

    2011-01-01

    The proposed development is an attempt to enhance aerosol retrieval by emphasizing statistical optimization in inversion of advanced satellite observations. This optimization concept improves retrieval accuracy relying on the knowledge of measurement error distribution. Efficient application of such optimization requires pronounced data redundancy (excess of the measurements number over number of unknowns) that is not common in satellite observations. The POLDER imager on board the PARASOL microsatellite registers spectral polarimetric characteristics of the reflected atmospheric radiation at up to 16 viewing directions over each observed pixel. The completeness of such observations is notably higher than for most currently operating passive satellite aerosol sensors. This provides an opportunity for profound utilization of statistical optimization principles in satellite data inversion. The proposed retrieval scheme is designed as statistically optimized multi-variable fitting of all available angular observations obtained by the POLDER sensor in the window spectral channels where absorption by gas is minimal. The total number of such observations by PARASOL always exceeds a hundred over each pixel and the statistical optimization concept promises to be efficient even if the algorithm retrieves several tens of aerosol parameters. Based on this idea, the proposed algorithm uses a large number of unknowns and is aimed at retrieval of extended set of parameters affecting measured radiation.

  14. Optimal Adaptive Statistical Iterative Reconstruction Percentage in Dual-energy Monochromatic CT Portal Venography.

    Science.gov (United States)

    Zhao, Liqin; Winklhofer, Sebastian; Yang, Zhenghan; Wang, Keyang; He, Wen

    2016-03-01

    The aim of this article was to study the influence of different adaptive statistical iterative reconstruction (ASIR) percentages on the image quality of dual-energy computed tomography (DECT) portal venography in portal hypertension patients. DECT scans of 40 patients with cirrhosis (mean age, 56 years) at the portal venous phase were retrospectively analyzed. Monochromatic images at 60 and 70 keV were reconstructed with four ASIR percentages: 0%, 30%, 50%, and 70%. Computed tomography (CT) numbers of the portal veins (PVs), liver parenchyma, and subcutaneous fat tissue in the abdomen were measured. The standard deviation from the region of interest of the liver parenchyma was interpreted as the objective image noise (IN). The contrast-noise ratio (CNR) between PV and liver parenchyma was calculated. The diagnostic acceptability (DA) and sharpness of PV margins were obtained using a 5-point score. The IN, CNR, DA, and sharpness of PV were compared among the eight groups with different keV + ASIR level combinations. The IN, CNR, DA, and sharpness of PV of different keV + ASIR groups were all statistically different (P ASIR and 70 keV + 0% ASIR (filtered back-projection [FBP]) combination, respectively, whereas the largest and smallest objective IN were obtained in the 60 keV + 0% ASIR (FBP) and 70 keV + 70% combination. The highest DA and sharpness values of PV were obtained at 50% ASIR for 60 keV. An optimal ASIR percentage (50%) combined with an appropriate monochromatic energy level (60 keV) provides the highest DA in portal venography imaging, whereas for the higher monochromatic energy (70 keV) images, 30% ASIR provides the highest image quality, with less IN than 60 keV with 50% ASIR. Copyright © 2015 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved.

  15. Statistical design of personalized medicine interventions: The Clarification of Optimal Anticoagulation through Genetics (COAG trial

    Directory of Open Access Journals (Sweden)

    Gage Brian F

    2010-11-01

    Full Text Available Abstract Background There is currently much interest in pharmacogenetics: determining variation in genes that regulate drug effects, with a particular emphasis on improving drug safety and efficacy. The ability to determine such variation motivates the application of personalized drug therapies that utilize a patient's genetic makeup to determine a safe and effective drug at the correct dose. To ascertain whether a genotype-guided drug therapy improves patient care, a personalized medicine intervention may be evaluated within the framework of a randomized controlled trial. The statistical design of this type of personalized medicine intervention requires special considerations: the distribution of relevant allelic variants in the study population; and whether the pharmacogenetic intervention is equally effective across subpopulations defined by allelic variants. Methods The statistical design of the Clarification of Optimal Anticoagulation through Genetics (COAG trial serves as an illustrative example of a personalized medicine intervention that uses each subject's genotype information. The COAG trial is a multicenter, double blind, randomized clinical trial that will compare two approaches to initiation of warfarin therapy: genotype-guided dosing, the initiation of warfarin therapy based on algorithms using clinical information and genotypes for polymorphisms in CYP2C9 and VKORC1; and clinical-guided dosing, the initiation of warfarin therapy based on algorithms using only clinical information. Results We determine an absolute minimum detectable difference of 5.49% based on an assumed 60% population prevalence of zero or multiple genetic variants in either CYP2C9 or VKORC1 and an assumed 15% relative effectiveness of genotype-guided warfarin initiation for those with zero or multiple genetic variants. Thus we calculate a sample size of 1238 to achieve a power level of 80% for the primary outcome. We show that reasonable departures from these

  16. Human Inferences about Sequences: A Minimal Transition Probability Model.

    Directory of Open Access Journals (Sweden)

    Florent Meyniel

    2016-12-01

    Full Text Available The brain constantly infers the causes of the inputs it receives and uses these inferences to generate statistical expectations about future observations. Experimental evidence for these expectations and their violations include explicit reports, sequential effects on reaction times, and mismatch or surprise signals recorded in electrophysiology and functional MRI. Here, we explore the hypothesis that the brain acts as a near-optimal inference device that constantly attempts to infer the time-varying matrix of transition probabilities between the stimuli it receives, even when those stimuli are in fact fully unpredictable. This parsimonious Bayesian model, with a single free parameter, accounts for a broad range of findings on surprise signals, sequential effects and the perception of randomness. Notably, it explains the pervasive asymmetry between repetitions and alternations encountered in those studies. Our analysis suggests that a neural machinery for inferring transition probabilities lies at the core of human sequence knowledge.

  17. Inference as Prediction

    Science.gov (United States)

    Watson, Jane

    2007-01-01

    Inference, or decision making, is seen in curriculum documents as the final step in a statistical investigation. For a formal statistical enquiry this may be associated with sophisticated tests involving probability distributions. For young students without the mathematical background to perform such tests, it is still possible to draw informal…

  18. Acid hydrolysis of corn stover using hydrochloric acid: Kinetic modeling and statistical optimization

    Directory of Open Access Journals (Sweden)

    Sun Yong

    2014-01-01

    Full Text Available The hydrolysis of corn stover using hydrochloric acid was studied. The kinetic parameters of the mathematical models for predicting the yields of xylose, glucose, furfural and acetic acid were obtained, and the corresponding xylose generation activation energy of 100 kJ/mol was determined. The characterization of corn stover using with different techniques during hydrolysis indicated an effective removal of xylan and the slightly alteration on the structures of cellulose and lignin. A 23five levels Central Composite Design (CCD was used to develop a statistical model for the optimization of process variables including acid concentration, pretreatment temperature and time. The optimum conditions determined by this model were found to be 108ºC for 80 minutes with acid concentration of 5.8%. Under these conditions, the maximised results are the following: xylose 19.93 g/L, glucose 1.2 g/L, furfural 1.5 g/L, acetic acid 1.3 g/L. The validation of the model indicates a good agreement between the experimental results and the predicted values.

  19. Automatic generation of 3D statistical shape models with optimal landmark distributions.

    Science.gov (United States)

    Heimann, T; Wolf, I; Meinzer, H-P

    2007-01-01

    To point out the problem of non-uniform landmark placement in statistical shape modeling, to present an improved method for generating landmarks in the 3D case and to propose an unbiased evaluation metric to determine model quality. Our approach minimizes a cost function based on the minimum description length (MDL) of the shape model to optimize landmark correspondences over the training set. In addition to the standard technique, we employ an extended remeshing method to change the landmark distribution without losing correspondences, thus ensuring a uniform distribution over all training samples. To break the dependency of the established evaluation measures generalization and specificity from the landmark distribution, we change the internal metric from landmark distance to volumetric overlap. Redistributing landmarks to an equally spaced distribution during the model construction phase improves the quality of the resulting models significantly if the shapes feature prominent bulges or other complex geometry. The distribution of landmarks on the training shapes is -- beyond the correspondence issue -- a crucial point in model construction.

  20. Statistical Optimization of Tannase Production by Penicillium sp. EZ-ZH390 in Submerged Fermentation

    Directory of Open Access Journals (Sweden)

    Zohreh Hamidi-Esfahani

    2015-06-01

    Full Text Available Tannase has several important applications in food, feed, chemical and pharmaceutical industries. In the present study, production of tannase by mutant strain, Penicillium sp. EZ-ZH390, was optimized in submerged fermentation utilizing two statistical approaches. At first step, a one factor at a time design was employed to screen the preferable nutriments (carbon and nitrogen sources of the medium to produce tannase. Screening of the carbon source resulted in the production of 10.74 U/mL of tannase in 72 h in the presence of 14% raspberry leaves powder. A 1.99-fold increase in tannase production was achieved upon further screening of the nitrogen source (in the presence of 1.2% ammonium nitrate. Then the culture condition variables were studied by the response surface methodology using a central composite design. The results showed that temperature of 30°C rotation rate of 85 rpm and fermentation time 24h led to increased tannase production. At these conditions, tannase activity reached to 21.77 U/mL, and tannase productivity was at least 3.55 times (0.26 U/mL.h in compare to those reported in the literature. The present study showed that, at the optimum conditions, Penicillium sp. EZ-ZH390 is an excellent strain for use in the efficient production of tannase.

  1. Statistical optimization for tannase production from Aspergillus niger under submerged fermentation.

    Science.gov (United States)

    Sharma, S; Agarwal, L; Saxena, R K

    2007-06-01

    Statistically based experimental design was employed for the optimization of fermentation conditions for maximum production of enzyme tannase from Aspergillus niger. Central composite rotatable design (CCRD) falling under response surface methodology (RSM) was used. Based on the results of 'one-at-a-time' approach in submerged fermentation, the most influencing factors for tannase production from A. niger were concentrations of tannic acid and sodium nitrate, agitation rate and incubation period. Hence, to achieve the maximum yield of tannase, interaction of these factors was studied at optimum production pH of 5.0 by RSM. The optimum values of parameters obtained through RSM were 5% tannic acid, 0.8% sodium nitrate, 5.0 pH, 5 × 10(7) spores/50mL inoculum density, 150 rpm agitation and incubation period of 48 h which resulted in production of 19.7 UmL(-1) of the enzyme. This activity was almost double as compared to the amount obtained by 'one-at-a-time' approach (9.8 UmL(-1)).

  2. Diameter optimization of VLS-synthesized ZnO nanowires, using statistical design of experiment

    International Nuclear Information System (INIS)

    Shafiei, Sepideh; Nourbakhsh, Amirhasan; Ganjipour, Bahram; Zahedifar, Mostafa; Vakili-Nezhaad, Gholamreza

    2007-01-01

    The possibility of diameter optimization of ZnO nanowires by using statistical design of experiment (DoE) is investigated. In this study, nanowires were synthesized using a vapor-liquid-solid (VLS) growth method in a horizontal reactor. The effects of six synthesis parameters (synthesis time, synthesis temperature, thickness of gold layer, distance between ZnO holder and substrate, mass of ZnO and Ar flow rate) on the average diameter of a ZnO nanowire were examined using the fractional factorial design (FFD) coupled with response surface methodology (RSM). Using a 2 III 6-3 FFD, the main effects of the thickness of the gold layer, synthesis temperature and synthesis time were concluded to be the key factors influencing the diameter. Then Box-Behnken design (BBD) was exploited to create a response surface from the main factors. The total number of required runs for the DoE process is 25, 8 runs for FFD parameter screening and 17 runs for the response surface obtained by BBD. Three extra runs are done to confirm the predicted results

  3. Morphology optimization of CCVD-synthesized multiwall carbon nanotubes, using statistical design of experiments

    International Nuclear Information System (INIS)

    Nourbakhsh, Amirhasan; Ganjipour, Bahram; Zahedifar, Mostafa; Arzi, Ezatollah

    2007-01-01

    The possibility of optimization of morphological features of multiwall carbon nanotubes (MWCNTs) using the statistical design of experiments (DoE) is investigated. In this study, MWCNTs were synthesized using a catalytic chemical vapour deposition (CCVD) method in a horizontal reactor using acetylene as the carbon source. The effects of six synthesis parameters (synthesis time, synthesis temperature, catalyst mass, reduction time, acetylene flow rate and hydrogen flow rate) on the average diameter and mean rectilinear length (MRL) of carbon nanotubes were examined using fractional-factorial design (FFD) coupled with response surface methodology (RSM). Using a 2 III 6-3 FFD, the main effects of reaction temperature, hydrogen flow rate and chemical reduction time were concluded to be the key factors influencing the diameter and MRL of MWCNTs; then Box-Behnken design (BBD) was exploited to create a response surface from the main factors. The total number of required runs is 26: 8 runs are for FFD parameter screening, 17 runs are for the response surface obtained by the BBD, and the final run is used to confirm the predicted results

  4. Statistical optimization of microencapsulation process for coating of magnesium particles with Viton polymer

    Energy Technology Data Exchange (ETDEWEB)

    Pourmortazavi, Seied Mahdi, E-mail: pourmortazavi@yahoo.com [Faculty of Material and Manufacturing Technologies, Malek Ashtar University of Technology, P.O. Box 16765-3454, Tehran (Iran, Islamic Republic of); Babaee, Saeed; Ashtiani, Fatemeh Shamsi [Faculty of Chemistry & Chemical Engineering, Malek Ashtar University of Technology, Tehran (Iran, Islamic Republic of)

    2015-09-15

    Graphical abstract: - Highlights: • Surface of magnesium particles was modified with Viton via solvent/non-solvent method. • FT-IR, SEM, EDX, Map analysis, and TG/DSC techniques were employed to characterize the coated particles. • Coating process factors were optimized by Taguchi robust design. • The importance of coating conditions on resistance of coated magnesium against oxidation was studied. - Abstract: The surface of magnesium particles was modified by coating with Viton as an energetic polymer using solvent/non-solvent technique. Taguchi robust method was utilized as a statistical experiment design to evaluate the role of coating process parameters. The coated magnesium particles were characterized by various techniques, i.e., Fourier transform infrared (FT-IR) spectroscopy, scanning electron microscopy (SEM), energy-dispersive X-ray spectroscopy (EDX) and thermogravimetry (TG), and differential scanning calorimetry (DSC). The results showed that the coating of magnesium powder with the Viton leads to a higher resistance of metal against oxidation in the presence of air atmosphere. Meanwhile, tuning of the coating process parameters (i.e., percent of Viton, flow rate of non-solvent addition, and type of solvent) influences on the resistance of the metal particles against thermal oxidation. Coating of magnesium particles yields Viton coated particles with higher thermal stability (632 °C); in comparison with the pure magnesium powder, which commences oxidation in the presence of air atmosphere at a lower temperature of 260 °C.

  5. Logical inference and evaluation

    International Nuclear Information System (INIS)

    Perey, F.G.

    1981-01-01

    Most methodologies of evaluation currently used are based upon the theory of statistical inference. It is generally perceived that this theory is not capable of dealing satisfactorily with what are called systematic errors. Theories of logical inference should be capable of treating all of the information available, including that not involving frequency data. A theory of logical inference is presented as an extension of deductive logic via the concept of plausibility and the application of group theory. Some conclusions, based upon the application of this theory to evaluation of data, are also given

  6. Strain improvement and statistical optimization as a combined strategy for improving fructosyltransferase production by Aureobasidium pullulans NAC8

    Directory of Open Access Journals (Sweden)

    Adedeji Nelson Ademakinwa

    2017-12-01

    A relatively low FTase-producing strain of Aureobasidium pullulans NAC8 was enhanced for optimum production using a two-pronged approach involving mutagenesis and statistical optimization. The improved mutant strain also had remarkable biotechnological properties that make it a suitable alternative than the wild-type.

  7. Using Artificial Intelligence to Retrieve the Optimal Parameters and Structures of Adaptive Network-Based Fuzzy Inference System for Typhoon Precipitation Forecast Modeling

    Directory of Open Access Journals (Sweden)

    Chien-Lin Huang

    2015-01-01

    Full Text Available This study aims to construct a typhoon precipitation forecast model providing forecasts one to six hours in advance using optimal model parameters and structures retrieved from a combination of the adaptive network-based fuzzy inference system (ANFIS and artificial intelligence. To enhance the accuracy of the precipitation forecast, two structures were then used to establish the precipitation forecast model for a specific lead-time: a single-model structure and a dual-model hybrid structure where the forecast models of higher and lower precipitation were integrated. In order to rapidly, automatically, and accurately retrieve the optimal parameters and structures of the ANFIS-based precipitation forecast model, a tabu search was applied to identify the adjacent radius in subtractive clustering when constructing the ANFIS structure. The coupled structure was also employed to establish a precipitation forecast model across short and long lead-times in order to improve the accuracy of long-term precipitation forecasts. The study area is the Shimen Reservoir, and the analyzed period is from 2001 to 2009. Results showed that the optimal initial ANFIS parameters selected by the tabu search, combined with the dual-model hybrid method and the coupled structure, provided the favors in computation efficiency and high-reliability predictions in typhoon precipitation forecasts regarding short to long lead-time forecasting horizons.

  8. A Statistical Framework for Microbial Source Attribution: Measuring Uncertainty in Host Transmission Events Inferred from Genetic Data (Part 2 of a 2 Part Report)

    Energy Technology Data Exchange (ETDEWEB)

    Allen, J; Velsko, S

    2009-11-16

    This report explores the question of whether meaningful conclusions can be drawn regarding the transmission relationship between two microbial samples on the basis of differences observed between the two sample's respective genomes. Unlike similar forensic applications using human DNA, the rapid rate of microbial genome evolution combined with the dynamics of infectious disease require a shift in thinking on what it means for two samples to 'match' in support of a forensic hypothesis. Previous outbreaks for SARS-CoV, FMDV and HIV were examined to investigate the question of how microbial sequence data can be used to draw inferences that link two infected individuals by direct transmission. The results are counter intuitive with respect to human DNA forensic applications in that some genetic change rather than exact matching improve confidence in inferring direct transmission links, however, too much genetic change poses challenges, which can weaken confidence in inferred links. High rates of infection coupled with relatively weak selective pressure observed in the SARS-CoV and FMDV data lead to fairly low confidence for direct transmission links. Confidence values for forensic hypotheses increased when testing for the possibility that samples are separated by at most a few intermediate hosts. Moreover, the observed outbreak conditions support the potential to provide high confidence values for hypothesis that exclude direct transmission links. Transmission inferences are based on the total number of observed or inferred genetic changes separating two sequences rather than uniquely weighing the importance of any one genetic mismatch. Thus, inferences are surprisingly robust in the presence of sequencing errors provided the error rates are randomly distributed across all samples in the reference outbreak database and the novel sequence samples in question. When the number of observed nucleotide mutations are limited due to characteristics of the

  9. Selecting statistical models and variable combinations for optimal classification using otolith microchemistry.

    Science.gov (United States)

    Mercier, Lény; Darnaude, Audrey M; Bruguier, Olivier; Vasconcelos, Rita P; Cabral, Henrique N; Costa, Maria J; Lara, Monica; Jones, David L; Mouillot, David

    2011-06-01

    Reliable assessment of fish origin is of critical importance for exploited species, since nursery areas must be identified and protected to maintain recruitment to the adult stock. During the last two decades, otolith chemical signatures (or "fingerprints") have been increasingly used as tools to discriminate between coastal habitats. However, correct assessment of fish origin from otolith fingerprints depends on various environmental and methodological parameters, including the choice of the statistical method used to assign fish to unknown origin. Among the available methods of classification, Linear Discriminant Analysis (LDA) is the most frequently used, although it assumes data are multivariate normal with homogeneous within-group dispersions, conditions that are not always met by otolith chemical data, even after transformation. Other less constrained classification methods are available, but there is a current lack of comparative analysis in applications to otolith microchemistry. Here, we assessed stock identification accuracy for four classification methods (LDA, Quadratic Discriminant Analysis [QDA], Random Forests [RF], and Artificial Neural Networks [ANN]), through the use of three distinct data sets. In each case, all possible combinations of chemical elements were examined to identify the elements to be used for optimal accuracy in fish assignment to their actual origin. Our study shows that accuracy differs according to the model and the number of elements considered. Best combinations did not include all the elements measured, and it was not possible to define an ad hoc multielement combination for accurate site discrimination. Among all the models tested, RF and ANN performed best, especially for complex data sets (e.g., with numerous fish species and/or chemical elements involved). However, for these data, RF was less time-consuming and more interpretable than ANN, and far more efficient and less demanding in terms of assumptions than LDA or QDA

  10. Establishment, maintenance and application of failure statistics as a basis for availability optimization

    International Nuclear Information System (INIS)

    Poll, H.

    1989-01-01

    The purpose of failure statistics is to obtain hints on weak points due to operation and design. The present failure statistics of Rheinisch-Westfaelisches Elektrizitaetswerk (RWE) is based on reducing availability of power station units. If damage or trouble occurs with a unit, data will be recorded in order to calculate the unavailability and to describe the occurence, the extent, and the removal of damage. Following a survey of the most important data, a short explanation is given on updating of failure statistics and some problems of this job are mentioned. Finally some examples are given, how failure statistics can be used for analyses. (orig.) [de

  11. Entropic Inference

    Science.gov (United States)

    Caticha, Ariel

    2011-03-01

    In this tutorial we review the essential arguments behing entropic inference. We focus on the epistemological notion of information and its relation to the Bayesian beliefs of rational agents. The problem of updating from a prior to a posterior probability distribution is tackled through an eliminative induction process that singles out the logarithmic relative entropy as the unique tool for inference. The resulting method of Maximum relative Entropy (ME), includes as special cases both MaxEnt and Bayes' rule, and therefore unifies the two themes of these workshops—the Maximum Entropy and the Bayesian methods—into a single general inference scheme.

  12. Variational inference & deep learning : A new synthesis

    NARCIS (Netherlands)

    Kingma, D.P.

    2017-01-01

    In this thesis, Variational Inference and Deep Learning: A New Synthesis, we propose novel solutions to the problems of variational (Bayesian) inference, generative modeling, representation learning, semi-supervised learning, and stochastic optimization.

  13. Variational inference & deep learning: A new synthesis

    OpenAIRE

    Kingma, D.P.

    2017-01-01

    In this thesis, Variational Inference and Deep Learning: A New Synthesis, we propose novel solutions to the problems of variational (Bayesian) inference, generative modeling, representation learning, semi-supervised learning, and stochastic optimization.

  14. Parameter optimization in biased decoy-state quantum key distribution with both source errors and statistical fluctuations

    Science.gov (United States)

    Zhu, Jian-Rong; Li, Jian; Zhang, Chun-Mei; Wang, Qin

    2017-10-01

    The decoy-state method has been widely used in commercial quantum key distribution (QKD) systems. In view of the practical decoy-state QKD with both source errors and statistical fluctuations, we propose a universal model of full parameter optimization in biased decoy-state QKD with phase-randomized sources. Besides, we adopt this model to carry out simulations of two widely used sources: weak coherent source (WCS) and heralded single-photon source (HSPS). Results show that full parameter optimization can significantly improve not only the secure transmission distance but also the final key generation rate. And when taking source errors and statistical fluctuations into account, the performance of decoy-state QKD using HSPS suffered less than that of decoy-state QKD using WCS.

  15. Prolonged release matrix tablet of pyridostigmine bromide: formulation and optimization using statistical methods.

    Science.gov (United States)

    Bolourchian, Noushin; Rangchian, Maryam; Foroutan, Seyed Mohsen

    2012-07-01

    The aim of this study was to design and optimize a prolonged release matrix formulation of pyridostigmine bromide, an effective drug in myasthenia gravis and poisoning with nerve gas, using hydrophilic - hydrophobic polymers via D-optimal experimental design. HPMC and carnauba wax as retarding agents as well as tricalcium phosphate were used in matrix formulation and considered as independent variables. Tablets were prepared by wet granulation technique and the percentage of drug released at 1 (Y(1)), 4 (Y(2)) and 8 (Y(3)) hours were considered as dependent variables (responses) in this investigation. These experimental responses were best fitted for the cubic, cubic and linear models, respectively. The optimal formulation obtained in this study, consisted of 12.8 % HPMC, 24.4 % carnauba wax and 26.7 % tricalcium phosphate, had a suitable prolonged release behavior followed by Higuchi model in which observed and predicted values were very close. The study revealed that D-optimal design could facilitate the optimization of prolonged release matrix tablet containing pyridostigmine bromide. Accelerated stability studies confirmed that the optimized formulation remains unchanged after exposing in stability conditions for six months.

  16. Entropic Inference

    OpenAIRE

    Caticha, Ariel

    2010-01-01

    In this tutorial we review the essential arguments behing entropic inference. We focus on the epistemological notion of information and its relation to the Bayesian beliefs of rational agents. The problem of updating from a prior to a posterior probability distribution is tackled through an eliminative induction process that singles out the logarithmic relative entropy as the unique tool for inference. The resulting method of Maximum relative Entropy (ME), includes as special cases both MaxEn...

  17. Improving alignment in Tract-based spatial statistics: evaluation and optimization of image registration

    NARCIS (Netherlands)

    de Groot, Marius; Vernooij, Meike W.; Klein, Stefan; Ikram, M. Arfan; Vos, Frans M.; Smith, Stephen M.; Niessen, Wiro J.; Andersson, Jesper L. R.

    2013-01-01

    Anatomical alignment in neuroimaging studies is of such importance that considerable effort is put into improving the registration used to establish spatial correspondence. Tract-based spatial statistics (TBSS) is a popular method for comparing diffusion characteristics across subjects. TBSS

  18. The Parallel C++ Statistical Library ‘QUESO’: Quantification of Uncertainty for Estimation, Simulation and Optimization

    KAUST Repository

    Prudencio, Ernesto E.; Schulz, Karl W.

    2012-01-01

    QUESO is a collection of statistical algorithms and programming constructs supporting research into the uncertainty quantification (UQ) of models and their predictions. It has been designed with three objectives: it should (a) be sufficiently

  19. Improving alignment in Tract-based spatial statistics : Evaluation and optimization of image registration

    NARCIS (Netherlands)

    De Groot, M.; Vernooij, M.W.; Klein, S.; Arfan Ikram, M.; Vos, F.M.; Smith, S.M.; Niessen, W.J.; Andersson, J.L.R.

    2013-01-01

    Anatomical alignment in neuroimaging studies is of such importance that considerable effort is put into improving the registration used to establish spatial correspondence. Tract-based spatial statistics (TBSS) is a popular method for comparing diffusion characteristics across subjects. TBSS

  20. Development and optimization of fast dissolving oro-dispersible films of granisetron HCl using Box–Behnken statistical design

    Directory of Open Access Journals (Sweden)

    Hema Chaudhary

    2013-12-01

    Full Text Available The aim was to develop and optimize fast dissolving oro-dispersible films of granisetron hydrochloride (GH by two-factor, three-level Box–Behnken design as the two independent variables such as X1 (polymer and X2 (plasticizer were selected on the basis of the preliminary studies carried out before the experimental design is being implemented. A second-order polynomial equation to construct contour plots for the prediction of responses of the dependent variables such as drug release (Y1, Disintegration time (Y2, and Y3 (Tensile strength was studied. The Response surface plots were drawn, statistical validity of the polynomials was established to find the compositions of optimized formulation which was evaluated using the Franz-type diffusion cell. The designs establish the role of the derived polynomial equation and contour plots in predicting the values of dependent variables for the preparation and optimization.

  1. Optimization of aspergillus niger nutritional conditions using statistical experimental methods for bio-recovery of manganese from pyrolusite

    International Nuclear Information System (INIS)

    Mujeeb-ur-Rahman; Yasinzai, M.M.; Tareen, R.B.; Iqbal, A.; Gul, S.; Odhano, E.A.

    2011-01-01

    Optimization of aspergillus niger nutritional conditions using statistical experimental methods for bio-recovery of manganese from pyrolusite Mujeeb-ur-rahman, Mohammed Masoom Yasinzai, Rasool Bakhsh Tareen, Asim Iqbal, Ejaz Ali Odhano, Shereen Gul. The nutritional requirements for Aspergillus niger PCSIR-06 for bio-recovery of manganese from pyrolusite ore were optimized. Box-Bhenken design and response surface methodology were used for designing of experiment and statistical analysis of the results. This procedure limited the number of actual experiments to 54 for studying the possible interaction between six nutrients. The optimum concentration of the nutrients were Sucrose 148.5 g/L, KH/sub 2/PO/sub 4/ 0.50 g/L, NH/sub 4/NO/sub 3/ 0.33 g/L, MgSO/sub 4/ 0.41 g/L, Zn 23.76 mg/L, Fe 0.18 mg/L for Aspergillus niger to achieve maximum bio-recovery of manganese (82.47 +- 5.67%). The verification run confirmed the predicted optimized concentration of all the six ingredients for maximum bio leaching of manganese and successfully confirmed the use of Box-Bhenken experimental design for maximum bio-recovery. Results also revealed that small and less time consuming experimental designs could be efficient for optimization of bio-recovery processes. (author)

  2. Transdermal agomelatine microemulsion gel: pyramidal screening, statistical optimization and in vivo bioavailability.

    Science.gov (United States)

    Said, Mayada; Elsayed, Ibrahim; Aboelwafa, Ahmed A; Elshafeey, Ahmed H

    2017-11-01

    Agomelatine is a new antidepressant having very low oral drug bioavailability less than 5% due to being liable to extensive hepatic 1st pass effect. This study aimed to deliver agomelatine by transdermal route through formulation and optimization of microemulsion gel. Pyramidal screening was performed to select the most suitable ingredients combinations and then, the design expert software was utilized to optimize the microemulsion formulations. The independent variables of the employed mixture design were the percentages of capryol 90 as an oily phase (X 1 ), Cremophor RH40 and Transcutol HP in a ratio of (1:2) as surfactant/cosurfactant mixture 'S mix ' (X 2 ) and water (X 3 ). The dependent variables were globule size, optical clarity, cumulative amount permeated after 1 and 24 h, respectively (Q1 and Q 24 ) and enhancement ratio (ER). The optimized formula was composed of 5% oil, 45% S mix and 50% water. The optimized microemulsion formula was converted into carbopol-based gel to improve its retention on the skin. It enhanced the drug permeation through rat skin with an enhancement ratio of 37.30 when compared to the drug hydrogel. The optimum ME gel formula was found to have significantly higher C max , AUC 0-24 h and AUC 0-∞ than that of the reference agomelatine hydrogel and oral solution. This could reveal the prosperity of the optimized microemulsion gel formula to augment the transdermal bioavailability of agomelatine.

  3. Global optimization based on noisy evaluations: An empirical study of two statistical approaches

    International Nuclear Information System (INIS)

    Vazquez, Emmanuel; Villemonteix, Julien; Sidorkiewicz, Maryan; Walter, Eric

    2008-01-01

    The optimization of the output of complex computer codes has often to be achieved with a small budget of evaluations. Algorithms dedicated to such problems have been developed and compared, such as the Expected Improvement algorithm (El) or the Informational Approach to Global Optimization (IAGO). However, the influence of noisy evaluation results on the outcome of these comparisons has often been neglected, despite its frequent appearance in industrial problems. In this paper, empirical convergence rates for El and IAGO are compared when an additive noise corrupts the result of an evaluation. IAGO appears more efficient than El and various modifications of El designed to deal with noisy evaluations. Keywords. Global optimization; computer simulations; kriging; Gaussian process; noisy evaluations.

  4. Using complete measurement statistics for optimal device-independent randomness evaluation

    International Nuclear Information System (INIS)

    Nieto-Silleras, O; Pironio, S; Silman, J

    2014-01-01

    The majority of recent works investigating the link between non-locality and randomness, e.g. in the context of device-independent cryptography, do so with respect to some specific Bell inequality, usually the CHSH inequality. However, the joint probabilities characterizing the measurement outcomes of a Bell test are richer than just the degree of violation of a single Bell inequality. In this work we show how to take this extra information into account in a systematic manner in order to optimally evaluate the randomness that can be certified from non-local correlations. We further show that taking into account the complete set of outcome probabilities is equivalent to optimizing over all possible Bell inequalities, thereby allowing us to determine the optimal Bell inequality for certifying the maximal amount of randomness from a given set of non-local correlations. (paper)

  5. Reconstructing a B-cell clonal lineage. I. Statistical inference of unobserved ancestors [v1; ref status: indexed, http://f1000r.es/z6

    Directory of Open Access Journals (Sweden)

    Thomas B Kepler

    2013-04-01

    Full Text Available One of the key phenomena in the adaptive immune response to infection and immunization is affinity maturation, during which antibody genes are mutated and selected, typically resulting in a substantial increase in binding affinity to the eliciting antigen. Advances in technology on several fronts have made it possible to clone large numbers of heavy-chain light-chain pairs from individual B cells and thereby identify whole sets of clonally related antibodies. These collections could provide the information necessary to reconstruct their own history - the sequence of changes introduced into the lineage during the development of the clone - and to study affinity maturation in detail. But the success of such a program depends entirely on accurately inferring the founding ancestor and the other unobserved intermediates. Given a set of clonally related immunoglobulin V-region genes, the method described here allows one to compute the posterior distribution over their possible ancestors, thereby giving a thorough accounting of the uncertainty inherent in the reconstruction. I demonstrate the application of this method on heavy-chain and light-chain clones, assess the reliability of the inference, and discuss the sources of uncertainty.

  6. Ethanol production from banana peels using statistically optimized simultaneous saccharification and fermentation process.

    Science.gov (United States)

    Oberoi, Harinder Singh; Vadlani, Praveen V; Saida, Lavudi; Bansal, Sunil; Hughes, Joshua D

    2011-07-01

    Dried and ground banana peel biomass (BP) after hydrothermal sterilization pretreatment was used for ethanol production using simultaneous saccharification and fermentation (SSF). Central composite design (CCD) was used to optimize concentrations of cellulase and pectinase, temperature and time for ethanol production from BP using SSF. Analysis of variance showed a high coefficient of determination (R(2)) value of 0.92 for ethanol production. On the basis of model graphs and numerical optimization, the validation was done in a laboratory batch fermenter with cellulase, pectinase, temperature and time of nine cellulase filter paper unit/gram cellulose (FPU/g-cellulose), 72 international units/gram pectin (IU/g-pectin), 37 °C and 15 h, respectively. The experiment using optimized parameters in batch fermenter not only resulted in higher ethanol concentration than the one predicted by the model equation, but also saved fermentation time. This study demonstrated that both hydrothermal pretreatment and SSF could be successfully carried out in a single vessel, and use of optimized process parameters helped achieve significant ethanol productivity, indicating commercial potential for the process. To the best of our knowledge, ethanol concentration and ethanol productivity of 28.2 g/l and 2.3 g/l/h, respectively from banana peels have not been reported to date. Copyright © 2011 Elsevier Ltd. All rights reserved.

  7. Statistical surrogate model based sampling criterion for stochastic global optimization of problems with constraints

    Energy Technology Data Exchange (ETDEWEB)

    Cho, Su Gil; Jang, Jun Yong; Kim, Ji Hoon; Lee, Tae Hee [Hanyang University, Seoul (Korea, Republic of); Lee, Min Uk [Romax Technology Ltd., Seoul (Korea, Republic of); Choi, Jong Su; Hong, Sup [Korea Research Institute of Ships and Ocean Engineering, Daejeon (Korea, Republic of)

    2015-04-15

    Sequential surrogate model-based global optimization algorithms, such as super-EGO, have been developed to increase the efficiency of commonly used global optimization technique as well as to ensure the accuracy of optimization. However, earlier studies have drawbacks because there are three phases in the optimization loop and empirical parameters. We propose a united sampling criterion to simplify the algorithm and to achieve the global optimum of problems with constraints without any empirical parameters. It is able to select the points located in a feasible region with high model uncertainty as well as the points along the boundary of constraint at the lowest objective value. The mean squared error determines which criterion is more dominant among the infill sampling criterion and boundary sampling criterion. Also, the method guarantees the accuracy of the surrogate model because the sample points are not located within extremely small regions like super-EGO. The performance of the proposed method, such as the solvability of a problem, convergence properties, and efficiency, are validated through nonlinear numerical examples with disconnected feasible regions.

  8. Experimental Verification of Statistically Optimized Parameters for Low-Pressure Cold Spray Coating of Titanium

    Directory of Open Access Journals (Sweden)

    Damilola Isaac Adebiyi

    2016-06-01

    Full Text Available The cold spray coating process involves many process parameters which make the process very complex, and highly dependent and sensitive to small changes in these parameters. This results in a small operational window of the parameters. Consequently, mathematical optimization of the process parameters is key, not only to achieving deposition but also improving the coating quality. This study focuses on the mathematical identification and experimental justification of the optimum process parameters for cold spray coating of titanium alloy with silicon carbide (SiC. The continuity, momentum and the energy equations governing the flow through the low-pressure cold spray nozzle were solved by introducing a constitutive equation to close the system. This was used to calculate the critical velocity for the deposition of SiC. In order to determine the input temperature that yields the calculated velocity, the distribution of velocity, temperature, and pressure in the cold spray nozzle were analyzed, and the exit values were predicted using the meshing tool of Solidworks. Coatings fabricated using the optimized parameters and some non-optimized parameters are compared. The coating of the CFD-optimized parameters yielded lower porosity and higher hardness.

  9. Design and statistical optimization of glipizide loaded lipospheres using response surface methodology.

    Science.gov (United States)

    Shivakumar, Hagalavadi Nanjappa; Patel, Pragnesh Bharat; Desai, Bapusaheb Gangadhar; Ashok, Purnima; Arulmozhi, Sinnathambi

    2007-09-01

    A 32 factorial design was employed to produce glipizide lipospheres by the emulsification phase separation technique using paraffin wax and stearic acid as retardants. The effect of critical formulation variables, namely levels of paraffin wax (X1) and proportion of stearic acid in the wax (X2) on geometric mean diameter (dg), percent encapsulation efficiency (% EE), release at the end of 12 h (rel12) and time taken for 50% of drug release (t50), were evaluated using the F-test. Mathematical models containing only the significant terms were generated for each response parameter using the multiple linear regression analysis (MLRA) and analysis of variance (ANOVA). Both formulation variables studied exerted a significant influence (p optimization using the desirability approach was employed to develop an optimized formulation by setting constraints on the dependent and independent variables. The experimental values of dg, % EE, rel12 and t50 values for the optimized formulation were found to be 57.54 +/- 1.38 mum, 86.28 +/- 1.32%, 77.23 +/- 2.78% and 5.60 +/- 0.32 h, respectively, which were in close agreement with those predicted by the mathematical models. The drug release from lipospheres followed first-order kinetics and was characterized by the Higuchi diffusion model. The optimized liposphere formulation developed was found to produce sustained anti-diabetic activity following oral administration in rats.

  10. Statistically optimized biotransformation protocol for continuous production of L-DOPA using Mucuna monosperma callus culture.

    Science.gov (United States)

    Inamdar, Shrirang Appasaheb; Surwase, Shripad Nagnath; Jadhav, Shekhar Bhagwan; Bapat, Vishwas Anant; Jadhav, Jyoti Prafull

    2013-01-01

    L-DOPA (3,4-dihydroxyphenyl-L-alanine), a modified amino acid, is an expansively used drug for the Parkinson's disease treatment. In the present study, optimization of nutritional parameters influencing L-DOPA production was attempted using the response surface methodology (RSM) from Mucuna monosperma callus. Optimization of the four factors was carried out using the Box-Behnken design. The optimized levels of factors predicted by the model include tyrosine 0.894 g l(-1), pH 4.99, ascorbic acid 31.62 mg l(-1)and copper sulphate 23.92 mg l(-1), which resulted in highest L-DOPA yield of 0.309 g l(-1). The optimization of medium using RSM resulted in a 3.45-fold increase in the yield of L-DOPA. The ANOVA analysis showed a significant R (2) value (0.9912), model F-value (112.465) and probability (0.0001), with insignificant lack of fit. Optimized medium was used in the laboratory scale column reactor for continuous production of L-DOPA. Uninterrupted flow column exhibited maximum L-DOPA production rate of 200 mg L(-1) h(-1) which is one of the highest values ever reported using plant as a biotransformation source. L-DOPA production was confirmed by HPTLC and HPLC analysis. This study demonstrates the synthesis of L- DOPA using Mucuna monosperma callus using a laboratory scale column reactor.

  11. Optimization of counting time using count statistics on a diffraction beamline

    Energy Technology Data Exchange (ETDEWEB)

    Marais, D., E-mail: Deon.Marais@necsa.co.za [Research and Development Division, South African Nuclear Energy Corporation (Necsa) SOC Limited, PO Box 582, Pretoria 0001 (South Africa); School of Mechanical and Nuclear Engineering, North-West University, Potchefstroom 2520 (South Africa); Venter, A.M., E-mail: Andrew.Venter@necsa.co.za [Research and Development Division, South African Nuclear Energy Corporation (Necsa) SOC Limited, PO Box 582, Pretoria 0001 (South Africa); Faculty of Agriculture Science and Technology, North-West University, Mahikeng 2790 (South Africa); Markgraaff, J., E-mail: Johan.Markgraaff@nwu.ac.za [School of Mechanical and Nuclear Engineering, North-West University, Potchefstroom 2520 (South Africa)

    2016-05-11

    The feasibility of an alternative data acquisition strategy to improve the efficiency of beam time usage with neutron strain scanner instruments is demonstrated. By performing strain measurements against set statistical criteria, rather than time, not only leads to substantially reduced sample investigation time but also renders data of similar quality throughout.

  12. Statistical optimization for tannase production by Aspergillus tubingensis in solid-state fermentation using tea stalks

    Directory of Open Access Journals (Sweden)

    Anfeng Xiao

    2015-05-01

    Conclusions: The result of the statistical approach was 2.09 times higher than the basal medium (40.22 U/gds. The results were fitted onto a second-order polynomial model with a correlation coefficient (R2 of 0.9340, which implied an adequate credibility of the model.

  13. Behavioral inferences from the statistical distribution of commercial catch: patterns of targeting in the landings of the Dutch beam trawler fleet

    NARCIS (Netherlands)

    Gillis, D.M.; Rijnsdorp, A.D.; Poos, J.J.

    2008-01-01

    The objective identification of targeting behavior in multispecies fisheries is critical to the development and evaluation of management measures. Here, we illustrate how the statistical distribution of commercial catches can provide information on species preference that is consistent with economic

  14. Statistical Optimization of Medium Compositions for High Cell Mass and Exopolysaccharide Production by Lactobacillus plantarum ATCC 8014

    Directory of Open Access Journals (Sweden)

    Nor Zalina Othman

    2018-03-01

    Full Text Available Background and Objective: Lactobacillus plantarum ATCC 8014 is known as a good producer of water soluble exopolysaccharide. Therefore, the aim of this study is to optimize the medium composition concurrently for high cell mass and exopolysaccharide production by Lactobacillus plantarum ATCC 8014. Since both are useful for food and pharmaceutical application and where most studies typically focus on one outcome only, the optimization process was carried out by using molasses as cheaper carbon source.Material and Methods: The main medium component which is known significantly give high effect on the cell mass and EPS production was selected as variables and statistically optimized based on Box-Behnken design in shake flask levels. The optimal medium for cell mass and exopolysaccharide production was composed of (in g l -1: molasses, 40; yeast extract, 16.8; phosphate, 2.72; sodium acetate, 3.98. The model was found to be significant and subsequently validated through the growth kinetics studies in un-optimized and optimized medium in the shake flask cultivation.Results and Conclusion: The maximum cell mass and exopolysaccharide in the new optimized medium was 4.40 g l-1 and 4.37 g l-1 respectively after 44 h of the cultivation. As a result, cell mass and exopolysaccharide production increased up to 4.5 and 16.5 times respectively, and the maximal exopolysaccharide yield of 1.19 per gram of cells was obtained when molasses was used as the carbon source. In conclusion, molasses has the potential to be a cheap carbon source for the cultivation of Lactobacillus plantarum ATCC 8014 concurrently for high cell mass and exopolysaccharide production.Conflict of interest: The authors declare no conflict of interest.

  15. On principles of inductive inference

    OpenAIRE

    Kostecki, Ryszard Paweł

    2011-01-01

    We propose an intersubjective epistemic approach to foundations of probability theory and statistical inference, based on relative entropy and category theory, and aimed to bypass the mathematical and conceptual problems of existing foundational approaches.

  16. Application of statistical experimental methodology to optimize bioremediation of n-alkanes in aquatic environment

    International Nuclear Information System (INIS)

    Zahed, Mohammad Ali; Aziz, Hamidi Abdul; Mohajeri, Leila; Mohajeri, Soraya; Kutty, Shamsul Rahman Mohamed; Isa, Mohamed Hasnain

    2010-01-01

    Response surface methodology (RSM) was employed to optimize nitrogen and phosphorus concentrations for removal of n-alkanes from crude oil contaminated seawater samples in batch reactors. Erlenmeyer flasks were used as bioreactors; each containing 250 mL dispersed crude oil contaminated seawater, indigenous acclimatized microorganism and different amounts of nitrogen and phosphorus based on central composite design (CCD). Samples were extracted and analyzed according to US-EPA protocols using a gas chromatograph. During 28 days of bioremediation, a maximum of 95% total aliphatic hydrocarbons removal was observed. The obtained Model F-value of 267.73 and probability F < 0.0001 implied the model was significant. Numerical condition optimization via a quadratic model, predicted 98% n-alkanes removal for a 20-day laboratory bioremediation trial using nitrogen and phosphorus concentrations of 13.62 and 1.39 mg/L, respectively. In actual experiments, 95% removal was observed under these conditions.

  17. Optimization and technological development strategies of an antimicrobial extract from Achyrocline alata assisted by statistical design.

    Directory of Open Access Journals (Sweden)

    Daniel P Demarque

    Full Text Available Achyrocline alata, known as Jateí-ka-há, is traditionally used to treat several health problems, including inflammations and infections. This study aimed to optimize an active extract against Streptococcus mutans, the main bacteria that causes caries. The extract was developed using an accelerated solvent extraction and chemometric calculations. Factorial design and response surface methodologies were used to determine the most important variables, such as active compound selectivity. The standardized extraction recovered 99% of the four main compounds, gnaphaliin, helipyrone, obtusifolin and lepidissipyrone, which represent 44% of the extract. The optimized extract of A. alata has a MIC of 62.5 μg/mL against S. mutans and could be used in mouth care products.

  18. Optimal sample preparation for nanoparticle metrology (statistical size measurements) using atomic force microscopy

    International Nuclear Information System (INIS)

    Hoo, Christopher M.; Doan, Trang; Starostin, Natasha; West, Paul E.; Mecartney, Martha L.

    2010-01-01

    Optimal deposition procedures are determined for nanoparticle size characterization by atomic force microscopy (AFM). Accurate nanoparticle size distribution analysis with AFM requires non-agglomerated nanoparticles on a flat substrate. The deposition of polystyrene (100 nm), silica (300 and 100 nm), gold (100 nm), and CdSe quantum dot (2-5 nm) nanoparticles by spin coating was optimized for size distribution measurements by AFM. Factors influencing deposition include spin speed, concentration, solvent, and pH. A comparison using spin coating, static evaporation, and a new fluid cell deposition method for depositing nanoparticles is also made. The fluid cell allows for a more uniform and higher density deposition of nanoparticles on a substrate at laminar flow rates, making nanoparticle size analysis via AFM more efficient and also offers the potential for nanoparticle analysis in liquid environments.

  19. Optimal Mass Transport for Statistical Estimation, Image Analysis, Information Geometry, and Control

    Science.gov (United States)

    2017-01-10

    advances on formulating and solving optimal transport problems on discrete spaces (networks) while ensuring robustness of the transportation plan. This...Metric Uncertainty for Spectral Estimation based on Nevanlinna-Pick Interpolation, (with J. Karlsson) Intern. Symp. on the Math . Theory of Networks and...Systems, Melbourne 2012. 22. Geometric tools for the estimation of structured covariances, (with L. Ning, X. Jiang) Intern. Symposium on the Math . Theory

  20. Optimization of cardiovascular stent against restenosis: factorial design-based statistical analysis of polymer coating conditions.

    Directory of Open Access Journals (Sweden)

    Gayathri Acharya

    Full Text Available The objective of this study was to optimize the physicodynamic conditions of polymeric system as a coating substrate for drug eluting stents against restenosis. As Nitric Oxide (NO has multifunctional activities, such as regulating blood flow and pressure, and influencing thrombus formation, a continuous and spatiotemporal delivery of NO loaded in the polymer based nanoparticles could be a viable option to reduce and prevent restenosis. To identify the most suitable carrier for S-Nitrosoglutathione (GSNO, a NO prodrug, stents were coated with various polymers, such as poly (lactic-co-glycolic acid (PLGA, polyethylene glycol (PEG and polycaprolactone (PCL, using solvent evaporation technique. Full factorial design was used to evaluate the effects of the formulation variables in polymer-based stent coatings on the GSNO release rate and weight loss rate. The least square regression model was used for data analysis in the optimization process. The polymer-coated stents were further assessed with Differential scanning calorimetry (DSC, Fourier transform infrared spectroscopy analysis (FTIR, Scanning electron microscopy (SEM images and platelet adhesion studies. Stents coated with PCL matrix displayed more sustained and controlled drug release profiles than those coated with PLGA and PEG. Stents coated with PCL matrix showed the least platelet adhesion rate. Subsequently, stents coated with PCL matrix were subjected to the further optimization processes for improvement of surface morphology and enhancement of the drug release duration. The results of this study demonstrated that PCL matrix containing GSNO is a promising system for stent surface coating against restenosis.

  1. Box-Behnken statistical design to optimize thermal performance of energy storage systems

    Science.gov (United States)

    Jalalian, Iman Joz; Mohammadiun, Mohammad; Moqadam, Hamid Hashemi; Mohammadiun, Hamid

    2018-05-01

    Latent heat thermal storage (LHTS) is a technology that can help to reduce energy consumption for cooling applications, where the cold is stored in phase change materials (PCMs). In the present study a comprehensive theoretical and experimental investigation is performed on a LHTES system containing RT25 as phase change material (PCM). Process optimization of the experimental conditions (inlet air temperature and velocity and number of slabs) was carried out by means of Box-Behnken design (BBD) of Response surface methodology (RSM). Two parameters (cooling time and COP value) were chosen to be the responses. Both of the responses were significantly influenced by combined effect of inlet air temperature with velocity and number of slabs. Simultaneous optimization was performed on the basis of the desirability function to determine the optimal conditions for the cooling time and COP value. Maximum cooling time (186 min) and COP value (6.04) were found at optimum process conditions i.e. inlet temperature of (32.5), air velocity of (1.98) and slab number of (7).

  2. Box-Behnken statistical design to optimize thermal performance of energy storage systems

    Science.gov (United States)

    Jalalian, Iman Joz; Mohammadiun, Mohammad; Moqadam, Hamid Hashemi; Mohammadiun, Hamid

    2017-11-01

    Latent heat thermal storage (LHTS) is a technology that can help to reduce energy consumption for cooling applications, where the cold is stored in phase change materials (PCMs). In the present study a comprehensive theoretical and experimental investigation is performed on a LHTES system containing RT25 as phase change material (PCM). Process optimization of the experimental conditions (inlet air temperature and velocity and number of slabs) was carried out by means of Box-Behnken design (BBD) of Response surface methodology (RSM). Two parameters (cooling time and COP value) were chosen to be the responses. Both of the responses were significantly influenced by combined effect of inlet air temperature with velocity and number of slabs. Simultaneous optimization was performed on the basis of the desirability function to determine the optimal conditions for the cooling time and COP value. Maximum cooling time (186 min) and COP value (6.04) were found at optimum process conditions i.e. inlet temperature of (32.5), air velocity of (1.98) and slab number of (7).

  3. Sampling, Probability Models and Statistical Reasoning Statistical

    Indian Academy of Sciences (India)

    Home; Journals; Resonance – Journal of Science Education; Volume 1; Issue 5. Sampling, Probability Models and Statistical Reasoning Statistical Inference. Mohan Delampady V R Padmawar. General Article Volume 1 Issue 5 May 1996 pp 49-58 ...

  4. Risk management and statistical multivariate analysis approach for design and optimization of satranidazole nanoparticles.

    Science.gov (United States)

    Dhat, Shalaka; Pund, Swati; Kokare, Chandrakant; Sharma, Pankaj; Shrivastava, Birendra

    2017-01-01

    Rapidly evolving technical and regulatory landscapes of the pharmaceutical product development necessitates risk management with application of multivariate analysis using Process Analytical Technology (PAT) and Quality by Design (QbD). Poorly soluble, high dose drug, Satranidazole was optimally nanoprecipitated (SAT-NP) employing principles of Formulation by Design (FbD). The potential risk factors influencing the critical quality attributes (CQA) of SAT-NP were identified using Ishikawa diagram. Plackett-Burman screening design was adopted to screen the eight critical formulation and process parameters influencing the mean particle size, zeta potential and dissolution efficiency at 30min in pH7.4 dissolution medium. Pareto charts (individual and cumulative) revealed three most critical factors influencing CQA of SAT-NP viz. aqueous stabilizer (Polyvinyl alcohol), release modifier (Eudragit® S 100) and volume of aqueous phase. The levels of these three critical formulation attributes were optimized by FbD within established design space to minimize mean particle size, poly dispersity index, and maximize encapsulation efficiency of SAT-NP. Lenth's and Bayesian analysis along with mathematical modeling of results allowed identification and quantification of critical formulation attributes significantly active on the selected CQAs. The optimized SAT-NP exhibited mean particle size; 216nm, polydispersity index; 0.250, zeta potential; -3.75mV and encapsulation efficiency; 78.3%. The product was lyophilized using mannitol to form readily redispersible powder. X-ray diffraction analysis confirmed the conversion of crystalline SAT to amorphous form. In vitro release of SAT-NP in gradually pH changing media showed 95%) in pH7.4 in next 3h, indicative of burst release after a lag time. This investigation demonstrated effective application of risk management and QbD tools in developing site-specific release SAT-NP by nanoprecipitation. Copyright © 2016 Elsevier B.V. All

  5. Statistically optimized fast dissolving microneedle transdermal patch of meloxicam: A patient friendly approach to manage arthritis.

    Science.gov (United States)

    Amodwala, Sejal; Kumar, Praveen; Thakkar, Hetal P

    2017-06-15

    The long term administration of Meloxicam for the management of arthritis, a chronic disorder, results in gastrointestinal disturbances leading to poor patient compliance. Considering the favorable molecular weight, therapeutic dose, biological half-life and log P value of meloxicam for transdermal delivery, its fast dissolving microneedle patch, with an ability to breach the stratum corneum and efficiently deliver the cargo to deeper skin layers, were developed. Microneedle patch of low molecular weight polyvinyl alcohol and polyvinylpyrrolidone was prepared using Polydimethylsiloxane micromolds. The ratio of polyvinyl alcohol to polyvinyl pyrrolidone and solid content of matrix solution was optimized to achieve maximum needle strength. The optimized batch was extensively evaluated for in vitro dissolution, drug release, stability, ex vivo skin permeation/deposition, histopathology and in vivo pharmacodynamic study. The patch containing 9:1 polyvinyl alcohol to polyvinylpyrrolidone ratio with 50% solid content had shown maximum axial needle fracture force (0.9N) suitable for penetrating the skin. The optimized batch was found to be fast dissolving and released almost 100% drug in 60min following dissolution controlled kinetics. The formulation showed a significant drug deposition within skin (63.37%) and an improved transdermal flux (1.60μg/cm 2 /h) with a 2.58 fold enhancement in permeation as compared to plain drug solution. The formulation showed a comparable anti-inflammatory activity in rats when compared to its existing approved marketed oral tablet. Histopathology and stability evaluations demonstrated acceptable safety and shelf-life of the developed formulation. The successful verification of safety, efficacy and stability of microneedle patch advocated the suitability of the formulation for transdermal use. Copyright © 2017 Elsevier B.V. All rights reserved.

  6. Formulation and optimization of chronomodulated press-coated tablet of carvedilol by Box–Behnken statistical design

    Directory of Open Access Journals (Sweden)

    Satwara RS

    2012-08-01

    Full Text Available Rohan S Satwara, Parul K PatelDepartment of Pharmaceutics, Babaria Institute of Pharmacy, Vadodara, Gujarat, IndiaObjective: The primary objective of the present investigation was to formulate and optimize chronomodulated press-coated tablets to deliver the antihypertensive carvedilol at an effective quantity predawn, when a blood pressure spike is typically observed in most hypertensive patients.Experimental work: Preformulation studies and drug excipient compatibility studies were carried out for carvedilol and excipients. Core tablets (6 mm containing carvedilol and 10-mm press-coated tablets were prepared by direct compression. The Box–Behnken experimental design was applied to these press-coated tablets (F1–F15 formula with differing concentrations of rate-controlling polymers. Hydroxypropyl methyl cellulose K4M, ethyl cellulose, and K-carrageenan were used as rate-controlling polymers in the outer layer. These tablets were subjected to various precompression and postcompression tests. The optimized batch was derived both by statistically (using desirability function and graphically (using Design Expert® 8; Stat-Ease Inc. Tablets formulated using the optimized formulas were then evaluated for lag time and in vitro dissolution.Results and discussion: Results of preformulation studies were satisfactory. No interaction was observed between carvedilol and excipients by ultraviolet, Fourier transform infrared spectroscopy, and dynamic light scattering analysis. The results of precompression studies and postcompression studies were within limits. The varying lag time and percent cumulative carvedilol release after 8 h was optimized to obtain a formulation that offered a release profile with 6 h lag time, followed by complete carvedilol release after 8 h. The results showed no significant bias between predicted response and actual response for the optimized formula.Conclusion: Bedtime dosing of chronomodulated press-coated tablets may offer a

  7. Comparative study for different statistical models to optimize cutting parameters of CNC end milling machines

    International Nuclear Information System (INIS)

    El-Berry, A.; El-Berry, A.; Al-Bossly, A.

    2010-01-01

    In machining operation, the quality of surface finish is an important requirement for many work pieces. Thus, that is very important to optimize cutting parameters for controlling the required manufacturing quality. Surface roughness parameter (Ra) in mechanical parts depends on turning parameters during the turning process. In the development of predictive models, cutting parameters of feed, cutting speed, depth of cut, are considered as model variables. For this purpose, this study focuses on comparing various machining experiments which using CNC vertical machining center, work pieces was aluminum 6061. Multiple regression models are used to predict the surface roughness at different experiments.

  8. Design and statistical optimization of an effervescent floating drug delivery system of theophylline using response surface methodology

    Directory of Open Access Journals (Sweden)

    Srikanth Meka Venkata

    2016-03-01

    Full Text Available The aim of this research was to formulate effervescent floating drug delivery systems of theophylline using different release retarding polymers such as ethyl cellulose, Eudragit® L100, xanthan gum and polyethylene oxide (PEO N12K. Sodium bicarbonate was used as a gas generating agent. Direct compression was used to formulate floating tablets and the tablets were evaluated for their physicochemical and dissolution characteristics. PEO based formulations produced better drug release properties than other formulations. Hence, it was further optimized by central composite design. Further subjects of research were the effect of formulation variables on floating lag time and the percentage of drug released at the seventh hour (D7h. The optimum quantities of PEO and sodium bicarbonate, which had the highest desirability close to 1.0, were chosen as the statistically optimized formulation. No interaction was found between theophylline and PEO by Fourier Transformation Infrared spectroscopy (FTIR and Differential Scanning Calorimetry (DSC studies.

  9. Colon specific CODES based Piroxicam tablet for colon targeting: statistical optimization, in vivo roentgenography and stability assessment.

    Science.gov (United States)

    Singh, Amit Kumar; Pathak, Kamla

    2015-03-01

    This study was aimed to statistically optimize CODES™ based Piroxicam (PXM) tablet for colon targeting. A 3(2) full factorial design was used for preparation of core tablet that was subsequently coated to get CODES™ based tablet. The experimental design of core tablets comprised of two independent variables: amount of lactulose and PEG 6000, each at three different levels and the dependent variable was %CDR at 12 h. The core tablets were evaluated for pharmacopoeial and non-pharmacopoeial test and coated with optimized levels of Eudragit E100 followed by HPMC K15 and finally with Eudragit S100. The in vitro drug release study of F1-F9 was carried out by change over media method (0.1 N HCl buffer, pH 1.2, phosphate buffer, pH 7.4 and phosphate buffer, pH 6.8 with enzyme β-galactosidase 120 IU) to select optimized formulation F9 that was subjected to in vivo roentgenography. Roentgenography study corroborated the in vitro performance, thus providing the proof of concept. The experimental design was validated by extra check point formulation and Diffuse Reflectance Spectroscopy revealed absence of any interaction between drug and formulation excipients. The shelf life of F9 was deduced as 12 months. Conclusively, colon targeted CODES™ technology based PXM tablets were successfully optimized and its potential of colon targeting was validated by roentgenography.

  10. The Parallel C++ Statistical Library ‘QUESO’: Quantification of Uncertainty for Estimation, Simulation and Optimization

    KAUST Repository

    Prudencio, Ernesto E.

    2012-01-01

    QUESO is a collection of statistical algorithms and programming constructs supporting research into the uncertainty quantification (UQ) of models and their predictions. It has been designed with three objectives: it should (a) be sufficiently abstract in order to handle a large spectrum of models, (b) be algorithmically extensible, allowing an easy insertion of new and improved algorithms, and (c) take advantage of parallel computing, in order to handle realistic models. Such objectives demand a combination of an object-oriented design with robust software engineering practices. QUESO is written in C++, uses MPI, and leverages libraries already available to the scientific community. We describe some UQ concepts, present QUESO, and list planned enhancements.

  11. Optimal systematics of single-humped fission barriers for statistical calculations

    International Nuclear Information System (INIS)

    Mashnik, S.G.

    1993-01-01

    A systematic comparison of the existing phenomenological approaches and models for describing single-humped fast-computing fission barriers are given. The experimental data on excitation energy dependence of the fissility of compound nuclei are analyzed in the framework of the statistical approach by using different models for fission barriers, shell and pairing corrections and level-density parameter in order to identify their reliability and region of applicability for Monte Carlo calculations of evaporative cascades. The energy dependence of fission cross-sections for reactions induced by intermediate energy protons has been analyzed in the framework of the cascade-exiton model. 53 refs., 15 figs., 3 tabs

  12. Optimizing Statistical Character Recognition Using Evolutionary Strategies to Recognize Aircraft Tail Numbers

    Directory of Open Access Journals (Sweden)

    Antonio Berlanga

    2004-07-01

    Full Text Available The design of statistical classification systems for optical character recognition (OCR is a cumbersome task. This paper proposes a method using evolutionary strategies (ES to evolve and upgrade the set of parameters in an OCR system. This OCR is applied to identify the tail number of aircrafts moving on the airport. The proposed approach is discussed and some results are obtained using a benchmark data set. This research demonstrates the successful application of ES to a difficult, noisy, and real-world problem.

  13. Optimal design of sampling and mapping schemes in the radiometric exploration of Chipilapa, El Salvador (Geo-statistics)

    International Nuclear Information System (INIS)

    Balcazar G, M.; Flores R, J.H.

    1992-01-01

    As part of the knowledge about the radiometric surface exploration, carried out in the geothermal field of Chipilapa, El Salvador, its were considered the geo-statistical parameters starting from the calculated variogram of the field data, being that the maxim distance of correlation of the samples in 'radon' in the different observation addresses (N-S, E-W, N W-S E, N E-S W), it was of 121 mts for the monitoring grill in future prospectus in the same area. Being derived of it an optimization (minimum cost) in the spacing of the field samples by means of geo-statistical techniques, without losing the detection of the anomaly. (Author)

  14. Statistical optimization for alkali pretreatment conditions of narrow-leaf cattail by response surface methodology

    Directory of Open Access Journals (Sweden)

    Arrisa Ruangmee

    2013-08-01

    Full Text Available Response surface methodology with central composite design was applied to optimize alkali pretreatment of narrow-leafcattail (Typha angustifolia. Joint effects of three independent variables; NaOH concentration (1-5%, temperature (60-100 ºC,and reaction time (30-150 min, were investigated to evaluate the increase in and the improvement of cellulosic componentscontained in the raw material after pretreatment. The combined optimum condition based on the cellulosic content obtainedfrom this study is: a concentration of 5% NaOH, a reaction time of 120 min, and a temperature of 100 ºC. This result has beenanalyzed employing ANOVA with a second order polynomial equation. The model was found to be significant and was able topredict accurately the response of strength at less than 5% error. Under this combined optimal condition, the desirable cellulosic content in the sample increased from 38.5 to 68.3%, while the unfavorable hemicellulosic content decreased from 37.6 to7.3%.

  15. Modeling landslide susceptibility in data-scarce environments using optimized data mining and statistical methods

    Science.gov (United States)

    Lee, Jung-Hyun; Sameen, Maher Ibrahim; Pradhan, Biswajeet; Park, Hyuck-Jin

    2018-02-01

    This study evaluated the generalizability of five models to select a suitable approach for landslide susceptibility modeling in data-scarce environments. In total, 418 landslide inventories and 18 landslide conditioning factors were analyzed. Multicollinearity and factor optimization were investigated before data modeling, and two experiments were then conducted. In each experiment, five susceptibility maps were produced based on support vector machine (SVM), random forest (RF), weight-of-evidence (WoE), ridge regression (Rid_R), and robust regression (RR) models. The highest accuracy (AUC = 0.85) was achieved with the SVM model when either the full or limited landslide inventories were used. Furthermore, the RF and WoE models were severely affected when less landslide samples were used for training. The other models were affected slightly when the training samples were limited.

  16. Statistical investigations for an optimal evaluation of data in international safeguards

    International Nuclear Information System (INIS)

    Beedgen, R.

    1981-09-01

    In international safeguards of nuclear material the material accountancy is an essential principle of the IAEA. The material balance is closed with the operator's data which are verified by the inspector at the hand of independent measurements on a random sampling basis. The results of the inspector have a probabilistic character because of measurement uncertainties and sampling. A diverter has in principle two possibilities - diversion without data falsification and playing with the measurement uncertainties - diversion with data falsification that the material balance seems to be correct. The strategies of the inspector are - closing the material balance - independent verification of the operator's data. The question is answered which test procedure leads under certain assumptions to the highest detection probability where the false alarm probability is fixed. The possibility of an optimal strategy of the diverter is taken into account. The results are partly illustrated at the hand of examples. (orig./HP) [de

  17. Robust video watermarking via optimization algorithm for quantization of pseudo-random semi-global statistics

    Science.gov (United States)

    Kucukgoz, Mehmet; Harmanci, Oztan; Mihcak, Mehmet K.; Venkatesan, Ramarathnam

    2005-03-01

    In this paper, we propose a novel semi-blind video watermarking scheme, where we use pseudo-random robust semi-global features of video in the three dimensional wavelet transform domain. We design the watermark sequence via solving an optimization problem, such that the features of the mark-embedded video are the quantized versions of the features of the original video. The exact realizations of the algorithmic parameters are chosen pseudo-randomly via a secure pseudo-random number generator, whose seed is the secret key, that is known (resp. unknown) by the embedder and the receiver (resp. by the public). We experimentally show the robustness of our algorithm against several attacks, such as conventional signal processing modifications and adversarial estimation attacks.

  18. Statistical optimization of synthesis procedure and characterization of europium (III) molybdate nano-plates

    Energy Technology Data Exchange (ETDEWEB)

    Pourmortazavi, Seied Mahdi [Malek Ashtar University of Technology, Faculty of Material and Manufacturing Technologies, P. O. Box 16765-3454, Tehran (Iran, Islamic Republic of); Rahimi-Nasrabadi, Mehdi [Imam Hossein University, Nano Science Center, Tehran (Iran, Islamic Republic of); Fazli, Yousef [Islamic Azad University, Department of Chemistry, Faculty of Science, Arak Branch, Arak (Iran, Islamic Republic of); Mohammad-Zadeh, Mohammad [Sabzevar University of Medical Sciences, Department of Physiology and Pharmacology, School of Medicine, Sabzevar (Iran, Islamic Republic of)

    2015-06-15

    Europium (III) molybdate nano-plates were synthesized in this work via chemical precipitation route involving adding of europium (III) ion solution to the aqueous solution of molybdate reagent. Effects of some reaction variables such as concentrations of europium and molybdate ions, flow rate of europium reagent, and reactor temperature on the diameter of the synthesized europium (III) molybdate nano-plates were experimentally investigated by orthogonal array design. The results showed that the size of europium (III) molybdate nano-plates can be optimized by adjusting the concentrations of europium (III) and molybdate ions, as well as the reactional temperature. Europium (III) molybdate nano-plates prepared under the optimum conditions were characterized by X-ray powder diffraction, scanning electron microscopy, and Fourier transform infrared spectroscopy. (orig.)

  19. Stochastic optimal control as non-equilibrium statistical mechanics: calculus of variations over density and current

    Science.gov (United States)

    Chernyak, Vladimir Y.; Chertkov, Michael; Bierkens, Joris; Kappen, Hilbert J.

    2014-01-01

    In stochastic optimal control (SOC) one minimizes the average cost-to-go, that consists of the cost-of-control (amount of efforts), cost-of-space (where one wants the system to be) and the target cost (where one wants the system to arrive), for a system participating in forced and controlled Langevin dynamics. We extend the SOC problem by introducing an additional cost-of-dynamics, characterized by a vector potential. We propose derivation of the generalized gauge-invariant Hamilton-Jacobi-Bellman equation as a variation over density and current, suggest hydrodynamic interpretation and discuss examples, e.g., ergodic control of a particle-within-a-circle, illustrating non-equilibrium space-time complexity.

  20. Statistical Evaluation and Optimization of Factors Affecting the Leaching Performance of Copper Flotation Waste

    Directory of Open Access Journals (Sweden)

    Semra Çoruh

    2012-01-01

    Full Text Available Copper flotation waste is an industrial by-product material produced from the process of manufacturing copper. The main concern with respect to landfilling of copper flotation waste is the release of elements (e.g., salts and heavy metals when in contact with water, that is, leaching. Copper flotation waste generally contains a significant amount of Cu together with trace elements of other toxic metals, such as Zn, Co, and Pb. The release of heavy metals into the environment has resulted in a number of environmental problems. The aim of this study is to investigate the leaching characteristics of copper flotation waste by use of the Box-Behnken experimental design approach. In order to obtain the optimized condition of leachability, a second-order model was examined. The best leaching conditions achieved were as follows: pH = 9, stirring time = 5 min, and temperature = 41.5°C.

  1. Statistical evaluation and optimization of factors affecting the leaching performance of copper flotation waste.

    Science.gov (United States)

    Coruh, Semra; Elevli, Sermin; Geyikçi, Feza

    2012-01-01

    Copper flotation waste is an industrial by-product material produced from the process of manufacturing copper. The main concern with respect to landfilling of copper flotation waste is the release of elements (e.g., salts and heavy metals) when in contact with water, that is, leaching. Copper flotation waste generally contains a significant amount of Cu together with trace elements of other toxic metals, such as Zn, Co, and Pb. The release of heavy metals into the environment has resulted in a number of environmental problems. The aim of this study is to investigate the leaching characteristics of copper flotation waste by use of the Box-Behnken experimental design approach. In order to obtain the optimized condition of leachability, a second-order model was examined. The best leaching conditions achieved were as follows: pH = 9, stirring time = 5 min, and temperature = 41.5 °C.

  2. Statistical optimization for lipase production from solid waste of vegetable oil industry.

    Science.gov (United States)

    Sahoo, Rajesh Kumar; Kumar, Mohit; Mohanty, Swati; Sawyer, Matthew; Rahman, Pattanathu K S M; Sukla, Lala Behari; Subudhi, Enketeswara

    2018-04-21

    The production of biofuel using thermostable bacterial lipase from hot spring bacteria out of low-cost agricultural residue olive oil cake is reported in the present paper. Using a lipase enzyme from Bacillus licheniformis, a 66.5% yield of methyl esters was obtained. Optimum parameters were determined, with maximum production of lipase at a pH of 8.2, temperature 50.8°C, moisture content of 55.7%, and biosurfactant content of 1.693 mg. The contour plots and 3D surface responses depict the significant interaction of pH and moisture content with biosurfactant during lipase production. Chromatographic analysis of the lipase transesterification product was methyl esters, from kitchen waste oil under optimized conditions, generated methyl palmitate, methyl stearate, methyl oleate, and methyl linoleate.

  3. Treatment of automotive industry oily wastewater by electrocoagulation: statistical optimization of the operational parameters.

    Science.gov (United States)

    GilPavas, Edison; Molina-Tirado, Kevin; Gómez-García, Miguel Angel

    2009-01-01

    An electrocoagulation process was used for the treatment of oily wastewater generated from an automotive industry in Medellín (Colombia). An electrochemical cell consisting of four parallel electrodes (Fe and Al) in bipolar configuration was implemented. A multifactorial experimental design was used for evaluating the influence of several parameters including: type and arrangement of electrodes, pH, and current density. Oil and grease removal was defined as the response variable for the statistical analysis. Additionally, the BOD(5), COD, and TOC were monitored during the treatment process. According to the results, at the optimum parameter values (current density = 4.3 mA/cm(2), distance between electrodes = 1.5 cm, Fe as anode, and pH = 12) it was possible to reach a c.a. 95% oils removal, COD and mineralization of 87.4% and 70.6%, respectively. A final biodegradability (BOD(5)/COD) of 0.54 was reached.

  4. Beginning statistics with data analysis

    CERN Document Server

    Mosteller, Frederick; Rourke, Robert EK

    2013-01-01

    This introduction to the world of statistics covers exploratory data analysis, methods for collecting data, formal statistical inference, and techniques of regression and analysis of variance. 1983 edition.

  5. Screening of Ganoderma strains with high polysaccharides and ganoderic acid contents and optimization of the fermentation medium by statistical methods.

    Science.gov (United States)

    Wei, Zhen-hua; Duan, Ying-yi; Qian, Yong-qing; Guo, Xiao-feng; Li, Yan-jun; Jin, Shi-he; Zhou, Zhong-Xin; Shan, Sheng-yan; Wang, Chun-ru; Chen, Xue-Jiao; Zheng, Yuguo; Zhong, Jian-Jiang

    2014-09-01

    Polysaccharides and ganoderic acids (GAs) are the major bioactive constituents of Ganoderma species. However, the commercialization of their production was limited by low yield in the submerged culture of Ganoderma despite improvement made in recent years. In this work, twelve Ganoderma strains were screened to efficiently produce polysaccharides and GAs, and Ganoderma lucidum 5.26 (GL 5.26) that had been never reported in fermentation process was found to be most efficient among the tested stains. Then, the fermentation medium was optimized for GL 5.26 by statistical method. Firstly, glucose and yeast extract were found to be the optimum carbon source and nitrogen source according to the single-factor tests. Ferric sulfate was found to have significant effect on GL 5.26 biomass production according to the results of Plackett-Burman design. The concentrations of glucose, yeast extract and ferric sulfate were further optimized by response surface methodology. The optimum medium composition was 55 g/L of glucose, 14 g/L of yeast extract, 0.3 g/L of ferric acid, with other medium components unchanged. The optimized medium was testified in the 10-L bioreactor, and the production of biomass, IPS, total GAs and GA-T enhanced by 85, 27, 49 and 93 %, respectively, compared to the initial medium. The fermentation process was scaled up to 300-L bioreactor; it showed good IPS (3.6 g/L) and GAs (670 mg/L) production. The biomass was 23.9 g/L in 300-L bioreactor, which was the highest biomass production in pilot scale. According to this study, the strain GL 5.26 showed good fermentation property by optimizing the medium. It might be a candidate industrial strain by further process optimization and scale-up study.

  6. Estimation of fundamental kinetic parameters of polyhydroxybutyrate fermentation process of Azohydromonas australica using statistical approach of media optimization.

    Science.gov (United States)

    Gahlawat, Geeta; Srivastava, Ashok K

    2012-11-01

    Polyhydroxybutyrate or PHB is a biodegradable and biocompatible thermoplastic with many interesting applications in medicine, food packaging, and tissue engineering materials. The present study deals with the enhanced production of PHB by Azohydromonas australica using sucrose and the estimation of fundamental kinetic parameters of PHB fermentation process. The preliminary culture growth inhibition studies were followed by statistical optimization of medium recipe using response surface methodology to increase the PHB production. Later on batch cultivation in a 7-L bioreactor was attempted using optimum concentration of medium components (process variables) obtained from statistical design to identify the batch growth and product kinetics parameters of PHB fermentation. A. australica exhibited a maximum biomass and PHB concentration of 8.71 and 6.24 g/L, respectively in bioreactor with an overall PHB production rate of 0.75 g/h. Bioreactor cultivation studies demonstrated that the specific biomass and PHB yield on sucrose was 0.37 and 0.29 g/g, respectively. The kinetic parameters obtained in the present investigation would be used in the development of a batch kinetic mathematical model for PHB production which will serve as launching pad for further process optimization studies, e.g., design of several bioreactor cultivation strategies to further enhance the biopolymer production.

  7. Design and performance characteristics of solar adsorption refrigeration system using parabolic trough collector: Experimental and statistical optimization technique

    International Nuclear Information System (INIS)

    Abu-Hamdeh, Nidal H.; Alnefaie, Khaled A.; Almitani, Khalid H.

    2013-01-01

    Highlights: • The successes of using olive waste/methanol as an adsorbent/adsorbate pair. • The experimental gross cycle coefficient of performance obtained was COP a = 0.75. • Optimization showed expanding adsorbent mass to a certain range increases the COP. • The statistical optimization led to optimum tank volume between 0.2 and 0.3 m 3 . • Increasing the collector area to a certain range increased the COP. - Abstract: The current work demonstrates a developed model of a solar adsorption refrigeration system with specific requirements and specifications. The recent scheme can be employed as a refrigerator and cooler unit suitable for remote areas. The unit runs through a parabolic trough solar collector (PTC) and uses olive waste as adsorbent with methanol as adsorbate. Cooling production, COP (coefficient of performance, and COP a (cycle gross coefficient of performance) were used to assess the system performance. The system’s design optimum parameters in this study were arrived to through statistical and experimental methods. The lowest temperature attained in the refrigerated space was 4 °C and the equivalent ambient temperature was 27 °C. The temperature started to decrease steadily at 20:30 – when the actual cooling started – until it reached 4 °C at 01:30 in the next day when it rose again. The highest COP a obtained was 0.75

  8. Optimism in the face of uncertainty supported by a statistically-designed multi-armed bandit algorithm.

    Science.gov (United States)

    Kamiura, Moto; Sano, Kohei

    2017-10-01

    The principle of optimism in the face of uncertainty is known as a heuristic in sequential decision-making problems. Overtaking method based on this principle is an effective algorithm to solve multi-armed bandit problems. It was defined by a set of some heuristic patterns of the formulation in the previous study. The objective of the present paper is to redefine the value functions of Overtaking method and to unify the formulation of them. The unified Overtaking method is associated with upper bounds of confidence intervals of expected rewards on statistics. The unification of the formulation enhances the universality of Overtaking method. Consequently we newly obtain Overtaking method for the exponentially distributed rewards, numerically analyze it, and show that it outperforms UCB algorithm on average. The present study suggests that the principle of optimism in the face of uncertainty should be regarded as the statistics-based consequence of the law of large numbers for the sample mean of rewards and estimation of upper bounds of expected rewards, rather than as a heuristic, in the context of multi-armed bandit problems. Copyright © 2017 Elsevier B.V. All rights reserved.

  9. Statistically designed enzymatic hydrolysis of an icariin/β-cyclodextrin inclusion complex optimized for production of icaritin

    Directory of Open Access Journals (Sweden)

    Xin Jin

    2012-02-01

    Full Text Available This study focuses on the preparation and enzymic hydrolysis of an icariin/β-cyclodextrin inclusion complex to efficiently generate icaritin. The physical characteristics of the inclusion complex were evaluated by differential scanning calorimetry (DSC. Enzymatic hydrolysis was optimized for the conversion of icariin/β-cyclodextrin complex to icaritin by Box–Behnken statistical design. The inclusion complex formulation increased the solubility of icariin approximately 17-fold, from 29.2 to 513.5 μg/mL at 60 °C. The optimum conditions were predicted by Box–Behnken statistical design as follows: 60 °C, pH 7.0, the ratio of enzyme/substrate (1:1.1 and reaction time 7 h. Under the optimal conditions the conversion of icariin was 97.91% and the reaction time was decreased by 68% compared with that without β-CD inclusion. Product analysis by melting point, ESI-MS, UV, IR, 1H NMR and 13C NMR confirmed the authenticity of icaritin with a purity of 99.3% and a yield of 473 mg of icaritin from 1.1 g icariin.

  10. Statistically optimal estimation of Greenland Ice Sheet mass variations from GRACE monthly solutions using an improved mascon approach

    Science.gov (United States)

    Ran, J.; Ditmar, P.; Klees, R.; Farahani, H. H.

    2018-03-01

    We present an improved mascon approach to transform monthly spherical harmonic solutions based on GRACE satellite data into mass anomaly estimates in Greenland. The GRACE-based spherical harmonic coefficients are used to synthesize gravity anomalies at satellite altitude, which are then inverted into mass anomalies per mascon. The limited spectral content of the gravity anomalies is properly accounted for by applying a low-pass filter as part of the inversion procedure to make the functional model spectrally consistent with the data. The full error covariance matrices of the monthly GRACE solutions are properly propagated using the law of covariance propagation. Using numerical experiments, we demonstrate the importance of a proper data weighting and of the spectral consistency between functional model and data. The developed methodology is applied to process real GRACE level-2 data (CSR RL05). The obtained mass anomaly estimates are integrated over five drainage systems, as well as over entire Greenland. We find that the statistically optimal data weighting reduces random noise by 35-69%, depending on the drainage system. The obtained mass anomaly time-series are de-trended to eliminate the contribution of ice discharge and are compared with de-trended surface mass balance (SMB) time-series computed with the Regional Atmospheric Climate Model (RACMO 2.3). We show that when using a statistically optimal data weighting in GRACE data processing, the discrepancies between GRACE-based estimates of SMB and modelled SMB are reduced by 24-47%.

  11. Lipid-polymer hybrid nanoparticles: Development & statistical optimization of norfloxacin for topical drug delivery system

    Directory of Open Access Journals (Sweden)

    Vivek Dave

    2017-12-01

    Full Text Available Poly lactic acid is a biodegradable, biocompatible, and non-toxic polymer, widely used in many pharmaceutical preparations such as controlled release formulations, parenteral preparations, surgical treatment applications, and tissue engineering. In this study, we prepared lipid-polymer hybrid nanoparticles for topical and site targeting delivery of Norfloxacin by emulsification solvent evaporation method (ESE. The design of experiment (DOE was done by using software to optimize the result, and then a surface plot was generated to compare with the practical results. The surface morphology, particle size, zeta potential and composition of the lipid-polymer hybrid nanoparticles were characterized by SEM, TEM, AFM, and FTIR. The thermal behavior of the lipid-polymer hybrid nanoparticles was characterized by DSC and TGA. The prepared lipid-polymer hybrid nanoparticles of Norfloxacin exhibited an average particle size from 178.6 ± 3.7 nm to 220.8 ± 2.3 nm, and showed very narrow distribution with polydispersity index ranging from 0.206 ± 0.36 to 0.383 ± 0.66. The surface charge on the lipid-polymer hybrid nanoparticles were confirmed by zeta potential, showed the value from +23.4 ± 1.5 mV to +41.5 ± 3.4 mV. An Antimicrobial study was done against Staphylococcus aureus and Pseudomonas aeruginosa, and the lipid-polymer hybrid nanoparticles showed potential activity against these two. Lipid-polymer hybrid nanoparticles of Norfloxacin showed the %cumulative drug release of 89.72% in 24 h. A stability study of the optimized formulation showed the suitable condition for the storage of lipid-polymer hybrid nanoparticles was at 4 ± 2 °C/60 ± 5% RH. These results illustrated high potential of lipid-polymer hybrid nanoparticles Norfloxacin for usage as a topical antibiotic drug carriers.

  12. Optimization of biodiesel production from Thevetia peruviana seed oil by adaptive neuro-fuzzy inference system coupled with genetic algorithm and response surface methodology

    International Nuclear Information System (INIS)

    Ogaga Ighose, Benjamin; Adeleke, Ibrahim A.; Damos, Mueuji; Adeola Junaid, Hamidat; Ernest Okpalaeke, Kelechi; Betiku, Eriola

    2017-01-01

    Highlights: • Oil was extracted from Thevetia peruviana seeds and converted to FAME. • The FFA of the oil was first reduced to <1% by esterification process. • The conversion of the esterified oil to FAME was modeled using ANFIS and RSM. • The developed models by ANFIS and RSM for transesterification process had R"2 ≈ 1. • GA and RSM gave the maximum FAME yield of 99.8 wt.% and 98.8 wt.%, respectively. - Abstract: This work focused on the application of adaptive neuro-fuzzy inference system (ANFIS) and response surface methodology (RSM) as predictive tools for production of fatty acid methyl esters (FAME) from yellow oleander (Thevetia peruviana) seed oil. Two-step transesterification method was adopted, in the first step, the high free fatty acid (FFA) content of the oil was reduced to <1% by treating it with ferric sulfate in the presence of methanol. While in the second step, the pretreated oil was converted to FAME by reacting it with methanol using sodium methoxide as catalyst. To model the second step, central composite design was employed to study the effect of catalyst loading (1–2 wt.%), methanol/oil molar ratio (6:1–12:1) and time (20–60 min) on the T. peruviana methyl esters (TPME) yield. The reduction of FFA of the oil to 0.65 ± 0.05 wt.% was realized using ferric sulfate of 3 wt.%, methanol/FFA molar ratio of 9:1 and reaction time of 40 min. The model developed for the transesterification process by ANFIS (coefficient of determination, R"2 = 0.9999, standard error of prediction, SEP = 0.07 and mean absolute percentage deviation, MAPD = 0.05%) was significantly better than that of RSM (R"2 = 0.9670, SEP = 1.55 and MAPD = 0.84%) in terms of accuracy of the predicted TPME yield. For maximum TPME yield, the transesterification process input variables were optimized using genetic algorithm (GA) coupled with the ANFIS model and RSM optimization tool. TPME yield of 99.8 wt.% could be obtained with the combination of 0.79 w/v catalyst

  13. Evaluation of HDPE and LDPE degradation by fungus, implemented by statistical optimization

    Science.gov (United States)

    Ojha, Nupur; Pradhan, Neha; Singh, Surjit; Barla, Anil; Shrivastava, Anamika; Khatua, Pradip; Rai, Vivek; Bose, Sutapa

    2017-01-01

    Plastic in any form is a nuisance to the well-being of the environment. The ‘pestilence’ caused by it is mainly due to its non-degradable nature. With the industrial boom and the population explosion, the usage of plastic products has increased. A steady increase has been observed in the use of plastic products, and this has accelerated the pollution. Several attempts have been made to curb the problem at large by resorting to both chemical and biological methods. Chemical methods have only resulted in furthering the pollution by releasing toxic gases into the atmosphere; whereas; biological methods have been found to be eco-friendly however they are not cost effective. This paves the way for the current study where fungal isolates have been used to degrade polyethylene sheets (HDPE, LDPE). Two potential fungal strains, namely, Penicillium oxalicum NS4 (KU559906) and Penicillium chrysogenum NS10 (KU559907) had been isolated and identified to have plastic degrading abilities. Further, the growth medium for the strains was optimized with the help of RSM. The plastic sheets were subjected to treatment with microbial culture for 90 days. The extent of degradation was analyzed by, FE-SEM, AFM and FTIR. Morphological changes in the plastic sheet were determined.

  14. Statistical Approach for Optimization of Physiochemical Requirements on Alkaline Protease Production from Bacillus licheniformis NCIM 2042

    Directory of Open Access Journals (Sweden)

    Biswanath Bhunia

    2012-01-01

    Full Text Available The optimization of physiochemical parameters for alkaline protease production using Bacillus licheniformis NCIM 2042 were carried out by Plackett-Burman design and response surface methodology (RSM. The model was validated experimentally and the maximum protease production was found 315.28 U using optimum culture conditions. The protease was purified using ammonium sulphate (60% precipitation technique. The HPLC analysis of dialyzed sample showed that the retention time is 1.84 min with 73.5% purity. This enzyme retained more than 92% of its initial activity after preincubation for 30 min at 37∘C in the presence of 25% v/v DMSO, methanol, ethanol, ACN, 2-propanol, benzene, toluene, and hexane. In addition, partially purified enzyme showed remarkable stability for 60 min at room temperature, in the presence of anionic detergent (Tween-80 and Triton X-100, surfactant (SDS, bleaching agent (sodium perborate and hydrogen peroxide, and anti-redeposition agents (Na2CMC, Na2CO3. Purified enzyme containing 10% w/v PEG 4000 showed better thermal, surfactant, and local detergent stability.

  15. Statistical modelling and optimization of hydrolysis of urea to generate ammonia for flue gas conditioning

    International Nuclear Information System (INIS)

    Mahalik, K.; Sahu, J.N.; Patwardhan, Anand V.; Meikap, B.C.

    2010-01-01

    The present study is concerned with the technique of producing a relatively small quantity of ammonia which can be used safely in a coal-fired thermal power plant to improve the efficiency of electrostatic precipitator by removing the suspended particulate material mostly fly ash, from the flue gas. In this work hydrolysis of urea has been conducted in a batch reactor at atmospheric pressure to study the different reaction variables such as reaction temperature, initial concentration and stirring speed on the conversion by using design expert software. A 2 3 full factorial central composite design (CCD) has been employed and a quadratic model equation has been developed. The study reveals that conversion increases exponentially with an increase in temperature, stirring speed and feed concentration. However the stirring speed has the greatest effect on the conversion with concentration and temperature exerting least and moderate effect respectively. The values of equilibrium conversion obtained through the developed models are found to agree well with their corresponding experimental counterparts with a satisfactory correlation coefficient of 93%. The developed quadratic model was optimized using quadratic programming to maximize conversion of urea within experimental range studied. The optimum production condition has been found to be at the temperature of 130 o C, feed concentration of 4.16 mol/l and stirring speed of 400 rpm and the corresponding conversion, 63.242%.

  16. Sustained release biodegradable solid lipid microparticles: Formulation, evaluation and statistical optimization by response surface methodology

    Directory of Open Access Journals (Sweden)

    Hanif Muhammad

    2017-12-01

    Full Text Available For preparing nebivolol loaded solid lipid microparticles (SLMs by the solvent evaporation microencapsulation process from carnauba wax and glyceryl monostearate, central composite design was used to study the impact of independent variables on yield (Y1, entrapment efficiency (Y2 and drug release (Y3. SLMs having a 10-40 μm size range, with good rheological behavior and spherical smooth surfaces, were produced. Fourier transform infrared spectroscopy, differential scanning calorimetry and X-ray diffractometry pointed to compatibility between formulation components and the zeta-potential study confirmed better stability due to the presence of negative charge (-20 to -40 mV. The obtained outcomes for Y1 (29-86 %, Y2 (45-83 % and Y3 (49-86 % were analyzed by polynomial equations and the suggested quadratic model were validated. Nebivolol release from SLMs at pH 1.2 and 6.8 was significantly (p 0.85 value (Korsmeyer- Peppas suggested slow erosion along with diffusion. The optimized SLMs have the potential to improve nebivolol oral bioavailability.

  17. Sustained release biodegradable solid lipid microparticles: Formulation, evaluation and statistical optimization by response surface methodology.

    Science.gov (United States)

    Hanif, Muhammad; Khan, Hafeez Ullah; Afzal, Samina; Mahmood, Asif; Maheen, Safirah; Afzal, Khurram; Iqbal, Nabila; Andleeb, Mehwish; Abbas, Nazar

    2017-12-20

    For preparing nebivolol loaded solid lipid microparticles (SLMs) by the solvent evaporation microencapsulation process from carnauba wax and glyceryl monostearate, central composite design was used to study the impact of independent variables on yield (Y1), entrapment efficiency (Y2) and drug release (Y3). SLMs having a 10-40 μm size range, with good rheological behavior and spherical smooth surfaces, were produced. Fourier transform infrared spectroscopy, differential scanning calorimetry and X-ray diffractometry pointed to compatibility between formulation components and the zeta-potential study confirmed better stability due to the presence of negative charge (-20 to -40 mV). The obtained outcomes for Y1 (29-86 %), Y2 (45-83 %) and Y3 (49-86 %) were analyzed by polynomial equations and the suggested quadratic model were validated. Nebivolol release from SLMs at pH 1.2 and 6.8 was significantly (p 0.85 value (Korsmeyer- Peppas) suggested slow erosion along with diffusion. The optimized SLMs have the potential to improve nebivolol oral bioavailability.

  18. Evaluation of HDPE and LDPE degradation by fungus, implemented by statistical optimization.

    Science.gov (United States)

    Ojha, Nupur; Pradhan, Neha; Singh, Surjit; Barla, Anil; Shrivastava, Anamika; Khatua, Pradip; Rai, Vivek; Bose, Sutapa

    2017-01-04

    Plastic in any form is a nuisance to the well-being of the environment. The 'pestilence' caused by it is mainly due to its non-degradable nature. With the industrial boom and the population explosion, the usage of plastic products has increased. A steady increase has been observed in the use of plastic products, and this has accelerated the pollution. Several attempts have been made to curb the problem at large by resorting to both chemical and biological methods. Chemical methods have only resulted in furthering the pollution by releasing toxic gases into the atmosphere; whereas; biological methods have been found to be eco-friendly however they are not cost effective. This paves the way for the current study where fungal isolates have been used to degrade polyethylene sheets (HDPE, LDPE). Two potential fungal strains, namely, Penicillium oxalicum NS4 (KU559906) and Penicillium chrysogenum NS10 (KU559907) had been isolated and identified to have plastic degrading abilities. Further, the growth medium for the strains was optimized with the help of RSM. The plastic sheets were subjected to treatment with microbial culture for 90 days. The extent of degradation was analyzed by, FE-SEM, AFM and FTIR. Morphological changes in the plastic sheet were determined.

  19. Statistical optimization of process parameters for the production of tannase by Aspergillus flavus under submerged fermentation.

    Science.gov (United States)

    Mohan, S K; Viruthagiri, T; Arunkumar, C

    2014-04-01

    Production of tannase by Aspergillus flavus (MTCC 3783) using tamarind seed powder as substrate was studied in submerged fermentation. Plackett-Burman design was applied for the screening of 12 medium nutrients. From the results, the significant nutrients were identified as tannic acid, magnesium sulfate, ferrous sulfate and ammonium sulfate. Further the optimization of process parameters was carried out using response surface methodology (RSM). RSM has been applied for designing of experiments to evaluate the interactive effects through a full 31 factorial design. The optimum conditions were tannic acid concentration, 3.22 %; fermentation period, 96 h; temperature, 35.1 °C; and pH 5.4. Higher value of the regression coefficient (R 2  = 0.9638) indicates excellent evaluation of experimental data by second-order polynomial regression model. The RSM revealed that a maximum tannase production of 139.3 U/ml was obtained at the optimum conditions.

  20. Statistical optimization of gold recovery from difficult leachable sulphide minerals using bacteria

    Energy Technology Data Exchange (ETDEWEB)

    Ahmed, Hussin A.M. [King Abdulaziz Univ., Jeddah (Saudi Arabia). Mining Engineering Dept.; El-Midany, Ayman A. [King Saud Univ., Riyadh (Saudi Arabia)

    2012-07-01

    Some of refractory gold ores represent one of the difficult processable ores due to fine dissemination and interlocking of the gold grains with the associated sulphide minerals. This makes it impossible to recover precious metals from sulphide matrices by direct cyanide leaching even at high consumption of cyanide solution. Research to solve this problem is numerous. Application of bacteria shows that, some types of bacteria have great affect on sulphides bio-oxidation and consequently facilitate the leaching process. In this paper, leaching of Saudi gold ore, from Alhura area, containing sulphides before cyanidation is studied to recover gold from such ores applying bacteria. The process is investigated using stirred reactor bio-leaching rather than heap bio-leaching. Using statistical analysis the main affecting variables under studied conditions were identified. The design results indicated that the dose of bacteria, retention time and nutrition K{sub 2}SO{sub 4} are the most significant parameters. The higher the bacterial dose and the bacterial nutrition, the better is the concentrate grade. Results show that the method is technically effective in gold recovery. A gold concentrate containing > 100 g/t gold was obtained at optimum conditions, from an ore containing < 2 g/t gold i.e., 10 ml bacterial dose, 6 days retention time, and 6.5 kg/t K{sub 2}SO{sub 4}as bacteria nutrition. (orig.)

  1. Enhanced Bioethanol Production from Potato Peel Waste Via Consolidated Bioprocessing with Statistically Optimized Medium.

    Science.gov (United States)

    Hossain, Tahmina; Miah, Abdul Bathen; Mahmud, Siraje Arif; Mahin, Abdullah-Al-

    2018-04-12

    In this study, an extensive screening was undertaken to isolate some amylolytic microorganisms capable of producing bioethanol from starchy biomass through Consolidated Bioprocessing (CBP). A total of 28 amylolytic microorganisms were isolated, from which 5 isolates were selected based on high α-amylase and glucoamylase activities and identified as Candida wangnamkhiaoensis, Hyphopichia pseudoburtonii (2 isolates), Wickerhamia sp., and Streptomyces drozdowiczii based on 26S rDNA and 16S rDNA sequencing. Wickerhamia sp. showed the highest ethanol production (30.4 g/L) with fermentation yield of 0.3 g ethanol/g starch. Then, a low cost starchy waste, potato peel waste (PPW) was used as a carbon source to produce ethanol by Wickerhamia sp. Finally, in order to obtain maximum ethanol production from PPW, a fermentation medium was statistically designed. The effect of various medium ingredients was evaluated initially by Plackett-Burman design (PBD), where malt extracts, tryptone, and KH 2 PO 4 showed significantly positive effect (p value < 0.05). Using Response Surface Modeling (RSM), 40 g/L (dry basis) PPW and 25 g/L malt extract were found optimum and yielded 21.7 g/L ethanol. This study strongly suggests Wickerhamia sp. as a promising candidate for bioethanol production from starchy biomass, in particular, PPW through CBP.

  2. Statistical Optimization of Synthesis of Manganese Carbonates Nanoparticles by Precipitation Methods

    International Nuclear Information System (INIS)

    Javidan, A.; Rahimi-Nasrabadi, M.; Davoudi, A.A.

    2011-01-01

    In this study, an orthogonal array design (OAD), OA9, was employed as a statistical experimental method for the controllable, simple and fast synthesis of manganese carbonate nanoparticle. Ultrafine manganese carbonate nanoparticles were synthesized by a precipitation method involving the addition of manganese ion solution to the carbonate reagent. The effects of reaction conditions, for example, manganese and carbonate concentrations, flow rate of reagent addition and temperature, on the diameter of the synthesized manganese carbonate nanoparticle were investigated. The effects of these factors on the width of the manganese carbonate nanoparticle were quantitatively evaluated by the analysis of variance (ANOVA). The results showed that manganese carbonate nanoparticle can be synthesized by controlling the manganese concentration, flow rate and temperature. Finally, the optimum conditions for the synthesis of manganese carbonate nanoparticle by this simple and fast method were proposed. The results of ANOVA showed that 0.001 mol/ L manganese ion and carbonate reagents concentrations, 2.5 mL/ min flow rate for the addition of the manganese reagent to the carbonate solution and 0 degree Celsius temperature are the optimum conditions for producing manganese carbonate nanoparticle with 75 ± 25 nm width. (author)

  3. KiDS-450: cosmological constraints from weak lensing peak statistics - I. Inference from analytical prediction of high signal-to-noise ratio convergence peaks

    Science.gov (United States)

    Shan, HuanYuan; Liu, Xiangkun; Hildebrandt, Hendrik; Pan, Chuzhong; Martinet, Nicolas; Fan, Zuhui; Schneider, Peter; Asgari, Marika; Harnois-Déraps, Joachim; Hoekstra, Henk; Wright, Angus; Dietrich, Jörg P.; Erben, Thomas; Getman, Fedor; Grado, Aniello; Heymans, Catherine; Klaes, Dominik; Kuijken, Konrad; Merten, Julian; Puddu, Emanuella; Radovich, Mario; Wang, Qiao

    2018-02-01

    This paper is the first of a series of papers constraining cosmological parameters with weak lensing peak statistics using ˜ 450 deg2 of imaging data from the Kilo Degree Survey (KiDS-450). We measure high signal-to-noise ratio (SNR: ν) weak lensing convergence peaks in the range of 3 < ν < 5, and employ theoretical models to derive expected values. These models are validated using a suite of simulations. We take into account two major systematic effects, the boost factor and the effect of baryons on the mass-concentration relation of dark matter haloes. In addition, we investigate the impacts of other potential astrophysical systematics including the projection effects of large-scale structures, intrinsic galaxy alignments, as well as residual measurement uncertainties in the shear and redshift calibration. Assuming a flat Λ cold dark matter model, we find constraints for S_8=σ _8(Ω _m/0.3)^{0.5}=0.746^{+0.046}_{-0.107} according to the degeneracy direction of the cosmic shear analysis and Σ _8=σ _8(Ω _m/0.3)^{0.38}=0.696^{+0.048}_{-0.050} based on the derived degeneracy direction of our high-SNR peak statistics. The difference between the power index of S8 and in Σ8 indicates that combining cosmic shear with peak statistics has the potential to break the degeneracy in σ8 and Ωm. Our results are consistent with the cosmic shear tomographic correlation analysis of the same data set and ˜2σ lower than the Planck 2016 results.

  4. Optimization of silver nanoparticles biosynthesis mediated by Aspergillus niger NRC1731 through application of statistical methods: enhancement and characterization.

    Science.gov (United States)

    Elsayed, Maysa A; Othman, Abdelmageed M; Hassan, Mohamed M; Elshafei, Ali M

    2018-03-01

    The fungal-mediated silver nanoparticles (AgNPs) biosynthesis optimization via the application of central composite design (CCD) response surface to develop an effective ecofriendly and inexpensive green process was the aim of the current study. Nanosilver biosynthesis using the Aspergillus niger NRC1731 cell-free filtrate (CFF) was studied through involving the most parameters affecting the AgNPs green synthesis and its interactions effects. The statistical optimization models showed that using 59.37% of CFF in reaction containing 1.82 mM silver nitrate for 34 h at pH 7.0 is the optimum value to optimize the AgNPs biosynthesis. The obtained AgNPs were characterized by means of electron microscopy, UV/visible spectrophotometry, energy dispersive X-ray analysis and infrared spectroscopy to elucidate its almost spherical shape with diameter of 3-20 nm. The produced AgNPs exhibited a considerable antimicrobial activity against Bacillus mycoides , Escherichia coli in addition to Candida albicans .

  5. Application of Statistical Design to the Optimization of Culture Medium for Prodigiosin Production by Serratia marcescens SWML08

    Directory of Open Access Journals (Sweden)

    Venil, C. K.

    2009-01-01

    Full Text Available Combination of Plackett – Burman design (PBD and Box – Behnken design (BBD were applied for optimization of different factors for prodigiosin production by Serratia marcescens SWML08. Among 11 factors, incubation temperature, and supplement of (NH42PO4 and trace salts into the culture medium were selected due to significant positive effect on prodigiosin yield. Box - Behnken design, a response surface methodology, was used for further optimization of these selected factors for better prodigiosin output. Data were analyzed step wise and a second order polynomial model was established to identify the relationship between the prodigiosin output and the selected factors. The media formulations were optimized having the factors such as incubation temperature 30 °C, (NH42PO4 6 g/L and trace salts 0.6 g/L. The maximum experimental response for prodigiosin production was 1397.96 mg/L whereas the predicted value was 1394.26 mg/L. The high correlation between the predicted and observed values indicated the validity of the statistical design.

  6. Statistical optimization of exopolysaccharide production by Lactobacillus plantarum NTMI05 and NTMI20.

    Science.gov (United States)

    Imran, Mohamed Yousuff Mohamed; Reehana, Nazar; Jayaraj, K Arumugam; Ahamed, Abdul Azees Parveez; Dhanasekaran, Dharmadurai; Thajuddin, Nooruddin; Alharbi, Naiyf S; Muralitharan, Gangatharan

    2016-12-01

    In this study, 27 strains of Lactic acid bacteria (LAB) were isolated and identified from different milk sources. All the isolates were biochemically characterized and screened for their ability to produce exopolysaccharides (EPS), among which two isolates namely Lactobacillus plantarum NTMI05 (197mg/L) and Lactobacillus plantarum NTMI20 (187mg/L) showed higher EPS production. Both the isolates were molecular characterized and tested for their probiotic properties. The chemical composition of EPS from L. plantarum NTMI05 and NTMI20 revealed the presence of 95.45% and 92.35% carbohydrates, 14±0.1and 11±0.15mg/L lactic acid, 10.5±0.2 and 9±0.1mg/mL of reducing sugar, respectively. HPLC analysis showed galactose at the retention time of 2.29.The maximum EPS yield was optimized for the media components like glucose (20g/L), yeast extract (25g/L) and ammonium sulphate (2g/L) using Central Composite Design and Response Surface Methodology (RSM). Under optimum conditions the predicted maximum EPS production was 0.891g/L, 0.797g/L, while the actual experimental value was 0.956g/L and 0.827g/L for L. plantarum NTMI05 and NTMI20, respectively. The antioxidant capacity was also evaluated by DPPH and reducing power assay proving the potentiality of these organisms in food and dairy industries. Copyright © 2016 Elsevier B.V. All rights reserved.

  7. Optimization of statistical methods for HpGe gamma-ray spectrometer used in wide count rate ranges

    Energy Technology Data Exchange (ETDEWEB)

    Gervino, G., E-mail: gervino@to.infn.it [UNITO - Università di Torino, Dipartimento di Fisica, Turin (Italy); INFN - Istituto Nazionale di Fisica Nucleare, Sez. Torino, Turin (Italy); Mana, G. [INRIM - Istituto Nazionale di Ricerca Metrologica, Turin (Italy); Palmisano, C. [UNITO - Università di Torino, Dipartimento di Fisica, Turin (Italy); INRIM - Istituto Nazionale di Ricerca Metrologica, Turin (Italy)

    2016-07-11

    The need to perform γ-ray measurements with HpGe detectors is a common technique in many fields such as nuclear physics, radiochemistry, nuclear medicine and neutron activation analysis. The use of HpGe detectors is chosen in situations where isotope identification is needed because of their excellent resolution. Our challenge is to obtain the “best” spectroscopy data possible in every measurement situation. “Best” is a combination of statistical (number of counts) and spectral quality (peak, width and position) over a wide range of counting rates. In this framework, we applied Bayesian methods and the Ellipsoidal Nested Sampling (a multidimensional integration technique) to study the most likely distribution for the shape of HpGe spectra. In treating these experiments, the prior information suggests to model the likelihood function with a product of Poisson distributions. We present the efforts that have been done in order to optimize the statistical methods to HpGe detector outputs with the aim to evaluate to a better order of precision the detector efficiency, the absolute measured activity and the spectra background. Reaching a more precise knowledge of statistical and systematic uncertainties for the measured physical observables is the final goal of this research project.

  8. Optimal day-ahead wind-thermal unit commitment considering statistical and predicted features of wind speeds

    International Nuclear Information System (INIS)

    Sun, Yanan; Dong, Jizhe; Ding, Lijuan

    2017-01-01

    Highlights: • A day–ahead wind–thermal unit commitment model is presented. • Wind speed transfer matrix is formed to depict the sequential wind features. • Spinning reserve setting considering wind power accuracy and variation is proposed. • Verified study is performed to check the correctness of the program. - Abstract: The increasing penetration of intermittent wind power affects the secure operation of power systems and leads to a requirement of robust and economic generation scheduling. This paper presents an optimal day–ahead wind–thermal generation scheduling method that considers the statistical and predicted features of wind speeds. In this method, the statistical analysis of historical wind data, which represents the local wind regime, is first implemented. Then, according to the statistical results and the predicted wind power, the spinning reserve requirements for the scheduling period are calculated. Based on the calculated spinning reserve requirements, the wind–thermal generation scheduling is finally conducted. To validate the program, a verified study is performed on a test system. Then, numerical studies to demonstrate the effectiveness of the proposed method are conducted.

  9. KiDS-450: cosmological constraints from weak-lensing peak statistics - II: Inference from shear peaks using N-body simulations

    Science.gov (United States)

    Martinet, Nicolas; Schneider, Peter; Hildebrandt, Hendrik; Shan, HuanYuan; Asgari, Marika; Dietrich, Jörg P.; Harnois-Déraps, Joachim; Erben, Thomas; Grado, Aniello; Heymans, Catherine; Hoekstra, Henk; Klaes, Dominik; Kuijken, Konrad; Merten, Julian; Nakajima, Reiko

    2018-02-01

    We study the statistics of peaks in a weak-lensing reconstructed mass map of the first 450 deg2 of the Kilo Degree Survey (KiDS-450). The map is computed with aperture masses directly applied to the shear field with an NFW-like compensated filter. We compare the peak statistics in the observations with that of simulations for various cosmologies to constrain the cosmological parameter S_8 = σ _8 √{Ω _m/0.3}, which probes the (Ωm, σ8) plane perpendicularly to its main degeneracy. We estimate S8 = 0.750 ± 0.059, using peaks in the signal-to-noise range 0 ≤ S/N ≤ 4, and accounting for various systematics, such as multiplicative shear bias, mean redshift bias, baryon feedback, intrinsic alignment, and shear-position coupling. These constraints are ˜ 25 per cent tighter than the constraints from the high significance peaks alone (3 ≤ S/N ≤ 4) which typically trace single-massive haloes. This demonstrates the gain of information from low-S/N peaks. However, we find that including S/N KiDS-450. Combining shear peaks with non-tomographic measurements of the shear two-point correlation functions yields a ˜20 per cent improvement in the uncertainty on S8 compared to the shear two-point correlation functions alone, highlighting the great potential of peaks as a cosmological probe.

  10. Variations on Bayesian Prediction and Inference

    Science.gov (United States)

    2016-05-09

    inference 2.2.1 Background There are a number of statistical inference problems that are not generally formulated via a full probability model...problem of inference about an unknown parameter, the Bayesian approach requires a full probability 1. REPORT DATE (DD-MM-YYYY) 4. TITLE AND...the problem of inference about an unknown parameter, the Bayesian approach requires a full probability model/likelihood which can be an obstacle

  11. Statistical optimization of cell disruption techniques for releasing intracellular X-prolyl dipeptidyl aminopeptidase from Lactococcus lactis spp. lactis.

    Science.gov (United States)

    Üstün-Aytekin, Özlem; Arısoy, Sevda; Aytekin, Ali Özhan; Yıldız, Ece

    2016-03-01

    X-prolyl dipeptidyl aminopeptidase (PepX) is an intracellular enzyme from the Gram-positive bacterium Lactococcus lactis spp. lactis NRRL B-1821, and it has commercial importance. The objective of this study was to compare the effects of several cell disruption methods on the activity of PepX. Statistical optimization methods were performed for two cavitation methods, hydrodynamic (high-pressure homogenization) and acoustic (sonication), to determine the more appropriate disruption method. Two level factorial design (2FI), with the parameters of number of cycles and pressure, and Box-Behnken design (BBD), with the parameters of cycle, sonication time, and power, were used for the optimization of the high-pressure homogenization and sonication methods, respectively. In addition, disruption methods, consisting of lysozyme, bead milling, heat treatment, freeze-thawing, liquid nitrogen, ethylenediaminetetraacetic acid (EDTA), Triton-X, sodium dodecyl sulfate (SDS), chloroform, and antibiotics, were performed and compared with the high-pressure homogenization and sonication methods. The optimized values of high-pressure homogenization were one cycle at 130 MPa providing activity of 114.47 mU ml(-1), while sonication afforded an activity of 145.09 mU ml(-1) at 28 min with 91% power and three cycles. In conclusion, sonication was the more effective disruption method, and its optimal operation parameters were manifested for the release of intracellular enzyme from a L. lactis spp. lactis strain, which is a Gram-positive bacterium. Copyright © 2015 Elsevier B.V. All rights reserved.

  12. Statistical Computing

    Indian Academy of Sciences (India)

    inference and finite population sampling. Sudhakar Kunte. Elements of statistical computing are discussed in this series. ... which captain gets an option to decide whether to field first or bat first ... may of course not be fair, in the sense that the team which wins ... describe two methods of drawing a random number between 0.

  13. Modeling and control of an unstable system using probabilistic fuzzy inference system

    Directory of Open Access Journals (Sweden)

    Sozhamadevi N.

    2015-09-01

    Full Text Available A new type Fuzzy Inference System is proposed, a Probabilistic Fuzzy Inference system which model and minimizes the effects of statistical uncertainties. The blend of two different concepts, degree of truth and probability of truth in a unique framework leads to this new concept. This combination is carried out both in Fuzzy sets and Fuzzy rules, which gives rise to Probabilistic Fuzzy Sets and Probabilistic Fuzzy Rules. Introducing these probabilistic elements, a distinctive probabilistic fuzzy inference system is developed and this involves fuzzification, inference and output processing. This integrated approach accounts for all of the uncertainty like rule uncertainties and measurement uncertainties present in the systems and has led to the design which performs optimally after training. In this paper a Probabilistic Fuzzy Inference System is applied for modeling and control of a highly nonlinear, unstable system and also proved its effectiveness.

  14. Automatic physical inference with information maximizing neural networks

    Science.gov (United States)

    Charnock, Tom; Lavaux, Guilhem; Wandelt, Benjamin D.

    2018-04-01

    Compressing large data sets to a manageable number of summaries that are informative about the underlying parameters vastly simplifies both frequentist and Bayesian inference. When only simulations are available, these summaries are typically chosen heuristically, so they may inadvertently miss important information. We introduce a simulation-based machine learning technique that trains artificial neural networks to find nonlinear functionals of data that maximize Fisher information: information maximizing neural networks (IMNNs). In test cases where the posterior can be derived exactly, likelihood-free inference based on automatically derived IMNN summaries produces nearly exact posteriors, showing that these summaries are good approximations to sufficient statistics. In a series of numerical examples of increasing complexity and astrophysical relevance we show that IMNNs are robustly capable of automatically finding optimal, nonlinear summaries of the data even in cases where linear compression fails: inferring the variance of Gaussian signal in the presence of noise, inferring cosmological parameters from mock simulations of the Lyman-α forest in quasar spectra, and inferring frequency-domain parameters from LISA-like detections of gravitational waveforms. In this final case, the IMNN summary outperforms linear data compression by avoiding the introduction of spurious likelihood maxima. We anticipate that the automatic physical inference method described in this paper will be essential to obtain both accurate and precise cosmological parameter estimates from complex and large astronomical data sets, including those from LSST and Euclid.

  15. Monte Carlo Bayesian inference on a statistical model of sub-gridcolumn moisture variability using high-resolution cloud observations. Part 2: Sensitivity tests and results

    Science.gov (United States)

    Norris, Peter M.; da Silva, Arlindo M.

    2018-01-01

    Part 1 of this series presented a Monte Carlo Bayesian method for constraining a complex statistical model of global circulation model (GCM) sub-gridcolumn moisture variability using high-resolution Moderate Resolution Imaging Spectroradiometer (MODIS) cloud data, thereby permitting parameter estimation and cloud data assimilation for large-scale models. This article performs some basic testing of this new approach, verifying that it does indeed reduce mean and standard deviation biases significantly with respect to the assimilated MODIS cloud optical depth, brightness temperature and cloud-top pressure and that it also improves the simulated rotational–Raman scattering cloud optical centroid pressure (OCP) against independent (non-assimilated) retrievals from the Ozone Monitoring Instrument (OMI). Of particular interest, the Monte Carlo method does show skill in the especially difficult case where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach allows non-gradient-based jumps into regions of non-zero cloud probability. In the example provided, the method is able to restore marine stratocumulus near the Californian coast, where the background state has a clear swath. This article also examines a number of algorithmic and physical sensitivities of the new method and provides guidance for its cost-effective implementation. One obvious difficulty for the method, and other cloud data assimilation methods as well, is the lack of information content in passive-radiometer-retrieved cloud observables on cloud vertical structure, beyond cloud-top pressure and optical thickness, thus necessitating strong dependence on the background vertical moisture structure. It is found that a simple flow-dependent correlation modification from Riishojgaard provides some help in this respect, by

  16. Monte Carlo Bayesian Inference on a Statistical Model of Sub-gridcolumn Moisture Variability Using High-resolution Cloud Observations . Part II; Sensitivity Tests and Results

    Science.gov (United States)

    da Silva, Arlindo M.; Norris, Peter M.

    2013-01-01

    Part I presented a Monte Carlo Bayesian method for constraining a complex statistical model of GCM sub-gridcolumn moisture variability using high-resolution MODIS cloud data, thereby permitting large-scale model parameter estimation and cloud data assimilation. This part performs some basic testing of this new approach, verifying that it does indeed significantly reduce mean and standard deviation biases with respect to the assimilated MODIS cloud optical depth, brightness temperature and cloud top pressure, and that it also improves the simulated rotational-Ramman scattering cloud optical centroid pressure (OCP) against independent (non-assimilated) retrievals from the OMI instrument. Of particular interest, the Monte Carlo method does show skill in the especially difficult case where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach allows finite jumps into regions of non-zero cloud probability. In the example provided, the method is able to restore marine stratocumulus near the Californian coast where the background state has a clear swath. This paper also examines a number of algorithmic and physical sensitivities of the new method and provides guidance for its cost-effective implementation. One obvious difficulty for the method, and other cloud data assimilation methods as well, is the lack of information content in the cloud observables on cloud vertical structure, beyond cloud top pressure and optical thickness, thus necessitating strong dependence on the background vertical moisture structure. It is found that a simple flow-dependent correlation modification due to Riishojgaard (1998) provides some help in this respect, by better honoring inversion structures in the background state.

  17. Monte Carlo Bayesian Inference on a Statistical Model of Sub-Gridcolumn Moisture Variability Using High-Resolution Cloud Observations. Part 2: Sensitivity Tests and Results

    Science.gov (United States)

    Norris, Peter M.; da Silva, Arlindo M.

    2016-01-01

    Part 1 of this series presented a Monte Carlo Bayesian method for constraining a complex statistical model of global circulation model (GCM) sub-gridcolumn moisture variability using high-resolution Moderate Resolution Imaging Spectroradiometer (MODIS) cloud data, thereby permitting parameter estimation and cloud data assimilation for large-scale models. This article performs some basic testing of this new approach, verifying that it does indeed reduce mean and standard deviation biases significantly with respect to the assimilated MODIS cloud optical depth, brightness temperature and cloud-top pressure and that it also improves the simulated rotational-Raman scattering cloud optical centroid pressure (OCP) against independent (non-assimilated) retrievals from the Ozone Monitoring Instrument (OMI). Of particular interest, the Monte Carlo method does show skill in the especially difficult case where the background state is clear but cloudy observations exist. In traditional linearized data assimilation methods, a subsaturated background cannot produce clouds via any infinitesimal equilibrium perturbation, but the Monte Carlo approach allows non-gradient-based jumps into regions of non-zero cloud probability. In the example provided, the method is able to restore marine stratocumulus near the Californian coast, where the background state has a clear swath. This article also examines a number of algorithmic and physical sensitivities of the new method and provides guidance for its cost-effective implementation. One obvious difficulty for the method, and other cloud data assimilation methods as well, is the lack of information content in passive-radiometer-retrieved cloud observables on cloud vertical structure, beyond cloud-top pressure and optical thickness, thus necessitating strong dependence on the background vertical moisture structure. It is found that a simple flow-dependent correlation modification from Riishojgaard provides some help in this respect, by

  18. Research on the optimization of air quality monitoring station layout based on spatial grid statistical analysis method.

    Science.gov (United States)

    Li, Tianxin; Zhou, Xing Chen; Ikhumhen, Harrison Odion; Difei, An

    2018-05-01

    In recent years, with the significant increase in urban development, it has become necessary to optimize the current air monitoring stations to reflect the quality of air in the environment. Highlighting the spatial representation of some air monitoring stations using Beijing's regional air monitoring station data from 2012 to 2014, the monthly mean particulate matter concentration (PM10) in the region was calculated and through the IDW interpolation method and spatial grid statistical method using GIS, the spatial distribution of PM10 concentration in the whole region was deduced. The spatial distribution variation of districts in Beijing using the gridding model was performed, and through the 3-year spatial analysis, PM10 concentration data including the variation and spatial overlay (1.5 km × 1.5 km cell resolution grid), the spatial distribution result obtained showed that the total PM10 concentration frequency variation exceeded the standard. It is very important to optimize the layout of the existing air monitoring stations by combining the concentration distribution of air pollutants with the spatial region using GIS.

  19. Statistical medium optimization of an alkaline protease from Pseudomonas aeruginosa MTCC 10501, its characterization and application in leather processing.

    Science.gov (United States)

    Boopathy, Naidu Ramachandra; Indhuja, Devadas; Srinivasan, Krishnan; Uthirappan, Mani; Gupta, Rishikesh; Ramudu, Kamini Numbi; Chellan, Rose

    2013-04-01

    Proteases are shown to have greener mode of application in leather processing for dehairing of goat skins and cow hides. Production of protease by submerged fermentation with potent activity is reported using a new isolate P. aeruginosa MTCC 10501. The production parameters were optimized by statistical methods such as Plackett-Burman and response surface methodology. The optimized production medium contained (g/L); tryptone, 2.5; yeast extract, 3.0; skim milk 30.0; dextrose 1.0; inoculum concentration 4%: initial pH 6.0; incubation temperature 30 degrees C and optimum production at 48 h with protease activity of 7.6 U/mL. The protease had the following characteristics: pH optima, 9.0; temperature optima 50 degrees C; pH stability between 5.0-10.0 and temperature stability between 10-40 degrees C. The protease was observed to have high potential for dehairing of goat skins in the pre- tanning process comparable to that of the chemical process as evidenced by histology. The method offers cleaner processing using enzyme only instead of toxic chemicals in the pre-tanning process of leather manufacture.

  20. Investigation of antimicrobial activity and statistical optimization of Bacillus subtilis SPB1 biosurfactant production in solid-state fermentation.

    Science.gov (United States)

    Ghribi, Dhouha; Abdelkefi-Mesrati, Lobna; Mnif, Ines; Kammoun, Radhouan; Ayadi, Imen; Saadaoui, Imen; Maktouf, Sameh; Chaabouni-Ellouze, Semia

    2012-01-01

    During the last years, several applications of biosurfactants with medical purposes have been reported. Biosurfactants are considered relevant molecules for applications in combating many diseases. However, their use is currently extremely limited due to their high cost in relation to that of chemical surfactants. Use of inexpensive substrates can drastically decrease its production cost. Here, twelve solid substrates were screened for the production of Bacillus subtilis SPB1 biosurfactant and the maximum yield was found with millet. A Plackett-Burman design was then used to evaluate the effects of five variables (temperature, moisture, initial pH, inoculum age, and inoculum size). Statistical analyses showed that temperature, inoculum age, and moisture content had significantly positive effect on SPB1 biosurfactant production. Their values were further optimized using a central composite design and a response surface methodology. The optimal conditions of temperature, inoculum age, and moisture content obtained under the conditions of study were 37°C, 14 h, and 88%, respectively. The evaluation of the antimicrobial activity of this compound was carried out against 11 bacteria and 8 fungi. The results demonstrated that this biosurfactant exhibited an important antimicrobial activity against microorganisms with multidrug-resistant profiles. Its activity was very effective against Staphylococcus aureus, Staphylococcus xylosus, Enterococcus faecalis, Klebsiella pneumonia, and so forth.

  1. Investigation of Antimicrobial Activity and Statistical Optimization of Bacillus subtilis SPB1 Biosurfactant Production in Solid-State Fermentation

    Directory of Open Access Journals (Sweden)

    Dhouha Ghribi

    2012-01-01

    Full Text Available During the last years, several applications of biosurfactants with medical purposes have been reported. Biosurfactants are considered relevant molecules for applications in combating many diseases. However, their use is currently extremely limited due to their high cost in relation to that of chemical surfactants. Use of inexpensive substrates can drastically decrease its production cost. Here, twelve solid substrates were screened for the production of Bacillus subtilis SPB1 biosurfactant and the maximum yield was found with millet. A Plackett-Burman design was then used to evaluate the effects of five variables (temperature, moisture, initial pH, inoculum age, and inoculum size. Statistical analyses showed that temperature, inoculum age, and moisture content had significantly positive effect on SPB1 biosurfactant production. Their values were further optimized using a central composite design and a response surface methodology. The optimal conditions of temperature, inoculum age, and moisture content obtained under the conditions of study were 37°C, 14 h, and 88%, respectively. The evaluation of the antimicrobial activity of this compound was carried out against 11 bacteria and 8 fungi. The results demonstrated that this biosurfactant exhibited an important antimicrobial activity against microorganisms with multidrug-resistant profiles. Its activity was very effective against Staphylococcus aureus, Staphylococcus xylosus, Enterococcus faecalis, Klebsiella pneumonia, and so forth.

  2. Transethosomal gels as carriers for the transdermal delivery of colchicine: statistical optimization, characterization, and ex vivo evaluation.

    Science.gov (United States)

    Abdulbaqi, Ibrahim M; Darwis, Yusrida; Assi, Reem Abou; Khan, Nurzalina Abdul Karim

    2018-01-01

    Colchicine is used for the treatment of gout, pseudo-gout, familial Mediterranean fever, and many other illnesses. Its oral administration is associated with poor bioavailability and severe gastrointestinal side effects. The drug is also known to have a low therapeutic index. Thus to overcome these drawbacks, the transdermal delivery of colchicine was investigated using transethosomal gels as potential carriers. Colchicine-loaded transethosomes (TEs) were prepared by the cold method and statistically optimized using three sets of 24 factorial design experiments. The optimized formulations were incorporated into Carbopol 940 ® gel base. The prepared colchicine-loaded transethosomal gels were further characterized for vesicular size, dispersity, zeta potential, drug content, pH, viscosity, yield, rheological behavior, and ex vivo skin permeation through Sprague Dawley rats' back skin. The results showed that the colchicine-loaded TEs had aspherical irregular shape, nanometric size range, and high entrapment efficiency. All the formulated gels exhibited non-Newtonian plastic flow without thixotropy. Colchicine-loaded transethosomal gels were able to significantly enhance the skin permeation parameters of the drug in comparison to the non-ethosomal gel. These findings suggested that the transethosomal gels are promising carriers for the transdermal delivery of colchicine, providing an alternative route for drug administration.

  3. Transethosomal gels as carriers for the transdermal delivery of colchicine: statistical optimization, characterization, and ex vivo evaluation

    Directory of Open Access Journals (Sweden)

    Abdulbaqi IM

    2018-04-01

    Full Text Available Ibrahim M Abdulbaqi, Yusrida Darwis, Reem Abou Assi, Nurzalina Abdul Karim Khan School of Pharmaceutical Sciences, Universiti Sains Malaysia, Minden, Penang, Malaysia Introduction: Colchicine is used for the treatment of gout, pseudo-gout, familial Mediterranean fever, and many other illnesses. Its oral administration is associated with poor bioavailability and severe gastrointestinal side effects. The drug is also known to have a low therapeutic index. Thus to overcome these drawbacks, the transdermal delivery of colchicine was investigated using transethosomal gels as potential carriers.Methods: Colchicine-loaded transethosomes (TEs were prepared by the cold method and statistically optimized using three sets of 24 factorial design experiments. The optimized formulations were incorporated into Carbopol 940® gel base. The prepared colchicine-loaded transethosomal gels were further characterized for vesicular size, dispersity, zeta potential, drug content, pH, viscosity, yield, rheological behavior, and ex vivo skin permeation through Sprague Dawley rats’ back skin.Results: The results showed that the colchicine-loaded TEs had aspherical irregular shape, nanometric size range, and high entrapment efficiency. All the formulated gels exhibited non-Newtonian plastic flow without thixotropy. Colchicine-loaded transethosomal gels were able to significantly enhance the skin permeation parameters of the drug in comparison to the non-ethosomal gel.Conclusion: These findings suggested that the transethosomal gels are promising carriers for the transdermal delivery of colchicine, providing an alternative route for drug administration. Keywords: transethosomes, ethosomal nanocarriers, colchicine, factorial design, skin permeation, rheology

  4. Determination of kinetic parameters of 1,3-propanediol fermentation by Clostridium diolis using statistically optimized medium.

    Science.gov (United States)

    Kaur, Guneet; Srivastava, Ashok K; Chand, Subhash

    2012-09-01

    1,3-propanediol (1,3-PD) is a chemical compound of immense importance primarily used as a raw material for fiber and textile industry. It can be produced by the fermentation of glycerol available abundantly as a by-product from the biodiesel plant. The present study was aimed at determination of key kinetic parameters of 1,3-PD fermentation by Clostridium diolis. Initial experiments on microbial growth inhibition were followed by optimization of nutrient medium recipe by statistical means. Batch kinetic data from studies in bioreactor using optimum concentration of variables obtained from statistical medium design was used for estimation of kinetic parameters of 1,3-PD production. Direct use of raw glycerol from biodiesel plant without any pre-treatment for 1,3-PD production using this strain investigated for the first time in this work gave results comparable to commercial glycerol. The parameter values obtained in this study would be used to develop a mathematical model for 1,3-PD to be used as a guide for designing various reactor operating strategies for further improving 1,3-PD production. An outline of protocol for model development has been discussed in the present work.

  5. Making Type Inference Practical

    DEFF Research Database (Denmark)

    Schwartzbach, Michael Ignatieff; Oxhøj, Nicholas; Palsberg, Jens

    1992-01-01

    We present the implementation of a type inference algorithm for untyped object-oriented programs with inheritance, assignments, and late binding. The algorithm significantly improves our previous one, presented at OOPSLA'91, since it can handle collection classes, such as List, in a useful way. Abo......, the complexity has been dramatically improved, from exponential time to low polynomial time. The implementation uses the techniques of incremental graph construction and constraint template instantiation to avoid representing intermediate results, doing superfluous work, and recomputing type information....... Experiments indicate that the implementation type checks as much as 100 lines pr. second. This results in a mature product, on which a number of tools can be based, for example a safety tool, an image compression tool, a code optimization tool, and an annotation tool. This may make type inference for object...

  6. The dynamical core of the Aeolus 1.0 statistical-dynamical atmosphere model: validation and parameter optimization

    Science.gov (United States)

    Totz, Sonja; Eliseev, Alexey V.; Petri, Stefan; Flechsig, Michael; Caesar, Levke; Petoukhov, Vladimir; Coumou, Dim

    2018-02-01

    We present and validate a set of equations for representing the atmosphere's large-scale general circulation in an Earth system model of intermediate complexity (EMIC). These dynamical equations have been implemented in Aeolus 1.0, which is a statistical-dynamical atmosphere model (SDAM) and includes radiative transfer and cloud modules (Coumou et al., 2011; Eliseev et al., 2013). The statistical dynamical approach is computationally efficient and thus enables us to perform climate simulations at multimillennia timescales, which is a prime aim of our model development. Further, this computational efficiency enables us to scan large and high-dimensional parameter space to tune the model parameters, e.g., for sensitivity studies.Here, we present novel equations for the large-scale zonal-mean wind as well as those for planetary waves. Together with synoptic parameterization (as presented by Coumou et al., 2011), these form the mathematical description of the dynamical core of Aeolus 1.0.We optimize the dynamical core parameter values by tuning all relevant dynamical fields to ERA-Interim reanalysis data (1983-2009) forcing the dynamical core with prescribed surface temperature, surface humidity and cumulus cloud fraction. We test the model's performance in reproducing the seasonal cycle and the influence of the El Niño-Southern Oscillation (ENSO). We use a simulated annealing optimization algorithm, which approximates the global minimum of a high-dimensional function.With non-tuned parameter values, the model performs reasonably in terms of its representation of zonal-mean circulation, planetary waves and storm tracks. The simulated annealing optimization improves in particular the model's representation of the Northern Hemisphere jet stream and storm tracks as well as the Hadley circulation.The regions of high azonal wind velocities (planetary waves) are accurately captured for all validation experiments. The zonal-mean zonal wind and the integrated lower

  7. Investigation on the Effects of Process Parameters on Laser Percussion Drilling Using Finite Element Methodology; Statistical Modelling and Optimization

    Directory of Open Access Journals (Sweden)

    Mahmoud Moradi

    Full Text Available Abstract In the present research, the simulation of the Nickel-base superalloy Inconel 718 fiber-laser drilling process with the thickness of 1mm is investigated through the Finite Element Method. In order to specify the appropriate Gaussian distribution of laser beam, the results of an experimental research on glass laser drilling were simulated using three types of Gaussian distribution. The DFLUX subroutine was used to implement the laser heat sources of the models using the Fortran language. After the appropriate Gaussian distribution was chosen, the model was validated with the experimental results of the Nickel-base superalloy Inconel 718 laser drilling process. The negligible error percentage among the experimental and simulation results demonstrates the high accuracy of this model. The experiments were performed based on the Response Surface Methodology (RSM as a statistical design of experiment (DOE approach to investigate the influence of process parameters on the responses, obtaining the mathematical regressions and predicting the new results. Four parameters i.e. laser pulse frequency (150 to 550 Hz, laser power (200 to 500 watts, laser focal plane position (-0.5 to +0.5 mm and the duty cycle (30 to 70% were considered to be the input variables in 5 levels and four external parameters i.e. the hole's entrance and exit diameters, hole taper angle and the weight of mass removed from the hole, were observed to be the process output responses of this central composite design. By performing the statistical analysis, the input and output parameters were found to have a direct relation with each other. By an increase in each of the input variables, the entrance and exit hole diameters, the hole taper angel, and the weight of mass removed from the hole increase. Finally, the results of the conducted simulations and statistical analyses having been used, the laser drilling process was optimized by means of the desire ability approach. Good

  8. Stochastic processes inference theory

    CERN Document Server

    Rao, Malempati M

    2014-01-01

    This is the revised and enlarged 2nd edition of the authors’ original text, which was intended to be a modest complement to Grenander's fundamental memoir on stochastic processes and related inference theory. The present volume gives a substantial account of regression analysis, both for stochastic processes and measures, and includes recent material on Ridge regression with some unexpected applications, for example in econometrics. The first three chapters can be used for a quarter or semester graduate course on inference on stochastic processes. The remaining chapters provide more advanced material on stochastic analysis suitable for graduate seminars and discussions, leading to dissertation or research work. In general, the book will be of interest to researchers in probability theory, mathematical statistics and electrical and information theory.

  9. The Pearson diffusions: A class of statistically tractable diffusion processes

    DEFF Research Database (Denmark)

    Forman, Julie Lyng; Sørensen, Michael

    The Pearson diffusions is a flexible class of diffusions defined by having linear drift and quadratic squared diffusion coefficient. It is demonstrated that for this class explicit statistical inference is feasible. Explicit optimal martingale estimating func- tions are found, and the corresponding...

  10. ENHANCED PRODUCTION OF POLYHYDROXYBUTYRATE (PHB FROM AGRO-INDUSTRIAL WASTES; FED-BATCH CULTIVATION AND STATISTICAL MEDIA OPTIMIZATION

    Directory of Open Access Journals (Sweden)

    Mahmoud M. Berekaa

    2016-06-01

    Full Text Available Bacillus megaterium SW1-2 showed enhanced growth and polyhydroxybutyrate (PHB production during cultivation on date palm syrup (DEPS or sugar cane molasses. FT-IR and NMR spectroscopic analyses of the polymer accumulated during growth on DEPS revealed specific absorption peaks characteristic for PHB. 1.65 g/L of PHB (56.9% CDW was produced during growth on medium supplemented with 2 g/L of DEPS. Approximately, 36.1% CDW of PHB were recorded during growth on sugar cane molasses. Six runs of different fed-batch cultivation strategies were tested, the optimal run showed approximately 6.87-fold increase. Modified E2 medium was prefered recording 10.11 and 11.34 g/L of total PHB produced for runs 1 and 2, at the end of 96 h incubation period, respectively. Decrease in PHB was recorded during growth on complex medium (run 3 and run 4. In another independent optimization strategy, ten variables were concurrently examined for their significance on PHB production by Plackett-Burman statistical design for the first time. Among variables, DEPS-II and inoculum concentration followed by KH2PO4 and (NH42SO4 were found to be the most significant variables encourage PHB production. Indeed, DEPS-II or Fresh syrup is more significant than commercial syrup DEPS-I (p-value= 0.05. RPM, incubation period have highly negative effect on PHB production. Role of ago-industrial wastes, especially DEPS, in enhancement of PHB production was closely discussed.

  11. Multi-criteria optimization of the flesh melons skin separation process by experimental and statistical analysis methods

    Directory of Open Access Journals (Sweden)

    Y. B. Medvedkov

    2016-01-01

    Full Text Available Research and innovation activity to create energy-efficient processes in the melon processing, is a significant task. Separation skin from the melon flesh with their subsequent destination application in the creation of new food products is one of the time-consuming operations in this technology. Lack of scientific and experimental base of this operation holding back the development of high-performance machines for its implementation. In this connection, the technique of the experiment on the separation of the skins of melons in the pilot plant and the search for optimal regimes of its work methods by statistical modeling is offered. The late-ripening species of melon: Kalaysan, Thorlami, Gulab-sary are objects of study. Interaction of factors influencing on separating the melon skins process is carried out. A central composite rotatable design and fractional factorial experiment was used. Using the method of experimental design with treatment planning template in Design Expert v.10 software yielded a regression equations that adequately describe the actual process. Rational intervals input factors values are established: the ratio of the rotational speed of the drum to the abrasive supply roll rotational frequency; the gap between the supply drum and the shearing knife; shearing blade sharpening angle; the number of feed drum spikes; abrading drum orifices diameter. The mean square error does not exceed 12.4%. Regression equations graphic interpretation is presented by scatter plots and engineering nomograms that can be predictive of a choice of rational values of the input factors for three optimization criteria: minimal specific energy consumption in the process of cutting values, maximal specific performance by the pulp and pulp extraction ratio values. Obtained data can be used for the operational management of the process technological parameters, taking into account the geometrical dimensions of the melon and its inhomogeneous structure.

  12. An introduction to probability and statistical inference

    CERN Document Server

    Roussas, George G

    2003-01-01

    "The text is wonderfully written and has the mostcomprehensive range of exercise problems that I have ever seen." - Tapas K. Das, University of South Florida"The exposition is great; a mixture between conversational tones and formal mathematics; the appropriate combination for a math text at [this] level. In my examination I could find no instance where I could improve the book." - H. Pat Goeters, Auburn, University, Alabama* Contains more than 200 illustrative examples discussed in detail, plus scores of numerical examples and applications* Chapters 1-8 can be used independently for an introductory course in probability* Provides a substantial number of proofs

  13. Statistical Inference for Partially Observed Diffusion Processes

    DEFF Research Database (Denmark)

    Jensen, Anders Christian

    This thesis is concerned with parameter estimation for multivariate diffusion models. It gives a short introduction to diffusion models, and related mathematical concepts. we then introduce the method of prediction-based estimating functions and describe in detail the application for a two......-Uhlenbeck process, while chapter eight describes the detials of an R-package that was developed in relations to the application of the estimationprocedure of chapters five and six....

  14. Statistical Inference for Cultural Consensus Theory

    Science.gov (United States)

    2014-02-24

    Social Network Conference XXXII , Redondo Beach, California, March 2012. Agrawal, K. (Presenter), and Batchelder, W. H. Cultural Consensus Theory...Aggregating Complete Signed Graphs Under a Balance Constraint -- Part 2. International Sunbelt Social Network Conference XXXII , Redondo Beach

  15. Statistical optimization for Monacolin K and yellow pigment production and citrinin reduction by Monascus purpureus in solid-state fermentation.

    Science.gov (United States)

    Jirasatid, Sani; Nopharatana, Montira; Kitsubun, Panit; Vichitsoonthonkul, Taweerat; Tongta, Anan

    2013-03-01

    Monacolin K and yellow pigment, produced by Monascus sp., have each been proven to be beneficial compounds as antihypercholesterolemic and anti-inflammation agents, respectively. However, citrinin, a human toxic substance, was also synthesized in this fungus. In this research, solidstate fermentation of M. purpureus TISTR 3541 was optimized by statistical methodology to obtain a high production of monacolin K and yellow pigment along with a low level of citrinin. Fractional factorial design was applied in this study to identify the significant factors. Among the 13 variables, five parameters (i.e., glycerol, methionine, sodium nitrate, cultivation time, and temperature) influencing monacolin K, yellow pigment, and citrinin production were identified. A central composite design was further employed to investigate the optimum level of these five factors. The maximum production of monacolin K and yellow pigment of 5,900 mg/kg and 1,700 units/g, respectively, and the minimum citrinin concentration of 0.26 mg/kg were achieved in the medium containing 2% glycerol, 0.14% methionine, and 0.01% sodium nitrate at 25°C for 16 days of cultivation. The yields of monacolin K and yellow pigment were about 3 and 1.5 times higher than the basal medium, respectively, whereas citrinin was dramatically reduced by 36 times.

  16. Statistical optimization of ultraviolet irradiate conditions for vitamin D₂ synthesis in oyster mushrooms (Pleurotus ostreatus using response surface methodology.

    Directory of Open Access Journals (Sweden)

    Wei-Jie Wu

    Full Text Available Response surface methodology (RSM was used to determine the optimum vitamin D2 synthesis conditions in oyster mushrooms (Pleurotus ostreatus. Ultraviolet B (UV-B was selected as the most efficient irradiation source for the preliminary experiment, in addition to the levels of three independent variables, which included ambient temperature (25-45°C, exposure time (40-120 min, and irradiation intensity (0.6-1.2 W/m2. The statistical analysis indicated that, for the range which was studied, irradiation intensity was the most critical factor that affected vitamin D2 synthesis in oyster mushrooms. Under optimal conditions (ambient temperature of 28.16°C, UV-B intensity of 1.14 W/m2, and exposure time of 94.28 min, the experimental vitamin D2 content of 239.67 µg/g (dry weight was in very good agreement with the predicted value of 245.49 µg/g, which verified the practicability of this strategy. Compared to fresh mushrooms, the lyophilized mushroom powder can synthesize remarkably higher level of vitamin D2 (498.10 µg/g within much shorter UV-B exposure time (10 min, and thus should receive attention from the food processing industry.

  17. Production and Statistical Optimization of Oxytetracycline from Streptomyces rimosus NCIM 2213 using a New Cellulosic Substrate, Prosopis juliflora

    Directory of Open Access Journals (Sweden)

    Surjith Ramasamy

    2014-10-01

    Full Text Available Prosopis juliflora is a drought-resistant evergreen spiny tree that grows in semi-arid and arid tracts of tropical and sub-tropical regions of the world. Dry pods of P. juliflora are a rich source of carbon (40% total sugar and nitrogen (15% of total nitrogen and so can be considered as a good substrate for the microbial growth. The present study was mainly focused on the utilization of these pods for the production and statistical optimization of oxytetracycline (OTC from Streptomyces rimosus NCIM 2213 under SSF. The spectral characterization and chemical color reactions of purified OTC by UV, FTIR, 1H NMR, 13C NMR, and HPLC revealed that the structure was homologous to a standard sample. A central composite design with 26 trails yielded the following critical values of supplements to be added to the dry pods: maltose (0.125 g/gds, Inoculum size (0.617 mL/gds, CaCO3 (0.0026 g/gds, and moisture content (74.87% with the maximum OTC yield 5.02 mg/gds.

  18. Nanoscale Optimization and Statistical Modeling of Photoelectrochemical Water Splitting Efficiency of N-Doped TiO2 Nanotubes

    KAUST Repository

    Isimjan, Tayirjan T.

    2014-12-19

    Highly ordered nitrogen-doped titanium dioxide (N-doped TiO2) nanotube array films with enhanced photo-electrochemical water splitting efficiency (PCE) for hydrogen generation were fabricated by electrochemical anodization, followed by annealing in a nitrogen atmosphere. Morphology, structure and composition of the N-doped TiO2 nanotube array films were investigated by FE-SEM, XPS, UV-Vis and XRD. The effect of annealing temperature, heating rate and annealing time on the morphology, structure, and photo-electrochemical property of the N-doped TiO2 nanotube array films were investigated. A design of experiments method was applied in order to minimize the number of experiments and obtain a statistical model for this system. From the modelling results, optimum values for the influential factors were obtained in order to achieve the maximum PCE. The optimized experiment resulted in 7.42 % PCE which was within 95 % confidence interval of the predicted value by the model. © 2014 Springer Science+Business Media.

  19. Statistical optimization of the growth factors for Chaetoceros neogracile using fractional factorial design and central composite design.

    Science.gov (United States)

    Jeong, Sung-Eun; Park, Jae-Kweon; Kim, Jeong-Dong; Chang, In-Jeong; Hong, Seong-Joo; Kang, Sung-Ho; Lee, Choul-Gyun

    2008-12-01

    Statistical experimental designs; involving (i) a fractional factorial design (FFD) and (ii) a central composite design (CCD) were applied to optimize the culture medium constituents for production of a unique antifreeze protein by the Antartic microalgae Chaetoceros neogracile. The results of the FFD suggested that NaCl, KCl, MgCl2, and Na2SiO3 were significant variables that highly influenced the growth rate and biomass production. The optimum culture medium for the production of an antifreeze protein from C. neogracile was found to be Kalleampersandrsquor;s artificial seawater, pH of 7.0ampersandplusmn;0.5, consisting of 28.566 g/l of NaCl, 3.887 g/l of MgCl2, 1.787 g/l of MgSO4, 1.308 g/l of CaSO4, 0.832 g/l of K2SO4, 0.124 g/l of CaCO3, 0.103 g/l of KBr, 0.0288 g/l of SrSO4, and 0.0282 g/l of H3BO3. The antifreeze activity significantly increased after cells were treated with cold shock (at -5oC) for 14 h. To the best of our knowledge, this is the first report demonstrating an antifreeze-like protein of C. neogracile.

  20. Cortical hierarchies perform Bayesian causal inference in multisensory perception.

    Directory of Open Access Journals (Sweden)

    Tim Rohe

    2015-02-01

    Full Text Available To form a veridical percept of the environment, the brain needs to integrate sensory signals from a common source but segregate those from independent sources. Thus, perception inherently relies on solving the "causal inference problem." Behaviorally, humans solve this problem optimally as predicted by Bayesian Causal Inference; yet, the underlying neural mechanisms are unexplored. Combining psychophysics, Bayesian modeling, functional magnetic resonance imaging (fMRI, and multivariate decoding in an audiovisual spatial localization task, we demonstrate that Bayesian Causal Inference is performed by a hierarchy of multisensory processes in the human brain. At the bottom of the hierarchy, in auditory and visual areas, location is represented on the basis that the two signals are generated by independent sources (= segregation. At the next stage, in posterior intraparietal sulcus, location is estimated under the assumption that the two signals are from a common source (= forced fusion. Only at the top of the hierarchy, in anterior intraparietal sulcus, the uncertainty about the causal structure of the world is taken into account and sensory signals are combined as predicted by Bayesian Causal Inference. Characterizing the computational operations of signal interactions reveals the hierarchical nature of multisensory perception in human neocortex. It unravels how the brain accomplishes Bayesian Causal Inference, a statistical computation fundamental for perception and cognition. Our results demonstrate how the brain combines information in the face of uncertainty about the underlying causal structure of the world.

  1. Inference in hybrid Bayesian networks

    DEFF Research Database (Denmark)

    Lanseth, Helge; Nielsen, Thomas Dyhre; Rumí, Rafael

    2009-01-01

    Since the 1980s, Bayesian Networks (BNs) have become increasingly popular for building statistical models of complex systems. This is particularly true for boolean systems, where BNs often prove to be a more efficient modelling framework than traditional reliability-techniques (like fault trees...... decade's research on inference in hybrid Bayesian networks. The discussions are linked to an example model for estimating human reliability....

  2. Mixed normal inference on multicointegration

    NARCIS (Netherlands)

    Boswijk, H.P.

    2009-01-01

    Asymptotic likelihood analysis of cointegration in I(2) models, see Johansen (1997, 2006), Boswijk (2000) and Paruolo (2000), has shown that inference on most parameters is mixed normal, implying hypothesis test statistics with an asymptotic 2 null distribution. The asymptotic distribution of the

  3. Statistical optimization of biodiesel production from sunflower waste cooking oil using basic heterogeneous biocatalyst prepared from eggshells

    Directory of Open Access Journals (Sweden)

    Nour Sh. El-Gendy

    2015-03-01

    Full Text Available A statistical design of experiments DOE was applied to investigate biodiesel fuel BDF production process from sunflower waste cooking oil SWCO using heterogeneous bio-catalyst produced from eggshells ES. It was based on 3 level D-optimal design involving as factors methanol:oil M:O molar ratio, catalyst concentration (wt%, reaction time (min and mixing rate (rpm. Twenty runs were carried out. A predictive linear interaction model has been correlated finding out how significant the effects of these variables are in practice. LINGO software was used to find out the optimum values of the aforementioned variables for enhancing the process. According to the results obtained, the most dominant positive factor influencing the response variable (% BDF yield was M:O molar ratio followed by catalyst concentration (wt% and mixing rate in a decreasing order while the reaction time showed to have a negative effect on the yield. The maximum BDF yield (98.8% and 97.5%, predicted and experimental, respectively was obtained at M:O 6:1 M ratio, catalyst concentration 3 wt%, reaction time 30 min, mixing rate 350 rpm and 60 °C. Also response surface methodology RSM has been applied to study the interactive effects of independent variables on BDF yield. It was found that, the interaction between M:O and catalyst concentration (wt% has more significant effect than interaction between other variables. The activity of the produced bio-catalyst was comparable to that of chemical CaO and immobilized enzyme Novozym 435. All the physicochemical characteristics of the produced BDF using the prepared bio-catalyst and its blends with petro-diesel fuel PDF are completely acceptable and meet most of the required standard specifications.

  4. Optimized statistical parametric mapping procedure for NIRS data contaminated by motion artifacts : Neurometric analysis of body schema extension.

    Science.gov (United States)

    Suzuki, Satoshi

    2017-09-01

    This study investigated the spatial distribution of brain activity on body schema (BS) modification induced by natural body motion using two versions of a hand-tracing task. In Task 1, participants traced Japanese Hiragana characters using the right forefinger, requiring no BS expansion. In Task 2, participants performed the tracing task with a long stick, requiring BS expansion. Spatial distribution was analyzed using general linear model (GLM)-based statistical parametric mapping of near-infrared spectroscopy data contaminated with motion artifacts caused by the hand-tracing task. Three methods were utilized in series to counter the artifacts, and optimal conditions and modifications were investigated: a model-free method (Step 1), a convolution matrix method (Step 2), and a boxcar-function-based Gaussian convolution method (Step 3). The results revealed four methodological findings: (1) Deoxyhemoglobin was suitable for the GLM because both Akaike information criterion and the variance against the averaged hemodynamic response function were smaller than for other signals, (2) a high-pass filter with a cutoff frequency of .014 Hz was effective, (3) the hemodynamic response function computed from a Gaussian kernel function and its first- and second-derivative terms should be included in the GLM model, and (4) correction of non-autocorrelation and use of effective degrees of freedom were critical. Investigating z-maps computed according to these guidelines revealed that contiguous areas of BA7-BA40-BA21 in the right hemisphere became significantly activated ([Formula: see text], [Formula: see text], and [Formula: see text], respectively) during BS modification while performing the hand-tracing task.

  5. Optimal prediction intervals of wind power generation

    DEFF Research Database (Denmark)

    Wan, Can; Wu, Zhao; Pinson, Pierre

    2014-01-01

    direct optimization of both the coverage probability and sharpness to ensure the quality. The proposed method does not involve the statistical inference or distribution assumption of forecasting errors needed in most existing methods. Case studies using real wind farm data from Australia have been...

  6. Inference in models with adaptive learning

    NARCIS (Netherlands)

    Chevillon, G.; Massmann, M.; Mavroeidis, S.

    2010-01-01

    Identification of structural parameters in models with adaptive learning can be weak, causing standard inference procedures to become unreliable. Learning also induces persistent dynamics, and this makes the distribution of estimators and test statistics non-standard. Valid inference can be

  7. The optimal monochromatic spectral computed tomographic imaging plus adaptive statistical iterative reconstruction algorithm can improve the superior mesenteric vessel image quality

    Energy Technology Data Exchange (ETDEWEB)

    Yin, Xiao-Ping; Zuo, Zi-Wei; Xu, Ying-Jin; Wang, Jia-Ning [CT/MRI room, Affiliated Hospital of Hebei University, Baoding, Hebei, 071000 (China); Liu, Huai-Jun, E-mail: hebeiliu@outlook.com [Department of Medical Imaging, The Second Hospital of Hebei Medical University, Shijiazhuang, Hebei, 050000 (China); Liang, Guang-Lu [CT/MRI room, Affiliated Hospital of Hebei University, Baoding, Hebei, 071000 (China); Gao, Bu-Lang, E-mail: browngao@163.com [Department of Medical Research, Shijiazhuang First Hospital, Shijiazhuang, Hebei, 050011 (China)

    2017-04-15

    Objective: To investigate the effect of the optimal monochromatic spectral computed tomography (CT) plus adaptive statistical iterative reconstruction on the improvement of the image quality of the superior mesenteric artery and vein. Materials and methods: The gemstone spectral CT angiographic data of 25 patients were reconstructed in the following three groups: 70 KeV, the optimal monochromatic imaging, and the optimal monochromatic plus 40%iterative reconstruction mode. The CT value, image noises (IN), background CT value and noises, contrast-to-noise ratio (CNR), signal-to-noise ratio (SNR) and image scores of the vessels and surrounding tissues were analyzed. Results: In the 70 KeV, the optimal monochromatic and the optimal monochromatic images plus 40% iterative reconstruction group, the mean scores of image quality were 3.86, 4.24 and 4.25 for the superior mesenteric artery and 3.46, 3.78 and 3.81 for the superior mesenteric vein, respectively. The image quality scores for the optimal monochromatic and the optimal monochromatic plus 40% iterative reconstruction groups were significantly greater than for the 70 KeV group (P < 0.05). The vascular CT value, image noise, background noise, CNR and SNR were significantly (P < 0.001) greater in the optimal monochromatic and the optimal monochromatic images plus 40% iterative reconstruction group than in the 70 KeV group. The optimal monochromatic plus 40% iterative reconstruction group had significantly (P < 0.05) lower image and background noise but higher CNR and SNR than the other two groups. Conclusion: The optimal monochromatic imaging combined with 40% iterative reconstruction using low-contrast agent dosage and low injection rate can significantly improve the image quality of the superior mesenteric artery and vein.

  8. Statistics of Extremes

    KAUST Repository

    Davison, Anthony C.; Huser, Raphaë l

    2015-01-01

    Statistics of extremes concerns inference for rare events. Often the events have never yet been observed, and their probabilities must therefore be estimated by extrapolation of tail models fitted to available data. Because data concerning the event

  9. Graphical Geometric and Learning/Optimization-Based Methods in Statistical Signal and Image Processing Object Recognition and Data Fusion

    National Research Council Canada - National Science Library

    Willsky, Alan S

    2008-01-01

    ...: (a) the use of graphical, hierarchical, and multiresolution representations for the development of statistical modeling methodologies for complex phenomena and for the construction of scalable algorithms...

  10. On Maximum Entropy and Inference

    Directory of Open Access Journals (Sweden)

    Luigi Gresele

    2017-11-01

    Full Text Available Maximum entropy is a powerful concept that entails a sharp separation between relevant and irrelevant variables. It is typically invoked in inference, once an assumption is made on what the relevant variables are, in order to estimate a model from data, that affords predictions on all other (dependent variables. Conversely, maximum entropy can be invoked to retrieve the relevant variables (sufficient statistics directly from the data, once a model is identified by Bayesian model selection. We explore this approach in the case of spin models with interactions of arbitrary order, and we discuss how relevant interactions can be inferred. In this perspective, the dimensionality of the inference problem is not set by the number of parameters in the model, but by the frequency distribution of the data. We illustrate the method showing its ability to recover the correct model in a few prototype cases and discuss its application on a real dataset.

  11. Application of Statistical Design to the Optimization of Culture Medium for Prodigiosin Production by Serratia marcescens SWML08

    OpenAIRE

    Venil, C. K.; Lakshmanaperumalsamy, P.

    2009-01-01

    Combination of Plackett – Burman design (PBD) and Box – Behnken design (BBD) were applied for optimization of different factors for prodigiosin production by Serratia marcescens SWML08. Among 11 factors, incubation temperature, and supplement of (NH4)2PO4 and trace salts into the culture medium were selected due to significant positive effect on prodigiosin yield. Box - Behnken design, a response surface methodology, was used for further optimization of these selected factors for better pro...

  12. Statistical optimization of beta-carotene production by Arthrobacter agilis A17 using response surface methodology and Box-Behnken design

    Science.gov (United States)

    Özdal, Murat; Özdal, Özlem Gür; Gürkök, Sümeyra

    2017-04-01

    β-carotene is a commercially important natural pigment and has been widely applied in the medicine, pharmaceutical, food, feed and cosmetic industries. The current study aimed to investigate the usability of molasses for β-carotene production by Arthrobacter agilis A17 (KP318146) and to optimize the production process. Box-Behnken Design of Response Surface Methodology was used to determine the optimum levels and the interactions of three independent variables namely molasses, yeast extract and KH2PO4 at three different levels. β-carotene yield in optimized medium containing 70 g/l molasses, 25 g/l yeast extract and 0.96 g/l KH2PO4, reached up to 100 mg/l, which is approximately 2.5-fold higher than the yield, obtained from control cultivation. A remarkable β-carotene production on inexpensive carbon source was achieved with the use of statistical optimization.

  13. Introduction to Bayesian statistics

    CERN Document Server

    Bolstad, William M

    2017-01-01

    There is a strong upsurge in the use of Bayesian methods in applied statistical analysis, yet most introductory statistics texts only present frequentist methods. Bayesian statistics has many important advantages that students should learn about if they are going into fields where statistics will be used. In this Third Edition, four newly-added chapters address topics that reflect the rapid advances in the field of Bayesian staistics. The author continues to provide a Bayesian treatment of introductory statistical topics, such as scientific data gathering, discrete random variables, robust Bayesian methods, and Bayesian approaches to inferenfe cfor discrete random variables, bionomial proprotion, Poisson, normal mean, and simple linear regression. In addition, newly-developing topics in the field are presented in four new chapters: Bayesian inference with unknown mean and variance; Bayesian inference for Multivariate Normal mean vector; Bayesian inference for Multiple Linear RegressionModel; and Computati...

  14. Statistical utilitarianism

    OpenAIRE

    Pivato, Marcus

    2013-01-01

    We show that, in a sufficiently large population satisfying certain statistical regularities, it is often possible to accurately estimate the utilitarian social welfare function, even if we only have very noisy data about individual utility functions and interpersonal utility comparisons. In particular, we show that it is often possible to identify an optimal or close-to-optimal utilitarian social choice using voting rules such as the Borda rule, approval voting, relative utilitarianism, or a...

  15. STATISTICAL APPROACH FOR MULTI CRITERIA OPTIMIZATION OF CUTTING PARAMETERS OF TURNING ON HEAT TREATED BERYLLIUM COPPER ALLOY

    Directory of Open Access Journals (Sweden)

    K. DEVAKI DEVI

    2017-08-01

    Full Text Available In machining operations, achieving desired performance features of the machined product, is really a challenging job. Because, these quality features are highly correlated and are expected to be influenced directly or indirectly by the direct effect of process parameters or their interactive effects. This paper presents effective method and to determine optimal machining parameters in a turning operation on heat treated Beryllium copper alloy to minimize the surface roughness, cutting forces and work tool interface temperature along with the maximization of metal removal rate. The scope of this work is extended to Multi Objective Optimization. Response Surface Methodology is opted for preparing the design matrix, generating ANOVA, and optimization. A powerful model would be obtained with high accuracy to analyse the effect of each parameter on the output. The input parameters considered in this work are cutting speed, feed, depth of cut, work material (Annealed and Hardened and tool material (CBN and HSS.

  16. Statistical Optimization of the Induction of Phytase Production by Arabinose in a recombinant E. coli using Response Surface Methodology

    Directory of Open Access Journals (Sweden)

    Abd-El Aziem Farouk

    2017-11-01

    Full Text Available The production of phytase in a recombinant E.coli using the pBAD expression  system was optimized using response surface methodology with full-factorial faced centered central composite design. The ampicilin and arabinose concentration in the cultivation media and the incubation temperature were optimized in order to maximize phytase production using 2 3  central composite experimental design. With this design the number of actual experiment performed could be reduced while allowing eludidation of possible interactions among these factors. The most significant parameter was shown to be the linear and quadratic effect of the incubation temperature.  Optimal conditions for phytase production were determined to be 100 µg/ml ampicilin, 0.2 % arabinose and an incubation temperature of 37ºC. The production of phytase in the recombinant E. coli was scaled up to 100 ml and 1000 ml.

  17. Development and statistical optimization of nefopam hydrochloride loaded nanospheres for neuropathic pain using Box–Behnken design

    Directory of Open Access Journals (Sweden)

    S. Sukhbir

    2016-09-01

    Full Text Available Nefopam hydrochloride (NFH is a non-opioid centrally acting analgesic drug used to treat chronic condition such as neuropathic pain. In current research, sustained release nefopam hydrochloride loaded nanospheres (NFH-NS were auspiciously synthesized using binary mixture of eudragit RL 100 and RS 100 with sorbitan monooleate as surfactant by quasi solvent diffusion technique and optimized by 35 Box–Behnken designs to evaluate the effects of process and formulation variables. Fourier transform infrared spectroscopy (FTIR, differential scanning calorimetric (DSC and X-ray diffraction (XRD affirmed absence of drug–polymer incompatibility and confirmed formation of nanospheres. Desirability function scrutinized by design-expert software for optimized formulation was 0.920. Optimized batch of NFH-NS had mean particle size 328.36 nm ± 2.23, % entrapment efficiency (% EE 84.97 ± 1.23, % process yield 83.60 ± 1.31 and % drug loading (% DL 21.41 ± 0.89. Dynamic light scattering (DLS, zeta potential analysis and scanning electron microscopy (SEM validated size, charge and shape of nanospheres, respectively. In-vitro drug release study revealed biphasic release pattern from optimized nanospheres. Korsmeyer Peppas found excellent kinetics model with release exponent less than 0.45. Chronic constricted injury (CCI model of optimized NFH-NS in Wistar rats produced significant difference in neuropathic pain behavior (p < 0.05 as compared to free NFH over 10 h indicating sustained action. Long term and accelerated stability testing of optimized NFH-NS revealed degradation rate constant 1.695 × 10−4 and shelf-life 621 days at 25 ± 2 °C/60% ± 5% RH.

  18. Development and statistical optimization of nefopam hydrochloride loaded nanospheres for neuropathic pain using Box-Behnken design.

    Science.gov (United States)

    Sukhbir, S; Yashpal, S; Sandeep, A

    2016-09-01

    Nefopam hydrochloride (NFH) is a non-opioid centrally acting analgesic drug used to treat chronic condition such as neuropathic pain. In current research, sustained release nefopam hydrochloride loaded nanospheres (NFH-NS) were auspiciously synthesized using binary mixture of eudragit RL 100 and RS 100 with sorbitan monooleate as surfactant by quasi solvent diffusion technique and optimized by 3 5 Box-Behnken designs to evaluate the effects of process and formulation variables. Fourier transform infrared spectroscopy (FTIR), differential scanning calorimetric (DSC) and X-ray diffraction (XRD) affirmed absence of drug-polymer incompatibility and confirmed formation of nanospheres. Desirability function scrutinized by design-expert software for optimized formulation was 0.920. Optimized batch of NFH-NS had mean particle size 328.36 nm ± 2.23, % entrapment efficiency (% EE) 84.97 ± 1.23, % process yield 83.60 ± 1.31 and % drug loading (% DL) 21.41 ± 0.89. Dynamic light scattering (DLS), zeta potential analysis and scanning electron microscopy (SEM) validated size, charge and shape of nanospheres, respectively. In-vitro drug release study revealed biphasic release pattern from optimized nanospheres. Korsmeyer Peppas found excellent kinetics model with release exponent less than 0.45. Chronic constricted injury (CCI) model of optimized NFH-NS in Wistar rats produced significant difference in neuropathic pain behavior ( p  accelerated stability testing of optimized NFH-NS revealed degradation rate constant 1.695 × 10 -4 and shelf-life 621 days at 25 ± 2 °C/60% ± 5% RH.

  19. Direct Learning of Systematics-Aware Summary Statistics

    CERN Multimedia

    CERN. Geneva

    2018-01-01

    Complex machine learning tools, such as deep neural networks and gradient boosting algorithms, are increasingly being used to construct powerful discriminative features for High Energy Physics analyses. These methods are typically trained with simulated or auxiliary data samples by optimising some classification or regression surrogate objective. The learned feature representations are then used to build a sample-based statistical model to perform inference (e.g. interval estimation or hypothesis testing) over a set of parameters of interest. However, the effectiveness of the mentioned approach can be reduced by the presence of known uncertainties that cause differences between training and experimental data, included in the statistical model via nuisance parameters. This work presents an end-to-end algorithm, which leverages on existing deep learning technologies but directly aims to produce inference-optimal sample-summary statistics. By including the statistical model and a differentiable approximation of ...

  20. Nonparametric predictive inference in reliability

    International Nuclear Information System (INIS)

    Coolen, F.P.A.; Coolen-Schrijner, P.; Yan, K.J.

    2002-01-01

    We introduce a recently developed statistical approach, called nonparametric predictive inference (NPI), to reliability. Bounds for the survival function for a future observation are presented. We illustrate how NPI can deal with right-censored data, and discuss aspects of competing risks. We present possible applications of NPI for Bernoulli data, and we briefly outline applications of NPI for replacement decisions. The emphasis is on introduction and illustration of NPI in reliability contexts, detailed mathematical justifications are presented elsewhere