A Statistical Approach For Modeling Tropical Cyclones. Synthetic Hurricanes Generator Model
Energy Technology Data Exchange (ETDEWEB)
Pasqualini, Donatella [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
2016-05-11
This manuscript brie y describes a statistical ap- proach to generate synthetic tropical cyclone tracks to be used in risk evaluations. The Synthetic Hur- ricane Generator (SynHurG) model allows model- ing hurricane risk in the United States supporting decision makers and implementations of adaptation strategies to extreme weather. In the literature there are mainly two approaches to model hurricane hazard for risk prediction: deterministic-statistical approaches, where the storm key physical parameters are calculated using physi- cal complex climate models and the tracks are usually determined statistically from historical data; and sta- tistical approaches, where both variables and tracks are estimated stochastically using historical records. SynHurG falls in the second category adopting a pure stochastic approach.
Risk prediction model: Statistical and artificial neural network approach
Paiman, Nuur Azreen; Hariri, Azian; Masood, Ibrahim
2017-04-01
Prediction models are increasingly gaining popularity and had been used in numerous areas of studies to complement and fulfilled clinical reasoning and decision making nowadays. The adoption of such models assist physician's decision making, individual's behavior, and consequently improve individual outcomes and the cost-effectiveness of care. The objective of this paper is to reviewed articles related to risk prediction model in order to understand the suitable approach, development and the validation process of risk prediction model. A qualitative review of the aims, methods and significant main outcomes of the nineteen published articles that developed risk prediction models from numerous fields were done. This paper also reviewed on how researchers develop and validate the risk prediction models based on statistical and artificial neural network approach. From the review done, some methodological recommendation in developing and validating the prediction model were highlighted. According to studies that had been done, artificial neural network approached in developing the prediction model were more accurate compared to statistical approach. However currently, only limited published literature discussed on which approach is more accurate for risk prediction model development.
Statistical sampling approaches for soil monitoring
Brus, D.J.
2014-01-01
This paper describes three statistical sampling approaches for regional soil monitoring, a design-based, a model-based and a hybrid approach. In the model-based approach a space-time model is exploited to predict global statistical parameters of interest such as the space-time mean. In the hybrid
Modelling diversity in building occupant behaviour: a novel statistical approach
DEFF Research Database (Denmark)
Haldi, Frédéric; Calì, Davide; Andersen, Rune Korsholm
2016-01-01
We propose an advanced modelling framework to predict the scope and effects of behavioural diversity regarding building occupant actions on window openings, shading devices and lighting. We develop a statistical approach based on generalised linear mixed models to account for the longitudinal nat...
Statistical inference an integrated Bayesianlikelihood approach
Aitkin, Murray
2010-01-01
Filling a gap in current Bayesian theory, Statistical Inference: An Integrated Bayesian/Likelihood Approach presents a unified Bayesian treatment of parameter inference and model comparisons that can be used with simple diffuse prior specifications. This novel approach provides new solutions to difficult model comparison problems and offers direct Bayesian counterparts of frequentist t-tests and other standard statistical methods for hypothesis testing.After an overview of the competing theories of statistical inference, the book introduces the Bayes/likelihood approach used throughout. It pre
Tornadoes and related damage costs: statistical modeling with a semi-Markov approach
Corini, Chiara; D'Amico, Guglielmo; Petroni, Filippo; Prattico, Flavio; Manca, Raimondo
2015-01-01
We propose a statistical approach to tornadoes modeling for predicting and simulating occurrences of tornadoes and accumulated cost distributions over a time interval. This is achieved by modeling the tornadoes intensity, measured with the Fujita scale, as a stochastic process. Since the Fujita scale divides tornadoes intensity into six states, it is possible to model the tornadoes intensity by using Markov and semi-Markov models. We demonstrate that the semi-Markov approach is able to reprod...
International Nuclear Information System (INIS)
Lü, Xiaoshu; Lu, Tao; Kibert, Charles J.; Viljanen, Martti
2015-01-01
Highlights: • This paper presents a new modeling method to forecast energy demands. • The model is based on physical–statistical approach to improving forecast accuracy. • A new method is proposed to address the heterogeneity challenge. • Comparison with measurements shows accurate forecasts of the model. • The first physical–statistical/heterogeneous building energy modeling approach is proposed and validated. - Abstract: Energy consumption forecasting is a critical and necessary input to planning and controlling energy usage in the building sector which accounts for 40% of the world’s energy use and the world’s greatest fraction of greenhouse gas emissions. However, due to the diversity and complexity of buildings as well as the random nature of weather conditions, energy consumption and loads are stochastic and difficult to predict. This paper presents a new methodology for energy demand forecasting that addresses the heterogeneity challenges in energy modeling of buildings. The new method is based on a physical–statistical approach designed to account for building heterogeneity to improve forecast accuracy. The physical model provides a theoretical input to characterize the underlying physical mechanism of energy flows. Then stochastic parameters are introduced into the physical model and the statistical time series model is formulated to reflect model uncertainties and individual heterogeneity in buildings. A new method of model generalization based on a convex hull technique is further derived to parameterize the individual-level model parameters for consistent model coefficients while maintaining satisfactory modeling accuracy for heterogeneous buildings. The proposed method and its validation are presented in detail for four different sports buildings with field measurements. The results show that the proposed methodology and model can provide a considerable improvement in forecasting accuracy
A statistical approach to plasma profile analysis
International Nuclear Information System (INIS)
Kardaun, O.J.W.F.; McCarthy, P.J.; Lackner, K.; Riedel, K.S.
1990-05-01
A general statistical approach to the parameterisation and analysis of tokamak profiles is presented. The modelling of the profile dependence on both the radius and the plasma parameters is discussed, and pertinent, classical as well as robust, methods of estimation are reviewed. Special attention is given to statistical tests for discriminating between the various models, and to the construction of confidence intervals for the parameterised profiles and the associated global quantities. The statistical approach is shown to provide a rigorous approach to the empirical testing of plasma profile invariance. (orig.)
Tornadoes and related damage costs: statistical modelling with a semi-Markov approach
Directory of Open Access Journals (Sweden)
Guglielmo D’Amico
2016-09-01
Full Text Available We propose a statistical approach to modelling for predicting and simulating occurrences of tornadoes and accumulated cost distributions over a time interval. This is achieved by modelling the tornado intensity, measured with the Fujita scale, as a stochastic process. Since the Fujita scale divides tornado intensity into six states, it is possible to model the tornado intensity by using Markov and semi-Markov models. We demonstrate that the semi-Markov approach is able to reproduce the duration effect that is detected in tornado occurrence. The superiority of the semi-Markov model as compared to the Markov chain model is also affirmed by means of a statistical test of hypothesis. As an application, we compute the expected value and the variance of the costs generated by the tornadoes over a given time interval in a given area. The paper contributes to the literature by demonstrating that semi-Markov models represent an effective tool for physical analysis of tornadoes as well as for the estimation of the economic damages to human things.
CSIR Research Space (South Africa)
Ntaka, L
2013-08-01
Full Text Available . In this work, statistical inference approach specifically the non-parametric bootstrapping and linear model were applied. Data used to develop the model were sourced from the literature. 104 data points with information on aggregation, natural organic matter...
Onisko, Agnieszka; Druzdzel, Marek J; Austin, R Marshall
2016-01-01
Classical statistics is a well-established approach in the analysis of medical data. While the medical community seems to be familiar with the concept of a statistical analysis and its interpretation, the Bayesian approach, argued by many of its proponents to be superior to the classical frequentist approach, is still not well-recognized in the analysis of medical data. The goal of this study is to encourage data analysts to use the Bayesian approach, such as modeling with graphical probabilistic networks, as an insightful alternative to classical statistical analysis of medical data. This paper offers a comparison of two approaches to analysis of medical time series data: (1) classical statistical approach, such as the Kaplan-Meier estimator and the Cox proportional hazards regression model, and (2) dynamic Bayesian network modeling. Our comparison is based on time series cervical cancer screening data collected at Magee-Womens Hospital, University of Pittsburgh Medical Center over 10 years. The main outcomes of our comparison are cervical cancer risk assessments produced by the three approaches. However, our analysis discusses also several aspects of the comparison, such as modeling assumptions, model building, dealing with incomplete data, individualized risk assessment, results interpretation, and model validation. Our study shows that the Bayesian approach is (1) much more flexible in terms of modeling effort, and (2) it offers an individualized risk assessment, which is more cumbersome for classical statistical approaches.
A statistical modeling approach to build expert credit risk rating systems
DEFF Research Database (Denmark)
Waagepetersen, Rasmus
2010-01-01
This paper presents an efficient method for extracting expert knowledge when building a credit risk rating system. Experts are asked to rate a sample of counterparty cases according to creditworthiness. Next, a statistical model is used to capture the relation between the characteristics...... of a counterparty and the expert rating. For any counterparty the model can identify the rating, which would be agreed upon by the majority of experts. Furthermore, the model can quantify the concurrence among experts. The approach is illustrated by a case study regarding the construction of an application score...
Statistical modelling for ship propulsion efficiency
DEFF Research Database (Denmark)
Petersen, Jóan Petur; Jacobsen, Daniel J.; Winther, Ole
2012-01-01
This paper presents a state-of-the-art systems approach to statistical modelling of fuel efficiency in ship propulsion, and also a novel and publicly available data set of high quality sensory data. Two statistical model approaches are investigated and compared: artificial neural networks...
A Formal Approach for RT-DVS Algorithms Evaluation Based on Statistical Model Checking
Directory of Open Access Journals (Sweden)
Shengxin Dai
2015-01-01
Full Text Available Energy saving is a crucial concern in embedded real time systems. Many RT-DVS algorithms have been proposed to save energy while preserving deadline guarantees. This paper presents a novel approach to evaluate RT-DVS algorithms using statistical model checking. A scalable framework is proposed for RT-DVS algorithms evaluation, in which the relevant components are modeled as stochastic timed automata, and the evaluation metrics including utilization bound, energy efficiency, battery awareness, and temperature awareness are expressed as statistical queries. Evaluation of these metrics is performed by verifying the corresponding queries using UPPAAL-SMC and analyzing the statistical information provided by the tool. We demonstrate the applicability of our framework via a case study of five classical RT-DVS algorithms.
Graphene growth process modeling: a physical-statistical approach
Wu, Jian; Huang, Qiang
2014-09-01
As a zero-band semiconductor, graphene is an attractive material for a wide variety of applications such as optoelectronics. Among various techniques developed for graphene synthesis, chemical vapor deposition on copper foils shows high potential for producing few-layer and large-area graphene. Since fabrication of high-quality graphene sheets requires the understanding of growth mechanisms, and methods of characterization and control of grain size of graphene flakes, analytical modeling of graphene growth process is therefore essential for controlled fabrication. The graphene growth process starts with randomly nucleated islands that gradually develop into complex shapes, grow in size, and eventually connect together to cover the copper foil. To model this complex process, we develop a physical-statistical approach under the assumption of self-similarity during graphene growth. The growth kinetics is uncovered by separating island shapes from area growth rate. We propose to characterize the area growth velocity using a confined exponential model, which not only has clear physical explanation, but also fits the real data well. For the shape modeling, we develop a parametric shape model which can be well explained by the angular-dependent growth rate. This work can provide useful information for the control and optimization of graphene growth process on Cu foil.
Statistical modelling with quantile functions
Gilchrist, Warren
2000-01-01
Galton used quantiles more than a hundred years ago in describing data. Tukey and Parzen used them in the 60s and 70s in describing populations. Since then, the authors of many papers, both theoretical and practical, have used various aspects of quantiles in their work. Until now, however, no one put all the ideas together to form what turns out to be a general approach to statistics.Statistical Modelling with Quantile Functions does just that. It systematically examines the entire process of statistical modelling, starting with using the quantile function to define continuous distributions. The author shows that by using this approach, it becomes possible to develop complex distributional models from simple components. A modelling kit can be developed that applies to the whole model - deterministic and stochastic components - and this kit operates by adding, multiplying, and transforming distributions rather than data.Statistical Modelling with Quantile Functions adds a new dimension to the practice of stati...
Statistical approach for uncertainty quantification of experimental modal model parameters
DEFF Research Database (Denmark)
Luczak, M.; Peeters, B.; Kahsin, M.
2014-01-01
Composite materials are widely used in manufacture of aerospace and wind energy structural components. These load carrying structures are subjected to dynamic time-varying loading conditions. Robust structural dynamics identification procedure impose tight constraints on the quality of modal models...... represent different complexity levels ranging from coupon, through sub-component up to fully assembled aerospace and wind energy structural components made of composite materials. The proposed method is demonstrated on two application cases of a small and large wind turbine blade........ This paper aims at a systematic approach for uncertainty quantification of the parameters of the modal models estimated from experimentally obtained data. Statistical analysis of modal parameters is implemented to derive an assessment of the entire modal model uncertainty measure. Investigated structures...
Statistical Model-Based Face Pose Estimation
Institute of Scientific and Technical Information of China (English)
GE Xinliang; YANG Jie; LI Feng; WANG Huahua
2007-01-01
A robust face pose estimation approach is proposed by using face shape statistical model approach and pose parameters are represented by trigonometric functions. The face shape statistical model is firstly built by analyzing the face shapes from different people under varying poses. The shape alignment is vital in the process of building the statistical model. Then, six trigonometric functions are employed to represent the face pose parameters. Lastly, the mapping function is constructed between face image and face pose by linearly relating different parameters. The proposed approach is able to estimate different face poses using a few face training samples. Experimental results are provided to demonstrate its efficiency and accuracy.
MASKED AREAS IN SHEAR PEAK STATISTICS: A FORWARD MODELING APPROACH
International Nuclear Information System (INIS)
Bard, D.; Kratochvil, J. M.; Dawson, W.
2016-01-01
The statistics of shear peaks have been shown to provide valuable cosmological information beyond the power spectrum, and will be an important constraint of models of cosmology in forthcoming astronomical surveys. Surveys include masked areas due to bright stars, bad pixels etc., which must be accounted for in producing constraints on cosmology from shear maps. We advocate a forward-modeling approach, where the impacts of masking and other survey artifacts are accounted for in the theoretical prediction of cosmological parameters, rather than correcting survey data to remove them. We use masks based on the Deep Lens Survey, and explore the impact of up to 37% of the survey area being masked on LSST and DES-scale surveys. By reconstructing maps of aperture mass the masking effect is smoothed out, resulting in up to 14% smaller statistical uncertainties compared to simply reducing the survey area by the masked area. We show that, even in the presence of large survey masks, the bias in cosmological parameter estimation produced in the forward-modeling process is ≈1%, dominated by bias caused by limited simulation volume. We also explore how this potential bias scales with survey area and evaluate how much small survey areas are impacted by the differences in cosmological structure in the data and simulated volumes, due to cosmic variance
A new Markov-chain-related statistical approach for modelling synthetic wind power time series
International Nuclear Information System (INIS)
Pesch, T; Hake, J F; Schröders, S; Allelein, H J
2015-01-01
The integration of rising shares of volatile wind power in the generation mix is a major challenge for the future energy system. To address the uncertainties involved in wind power generation, models analysing and simulating the stochastic nature of this energy source are becoming increasingly important. One statistical approach that has been frequently used in the literature is the Markov chain approach. Recently, the method was identified as being of limited use for generating wind time series with time steps shorter than 15–40 min as it is not capable of reproducing the autocorrelation characteristics accurately. This paper presents a new Markov-chain-related statistical approach that is capable of solving this problem by introducing a variable second lag. Furthermore, additional features are presented that allow for the further adjustment of the generated synthetic time series. The influences of the model parameter settings are examined by meaningful parameter variations. The suitability of the approach is demonstrated by an application analysis with the example of the wind feed-in in Germany. It shows that—in contrast to conventional Markov chain approaches—the generated synthetic time series do not systematically underestimate the required storage capacity to balance wind power fluctuation. (paper)
Extension of the direct statistical approach to a volume parameter model (non-integer splitting)
International Nuclear Information System (INIS)
Burn, K.W.
1990-01-01
The Direct Statistical Approach is a rigorous mathematical derivation of the second moment for surface splitting and Russian Roulette games attached to the Monte Carlo modelling of fixed source particle transport. It has been extended to a volume parameter model (involving non-integer ''expected value'' splitting), and then to a cell model. The cell model gives second moment and time functions that have a closed form. This suggests the possibility of two different methods of solution of the optimum splitting/Russian Roulette parameters. (author)
Computational and Statistical Models: A Comparison for Policy Modeling of Childhood Obesity
Mabry, Patricia L.; Hammond, Ross; Ip, Edward Hak-Sing; Huang, Terry T.-K.
As systems science methodologies have begun to emerge as a set of innovative approaches to address complex problems in behavioral, social science, and public health research, some apparent conflicts with traditional statistical methodologies for public health have arisen. Computational modeling is an approach set in context that integrates diverse sources of data to test the plausibility of working hypotheses and to elicit novel ones. Statistical models are reductionist approaches geared towards proving the null hypothesis. While these two approaches may seem contrary to each other, we propose that they are in fact complementary and can be used jointly to advance solutions to complex problems. Outputs from statistical models can be fed into computational models, and outputs from computational models can lead to further empirical data collection and statistical models. Together, this presents an iterative process that refines the models and contributes to a greater understanding of the problem and its potential solutions. The purpose of this panel is to foster communication and understanding between statistical and computational modelers. Our goal is to shed light on the differences between the approaches and convey what kinds of research inquiries each one is best for addressing and how they can serve complementary (and synergistic) roles in the research process, to mutual benefit. For each approach the panel will cover the relevant "assumptions" and how the differences in what is assumed can foster misunderstandings. The interpretations of the results from each approach will be compared and contrasted and the limitations for each approach will be delineated. We will use illustrative examples from CompMod, the Comparative Modeling Network for Childhood Obesity Policy. The panel will also incorporate interactive discussions with the audience on the issues raised here.
Probing NWP model deficiencies by statistical postprocessing
DEFF Research Database (Denmark)
Rosgaard, Martin Haubjerg; Nielsen, Henrik Aalborg; Nielsen, Torben S.
2016-01-01
The objective in this article is twofold. On one hand, a Model Output Statistics (MOS) framework for improved wind speed forecast accuracy is described and evaluated. On the other hand, the approach explored identifies unintuitive explanatory value from a diagnostic variable in an operational....... Based on the statistical model candidates inferred from the data, the lifted index NWP model diagnostic is consistently found among the NWP model predictors of the best performing statistical models across sites....
Uncertainty the soul of modeling, probability & statistics
Briggs, William
2016-01-01
This book presents a philosophical approach to probability and probabilistic thinking, considering the underpinnings of probabilistic reasoning and modeling, which effectively underlie everything in data science. The ultimate goal is to call into question many standard tenets and lay the philosophical and probabilistic groundwork and infrastructure for statistical modeling. It is the first book devoted to the philosophy of data aimed at working scientists and calls for a new consideration in the practice of probability and statistics to eliminate what has been referred to as the "Cult of Statistical Significance". The book explains the philosophy of these ideas and not the mathematics, though there are a handful of mathematical examples. The topics are logically laid out, starting with basic philosophy as related to probability, statistics, and science, and stepping through the key probabilistic ideas and concepts, and ending with statistical models. Its jargon-free approach asserts that standard methods, suc...
Performance modeling, loss networks, and statistical multiplexing
Mazumdar, Ravi
2009-01-01
This monograph presents a concise mathematical approach for modeling and analyzing the performance of communication networks with the aim of understanding the phenomenon of statistical multiplexing. The novelty of the monograph is the fresh approach and insights provided by a sample-path methodology for queueing models that highlights the important ideas of Palm distributions associated with traffic models and their role in performance measures. Also presented are recent ideas of large buffer, and many sources asymptotics that play an important role in understanding statistical multiplexing. I
Kassem, M.; Soize, C.; Gagliardini, L.
2009-06-01
In this paper, an energy-density field approach applied to the vibroacoustic analysis of complex industrial structures in the low- and medium-frequency ranges is presented. This approach uses a statistical computational model. The analyzed system consists of an automotive vehicle structure coupled with its internal acoustic cavity. The objective of this paper is to make use of the statistical properties of the frequency response functions of the vibroacoustic system observed from previous experimental and numerical work. The frequency response functions are expressed in terms of a dimensionless matrix which is estimated using the proposed energy approach. Using this dimensionless matrix, a simplified vibroacoustic model is proposed.
Improved model for statistical alignment
Energy Technology Data Exchange (ETDEWEB)
Miklos, I.; Toroczkai, Z. (Zoltan)
2001-01-01
The statistical approach to molecular sequence evolution involves the stochastic modeling of the substitution, insertion and deletion processes. Substitution has been modeled in a reliable way for more than three decades by using finite Markov-processes. Insertion and deletion, however, seem to be more difficult to model, and thc recent approaches cannot acceptably deal with multiple insertions and deletions. A new method based on a generating function approach is introduced to describe the multiple insertion process. The presented algorithm computes the approximate joint probability of two sequences in 0(13) running time where 1 is the geometric mean of the sequence lengths.
Comparing geological and statistical approaches for element selection in sediment tracing research
Laceby, J. Patrick; McMahon, Joe; Evrard, Olivier; Olley, Jon
2015-04-01
Elevated suspended sediment loads reduce reservoir capacity and significantly increase the cost of operating water treatment infrastructure, making the management of sediment supply to reservoirs of increasingly importance. Sediment fingerprinting techniques can be used to determine the relative contributions of different sources of sediment accumulating in reservoirs. The objective of this research is to compare geological and statistical approaches to element selection for sediment fingerprinting modelling. Time-integrated samplers (n=45) were used to obtain source samples from four major subcatchments flowing into the Baroon Pocket Dam in South East Queensland, Australia. The geochemistry of potential sources were compared to the geochemistry of sediment cores (n=12) sampled in the reservoir. The geochemical approach selected elements for modelling that provided expected, observed and statistical discrimination between sediment sources. Two statistical approaches selected elements for modelling with the Kruskal-Wallis H-test and Discriminatory Function Analysis (DFA). In particular, two different significance levels (0.05 & 0.35) for the DFA were included to investigate the importance of element selection on modelling results. A distribution model determined the relative contributions of different sources to sediment sampled in the Baroon Pocket Dam. Elemental discrimination was expected between one subcatchment (Obi Obi Creek) and the remaining subcatchments (Lexys, Falls and Bridge Creek). Six major elements were expected to provide discrimination. Of these six, only Fe2O3 and SiO2 provided expected, observed and statistical discrimination. Modelling results with this geological approach indicated 36% (+/- 9%) of sediment sampled in the reservoir cores were from mafic-derived sources and 64% (+/- 9%) were from felsic-derived sources. The geological and the first statistical approach (DFA0.05) differed by only 1% (σ 5%) for 5 out of 6 model groupings with only
Borsboom, D.; Haig, B.D.
2013-01-01
Unlike most other statistical frameworks, Bayesian statistical inference is wedded to a particular approach in the philosophy of science (see Howson & Urbach, 2006); this approach is called Bayesianism. Rather than being concerned with model fitting, this position in the philosophy of science
Shell model in large spaces and statistical spectroscopy
International Nuclear Information System (INIS)
Kota, V.K.B.
1996-01-01
For many nuclear structure problems of current interest it is essential to deal with shell model in large spaces. For this, three different approaches are now in use and two of them are: (i) the conventional shell model diagonalization approach but taking into account new advances in computer technology; (ii) the shell model Monte Carlo method. A brief overview of these two methods is given. Large space shell model studies raise fundamental questions regarding the information content of the shell model spectrum of complex nuclei. This led to the third approach- the statistical spectroscopy methods. The principles of statistical spectroscopy have their basis in nuclear quantum chaos and they are described (which are substantiated by large scale shell model calculations) in some detail. (author)
Mridula, Meenu R; Nair, Ashalatha S; Kumar, K Satheesh
2018-02-01
In this paper, we compared the efficacy of observation based modeling approach using a genetic algorithm with the regular statistical analysis as an alternative methodology in plant research. Preliminary experimental data on in vitro rooting was taken for this study with an aim to understand the effect of charcoal and naphthalene acetic acid (NAA) on successful rooting and also to optimize the two variables for maximum result. Observation-based modelling, as well as traditional approach, could identify NAA as a critical factor in rooting of the plantlets under the experimental conditions employed. Symbolic regression analysis using the software deployed here optimised the treatments studied and was successful in identifying the complex non-linear interaction among the variables, with minimalistic preliminary data. The presence of charcoal in the culture medium has a significant impact on root generation by reducing basal callus mass formation. Such an approach is advantageous for establishing in vitro culture protocols as these models will have significant potential for saving time and expenditure in plant tissue culture laboratories, and it further reduces the need for specialised background.
A statistical approach to optimizing concrete mixture design.
Ahmad, Shamsad; Alghamdi, Saeid A
2014-01-01
A step-by-step statistical approach is proposed to obtain optimum proportioning of concrete mixtures using the data obtained through a statistically planned experimental program. The utility of the proposed approach for optimizing the design of concrete mixture is illustrated considering a typical case in which trial mixtures were considered according to a full factorial experiment design involving three factors and their three levels (3(3)). A total of 27 concrete mixtures with three replicates (81 specimens) were considered by varying the levels of key factors affecting compressive strength of concrete, namely, water/cementitious materials ratio (0.38, 0.43, and 0.48), cementitious materials content (350, 375, and 400 kg/m(3)), and fine/total aggregate ratio (0.35, 0.40, and 0.45). The experimental data were utilized to carry out analysis of variance (ANOVA) and to develop a polynomial regression model for compressive strength in terms of the three design factors considered in this study. The developed statistical model was used to show how optimization of concrete mixtures can be carried out with different possible options.
A Statistical Approach to Optimizing Concrete Mixture Design
Directory of Open Access Journals (Sweden)
Shamsad Ahmad
2014-01-01
Full Text Available A step-by-step statistical approach is proposed to obtain optimum proportioning of concrete mixtures using the data obtained through a statistically planned experimental program. The utility of the proposed approach for optimizing the design of concrete mixture is illustrated considering a typical case in which trial mixtures were considered according to a full factorial experiment design involving three factors and their three levels (33. A total of 27 concrete mixtures with three replicates (81 specimens were considered by varying the levels of key factors affecting compressive strength of concrete, namely, water/cementitious materials ratio (0.38, 0.43, and 0.48, cementitious materials content (350, 375, and 400 kg/m3, and fine/total aggregate ratio (0.35, 0.40, and 0.45. The experimental data were utilized to carry out analysis of variance (ANOVA and to develop a polynomial regression model for compressive strength in terms of the three design factors considered in this study. The developed statistical model was used to show how optimization of concrete mixtures can be carried out with different possible options.
Petersson, K M; Nichols, T E; Poline, J B; Holmes, A P
1999-01-01
Functional neuroimaging (FNI) provides experimental access to the intact living brain making it possible to study higher cognitive functions in humans. In this review and in a companion paper in this issue, we discuss some common methods used to analyse FNI data. The emphasis in both papers is on assumptions and limitations of the methods reviewed. There are several methods available to analyse FNI data indicating that none is optimal for all purposes. In order to make optimal use of the methods available it is important to know the limits of applicability. For the interpretation of FNI results it is also important to take into account the assumptions, approximations and inherent limitations of the methods used. This paper gives a brief overview over some non-inferential descriptive methods and common statistical models used in FNI. Issues relating to the complex problem of model selection are discussed. In general, proper model selection is a necessary prerequisite for the validity of the subsequent statistical inference. The non-inferential section describes methods that, combined with inspection of parameter estimates and other simple measures, can aid in the process of model selection and verification of assumptions. The section on statistical models covers approaches to global normalization and some aspects of univariate, multivariate, and Bayesian models. Finally, approaches to functional connectivity and effective connectivity are discussed. In the companion paper we review issues related to signal detection and statistical inference. PMID:10466149
DEFF Research Database (Denmark)
Breinholt, Anders; Møller, Jan Kloppenborg; Madsen, Henrik
2012-01-01
While there seems to be consensus that hydrological model outputs should be accompanied with an uncertainty estimate the appropriate method for uncertainty estimation is not agreed upon and a debate is ongoing between advocators of formal statistical methods who consider errors as stochastic...... and GLUE advocators who consider errors as epistemic, arguing that the basis of formal statistical approaches that requires the residuals to be stationary and conform to a statistical distribution is unrealistic. In this paper we take a formal frequentist approach to parameter estimation and uncertainty...... necessary but the statistical assumptions were nevertheless not 100% justified. The residual analysis showed that significant autocorrelation was present for all simulation models. We believe users of formal approaches to uncertainty evaluation within hydrology and within environmental modelling in general...
Mohd Fo'ad Rohani; Mohd Aizaini Maarof; Ali Selamat; Houssain Kettani
2010-01-01
This paper proposes a Multi-Level Sampling (MLS) approach for continuous Loss of Self-Similarity (LoSS) detection using iterative window. The method defines LoSS based on Second Order Self-Similarity (SOSS) statistical model. The Optimization Method (OM) is used to estimate self-similarity parameter since it is fast and more accurate in comparison with other estimation methods known in the literature. Probability of LoSS detection is introduced to measure continuous LoSS detection performance...
Statistical Model Checking of Rich Models and Properties
DEFF Research Database (Denmark)
Poulsen, Danny Bøgsted
in undecidability issues for the traditional model checking approaches. Statistical model checking has proven itself a valuable supplement to model checking and this thesis is concerned with extending this software validation technique to stochastic hybrid systems. The thesis consists of two parts: the first part...... motivates why existing model checking technology should be supplemented by new techniques. It also contains a brief introduction to probability theory and concepts covered by the six papers making up the second part. The first two papers are concerned with developing online monitoring techniques...... systems. The fifth paper shows how stochastic hybrid automata are useful for modelling biological systems and the final paper is concerned with showing how statistical model checking is efficiently distributed. In parallel with developing the theory contained in the papers, a substantial part of this work...
Statistical approach for selection of regression model during validation of bioanalytical method
Directory of Open Access Journals (Sweden)
Natalija Nakov
2014-06-01
Full Text Available The selection of an adequate regression model is the basis for obtaining accurate and reproducible results during the bionalytical method validation. Given the wide concentration range, frequently present in bioanalytical assays, heteroscedasticity of the data may be expected. Several weighted linear and quadratic regression models were evaluated during the selection of the adequate curve fit using nonparametric statistical tests: One sample rank test and Wilcoxon signed rank test for two independent groups of samples. The results obtained with One sample rank test could not give statistical justification for the selection of linear vs. quadratic regression models because slight differences between the error (presented through the relative residuals were obtained. Estimation of the significance of the differences in the RR was achieved using Wilcoxon signed rank test, where linear and quadratic regression models were treated as two independent groups. The application of this simple non-parametric statistical test provides statistical confirmation of the choice of an adequate regression model.
Uniting statistical and individual-based approaches for animal movement modelling.
Latombe, Guillaume; Parrott, Lael; Basille, Mathieu; Fortin, Daniel
2014-01-01
The dynamic nature of their internal states and the environment directly shape animals' spatial behaviours and give rise to emergent properties at broader scales in natural systems. However, integrating these dynamic features into habitat selection studies remains challenging, due to practically impossible field work to access internal states and the inability of current statistical models to produce dynamic outputs. To address these issues, we developed a robust method, which combines statistical and individual-based modelling. Using a statistical technique for forward modelling of the IBM has the advantage of being faster for parameterization than a pure inverse modelling technique and allows for robust selection of parameters. Using GPS locations from caribou monitored in Québec, caribou movements were modelled based on generative mechanisms accounting for dynamic variables at a low level of emergence. These variables were accessed by replicating real individuals' movements in parallel sub-models, and movement parameters were then empirically parameterized using Step Selection Functions. The final IBM model was validated using both k-fold cross-validation and emergent patterns validation and was tested for two different scenarios, with varying hardwood encroachment. Our results highlighted a functional response in habitat selection, which suggests that our method was able to capture the complexity of the natural system, and adequately provided projections on future possible states of the system in response to different management plans. This is especially relevant for testing the long-term impact of scenarios corresponding to environmental configurations that have yet to be observed in real systems.
Hein, H.; Mai, S.; Mayer, B.; Pohlmann, T.; Barjenbruch, U.
2012-04-01
The interactions of tides, external surges, storm surges and waves with an additional role of the coastal bathymetry define the probability of extreme water levels at the coast. Probabilistic analysis and also process based numerical models allow the estimation of future states. From the physical point of view both, deterministic processes and stochastic residuals are the fundamentals of high water statistics. This study uses a so called model chain to reproduce historic statistics of tidal high water levels (Thw) as well as the prediction of future statistics high water levels. The results of the numerical models are post-processed by a stochastic analysis. Recent studies show, that for future extrapolation of extreme Thw nonstationary parametric approaches are required. With the presented methods a better prediction of time depended parameter sets seems possible. The investigation region of this study is the southern German Bright. The model-chain is the representation of a downscaling process, which starts with an emissions scenario. Regional atmospheric and ocean models refine the results of global climate models. The concept of downscaling was chosen to resolve coastal topography sufficiently. The North Sea and estuaries are modeled with the three-dimensional model HAMburg Shelf Ocean Model. The running time includes 150 years (1950 - 2100). Results of four different hindcast runs and also of one future prediction run are validated. Based on multi-scale analysis and the theory of entropy we analyze whether any significant periodicities are represented numerically. Results show that also hindcasting the climate of Thw with a model chain for the last 60 years is a challenging task. For example, an additional modeling activity must be the inclusion of tides into regional climate ocean models. It is found that the statistics of climate variables derived from model results differs from the statistics derived from measurements. E.g. there are considerable shifts in
Statistical Power Analysis with Missing Data A Structural Equation Modeling Approach
Davey, Adam
2009-01-01
Statistical power analysis has revolutionized the ways in which we conduct and evaluate research. Similar developments in the statistical analysis of incomplete (missing) data are gaining more widespread applications. This volume brings statistical power and incomplete data together under a common framework, in a way that is readily accessible to those with only an introductory familiarity with structural equation modeling. It answers many practical questions such as: How missing data affects the statistical power in a study How much power is likely with different amounts and types
Statistical modelling of transcript profiles of differentially regulated genes
Directory of Open Access Journals (Sweden)
Sergeant Martin J
2008-07-01
Full Text Available Abstract Background The vast quantities of gene expression profiling data produced in microarray studies, and the more precise quantitative PCR, are often not statistically analysed to their full potential. Previous studies have summarised gene expression profiles using simple descriptive statistics, basic analysis of variance (ANOVA and the clustering of genes based on simple models fitted to their expression profiles over time. We report the novel application of statistical non-linear regression modelling techniques to describe the shapes of expression profiles for the fungus Agaricus bisporus, quantified by PCR, and for E. coli and Rattus norvegicus, using microarray technology. The use of parametric non-linear regression models provides a more precise description of expression profiles, reducing the "noise" of the raw data to produce a clear "signal" given by the fitted curve, and describing each profile with a small number of biologically interpretable parameters. This approach then allows the direct comparison and clustering of the shapes of response patterns between genes and potentially enables a greater exploration and interpretation of the biological processes driving gene expression. Results Quantitative reverse transcriptase PCR-derived time-course data of genes were modelled. "Split-line" or "broken-stick" regression identified the initial time of gene up-regulation, enabling the classification of genes into those with primary and secondary responses. Five-day profiles were modelled using the biologically-oriented, critical exponential curve, y(t = A + (B + CtRt + ε. This non-linear regression approach allowed the expression patterns for different genes to be compared in terms of curve shape, time of maximal transcript level and the decline and asymptotic response levels. Three distinct regulatory patterns were identified for the five genes studied. Applying the regression modelling approach to microarray-derived time course data
Statistical mechanics of directed models of polymers in the square lattice
Rensburg, J V
2003-01-01
Directed square lattice models of polymers and vesicles have received considerable attention in the recent mathematical and physical sciences literature. These are idealized geometric directed lattice models introduced to study phase behaviour in polymers, and include Dyck paths, partially directed paths, directed trees and directed vesicles models. Directed models are closely related to models studied in the combinatorics literature (and are often exactly solvable). They are also simplified versions of a number of statistical mechanics models, including the self-avoiding walk, lattice animals and lattice vesicles. The exchange of approaches and ideas between statistical mechanics and combinatorics have considerably advanced the description and understanding of directed lattice models, and this will be explored in this review. The combinatorial nature of directed lattice path models makes a study using generating function approaches most natural. In contrast, the statistical mechanics approach would introduce...
Performance modeling, stochastic networks, and statistical multiplexing
Mazumdar, Ravi R
2013-01-01
This monograph presents a concise mathematical approach for modeling and analyzing the performance of communication networks with the aim of introducing an appropriate mathematical framework for modeling and analysis as well as understanding the phenomenon of statistical multiplexing. The models, techniques, and results presented form the core of traffic engineering methods used to design, control and allocate resources in communication networks.The novelty of the monograph is the fresh approach and insights provided by a sample-path methodology for queueing models that highlights the importan
Directory of Open Access Journals (Sweden)
Özlem TÜRKŞEN
2018-03-01
Full Text Available Some of the experimental designs can be composed of replicated response measures in which the replications cannot be identified exactly and may have uncertainty different than randomness. Then, the classical regression analysis may not be proper to model the designed data because of the violation of probabilistic modeling assumptions. In this case, fuzzy regression analysis can be used as a modeling tool. In this study, the replicated response values are newly formed to fuzzy numbers by using descriptive statistics of replications and golden ratio. The main aim of the study is obtaining the most suitable fuzzy model for replicated response measures through fuzzification of the replicated values by taking into account the data structure of the replications in statistical framework. Here, the response and unknown model coefficients are considered as triangular type-1 fuzzy numbers (TT1FNs whereas the inputs are crisp. Predicted fuzzy models are obtained according to the proposed fuzzification rules by using Fuzzy Least Squares (FLS approach. The performances of the predicted fuzzy models are compared by using Root Mean Squared Error (RMSE criteria. A data set from the literature, called wheel cover component data set, is used to illustrate the performance of the proposed approach and the obtained results are discussed. The calculation results show that the combined formulation of the descriptive statistics and the golden ratio is the most preferable fuzzification rule according to the well-known decision making method, called TOPSIS, for the data set.
Topology for statistical modeling of petascale data.
Energy Technology Data Exchange (ETDEWEB)
Pascucci, Valerio (University of Utah, Salt Lake City, UT); Mascarenhas, Ajith Arthur; Rusek, Korben (Texas A& M University, College Station, TX); Bennett, Janine Camille; Levine, Joshua (University of Utah, Salt Lake City, UT); Pebay, Philippe Pierre; Gyulassy, Attila (University of Utah, Salt Lake City, UT); Thompson, David C.; Rojas, Joseph Maurice (Texas A& M University, College Station, TX)
2011-07-01
This document presents current technical progress and dissemination of results for the Mathematics for Analysis of Petascale Data (MAPD) project titled 'Topology for Statistical Modeling of Petascale Data', funded by the Office of Science Advanced Scientific Computing Research (ASCR) Applied Math program. Many commonly used algorithms for mathematical analysis do not scale well enough to accommodate the size or complexity of petascale data produced by computational simulations. The primary goal of this project is thus to develop new mathematical tools that address both the petascale size and uncertain nature of current data. At a high level, our approach is based on the complementary techniques of combinatorial topology and statistical modeling. In particular, we use combinatorial topology to filter out spurious data that would otherwise skew statistical modeling techniques, and we employ advanced algorithms from algebraic statistics to efficiently find globally optimal fits to statistical models. This document summarizes the technical advances we have made to date that were made possible in whole or in part by MAPD funding. These technical contributions can be divided loosely into three categories: (1) advances in the field of combinatorial topology, (2) advances in statistical modeling, and (3) new integrated topological and statistical methods.
Energy Technology Data Exchange (ETDEWEB)
Reichert, B.K.; Bengtsson, L. [Max-Planck-Institut fuer Meteorologie, Hamburg (Germany); Aakesson, O. [Sveriges Meteorologiska och Hydrologiska Inst., Norrkoeping (Sweden)
1998-08-01
Recent proxy data obtained from ice core measurements, dendrochronology and valley glaciers provide important information on the evolution of the regional or local climate. General circulation models integrated over a long period of time could help to understand the (external and internal) forcing mechanisms of natural climate variability. For a systematic interpretation of in situ paleo proxy records, a combined method of dynamical and statistical modeling is proposed. Local 'paleo records' can be simulated from GCM output by first undertaking a model-consistent statistical downscaling and then using a process-based forward modeling approach to obtain the behavior of valley glaciers and the growth of trees under specific conditions. The simulated records can be compared to actual proxy records in order to investigate whether e.g. the response of glaciers to climatic change can be reproduced by models and to what extent climate variability obtained from proxy records (with the main focus on the last millennium) can be represented. For statistical downscaling to local weather conditions, a multiple linear forward regression model is used. Daily sets of observed weather station data and various large-scale predictors at 7 pressure levels obtained from ECMWF reanalyses are used for development of the model. Daily data give the closest and most robust relationships due to the strong dependence on individual synoptic-scale patterns. For some local variables, the performance of the model can be further increased by developing seasonal specific statistical relationships. The model is validated using both independent and restricted predictor data sets. The model is applied to a long integration of a mixed layer GCM experiment simulating pre-industrial climate variability. The dynamical-statistical local GCM output within a region around Nigardsbreen glacier, Norway is compared to nearby observed station data for the period 1868-1993. Patterns of observed
Statistical Model Checking for Biological Systems
DEFF Research Database (Denmark)
David, Alexandre; Larsen, Kim Guldstrand; Legay, Axel
2014-01-01
Statistical Model Checking (SMC) is a highly scalable simulation-based verification approach for testing and estimating the probability that a stochastic system satisfies a given linear temporal property. The technique has been applied to (discrete and continuous time) Markov chains, stochastic...
Wesonga, Ronald; Nabugoomu, Fabian
2016-01-01
The study derives a framework for assessing airport efficiency through evaluating optimal arrival and departure delay thresholds. Assumptions of airport efficiency measurements, though based upon minimum numeric values such as 15 min of turnaround time, cannot be extrapolated to determine proportions of delay-days of an airport. This study explored the concept of delay threshold to determine the proportion of delay-days as an expansion of the theory of delay and our previous work. Data-driven approach using statistical modelling was employed to a limited set of determinants of daily delay at an airport. For the purpose of testing the efficacy of the threshold levels, operational data for Entebbe International Airport were used as a case study. Findings show differences in the proportions of delay at departure (μ = 0.499; 95 % CI = 0.023) and arrival (μ = 0.363; 95 % CI = 0.022). Multivariate logistic model confirmed an optimal daily departure and arrival delay threshold of 60 % for the airport given the four probable thresholds {50, 60, 70, 80}. The decision for the threshold value was based on the number of significant determinants, the goodness of fit statistics based on the Wald test and the area under the receiver operating curves. These findings propose a modelling framework to generate relevant information for the Air Traffic Management relevant in planning and measurement of airport operational efficiency.
Bayesian models: A statistical primer for ecologists
Hobbs, N. Thompson; Hooten, Mevin B.
2015-01-01
Bayesian modeling has become an indispensable tool for ecological research because it is uniquely suited to deal with complexity in a statistically coherent way. This textbook provides a comprehensive and accessible introduction to the latest Bayesian methods—in language ecologists can understand. Unlike other books on the subject, this one emphasizes the principles behind the computations, giving ecologists a big-picture understanding of how to implement this powerful statistical approach.Bayesian Models is an essential primer for non-statisticians. It begins with a definition of probability and develops a step-by-step sequence of connected ideas, including basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and inference from single and multiple models. This unique book places less emphasis on computer coding, favoring instead a concise presentation of the mathematical statistics needed to understand how and why Bayesian analysis works. It also explains how to write out properly formulated hierarchical Bayesian models and use them in computing, research papers, and proposals.This primer enables ecologists to understand the statistical principles behind Bayesian modeling and apply them to research, teaching, policy, and management.Presents the mathematical and statistical foundations of Bayesian modeling in language accessible to non-statisticiansCovers basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and moreDeemphasizes computer coding in favor of basic principlesExplains how to write out properly factored statistical expressions representing Bayesian models
Awédikian , Roy; Yannou , Bernard
2012-01-01
International audience; With the growing complexity of industrial software applications, industrials are looking for efficient and practical methods to validate the software. This paper develops a model-based statistical testing approach that automatically generates online and offline test cases for embedded software. It discusses an integrated framework that combines solutions for three major software testing research questions: (i) how to select test inputs; (ii) how to predict the expected...
Bayesian models a statistical primer for ecologists
Hobbs, N Thompson
2015-01-01
Bayesian modeling has become an indispensable tool for ecological research because it is uniquely suited to deal with complexity in a statistically coherent way. This textbook provides a comprehensive and accessible introduction to the latest Bayesian methods-in language ecologists can understand. Unlike other books on the subject, this one emphasizes the principles behind the computations, giving ecologists a big-picture understanding of how to implement this powerful statistical approach. Bayesian Models is an essential primer for non-statisticians. It begins with a definition of probabili
The Precautionary Principle and statistical approaches to uncertainty
DEFF Research Database (Denmark)
Keiding, Niels; Budtz-Jørgensen, Esben
2004-01-01
is unhelpful, because lack of significance can be due either to uninformative data or to genuine lack of effect (the Type II error problem). Its inversion, bioequivalence testing, might sometimes be a model for the Precautionary Principle in its ability to "prove the null hypothesis". Current procedures...... for setting safe exposure levels are essentially derived from these classical statistical ideas, and we outline how uncertainties in the exposure and response measurements affect the no observed adverse effect level, the Benchmark approach and the "Hockey Stick" model. A particular problem concerns model...
Statistical Models of Adaptive Immune populations
Sethna, Zachary; Callan, Curtis; Walczak, Aleksandra; Mora, Thierry
The availability of large (104-106 sequences) datasets of B or T cell populations from a single individual allows reliable fitting of complex statistical models for naïve generation, somatic selection, and hypermutation. It is crucial to utilize a probabilistic/informational approach when modeling these populations. The inferred probability distributions allow for population characterization, calculation of probability distributions of various hidden variables (e.g. number of insertions), as well as statistical properties of the distribution itself (e.g. entropy). In particular, the differences between the T cell populations of embryonic and mature mice will be examined as a case study. Comparing these populations, as well as proposed mixed populations, provides a concrete exercise in model creation, comparison, choice, and validation.
Multiple commodities in statistical microeconomics: Model and market
Baaquie, Belal E.; Yu, Miao; Du, Xin
2016-11-01
A statistical generalization of microeconomics has been made in Baaquie (2013). In Baaquie et al. (2015), the market behavior of single commodities was analyzed and it was shown that market data provides strong support for the statistical microeconomic description of commodity prices. The case of multiple commodities is studied and a parsimonious generalization of the single commodity model is made for the multiple commodities case. Market data shows that the generalization can accurately model the simultaneous correlation functions of up to four commodities. To accurately model five or more commodities, further terms have to be included in the model. This study shows that the statistical microeconomics approach is a comprehensive and complete formulation of microeconomics, and which is independent to the mainstream formulation of microeconomics.
Predicting future protection of respirator users: Statistical approaches and practical implications.
Hu, Chengcheng; Harber, Philip; Su, Jing
2016-01-01
The purpose of this article is to describe a statistical approach for predicting a respirator user's fit factor in the future based upon results from initial tests. A statistical prediction model was developed based upon joint distribution of multiple fit factor measurements over time obtained from linear mixed effect models. The model accounts for within-subject correlation as well as short-term (within one day) and longer-term variability. As an example of applying this approach, model parameters were estimated from a research study in which volunteers were trained by three different modalities to use one of two types of respirators. They underwent two quantitative fit tests at the initial session and two on the same day approximately six months later. The fitted models demonstrated correlation and gave the estimated distribution of future fit test results conditional on past results for an individual worker. This approach can be applied to establishing a criterion value for passing an initial fit test to provide reasonable likelihood that a worker will be adequately protected in the future; and to optimizing the repeat fit factor test intervals individually for each user for cost-effective testing.
Papa, Lesther A; Litson, Kaylee; Lockhart, Ginger; Chassin, Laurie; Geiser, Christian
2015-01-01
Testing mediation models is critical for identifying potential variables that need to be targeted to effectively change one or more outcome variables. In addition, it is now common practice for clinicians to use multiple informant (MI) data in studies of statistical mediation. By coupling the use of MI data with statistical mediation analysis, clinical researchers can combine the benefits of both techniques. Integrating the information from MIs into a statistical mediation model creates various methodological and practical challenges. The authors review prior methodological approaches to MI mediation analysis in clinical research and propose a new latent variable approach that overcomes some limitations of prior approaches. An application of the new approach to mother, father, and child reports of impulsivity, frustration tolerance, and externalizing problems (N = 454) is presented. The results showed that frustration tolerance mediated the relationship between impulsivity and externalizing problems. The new approach allows for a more comprehensive and effective use of MI data when testing mediation models.
Statistical approaches for evaluating body composition markers in clinical cancer research.
Bayar, Mohamed Amine; Antoun, Sami; Lanoy, Emilie
2017-04-01
The term 'morphomics' stands for the markers of body composition in muscle and adipose tissues. in recent years, as part of clinical cancer research, several associations between morphomics and outcome or toxicity were found in different treatment settings leading to a growing interest. we aim to review statistical approaches used to evaluate these markers and suggest practical statistical recommendations. Area covered: We identified statistical methods used recently to take into account properties of morphomics measurements. We also reviewed adjustment methods on major confounding factors such as gender and approaches to model morphomic data, especially mixed models for repeated measures. Finally, we focused on methods for determining a cut-off for a morphomic marker that could be used in clinical practice and how to assess its robustness. Expert commentary: From our review, we proposed 13 key points to strengthen analyses and reporting of clinical research assessing associations between morphomics and outcome or toxicity.
Statistical Approaches for Spatiotemporal Prediction of Low Flows
Fangmann, A.; Haberlandt, U.
2017-12-01
An adequate assessment of regional climate change impacts on streamflow requires the integration of various sources of information and modeling approaches. This study proposes simple statistical tools for inclusion into model ensembles, which are fast and straightforward in their application, yet able to yield accurate streamflow predictions in time and space. Target variables for all approaches are annual low flow indices derived from a data set of 51 records of average daily discharge for northwestern Germany. The models require input of climatic data in the form of meteorological drought indices, derived from observed daily climatic variables, averaged over the streamflow gauges' catchments areas. Four different modeling approaches are analyzed. Basis for all pose multiple linear regression models that estimate low flows as a function of a set of meteorological indices and/or physiographic and climatic catchment descriptors. For the first method, individual regression models are fitted at each station, predicting annual low flow values from a set of annual meteorological indices, which are subsequently regionalized using a set of catchment characteristics. The second method combines temporal and spatial prediction within a single panel data regression model, allowing estimation of annual low flow values from input of both annual meteorological indices and catchment descriptors. The third and fourth methods represent non-stationary low flow frequency analyses and require fitting of regional distribution functions. Method three is subject to a spatiotemporal prediction of an index value, method four to estimation of L-moments that adapt the regional frequency distribution to the at-site conditions. The results show that method two outperforms successive prediction in time and space. Method three also shows a high performance in the near future period, but since it relies on a stationary distribution, its application for prediction of far future changes may be
General renormalized statistical approach with finite cross-field correlations
International Nuclear Information System (INIS)
Vakulenko, M.O.
1992-01-01
The renormalized statistical approach is proposed, accounting for finite correlations of potential and magnetic fluctuations. It may be used for analysis of a wide class of nonlinear model equations describing the cross-correlated plasma states. The influence of a cross spectrum on stationary potential and magnetic ones is investigated. 10 refs. (author)
Directory of Open Access Journals (Sweden)
Lesther ePapa
2015-11-01
Full Text Available Testing mediation models is critical for identifying potential variables that need to be targeted to effectively change one or more outcome variables. In addition, it is now common practice for clinicians to use multiple informant (MI data in studies of statistical mediation. By coupling the use of MI data with statistical mediation analysis, clinical researchers can combine the benefits of both techniques. Integrating the information from MIs into a statistical mediation model creates various methodological and practical challenges. The authors review prior methodological approaches to MI mediation analysis in clinical research and propose a new latent variable approach that overcomes some limitations of prior approaches. An application of the new approach to mother, father, and child reports of impulsivity, frustration tolerance, and externalizing problems (N = 454 is presented. The results showed that frustration tolerance mediated the relationship between impulsivity and externalizing problems. Advantages and limitations of the new approach are discussed. The new approach can help clinical researchers overcome limitations of prior techniques. It allows for a more comprehensive and effective use of MI data when testing mediation models.
Artificial intelligence approaches in statistics
International Nuclear Information System (INIS)
Phelps, R.I.; Musgrove, P.B.
1986-01-01
The role of pattern recognition and knowledge representation methods from Artificial Intelligence within statistics is considered. Two areas of potential use are identified and one, data exploration, is used to illustrate the possibilities. A method is presented to identify and separate overlapping groups within cluster analysis, using an AI approach. The potential of such ''intelligent'' approaches is stressed
Right-sizing statistical models for longitudinal data.
Wood, Phillip K; Steinley, Douglas; Jackson, Kristina M
2015-12-01
Arguments are proposed that researchers using longitudinal data should consider more and less complex statistical model alternatives to their initially chosen techniques in an effort to "right-size" the model to the data at hand. Such model comparisons may alert researchers who use poorly fitting, overly parsimonious models to more complex, better-fitting alternatives and, alternatively, may identify more parsimonious alternatives to overly complex (and perhaps empirically underidentified and/or less powerful) statistical models. A general framework is proposed for considering (often nested) relationships between a variety of psychometric and growth curve models. A 3-step approach is proposed in which models are evaluated based on the number and patterning of variance components prior to selection of better-fitting growth models that explain both mean and variation-covariation patterns. The orthogonal free curve slope intercept (FCSI) growth model is considered a general model that includes, as special cases, many models, including the factor mean (FM) model (McArdle & Epstein, 1987), McDonald's (1967) linearly constrained factor model, hierarchical linear models (HLMs), repeated-measures multivariate analysis of variance (MANOVA), and the linear slope intercept (linearSI) growth model. The FCSI model, in turn, is nested within the Tuckerized factor model. The approach is illustrated by comparing alternative models in a longitudinal study of children's vocabulary and by comparing several candidate parametric growth and chronometric models in a Monte Carlo study. (c) 2015 APA, all rights reserved).
Statistical approach to LHCD modeling using the wave kinetic equation
International Nuclear Information System (INIS)
Kupfer, K.; Moreau, D.; Litaudon, X.
1993-04-01
Recent work has shown that for parameter regimes typical of many present day current drive experiments, the orbits of the launched LH rays are chaotic (in the Hamiltonian sense), so that wave energy diffuses through the stochastic layer and fills the spectral gap. We have analyzed this problem using a statistical approach, by solving the wave kinetic equation for the coarse-grained spectral energy density. An interesting result is that the LH absorption profile is essentially independent of both the total injected power and the level of wave stochastic diffusion
Monitor-Based Statistical Model Checking for Weighted Metric Temporal Logic
DEFF Research Database (Denmark)
Bulychev, Petr; David, Alexandre; Larsen, Kim Guldstrand
2012-01-01
We present a novel approach and implementation for ana- lysing weighted timed automata (WTA) with respect to the weighted metric temporal logic (WMTL≤ ). Based on a stochastic semantics of WTAs, we apply statistical model checking (SMC) to estimate and test probabilities of satisfaction with desi......We present a novel approach and implementation for ana- lysing weighted timed automata (WTA) with respect to the weighted metric temporal logic (WMTL≤ ). Based on a stochastic semantics of WTAs, we apply statistical model checking (SMC) to estimate and test probabilities of satisfaction...
Lee, K. David; Wiesenfeld, Eric; Gelfand, Andrew
2007-04-01
One of the greatest challenges in modern combat is maintaining a high level of timely Situational Awareness (SA). In many situations, computational complexity and accuracy considerations make the development and deployment of real-time, high-level inference tools very difficult. An innovative hybrid framework that combines Bayesian inference, in the form of Bayesian Networks, and Possibility Theory, in the form of Fuzzy Logic systems, has recently been introduced to provide a rigorous framework for high-level inference. In previous research, the theoretical basis and benefits of the hybrid approach have been developed. However, lacking is a concrete experimental comparison of the hybrid framework with traditional fusion methods, to demonstrate and quantify this benefit. The goal of this research, therefore, is to provide a statistical analysis on the comparison of the accuracy and performance of hybrid network theory, with pure Bayesian and Fuzzy systems and an inexact Bayesian system approximated using Particle Filtering. To accomplish this task, domain specific models will be developed under these different theoretical approaches and then evaluated, via Monte Carlo Simulation, in comparison to situational ground truth to measure accuracy and fidelity. Following this, a rigorous statistical analysis of the performance results will be performed, to quantify the benefit of hybrid inference to other fusion tools.
Most, S.; Nowak, W.; Bijeljic, B.
2014-12-01
Transport processes in porous media are frequently simulated as particle movement. This process can be formulated as a stochastic process of particle position increments. At the pore scale, the geometry and micro-heterogeneities prohibit the commonly made assumption of independent and normally distributed increments to represent dispersion. Many recent particle methods seek to loosen this assumption. Recent experimental data suggest that we have not yet reached the end of the need to generalize, because particle increments show statistical dependency beyond linear correlation and over many time steps. The goal of this work is to better understand the validity regions of commonly made assumptions. We are investigating after what transport distances can we observe: A statistical dependence between increments, that can be modelled as an order-k Markov process, boils down to order 1. This would be the Markovian distance for the process, where the validity of yet-unexplored non-Gaussian-but-Markovian random walks would start. A bivariate statistical dependence that simplifies to a multi-Gaussian dependence based on simple linear correlation (validity of correlated PTRW). Complete absence of statistical dependence (validity of classical PTRW/CTRW). The approach is to derive a statistical model for pore-scale transport from a powerful experimental data set via copula analysis. The model is formulated as a non-Gaussian, mutually dependent Markov process of higher order, which allows us to investigate the validity ranges of simpler models.
Statistical validation of normal tissue complication probability models.
Xu, Cheng-Jian; van der Schaaf, Arjen; Van't Veld, Aart A; Langendijk, Johannes A; Schilstra, Cornelis
2012-09-01
To investigate the applicability and value of double cross-validation and permutation tests as established statistical approaches in the validation of normal tissue complication probability (NTCP) models. A penalized regression method, LASSO (least absolute shrinkage and selection operator), was used to build NTCP models for xerostomia after radiation therapy treatment of head-and-neck cancer. Model assessment was based on the likelihood function and the area under the receiver operating characteristic curve. Repeated double cross-validation showed the uncertainty and instability of the NTCP models and indicated that the statistical significance of model performance can be obtained by permutation testing. Repeated double cross-validation and permutation tests are recommended to validate NTCP models before clinical use. Copyright © 2012 Elsevier Inc. All rights reserved.
Statistical Validation of Normal Tissue Complication Probability Models
Energy Technology Data Exchange (ETDEWEB)
Xu Chengjian, E-mail: c.j.xu@umcg.nl [Department of Radiation Oncology, University of Groningen, University Medical Center Groningen, Groningen (Netherlands); Schaaf, Arjen van der; Veld, Aart A. van' t; Langendijk, Johannes A. [Department of Radiation Oncology, University of Groningen, University Medical Center Groningen, Groningen (Netherlands); Schilstra, Cornelis [Department of Radiation Oncology, University of Groningen, University Medical Center Groningen, Groningen (Netherlands); Radiotherapy Institute Friesland, Leeuwarden (Netherlands)
2012-09-01
Purpose: To investigate the applicability and value of double cross-validation and permutation tests as established statistical approaches in the validation of normal tissue complication probability (NTCP) models. Methods and Materials: A penalized regression method, LASSO (least absolute shrinkage and selection operator), was used to build NTCP models for xerostomia after radiation therapy treatment of head-and-neck cancer. Model assessment was based on the likelihood function and the area under the receiver operating characteristic curve. Results: Repeated double cross-validation showed the uncertainty and instability of the NTCP models and indicated that the statistical significance of model performance can be obtained by permutation testing. Conclusion: Repeated double cross-validation and permutation tests are recommended to validate NTCP models before clinical use.
Linear mixed models a practical guide using statistical software
West, Brady T; Galecki, Andrzej T
2006-01-01
Simplifying the often confusing array of software programs for fitting linear mixed models (LMMs), Linear Mixed Models: A Practical Guide Using Statistical Software provides a basic introduction to primary concepts, notation, software implementation, model interpretation, and visualization of clustered and longitudinal data. This easy-to-navigate reference details the use of procedures for fitting LMMs in five popular statistical software packages: SAS, SPSS, Stata, R/S-plus, and HLM. The authors introduce basic theoretical concepts, present a heuristic approach to fitting LMMs based on bo
Flores-Marquez, Leticia Elsa; Ramirez Rojaz, Alejandro; Telesca, Luciano
2015-04-01
The study of two statistical approaches is analyzed for two different types of data sets, one is the seismicity generated by the subduction processes occurred at south Pacific coast of Mexico between 2005 and 2012, and the other corresponds to the synthetic seismic data generated by a stick-slip experimental model. The statistical methods used for the present study are the visibility graph in order to investigate the time dynamics of the series and the scaled probability density function in the natural time domain to investigate the critical order of the system. This comparison has the purpose to show the similarities between the dynamical behaviors of both types of data sets, from the point of view of critical systems. The observed behaviors allow us to conclude that the experimental set up globally reproduces the behavior observed in the statistical approaches used to analyses the seismicity of the subduction zone. The present study was supported by the Bilateral Project Italy-Mexico Experimental Stick-slip models of tectonic faults: innovative statistical approaches applied to synthetic seismic sequences, jointly funded by MAECI (Italy) and AMEXCID (Mexico) in the framework of the Bilateral Agreement for Scientific and Technological Cooperation PE 2014-2016.
Stochastic Spatial Models in Ecology: A Statistical Physics Approach
Pigolotti, Simone; Cencini, Massimo; Molina, Daniel; Muñoz, Miguel A.
2017-11-01
Ecosystems display a complex spatial organization. Ecologists have long tried to characterize them by looking at how different measures of biodiversity change across spatial scales. Ecological neutral theory has provided simple predictions accounting for general empirical patterns in communities of competing species. However, while neutral theory in well-mixed ecosystems is mathematically well understood, spatial models still present several open problems, limiting the quantitative understanding of spatial biodiversity. In this review, we discuss the state of the art in spatial neutral theory. We emphasize the connection between spatial ecological models and the physics of non-equilibrium phase transitions and how concepts developed in statistical physics translate in population dynamics, and vice versa. We focus on non-trivial scaling laws arising at the critical dimension D = 2 of spatial neutral models, and their relevance for biological populations inhabiting two-dimensional environments. We conclude by discussing models incorporating non-neutral effects in the form of spatial and temporal disorder, and analyze how their predictions deviate from those of purely neutral theories.
Computationally efficient statistical differential equation modeling using homogenization
Hooten, Mevin B.; Garlick, Martha J.; Powell, James A.
2013-01-01
Statistical models using partial differential equations (PDEs) to describe dynamically evolving natural systems are appearing in the scientific literature with some regularity in recent years. Often such studies seek to characterize the dynamics of temporal or spatio-temporal phenomena such as invasive species, consumer-resource interactions, community evolution, and resource selection. Specifically, in the spatial setting, data are often available at varying spatial and temporal scales. Additionally, the necessary numerical integration of a PDE may be computationally infeasible over the spatial support of interest. We present an approach to impose computationally advantageous changes of support in statistical implementations of PDE models and demonstrate its utility through simulation using a form of PDE known as “ecological diffusion.” We also apply a statistical ecological diffusion model to a data set involving the spread of mountain pine beetle (Dendroctonus ponderosae) in Idaho, USA.
Statistical mechanics of directed models of polymers in the square lattice
International Nuclear Information System (INIS)
Rensburg, E J Janse van
2003-01-01
Directed square lattice models of polymers and vesicles have received considerable attention in the recent mathematical and physical sciences literature. These are idealized geometric directed lattice models introduced to study phase behaviour in polymers, and include Dyck paths, partially directed paths, directed trees and directed vesicles models. Directed models are closely related to models studied in the combinatorics literature (and are often exactly solvable). They are also simplified versions of a number of statistical mechanics models, including the self-avoiding walk, lattice animals and lattice vesicles. The exchange of approaches and ideas between statistical mechanics and combinatorics have considerably advanced the description and understanding of directed lattice models, and this will be explored in this review. The combinatorial nature of directed lattice path models makes a study using generating function approaches most natural. In contrast, the statistical mechanics approach would introduce partition functions and free energies, and then investigate these using the general framework of critical phenomena. Generating function and statistical mechanics approaches are closely related. For example, questions regarding the limiting free energy may be approached by considering the radius of convergence of a generating function, and the scaling properties of thermodynamic quantities are related to the asymptotic properties of the generating function. In this review the methods for obtaining generating functions and determining free energies in directed lattice path models of linear polymers is presented. These methods include decomposition methods leading to functional recursions, as well as the Temperley method (that is implemented by creating a combinatorial object, one slice at a time). A constant term formulation of the generating function will also be reviewed. The thermodynamic features and critical behaviour in models of directed paths may be
Improved air ventilation rate estimation based on a statistical model
International Nuclear Information System (INIS)
Brabec, M.; Jilek, K.
2004-01-01
A new approach to air ventilation rate estimation from CO measurement data is presented. The approach is based on a state-space dynamic statistical model, allowing for quick and efficient estimation. Underlying computations are based on Kalman filtering, whose practical software implementation is rather easy. The key property is the flexibility of the model, allowing various artificial regimens of CO level manipulation to be treated. The model is semi-parametric in nature and can efficiently handle time-varying ventilation rate. This is a major advantage, compared to some of the methods which are currently in practical use. After a formal introduction of the statistical model, its performance is demonstrated on real data from routine measurements. It is shown how the approach can be utilized in a more complex situation of major practical relevance, when time-varying air ventilation rate and radon entry rate are to be estimated simultaneously from concurrent radon and CO measurements
A statistical approach to instrument calibration
Robert R. Ziemer; David Strauss
1978-01-01
Summary - It has been found that two instruments will yield different numerical values when used to measure identical points. A statistical approach is presented that can be used to approximate the error associated with the calibration of instruments. Included are standard statistical tests that can be used to determine if a number of successive calibrations of the...
Cellular automata and statistical mechanical models
International Nuclear Information System (INIS)
Rujan, P.
1987-01-01
The authors elaborate on the analogy between the transfer matrix of usual lattice models and the master equation describing the time development of cellular automata. Transient and stationary properties of probabilistic automata are linked to surface and bulk properties, respectively, of restricted statistical mechanical systems. It is demonstrated that methods of statistical physics can be successfully used to describe the dynamic and the stationary behavior of such automata. Some exact results are derived, including duality transformations, exact mappings, disorder, and linear solutions. Many examples are worked out in detail to demonstrate how to use statistical physics in order to construct cellular automata with desired properties. This approach is considered to be a first step toward the design of fully parallel, probabilistic systems whose computational abilities rely on the cooperative behavior of their components
A κ-generalized statistical mechanics approach to income analysis
Clementi, F.; Gallegati, M.; Kaniadakis, G.
2009-02-01
This paper proposes a statistical mechanics approach to the analysis of income distribution and inequality. A new distribution function, having its roots in the framework of κ-generalized statistics, is derived that is particularly suitable for describing the whole spectrum of incomes, from the low-middle income region up to the high income Pareto power-law regime. Analytical expressions for the shape, moments and some other basic statistical properties are given. Furthermore, several well-known econometric tools for measuring inequality, which all exist in a closed form, are considered. A method for parameter estimation is also discussed. The model is shown to fit remarkably well the data on personal income for the United States, and the analysis of inequality performed in terms of its parameters is revealed as very powerful.
A κ-generalized statistical mechanics approach to income analysis
International Nuclear Information System (INIS)
Clementi, F; Gallegati, M; Kaniadakis, G
2009-01-01
This paper proposes a statistical mechanics approach to the analysis of income distribution and inequality. A new distribution function, having its roots in the framework of κ-generalized statistics, is derived that is particularly suitable for describing the whole spectrum of incomes, from the low–middle income region up to the high income Pareto power-law regime. Analytical expressions for the shape, moments and some other basic statistical properties are given. Furthermore, several well-known econometric tools for measuring inequality, which all exist in a closed form, are considered. A method for parameter estimation is also discussed. The model is shown to fit remarkably well the data on personal income for the United States, and the analysis of inequality performed in terms of its parameters is revealed as very powerful
Statistical validation of normal tissue complication probability models
Xu, Cheng-Jian; van der Schaaf, Arjen; van t Veld, Aart; Langendijk, Johannes A.; Schilstra, Cornelis
2012-01-01
PURPOSE: To investigate the applicability and value of double cross-validation and permutation tests as established statistical approaches in the validation of normal tissue complication probability (NTCP) models. METHODS AND MATERIALS: A penalized regression method, LASSO (least absolute shrinkage
Mixed deterministic statistical modelling of regional ozone air pollution
Kalenderski, Stoitchko
2011-03-17
We develop a physically motivated statistical model for regional ozone air pollution by separating the ground-level pollutant concentration field into three components, namely: transport, local production and large-scale mean trend mostly dominated by emission rates. The model is novel in the field of environmental spatial statistics in that it is a combined deterministic-statistical model, which gives a new perspective to the modelling of air pollution. The model is presented in a Bayesian hierarchical formalism, and explicitly accounts for advection of pollutants, using the advection equation. We apply the model to a specific case of regional ozone pollution-the Lower Fraser valley of British Columbia, Canada. As a predictive tool, we demonstrate that the model vastly outperforms existing, simpler modelling approaches. Our study highlights the importance of simultaneously considering different aspects of an air pollution problem as well as taking into account the physical bases that govern the processes of interest. © 2011 John Wiley & Sons, Ltd..
Permutation statistical methods an integrated approach
Berry, Kenneth J; Johnston, Janis E
2016-01-01
This research monograph provides a synthesis of a number of statistical tests and measures, which, at first consideration, appear disjoint and unrelated. Numerous comparisons of permutation and classical statistical methods are presented, and the two methods are compared via probability values and, where appropriate, measures of effect size. Permutation statistical methods, compared to classical statistical methods, do not rely on theoretical distributions, avoid the usual assumptions of normality and homogeneity of variance, and depend only on the data at hand. This text takes a unique approach to explaining statistics by integrating a large variety of statistical methods, and establishing the rigor of a topic that to many may seem to be a nascent field in statistics. This topic is new in that it took modern computing power to make permutation methods available to people working in the mainstream of research. This research monograph addresses a statistically-informed audience, and can also easily serve as a ...
International Nuclear Information System (INIS)
Zahedi, Gholamreza; Karami, Zohre; Yaghoobi, Hamed
2009-01-01
In this study, various estimation methods have been reviewed for hydrate formation temperature (HFT) and two procedures have been presented. In the first method, two general correlations have been proposed for HFT. One of the correlations has 11 parameters, and the second one has 18 parameters. In order to obtain constants in proposed equations, 203 experimental data points have been collected from literatures. The Engineering Equation Solver (EES) and Statistical Package for the Social Sciences (SPSS) soft wares have been employed for statistical analysis of the data. Accuracy of the obtained correlations also has been declared by comparison with experimental data and some recent common used correlations. In the second method, HFT is estimated by artificial neural network (ANN) approach. In this case, various architectures have been checked using 70% of experimental data for training of ANN. Among the various architectures multi layer perceptron (MLP) network with trainlm training algorithm was found as the best architecture. Comparing the obtained ANN model results with 30% of unseen data confirms ANN excellent estimation performance. It was found that ANN is more accurate than traditional methods and even our two proposed correlations for HFT estimation.
Advanced data analysis in neuroscience integrating statistical and computational models
Durstewitz, Daniel
2017-01-01
This book is intended for use in advanced graduate courses in statistics / machine learning, as well as for all experimental neuroscientists seeking to understand statistical methods at a deeper level, and theoretical neuroscientists with a limited background in statistics. It reviews almost all areas of applied statistics, from basic statistical estimation and test theory, linear and nonlinear approaches for regression and classification, to model selection and methods for dimensionality reduction, density estimation and unsupervised clustering. Its focus, however, is linear and nonlinear time series analysis from a dynamical systems perspective, based on which it aims to convey an understanding also of the dynamical mechanisms that could have generated observed time series. Further, it integrates computational modeling of behavioral and neural dynamics with statistical estimation and hypothesis testing. This way computational models in neuroscience are not only explanat ory frameworks, but become powerfu...
Energy Technology Data Exchange (ETDEWEB)
Yang, Jinzhong, E-mail: jyang4@mdanderson.org [Department of Radiation Physics, The University of Texas MD Anderson Cancer Center, Houston, Texas (United States); Woodward, Wendy A.; Reed, Valerie K.; Strom, Eric A.; Perkins, George H.; Tereffe, Welela; Buchholz, Thomas A. [Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas (United States); Zhang, Lifei; Balter, Peter; Court, Laurence E. [Department of Radiation Physics, The University of Texas MD Anderson Cancer Center, Houston, Texas (United States); Li, X. Allen [Department of Radiation Oncology, Medical College of Wisconsin, Milwaukee, Wisconsin (United States); Dong, Lei [Department of Radiation Physics, The University of Texas MD Anderson Cancer Center, Houston, Texas (United States); Scripps Proton Therapy Center, San Diego, California (United States)
2014-05-01
Purpose: To develop a new approach for interobserver variability analysis. Methods and Materials: Eight radiation oncologists specializing in breast cancer radiation therapy delineated a patient's left breast “from scratch” and from a template that was generated using deformable image registration. Three of the radiation oncologists had previously received training in Radiation Therapy Oncology Group consensus contouring for breast cancer atlas. The simultaneous truth and performance level estimation algorithm was applied to the 8 contours delineated “from scratch” to produce a group consensus contour. Individual Jaccard scores were fitted to a beta distribution model. We also applied this analysis to 2 or more patients, which were contoured by 9 breast radiation oncologists from 8 institutions. Results: The beta distribution model had a mean of 86.2%, standard deviation (SD) of ±5.9%, a skewness of −0.7, and excess kurtosis of 0.55, exemplifying broad interobserver variability. The 3 RTOG-trained physicians had higher agreement scores than average, indicating that their contours were close to the group consensus contour. One physician had high sensitivity but lower specificity than the others, which implies that this physician tended to contour a structure larger than those of the others. Two other physicians had low sensitivity but specificity similar to the others, which implies that they tended to contour a structure smaller than the others. With this information, they could adjust their contouring practice to be more consistent with others if desired. When contouring from the template, the beta distribution model had a mean of 92.3%, SD ± 3.4%, skewness of −0.79, and excess kurtosis of 0.83, which indicated a much better consistency among individual contours. Similar results were obtained for the analysis of 2 additional patients. Conclusions: The proposed statistical approach was able to measure interobserver variability quantitatively
Growth Curve Models and Applications : Indian Statistical Institute
2017-01-01
Growth curve models in longitudinal studies are widely used to model population size, body height, biomass, fungal growth, and other variables in the biological sciences, but these statistical methods for modeling growth curves and analyzing longitudinal data also extend to general statistics, economics, public health, demographics, epidemiology, SQC, sociology, nano-biotechnology, fluid mechanics, and other applied areas. There is no one-size-fits-all approach to growth measurement. The selected papers in this volume build on presentations from the GCM workshop held at the Indian Statistical Institute, Giridih, on March 28-29, 2016. They represent recent trends in GCM research on different subject areas, both theoretical and applied. This book includes tools and possibilities for further work through new techniques and modification of existing ones. The volume includes original studies, theoretical findings and case studies from a wide range of app lied work, and these contributions have been externally r...
Costa, Marco; A. Manuela Gonçalves
2012-01-01
In this work are discussed some statistical approaches that combine multivariate statistical techniques and time series analysis in order to describe and model spatial patterns and temporal evolution by observing hydrological series of water quality variables recorded in time and space. These approaches are illustrated with a data set collected in the River Ave hydrological basin located in the Northwest region of Portugal.
Meta-analysis a structural equation modeling approach
Cheung, Mike W-L
2015-01-01
Presents a novel approach to conducting meta-analysis using structural equation modeling. Structural equation modeling (SEM) and meta-analysis are two powerful statistical methods in the educational, social, behavioral, and medical sciences. They are often treated as two unrelated topics in the literature. This book presents a unified framework on analyzing meta-analytic data within the SEM framework, and illustrates how to conduct meta-analysis using the metaSEM package in the R statistical environment. Meta-Analysis: A Structural Equation Modeling Approach begins by introducing the impo
Statistical model selection with “Big Data”
Directory of Open Access Journals (Sweden)
Jurgen A. Doornik
2015-12-01
Full Text Available Big Data offer potential benefits for statistical modelling, but confront problems including an excess of false positives, mistaking correlations for causes, ignoring sampling biases and selecting by inappropriate methods. We consider the many important requirements when searching for a data-based relationship using Big Data, and the possible role of Autometrics in that context. Paramount considerations include embedding relationships in general initial models, possibly restricting the number of variables to be selected over by non-statistical criteria (the formulation problem, using good quality data on all variables, analyzed with tight significance levels by a powerful selection procedure, retaining available theory insights (the selection problem while testing for relationships being well specified and invariant to shifts in explanatory variables (the evaluation problem, using a viable approach that resolves the computational problem of immense numbers of possible models.
Topology for Statistical Modeling of Petascale Data
Energy Technology Data Exchange (ETDEWEB)
Pascucci, Valerio [Univ. of Utah, Salt Lake City, UT (United States); Levine, Joshua [Univ. of Utah, Salt Lake City, UT (United States); Gyulassy, Attila [Univ. of Utah, Salt Lake City, UT (United States); Bremer, P. -T. [Univ. of Utah, Salt Lake City, UT (United States)
2013-10-31
Many commonly used algorithms for mathematical analysis do not scale well enough to accommodate the size or complexity of petascale data produced by computational simulations. The primary goal of this project is to develop new mathematical tools that address both the petascale size and uncertain nature of current data. At a high level, the approach of the entire team involving all three institutions is based on the complementary techniques of combinatorial topology and statistical modelling. In particular, we use combinatorial topology to filter out spurious data that would otherwise skew statistical modelling techniques, and we employ advanced algorithms from algebraic statistics to efficiently find globally optimal fits to statistical models. The overall technical contributions can be divided loosely into three categories: (1) advances in the field of combinatorial topology, (2) advances in statistical modelling, and (3) new integrated topological and statistical methods. Roughly speaking, the division of labor between our 3 groups (Sandia Labs in Livermore, Texas A&M in College Station, and U Utah in Salt Lake City) is as follows: the Sandia group focuses on statistical methods and their formulation in algebraic terms, and finds the application problems (and data sets) most relevant to this project, the Texas A&M Group develops new algebraic geometry algorithms, in particular with fewnomial theory, and the Utah group develops new algorithms in computational topology via Discrete Morse Theory. However, we hasten to point out that our three groups stay in tight contact via videconference every 2 weeks, so there is much synergy of ideas between the groups. The following of this document is focused on the contributions that had grater direct involvement from the team at the University of Utah in Salt Lake City.
On the statistical comparison of climate model output and climate data
International Nuclear Information System (INIS)
Solow, A.R.
1991-01-01
Some broad issues arising in the statistical comparison of the output of climate models with the corresponding climate data are reviewed. Particular attention is paid to the question of detecting climate change. The purpose of this paper is to review some statistical approaches to the comparison of the output of climate models with climate data. There are many statistical issues arising in such a comparison. The author will focus on some of the broader issues, although some specific methodological questions will arise along the way. One important potential application of the approaches discussed in this paper is the detection of climate change. Although much of the discussion will be fairly general, he will try to point out the appropriate connections to the detection question. 9 refs
On the statistical comparison of climate model output and climate data
International Nuclear Information System (INIS)
Solow, A.R.
1990-01-01
Some broad issues arising in the statistical comparison of the output of climate models with the corresponding climate data are reviewed. Particular attention is paid to the question of detecting climate change. The purpose of this paper is to review some statistical approaches to the comparison of the output of climate models with climate data. There are many statistical issues arising in such a comparison. The author will focus on some of the broader issues, although some specific methodological questions will arise along the way. One important potential application of the approaches discussed in this paper is the detection of climate change. Although much of the discussion will be fairly general, he will try to point out the appropriate connections to the detection question
Statistical physics approach to earthquake occurrence and forecasting
Energy Technology Data Exchange (ETDEWEB)
Arcangelis, Lucilla de [Department of Industrial and Information Engineering, Second University of Naples, Aversa (CE) (Italy); Godano, Cataldo [Department of Mathematics and Physics, Second University of Naples, Caserta (Italy); Grasso, Jean Robert [ISTerre, IRD-CNRS-OSUG, University of Grenoble, Saint Martin d’Héres (France); Lippiello, Eugenio, E-mail: eugenio.lippiello@unina2.it [Department of Mathematics and Physics, Second University of Naples, Caserta (Italy)
2016-04-25
There is striking evidence that the dynamics of the Earth crust is controlled by a wide variety of mutually dependent mechanisms acting at different spatial and temporal scales. The interplay of these mechanisms produces instabilities in the stress field, leading to abrupt energy releases, i.e., earthquakes. As a consequence, the evolution towards instability before a single event is very difficult to monitor. On the other hand, collective behavior in stress transfer and relaxation within the Earth crust leads to emergent properties described by stable phenomenological laws for a population of many earthquakes in size, time and space domains. This observation has stimulated a statistical mechanics approach to earthquake occurrence, applying ideas and methods as scaling laws, universality, fractal dimension, renormalization group, to characterize the physics of earthquakes. In this review we first present a description of the phenomenological laws of earthquake occurrence which represent the frame of reference for a variety of statistical mechanical models, ranging from the spring-block to more complex fault models. Next, we discuss the problem of seismic forecasting in the general framework of stochastic processes, where seismic occurrence can be described as a branching process implementing space–time-energy correlations between earthquakes. In this context we show how correlations originate from dynamical scaling relations between time and energy, able to account for universality and provide a unifying description for the phenomenological power laws. Then we discuss how branching models can be implemented to forecast the temporal evolution of the earthquake occurrence probability and allow to discriminate among different physical mechanisms responsible for earthquake triggering. In particular, the forecasting problem will be presented in a rigorous mathematical framework, discussing the relevance of the processes acting at different temporal scales for
Kim, Seongho; Li, Lang
2014-02-01
The statistical identifiability of nonlinear pharmacokinetic (PK) models with the Michaelis-Menten (MM) kinetic equation is considered using a global optimization approach, which is particle swarm optimization (PSO). If a model is statistically non-identifiable, the conventional derivative-based estimation approach is often terminated earlier without converging, due to the singularity. To circumvent this difficulty, we develop a derivative-free global optimization algorithm by combining PSO with a derivative-free local optimization algorithm to improve the rate of convergence of PSO. We further propose an efficient approach to not only checking the convergence of estimation but also detecting the identifiability of nonlinear PK models. PK simulation studies demonstrate that the convergence and identifiability of the PK model can be detected efficiently through the proposed approach. The proposed approach is then applied to clinical PK data along with a two-compartmental model. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Statistical Emulation of Climate Model Projections Based on Precomputed GCM Runs*
Castruccio, Stefano
2014-03-01
The authors describe a new approach for emulating the output of a fully coupled climate model under arbitrary forcing scenarios that is based on a small set of precomputed runs from the model. Temperature and precipitation are expressed as simple functions of the past trajectory of atmospheric CO2 concentrations, and a statistical model is fit using a limited set of training runs. The approach is demonstrated to be a useful and computationally efficient alternative to pattern scaling and captures the nonlinear evolution of spatial patterns of climate anomalies inherent in transient climates. The approach does as well as pattern scaling in all circumstances and substantially better in many; it is not computationally demanding; and, once the statistical model is fit, it produces emulated climate output effectively instantaneously. It may therefore find wide application in climate impacts assessments and other policy analyses requiring rapid climate projections.
A statistical approach for water movement in the unsaturated zone
International Nuclear Information System (INIS)
Tielin Zang.
1991-01-01
This thesis presents a statistical approach for estimating and analyzing the downward transport pattern and distribution of soil water by the use of pattern analysis of space-time correlation structures. This approach, called the Space-time-Correlation Field, is mainly based on the analyses of correlation functions simultaneously in the space and time domain. The overall purpose of this work is to derive an alternative statistical procedure in soil moisture analysis without involving detailed information on hydraulic parameters and to visualize the dynamics of soil water variability in the space and time domains. A numerical model using method of characteristics is employed to provide hypothetical time series to use in the statistical method, which is, after the verification and calibration, applied to the field measured time series. The results of the application show that the space-time correlation fields reveal effects of soil layers with different hydraulic properties and boundaries between them. It is concluded that the approach poses special advantages when visualizing time and space dependent properties simultaneously. It can be used to investigate the hydrological response of soil water dynamics and characteristics in different dimensions (space and time) and scales. This approach can be used to identify the dominant component in unsaturated flow systems. It is possible to estimate the pattern and the propagation rate downwards of moisture movement in the soil profile. Small-scale soil heterogeneities can be identified by the correlation field. Since the correlation field technique give a statistical measure of the dependent property that varies within the space-time field, it is possible to interpolate the fields to points where observations are not available, estimating spatial or temporal averages from discrete observations. (au)
Applied systems ecology: models, data, and statistical methods
Energy Technology Data Exchange (ETDEWEB)
Eberhardt, L L
1976-01-01
In this report, systems ecology is largely equated to mathematical or computer simulation modelling. The need for models in ecology stems from the necessity to have an integrative device for the diversity of ecological data, much of which is observational, rather than experimental, as well as from the present lack of a theoretical structure for ecology. Different objectives in applied studies require specialized methods. The best predictive devices may be regression equations, often non-linear in form, extracted from much more detailed models. A variety of statistical aspects of modelling, including sampling, are discussed. Several aspects of population dynamics and food-chain kinetics are described, and it is suggested that the two presently separated approaches should be combined into a single theoretical framework. It is concluded that future efforts in systems ecology should emphasize actual data and statistical methods, as well as modelling.
Statistical power of model selection strategies for genome-wide association studies.
Directory of Open Access Journals (Sweden)
Zheyang Wu
2009-07-01
Full Text Available Genome-wide association studies (GWAS aim to identify genetic variants related to diseases by examining the associations between phenotypes and hundreds of thousands of genotyped markers. Because many genes are potentially involved in common diseases and a large number of markers are analyzed, it is crucial to devise an effective strategy to identify truly associated variants that have individual and/or interactive effects, while controlling false positives at the desired level. Although a number of model selection methods have been proposed in the literature, including marginal search, exhaustive search, and forward search, their relative performance has only been evaluated through limited simulations due to the lack of an analytical approach to calculating the power of these methods. This article develops a novel statistical approach for power calculation, derives accurate formulas for the power of different model selection strategies, and then uses the formulas to evaluate and compare these strategies in genetic model spaces. In contrast to previous studies, our theoretical framework allows for random genotypes, correlations among test statistics, and a false-positive control based on GWAS practice. After the accuracy of our analytical results is validated through simulations, they are utilized to systematically evaluate and compare the performance of these strategies in a wide class of genetic models. For a specific genetic model, our results clearly reveal how different factors, such as effect size, allele frequency, and interaction, jointly affect the statistical power of each strategy. An example is provided for the application of our approach to empirical research. The statistical approach used in our derivations is general and can be employed to address the model selection problems in other random predictor settings. We have developed an R package markerSearchPower to implement our formulas, which can be downloaded from the
A new method to determine the number of experimental data using statistical modeling methods
Energy Technology Data Exchange (ETDEWEB)
Jung, Jung-Ho; Kang, Young-Jin; Lim, O-Kaung; Noh, Yoojeong [Pusan National University, Busan (Korea, Republic of)
2017-06-15
For analyzing the statistical performance of physical systems, statistical characteristics of physical parameters such as material properties need to be estimated by collecting experimental data. For accurate statistical modeling, many such experiments may be required, but data are usually quite limited owing to the cost and time constraints of experiments. In this study, a new method for determining a rea- sonable number of experimental data is proposed using an area metric, after obtaining statistical models using the information on the underlying distribution, the Sequential statistical modeling (SSM) approach, and the Kernel density estimation (KDE) approach. The area metric is used as a convergence criterion to determine the necessary and sufficient number of experimental data to be acquired. The pro- posed method is validated in simulations, using different statistical modeling methods, different true models, and different convergence criteria. An example data set with 29 data describing the fatigue strength coefficient of SAE 950X is used for demonstrating the performance of the obtained statistical models that use a pre-determined number of experimental data in predicting the probability of failure for a target fatigue life.
Sampling, Probability Models and Statistical Reasoning Statistical
Indian Academy of Sciences (India)
Home; Journals; Resonance – Journal of Science Education; Volume 1; Issue 5. Sampling, Probability Models and Statistical Reasoning Statistical Inference. Mohan Delampady V R Padmawar. General Article Volume 1 Issue 5 May 1996 pp 49-58 ...
Functional integral approach to classical statistical dynamics
International Nuclear Information System (INIS)
Jensen, R.V.
1980-04-01
A functional integral method is developed for the statistical solution of nonlinear stochastic differential equations which arise in classical dynamics. The functional integral approach provides a very natural and elegant derivation of the statistical dynamical equations that have been derived using the operator formalism of Martin, Siggia, and Rose
Xu, Chet C; Chan, Roger W; Sun, Han; Zhan, Xiaowei
2017-11-01
A mixed-effects model approach was introduced in this study for the statistical analysis of rheological data of vocal fold tissues, in order to account for the data correlation caused by multiple measurements of each tissue sample across the test frequency range. Such data correlation had often been overlooked in previous studies in the past decades. The viscoelastic shear properties of the vocal fold lamina propria of two commonly used laryngeal research animal species (i.e. rabbit, porcine) were measured by a linear, controlled-strain simple-shear rheometer. Along with published canine and human rheological data, the vocal fold viscoelastic shear moduli of these animal species were compared to those of human over a frequency range of 1-250Hz using the mixed-effects models. Our results indicated that tissues of the rabbit, canine and porcine vocal fold lamina propria were significantly stiffer and more viscous than those of human. Mixed-effects models were shown to be able to more accurately analyze rheological data generated from repeated measurements. Copyright © 2017 Elsevier Ltd. All rights reserved.
Statistical downscaling of rainfall: a non-stationary and multi-resolution approach
Rashid, Md. Mamunur; Beecham, Simon; Chowdhury, Rezaul Kabir
2016-05-01
A novel downscaling technique is proposed in this study whereby the original rainfall and reanalysis variables are first decomposed by wavelet transforms and rainfall is modelled using the semi-parametric additive model formulation of Generalized Additive Model in Location, Scale and Shape (GAMLSS). The flexibility of the GAMLSS model makes it feasible as a framework for non-stationary modelling. Decomposition of a rainfall series into different components is useful to separate the scale-dependent properties of the rainfall as this varies both temporally and spatially. The study was conducted at the Onkaparinga river catchment in South Australia. The model was calibrated over the period 1960 to 1990 and validated over the period 1991 to 2010. The model reproduced the monthly variability and statistics of the observed rainfall well with Nash-Sutcliffe efficiency (NSE) values of 0.66 and 0.65 for the calibration and validation periods, respectively. It also reproduced well the seasonal rainfall over the calibration (NSE = 0.37) and validation (NSE = 0.69) periods for all seasons. The proposed model was better than the tradition modelling approach (application of GAMLSS to the original rainfall series without decomposition) at reproducing the time-frequency properties of the observed rainfall, and yet it still preserved the statistics produced by the traditional modelling approach. When downscaling models were developed with general circulation model (GCM) historical output datasets, the proposed wavelet-based downscaling model outperformed the traditional downscaling model in terms of reproducing monthly rainfall for both the calibration and validation periods.
Statistical Data Processing with R – Metadata Driven Approach
Directory of Open Access Journals (Sweden)
Rudi SELJAK
2016-06-01
Full Text Available In recent years the Statistical Office of the Republic of Slovenia has put a lot of effort into re-designing its statistical process. We replaced the classical stove-pipe oriented production system with general software solutions, based on the metadata driven approach. This means that one general program code, which is parametrized with process metadata, is used for data processing for a particular survey. Currently, the general program code is entirely based on SAS macros, but in the future we would like to explore how successfully statistical software R can be used for this approach. Paper describes the metadata driven principle for data validation, generic software solution and main issues connected with the use of statistical software R for this approach.
Efficient Parallel Statistical Model Checking of Biochemical Networks
Directory of Open Access Journals (Sweden)
Paolo Ballarini
2009-12-01
Full Text Available We consider the problem of verifying stochastic models of biochemical networks against behavioral properties expressed in temporal logic terms. Exact probabilistic verification approaches such as, for example, CSL/PCTL model checking, are undermined by a huge computational demand which rule them out for most real case studies. Less demanding approaches, such as statistical model checking, estimate the likelihood that a property is satisfied by sampling executions out of the stochastic model. We propose a methodology for efficiently estimating the likelihood that a LTL property P holds of a stochastic model of a biochemical network. As with other statistical verification techniques, the methodology we propose uses a stochastic simulation algorithm for generating execution samples, however there are three key aspects that improve the efficiency: first, the sample generation is driven by on-the-fly verification of P which results in optimal overall simulation time. Second, the confidence interval estimation for the probability of P to hold is based on an efficient variant of the Wilson method which ensures a faster convergence. Third, the whole methodology is designed according to a parallel fashion and a prototype software tool has been implemented that performs the sampling/verification process in parallel over an HPC architecture.
Statistical approaches to forecast gamma dose rates by using measurements from the atmosphere
International Nuclear Information System (INIS)
Jeong, H.J.; Hwang, W. T.; Kim, E.H.; Han, M.H.
2008-01-01
In this paper, the results obtained by inter-comparing several statistical techniques for estimating gamma dose rates, such as an exponential moving average model, a seasonal exponential smoothing model and an artificial neural networks model, are reported. Seven years of gamma dose rates data measured in Daejeon City, Korea, were divided into two parts to develop the models and validate the effectiveness of the generated predictions by the techniques mentioned above. Artificial neural networks model shows the best forecasting capability among the three statistical models. The reason why the artificial neural networks model provides a superior prediction to the other models would be its ability for a non-linear approximation. To replace the gamma dose rates when missing data for an environmental monitoring system occurs, the moving average model and the seasonal exponential smoothing model can be better because they are faster and easier for applicability than the artificial neural networks model. These kinds of statistical approaches will be helpful for a real-time control of radio emissions or for an environmental quality assessment. (authors)
Paprotny, D.; Morales Napoles, O.; Jonkman, S.N.
2017-01-01
Flood hazard is currently being researched on continental and global scales, using models of increasing complexity. In this paper we investigate a different, simplified approach, which combines statistical and physical models in place of conventional rainfall-run-off models to carry out flood
Computer modelling of statistical properties of SASE FEL radiation
International Nuclear Information System (INIS)
Saldin, E. L.; Schneidmiller, E. A.; Yurkov, M. V.
1997-01-01
The paper describes an approach to computer modelling of statistical properties of the radiation from self amplified spontaneous emission free electron laser (SASE FEL). The present approach allows one to calculate the following statistical properties of the SASE FEL radiation: time and spectral field correlation functions, distribution of the fluctuations of the instantaneous radiation power, distribution of the energy in the electron bunch, distribution of the radiation energy after monochromator installed at the FEL amplifier exit and the radiation spectrum. All numerical results presented in the paper have been calculated for the 70 nm SASE FEL at the TESLA Test Facility being under construction at DESY
A statistical mechanical approach to restricted integer partition functions
Zhou, Chi-Chun; Dai, Wu-Sheng
2018-05-01
The main aim of this paper is twofold: (1) suggesting a statistical mechanical approach to the calculation of the generating function of restricted integer partition functions which count the number of partitions—a way of writing an integer as a sum of other integers under certain restrictions. In this approach, the generating function of restricted integer partition functions is constructed from the canonical partition functions of various quantum gases. (2) Introducing a new type of restricted integer partition functions corresponding to general statistics which is a generalization of Gentile statistics in statistical mechanics; many kinds of restricted integer partition functions are special cases of this restricted integer partition function. Moreover, with statistical mechanics as a bridge, we reveal a mathematical fact: the generating function of restricted integer partition function is just the symmetric function which is a class of functions being invariant under the action of permutation groups. Using this approach, we provide some expressions of restricted integer partition functions as examples.
Statistical Model of Extreme Shear
DEFF Research Database (Denmark)
Larsen, Gunner Chr.; Hansen, Kurt Schaldemose
2004-01-01
In order to continue cost-optimisation of modern large wind turbines, it is important to continously increase the knowledge on wind field parameters relevant to design loads. This paper presents a general statistical model that offers site-specific prediction of the probability density function...... by a model that, on a statistically consistent basis, describe the most likely spatial shape of an extreme wind shear event. Predictions from the model have been compared with results from an extreme value data analysis, based on a large number of high-sampled full-scale time series measurements...... are consistent, given the inevitabel uncertainties associated with model as well as with the extreme value data analysis. Keywords: Statistical model, extreme wind conditions, statistical analysis, turbulence, wind loading, statistical analysis, turbulence, wind loading, wind shear, wind turbines....
Parametric statistical inference basic theory and modern approaches
Zacks, Shelemyahu; Tsokos, C P
1981-01-01
Parametric Statistical Inference: Basic Theory and Modern Approaches presents the developments and modern trends in statistical inference to students who do not have advanced mathematical and statistical preparation. The topics discussed in the book are basic and common to many fields of statistical inference and thus serve as a jumping board for in-depth study. The book is organized into eight chapters. Chapter 1 provides an overview of how the theory of statistical inference is presented in subsequent chapters. Chapter 2 briefly discusses statistical distributions and their properties. Chapt
Nonparametric statistics a step-by-step approach
Corder, Gregory W
2014-01-01
"…a very useful resource for courses in nonparametric statistics in which the emphasis is on applications rather than on theory. It also deserves a place in libraries of all institutions where introductory statistics courses are taught."" -CHOICE This Second Edition presents a practical and understandable approach that enhances and expands the statistical toolset for readers. This book includes: New coverage of the sign test and the Kolmogorov-Smirnov two-sample test in an effort to offer a logical and natural progression to statistical powerSPSS® (Version 21) software and updated screen ca
A Statistical Approach to Optimizing Concrete Mixture Design
Ahmad, Shamsad; Alghamdi, Saeid A.
2014-01-01
A step-by-step statistical approach is proposed to obtain optimum proportioning of concrete mixtures using the data obtained through a statistically planned experimental program. The utility of the proposed approach for optimizing the design of concrete mixture is illustrated considering a typical case in which trial mixtures were considered according to a full factorial experiment design involving three factors and their three levels (33). A total of 27 concrete mixtures with three replicate...
A matrix approach to the statistics of longevity in heterogeneous frailty models
Directory of Open Access Journals (Sweden)
Hal Caswell
2014-09-01
Full Text Available Background: The gamma-Gompertz model is a fixed frailty model in which baseline mortality increasesexponentially with age, frailty has a proportional effect on mortality, and frailty at birth follows a gamma distribution. Mortality selects against the more frail, so the marginal mortality rate decelerates, eventually reaching an asymptote. The gamma-Gompertz is one of a wider class of frailty models, characterized by the choice of baseline mortality, effects of frailty, distributions of frailty, and assumptions about the dynamics of frailty. Objective: To develop a matrix model to compute all the statistical properties of longevity from thegamma-Gompertz and related models. Methods: I use the vec-permutation matrix formulation to develop a model in which individuals are jointly classified by age and frailty. The matrix is used to project the age and frailty dynamicsof a cohort and the fundamental matrix is used to obtain the statistics of longevity. Results: The model permits calculation of the mean, variance, coefficient of variation, skewness and all moments of longevity, the marginal mortality and survivorship functions, the dynamics of the frailty distribution, and other quantities. The matrix formulation extends naturally to other frailty models. I apply the analysis to the gamma-Gompertz model (for humans and laboratory animals, the gamma-Makeham model, and the gamma-Siler model, and to a hypothetical dynamic frailty model characterized by diffusion of frailty with reflecting boundaries.The matrix model permits partitioning the variance in longevity into components due to heterogeneity and to individual stochasticity. In several published human data sets, heterogeneity accounts for less than 10Š of the variance in longevity. In laboratory populations of five invertebrate animal species, heterogeneity accounts for 46Š to 83Š ofthe total variance in longevity.
Müller, M. F.; Thompson, S. E.
2015-09-01
The prediction of flow duration curves (FDCs) in ungauged basins remains an important task for hydrologists given the practical relevance of FDCs for water management and infrastructure design. Predicting FDCs in ungauged basins typically requires spatial interpolation of statistical or model parameters. This task is complicated if climate becomes non-stationary, as the prediction challenge now also requires extrapolation through time. In this context, process-based models for FDCs that mechanistically link the streamflow distribution to climate and landscape factors may have an advantage over purely statistical methods to predict FDCs. This study compares a stochastic (process-based) and statistical method for FDC prediction in both stationary and non-stationary contexts, using Nepal as a case study. Under contemporary conditions, both models perform well in predicting FDCs, with Nash-Sutcliffe coefficients above 0.80 in 75 % of the tested catchments. The main drives of uncertainty differ between the models: parameter interpolation was the main source of error for the statistical model, while violations of the assumptions of the process-based model represented the main source of its error. The process-based approach performed better than the statistical approach in numerical simulations with non-stationary climate drivers. The predictions of the statistical method under non-stationary rainfall conditions were poor if (i) local runoff coefficients were not accurately determined from the gauge network, or (ii) streamflow variability was strongly affected by changes in rainfall. A Monte Carlo analysis shows that the streamflow regimes in catchments characterized by a strong wet-season runoff and a rapid, strongly non-linear hydrologic response are particularly sensitive to changes in rainfall statistics. In these cases, process-based prediction approaches are strongly favored over statistical models.
Thermodynamics and statistical mechanics an integrated approach
Shell, M Scott
2015-01-01
Learn classical thermodynamics alongside statistical mechanics with this fresh approach to the subjects. Molecular and macroscopic principles are explained in an integrated, side-by-side manner to give students a deep, intuitive understanding of thermodynamics and equip them to tackle future research topics that focus on the nanoscale. Entropy is introduced from the get-go, providing a clear explanation of how the classical laws connect to the molecular principles, and closing the gap between the atomic world and thermodynamics. Notation is streamlined throughout, with a focus on general concepts and simple models, for building basic physical intuition and gaining confidence in problem analysis and model development. Well over 400 guided end-of-chapter problems are included, addressing conceptual, fundamental, and applied skill sets. Numerous worked examples are also provided together with handy shaded boxes to emphasize key concepts, making this the complete teaching package for students in chemical engineer...
Theoretical approaches to the steady-state statistical physics of interacting dissipative units
Bertin, Eric
2017-02-01
The aim of this review is to provide a concise overview of some of the generic approaches that have been developed to deal with the statistical description of large systems of interacting dissipative ‘units’. The latter notion includes, e.g. inelastic grains, active or self-propelled particles, bubbles in a foam, low-dimensional dynamical systems like driven oscillators, or even spatially extended modes like Fourier modes of the velocity field in a fluid. We first review methods based on the statistical properties of a single unit, starting with elementary mean-field approximations, either static or dynamic, that describe a unit embedded in a ‘self-consistent’ environment. We then discuss how this basic mean-field approach can be extended to account for spatial dependences, in the form of space-dependent mean-field Fokker-Planck equations, for example. We also briefly review the use of kinetic theory in the framework of the Boltzmann equation, which is an appropriate description for dilute systems. We then turn to descriptions in terms of the full N-body distribution, starting from exact solutions of one-dimensional models, using a matrix-product ansatz method when correlations are present. Since exactly solvable models are scarce, we also present some approximation methods which can be used to determine the N-body distribution in a large system of dissipative units. These methods include the Edwards approach for dense granular matter and the approximate treatment of multiparticle Langevin equations with colored noise, which models systems of self-propelled particles. Throughout this review, emphasis is put on methodological aspects of the statistical modeling and on formal similarities between different physical problems, rather than on the specific behavior of a given system.
Statistical modelling for recurrent events: an application to sports injuries.
Ullah, Shahid; Gabbett, Tim J; Finch, Caroline F
2014-09-01
Injuries are often recurrent, with subsequent injuries influenced by previous occurrences and hence correlation between events needs to be taken into account when analysing such data. This paper compares five different survival models (Cox proportional hazards (CoxPH) model and the following generalisations to recurrent event data: Andersen-Gill (A-G), frailty, Wei-Lin-Weissfeld total time (WLW-TT) marginal, Prentice-Williams-Peterson gap time (PWP-GT) conditional models) for the analysis of recurrent injury data. Empirical evaluation and comparison of different models were performed using model selection criteria and goodness-of-fit statistics. Simulation studies assessed the size and power of each model fit. The modelling approach is demonstrated through direct application to Australian National Rugby League recurrent injury data collected over the 2008 playing season. Of the 35 players analysed, 14 (40%) players had more than 1 injury and 47 contact injuries were sustained over 29 matches. The CoxPH model provided the poorest fit to the recurrent sports injury data. The fit was improved with the A-G and frailty models, compared to WLW-TT and PWP-GT models. Despite little difference in model fit between the A-G and frailty models, in the interest of fewer statistical assumptions it is recommended that, where relevant, future studies involving modelling of recurrent sports injury data use the frailty model in preference to the CoxPH model or its other generalisations. The paper provides a rationale for future statistical modelling approaches for recurrent sports injury. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
STATISTICAL MODELS OF REPRESENTING INTELLECTUAL CAPITAL
Directory of Open Access Journals (Sweden)
Andreea Feraru
2016-06-01
Full Text Available This article entitled Statistical Models of Representing Intellectual Capital approaches and analyses the concept of intellectual capital, as well as the main models which can support enterprisers/managers in evaluating and quantifying the advantages of intellectual capital. Most authors examine intellectual capital from a static perspective and focus on the development of its various evaluation models. In this chapter we surveyed the classical static models: Sveiby, Edvisson, Balanced Scorecard, as well as the canonical model of intellectual capital. Among the group of static models for evaluating organisational intellectual capital the canonical model stands out. This model enables the structuring of organisational intellectual capital in: human capital, structural capital and relational capital. Although the model is widely spread, it is a static one and can thus create a series of errors in the process of evaluation, because all the three entities mentioned above are not independent from the viewpoint of their contents, as any logic of structuring complex entities requires.
Pseudo-dynamic source modelling with 1-point and 2-point statistics of earthquake source parameters
Song, S. G.
2013-12-24
Ground motion prediction is an essential element in seismic hazard and risk analysis. Empirical ground motion prediction approaches have been widely used in the community, but efficient simulation-based ground motion prediction methods are needed to complement empirical approaches, especially in the regions with limited data constraints. Recently, dynamic rupture modelling has been successfully adopted in physics-based source and ground motion modelling, but it is still computationally demanding and many input parameters are not well constrained by observational data. Pseudo-dynamic source modelling keeps the form of kinematic modelling with its computational efficiency, but also tries to emulate the physics of source process. In this paper, we develop a statistical framework that governs the finite-fault rupture process with 1-point and 2-point statistics of source parameters in order to quantify the variability of finite source models for future scenario events. We test this method by extracting 1-point and 2-point statistics from dynamically derived source models and simulating a number of rupture scenarios, given target 1-point and 2-point statistics. We propose a new rupture model generator for stochastic source modelling with the covariance matrix constructed from target 2-point statistics, that is, auto- and cross-correlations. Our sensitivity analysis of near-source ground motions to 1-point and 2-point statistics of source parameters provides insights into relations between statistical rupture properties and ground motions. We observe that larger standard deviation and stronger correlation produce stronger peak ground motions in general. The proposed new source modelling approach will contribute to understanding the effect of earthquake source on near-source ground motion characteristics in a more quantitative and systematic way.
Bayesian models based on test statistics for multiple hypothesis testing problems.
Ji, Yuan; Lu, Yiling; Mills, Gordon B
2008-04-01
We propose a Bayesian method for the problem of multiple hypothesis testing that is routinely encountered in bioinformatics research, such as the differential gene expression analysis. Our algorithm is based on modeling the distributions of test statistics under both null and alternative hypotheses. We substantially reduce the complexity of the process of defining posterior model probabilities by modeling the test statistics directly instead of modeling the full data. Computationally, we apply a Bayesian FDR approach to control the number of rejections of null hypotheses. To check if our model assumptions for the test statistics are valid for various bioinformatics experiments, we also propose a simple graphical model-assessment tool. Using extensive simulations, we demonstrate the performance of our models and the utility of the model-assessment tool. In the end, we apply the proposed methodology to an siRNA screening and a gene expression experiment.
Thiessen, Erik D
2017-01-05
Statistical learning has been studied in a variety of different tasks, including word segmentation, object identification, category learning, artificial grammar learning and serial reaction time tasks (e.g. Saffran et al. 1996 Science 274: , 1926-1928; Orban et al. 2008 Proceedings of the National Academy of Sciences 105: , 2745-2750; Thiessen & Yee 2010 Child Development 81: , 1287-1303; Saffran 2002 Journal of Memory and Language 47: , 172-196; Misyak & Christiansen 2012 Language Learning 62: , 302-331). The difference among these tasks raises questions about whether they all depend on the same kinds of underlying processes and computations, or whether they are tapping into different underlying mechanisms. Prior theoretical approaches to statistical learning have often tried to explain or model learning in a single task. However, in many cases these approaches appear inadequate to explain performance in multiple tasks. For example, explaining word segmentation via the computation of sequential statistics (such as transitional probability) provides little insight into the nature of sensitivity to regularities among simultaneously presented features. In this article, we will present a formal computational approach that we believe is a good candidate to provide a unifying framework to explore and explain learning in a wide variety of statistical learning tasks. This framework suggests that statistical learning arises from a set of processes that are inherent in memory systems, including activation, interference, integration of information and forgetting (e.g. Perruchet & Vinter 1998 Journal of Memory and Language 39: , 246-263; Thiessen et al. 2013 Psychological Bulletin 139: , 792-814). From this perspective, statistical learning does not involve explicit computation of statistics, but rather the extraction of elements of the input into memory traces, and subsequent integration across those memory traces that emphasize consistent information (Thiessen and Pavlik
Exclusion statistics and integrable models
International Nuclear Information System (INIS)
Mashkevich, S.
1998-01-01
The definition of exclusion statistics that was given by Haldane admits a 'statistical interaction' between distinguishable particles (multispecies statistics). For such statistics, thermodynamic quantities can be evaluated exactly; explicit expressions are presented here for cluster coefficients. Furthermore, single-species exclusion statistics is realized in one-dimensional integrable models of the Calogero-Sutherland type. The interesting questions of generalizing this correspondence to the higher-dimensional and the multispecies cases remain essentially open; however, our results provide some hints as to searches for the models in question
A statistical approach to the prediction of pressure tube fracture toughness
International Nuclear Information System (INIS)
Pandey, M.D.; Radford, D.D.
2008-01-01
The fracture toughness of the zirconium alloy (Zr-2.5Nb) is an important parameter in determining the flaw tolerance for operation of pressure tubes in a nuclear reactor. Fracture toughness data have been generated by performing rising pressure burst tests on sections of pressure tubes removed from operating reactors. The test data were used to generate a lower-bound fracture toughness curve, which is used in defining the operational limits of pressure tubes. The paper presents a comprehensive statistical analysis of burst test data and develops a multivariate statistical model to relate toughness with material chemistry, mechanical properties, and operational history. The proposed model can be useful in predicting fracture toughness of specific in-service pressure tubes, thereby minimizing conservatism associated with a generic lower-bound approach
Statistical-mechanical lattice models for protein-DNA binding in chromatin
International Nuclear Information System (INIS)
Teif, Vladimir B; Rippe, Karsten
2010-01-01
Statistical-mechanical lattice models for protein-DNA binding are well established as a method to describe complex ligand binding equilibria measured in vitro with purified DNA and protein components. Recently, a new field of applications has opened up for this approach since it has become possible to experimentally quantify genome-wide protein occupancies in relation to the DNA sequence. In particular, the organization of the eukaryotic genome by histone proteins into a nucleoprotein complex termed chromatin has been recognized as a key parameter that controls the access of transcription factors to the DNA sequence. New approaches have to be developed to derive statistical-mechanical lattice descriptions of chromatin-associated protein-DNA interactions. Here, we present the theoretical framework for lattice models of histone-DNA interactions in chromatin and investigate the (competitive) DNA binding of other chromosomal proteins and transcription factors. The results have a number of applications for quantitative models for the regulation of gene expression.
Directory of Open Access Journals (Sweden)
Jeng-Wen Lin
2009-01-01
Full Text Available This paper proposes a statistical confidence interval based nonlinear model parameter refinement approach for the health monitoring of structural systems subjected to seismic excitations. The developed model refinement approach uses the 95% confidence interval of the estimated structural parameters to determine their statistical significance in a least-squares regression setting. When the parameters' confidence interval covers the zero value, it is statistically sustainable to truncate such parameters. The remaining parameters will repetitively undergo such parameter sifting process for model refinement until all the parameters' statistical significance cannot be further improved. This newly developed model refinement approach is implemented for the series models of multivariable polynomial expansions: the linear, the Taylor series, and the power series model, leading to a more accurate identification as well as a more controllable design for system vibration control. Because the statistical regression based model refinement approach is intrinsically used to process a “batch” of data and obtain an ensemble average estimation such as the structural stiffness, the Kalman filter and one of its extended versions is introduced to the refined power series model for structural health monitoring.
The Practicality of Statistical Physics Handout Based on KKNI and the Constructivist Approach
Sari, S. Y.; Afrizon, R.
2018-04-01
Statistical physics lecture shows that: 1) the performance of lecturers, social climate, students’ competence and soft skills needed at work are in enough category, 2) students feel difficulties in following the lectures of statistical physics because it is abstract, 3) 40.72% of students needs more understanding in the form of repetition, practice questions and structured tasks, and 4) the depth of statistical physics material needs to be improved gradually and structured. This indicates that learning materials in accordance of The Indonesian National Qualification Framework or Kerangka Kualifikasi Nasional Indonesia (KKNI) with the appropriate learning approach are needed to help lecturers and students in lectures. The author has designed statistical physics handouts which have very valid criteria (90.89%) according to expert judgment. In addition, the practical level of handouts designed also needs to be considered in order to be easy to use, interesting and efficient in lectures. The purpose of this research is to know the practical level of statistical physics handout based on KKNI and a constructivist approach. This research is a part of research and development with 4-D model developed by Thiagarajan. This research activity has reached part of development test at Development stage. Data collection took place by using a questionnaire distributed to lecturers and students. Data analysis using descriptive data analysis techniques in the form of percentage. The analysis of the questionnaire shows that the handout of statistical physics has very practical criteria. The conclusion of this study is statistical physics handouts based on the KKNI and constructivist approach have been practically used in lectures.
Two statistical approaches, weighted regression on time, discharge, and season and generalized additive models, have recently been used to evaluate water quality trends in estuaries. Both models have been used in similar contexts despite differences in statistical foundations and...
Statistical approach of weakly nonlinear ablative Rayleigh-Taylor instability
International Nuclear Information System (INIS)
Garnier, J.; Masse, L.
2005-01-01
A weakly nonlinear model is proposed for the Rayleigh-Taylor instability in presence of ablation and thermal transport. The nonlinear effects for a single-mode disturbance are computed, included the nonlinear correction to the exponential growth of the fundamental modulation. Mode coupling in the spectrum of a multimode disturbance is thoroughly analyzed by a statistical approach. The exponential growth of the linear regime is shown to be reduced by the nonlinear mode coupling. The saturation amplitude is around 0.1λ for long wavelengths, but higher for short instable wavelengths in the ablative regime
Decoding β-decay systematics: A global statistical model for β- half-lives
International Nuclear Information System (INIS)
Costiris, N. J.; Mavrommatis, E.; Gernoth, K. A.; Clark, J. W.
2009-01-01
Statistical modeling of nuclear data provides a novel approach to nuclear systematics complementary to established theoretical and phenomenological approaches based on quantum theory. Continuing previous studies in which global statistical modeling is pursued within the general framework of machine learning theory, we implement advances in training algorithms designed to improve generalization, in application to the problem of reproducing and predicting the half-lives of nuclear ground states that decay 100% by the β - mode. More specifically, fully connected, multilayer feed-forward artificial neural network models are developed using the Levenberg-Marquardt optimization algorithm together with Bayesian regularization and cross-validation. The predictive performance of models emerging from extensive computer experiments is compared with that of traditional microscopic and phenomenological models as well as with the performance of other learning systems, including earlier neural network models as well as the support vector machines recently applied to the same problem. In discussing the results, emphasis is placed on predictions for nuclei that are far from the stability line, and especially those involved in r-process nucleosynthesis. It is found that the new statistical models can match or even surpass the predictive performance of conventional models for β-decay systematics and accordingly should provide a valuable additional tool for exploring the expanding nuclear landscape.
Statistical analysis of probabilistic models of software product lines with quantitative constraints
DEFF Research Database (Denmark)
Beek, M.H. ter; Legay, A.; Lluch Lafuente, Alberto
2015-01-01
We investigate the suitability of statistical model checking for the analysis of probabilistic models of software product lines with complex quantitative constraints and advanced feature installation options. Such models are specified in the feature-oriented language QFLan, a rich process algebra...... of certain behaviour to the expected average cost of products. This is supported by a Maude implementation of QFLan, integrated with the SMT solver Z3 and the distributed statistical model checker MultiVeStA. Our approach is illustrated with a bikes product line case study....
Cressie, Noel; Calder, Catherine A; Clark, James S; Ver Hoef, Jay M; Wikle, Christopher K
2009-04-01
Analyses of ecological data should account for the uncertainty in the process(es) that generated the data. However, accounting for these uncertainties is a difficult task, since ecology is known for its complexity. Measurement and/or process errors are often the only sources of uncertainty modeled when addressing complex ecological problems, yet analyses should also account for uncertainty in sampling design, in model specification, in parameters governing the specified model, and in initial and boundary conditions. Only then can we be confident in the scientific inferences and forecasts made from an analysis. Probability and statistics provide a framework that accounts for multiple sources of uncertainty. Given the complexities of ecological studies, the hierarchical statistical model is an invaluable tool. This approach is not new in ecology, and there are many examples (both Bayesian and non-Bayesian) in the literature illustrating the benefits of this approach. In this article, we provide a baseline for concepts, notation, and methods, from which discussion on hierarchical statistical modeling in ecology can proceed. We have also planted some seeds for discussion and tried to show where the practical difficulties lie. Our thesis is that hierarchical statistical modeling is a powerful way of approaching ecological analysis in the presence of inevitable but quantifiable uncertainties, even if practical issues sometimes require pragmatic compromises.
A Statistical Programme Assignment Model
DEFF Research Database (Denmark)
Rosholm, Michael; Staghøj, Jonas; Svarer, Michael
When treatment effects of active labour market programmes are heterogeneous in an observable way across the population, the allocation of the unemployed into different programmes becomes a particularly important issue. In this paper, we present a statistical model designed to improve the present...... duration of unemployment spells may result if a statistical programme assignment model is introduced. We discuss several issues regarding the plementation of such a system, especially the interplay between the statistical model and case workers....
Tsutsumi, Morito; Seya, Hajime
2009-12-01
This study discusses the theoretical foundation of the application of spatial hedonic approaches—the hedonic approach employing spatial econometrics or/and spatial statistics—to benefits evaluation. The study highlights the limitations of the spatial econometrics approach since it uses a spatial weight matrix that is not employed by the spatial statistics approach. Further, the study presents empirical analyses by applying the Spatial Autoregressive Error Model (SAEM), which is based on the spatial econometrics approach, and the Spatial Process Model (SPM), which is based on the spatial statistics approach. SPMs are conducted based on both isotropy and anisotropy and applied to different mesh sizes. The empirical analysis reveals that the estimated benefits are quite different, especially between isotropic and anisotropic SPM and between isotropic SPM and SAEM; the estimated benefits are similar for SAEM and anisotropic SPM. The study demonstrates that the mesh size does not affect the estimated amount of benefits. Finally, the study provides a confidence interval for the estimated benefits and raises an issue with regard to benefit evaluation.
Glass viscosity calculation based on a global statistical modelling approach
Energy Technology Data Exchange (ETDEWEB)
Fluegel, Alex
2007-02-01
A global statistical glass viscosity model was developed for predicting the complete viscosity curve, based on more than 2200 composition-property data of silicate glasses from the scientific literature, including soda-lime-silica container and float glasses, TV panel glasses, borosilicate fiber wool and E type glasses, low expansion borosilicate glasses, glasses for nuclear waste vitrification, lead crystal glasses, binary alkali silicates, and various further compositions from over half a century. It is shown that within a measurement series from a specific laboratory the reported viscosity values are often over-estimated at higher temperatures due to alkali and boron oxide evaporation during the measurement and glass preparation, including data by Lakatos et al. (1972) and the recently published High temperature glass melt property database for process modeling by Seward et al. (2005). Similarly, in the glass transition range many experimental data of borosilicate glasses are reported too high due to phase separation effects. The developed global model corrects those errors. The model standard error was 9-17°C, with R^2 = 0.985-0.989. The prediction 95% confidence interval for glass in mass production largely depends on the glass composition of interest, the composition uncertainty, and the viscosity level. New insights in the mixed-alkali effect are provided.
Introducing linear functions: an alternative statistical approach
Nolan, Caroline; Herbert, Sandra
2015-12-01
The introduction of linear functions is the turning point where many students decide if mathematics is useful or not. This means the role of parameters and variables in linear functions could be considered to be `threshold concepts'. There is recognition that linear functions can be taught in context through the exploration of linear modelling examples, but this has its limitations. Currently, statistical data is easily attainable, and graphics or computer algebra system (CAS) calculators are common in many classrooms. The use of this technology provides ease of access to different representations of linear functions as well as the ability to fit a least-squares line for real-life data. This means these calculators could support a possible alternative approach to the introduction of linear functions. This study compares the results of an end-of-topic test for two classes of Australian middle secondary students at a regional school to determine if such an alternative approach is feasible. In this study, test questions were grouped by concept and subjected to concept by concept analysis of the means of test results of the two classes. This analysis revealed that the students following the alternative approach demonstrated greater competence with non-standard questions.
Directory of Open Access Journals (Sweden)
Gilles Feron
Full Text Available For human beings, the mouth is the first organ to perceive food and the different signalling events associated to food breakdown. These events are very complex and as such, their description necessitates combining different data sets. This study proposed an integrated approach to understand the relative contribution of main food oral processing events involved in aroma release during cheese consumption. In vivo aroma release was monitored on forty eight subjects who were asked to eat four different model cheeses varying in fat content and firmness and flavoured with ethyl propanoate and nonan-2-one. A multiblock partial least square regression was performed to explain aroma release from the different physiological data sets (masticatory behaviour, bolus rheology, saliva composition and flux, mouth coating and bolus moistening. This statistical approach was relevant to point out that aroma release was mostly explained by masticatory behaviour whatever the cheese and the aroma, with a specific influence of mean amplitude on aroma release after swallowing. Aroma release from the firmer cheeses was explained mainly by bolus rheology. The persistence of hydrophobic compounds in the breath was mainly explained by bolus spreadability, in close relation with bolus moistening. Resting saliva poorly contributed to the analysis whereas the composition of stimulated saliva was negatively correlated with aroma release and mostly for soft cheeses, when significant.
Müller, M. F.; Thompson, S. E.
2016-02-01
The prediction of flow duration curves (FDCs) in ungauged basins remains an important task for hydrologists given the practical relevance of FDCs for water management and infrastructure design. Predicting FDCs in ungauged basins typically requires spatial interpolation of statistical or model parameters. This task is complicated if climate becomes non-stationary, as the prediction challenge now also requires extrapolation through time. In this context, process-based models for FDCs that mechanistically link the streamflow distribution to climate and landscape factors may have an advantage over purely statistical methods to predict FDCs. This study compares a stochastic (process-based) and statistical method for FDC prediction in both stationary and non-stationary contexts, using Nepal as a case study. Under contemporary conditions, both models perform well in predicting FDCs, with Nash-Sutcliffe coefficients above 0.80 in 75 % of the tested catchments. The main drivers of uncertainty differ between the models: parameter interpolation was the main source of error for the statistical model, while violations of the assumptions of the process-based model represented the main source of its error. The process-based approach performed better than the statistical approach in numerical simulations with non-stationary climate drivers. The predictions of the statistical method under non-stationary rainfall conditions were poor if (i) local runoff coefficients were not accurately determined from the gauge network, or (ii) streamflow variability was strongly affected by changes in rainfall. A Monte Carlo analysis shows that the streamflow regimes in catchments characterized by frequent wet-season runoff and a rapid, strongly non-linear hydrologic response are particularly sensitive to changes in rainfall statistics. In these cases, process-based prediction approaches are favored over statistical models.
Fast optimization of statistical potentials for structurally constrained phylogenetic models
Directory of Open Access Journals (Sweden)
Rodrigue Nicolas
2009-09-01
Full Text Available Abstract Background Statistical approaches for protein design are relevant in the field of molecular evolutionary studies. In recent years, new, so-called structurally constrained (SC models of protein-coding sequence evolution have been proposed, which use statistical potentials to assess sequence-structure compatibility. In a previous work, we defined a statistical framework for optimizing knowledge-based potentials especially suited to SC models. Our method used the maximum likelihood principle and provided what we call the joint potentials. However, the method required numerical estimations by the use of computationally heavy Markov Chain Monte Carlo sampling algorithms. Results Here, we develop an alternative optimization procedure, based on a leave-one-out argument coupled to fast gradient descent algorithms. We assess that the leave-one-out potential yields very similar results to the joint approach developed previously, both in terms of the resulting potential parameters, and by Bayes factor evaluation in a phylogenetic context. On the other hand, the leave-one-out approach results in a considerable computational benefit (up to a 1,000 fold decrease in computational time for the optimization procedure. Conclusion Due to its computational speed, the optimization method we propose offers an attractive alternative for the design and empirical evaluation of alternative forms of potentials, using large data sets and high-dimensional parameterizations.
Statistical modeling to support power system planning
Staid, Andrea
This dissertation focuses on data-analytic approaches that improve our understanding of power system applications to promote better decision-making. It tackles issues of risk analysis, uncertainty management, resource estimation, and the impacts of climate change. Tools of data mining and statistical modeling are used to bring new insight to a variety of complex problems facing today's power system. The overarching goal of this research is to improve the understanding of the power system risk environment for improved operation, investment, and planning decisions. The first chapter introduces some challenges faced in planning for a sustainable power system. Chapter 2 analyzes the driving factors behind the disparity in wind energy investments among states with a goal of determining the impact that state-level policies have on incentivizing wind energy. Findings show that policy differences do not explain the disparities; physical and geographical factors are more important. Chapter 3 extends conventional wind forecasting to a risk-based focus of predicting maximum wind speeds, which are dangerous for offshore operations. Statistical models are presented that issue probabilistic predictions for the highest wind speed expected in a three-hour interval. These models achieve a high degree of accuracy and their use can improve safety and reliability in practice. Chapter 4 examines the challenges of wind power estimation for onshore wind farms. Several methods for wind power resource assessment are compared, and the weaknesses of the Jensen model are demonstrated. For two onshore farms, statistical models outperform other methods, even when very little information is known about the wind farm. Lastly, chapter 5 focuses on the power system more broadly in the context of the risks expected from tropical cyclones in a changing climate. Risks to U.S. power system infrastructure are simulated under different scenarios of tropical cyclone behavior that may result from climate
Benchmark validation of statistical models: Application to mediation analysis of imagery and memory.
MacKinnon, David P; Valente, Matthew J; Wurpts, Ingrid C
2018-03-29
This article describes benchmark validation, an approach to validating a statistical model. According to benchmark validation, a valid model generates estimates and research conclusions consistent with a known substantive effect. Three types of benchmark validation-(a) benchmark value, (b) benchmark estimate, and (c) benchmark effect-are described and illustrated with examples. Benchmark validation methods are especially useful for statistical models with assumptions that are untestable or very difficult to test. Benchmark effect validation methods were applied to evaluate statistical mediation analysis in eight studies using the established effect that increasing mental imagery improves recall of words. Statistical mediation analysis led to conclusions about mediation that were consistent with established theory that increased imagery leads to increased word recall. Benchmark validation based on established substantive theory is discussed as a general way to investigate characteristics of statistical models and a complement to mathematical proof and statistical simulation. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
International Nuclear Information System (INIS)
Yokoyama, Masayuki
2014-01-01
A statistical approach is proposed to predict thermal diffusivity profiles as a transport “model” in fusion plasmas. It can provide regression expressions for the ion and electron heat diffusivities (χ i and χ e ), separately, to construct their radial profiles. An approach that this letter is proposing outstrips the conventional scaling laws for the global confinement time (τ E ) since it also deals with profiles (temperature, density, heating depositions etc.). This approach has become possible with the analysis database accumulated by the extensive application of the integrated transport analysis suite to experiment data. In this letter, TASK3D-a analysis database for high-ion-temperature (high-T i ) plasmas in the LHD (Large Helical Device) is used as an example to describe an approach. (author)
Cierniak, Robert; Lorent, Anna
2016-09-01
The main aim of this paper is to investigate properties of our originally formulated statistical model-based iterative approach applied to the image reconstruction from projections problem which are related to its conditioning, and, in this manner, to prove a superiority of this approach over ones recently used by other authors. The reconstruction algorithm based on this conception uses a maximum likelihood estimation with an objective adjusted to the probability distribution of measured signals obtained from an X-ray computed tomography system with parallel beam geometry. The analysis and experimental results presented here show that our analytical approach outperforms the referential algebraic methodology which is explored widely in the literature and exploited in various commercial implementations. Copyright © 2016 Elsevier Ltd. All rights reserved.
Brandt, Laura A.; Benscoter, Allison; Harvey, Rebecca G.; Speroterra, Carolina; Bucklin, David N.; Romañach, Stephanie; Watling, James I.; Mazzotti, Frank J.
2017-01-01
Climate envelope models are widely used to describe potential future distribution of species under different climate change scenarios. It is broadly recognized that there are both strengths and limitations to using climate envelope models and that outcomes are sensitive to initial assumptions, inputs, and modeling methods Selection of predictor variables, a central step in modeling, is one of the areas where different techniques can yield varying results. Selection of climate variables to use as predictors is often done using statistical approaches that develop correlations between occurrences and climate data. These approaches have received criticism in that they rely on the statistical properties of the data rather than directly incorporating biological information about species responses to temperature and precipitation. We evaluated and compared models and prediction maps for 15 threatened or endangered species in Florida based on two variable selection techniques: expert opinion and a statistical method. We compared model performance between these two approaches for contemporary predictions, and the spatial correlation, spatial overlap and area predicted for contemporary and future climate predictions. In general, experts identified more variables as being important than the statistical method and there was low overlap in the variable sets (0.9 for area under the curve (AUC) and >0.7 for true skill statistic (TSS). Spatial overlap, which compares the spatial configuration between maps constructed using the different variable selection techniques, was only moderate overall (about 60%), with a great deal of variability across species. Difference in spatial overlap was even greater under future climate projections, indicating additional divergence of model outputs from different variable selection techniques. Our work is in agreement with other studies which have found that for broad-scale species distribution modeling, using statistical methods of variable
Analytical model of SiPM time resolution and order statistics with crosstalk
International Nuclear Information System (INIS)
Vinogradov, S.
2015-01-01
Time resolution is the most important parameter of photon detectors in a wide range of time-of-flight and time correlation applications within the areas of high energy physics, medical imaging, and others. Silicon photomultipliers (SiPM) have been initially recognized as perfect photon-number-resolving detectors; now they also provide outstanding results in the scintillator timing resolution. However, crosstalk and afterpulsing introduce false secondary non-Poissonian events, and SiPM time resolution models are experiencing significant difficulties with that. This study presents an attempt to develop an analytical model of the timing resolution of an SiPM taking into account statistics of secondary events resulting from a crosstalk. Two approaches have been utilized to derive an analytical expression for time resolution: the first one based on statistics of independent identically distributed detection event times and the second one based on order statistics of these times. The first approach is found to be more straightforward and “analytical-friendly” to model analog SiPMs. Comparisons of coincidence resolving times predicted by the model with the known experimental results from a LYSO:Ce scintillator and a Hamamatsu MPPC are presented
Analytical model of SiPM time resolution and order statistics with crosstalk
Energy Technology Data Exchange (ETDEWEB)
Vinogradov, S., E-mail: Sergey.Vinogradov@liverpool.ac.uk [University of Liverpool and Cockcroft Institute, Sci-Tech Daresbury, Keckwick Lane, Warrington WA4 4AD (United Kingdom); P.N. Lebedev Physical Institute of the Russian Academy of Sciences, 119991 Leninskiy Prospekt 53, Moscow (Russian Federation)
2015-07-01
Time resolution is the most important parameter of photon detectors in a wide range of time-of-flight and time correlation applications within the areas of high energy physics, medical imaging, and others. Silicon photomultipliers (SiPM) have been initially recognized as perfect photon-number-resolving detectors; now they also provide outstanding results in the scintillator timing resolution. However, crosstalk and afterpulsing introduce false secondary non-Poissonian events, and SiPM time resolution models are experiencing significant difficulties with that. This study presents an attempt to develop an analytical model of the timing resolution of an SiPM taking into account statistics of secondary events resulting from a crosstalk. Two approaches have been utilized to derive an analytical expression for time resolution: the first one based on statistics of independent identically distributed detection event times and the second one based on order statistics of these times. The first approach is found to be more straightforward and “analytical-friendly” to model analog SiPMs. Comparisons of coincidence resolving times predicted by the model with the known experimental results from a LYSO:Ce scintillator and a Hamamatsu MPPC are presented.
International Nuclear Information System (INIS)
Ching, W-H; K H Leung, Michael; Leung, Dennis Y C
2009-01-01
Transient turbulent dispersion phenomena can be found in various practical problems, such as the accidental release of toxic chemical vapor and the airborne transmission of infectious droplets. Computational fluid dynamics (CFD) is an effective tool for analyzing such transient dispersion behaviors. However, the transient CFD analysis is often computationally expensive and time consuming. In the present study, a computationally efficient CFD-statistical hybrid modeling method has been developed for studying transient turbulent dispersion. In this method, the source emission is represented by emissions of many infinitesimal puffs. Statistical analysis is performed to obtain first the statistical properties of the puff trajectories and subsequently the most probable distribution of the puff trajectories that represent the macroscopic dispersion behaviors. In two case studies of ambient dispersion, the numerical modeling results obtained agree reasonably well with both experimental measurements and conventional k-ε modeling results published in the literature. More importantly, the proposed many-puff CFD-statistical hybrid modeling method effectively reduces the computational time by two orders of magnitude.
A novel approach for choosing summary statistics in approximate Bayesian computation.
Aeschbacher, Simon; Beaumont, Mark A; Futschik, Andreas
2012-11-01
The choice of summary statistics is a crucial step in approximate Bayesian computation (ABC). Since statistics are often not sufficient, this choice involves a trade-off between loss of information and reduction of dimensionality. The latter may increase the efficiency of ABC. Here, we propose an approach for choosing summary statistics based on boosting, a technique from the machine-learning literature. We consider different types of boosting and compare them to partial least-squares regression as an alternative. To mitigate the lack of sufficiency, we also propose an approach for choosing summary statistics locally, in the putative neighborhood of the true parameter value. We study a demographic model motivated by the reintroduction of Alpine ibex (Capra ibex) into the Swiss Alps. The parameters of interest are the mean and standard deviation across microsatellites of the scaled ancestral mutation rate (θ(anc) = 4N(e)u) and the proportion of males obtaining access to matings per breeding season (ω). By simulation, we assess the properties of the posterior distribution obtained with the various methods. According to our criteria, ABC with summary statistics chosen locally via boosting with the L(2)-loss performs best. Applying that method to the ibex data, we estimate θ(anc)≈ 1.288 and find that most of the variation across loci of the ancestral mutation rate u is between 7.7 × 10(-4) and 3.5 × 10(-3) per locus per generation. The proportion of males with access to matings is estimated as ω≈ 0.21, which is in good agreement with recent independent estimates.
Diffeomorphic Statistical Deformation Models
DEFF Research Database (Denmark)
Hansen, Michael Sass; Hansen, Mads/Fogtman; Larsen, Rasmus
2007-01-01
In this paper we present a new method for constructing diffeomorphic statistical deformation models in arbitrary dimensional images with a nonlinear generative model and a linear parameter space. Our deformation model is a modified version of the diffeomorphic model introduced by Cootes et al....... The modifications ensure that no boundary restriction has to be enforced on the parameter space to prevent folds or tears in the deformation field. For straightforward statistical analysis, principal component analysis and sparse methods, we assume that the parameters for a class of deformations lie on a linear...... with ground truth in form of manual expert annotations, and compared to Cootes's model. We anticipate applications in unconstrained diffeomorphic synthesis of images, e.g. for tracking, segmentation, registration or classification purposes....
International Nuclear Information System (INIS)
Sengupta, S.K.; Boyle, J.S.
1993-05-01
Variables describing atmospheric circulation and other climate parameters derived from various GCMs and obtained from observations can be represented on a spatio-temporal grid (lattice) structure. The primary objective of this paper is to explore existing as well as some new statistical methods to analyze such data structures for the purpose of model diagnostics and intercomparison from a statistical perspective. Among the several statistical methods considered here, a new method based on common principal components appears most promising for the purpose of intercomparison of spatio-temporal data structures arising in the task of model/model and model/data intercomparison. A complete strategy for such an intercomparison is outlined. The strategy includes two steps. First, the commonality of spatial structures in two (or more) fields is captured in the common principal vectors. Second, the corresponding principal components obtained as time series are then compared on the basis of similarities in their temporal evolution
Interactive statistics with ILLMO
Martens, J.B.O.S.
2014-01-01
Progress in empirical research relies on adequate statistical analysis and reporting. This article proposes an alternative approach to statistical modeling that is based on an old but mostly forgotten idea, namely Thurstone modeling. Traditional statistical methods assume that either the measured
Process simulation and statistical approaches for validating waste form qualification models
International Nuclear Information System (INIS)
Kuhn, W.L.; Toland, M.R.; Pulsipher, B.A.
1989-05-01
This report describes recent progress toward one of the principal objectives of the Nuclear Waste Treatment Program (NWTP) at the Pacific Northwest Laboratory (PNL): to establish relationships between vitrification process control and glass product quality. during testing of a vitrification system, it is important to show that departures affecting the product quality can be sufficiently detected through process measurements to prevent an unacceptable canister from being produced. Meeting this goal is a practical definition of a successful sampling, data analysis, and process control strategy. A simulation model has been developed and preliminarily tested by applying it to approximate operation of the West Valley Demonstration Project (WVDP) vitrification system at West Valley, New York. Multivariate statistical techniques have been identified and described that can be applied to analyze large sets of process measurements. Information on components, tanks, and time is then combined to create a single statistic through which all of the information can be used at once to determine whether the process has shifted away from a normal condition
Bayesian statistic methods and theri application in probabilistic simulation models
Directory of Open Access Journals (Sweden)
Sergio Iannazzo
2007-03-01
Full Text Available Bayesian statistic methods are facing a rapidly growing level of interest and acceptance in the field of health economics. The reasons of this success are probably to be found on the theoretical fundaments of the discipline that make these techniques more appealing to decision analysis. To this point should be added the modern IT progress that has developed different flexible and powerful statistical software framework. Among them probably one of the most noticeably is the BUGS language project and its standalone application for MS Windows WinBUGS. Scope of this paper is to introduce the subject and to show some interesting applications of WinBUGS in developing complex economical models based on Markov chains. The advantages of this approach reside on the elegance of the code produced and in its capability to easily develop probabilistic simulations. Moreover an example of the integration of bayesian inference models in a Markov model is shown. This last feature let the analyst conduce statistical analyses on the available sources of evidence and exploit them directly as inputs in the economic model.
A statistical mechanical model of economics
Lubbers, Nicholas Edward Williams
Statistical mechanics pursues low-dimensional descriptions of systems with a very large number of degrees of freedom. I explore this theme in two contexts. The main body of this dissertation explores and extends the Yard Sale Model (YSM) of economic transactions using a combination of simulations and theory. The YSM is a simple interacting model for wealth distributions which has the potential to explain the empirical observation of Pareto distributions of wealth. I develop the link between wealth condensation and the breakdown of ergodicity due to nonlinear diffusion effects which are analogous to the geometric random walk. Using this, I develop a deterministic effective theory of wealth transfer in the YSM that is useful for explaining many quantitative results. I introduce various forms of growth to the model, paying attention to the effect of growth on wealth condensation, inequality, and ergodicity. Arithmetic growth is found to partially break condensation, and geometric growth is found to completely break condensation. Further generalizations of geometric growth with growth in- equality show that the system is divided into two phases by a tipping point in the inequality parameter. The tipping point marks the line between systems which are ergodic and systems which exhibit wealth condensation. I explore generalizations of the YSM transaction scheme to arbitrary betting functions to develop notions of universality in YSM-like models. I find that wealth vi condensation is universal to a large class of models which can be divided into two phases. The first exhibits slow, power-law condensation dynamics, and the second exhibits fast, finite-time condensation dynamics. I find that the YSM, which exhibits exponential dynamics, is the critical, self-similar model which marks the dividing line between the two phases. The final chapter develops a low-dimensional approach to materials microstructure quantification. Modern materials design harnesses complex
Hadronic equation of state in the statistical bootstrap model and linear graph theory
International Nuclear Information System (INIS)
Fre, P.; Page, R.
1976-01-01
Taking a statistical mechanical point og view, the statistical bootstrap model is discussed and, from a critical analysis of the bootstrap volume comcept, it is reached a physical ipothesis, which leads immediately to the hadronic equation of state provided by the bootstrap integral equation. In this context also the connection between the statistical bootstrap and the linear graph theory approach to interacting gases is analyzed
Quantitative and statistical approaches to geography a practical manual
Matthews, John A
2013-01-01
Quantitative and Statistical Approaches to Geography: A Practical Manual is a practical introduction to some quantitative and statistical techniques of use to geographers and related scientists. This book is composed of 15 chapters, each begins with an outline of the purpose and necessary mechanics of a technique or group of techniques and is concluded with exercises and the particular approach adopted. These exercises aim to enhance student's ability to use the techniques as part of the process by which sound judgments are made according to scientific standards while tackling complex problems. After a brief introduction to the principles of quantitative and statistical geography, this book goes on dealing with the topics of measures of central tendency; probability statements and maps; the problem of time-dependence, time-series analysis, non-normality, and data transformations; and the elements of sampling methodology. Other chapters cover the confidence intervals and estimation from samples, statistical hy...
New robust statistical procedures for the polytomous logistic regression models.
Castilla, Elena; Ghosh, Abhik; Martin, Nirian; Pardo, Leandro
2018-05-17
This article derives a new family of estimators, namely the minimum density power divergence estimators, as a robust generalization of the maximum likelihood estimator for the polytomous logistic regression model. Based on these estimators, a family of Wald-type test statistics for linear hypotheses is introduced. Robustness properties of both the proposed estimators and the test statistics are theoretically studied through the classical influence function analysis. Appropriate real life examples are presented to justify the requirement of suitable robust statistical procedures in place of the likelihood based inference for the polytomous logistic regression model. The validity of the theoretical results established in the article are further confirmed empirically through suitable simulation studies. Finally, an approach for the data-driven selection of the robustness tuning parameter is proposed with empirical justifications. © 2018, The International Biometric Society.
Statistical mechanics of sparse generalization and graphical model selection
International Nuclear Information System (INIS)
Lage-Castellanos, Alejandro; Pagnani, Andrea; Weigt, Martin
2009-01-01
One of the crucial tasks in many inference problems is the extraction of an underlying sparse graphical model from a given number of high-dimensional measurements. In machine learning, this is frequently achieved using, as a penalty term, the L p norm of the model parameters, with p≤1 for efficient dilution. Here we propose a statistical mechanics analysis of the problem in the setting of perceptron memorization and generalization. Using a replica approach, we are able to evaluate the relative performance of naive dilution (obtained by learning without dilution, following by applying a threshold to the model parameters), L 1 dilution (which is frequently used in convex optimization) and L 0 dilution (which is optimal but computationally hard to implement). Whereas both L p diluted approaches clearly outperform the naive approach, we find a small region where L 0 works almost perfectly and strongly outperforms the simpler to implement L 1 dilution
International Nuclear Information System (INIS)
Calvin W. Johnson
2005-01-01
The general goal of the project is to develop and implement computer codes and input files to compute nuclear densities of state. Such densities are important input into calculations of statistical neutron capture, and are difficult to access experimentally. In particular, we will focus on calculating densities for nuclides in the mass range A ∼ 50-100. We use statistical spectroscopy, a moments method based upon a microscopic framework, the interacting shell model. Second year goals and milestones: Develop two or three competing interactions (based upon surface-delta, Gogny, and NN-scattering) suitable for application to nuclei up to A = 100. Begin calculations for nuclides with A = 50-70
Statistical modeling for degradation data
Lio, Yuhlong; Ng, Hon; Tsai, Tzong-Ru
2017-01-01
This book focuses on the statistical aspects of the analysis of degradation data. In recent years, degradation data analysis has come to play an increasingly important role in different disciplines such as reliability, public health sciences, and finance. For example, information on products’ reliability can be obtained by analyzing degradation data. In addition, statistical modeling and inference techniques have been developed on the basis of different degradation measures. The book brings together experts engaged in statistical modeling and inference, presenting and discussing important recent advances in degradation data analysis and related applications. The topics covered are timely and have considerable potential to impact both statistics and reliability engineering.
Exclusion statistics and integrable models
International Nuclear Information System (INIS)
Mashkevich, S.
1998-01-01
The definition of exclusion statistics, as given by Haldane, allows for a statistical interaction between distinguishable particles (multi-species statistics). The thermodynamic quantities for such statistics ca be evaluated exactly. The explicit expressions for the cluster coefficients are presented. Furthermore, single-species exclusion statistics is realized in one-dimensional integrable models. The interesting questions of generalizing this correspondence onto the higher-dimensional and the multi-species cases remain essentially open
Smooth extrapolation of unknown anatomy via statistical shape models
Grupp, R. B.; Chiang, H.; Otake, Y.; Murphy, R. J.; Gordon, C. R.; Armand, M.; Taylor, R. H.
2015-03-01
Several methods to perform extrapolation of unknown anatomy were evaluated. The primary application is to enhance surgical procedures that may use partial medical images or medical images of incomplete anatomy. Le Fort-based, face-jaw-teeth transplant is one such procedure. From CT data of 36 skulls and 21 mandibles separate Statistical Shape Models of the anatomical surfaces were created. Using the Statistical Shape Models, incomplete surfaces were projected to obtain complete surface estimates. The surface estimates exhibit non-zero error in regions where the true surface is known; it is desirable to keep the true surface and seamlessly merge the estimated unknown surface. Existing extrapolation techniques produce non-smooth transitions from the true surface to the estimated surface, resulting in additional error and a less aesthetically pleasing result. The three extrapolation techniques evaluated were: copying and pasting of the surface estimate (non-smooth baseline), a feathering between the patient surface and surface estimate, and an estimate generated via a Thin Plate Spline trained from displacements between the surface estimate and corresponding vertices of the known patient surface. Feathering and Thin Plate Spline approaches both yielded smooth transitions. However, feathering corrupted known vertex values. Leave-one-out analyses were conducted, with 5% to 50% of known anatomy removed from the left-out patient and estimated via the proposed approaches. The Thin Plate Spline approach yielded smaller errors than the other two approaches, with an average vertex error improvement of 1.46 mm and 1.38 mm for the skull and mandible respectively, over the baseline approach.
A New Approach to Monte Carlo Simulations in Statistical Physics
Landau, David P.
2002-08-01
Monte Carlo simulations [1] have become a powerful tool for the study of diverse problems in statistical/condensed matter physics. Standard methods sample the probability distribution for the states of the system, most often in the canonical ensemble, and over the past several decades enormous improvements have been made in performance. Nonetheless, difficulties arise near phase transitions-due to critical slowing down near 2nd order transitions and to metastability near 1st order transitions, and these complications limit the applicability of the method. We shall describe a new Monte Carlo approach [2] that uses a random walk in energy space to determine the density of states directly. Once the density of states is known, all thermodynamic properties can be calculated. This approach can be extended to multi-dimensional parameter spaces and should be effective for systems with complex energy landscapes, e.g., spin glasses, protein folding models, etc. Generalizations should produce a broadly applicable optimization tool. 1. A Guide to Monte Carlo Simulations in Statistical Physics, D. P. Landau and K. Binder (Cambridge U. Press, Cambridge, 2000). 2. Fugao Wang and D. P. Landau, Phys. Rev. Lett. 86, 2050 (2001); Phys. Rev. E64, 056101-1 (2001).
Validation of statistical models for creep rupture by parametric analysis
Energy Technology Data Exchange (ETDEWEB)
Bolton, J., E-mail: john.bolton@uwclub.net [65, Fisher Ave., Rugby, Warks CV22 5HW (United Kingdom)
2012-01-15
Statistical analysis is an efficient method for the optimisation of any candidate mathematical model of creep rupture data, and for the comparative ranking of competing models. However, when a series of candidate models has been examined and the best of the series has been identified, there is no statistical criterion to determine whether a yet more accurate model might be devised. Hence there remains some uncertainty that the best of any series examined is sufficiently accurate to be considered reliable as a basis for extrapolation. This paper proposes that models should be validated primarily by parametric graphical comparison to rupture data and rupture gradient data. It proposes that no mathematical model should be considered reliable for extrapolation unless the visible divergence between model and data is so small as to leave no apparent scope for further reduction. This study is based on the data for a 12% Cr alloy steel used in BS PD6605:1998 to exemplify its recommended statistical analysis procedure. The models considered in this paper include a) a relatively simple model, b) the PD6605 recommended model and c) a more accurate model of somewhat greater complexity. - Highlights: Black-Right-Pointing-Pointer The paper discusses the validation of creep rupture models derived from statistical analysis. Black-Right-Pointing-Pointer It demonstrates that models can be satisfactorily validated by a visual-graphic comparison of models to data. Black-Right-Pointing-Pointer The method proposed utilises test data both as conventional rupture stress and as rupture stress gradient. Black-Right-Pointing-Pointer The approach is shown to be more reliable than a well-established and widely used method (BS PD6605).
Statistical Emulation of Climate Model Projections Based on Precomputed GCM Runs*
Castruccio, Stefano; McInerney, David J.; Stein, Michael L.; Liu Crouch, Feifei; Jacob, Robert L.; Moyer, Elisabeth J.
2014-01-01
functions of the past trajectory of atmospheric CO2 concentrations, and a statistical model is fit using a limited set of training runs. The approach is demonstrated to be a useful and computationally efficient alternative to pattern scaling and captures
Statistical approach for selection of biologically informative genes.
Das, Samarendra; Rai, Anil; Mishra, D C; Rai, Shesh N
2018-05-20
Selection of informative genes from high dimensional gene expression data has emerged as an important research area in genomics. Many gene selection techniques have been proposed so far are either based on relevancy or redundancy measure. Further, the performance of these techniques has been adjudged through post selection classification accuracy computed through a classifier using the selected genes. This performance metric may be statistically sound but may not be biologically relevant. A statistical approach, i.e. Boot-MRMR, was proposed based on a composite measure of maximum relevance and minimum redundancy, which is both statistically sound and biologically relevant for informative gene selection. For comparative evaluation of the proposed approach, we developed two biological sufficient criteria, i.e. Gene Set Enrichment with QTL (GSEQ) and biological similarity score based on Gene Ontology (GO). Further, a systematic and rigorous evaluation of the proposed technique with 12 existing gene selection techniques was carried out using five gene expression datasets. This evaluation was based on a broad spectrum of statistically sound (e.g. subject classification) and biological relevant (based on QTL and GO) criteria under a multiple criteria decision-making framework. The performance analysis showed that the proposed technique selects informative genes which are more biologically relevant. The proposed technique is also found to be quite competitive with the existing techniques with respect to subject classification and computational time. Our results also showed that under the multiple criteria decision-making setup, the proposed technique is best for informative gene selection over the available alternatives. Based on the proposed approach, an R Package, i.e. BootMRMR has been developed and available at https://cran.r-project.org/web/packages/BootMRMR. This study will provide a practical guide to select statistical techniques for selecting informative genes
The large break LOCA evaluation method with the simplified statistic approach
International Nuclear Information System (INIS)
Kamata, Shinya; Kubo, Kazuo
2004-01-01
USNRC published the Code Scaling, Applicability and Uncertainty (CSAU) evaluation methodology to large break LOCA which supported the revised rule for Emergency Core Cooling System performance in 1989. In USNRC regulatory guide 1.157, it is required that the peak cladding temperature (PCT) cannot exceed 2200deg F with high probability 95th percentile. In recent years, overseas countries have developed statistical methodology and best estimate code with the model which can provide more realistic simulation for the phenomena based on the CSAU evaluation methodology. In order to calculate PCT probability distribution by Monte Carlo trials, there are approaches such as the response surface technique using polynomials, the order statistics method, etc. For the purpose of performing rational statistic analysis, Mitsubishi Heavy Industries, LTD (MHI) tried to develop the statistic LOCA method using the best estimate LOCA code MCOBRA/TRAC and the simplified code HOTSPOT. HOTSPOT is a Monte Carlo heat conduction solver to evaluate the uncertainties of the significant fuel parameters at the PCT positions of the hot rod. The direct uncertainty sensitivity studies can be performed without the response surface because the Monte Carlo simulation for key parameters can be performed in short time using HOTSPOT. With regard to the parameter uncertainties, MHI established the treatment that the bounding conditions are given for LOCA boundary and plant initial conditions, the Monte Carlo simulation using HOTSPOT is applied to the significant fuel parameters. The paper describes the large break LOCA evaluation method with the simplified statistic approach and the results of the application of the method to the representative four-loop nuclear power plant. (author)
Online Statistical Modeling (Regression Analysis) for Independent Responses
Made Tirta, I.; Anggraeni, Dian; Pandutama, Martinus
2017-06-01
Regression analysis (statistical analmodelling) are among statistical methods which are frequently needed in analyzing quantitative data, especially to model relationship between response and explanatory variables. Nowadays, statistical models have been developed into various directions to model various type and complex relationship of data. Rich varieties of advanced and recent statistical modelling are mostly available on open source software (one of them is R). However, these advanced statistical modelling, are not very friendly to novice R users, since they are based on programming script or command line interface. Our research aims to developed web interface (based on R and shiny), so that most recent and advanced statistical modelling are readily available, accessible and applicable on web. We have previously made interface in the form of e-tutorial for several modern and advanced statistical modelling on R especially for independent responses (including linear models/LM, generalized linier models/GLM, generalized additive model/GAM and generalized additive model for location scale and shape/GAMLSS). In this research we unified them in the form of data analysis, including model using Computer Intensive Statistics (Bootstrap and Markov Chain Monte Carlo/ MCMC). All are readily accessible on our online Virtual Statistics Laboratory. The web (interface) make the statistical modeling becomes easier to apply and easier to compare them in order to find the most appropriate model for the data.
Whole vertebral bone segmentation method with a statistical intensity-shape model based approach
Hanaoka, Shouhei; Fritscher, Karl; Schuler, Benedikt; Masutani, Yoshitaka; Hayashi, Naoto; Ohtomo, Kuni; Schubert, Rainer
2011-03-01
An automatic segmentation algorithm for the vertebrae in human body CT images is presented. Especially we focused on constructing and utilizing 4 different statistical intensity-shape combined models for the cervical, upper / lower thoracic and lumbar vertebrae, respectively. For this purpose, two previously reported methods were combined: a deformable model-based initial segmentation method and a statistical shape-intensity model-based precise segmentation method. The former is used as a pre-processing to detect the position and orientation of each vertebra, which determines the initial condition for the latter precise segmentation method. The precise segmentation method needs prior knowledge on both the intensities and the shapes of the objects. After PCA analysis of such shape-intensity expressions obtained from training image sets, vertebrae were parametrically modeled as a linear combination of the principal component vectors. The segmentation of each target vertebra was performed as fitting of this parametric model to the target image by maximum a posteriori estimation, combined with the geodesic active contour method. In the experimental result by using 10 cases, the initial segmentation was successful in 6 cases and only partially failed in 4 cases (2 in the cervical area and 2 in the lumbo-sacral). In the precise segmentation, the mean error distances were 2.078, 1.416, 0.777, 0.939 mm for cervical, upper and lower thoracic, lumbar spines, respectively. In conclusion, our automatic segmentation algorithm for the vertebrae in human body CT images showed a fair performance for cervical, thoracic and lumbar vertebrae.
A Statistical Graphical Model of the California Reservoir System
Taeb, A.; Reager, J. T.; Turmon, M.; Chandrasekaran, V.
2017-11-01
The recent California drought has highlighted the potential vulnerability of the state's water management infrastructure to multiyear dry intervals. Due to the high complexity of the network, dynamic storage changes in California reservoirs on a state-wide scale have previously been difficult to model using either traditional statistical or physical approaches. Indeed, although there is a significant line of research on exploring models for single (or a small number of) reservoirs, these approaches are not amenable to a system-wide modeling of the California reservoir network due to the spatial and hydrological heterogeneities of the system. In this work, we develop a state-wide statistical graphical model to characterize the dependencies among a collection of 55 major California reservoirs across the state; this model is defined with respect to a graph in which the nodes index reservoirs and the edges specify the relationships or dependencies between reservoirs. We obtain and validate this model in a data-driven manner based on reservoir volumes over the period 2003-2016. A key feature of our framework is a quantification of the effects of external phenomena that influence the entire reservoir network. We further characterize the degree to which physical factors (e.g., state-wide Palmer Drought Severity Index (PDSI), average temperature, snow pack) and economic factors (e.g., consumer price index, number of agricultural workers) explain these external influences. As a consequence of this analysis, we obtain a system-wide health diagnosis of the reservoir network as a function of PDSI.
Directory of Open Access Journals (Sweden)
Guozhu Zhang
Full Text Available Zebrafish have become an important alternative model for characterizing chemical bioactivity, partly due to the efficiency at which systematic, high-dimensional data can be generated. However, these new data present analytical challenges associated with scale and diversity. We developed a novel, robust statistical approach to characterize chemical-elicited effects in behavioral data from high-throughput screening (HTS of all 1,060 Toxicity Forecaster (ToxCast™ chemicals across 5 concentrations at 120 hours post-fertilization (hpf. Taking advantage of the immense scale of data for a global view, we show that this new approach reduces bias introduced by extreme values yet allows for diverse response patterns that confound the application of traditional statistics. We have also shown that, as a summary measure of response for local tests of chemical-associated behavioral effects, it achieves a significant reduction in coefficient of variation compared to many traditional statistical modeling methods. This effective increase in signal-to-noise ratio augments statistical power and is observed across experimental periods (light/dark conditions that display varied distributional response patterns. Finally, we integrated results with data from concomitant developmental endpoint measurements to show that appropriate statistical handling of HTS behavioral data can add important biological context that informs mechanistic hypotheses.
Statistical inference to advance network models in epidemiology.
Welch, David; Bansal, Shweta; Hunter, David R
2011-03-01
Contact networks are playing an increasingly important role in the study of epidemiology. Most of the existing work in this area has focused on considering the effect of underlying network structure on epidemic dynamics by using tools from probability theory and computer simulation. This work has provided much insight on the role that heterogeneity in host contact patterns plays on infectious disease dynamics. Despite the important understanding afforded by the probability and simulation paradigm, this approach does not directly address important questions about the structure of contact networks such as what is the best network model for a particular mode of disease transmission, how parameter values of a given model should be estimated, or how precisely the data allow us to estimate these parameter values. We argue that these questions are best answered within a statistical framework and discuss the role of statistical inference in estimating contact networks from epidemiological data. Copyright © 2011 Elsevier B.V. All rights reserved.
Exploiting linkage disequilibrium in statistical modelling in quantitative genomics
DEFF Research Database (Denmark)
Wang, Lei
Alleles at two loci are said to be in linkage disequilibrium (LD) when they are correlated or statistically dependent. Genomic prediction and gene mapping rely on the existence of LD between gentic markers and causul variants of complex traits. In the first part of the thesis, a novel method...... to quantify and visualize local variation in LD along chromosomes in describet, and applied to characterize LD patters at the local and genome-wide scale in three Danish pig breeds. In the second part, different ways of taking LD into account in genomic prediction models are studied. One approach is to use...... the recently proposed antedependence models, which treat neighbouring marker effects as correlated; another approach involves use of haplotype block information derived using the program Beagle. The overall conclusion is that taking LD information into account in genomic prediction models potentially improves...
Daniel Goodman’s empirical approach to Bayesian statistics
Gerrodette, Tim; Ward, Eric; Taylor, Rebecca L.; Schwarz, Lisa K.; Eguchi, Tomoharu; Wade, Paul; Himes Boor, Gina
2016-01-01
Bayesian statistics, in contrast to classical statistics, uses probability to represent uncertainty about the state of knowledge. Bayesian statistics has often been associated with the idea that knowledge is subjective and that a probability distribution represents a personal degree of belief. Dr. Daniel Goodman considered this viewpoint problematic for issues of public policy. He sought to ground his Bayesian approach in data, and advocated the construction of a prior as an empirical histogram of “similar” cases. In this way, the posterior distribution that results from a Bayesian analysis combined comparable previous data with case-specific current data, using Bayes’ formula. Goodman championed such a data-based approach, but he acknowledged that it was difficult in practice. If based on a true representation of our knowledge and uncertainty, Goodman argued that risk assessment and decision-making could be an exact science, despite the uncertainties. In his view, Bayesian statistics is a critical component of this science because a Bayesian analysis produces the probabilities of future outcomes. Indeed, Goodman maintained that the Bayesian machinery, following the rules of conditional probability, offered the best legitimate inference from available data. We give an example of an informative prior in a recent study of Steller sea lion spatial use patterns in Alaska.
Data and Dynamics Driven Approaches for Modelling and Forecasting the Red Sea Chlorophyll
Dreano, Denis
2017-05-31
Phytoplankton is at the basis of the marine food chain and therefore play a fundamental role in the ocean ecosystem. However, the large-scale phytoplankton dynamics of the Red Sea are not well understood yet, mainly due to the lack of historical in situ measurements. As a result, our knowledge in this area relies mostly on remotely-sensed observations and large-scale numerical marine ecosystem models. Models are very useful to identify the mechanisms driving the variations in chlorophyll concentration and have practical applications for fisheries operation and harmful algae blooms monitoring. Modelling approaches can be divided between physics- driven (dynamical) approaches, and data-driven (statistical) approaches. Dynamical models are based on a set of differential equations representing the transfer of energy and matter between different subsets of the biota, whereas statistical models identify relationships between variables based on statistical relations within the available data. The goal of this thesis is to develop, implement and test novel dynamical and statistical modelling approaches for studying and forecasting the variability of chlorophyll concentration in the Red Sea. These new models are evaluated in term of their ability to efficiently forecast and explain the regional chlorophyll variability. We also propose innovative synergistic strategies to combine data- and physics-driven approaches to further enhance chlorophyll forecasting capabilities and efficiency.
Behavioral investment strategy matters: a statistical arbitrage approach
Sun, David; Tsai, Shih-Chuan; Wang, Wei
2011-01-01
In this study, we employ a statistical arbitrage approach to demonstrate that momentum investment strategy tend to work better in periods longer than six months, a result different from findings in past literature. Compared with standard parametric tests, the statistical arbitrage method produces more clearly that momentum strategies work only in longer formation and holding periods. Also they yield positive significant returns in an up market, but negative yet insignificant returns in a down...
Statistical Modelling of Synaptic Vesicles Distribution and Analysing their Physical Characteristics
DEFF Research Database (Denmark)
Khanmohammadi, Mahdieh
transmission electron microscopy is used to acquire images from two experimental groups of rats: 1) rats subjected to a behavioral model of stress and 2) rats subjected to sham stress as the control group. The synaptic vesicle distribution and interactions are modeled by employing a point process approach......This Ph.D. thesis deals with mathematical and statistical modeling of synaptic vesicle distribution, shape, orientation and interactions. The first major part of this thesis treats the problem of determining the effect of stress on synaptic vesicle distribution and interactions. Serial section...... on differences of statistical measures in section and the same measures in between sections. Three-dimensional (3D) datasets are reconstructed by using image registration techniques and estimated thicknesses. We distinguish the effect of stress by estimating the synaptic vesicle densities and modeling...
Hagenlocher, Michael; Delmelle, Eric; Casas, Irene; Kienberger, Stefan
2013-08-14
As a result of changes in climatic conditions and greater resistance to insecticides, many regions across the globe, including Colombia, have been facing a resurgence of vector-borne diseases, and dengue fever in particular. Timely information on both (1) the spatial distribution of the disease, and (2) prevailing vulnerabilities of the population are needed to adequately plan targeted preventive intervention. We propose a methodology for the spatial assessment of current socioeconomic vulnerabilities to dengue fever in Cali, a tropical urban environment of Colombia. Based on a set of socioeconomic and demographic indicators derived from census data and ancillary geospatial datasets, we develop a spatial approach for both expert-based and purely statistical-based modeling of current vulnerability levels across 340 neighborhoods of the city using a Geographic Information System (GIS). The results of both approaches are comparatively evaluated by means of spatial statistics. A web-based approach is proposed to facilitate the visualization and the dissemination of the output vulnerability index to the community. The statistical and the expert-based modeling approach exhibit a high concordance, globally, and spatially. The expert-based approach indicates a slightly higher vulnerability mean (0.53) and vulnerability median (0.56) across all neighborhoods, compared to the purely statistical approach (mean = 0.48; median = 0.49). Both approaches reveal that high values of vulnerability tend to cluster in the eastern, north-eastern, and western part of the city. These are poor neighborhoods with high percentages of young (i.e., local expertise, statistical approaches could be used, with caution. By decomposing identified vulnerability "hotspots" into their underlying factors, our approach provides valuable information on both (1) the location of neighborhoods, and (2) vulnerability factors that should be given priority in the context of targeted intervention
Spherical Process Models for Global Spatial Statistics
Jeong, Jaehong
2017-11-28
Statistical models used in geophysical, environmental, and climate science applications must reflect the curvature of the spatial domain in global data. Over the past few decades, statisticians have developed covariance models that capture the spatial and temporal behavior of these global data sets. Though the geodesic distance is the most natural metric for measuring distance on the surface of a sphere, mathematical limitations have compelled statisticians to use the chordal distance to compute the covariance matrix in many applications instead, which may cause physically unrealistic distortions. Therefore, covariance functions directly defined on a sphere using the geodesic distance are needed. We discuss the issues that arise when dealing with spherical data sets on a global scale and provide references to recent literature. We review the current approaches to building process models on spheres, including the differential operator, the stochastic partial differential equation, the kernel convolution, and the deformation approaches. We illustrate realizations obtained from Gaussian processes with different covariance structures and the use of isotropic and nonstationary covariance models through deformations and geographical indicators for global surface temperature data. To assess the suitability of each method, we compare their log-likelihood values and prediction scores, and we end with a discussion of related research problems.
Statistical and machine learning approaches for network analysis
Dehmer, Matthias
2012-01-01
Explore the multidisciplinary nature of complex networks through machine learning techniques Statistical and Machine Learning Approaches for Network Analysis provides an accessible framework for structurally analyzing graphs by bringing together known and novel approaches on graph classes and graph measures for classification. By providing different approaches based on experimental data, the book uniquely sets itself apart from the current literature by exploring the application of machine learning techniques to various types of complex networks. Comprised of chapters written by internation
Araya, Takao; Kubo, Takuya; von Wirén, Nicolaus; Takahashi, Hideki
2016-03-01
Plant root development is strongly affected by nutrient availability. Despite the importance of structure and function of roots in nutrient acquisition, statistical modeling approaches to evaluate dynamic and temporal modulations of root system architecture in response to nutrient availability have remained as widely open and exploratory areas in root biology. In this study, we developed a statistical modeling approach to investigate modulations of root system architecture in response to nitrogen availability. Mathematical models were designed for quantitative assessment of root growth and root branching phenotypes and their dynamic relationships based on hierarchical configuration of primary and lateral roots formulating the fishbone-shaped root system architecture in Arabidopsis thaliana. Time-series datasets reporting dynamic changes in root developmental traits on different nitrate or ammonium concentrations were generated for statistical analyses. Regression analyses unraveled key parameters associated with: (i) inhibition of primary root growth under nitrogen limitation or on ammonium; (ii) rapid progression of lateral root emergence in response to ammonium; and (iii) inhibition of lateral root elongation in the presence of excess nitrate or ammonium. This study provides a statistical framework for interpreting dynamic modulation of root system architecture, supported by meta-analysis of datasets displaying morphological responses of roots to diverse nitrogen supplies. © 2015 Institute of Botany, Chinese Academy of Sciences.
Statistical analysis of questionnaires a unified approach based on R and Stata
Bartolucci, Francesco; Gnaldi, Michela
2015-01-01
Statistical Analysis of Questionnaires: A Unified Approach Based on R and Stata presents special statistical methods for analyzing data collected by questionnaires. The book takes an applied approach to testing and measurement tasks, mirroring the growing use of statistical methods and software in education, psychology, sociology, and other fields. It is suitable for graduate students in applied statistics and psychometrics and practitioners in education, health, and marketing.The book covers the foundations of classical test theory (CTT), test reliability, va
Classical model of intermediate statistics
International Nuclear Information System (INIS)
Kaniadakis, G.
1994-01-01
In this work we present a classical kinetic model of intermediate statistics. In the case of Brownian particles we show that the Fermi-Dirac (FD) and Bose-Einstein (BE) distributions can be obtained, just as the Maxwell-Boltzmann (MD) distribution, as steady states of a classical kinetic equation that intrinsically takes into account an exclusion-inclusion principle. In our model the intermediate statistics are obtained as steady states of a system of coupled nonlinear kinetic equations, where the coupling constants are the transmutational potentials η κκ' . We show that, besides the FD-BE intermediate statistics extensively studied from the quantum point of view, we can also study the MB-FD and MB-BE ones. Moreover, our model allows us to treat the three-state mixing FD-MB-BE intermediate statistics. For boson and fermion mixing in a D-dimensional space, we obtain a family of FD-BE intermediate statistics by varying the transmutational potential η BF . This family contains, as a particular case when η BF =0, the quantum statistics recently proposed by L. Wu, Z. Wu, and J. Sun [Phys. Lett. A 170, 280 (1992)]. When we consider the two-dimensional FD-BE statistics, we derive an analytic expression of the fraction of fermions. When the temperature T→∞, the system is composed by an equal number of bosons and fermions, regardless of the value of η BF . On the contrary, when T=0, η BF becomes important and, according to its value, the system can be completely bosonic or fermionic, or composed both by bosons and fermions
INNOVATIVE APPROACH TO EDUCATION AND TEACHING OF STATISTICS
Directory of Open Access Journals (Sweden)
Andrea Jindrová
2010-06-01
Full Text Available Educational and tutorial programs are being developed together, with the changing world of information technology it is a necessary course to adapt to and accept new possibilities and needs. Use of online learning tools can amplify our teaching resources and create new types of learning opportunities that did not exist in the pre-Internet age. The world is full of information, which needs to be constantly updated. Virtualisation of studying materials enables us to update and manage them quickly and easily. As an advantage, we see an asynchronous approach towards learning materials that can be tailored for the students´ needs and adjusted according to their time and availability. The specificness of statistical learning lies in various statistical programs. The high technical demands of these programs require tutorials (instructional presentations, which can help students to learn how to use them efficiently. Instructional presentation may be understood as a demonstration of how the statistical software program works. This is one of the options that students may use to simplify the utilization of control and navigation through the statistical system. Thanks to instructional presentations, students will be able to transfer their theoretical statistical knowledge into practical situation and real life and, therefore, improve their personal development process. The goal of this tutorial is to show an innovative approach for learning of statistics in the Czech University of Life Sciences. The use of presentations and their benefits for students was evaluated according to results obtained from a questionnaire survey completed by students of the 4th grade of the Faculty of Economics and Management. The aim of this pilot survey was to evaluate the benefits of these instructional presentations, and the students interest in using them. The information obtained was used as essential data for the evaluation of the efficiency of this new approach. Firstly
International Nuclear Information System (INIS)
Ichinose, Shoichi
2010-01-01
A geometric approach to general quantum statistical systems (including the harmonic oscillator) is presented. It is applied to Casimir energy and the dissipative system with friction. We regard the (N+1)-dimensional Euclidean coordinate system (X i ,τ) as the quantum statistical system of N quantum (statistical) variables (X τ ) and one Euclidean time variable (t). Introducing paths (lines or hypersurfaces) in this space (X τ ,t), we adopt the path-integral method to quantize the mechanical system. This is a new view of (statistical) quantization of the mechanical system. The system Hamiltonian appears as the area. We show quantization is realized by the minimal area principle in the present geometric approach. When we take a line as the path, the path-integral expressions of the free energy are shown to be the ordinary ones (such as N harmonic oscillators) or their simple variation. When we take a hyper-surface as the path, the system Hamiltonian is given by the area of the hyper-surface which is defined as a closed-string configuration in the bulk space. In this case, the system becomes a O(N) non-linear model. We show the recently-proposed 5 dimensional Casimir energy (ArXiv:0801.3064,0812.1263) is valid. We apply this approach to the visco-elastic system, and present a new method using the path-integral for the calculation of the dissipative properties.
Statistical multi-model approach for performance assessment of cooling tower
International Nuclear Information System (INIS)
Pan, Tian-Hong; Shieh, Shyan-Shu; Jang, Shi-Shang; Tseng, Wen-Hung; Wu, Chan-Wei; Ou, Jenq-Jang
2011-01-01
This paper presents a data-driven model-based assessment strategy to investigate the performance of a cooling tower. In order to achieve this objective, the operations of a cooling tower are first characterized using a data-driven method, multiple models, which presents a set of local models in the format of linear equations. Satisfactory fuzzy c-mean clustering algorithm is used to classify operating data into several groups to build local models. The developed models are then applied to predict the performance of the system based on design input parameters provided by the manufacturer. The tower characteristics are also investigated using the proposed models via the effects of the water/air flow ratio. The predicted results tend to agree well with the calculated tower characteristics using actual measured operating data from an industrial plant. By comparison with the design characteristic curve provided by the manufacturer, the effectiveness of cooling tower can be obtained in the end. A case study conducted in a commercial plant demonstrates the validity of proposed approach. It should be noted that this is the first attempt to assess the cooling efficiency which is deviated from the original design value using operating data for an industrial scale process. Moreover, the evaluated process need not interrupt the normal operation of the cooling tower. This should be of particular interest in industrial applications.
Evolutionary modeling-based approach for model errors correction
Directory of Open Access Journals (Sweden)
S. Q. Wan
2012-08-01
Full Text Available The inverse problem of using the information of historical data to estimate model errors is one of the science frontier research topics. In this study, we investigate such a problem using the classic Lorenz (1963 equation as a prediction model and the Lorenz equation with a periodic evolutionary function as an accurate representation of reality to generate "observational data."
On the basis of the intelligent features of evolutionary modeling (EM, including self-organization, self-adaptive and self-learning, the dynamic information contained in the historical data can be identified and extracted by computer automatically. Thereby, a new approach is proposed to estimate model errors based on EM in the present paper. Numerical tests demonstrate the ability of the new approach to correct model structural errors. In fact, it can actualize the combination of the statistics and dynamics to certain extent.
Urban pavement surface temperature. Comparison of numerical and statistical approach
Marchetti, Mario; Khalifa, Abderrahmen; Bues, Michel; Bouilloud, Ludovic; Martin, Eric; Chancibaut, Katia
2015-04-01
The forecast of pavement surface temperature is very specific in the context of urban winter maintenance. to manage snow plowing and salting of roads. Such forecast mainly relies on numerical models based on a description of the energy balance between the atmosphere, the buildings and the pavement, with a canyon configuration. Nevertheless, there is a specific need in the physical description and the numerical implementation of the traffic in the energy flux balance. This traffic was originally considered as a constant. Many changes were performed in a numerical model to describe as accurately as possible the traffic effects on this urban energy balance, such as tires friction, pavement-air exchange coefficient, and infrared flux neat balance. Some experiments based on infrared thermography and radiometry were then conducted to quantify the effect fo traffic on urban pavement surface. Based on meteorological data, corresponding pavement temperature forecast were calculated and were compared with fiels measurements. Results indicated a good agreement between the forecast from the numerical model based on this energy balance approach. A complementary forecast approach based on principal component analysis (PCA) and partial least-square regression (PLS) was also developed, with data from thermal mapping usng infrared radiometry. The forecast of pavement surface temperature with air temperature was obtained in the specific case of urban configurtation, and considering traffic into measurements used for the statistical analysis. A comparison between results from the numerical model based on energy balance, and PCA/PLS was then conducted, indicating the advantages and limits of each approach.
Statistical modelling of networked human-automation performance using working memory capacity.
Ahmed, Nisar; de Visser, Ewart; Shaw, Tyler; Mohamed-Ameen, Amira; Campbell, Mark; Parasuraman, Raja
2014-01-01
This study examines the challenging problem of modelling the interaction between individual attentional limitations and decision-making performance in networked human-automation system tasks. Analysis of real experimental data from a task involving networked supervision of multiple unmanned aerial vehicles by human participants shows that both task load and network message quality affect performance, but that these effects are modulated by individual differences in working memory (WM) capacity. These insights were used to assess three statistical approaches for modelling and making predictions with real experimental networked supervisory performance data: classical linear regression, non-parametric Gaussian processes and probabilistic Bayesian networks. It is shown that each of these approaches can help designers of networked human-automated systems cope with various uncertainties in order to accommodate future users by linking expected operating conditions and performance from real experimental data to observable cognitive traits like WM capacity. Practitioner Summary: Working memory (WM) capacity helps account for inter-individual variability in operator performance in networked unmanned aerial vehicle supervisory tasks. This is useful for reliable performance prediction near experimental conditions via linear models; robust statistical prediction beyond experimental conditions via Gaussian process models and probabilistic inference about unknown task conditions/WM capacities via Bayesian network models.
Statistical Model of Extreme Shear
DEFF Research Database (Denmark)
Hansen, Kurt Schaldemose; Larsen, Gunner Chr.
2005-01-01
In order to continue cost-optimisation of modern large wind turbines, it is important to continuously increase the knowledge of wind field parameters relevant to design loads. This paper presents a general statistical model that offers site-specific prediction of the probability density function...... by a model that, on a statistically consistent basis, describes the most likely spatial shape of an extreme wind shear event. Predictions from the model have been compared with results from an extreme value data analysis, based on a large number of full-scale measurements recorded with a high sampling rate...
Straub, Annette; Beck, Christoph; Breitner, Susanne; Cyrys, Josef; Geruschkat, Uta; Jacobeit, Jucundus; Kühlbach, Benjamin; Kusch, Thomas; Richter, Katja; Schneider, Alexandra; Umminger, Robin; Wolf, Kathrin
2017-04-01
Frequently spatial variations of air temperature of considerable magnitude occur within urban areas. They correspond to varying land use/land cover characteristics and vary with season, time of day and synoptic conditions. These temperature differences have an impact on human health and comfort directly by inducing thermal stress as well as indirectly by means of affecting air quality. Therefore, knowledge of the spatial patterns of air temperature in cities and the factors causing them is of great importance, e.g. for urban planners. A multitude of studies have shown statistical modelling to be a suitable tool for generating spatial air temperature patterns. This contribution presents a comparison of different statistical modelling approaches for deriving spatial air temperature patterns in the urban environment of Augsburg, Southern Germany. In Augsburg there exists a measurement network for air temperature and humidity currently comprising 48 stations in the city and its rural surroundings (corporately operated by the Institute of Epidemiology II, Helmholtz Zentrum München, German Research Center for Environmental Health and the Institute of Geography, University of Augsburg). Using different datasets for land surface characteristics (Open Street Map, Urban Atlas) area percentages of different types of land cover were calculated for quadratic buffer zones of different size (25, 50, 100, 250, 500 m) around the stations as well for source regions of advective air flow and used as predictors together with additional variables such as sky view factor, ground level and distance from the city centre. Multiple Linear Regression and Random Forest models for different situations taking into account season, time of day and weather condition were applied utilizing selected subsets of these predictors in order to model spatial distributions of mean hourly and daily air temperature deviations from a rural reference station. Furthermore, the different model setups were
Aspects of statistical model for multifragmentation
International Nuclear Information System (INIS)
Bhattacharyya, P.; Das Gupta, S.; Mekjian, A. Z.
1999-01-01
We deal with two different aspects of an exactly soluble statistical model of fragmentation. First we show, using zero range force and finite temperature Thomas-Fermi theory, that a common link can be found between finite temperature mean field theory and the statistical fragmentation model. We show the latter naturally arises in the spinodal region. Next we show that although the exact statistical model is a canonical model and uses temperature, microcanonical results which use constant energy rather than constant temperature can also be obtained from the canonical model using saddle-point approximation. The methodology is extremely simple to implement and at least in all the examples studied in this work is very accurate. (c) 1999 The American Physical Society
Statistical Compression for Climate Model Output
Hammerling, D.; Guinness, J.; Soh, Y. J.
2017-12-01
Numerical climate model simulations run at high spatial and temporal resolutions generate massive quantities of data. As our computing capabilities continue to increase, storing all of the data is not sustainable, and thus is it important to develop methods for representing the full datasets by smaller compressed versions. We propose a statistical compression and decompression algorithm based on storing a set of summary statistics as well as a statistical model describing the conditional distribution of the full dataset given the summary statistics. We decompress the data by computing conditional expectations and conditional simulations from the model given the summary statistics. Conditional expectations represent our best estimate of the original data but are subject to oversmoothing in space and time. Conditional simulations introduce realistic small-scale noise so that the decompressed fields are neither too smooth nor too rough compared with the original data. Considerable attention is paid to accurately modeling the original dataset-one year of daily mean temperature data-particularly with regard to the inherent spatial nonstationarity in global fields, and to determining the statistics to be stored, so that the variation in the original data can be closely captured, while allowing for fast decompression and conditional emulation on modest computers.
Automated statistical modeling of analytical measurement systems
International Nuclear Information System (INIS)
Jacobson, J.J.
1992-01-01
The statistical modeling of analytical measurement systems at the Idaho Chemical Processing Plant (ICPP) has been completely automated through computer software. The statistical modeling of analytical measurement systems is one part of a complete quality control program used by the Remote Analytical Laboratory (RAL) at the ICPP. The quality control program is an integration of automated data input, measurement system calibration, database management, and statistical process control. The quality control program and statistical modeling program meet the guidelines set forth by the American Society for Testing Materials and American National Standards Institute. A statistical model is a set of mathematical equations describing any systematic bias inherent in a measurement system and the precision of a measurement system. A statistical model is developed from data generated from the analysis of control standards. Control standards are samples which are made up at precise known levels by an independent laboratory and submitted to the RAL. The RAL analysts who process control standards do not know the values of those control standards. The object behind statistical modeling is to describe real process samples in terms of their bias and precision and, to verify that a measurement system is operating satisfactorily. The processing of control standards gives us this ability
A Tensor Statistical Model for Quantifying Dynamic Functional Connectivity.
Zhu, Yingying; Zhu, Xiaofeng; Kim, Minjeong; Yan, Jin; Wu, Guorong
2017-06-01
Functional connectivity (FC) has been widely investigated in many imaging-based neuroscience and clinical studies. Since functional Magnetic Resonance Image (MRI) signal is just an indirect reflection of brain activity, it is difficult to accurately quantify the FC strength only based on signal correlation. To address this limitation, we propose a learning-based tensor model to derive high sensitivity and specificity connectome biomarkers at the individual level from resting-state fMRI images. First, we propose a learning-based approach to estimate the intrinsic functional connectivity. In addition to the low level region-to-region signal correlation, latent module-to-module connection is also estimated and used to provide high level heuristics for measuring connectivity strength. Furthermore, sparsity constraint is employed to automatically remove the spurious connections, thus alleviating the issue of searching for optimal threshold. Second, we integrate our learning-based approach with the sliding-window technique to further reveal the dynamics of functional connectivity. Specifically, we stack the functional connectivity matrix within each sliding window and form a 3D tensor where the third dimension denotes for time. Then we obtain dynamic functional connectivity (dFC) for each individual subject by simultaneously estimating the within-sliding-window functional connectivity and characterizing the across-sliding-window temporal dynamics. Third, in order to enhance the robustness of the connectome patterns extracted from dFC, we extend the individual-based 3D tensors to a population-based 4D tensor (with the fourth dimension stands for the training subjects) and learn the statistics of connectome patterns via 4D tensor analysis. Since our 4D tensor model jointly (1) optimizes dFC for each training subject and (2) captures the principle connectome patterns, our statistical model gains more statistical power of representing new subject than current state
Inverse statistical approach in heartbeat time series
International Nuclear Information System (INIS)
Ebadi, H; Shirazi, A H; Mani, Ali R; Jafari, G R
2011-01-01
We present an investigation on heart cycle time series, using inverse statistical analysis, a concept borrowed from studying turbulence. Using this approach, we studied the distribution of the exit times needed to achieve a predefined level of heart rate alteration. Such analysis uncovers the most likely waiting time needed to reach a certain change in the rate of heart beat. This analysis showed a significant difference between the raw data and shuffled data, when the heart rate accelerates or decelerates to a rare event. We also report that inverse statistical analysis can distinguish between the electrocardiograms taken from healthy volunteers and patients with heart failure
Model output statistics applied to wind power prediction
Energy Technology Data Exchange (ETDEWEB)
Joensen, A; Giebel, G; Landberg, L [Risoe National Lab., Roskilde (Denmark); Madsen, H; Nielsen, H A [The Technical Univ. of Denmark, Dept. of Mathematical Modelling, Lyngby (Denmark)
1999-03-01
Being able to predict the output of a wind farm online for a day or two in advance has significant advantages for utilities, such as better possibility to schedule fossil fuelled power plants and a better position on electricity spot markets. In this paper prediction methods based on Numerical Weather Prediction (NWP) models are considered. The spatial resolution used in NWP models implies that these predictions are not valid locally at a specific wind farm. Furthermore, due to the non-stationary nature and complexity of the processes in the atmosphere, and occasional changes of NWP models, the deviation between the predicted and the measured wind will be time dependent. If observational data is available, and if the deviation between the predictions and the observations exhibits systematic behavior, this should be corrected for; if statistical methods are used, this approaches is usually referred to as MOS (Model Output Statistics). The influence of atmospheric turbulence intensity, topography, prediction horizon length and auto-correlation of wind speed and power is considered, and to take the time-variations into account, adaptive estimation methods are applied. Three estimation techniques are considered and compared, Extended Kalman Filtering, recursive least squares and a new modified recursive least squares algorithm. (au) EU-JOULE-3. 11 refs.
Paradigms and pragmatism: approaches to medical statistics.
Healy, M J
2000-01-01
Until recently, the dominant philosophy of science was that due to Karl Popper, with its doctrine that the proper task of science was the formulation of hypotheses followed by attempts at refuting them. In spite of the close analogy with significance testing, these ideas do not fit well with the practice of medical statistics. The same can be said of the later philosophy of Thomas Kuhn, who maintains that science proceeds by way of revolutionary upheavals separated by periods of relatively pedestrian research which are governed by what Kuhn refers to as paradigms. Through there have been paradigm shifts in the history of statistics, a degree of continuity can also be discerned. A current paradigm shift is embodied in the spread of Bayesian ideas. It may be that a future paradigm will emphasise the pragmatic approach to statistics that is associated with the name of Daniel Schwartz.
A statistical mechanics approach to Granovetter theory
Barra, Adriano; Agliari, Elena
2012-05-01
In this paper we try to bridge breakthroughs in quantitative sociology/econometrics, pioneered during the last decades by Mac Fadden, Brock-Durlauf, Granovetter and Watts-Strogatz, by introducing a minimal model able to reproduce essentially all the features of social behavior highlighted by these authors. Our model relies on a pairwise Hamiltonian for decision-maker interactions which naturally extends the multi-populations approaches by shifting and biasing the pattern definitions of a Hopfield model of neural networks. Once introduced, the model is investigated through graph theory (to recover Granovetter and Watts-Strogatz results) and statistical mechanics (to recover Mac-Fadden and Brock-Durlauf results). Due to the internal symmetries of our model, the latter is obtained as the relaxation of a proper Markov process, allowing even to study its out-of-equilibrium properties. The method used to solve its equilibrium is an adaptation of the Hamilton-Jacobi technique recently introduced by Guerra in the spin-glass scenario and the picture obtained is the following: shifting the patterns from [-1,+1]→[0.+1] implies that the larger the amount of similarities among decision makers, the stronger their relative influence, and this is enough to explain both the different role of strong and weak ties in the social network as well as its small-world properties. As a result, imitative interaction strengths seem essentially a robust request (enough to break the gauge symmetry in the couplings), furthermore, this naturally leads to a discrete choice modelization when dealing with the external influences and to imitative behavior à la Curie-Weiss as the one introduced by Brock and Durlauf.
Students' Attitudes toward Statistics across the Disciplines: A Mixed-Methods Approach
Griffith, James D.; Adams, Lea T.; Gu, Lucy L.; Hart, Christian L.; Nichols-Whitehead, Penney
2012-01-01
Students' attitudes toward statistics were investigated using a mixed-methods approach including a discovery-oriented qualitative methodology among 684 undergraduate students across business, criminal justice, and psychology majors where at least one course in statistics was required. Students were asked about their attitudes toward statistics and…
Data and Dynamics Driven Approaches for Modelling and Forecasting the Red Sea Chlorophyll
Dreano, Denis
2017-01-01
concentration and have practical applications for fisheries operation and harmful algae blooms monitoring. Modelling approaches can be divided between physics- driven (dynamical) approaches, and data-driven (statistical) approaches. Dynamical models are based
The cybernetic-statistical approach to the search for substances with pre-set properties
International Nuclear Information System (INIS)
Savitskij, E.M.; Kiseleva, N.N.; Shkatova, T.M.
1982-01-01
Using the cybernetic methods the forecast for a new phase Agsub(x)Mosub(6)Ssub(8), the data on which have not been used when computer learning, is given. Its Tsub(c) is evaluated and optimum contents of silver and conditions of phase obtaining with Tsub(c) somewhat higher than it is mentioned in literature for the same ratio of molybdenum and sulphur in the compound are found using statistical methods. The results obtained when solving the model task of the search for new ternary superconducting phases Asub(x)Bsub(6)Ssub(8) permitted to make a conclusion on correctness of the cybernetic-statistical approach suggested [ru
Sensometrics: Thurstonian and Statistical Models
DEFF Research Database (Denmark)
Christensen, Rune Haubo Bojesen
. sensR is a package for sensory discrimination testing with Thurstonian models and ordinal supports analysis of ordinal data with cumulative link (mixed) models. While sensR is closely connected to the sensometrics field, the ordinal package has developed into a generic statistical package applicable......This thesis is concerned with the development and bridging of Thurstonian and statistical models for sensory discrimination testing as applied in the scientific discipline of sensometrics. In sensory discrimination testing sensory differences between products are detected and quantified by the use...... and sensory discrimination testing in particular in a series of papers by advancing Thurstonian models for a range of sensory discrimination protocols in addition to facilitating their application by providing software for fitting these models. The main focus is on identifying Thurstonian models...
Statistical modelling for social researchers principles and practice
Tarling, Roger
2008-01-01
This book explains the principles and theory of statistical modelling in an intelligible way for the non-mathematical social scientist looking to apply statistical modelling techniques in research. The book also serves as an introduction for those wishing to develop more detailed knowledge and skills in statistical modelling. Rather than present a limited number of statistical models in great depth, the aim is to provide a comprehensive overview of the statistical models currently adopted in social research, in order that the researcher can make appropriate choices and select the most suitable model for the research question to be addressed. To facilitate application, the book also offers practical guidance and instruction in fitting models using SPSS and Stata, the most popular statistical computer software which is available to most social researchers. Instruction in using MLwiN is also given. Models covered in the book include; multiple regression, binary, multinomial and ordered logistic regression, log-l...
Statistical modeling of static strengths of nuclear graphites with relevance to structural design
International Nuclear Information System (INIS)
Arai, Taketoshi
1992-02-01
Use of graphite materials for structural members poses a problem as to how to take into account of statistical properties of static strength, especially tensile fracture stresses, in component structural design. The present study concerns comprehensive examinations on statistical data base and modelings on nuclear graphites. First, the report provides individual samples and their analyses on strengths of IG-110 and PGX graphites for HTTR components. Those statistical characteristics on other HTGR graphites are also exemplified from the literature. Most of statistical distributions of individual samples are found to be approximately normal. The goodness of fit to normal distributions is more satisfactory with larger sample sizes. Molded and extruded graphites, however, possess a variety of statistical properties depending of samples from different with-in-log locations and/or different orientations. Second, the previous statistical models including the Weibull theory are assessed from the viewpoint of applicability to design procedures. This leads to a conclusion that the Weibull theory and its modified ones are satisfactory only for limited parts of tensile fracture behavior. They are not consistent for whole observations. Only normal statistics are justifiable as practical approaches to discuss specified minimum ultimate strengths as statistical confidence limits for individual samples. Third, the assessment of various statistical models emphasizes the need to develop advanced analytical ones which should involve modeling of microstructural features of actual graphite materials. Improvements of other structural design methodologies are also presented. (author)
Li, Changyang; Wang, Xiuying; Eberl, Stefan; Fulham, Michael; Yin, Yong; Dagan Feng, David
2015-01-01
Automated and general medical image segmentation can be challenging because the foreground and the background may have complicated and overlapping density distributions in medical imaging. Conventional region-based level set algorithms often assume piecewise constant or piecewise smooth for segments, which are implausible for general medical image segmentation. Furthermore, low contrast and noise make identification of the boundaries between foreground and background difficult for edge-based level set algorithms. Thus, to address these problems, we suggest a supervised variational level set segmentation model to harness the statistical region energy functional with a weighted probability approximation. Our approach models the region density distributions by using the mixture-of-mixtures Gaussian model to better approximate real intensity distributions and distinguish statistical intensity differences between foreground and background. The region-based statistical model in our algorithm can intuitively provide better performance on noisy images. We constructed a weighted probability map on graphs to incorporate spatial indications from user input with a contextual constraint based on the minimization of contextual graphs energy functional. We measured the performance of our approach on ten noisy synthetic images and 58 medical datasets with heterogeneous intensities and ill-defined boundaries and compared our technique to the Chan-Vese region-based level set model, the geodesic active contour model with distance regularization, and the random walker model. Our method consistently achieved the highest Dice similarity coefficient when compared to the other methods.
Predicting energy performance of a net-zero energy building: A statistical approach
International Nuclear Information System (INIS)
Kneifel, Joshua; Webb, David
2016-01-01
Highlights: • A regression model is applied to actual energy data from a net-zero energy building. • The model is validated through a rigorous statistical analysis. • Comparisons are made between model predictions and those of a physics-based model. • The model is a viable baseline for evaluating future models from the energy data. - Abstract: Performance-based building requirements have become more prevalent because it gives freedom in building design while still maintaining or exceeding the energy performance required by prescriptive-based requirements. In order to determine if building designs reach target energy efficiency improvements, it is necessary to estimate the energy performance of a building using predictive models and different weather conditions. Physics-based whole building energy simulation modeling is the most common approach. However, these physics-based models include underlying assumptions and require significant amounts of information in order to specify the input parameter values. An alternative approach to test the performance of a building is to develop a statistically derived predictive regression model using post-occupancy data that can accurately predict energy consumption and production based on a few common weather-based factors, thus requiring less information than simulation models. A regression model based on measured data should be able to predict energy performance of a building for a given day as long as the weather conditions are similar to those during the data collection time frame. This article uses data from the National Institute of Standards and Technology (NIST) Net-Zero Energy Residential Test Facility (NZERTF) to develop and validate a regression model to predict the energy performance of the NZERTF using two weather variables aggregated to the daily level, applies the model to estimate the energy performance of hypothetical NZERTFs located in different cities in the Mixed-Humid Climate Zone, and compares these
UPPAAL-SMC: Statistical Model Checking for Priced Timed Automata
DEFF Research Database (Denmark)
Bulychev, Petr; David, Alexandre; Larsen, Kim Guldstrand
2012-01-01
on a series of extensions of the statistical model checking approach generalized to handle real-time systems and estimate undecidable problems. U PPAAL - SMC comes together with a friendly user interface that allows a user to specify complex problems in an efficient manner as well as to get feedback...... in the form of probability distributions and compare probabilities to analyze performance aspects of systems. The focus of the survey is on the evolution of the tool – including modeling and specification formalisms as well as techniques applied – together with applications of the tool to case studies....
International Nuclear Information System (INIS)
Koyumdjieva, N.
2006-01-01
A statistical model for the resonant cross section structure in the Unresolved Resonance Region has been developed in the framework of the R-matrix formalism in Reich Moore approach with effective accounting of the resonance parameters fluctuations. The model uses only the average resonance parameters and can be effectively applied for analyses of cross sections functional, averaged over many resonances. Those are cross section moments, transmission and self-indication functions measured through thick sample. In this statistical model the resonant cross sections structure is accepted to be periodic and the R-matrix is a function of ε=E/D with period 0≤ε≤N; R nc (ε)=π/2√(S n *S c )1/NΣ(i=1,N)(β in *β ic *ctg[π(ε i - = ε-iS i )/N]; Here S n ,S c ,S i is respectively neutron strength function, strength function for fission or inelastic channel and strength function for radiative capture, N is the number of resonances (ε i ,β i ) that obey the statistic of Porter-Thomas and Wigner's one. The simple case of this statistical model concerns the resonant cross section structure for non-fissile nuclei under the threshold for inelastic scattering - the model of the characteristic function with HARFOR program. In the above model some improvements of calculation of the phases and logarithmic derivatives of neutron channels have been done. In the parameterization we use the free parameter R l ∞ , which accounts the influence of long-distant resonances. The above scheme for statistical modelling of the resonant cross section structure has been applied for evaluation of experimental data for total, capture and inelastic cross sections for 232 Th in the URR (4-150) keV and also the transmission and self-indication functions in (4-175) keV. The set of evaluated average resonance parameters have been obtained. The evaluated average resonance parameters in the URR are consistent with those in the Resolved Resonance Region (CRP for Th-U cycle, Vienna, 2006
Statistical modeling of urban air temperature distributions under different synoptic conditions
Beck, Christoph; Breitner, Susanne; Cyrys, Josef; Hald, Cornelius; Hartz, Uwe; Jacobeit, Jucundus; Richter, Katja; Schneider, Alexandra; Wolf, Kathrin
2015-04-01
Within urban areas air temperature may vary distinctly between different locations. These intra-urban air temperature variations partly reach magnitudes that are relevant with respect to human thermal comfort. Therefore and furthermore taking into account potential interrelations with other health related environmental factors (e.g. air quality) it is important to estimate spatial patterns of intra-urban air temperature distributions that may be incorporated into urban planning processes. In this contribution we present an approach to estimate spatial temperature distributions in the urban area of Augsburg (Germany) by means of statistical modeling. At 36 locations in the urban area of Augsburg air temperatures are measured with high temporal resolution (4 min.) since December 2012. These 36 locations represent different typical urban land use characteristics in terms of varying percentage coverages of different land cover categories (e.g. impervious, built-up, vegetated). Percentage coverages of these land cover categories have been extracted from different sources (Open Street Map, European Urban Atlas, Urban Morphological Zones) for regular grids of varying size (50, 100, 200 meter horizonal resolution) for the urban area of Augsburg. It is well known from numerous studies that land use characteristics have a distinct influence on air temperature and as well other climatic variables at a certain location. Therefore air temperatures at the 36 locations are modeled utilizing land use characteristics (percentage coverages of land cover categories) as predictor variables in Stepwise Multiple Regression models and in Random Forest based model approaches. After model evaluation via cross-validation appropriate statistical models are applied to gridded land use data to derive spatial urban air temperature distributions. Varying models are tested and applied for different seasons and times of the day and also for different synoptic conditions (e.g. clear and calm
Statistical model for OCT image denoising
Li, Muxingzi
2017-08-01
Optical coherence tomography (OCT) is a non-invasive technique with a large array of applications in clinical imaging and biological tissue visualization. However, the presence of speckle noise affects the analysis of OCT images and their diagnostic utility. In this article, we introduce a new OCT denoising algorithm. The proposed method is founded on a numerical optimization framework based on maximum-a-posteriori estimate of the noise-free OCT image. It combines a novel speckle noise model, derived from local statistics of empirical spectral domain OCT (SD-OCT) data, with a Huber variant of total variation regularization for edge preservation. The proposed approach exhibits satisfying results in terms of speckle noise reduction as well as edge preservation, at reduced computational cost.
Statistical power analysis a simple and general model for traditional and modern hypothesis tests
Murphy, Kevin R; Wolach, Allen
2014-01-01
Noted for its accessible approach, this text applies the latest approaches of power analysis to both null hypothesis and minimum-effect testing using the same basic unified model. Through the use of a few simple procedures and examples, the authors show readers with little expertise in statistical analysis how to obtain the values needed to carry out the power analysis for their research. Illustrations of how these analyses work and how they can be used to choose the appropriate criterion for defining statistically significant outcomes are sprinkled throughout. The book presents a simple and g
Simple statistical model for branched aggregates
DEFF Research Database (Denmark)
Lemarchand, Claire; Hansen, Jesper Schmidt
2015-01-01
, given that it already has bonds with others. The model is applied here to asphaltene nanoaggregates observed in molecular dynamics simulations of Cooee bitumen. The variation with temperature of the probabilities deduced from this model is discussed in terms of statistical mechanics arguments....... The relevance of the statistical model in the case of asphaltene nanoaggregates is checked by comparing the predicted value of the probability for one molecule to have exactly i bonds with the same probability directly measured in the molecular dynamics simulations. The agreement is satisfactory......We propose a statistical model that can reproduce the size distribution of any branched aggregate, including amylopectin, dendrimers, molecular clusters of monoalcohols, and asphaltene nanoaggregates. It is based on the conditional probability for one molecule to form a new bond with a molecule...
Matrix Tricks for Linear Statistical Models
Puntanen, Simo; Styan, George PH
2011-01-01
In teaching linear statistical models to first-year graduate students or to final-year undergraduate students there is no way to proceed smoothly without matrices and related concepts of linear algebra; their use is really essential. Our experience is that making some particular matrix tricks very familiar to students can substantially increase their insight into linear statistical models (and also multivariate statistical analysis). In matrix algebra, there are handy, sometimes even very simple "tricks" which simplify and clarify the treatment of a problem - both for the student and
a Statistical Dynamic Approach to Structural Evolution of Complex Capital Market Systems
Shao, Xiao; Chai, Li H.
As an important part of modern financial systems, capital market has played a crucial role on diverse social resource allocations and economical exchanges. Beyond traditional models and/or theories based on neoclassical economics, considering capital markets as typical complex open systems, this paper attempts to develop a new approach to overcome some shortcomings of the available researches. By defining the generalized entropy of capital market systems, a theoretical model and nonlinear dynamic equation on the operations of capital market are proposed from statistical dynamic perspectives. The US security market from 1995 to 2001 is then simulated and analyzed as a typical case. Some instructive results are discussed and summarized.
Six sigma for organizational excellence a statistical approach
Muralidharan, K
2015-01-01
This book discusses the integrated concepts of statistical quality engineering and management tools. It will help readers to understand and apply the concepts of quality through project management and technical analysis, using statistical methods. Prepared in a ready-to-use form, the text will equip practitioners to implement the Six Sigma principles in projects. The concepts discussed are all critically assessed and explained, allowing them to be practically applied in managerial decision-making, and in each chapter, the objectives and connections to the rest of the work are clearly illustrated. To aid in understanding, the book includes a wealth of tables, graphs, descriptions and checklists, as well as charts and plots, worked-out examples and exercises. Perhaps the most unique feature of the book is its approach, using statistical tools, to explain the science behind Six Sigma project management and integrated in engineering concepts. The material on quality engineering and statistical management tools of...
Statistical Modelling of Wind Proles - Data Analysis and Modelling
DEFF Research Database (Denmark)
Jónsson, Tryggvi; Pinson, Pierre
The aim of the analysis presented in this document is to investigate whether statistical models can be used to make very short-term predictions of wind profiles.......The aim of the analysis presented in this document is to investigate whether statistical models can be used to make very short-term predictions of wind profiles....
Statistical bootstrap approach to hadronic matter and multiparticle reactions
International Nuclear Information System (INIS)
Ilgenfritz, E.M.; Kripfganz, J.; Moehring, H.J.
1977-01-01
The authors present the main ideas behind the statistical bootstrap model and recent developments within this model related to the description of fireball cascade decay. Mathematical methods developed in this model might be useful in other phenomenological schemes of strong interaction physics; they are described in detail. The present status of applications of the model to various hadronic reactions is discussed. When discussing the relations of the statistical bootstrap model to other models of hadron physics the authors point out possibly fruitful analogies and dynamical mechanisms which are modelled by the bootstrap dynamics under definite conditions. This offers interpretations for the critical temperature typical for the model and indicates futher fields of application. (author)
Linear mixed-effects models for central statistical monitoring of multicenter clinical trials
Desmet, L.; Venet, D.; Doffagne, E.; Timmermans, C.; BURZYKOWSKI, Tomasz; LEGRAND, Catherine; BUYSE, Marc
2014-01-01
Multicenter studies are widely used to meet accrual targets in clinical trials. Clinical data monitoring is required to ensure the quality and validity of the data gathered across centers. One approach to this end is central statistical monitoring, which aims at detecting atypical patterns in the data by means of statistical methods. In this context, we consider the simple case of a continuous variable, and we propose a detection procedure based on a linear mixed-effects model to detect locat...
Gridded Calibration of Ensemble Wind Vector Forecasts Using Ensemble Model Output Statistics
Lazarus, S. M.; Holman, B. P.; Splitt, M. E.
2017-12-01
A computationally efficient method is developed that performs gridded post processing of ensemble wind vector forecasts. An expansive set of idealized WRF model simulations are generated to provide physically consistent high resolution winds over a coastal domain characterized by an intricate land / water mask. Ensemble model output statistics (EMOS) is used to calibrate the ensemble wind vector forecasts at observation locations. The local EMOS predictive parameters (mean and variance) are then spread throughout the grid utilizing flow-dependent statistical relationships extracted from the downscaled WRF winds. Using data withdrawal and 28 east central Florida stations, the method is applied to one year of 24 h wind forecasts from the Global Ensemble Forecast System (GEFS). Compared to the raw GEFS, the approach improves both the deterministic and probabilistic forecast skill. Analysis of multivariate rank histograms indicate the post processed forecasts are calibrated. Two downscaling case studies are presented, a quiescent easterly flow event and a frontal passage. Strengths and weaknesses of the approach are presented and discussed.
Statistical algebraic approach to quantum mechanics
International Nuclear Information System (INIS)
Slavnov, D.A.
2001-01-01
The scheme for plotting the quantum theory with application of the statistical algebraic approach is proposed. The noncommutative algebra elements (observed ones) and nonlinear functionals on this algebra (physical state) are used as the primary constituents. The latter ones are associated with the single-unit measurement results. Certain physical state groups are proposed to consider as quantum states of the standard quantum mechanics. It is shown that the mathematical apparatus of the standard quantum mechanics may be reproduced in such a scheme in full volume [ru
Estimating Predictive Variance for Statistical Gas Distribution Modelling
International Nuclear Information System (INIS)
Lilienthal, Achim J.; Asadi, Sahar; Reggente, Matteo
2009-01-01
Recent publications in statistical gas distribution modelling have proposed algorithms that model mean and variance of a distribution. This paper argues that estimating the predictive concentration variance entails not only a gradual improvement but is rather a significant step to advance the field. This is, first, since the models much better fit the particular structure of gas distributions, which exhibit strong fluctuations with considerable spatial variations as a result of the intermittent character of gas dispersal. Second, because estimating the predictive variance allows to evaluate the model quality in terms of the data likelihood. This offers a solution to the problem of ground truth evaluation, which has always been a critical issue for gas distribution modelling. It also enables solid comparisons of different modelling approaches, and provides the means to learn meta parameters of the model, to determine when the model should be updated or re-initialised, or to suggest new measurement locations based on the current model. We also point out directions of related ongoing or potential future research work.
Statistical Analysis of fMRI Time-Series: A Critical Review of the GLM Approach
Directory of Open Access Journals (Sweden)
Martin M Monti
2011-03-01
Full Text Available Functional Magnetic Resonance Imaging (fMRI is one of the most widely used tools to study the neural underpinnings of human cognition. Standard analysis of fMRI data relies on a General Linear Model (GLM approach to separate stimulus induced signals from noise. Crucially, this approach relies on a number of assumptions about the data which, for inferences to be valid, must be met. The current paper reviews the GLM approach to analysis of fMRI time-series, focusing in particular on the degree to which such data abides by the assumptions of the GLM framework, and on the methods that have been developed to correct for any violation of those assumptions. Rather than biasing estimates of effect size, the major consequence of non-conformity to the assumptions is to introduce bias into estimates of the variance, thus affecting test statistics, power and false positive rates. Furthermore, this bias can have pervasive effects on both individual subject and group-level statistics, potentially yielding qualitatively different results across replications, especially after the thresholding procedures commonly used for inference-making.
Statistical physics of pairwise probability models
DEFF Research Database (Denmark)
Roudi, Yasser; Aurell, Erik; Hertz, John
2009-01-01
(dansk abstrakt findes ikke) Statistical models for describing the probability distribution over the states of biological systems are commonly used for dimensional reduction. Among these models, pairwise models are very attractive in part because they can be fit using a reasonable amount of data......: knowledge of the means and correlations between pairs of elements in the system is sufficient. Not surprisingly, then, using pairwise models for studying neural data has been the focus of many studies in recent years. In this paper, we describe how tools from statistical physics can be employed for studying...
Experiential Approach to Teaching Statistics and Research Methods ...
African Journals Online (AJOL)
Statistics and research methods are among the more demanding topics for students of education to master at both the undergraduate and postgraduate levels. It is our conviction that teaching these topics should be combined with real practical experiences. We discuss an experiential teaching/ learning approach that ...
Statistical Approaches to Assess Biosimilarity from Analytical Data.
Burdick, Richard; Coffey, Todd; Gutka, Hiten; Gratzl, Gyöngyi; Conlon, Hugh D; Huang, Chi-Ting; Boyne, Michael; Kuehne, Henriette
2017-01-01
Protein therapeutics have unique critical quality attributes (CQAs) that define their purity, potency, and safety. The analytical methods used to assess CQAs must be able to distinguish clinically meaningful differences in comparator products, and the most important CQAs should be evaluated with the most statistical rigor. High-risk CQA measurements assess the most important attributes that directly impact the clinical mechanism of action or have known implications for safety, while the moderate- to low-risk characteristics may have a lower direct impact and thereby may have a broader range to establish similarity. Statistical equivalence testing is applied for high-risk CQA measurements to establish the degree of similarity (e.g., highly similar fingerprint, highly similar, or similar) of selected attributes. Notably, some high-risk CQAs (e.g., primary sequence or disulfide bonding) are qualitative (e.g., the same as the originator or not the same) and therefore not amenable to equivalence testing. For biosimilars, an important step is the acquisition of a sufficient number of unique originator drug product lots to measure the variability in the originator drug manufacturing process and provide sufficient statistical power for the analytical data comparisons. Together, these analytical evaluations, along with PK/PD and safety data (immunogenicity), provide the data necessary to determine if the totality of the evidence warrants a designation of biosimilarity and subsequent licensure for marketing in the USA. In this paper, a case study approach is used to provide examples of analytical similarity exercises and the appropriateness of statistical approaches for the example data.
A statistical approach to nuclear fuel design and performance
Cunning, Travis Andrew
As CANDU fuel failures can have significant economic and operational consequences on the Canadian nuclear power industry, it is essential that factors impacting fuel performance are adequately understood. Current industrial practice relies on deterministic safety analysis and the highly conservative "limit of operating envelope" approach, where all parameters are assumed to be at their limits simultaneously. This results in a conservative prediction of event consequences with little consideration given to the high quality and precision of current manufacturing processes. This study employs a novel approach to the prediction of CANDU fuel reliability. Probability distributions are fitted to actual fuel manufacturing datasets provided by Cameco Fuel Manufacturing, Inc. They are used to form input for two industry-standard fuel performance codes: ELESTRES for the steady-state case and ELOCA for the transient case---a hypothesized 80% reactor outlet header break loss of coolant accident. Using a Monte Carlo technique for input generation, 105 independent trials are conducted and probability distributions are fitted to key model output quantities. Comparing model output against recognized industrial acceptance criteria, no fuel failures are predicted for either case. Output distributions are well removed from failure limit values, implying that margin exists in current fuel manufacturing and design. To validate the results and attempt to reduce the simulation burden of the methodology, two dimensional reduction methods are assessed. Using just 36 trials, both methods are able to produce output distributions that agree strongly with those obtained via the brute-force Monte Carlo method, often to a relative discrepancy of less than 0.3% when predicting the first statistical moment, and a relative discrepancy of less than 5% when predicting the second statistical moment. In terms of global sensitivity, pellet density proves to have the greatest impact on fuel performance
A Bayesian approach for parameter estimation and prediction using a computationally intensive model
International Nuclear Information System (INIS)
Higdon, Dave; McDonnell, Jordan D; Schunck, Nicolas; Sarich, Jason; Wild, Stefan M
2015-01-01
Bayesian methods have been successful in quantifying uncertainty in physics-based problems in parameter estimation and prediction. In these cases, physical measurements y are modeled as the best fit of a physics-based model η(θ), where θ denotes the uncertain, best input setting. Hence the statistical model is of the form y=η(θ)+ϵ, where ϵ accounts for measurement, and possibly other, error sources. When nonlinearity is present in η(⋅), the resulting posterior distribution for the unknown parameters in the Bayesian formulation is typically complex and nonstandard, requiring computationally demanding computational approaches such as Markov chain Monte Carlo (MCMC) to produce multivariate draws from the posterior. Although generally applicable, MCMC requires thousands (or even millions) of evaluations of the physics model η(⋅). This requirement is problematic if the model takes hours or days to evaluate. To overcome this computational bottleneck, we present an approach adapted from Bayesian model calibration. This approach combines output from an ensemble of computational model runs with physical measurements, within a statistical formulation, to carry out inference. A key component of this approach is a statistical response surface, or emulator, estimated from the ensemble of model runs. We demonstrate this approach with a case study in estimating parameters for a density functional theory model, using experimental mass/binding energy measurements from a collection of atomic nuclei. We also demonstrate how this approach produces uncertainties in predictions for recent mass measurements obtained at Argonne National Laboratory. (paper)
Risk Modelling for Passages in Approach Channel
Directory of Open Access Journals (Sweden)
Leszek Smolarek
2013-01-01
Full Text Available Methods of multivariate statistics, stochastic processes, and simulation methods are used to identify and assess the risk measures. This paper presents the use of generalized linear models and Markov models to study risks to ships along the approach channel. These models combined with simulation testing are used to determine the time required for continuous monitoring of endangered objects or period at which the level of risk should be verified.
Statistical inference an integrated approach
Migon, Helio S; Louzada, Francisco
2014-01-01
Introduction Information The concept of probability Assessing subjective probabilities An example Linear algebra and probability Notation Outline of the bookElements of Inference Common statistical modelsLikelihood-based functions Bayes theorem Exchangeability Sufficiency and exponential family Parameter elimination Prior Distribution Entirely subjective specification Specification through functional forms Conjugacy with the exponential family Non-informative priors Hierarchical priors Estimation Introduction to decision theoryBayesian point estimation Classical point estimation Empirical Bayes estimation Comparison of estimators Interval estimation Estimation in the Normal model Approximating Methods The general problem of inference Optimization techniquesAsymptotic theory Other analytical approximations Numerical integration methods Simulation methods Hypothesis Testing Introduction Classical hypothesis testingBayesian hypothesis testing Hypothesis testing and confidence intervalsAsymptotic tests Prediction...
A combined statistical model for multiple motifs search
International Nuclear Information System (INIS)
Gao Lifeng; Liu Xin; Guan Shan
2008-01-01
Transcription factor binding sites (TFBS) play key roles in genebior 6.8 wavelet expression and regulation. They are short sequence segments with definite structure and can be recognized by the corresponding transcription factors correctly. From the viewpoint of statistics, the candidates of TFBS should be quite different from the segments that are randomly combined together by nucleotide. This paper proposes a combined statistical model for finding over-represented short sequence segments in different kinds of data set. While the over-represented short sequence segment is described by position weight matrix, the nucleotide distribution at most sites of the segment should be far from the background nucleotide distribution. The central idea of this approach is to search for such kind of signals. This algorithm is tested on 3 data sets, including binding sites data set of cyclic AMP receptor protein in E.coli, PlantProm DB which is a non-redundant collection of proximal promoter sequences from different species, collection of the intergenic sequences of the whole genome of E.Coli. Even though the complexity of these three data sets is quite different, the results show that this model is rather general and sensible. (general)
Crimp, Steven; Jin, Huidong; Kokic, Philip; Bakar, Shuvo; Nicholls, Neville
2018-04-01
Anthropogenic climate change has already been shown to effect the frequency, intensity, spatial extent, duration and seasonality of extreme climate events. Understanding these changes is an important step in determining exposure, vulnerability and focus for adaptation. In an attempt to support adaptation decision-making we have examined statistical modelling techniques to improve the representation of global climate model (GCM) derived projections of minimum temperature extremes (frosts) in Australia. We examine the spatial changes in minimum temperature extreme metrics (e.g. monthly and seasonal frost frequency etc.), for a region exhibiting the strongest station trends in Australia, and compare these changes with minimum temperature extreme metrics derived from 10 GCMs, from the Coupled Model Inter-comparison Project Phase 5 (CMIP 5) datasets, and via statistical downscaling. We compare the observed trends with those derived from the "raw" GCM minimum temperature data as well as examine whether quantile matching (QM) or spatio-temporal (spTimerQM) modelling with Quantile Matching can be used to improve the correlation between observed and simulated extreme minimum temperatures. We demonstrate, that the spTimerQM modelling approach provides correlations with observed daily minimum temperatures for the period August to November of 0.22. This represents an almost fourfold improvement over either the "raw" GCM or QM results. The spTimerQM modelling approach also improves correlations with observed monthly frost frequency statistics to 0.84 as opposed to 0.37 and 0.81 for the "raw" GCM and QM results respectively. We apply the spatio-temporal model to examine future extreme minimum temperature projections for the period 2016 to 2048. The spTimerQM modelling results suggest the persistence of current levels of frost risk out to 2030, with the evidence of continuing decadal variation.
Isotopic safeguards statistics
International Nuclear Information System (INIS)
Timmerman, C.L.; Stewart, K.B.
1978-06-01
The methods and results of our statistical analysis of isotopic data using isotopic safeguards techniques are illustrated using example data from the Yankee Rowe reactor. The statistical methods used in this analysis are the paired comparison and the regression analyses. A paired comparison results when a sample from a batch is analyzed by two different laboratories. Paired comparison techniques can be used with regression analysis to detect and identify outlier batches. The second analysis tool, linear regression, involves comparing various regression approaches. These approaches use two basic types of models: the intercept model (y = α + βx) and the initial point model [y - y 0 = β(x - x 0 )]. The intercept model fits strictly the exposure or burnup values of isotopic functions, while the initial point model utilizes the exposure values plus the initial or fabricator's data values in the regression analysis. Two fitting methods are applied to each of these models. These methods are: (1) the usual least squares fitting approach where x is measured without error, and (2) Deming's approach which uses the variance estimates obtained from the paired comparison results and considers x and y are both measured with error. The Yankee Rowe data were first measured by Nuclear Fuel Services (NFS) and remeasured by Nuclear Audit and Testing Company (NATCO). The ratio of Pu/U versus 235 D (in which 235 D is the amount of depleted 235 U expressed in weight percent) using actual numbers is the isotopic function illustrated. Statistical results using the Yankee Rowe data indicates the attractiveness of Deming's regression model over the usual approach by simple comparison of the given regression variances with the random variance from the paired comparison results
Propensity Score Analysis: An Alternative Statistical Approach for HRD Researchers
Keiffer, Greggory L.; Lane, Forrest C.
2016-01-01
Purpose: This paper aims to introduce matching in propensity score analysis (PSA) as an alternative statistical approach for researchers looking to make causal inferences using intact groups. Design/methodology/approach: An illustrative example demonstrated the varying results of analysis of variance, analysis of covariance and PSA on a heuristic…
Measuring University Students' Approaches to Learning Statistics: An Invariance Study
Chiesi, Francesca; Primi, Caterina; Bilgin, Ayse Aysin; Lopez, Maria Virginia; del Carmen Fabrizio, Maria; Gozlu, Sitki; Tuan, Nguyen Minh
2016-01-01
The aim of the current study was to provide evidence that an abbreviated version of the Approaches and Study Skills Inventory for Students (ASSIST) was invariant across different languages and educational contexts in measuring university students' learning approaches to statistics. Data were collected on samples of university students attending…
Energy Technology Data Exchange (ETDEWEB)
Mishra, Srikanta; Schuetter, Jared
2014-11-01
We compare two approaches for building a statistical proxy model (metamodel) for CO₂ geologic sequestration from the results of full-physics compositional simulations. The first approach involves a classical Box-Behnken or Augmented Pairs experimental design with a quadratic polynomial response surface. The second approach used a space-filling maxmin Latin Hypercube sampling or maximum entropy design with the choice of five different meta-modeling techniques: quadratic polynomial, kriging with constant and quadratic trend terms, multivariate adaptive regression spline (MARS) and additivity and variance stabilization (AVAS). Simulations results for CO₂ injection into a reservoir-caprock system with 9 design variables (and 97 samples) were used to generate the data for developing the proxy models. The fitted models were validated with using an independent data set and a cross-validation approach for three different performance metrics: total storage efficiency, CO₂ plume radius and average reservoir pressure. The Box-Behnken–quadratic polynomial metamodel performed the best, followed closely by the maximin LHS–kriging metamodel.
A statistical approach to evaluate flood risk at the regional level: an application to Italy
Rossi, Mauro; Marchesini, Ivan; Salvati, Paola; Donnini, Marco; Guzzetti, Fausto; Sterlacchini, Simone; Zazzeri, Marco; Bonazzi, Alessandro; Carlesi, Andrea
2016-04-01
Floods are frequent and widespread in Italy, causing every year multiple fatalities and extensive damages to public and private structures. A pre-requisite for the development of mitigation schemes, including financial instruments such as insurance, is the ability to quantify their costs starting from the estimation of the underlying flood hazard. However, comprehensive and coherent information on flood prone areas, and estimates on the frequency and intensity of flood events, are not often available at scales appropriate for risk pooling and diversification. In Italy, River Basins Hydrogeological Plans (PAI), prepared by basin administrations, are the basic descriptive, regulatory, technical and operational tools for environmental planning in flood prone areas. Nevertheless, such plans do not cover the entire Italian territory, having significant gaps along the minor hydrographic network and in ungauged basins. Several process-based modelling approaches have been used by different basin administrations for the flood hazard assessment, resulting in an inhomogeneous hazard zonation of the territory. As a result, flood hazard assessments expected and damage estimations across the different Italian basin administrations are not always coherent. To overcome these limitations, we propose a simplified multivariate statistical approach for the regional flood hazard zonation coupled with a flood impact model. This modelling approach has been applied in different Italian basin administrations, allowing a preliminary but coherent and comparable estimation of the flood hazard and the relative impact. Model performances are evaluated comparing the predicted flood prone areas with the corresponding PAI zonation. The proposed approach will provide standardized information (following the EU Floods Directive specifications) on flood risk at a regional level which can in turn be more readily applied to assess flood economic impacts. Furthermore, in the assumption of an appropriate
Statistical analysis tolerance using jacobian torsor model based on uncertainty propagation method
Directory of Open Access Journals (Sweden)
W Ghie
2016-04-01
Full Text Available One risk inherent in the use of assembly components is that the behaviourof these components is discovered only at the moment an assembly isbeing carried out. The objective of our work is to enable designers to useknown component tolerances as parameters in models that can be usedto predict properties at the assembly level. In this paper we present astatistical approach to assemblability evaluation, based on tolerance andclearance propagations. This new statistical analysis method for toleranceis based on the Jacobian-Torsor model and the uncertainty measurementapproach. We show how this can be accomplished by modeling thedistribution of manufactured dimensions through applying a probabilitydensity function. By presenting an example we show how statisticaltolerance analysis should be used in the Jacobian-Torsor model. This workis supported by previous efforts aimed at developing a new generation ofcomputational tools for tolerance analysis and synthesis, using theJacobian-Torsor approach. This approach is illustrated on a simple threepartassembly, demonstrating the method’s capability in handling threedimensionalgeometry.
Statistical Models for Social Networks
Snijders, Tom A. B.; Cook, KS; Massey, DS
2011-01-01
Statistical models for social networks as dependent variables must represent the typical network dependencies between tie variables such as reciprocity, homophily, transitivity, etc. This review first treats models for single (cross-sectionally observed) networks and then for network dynamics. For
Strategists and Non-Strategists in Austrian Enterprises—Statistical Approaches
Duller, Christine
2011-09-01
The purpose of this work is to determine with a modern statistical approach which variables can indicate whether an arbitrary enterprise uses strategic management as basic business concept. "Strategic management is an ongoing process that evaluates and controls the business and the industries in which the company is involved; assesses its competitors and sets goals and strategies to meet all existing and potential competitors; and then reassesses each strategy annually or quarterly (i.e. regularly) to determine how it has been implemented and whether it has succeeded or needs replacement by a new strategy to meet changed circumstances, new technology, new competitors, a new economic environment or a new social, financial or political environment." [12] In Austria 70% to 80% of all enterprises can be classified as family firms. In literature the empirically untested hypothesis can be found that family firms tend to have less formalised management accounting systems than non-family enterprises. But it is unknown whether the use of strategic management accounting systems is influenced more by the fact of structure (family or non-family enterprise) or by the effect of size (number of employees). Therefore, the goal is to split up enterprises into two subgroups, namely strategists and non-strategists and to get information on the variables of influence (size, structure, branches, etc.). Two statistical approaches are used: On the one hand a classical cluster analysis is implemented to design two subgroups and on the other hand a latent class model is built up for this problem. After a description of the theoretical background first results of both strategies are compared.
Statistical geological discrete fracture network model. Forsmark modelling stage 2.2
Energy Technology Data Exchange (ETDEWEB)
Fox, Aaron; La Pointe, Paul [Golder Associates Inc (United States); Simeonov, Assen [Swedish Nuclear Fuel and Waste Management Co., Stockholm (Sweden); Hermanson, Jan; Oehman, Johan [Golder Associates AB, Stockholm (Sweden)
2007-11-15
The Swedish Nuclear Fuel and Waste Management Company (SKB) is performing site characterization at two different locations, Forsmark and Laxemar, in order to locate a site for a final geologic repository for spent nuclear fuel. The program is built upon the development of Site Descriptive Models (SDMs) at specific timed data freezes. Each SDM is formed from discipline-specific reports from across the scientific spectrum. This report describes the methods, analyses, and conclusions of the geological modeling team with respect to a geological and statistical model of fractures and minor deformation zones (henceforth referred to as the geological DFN), version 2.2, at the Forsmark site. The geological DFN builds upon the work of other geological modelers, including the deformation zone (DZ), rock domain (RD), and fracture domain (FD) models. The geological DFN is a statistical model for stochastically simulating rock fractures and minor deformation zones as a scale of less than 1,000 m (the lower cut-off of the DZ models). The geological DFN is valid within four specific fracture domains inside the local model region, and encompassing the candidate volume at Forsmark: FFM01, FFM02, FFM03, and FFM06. The models are build using data from detailed surface outcrop maps and the cored borehole record at Forsmark. The conceptual model for the Forsmark 2.2 geological revolves around the concept of orientation sets; for each fracture domain, other model parameters such as size and intensity are tied to the orientation sets. Two classes of orientation sets were described; Global sets, which are encountered everywhere in the model region, and Local sets, which represent highly localized stress environments. Orientation sets were described in terms of their general cardinal direction (NE, NW, etc). Two alternatives are presented for fracture size modeling: - the tectonic continuum approach (TCM, TCMF) described by coupled size-intensity scaling following power law distributions
Statistical geological discrete fracture network model. Forsmark modelling stage 2.2
International Nuclear Information System (INIS)
Fox, Aaron; La Pointe, Paul; Simeonov, Assen; Hermanson, Jan; Oehman, Johan
2007-11-01
The Swedish Nuclear Fuel and Waste Management Company (SKB) is performing site characterization at two different locations, Forsmark and Laxemar, in order to locate a site for a final geologic repository for spent nuclear fuel. The program is built upon the development of Site Descriptive Models (SDMs) at specific timed data freezes. Each SDM is formed from discipline-specific reports from across the scientific spectrum. This report describes the methods, analyses, and conclusions of the geological modeling team with respect to a geological and statistical model of fractures and minor deformation zones (henceforth referred to as the geological DFN), version 2.2, at the Forsmark site. The geological DFN builds upon the work of other geological modelers, including the deformation zone (DZ), rock domain (RD), and fracture domain (FD) models. The geological DFN is a statistical model for stochastically simulating rock fractures and minor deformation zones as a scale of less than 1,000 m (the lower cut-off of the DZ models). The geological DFN is valid within four specific fracture domains inside the local model region, and encompassing the candidate volume at Forsmark: FFM01, FFM02, FFM03, and FFM06. The models are build using data from detailed surface outcrop maps and the cored borehole record at Forsmark. The conceptual model for the Forsmark 2.2 geological revolves around the concept of orientation sets; for each fracture domain, other model parameters such as size and intensity are tied to the orientation sets. Two classes of orientation sets were described; Global sets, which are encountered everywhere in the model region, and Local sets, which represent highly localized stress environments. Orientation sets were described in terms of their general cardinal direction (NE, NW, etc). Two alternatives are presented for fracture size modeling: - the tectonic continuum approach (TCM, TCMF) described by coupled size-intensity scaling following power law distributions
International Nuclear Information System (INIS)
Lee, Jae Bong; Park, Jae Hak; Kim, Hong Deok; Chung, Han Sub; Kim, Tae Ryong
2005-01-01
The growth of AVB wear in Model F steam generator tubes is predicted using the Monte Carlo Method and statistical approaches. The statistical parameters that represent the characteristics of wear growth and wear initiation are derived from In-Service Inspection (ISI) Non-Destructive Evaluation (NDE) data. Based on the statistical approaches, wear growth model are proposed and applied to predict wear distribution at the End Of Cycle (EOC). Probabilistic distributions of the number of wear flaws and maximum wear depth at EOC are obtained from the analysis. Comparing the predicted EOC wear flaw data with the known EOC data the usefulness of the proposed method is examined and satisfactory results are obtained
Energy Technology Data Exchange (ETDEWEB)
Lee, Jae Bong; Park, Jae Hak [Chungbuk National Univ., Cheongju (Korea, Republic of); Kim, Hong Deok; Chung, Han Sub; Kim, Tae Ryong [Korea Electtric Power Research Institute, Daejeon (Korea, Republic of)
2005-07-01
The growth of AVB wear in Model F steam generator tubes is predicted using the Monte Carlo Method and statistical approaches. The statistical parameters that represent the characteristics of wear growth and wear initiation are derived from In-Service Inspection (ISI) Non-Destructive Evaluation (NDE) data. Based on the statistical approaches, wear growth model are proposed and applied to predict wear distribution at the End Of Cycle (EOC). Probabilistic distributions of the number of wear flaws and maximum wear depth at EOC are obtained from the analysis. Comparing the predicted EOC wear flaw data with the known EOC data the usefulness of the proposed method is examined and satisfactory results are obtained.
Functional summary statistics for the Johnson-Mehl model
DEFF Research Database (Denmark)
Møller, Jesper; Ghorbani, Mohammad
The Johnson-Mehl germination-growth model is a spatio-temporal point process model which among other things have been used for the description of neurotransmitters datasets. However, for such datasets parametric Johnson-Mehl models fitted by maximum likelihood have yet not been evaluated by means...... of functional summary statistics. This paper therefore invents four functional summary statistics adapted to the Johnson-Mehl model, with two of them based on the second-order properties and the other two on the nuclei-boundary distances for the associated Johnson-Mehl tessellation. The functional summary...... statistics theoretical properties are investigated, non-parametric estimators are suggested, and their usefulness for model checking is examined in a simulation study. The functional summary statistics are also used for checking fitted parametric Johnson-Mehl models for a neurotransmitters dataset....
Model unspecific search in CMS. Treatment of insufficient Monte Carlo statistics
Energy Technology Data Exchange (ETDEWEB)
Lieb, Jonas; Albert, Andreas; Duchardt, Deborah; Hebbeker, Thomas; Knutzen, Simon; Meyer, Arnd; Pook, Tobias; Roemer, Jonas [III. Physikalisches Institut A, RWTH Aachen University (Germany)
2016-07-01
In 2015, the CMS detector recorded proton-proton collisions at an unprecedented center of mass energy of √(s)=13 TeV. The Model Unspecific Search in CMS (MUSiC) offers an analysis approach of these data which is complementary to dedicated analyses: By taking all produced final states into consideration, MUSiC is sensitive to indicators of new physics appearing in final states that are usually not investigated. In a two step process, MUSiC first classifies events according to their physics content and then searches kinematic distributions for the most significant deviations between Monte Carlo simulations and observed data. Such a general approach introduces its own set of challenges. One of them is the treatment of situations with insufficient Monte Carlo statistics. Complementing introductory presentations on the MUSiC event selection and classification, this talk will present a method of dealing with the issue of low Monte Carlo statistics.
Elementary statistical thermodynamics a problems approach
Smith, Norman O
1982-01-01
This book is a sequel to my Chemical Thermodynamics: A Prob lems Approach published in 1967, which concerned classical thermodynamics almost exclusively. Most books on statistical thermodynamics now available are written either for the superior general chemistry student or for the specialist. The author has felt the need for a text which would bring the intermediate reader to the point where he could not only appreciate the roots of the subject but also have some facility in calculating thermodynamic quantities. Although statistical thermodynamics comprises an essential part of the college training of a chemist, its treatment in general physical chem istry texts is, of necessity, compressed to the point where the less competent student is unable to appreciate or comprehend its logic and beauty, and is reduced to memorizing a series of formulas. It has been my aim to fill this need by writing a logical account of the foundations and applications of the sub ject at a level which can be grasped by an under...
Statistical mechanics of attractor neural network models with synaptic depression
International Nuclear Information System (INIS)
Igarashi, Yasuhiko; Oizumi, Masafumi; Otsubo, Yosuke; Nagata, Kenji; Okada, Masato
2009-01-01
Synaptic depression is known to control gain for presynaptic inputs. Since cortical neurons receive thousands of presynaptic inputs, and their outputs are fed into thousands of other neurons, the synaptic depression should influence macroscopic properties of neural networks. We employ simple neural network models to explore the macroscopic effects of synaptic depression. Systems with the synaptic depression cannot be analyzed due to asymmetry of connections with the conventional equilibrium statistical-mechanical approach. Thus, we first propose a microscopic dynamical mean field theory. Next, we derive macroscopic steady state equations and discuss the stabilities of steady states for various types of neural network models.
Distributions with given marginals and statistical modelling
Fortiana, Josep; Rodriguez-Lallena, José
2002-01-01
This book contains a selection of the papers presented at the meeting `Distributions with given marginals and statistical modelling', held in Barcelona (Spain), July 17-20, 2000. In 24 chapters, this book covers topics such as the theory of copulas and quasi-copulas, the theory and compatibility of distributions, models for survival distributions and other well-known distributions, time series, categorical models, definition and estimation of measures of dependence, monotonicity and stochastic ordering, shape and separability of distributions, hidden truncation models, diagonal families, orthogonal expansions, tests of independence, and goodness of fit assessment. These topics share the use and properties of distributions with given marginals, this being the fourth specialised text on this theme. The innovative aspect of the book is the inclusion of statistical aspects such as modelling, Bayesian statistics, estimation, and tests.
Statistical modelling of subdiffusive dynamics in the cytoplasm of living cells: A FARIMA approach
Burnecki, K.; Muszkieta, M.; Sikora, G.; Weron, A.
2012-04-01
Golding and Cox (Phys. Rev. Lett., 96 (2006) 098102) tracked the motion of individual fluorescently labelled mRNA molecules inside live E. coli cells. They found that in the set of 23 trajectories from 3 different experiments, the automatically recognized motion is subdiffusive and published an intriguing microscopy video. Here, we extract the corresponding time series from this video by image segmentation method and present its detailed statistical analysis. We find that this trajectory was not included in the data set already studied and has different statistical properties. It is best fitted by a fractional autoregressive integrated moving average (FARIMA) process with the normal-inverse Gaussian (NIG) noise and the negative memory. In contrast to earlier studies, this shows that the fractional Brownian motion is not the best model for the dynamics documented in this video.
International Nuclear Information System (INIS)
Shafieloo, Arman
2012-01-01
By introducing Crossing functions and hyper-parameters I show that the Bayesian interpretation of the Crossing Statistics [1] can be used trivially for the purpose of model selection among cosmological models. In this approach to falsify a cosmological model there is no need to compare it with other models or assume any particular form of parametrization for the cosmological quantities like luminosity distance, Hubble parameter or equation of state of dark energy. Instead, hyper-parameters of Crossing functions perform as discriminators between correct and wrong models. Using this approach one can falsify any assumed cosmological model without putting priors on the underlying actual model of the universe and its parameters, hence the issue of dark energy parametrization is resolved. It will be also shown that the sensitivity of the method to the intrinsic dispersion of the data is small that is another important characteristic of the method in testing cosmological models dealing with data with high uncertainties
Statistical network analysis for analyzing policy networks
DEFF Research Database (Denmark)
Robins, Garry; Lewis, Jenny; Wang, Peng
2012-01-01
and policy network methodology is the development of statistical modeling approaches that can accommodate such dependent data. In this article, we review three network statistical methods commonly used in the current literature: quadratic assignment procedures, exponential random graph models (ERGMs......To analyze social network data using standard statistical approaches is to risk incorrect inference. The dependencies among observations implied in a network conceptualization undermine standard assumptions of the usual general linear models. One of the most quickly expanding areas of social......), and stochastic actor-oriented models. We focus most attention on ERGMs by providing an illustrative example of a model for a strategic information network within a local government. We draw inferences about the structural role played by individuals recognized as key innovators and conclude that such an approach...
Ayam, Rufus Tekoh
2011-01-01
PURPOSE: The two approaches to audit sampling; statistical and nonstatistical have been examined in this study. The overall purpose of the study is to explore the current extent at which statistical and nonstatistical sampling approaches are utilized by independent auditors during auditing practices. Moreover, the study also seeks to achieve two additional purposes; the first is to find out whether auditors utilize different sampling techniques when auditing SME´s (Small and Medium-Sized Ente...
An introduction to statistical computing a simulation-based approach
Voss, Jochen
2014-01-01
A comprehensive introduction to sampling-based methods in statistical computing The use of computers in mathematics and statistics has opened up a wide range of techniques for studying otherwise intractable problems. Sampling-based simulation techniques are now an invaluable tool for exploring statistical models. This book gives a comprehensive introduction to the exciting area of sampling-based methods. An Introduction to Statistical Computing introduces the classical topics of random number generation and Monte Carlo methods. It also includes some advanced met
A model of seismic focus and related statistical distributions of earthquakes
International Nuclear Information System (INIS)
Apostol, Bogdan-Felix
2006-01-01
A growth model for accumulating seismic energy in a localized seismic focus is described, which introduces a fractional parameter r on geometrical grounds. The model is employed for deriving a power-type law for the statistical distribution in energy, where the parameter r contributes to the exponent, as well as corresponding time and magnitude distributions for earthquakes. The accompanying seismic activity of foreshocks and aftershocks is discussed in connection with this approach, as based on Omori distributions, and the rate of released energy is derived
Actuarial statistics with generalized linear mixed models
Antonio, K.; Beirlant, J.
2007-01-01
Over the last decade the use of generalized linear models (GLMs) in actuarial statistics has received a lot of attention, starting from the actuarial illustrations in the standard text by McCullagh and Nelder [McCullagh, P., Nelder, J.A., 1989. Generalized linear models. In: Monographs on Statistics
A Statistical Mechanics Approach to Approximate Analytical Bootstrap Averages
DEFF Research Database (Denmark)
Malzahn, Dorthe; Opper, Manfred
2003-01-01
We apply the replica method of Statistical Physics combined with a variational method to the approximate analytical computation of bootstrap averages for estimating the generalization error. We demonstrate our approach on regression with Gaussian processes and compare our results with averages...
Statistical Method to Overcome Overfitting Issue in Rational Function Models
Alizadeh Moghaddam, S. H.; Mokhtarzade, M.; Alizadeh Naeini, A.; Alizadeh Moghaddam, S. A.
2017-09-01
Rational function models (RFMs) are known as one of the most appealing models which are extensively applied in geometric correction of satellite images and map production. Overfitting is a common issue, in the case of terrain dependent RFMs, that degrades the accuracy of RFMs-derived geospatial products. This issue, resulting from the high number of RFMs' parameters, leads to ill-posedness of the RFMs. To tackle this problem, in this study, a fast and robust statistical approach is proposed and compared to Tikhonov regularization (TR) method, as a frequently-used solution to RFMs' overfitting. In the proposed method, a statistical test, namely, significance test is applied to search for the RFMs' parameters that are resistant against overfitting issue. The performance of the proposed method was evaluated for two real data sets of Cartosat-1 satellite images. The obtained results demonstrate the efficiency of the proposed method in term of the achievable level of accuracy. This technique, indeed, shows an improvement of 50-80% over the TR.
Statistical pairwise interaction model of stock market
Bury, Thomas
2013-03-01
Financial markets are a classical example of complex systems as they are compound by many interacting stocks. As such, we can obtain a surprisingly good description of their structure by making the rough simplification of binary daily returns. Spin glass models have been applied and gave some valuable results but at the price of restrictive assumptions on the market dynamics or they are agent-based models with rules designed in order to recover some empirical behaviors. Here we show that the pairwise model is actually a statistically consistent model with the observed first and second moments of the stocks orientation without making such restrictive assumptions. This is done with an approach only based on empirical data of price returns. Our data analysis of six major indices suggests that the actual interaction structure may be thought as an Ising model on a complex network with interaction strengths scaling as the inverse of the system size. This has potentially important implications since many properties of such a model are already known and some techniques of the spin glass theory can be straightforwardly applied. Typical behaviors, as multiple equilibria or metastable states, different characteristic time scales, spatial patterns, order-disorder, could find an explanation in this picture.
Validating an Air Traffic Management Concept of Operation Using Statistical Modeling
He, Yuning; Davies, Misty Dawn
2013-01-01
Validating a concept of operation for a complex, safety-critical system (like the National Airspace System) is challenging because of the high dimensionality of the controllable parameters and the infinite number of states of the system. In this paper, we use statistical modeling techniques to explore the behavior of a conflict detection and resolution algorithm designed for the terminal airspace. These techniques predict the robustness of the system simulation to both nominal and off-nominal behaviors within the overall airspace. They also can be used to evaluate the output of the simulation against recorded airspace data. Additionally, the techniques carry with them a mathematical value of the worth of each prediction-a statistical uncertainty for any robustness estimate. Uncertainty Quantification (UQ) is the process of quantitative characterization and ultimately a reduction of uncertainties in complex systems. UQ is important for understanding the influence of uncertainties on the behavior of a system and therefore is valuable for design, analysis, and verification and validation. In this paper, we apply advanced statistical modeling methodologies and techniques on an advanced air traffic management system, namely the Terminal Tactical Separation Assured Flight Environment (T-TSAFE). We show initial results for a parameter analysis and safety boundary (envelope) detection in the high-dimensional parameter space. For our boundary analysis, we developed a new sequential approach based upon the design of computer experiments, allowing us to incorporate knowledge from domain experts into our modeling and to determine the most likely boundary shapes and its parameters. We carried out the analysis on system parameters and describe an initial approach that will allow us to include time-series inputs, such as the radar track data, into the analysis
Structured statistical models of inductive reasoning.
Kemp, Charles; Tenenbaum, Joshua B
2009-01-01
Everyday inductive inferences are often guided by rich background knowledge. Formal models of induction should aim to incorporate this knowledge and should explain how different kinds of knowledge lead to the distinctive patterns of reasoning found in different inductive contexts. This article presents a Bayesian framework that attempts to meet both goals and describes [corrected] 4 applications of the framework: a taxonomic model, a spatial model, a threshold model, and a causal model. Each model makes probabilistic inferences about the extensions of novel properties, but the priors for the 4 models are defined over different kinds of structures that capture different relationships between the categories in a domain. The framework therefore shows how statistical inference can operate over structured background knowledge, and the authors argue that this interaction between structure and statistics is critical for explaining the power and flexibility of human reasoning.
Statistical modelling in biostatistics and bioinformatics selected papers
Peng, Defen
2014-01-01
This book presents selected papers on statistical model development related mainly to the fields of Biostatistics and Bioinformatics. The coverage of the material falls squarely into the following categories: (a) Survival analysis and multivariate survival analysis, (b) Time series and longitudinal data analysis, (c) Statistical model development and (d) Applied statistical modelling. Innovations in statistical modelling are presented throughout each of the four areas, with some intriguing new ideas on hierarchical generalized non-linear models and on frailty models with structural dispersion, just to mention two examples. The contributors include distinguished international statisticians such as Philip Hougaard, John Hinde, Il Do Ha, Roger Payne and Alessandra Durio, among others, as well as promising newcomers. Some of the contributions have come from researchers working in the BIO-SI research programme on Biostatistics and Bioinformatics, centred on the Universities of Limerick and Galway in Ireland and fu...
A statistical mechanics approach to mixing in stratified fluids
Venaille , Antoine; Gostiaux , Louis; Sommeria , Joël
2016-01-01
Accepted for the Journal of Fluid Mechanics; Predicting how much mixing occurs when a given amount of energy is injected into a Boussinesq fluid is a longstanding problem in stratified turbulence. The huge number of degrees of freedom involved in these processes renders extremely difficult a deterministic approach to the problem. Here we present a statistical mechanics approach yielding a prediction for a cumulative, global mixing efficiency as a function of a global Richard-son number and th...
DEFF Research Database (Denmark)
Ranaboldo, Matteo; Giebel, Gregor; Codina, Bernat
2013-01-01
A combination of physical and statistical treatments to post‐process numerical weather predictions (NWP) outputs is needed for successful short‐term wind power forecasts. One of the most promising and effective approaches for statistical treatment is the Model Output Statistics (MOS) technique....... The proposed MOS performed well in both wind farms, and its forecasts compare positively with an actual operative model in use at Risø DTU and other MOS types, showing minimum BIAS and improving NWP power forecast of around 15% in terms of root mean square error. Further improvements could be obtained...
International Nuclear Information System (INIS)
Frepoli, Cesare; Oriani, Luca
2006-01-01
In recent years, non-parametric or order statistics methods have been widely used to assess the impact of the uncertainties within Best-Estimate LOCA evaluation models. The bounding of the uncertainties is achieved with a direct Monte Carlo sampling of the uncertainty attributes, with the minimum trial number selected to 'stabilize' the estimation of the critical output values (peak cladding temperature (PCT), local maximum oxidation (LMO), and core-wide oxidation (CWO A non-parametric order statistics uncertainty analysis was recently implemented within the Westinghouse Realistic Large Break LOCA evaluation model, also referred to as 'Automated Statistical Treatment of Uncertainty Method' (ASTRUM). The implementation or interpretation of order statistics in safety analysis is not fully consistent within the industry. This has led to an extensive public debate among regulators and researchers which can be found in the open literature. The USNRC-approved Westinghouse method follows a rigorous implementation of the order statistics theory, which leads to the execution of 124 simulations within a Large Break LOCA analysis. This is a solid approach which guarantees that a bounding value (at 95% probability) of the 95 th percentile for each of the three 10 CFR 50.46 ECCS design acceptance criteria (PCT, LMO and CWO) is obtained. The objective of this paper is to provide additional insights on the ASTRUM statistical approach, with a more in-depth analysis of pros and cons of the order statistics and of the Westinghouse approach in the implementation of this statistical methodology. (authors)
Carlson, Kieth A.; Winquist, Jennifer R.
2011-01-01
The study evaluates a semester-long workbook curriculum approach to teaching a college level introductory statistics course. The workbook curriculum required students to read content before and during class and then work in groups to complete problems and answer conceptual questions pertaining to the material they read. Instructors spent class…
Craniofacial Statistical Deformation Models of Wild-type mice and Crouzon mice
DEFF Research Database (Denmark)
Ólafsdóttir, Hildur; Darvann, Tron Andre; Ersbøll, Bjarne Kjær
2007-01-01
Crouzon syndrome is characterised by the premature fusion of cranial sutures and synchondroses leading to craniofacial growth disturbances. The gene causing the syndrome was discovered approximately a decade ago and recently the first mouse model of the syndrome was generated. In this study, a set...... of Micro CT scannings of the heads of wild-type (normal) mice and Crouzon mice were investigated. Statistical deformation models were built to assess the anatomical differences between the groups, as well as the within-group anatomical variation. Following the approach by Rueckert et al. we built an atlas...
International Nuclear Information System (INIS)
Yamaoka, Naoto; Watanabe, Wataru; Hontani, Hidekata
2010-01-01
Most of the time when we construct statistical point cloud model, we need to calculate the corresponding points. Constructed statistical model will not be the same if we use different types of method to calculate the corresponding points. This article proposes the effect to statistical model of human organ made by different types of method to calculate the corresponding points. We validated the performance of statistical model by registering a surface of an organ in a 3D medical image. We compare two methods to calculate corresponding points. The first, the 'Generalized Multi-Dimensional Scaling (GMDS)', determines the corresponding points by the shapes of two curved surfaces. The second approach, the 'Entropy-based Particle system', chooses corresponding points by calculating a number of curved surfaces statistically. By these methods we construct the statistical models and using these models we conducted registration with the medical image. For the estimation, we use non-parametric belief propagation and this method estimates not only the position of the organ but also the probability density of the organ position. We evaluate how the two different types of method that calculates corresponding points affects the statistical model by change in probability density of each points. (author)
Drought episodes over Greece as simulated by dynamical and statistical downscaling approaches
Anagnostopoulou, Christina
2017-07-01
Drought over the Greek region is characterized by a strong seasonal cycle and large spatial variability. Dry spells longer than 10 consecutive days mainly characterize the duration and the intensity of Greek drought. Moreover, an increasing trend of the frequency of drought episodes has been observed, especially during the last 20 years of the 20th century. Moreover, the most recent regional circulation models (RCMs) present discrepancies compared to observed precipitation, while they are able to reproduce the main patterns of atmospheric circulation. In this study, both a statistical and a dynamical downscaling approach are used to quantify drought episodes over Greece by simulating the Standardized Precipitation Index (SPI) for different time steps (3, 6, and 12 months). A statistical downscaling technique based on artificial neural network is employed for the estimation of SPI over Greece, while this drought index is also estimated using the RCM precipitation for the time period of 1961-1990. Overall, it was found that the drought characteristics (intensity, duration, and spatial extent) were well reproduced by the regional climate models for long term drought indices (SPI12) while ANN simulations are better for the short-term drought indices (SPI3).
Directory of Open Access Journals (Sweden)
Jay Krishna Thakur
2015-08-01
Full Text Available The aim of this work is to investigate new approaches using methods based on statistics and geo-statistics for spatio-temporal optimization of groundwater monitoring networks. The formulated and integrated methods were tested with the groundwater quality data set of Bitterfeld/Wolfen, Germany. Spatially, the monitoring network was optimized using geo-statistical methods. Temporal optimization of the monitoring network was carried out using Sen’s method (1968. For geostatistical network optimization, a geostatistical spatio-temporal algorithm was used to identify redundant wells in 2- and 2.5-D Quaternary and Tertiary aquifers. Influences of interpolation block width, dimension, contaminant association, groundwater flow direction and aquifer homogeneity on statistical and geostatistical methods for monitoring network optimization were analysed. The integrated approach shows 37% and 28% redundancies in the monitoring network in Quaternary aquifer and Tertiary aquifer respectively. The geostatistical method also recommends 41 and 22 new monitoring wells in the Quaternary and Tertiary aquifers respectively. In temporal optimization, an overall optimized sampling interval was recommended in terms of lower quartile (238 days, median quartile (317 days and upper quartile (401 days in the research area of Bitterfeld/Wolfen. Demonstrated methods for improving groundwater monitoring network can be used in real monitoring network optimization with due consideration given to influencing factors.
A statistical approach to traditional Vietnamese medical diagnoses standardization
International Nuclear Information System (INIS)
Nguyen Hoang Phuong; Nguyen Quang Hoa; Le Dinh Long
1990-12-01
In this paper the first results of the statistical approach for Cold-Heat diagnosis standardization as a first work in the ''eight rules diagnoses'' standardization of Traditional Vietnamese Medicine are briefly described. Some conclusions and suggestions for further work are given. 3 refs, 2 tabs
Improving UWB-Based Localization in IoT Scenarios with Statistical Models of Distance Error.
Monica, Stefania; Ferrari, Gianluigi
2018-05-17
Interest in the Internet of Things (IoT) is rapidly increasing, as the number of connected devices is exponentially growing. One of the application scenarios envisaged for IoT technologies involves indoor localization and context awareness. In this paper, we focus on a localization approach that relies on a particular type of communication technology, namely Ultra Wide Band (UWB). UWB technology is an attractive choice for indoor localization, owing to its high accuracy. Since localization algorithms typically rely on estimated inter-node distances, the goal of this paper is to evaluate the improvement brought by a simple (linear) statistical model of the distance error. On the basis of an extensive experimental measurement campaign, we propose a general analytical framework, based on a Least Square (LS) method, to derive a novel statistical model for the range estimation error between a pair of UWB nodes. The proposed statistical model is then applied to improve the performance of a few illustrative localization algorithms in various realistic scenarios. The obtained experimental results show that the use of the proposed statistical model improves the accuracy of the considered localization algorithms with a reduction of the localization error up to 66%.
A Bayesian approach for quantification of model uncertainty
International Nuclear Information System (INIS)
Park, Inseok; Amarchinta, Hemanth K.; Grandhi, Ramana V.
2010-01-01
In most engineering problems, more than one model can be created to represent an engineering system's behavior. Uncertainty is inevitably involved in selecting the best model from among the models that are possible. Uncertainty in model selection cannot be ignored, especially when the differences between the predictions of competing models are significant. In this research, a methodology is proposed to quantify model uncertainty using measured differences between experimental data and model outcomes under a Bayesian statistical framework. The adjustment factor approach is used to propagate model uncertainty into prediction of a system response. A nonlinear vibration system is used to demonstrate the processes for implementing the adjustment factor approach. Finally, the methodology is applied on the engineering benefits of a laser peening process, and a confidence band for residual stresses is established to indicate the reliability of model prediction.
MacCuspie, Robert I; Gorka, Danielle E
2013-10-01
Recently, an atomic force microscopy (AFM)-based approach for quantifying the number of biological molecules conjugated to a nanoparticle surface at low number densities was reported. The number of target molecules conjugated to the analyte nanoparticle can be determined with single nanoparticle fidelity using antibody-mediated self-assembly to decorate the analyte nanoparticles with probe nanoparticles (i.e., quantitative immunostaining). This work refines the statistical models used to quantitatively interpret the observations when AFM is used to image the resulting structures. The refinements add terms to the previous statistical models to account for the physical sizes of the analyte nanoparticles, conjugated molecules, antibodies, and probe nanoparticles. Thus, a more physically realistic statistical computation can be implemented for a given sample of known qualitative composition, using the software scripts provided. Example AFM data sets, using horseradish peroxidase conjugated to gold nanoparticles, are presented to illustrate how to implement this method successfully.
Data Analysis A Model Comparison Approach, Second Edition
Judd, Charles M; Ryan, Carey S
2008-01-01
This completely rewritten classic text features many new examples, insights and topics including mediational, categorical, and multilevel models. Substantially reorganized, this edition provides a briefer, more streamlined examination of data analysis. Noted for its model-comparison approach and unified framework based on the general linear model, the book provides readers with a greater understanding of a variety of statistical procedures. This consistent framework, including consistent vocabulary and notation, is used throughout to develop fewer but more powerful model building techniques. T
A Stochastic Fractional Dynamics Model of Rainfall Statistics
Kundu, Prasun; Travis, James
2013-04-01
Rainfall varies in space and time in a highly irregular manner and is described naturally in terms of a stochastic process. A characteristic feature of rainfall statistics is that they depend strongly on the space-time scales over which rain data are averaged. A spectral model of precipitation has been developed based on a stochastic differential equation of fractional order for the point rain rate, that allows a concise description of the second moment statistics of rain at any prescribed space-time averaging scale. The model is designed to faithfully reflect the scale dependence and is thus capable of providing a unified description of the statistics of both radar and rain gauge data. The underlying dynamical equation can be expressed in terms of space-time derivatives of fractional orders that are adjusted together with other model parameters to fit the data. The form of the resulting spectrum gives the model adequate flexibility to capture the subtle interplay between the spatial and temporal scales of variability of rain but strongly constrains the predicted statistical behavior as a function of the averaging length and times scales. The main restriction is the assumption that the statistics of the precipitation field is spatially homogeneous and isotropic and stationary in time. We test the model with radar and gauge data collected contemporaneously at the NASA TRMM ground validation sites located near Melbourne, Florida and in Kwajalein Atoll, Marshall Islands in the tropical Pacific. We estimate the parameters by tuning them to the second moment statistics of the radar data. The model predictions are then found to fit the second moment statistics of the gauge data reasonably well without any further adjustment. Some data sets containing periods of non-stationary behavior that involves occasional anomalously correlated rain events, present a challenge for the model.
Statistical Models and Methods for Lifetime Data
Lawless, Jerald F
2011-01-01
Praise for the First Edition"An indispensable addition to any serious collection on lifetime data analysis and . . . a valuable contribution to the statistical literature. Highly recommended . . ."-Choice"This is an important book, which will appeal to statisticians working on survival analysis problems."-Biometrics"A thorough, unified treatment of statistical models and methods used in the analysis of lifetime data . . . this is a highly competent and agreeable statistical textbook."-Statistics in MedicineThe statistical analysis of lifetime or response time data is a key tool in engineering,
A survey on computational intelligence approaches for predictive modeling in prostate cancer
Cosma, G; Brown, D; Archer, M; Khan, M; Pockley, AG
2017-01-01
Predictive modeling in medicine involves the development of computational models which are capable of analysing large amounts of data in order to predict healthcare outcomes for individual patients. Computational intelligence approaches are suitable when the data to be modelled are too complex forconventional statistical techniques to process quickly and eciently. These advanced approaches are based on mathematical models that have been especially developed for dealing with the uncertainty an...
White, H; Racine, J
2001-01-01
We propose tests for individual and joint irrelevance of network inputs. Such tests can be used to determine whether an input or group of inputs "belong" in a particular model, thus permitting valid statistical inference based on estimated feedforward neural-network models. The approaches employ well-known statistical resampling techniques. We conduct a small Monte Carlo experiment showing that our tests have reasonable level and power behavior, and we apply our methods to examine whether there are predictable regularities in foreign exchange rates. We find that exchange rates do appear to contain information that is exploitable for enhanced point prediction, but the nature of the predictive relations evolves through time.
Statistical approach to partial equilibrium analysis
Wang, Yougui; Stanley, H. E.
2009-04-01
A statistical approach to market equilibrium and efficiency analysis is proposed in this paper. One factor that governs the exchange decisions of traders in a market, named willingness price, is highlighted and constitutes the whole theory. The supply and demand functions are formulated as the distributions of corresponding willing exchange over the willingness price. The laws of supply and demand can be derived directly from these distributions. The characteristics of excess demand function are analyzed and the necessary conditions for the existence and uniqueness of equilibrium point of the market are specified. The rationing rates of buyers and sellers are introduced to describe the ratio of realized exchange to willing exchange, and their dependence on the market price is studied in the cases of shortage and surplus. The realized market surplus, which is the criterion of market efficiency, can be written as a function of the distributions of willing exchange and the rationing rates. With this approach we can strictly prove that a market is efficient in the state of equilibrium.
Statistical models and methods for reliability and survival analysis
Couallier, Vincent; Huber-Carol, Catherine; Mesbah, Mounir; Huber -Carol, Catherine; Limnios, Nikolaos; Gerville-Reache, Leo
2013-01-01
Statistical Models and Methods for Reliability and Survival Analysis brings together contributions by specialists in statistical theory as they discuss their applications providing up-to-date developments in methods used in survival analysis, statistical goodness of fit, stochastic processes for system reliability, amongst others. Many of these are related to the work of Professor M. Nikulin in statistics over the past 30 years. The authors gather together various contributions with a broad array of techniques and results, divided into three parts - Statistical Models and Methods, Statistical
Current algebra, statistical mechanics and quantum models
Vilela Mendes, R.
2017-11-01
Results obtained in the past for free boson systems at zero and nonzero temperatures are revisited to clarify the physical meaning of current algebra reducible functionals which are associated to systems with density fluctuations, leading to observable effects on phase transitions. To use current algebra as a tool for the formulation of quantum statistical mechanics amounts to the construction of unitary representations of diffeomorphism groups. Two mathematical equivalent procedures exist for this purpose. One searches for quasi-invariant measures on configuration spaces, the other for a cyclic vector in Hilbert space. Here, one argues that the second approach is closer to the physical intuition when modelling complex systems. An example of application of the current algebra methodology to the pairing phenomenon in two-dimensional fermion systems is discussed.
An invariant approach to statistical analysis of shapes
Lele, Subhash R
2001-01-01
INTRODUCTIONA Brief History of MorphometricsFoundations for the Study of Biological FormsDescription of the data SetsMORPHOMETRIC DATATypes of Morphometric DataLandmark Homology and CorrespondenceCollection of Landmark CoordinatesReliability of Landmark Coordinate DataSummarySTATISTICAL MODELS FOR LANDMARK COORDINATE DATAStatistical Models in GeneralModels for Intra-Group VariabilityEffect of Nuisance ParametersInvariance and Elimination of Nuisance ParametersA Definition of FormCoordinate System Free Representation of FormEst
Multivariate statistical analysis a high-dimensional approach
Serdobolskii, V
2000-01-01
In the last few decades the accumulation of large amounts of in formation in numerous applications. has stimtllated an increased in terest in multivariate analysis. Computer technologies allow one to use multi-dimensional and multi-parametric models successfully. At the same time, an interest arose in statistical analysis with a de ficiency of sample data. Nevertheless, it is difficult to describe the recent state of affairs in applied multivariate methods as satisfactory. Unimprovable (dominating) statistical procedures are still unknown except for a few specific cases. The simplest problem of estimat ing the mean vector with minimum quadratic risk is unsolved, even for normal distributions. Commonly used standard linear multivari ate procedures based on the inversion of sample covariance matrices can lead to unstable results or provide no solution in dependence of data. Programs included in standard statistical packages cannot process 'multi-collinear data' and there are no theoretical recommen ...
Model-generated air quality statistics for application in vegetation response models in Alberta
International Nuclear Information System (INIS)
McVehil, G.E.; Nosal, M.
1990-01-01
To test and apply vegetation response models in Alberta, air pollution statistics representative of various parts of the Province are required. At this time, air quality monitoring data of the requisite accuracy and time resolution are not available for most parts of Alberta. Therefore, there exists a need to develop appropriate air quality statistics. The objectives of the work reported here were to determine the applicability of model generated air quality statistics and to develop by modelling, realistic and representative time series of hourly SO 2 concentrations that could be used to generate the statistics demanded by vegetation response models
Statistical modelling of fish stocks
DEFF Research Database (Denmark)
Kvist, Trine
1999-01-01
for modelling the dynamics of a fish population is suggested. A new approach is introduced to analyse the sources of variation in age composition data, which is one of the most important sources of information in the cohort based models for estimation of stock abundancies and mortalities. The approach combines...... and it is argued that an approach utilising stochastic differential equations might be advantagous in fish stoch assessments....
Steinberg, P. D.; Brener, G.; Duffy, D.; Nearing, G. S.; Pelissier, C.
2017-12-01
Hyperparameterization, of statistical models, i.e. automated model scoring and selection, such as evolutionary algorithms, grid searches, and randomized searches, can improve forecast model skill by reducing errors associated with model parameterization, model structure, and statistical properties of training data. Ensemble Learning Models (Elm), and the related Earthio package, provide a flexible interface for automating the selection of parameters and model structure for machine learning models common in climate science and land cover classification, offering convenient tools for loading NetCDF, HDF, Grib, or GeoTiff files, decomposition methods like PCA and manifold learning, and parallel training and prediction with unsupervised and supervised classification, clustering, and regression estimators. Continuum Analytics is using Elm to experiment with statistical soil moisture forecasting based on meteorological forcing data from NASA's North American Land Data Assimilation System (NLDAS). There Elm is using the NSGA-2 multiobjective optimization algorithm for optimizing statistical preprocessing of forcing data to improve goodness-of-fit for statistical models (i.e. feature engineering). This presentation will discuss Elm and its components, including dask (distributed task scheduling), xarray (data structures for n-dimensional arrays), and scikit-learn (statistical preprocessing, clustering, classification, regression), and it will show how NSGA-2 is being used for automate selection of soil moisture forecast statistical models for North America.
The Precautionary Principle and statistical approaches to uncertainty
DEFF Research Database (Denmark)
Keiding, Niels; Budtz-Jørgensen, Esben
2003-01-01
Bayesian model averaging; Benchmark approach to safety standards in toxicology; dose-response relationship; environmental standards; exposure measurement uncertainty; Popper falsification......Bayesian model averaging; Benchmark approach to safety standards in toxicology; dose-response relationship; environmental standards; exposure measurement uncertainty; Popper falsification...
The Precautionary Principle and Statistical Approaches to Uncertainty
DEFF Research Database (Denmark)
Keiding, Niels; Budtz-Jørgensen, Esben
2005-01-01
Bayesian model averaging; Benchmark approach to safety standars in toxicology; dose-response relationships; environmental standards; exposure measurement uncertainty; Popper falsification......Bayesian model averaging; Benchmark approach to safety standars in toxicology; dose-response relationships; environmental standards; exposure measurement uncertainty; Popper falsification...
An analytical statistical approach to the 3D reconstruction problem
Energy Technology Data Exchange (ETDEWEB)
Cierniak, Robert [Czestochowa Univ. of Technology (Poland). Inst. of Computer Engineering
2011-07-01
The presented here approach is concerned with the reconstruction problem for 3D spiral X-ray tomography. The reconstruction problem is formulated taking into considerations the statistical properties of signals obtained in X-ray CT. Additinally, image processing performed in our approach is involved in analytical methodology. This conception significantly improves quality of the obtained after reconstruction images and decreases the complexity of the reconstruction problem in comparison with other approaches. Computer simulations proved that schematically described here reconstruction algorithm outperforms conventional analytical methods in obtained image quality. (orig.)
Modeling Time-Dependent Association in Longitudinal Data: A Lag as Moderator Approach
Selig, James P.; Preacher, Kristopher J.; Little, Todd D.
2012-01-01
We describe a straightforward, yet novel, approach to examine time-dependent association between variables. The approach relies on a measurement-lag research design in conjunction with statistical interaction models. We base arguments in favor of this approach on the potential for better understanding the associations between variables by…
Tropical geometry of statistical models.
Pachter, Lior; Sturmfels, Bernd
2004-11-16
This article presents a unified mathematical framework for inference in graphical models, building on the observation that graphical models are algebraic varieties. From this geometric viewpoint, observations generated from a model are coordinates of a point in the variety, and the sum-product algorithm is an efficient tool for evaluating specific coordinates. Here, we address the question of how the solutions to various inference problems depend on the model parameters. The proposed answer is expressed in terms of tropical algebraic geometry. The Newton polytope of a statistical model plays a key role. Our results are applied to the hidden Markov model and the general Markov model on a binary tree.
12th Workshop on Stochastic Models, Statistics and Their Applications
Rafajłowicz, Ewaryst; Szajowski, Krzysztof
2015-01-01
This volume presents the latest advances and trends in stochastic models and related statistical procedures. Selected peer-reviewed contributions focus on statistical inference, quality control, change-point analysis and detection, empirical processes, time series analysis, survival analysis and reliability, statistics for stochastic processes, big data in technology and the sciences, statistical genetics, experiment design, and stochastic models in engineering. Stochastic models and related statistical procedures play an important part in furthering our understanding of the challenging problems currently arising in areas of application such as the natural sciences, information technology, engineering, image analysis, genetics, energy and finance, to name but a few. This collection arises from the 12th Workshop on Stochastic Models, Statistics and Their Applications, Wroclaw, Poland.
Accurate phenotyping: Reconciling approaches through Bayesian model averaging.
Directory of Open Access Journals (Sweden)
Carla Chia-Ming Chen
Full Text Available Genetic research into complex diseases is frequently hindered by a lack of clear biomarkers for phenotype ascertainment. Phenotypes for such diseases are often identified on the basis of clinically defined criteria; however such criteria may not be suitable for understanding the genetic composition of the diseases. Various statistical approaches have been proposed for phenotype definition; however our previous studies have shown that differences in phenotypes estimated using different approaches have substantial impact on subsequent analyses. Instead of obtaining results based upon a single model, we propose a new method, using Bayesian model averaging to overcome problems associated with phenotype definition. Although Bayesian model averaging has been used in other fields of research, this is the first study that uses Bayesian model averaging to reconcile phenotypes obtained using multiple models. We illustrate the new method by applying it to simulated genetic and phenotypic data for Kofendred personality disorder-an imaginary disease with several sub-types. Two separate statistical methods were used to identify clusters of individuals with distinct phenotypes: latent class analysis and grade of membership. Bayesian model averaging was then used to combine the two clusterings for the purpose of subsequent linkage analyses. We found that causative genetic loci for the disease produced higher LOD scores using model averaging than under either individual model separately. We attribute this improvement to consolidation of the cores of phenotype clusters identified using each individual method.
Directory of Open Access Journals (Sweden)
Sutikno Sutikno
2010-08-01
Full Text Available One of the climate models used to predict the climatic conditions is Global Circulation Models (GCM. GCM is a computer-based model that consists of different equations. It uses numerical and deterministic equation which follows the physics rules. GCM is a main tool to predict climate and weather, also it uses as primary information source to review the climate change effect. Statistical Downscaling (SD technique is used to bridge the large-scale GCM with a small scale (the study area. GCM data is spatial and temporal data most likely to occur where the spatial correlation between different data on the grid in a single domain. Multicollinearity problems require the need for pre-processing of variable data X. Continuum Regression (CR and pre-processing with Principal Component Analysis (PCA methods is an alternative to SD modelling. CR is one method which was developed by Stone and Brooks (1990. This method is a generalization from Ordinary Least Square (OLS, Principal Component Regression (PCR and Partial Least Square method (PLS methods, used to overcome multicollinearity problems. Data processing for the station in Ambon, Pontianak, Losarang, Indramayu and Yuntinyuat show that the RMSEP values and R2 predict in the domain 8x8 and 12x12 by uses CR method produces results better than by PCR and PLS.
Multivariate meta-analysis: a robust approach based on the theory of U-statistic.
Ma, Yan; Mazumdar, Madhu
2011-10-30
Meta-analysis is the methodology for combining findings from similar research studies asking the same question. When the question of interest involves multiple outcomes, multivariate meta-analysis is used to synthesize the outcomes simultaneously taking into account the correlation between the outcomes. Likelihood-based approaches, in particular restricted maximum likelihood (REML) method, are commonly utilized in this context. REML assumes a multivariate normal distribution for the random-effects model. This assumption is difficult to verify, especially for meta-analysis with small number of component studies. The use of REML also requires iterative estimation between parameters, needing moderately high computation time, especially when the dimension of outcomes is large. A multivariate method of moments (MMM) is available and is shown to perform equally well to REML. However, there is a lack of information on the performance of these two methods when the true data distribution is far from normality. In this paper, we propose a new nonparametric and non-iterative method for multivariate meta-analysis on the basis of the theory of U-statistic and compare the properties of these three procedures under both normal and skewed data through simulation studies. It is shown that the effect on estimates from REML because of non-normal data distribution is marginal and that the estimates from MMM and U-statistic-based approaches are very similar. Therefore, we conclude that for performing multivariate meta-analysis, the U-statistic estimation procedure is a viable alternative to REML and MMM. Easy implementation of all three methods are illustrated by their application to data from two published meta-analysis from the fields of hip fracture and periodontal disease. We discuss ideas for future research based on U-statistic for testing significance of between-study heterogeneity and for extending the work to meta-regression setting. Copyright © 2011 John Wiley & Sons, Ltd.
Challenges and Approaches to Statistical Design and Inference in High Dimensional Investigations
Garrett, Karen A.; Allison, David B.
2015-01-01
Summary Advances in modern technologies have facilitated high-dimensional experiments (HDEs) that generate tremendous amounts of genomic, proteomic, and other “omic” data. HDEs involving whole-genome sequences and polymorphisms, expression levels of genes, protein abundance measurements, and combinations thereof have become a vanguard for new analytic approaches to the analysis of HDE data. Such situations demand creative approaches to the processes of statistical inference, estimation, prediction, classification, and study design. The novel and challenging biological questions asked from HDE data have resulted in many specialized analytic techniques being developed. This chapter discusses some of the unique statistical challenges facing investigators studying high-dimensional biology, and describes some approaches being developed by statistical scientists. We have included some focus on the increasing interest in questions involving testing multiple propositions simultaneously, appropriate inferential indicators for the types of questions biologists are interested in, and the need for replication of results across independent studies, investigators, and settings. A key consideration inherent throughout is the challenge in providing methods that a statistician judges to be sound and a biologist finds informative. PMID:19588106
Challenges and approaches to statistical design and inference in high-dimensional investigations.
Gadbury, Gary L; Garrett, Karen A; Allison, David B
2009-01-01
Advances in modern technologies have facilitated high-dimensional experiments (HDEs) that generate tremendous amounts of genomic, proteomic, and other "omic" data. HDEs involving whole-genome sequences and polymorphisms, expression levels of genes, protein abundance measurements, and combinations thereof have become a vanguard for new analytic approaches to the analysis of HDE data. Such situations demand creative approaches to the processes of statistical inference, estimation, prediction, classification, and study design. The novel and challenging biological questions asked from HDE data have resulted in many specialized analytic techniques being developed. This chapter discusses some of the unique statistical challenges facing investigators studying high-dimensional biology and describes some approaches being developed by statistical scientists. We have included some focus on the increasing interest in questions involving testing multiple propositions simultaneously, appropriate inferential indicators for the types of questions biologists are interested in, and the need for replication of results across independent studies, investigators, and settings. A key consideration inherent throughout is the challenge in providing methods that a statistician judges to be sound and a biologist finds informative.
Statistical Validation of Engineering and Scientific Models: Background
International Nuclear Information System (INIS)
Hills, Richard G.; Trucano, Timothy G.
1999-01-01
A tutorial is presented discussing the basic issues associated with propagation of uncertainty analysis and statistical validation of engineering and scientific models. The propagation of uncertainty tutorial illustrates the use of the sensitivity method and the Monte Carlo method to evaluate the uncertainty in predictions for linear and nonlinear models. Four example applications are presented; a linear model, a model for the behavior of a damped spring-mass system, a transient thermal conduction model, and a nonlinear transient convective-diffusive model based on Burger's equation. Correlated and uncorrelated model input parameters are considered. The model validation tutorial builds on the material presented in the propagation of uncertainty tutoriaI and uses the damp spring-mass system as the example application. The validation tutorial illustrates several concepts associated with the application of statistical inference to test model predictions against experimental observations. Several validation methods are presented including error band based, multivariate, sum of squares of residuals, and optimization methods. After completion of the tutorial, a survey of statistical model validation literature is presented and recommendations for future work are made
Directory of Open Access Journals (Sweden)
Liang Meng
Full Text Available A fundamental issue in neuroscience is how to identify the multiple biophysical mechanisms through which neurons generate observed patterns of spiking activity. In previous work, we proposed a method for linking observed patterns of spiking activity to specific biophysical mechanisms based on a state space modeling framework and a sequential Monte Carlo, or particle filter, estimation algorithm. We have shown, in simulation, that this approach is able to identify a space of simple biophysical models that were consistent with observed spiking data (and included the model that generated the data, but have yet to demonstrate the application of the method to identify realistic currents from real spike train data. Here, we apply the particle filter to spiking data recorded from rat layer V cortical neurons, and correctly identify the dynamics of an slow, intrinsic current. The underlying intrinsic current is successfully identified in four distinct neurons, even though the cells exhibit two distinct classes of spiking activity: regular spiking and bursting. This approach--linking statistical, computational, and experimental neuroscience--provides an effective technique to constrain detailed biophysical models to specific mechanisms consistent with observed spike train data.
International Nuclear Information System (INIS)
Seeliger, D.
1993-01-01
This contribution contains a brief presentation and comparison of the different Statistical Multistep Approaches, presently available for practical nuclear data calculations. (author). 46 refs, 5 figs
Statistical models for optimizing mineral exploration
International Nuclear Information System (INIS)
Wignall, T.K.; DeGeoffroy, J.
1987-01-01
The primary purpose of mineral exploration is to discover ore deposits. The emphasis of this volume is on the mathematical and computational aspects of optimizing mineral exploration. The seven chapters that make up the main body of the book are devoted to the description and application of various types of computerized geomathematical models. These chapters include: (1) the optimal selection of ore deposit types and regions of search, as well as prospecting selected areas, (2) designing airborne and ground field programs for the optimal coverage of prospecting areas, and (3) delineating and evaluating exploration targets within prospecting areas by means of statistical modeling. Many of these statistical programs are innovative and are designed to be useful for mineral exploration modeling. Examples of geomathematical models are applied to exploring for six main types of base and precious metal deposits, as well as other mineral resources (such as bauxite and uranium)
An Analysis of the Effectiveness of the Constructivist Approach in Teaching Business Statistics
Directory of Open Access Journals (Sweden)
Greeni Maheshwari
2017-05-01
Full Text Available Aim/Purpose: The main aim of the research is to examine the performance of second language English speaking students enrolled in the Business Statistics course and to investigate the academic performance of students when taught under the constructivist and non-constructivist approaches in a classroom environment. Background: There are different learning theories that are established based on how students learn. Each of these theories has its own benefits based on the different type of learners and context of the environment. The students in this research are new to the University environment and to a challenging technical course like Business Statistics. This research has been carried out to see the effectiveness of the constructivist approach in motivating and increasing the student engagement and their academic performance. Methodology\t: A total of 1373 students were involved in the quasi-experiment method using Stratified Sampling Method from the year 2015 until 2016. Contribution: To consider curriculum adjustments for first year programs and implications for teacher education. Findings: The t-test for unequal variances was used to understand the mean score. Results indicate students have high motivation level and achieve higher mean scores when they are taught using the constructivist teaching approach compared to the non-constructivist teaching approach. Recommendations for Practitioners: To consider the challenges faced by first year students and create a teaching approach that fits their needs. Recommendation for Researchers: To explore in depth other teaching approaches of the Business Statistics course in improving students’ academic performance. Impact on Society\t: The constructivist approach will enable learning to be enjoyable and students to be more confident. Future Research: The research will assist other lectures teaching Business Statistics in creating a more conducive environment to encourage second language English
Statistical fracture mechanics approach to the strength of brittle rock
International Nuclear Information System (INIS)
Ratigan, J.L.
1981-06-01
Statistical fracture mechanics concepts used in the past for rock are critically reviewed and modifications are proposed which are warranted by (1) increased understanding of fracture provided by modern fracture mechanics and (2) laboratory test data both from the literature and from this research. Over 600 direct and indirect tension tests have been performed on three different rock types; Stripa Granite, Sierra White Granite and Carrara Marble. In several instances assumptions which are common in the literature were found to be invalid. A three parameter statistical fracture mechanics model with Mode I critical strain energy release rate as the variant is presented. Methodologies for evaluating the parameters in this model as well as the more commonly employed two parameter models are discussed. The experimental results and analysis of this research indicate that surfacially distributed flaws, rather than volumetrically distributed flaws are responsible for rupture in many testing situations. For several of the rock types tested, anisotropy (both in apparent tensile strength and size effect) precludes the use of contemporary statistical fracture mechanics models
Basics of modern mathematical statistics
Spokoiny, Vladimir
2015-01-01
This textbook provides a unified and self-contained presentation of the main approaches to and ideas of mathematical statistics. It collects the basic mathematical ideas and tools needed as a basis for more serious studies or even independent research in statistics. The majority of existing textbooks in mathematical statistics follow the classical asymptotic framework. Yet, as modern statistics has changed rapidly in recent years, new methods and approaches have appeared. The emphasis is on finite sample behavior, large parameter dimensions, and model misspecifications. The present book provides a fully self-contained introduction to the world of modern mathematical statistics, collecting the basic knowledge, concepts and findings needed for doing further research in the modern theoretical and applied statistics. This textbook is primarily intended for graduate and postdoc students and young researchers who are interested in modern statistical methods.
Statistical Analysis of Input Parameters Impact on the Modelling of Underground Structures
Directory of Open Access Journals (Sweden)
M. Hilar
2008-01-01
Full Text Available The behaviour of a geomechanical model and its final results are strongly affected by the input parameters. As the inherent variability of rock mass is difficult to model, engineers are frequently forced to face the question “Which input values should be used for analyses?” The correct answer to such a question requires a probabilistic approach, considering the uncertainty of site investigations and variation in the ground. This paper describes the statistical analysis of input parameters for FEM calculations of traffic tunnels in the city of Prague. At the beginning of the paper, the inaccuracy in the geotechnical modelling is discussed. In the following part the Fuzzy techniques are summarized, including information about an application of the Fuzzy arithmetic on the shotcrete parameters. The next part of the paper is focused on the stochastic simulation – Monte Carlo Simulation is briefly described, Latin Hypercubes method is described more in details. At the end several practical examples are described: statistical analysis of the input parameters on the numerical modelling of the completed Mrázovka tunnel (profile West Tunnel Tube km 5.160 and modelling of the constructed tunnel Špejchar – Pelc Tyrolka.
2017-09-01
application of statistical inference. Even when human forecasters leverage their professional experience, which is often gained through long periods of... application throughout statistics and Bayesian data analysis. The multivariate form of 2( , ) (e.g., Figure 12) is similarly analytically...data (i.e., no systematic manipulations with analytical functions), it is common in the statistical literature to apply mathematical transformations
Wang, Ming; Long, Qi
2016-09-01
Prediction models for disease risk and prognosis play an important role in biomedical research, and evaluating their predictive accuracy in the presence of censored data is of substantial interest. The standard concordance (c) statistic has been extended to provide a summary measure of predictive accuracy for survival models. Motivated by a prostate cancer study, we address several issues associated with evaluating survival prediction models based on c-statistic with a focus on estimators using the technique of inverse probability of censoring weighting (IPCW). Compared to the existing work, we provide complete results on the asymptotic properties of the IPCW estimators under the assumption of coarsening at random (CAR), and propose a sensitivity analysis under the mechanism of noncoarsening at random (NCAR). In addition, we extend the IPCW approach as well as the sensitivity analysis to high-dimensional settings. The predictive accuracy of prediction models for cancer recurrence after prostatectomy is assessed by applying the proposed approaches. We find that the estimated predictive accuracy for the models in consideration is sensitive to NCAR assumption, and thus identify the best predictive model. Finally, we further evaluate the performance of the proposed methods in both settings of low-dimensional and high-dimensional data under CAR and NCAR through simulations. © 2016, The International Biometric Society.
Statistical physics of pairwise probability models
Directory of Open Access Journals (Sweden)
Yasser Roudi
2009-11-01
Full Text Available Statistical models for describing the probability distribution over the states of biological systems are commonly used for dimensional reduction. Among these models, pairwise models are very attractive in part because they can be fit using a reasonable amount of data: knowledge of the means and correlations between pairs of elements in the system is sufficient. Not surprisingly, then, using pairwise models for studying neural data has been the focus of many studies in recent years. In this paper, we describe how tools from statistical physics can be employed for studying and using pairwise models. We build on our previous work on the subject and study the relation between different methods for fitting these models and evaluating their quality. In particular, using data from simulated cortical networks we study how the quality of various approximate methods for inferring the parameters in a pairwise model depends on the time bin chosen for binning the data. We also study the effect of the size of the time bin on the model quality itself, again using simulated data. We show that using finer time bins increases the quality of the pairwise model. We offer new ways of deriving the expressions reported in our previous work for assessing the quality of pairwise models.
Improving the Statistical Modeling of the TRMM Extreme Precipitation Monitoring System
Demirdjian, L.; Zhou, Y.; Huffman, G. J.
2016-12-01
This project improves upon an existing extreme precipitation monitoring system based on the Tropical Rainfall Measuring Mission (TRMM) daily product (3B42) using new statistical models. The proposed system utilizes a regional modeling approach, where data from similar grid locations are pooled to increase the quality and stability of the resulting model parameter estimates to compensate for the short data record. The regional frequency analysis is divided into two stages. In the first stage, the region defined by the TRMM measurements is partitioned into approximately 27,000 non-overlapping clusters using a recursive k-means clustering scheme. In the second stage, a statistical model is used to characterize the extreme precipitation events occurring in each cluster. Instead of utilizing the block-maxima approach used in the existing system, where annual maxima are fit to the Generalized Extreme Value (GEV) probability distribution at each cluster separately, the present work adopts the peak-over-threshold (POT) method of classifying points as extreme if they exceed a pre-specified threshold. Theoretical considerations motivate the use of the Generalized-Pareto (GP) distribution for fitting threshold exceedances. The fitted parameters can be used to construct simple and intuitive average recurrence interval (ARI) maps which reveal how rare a particular precipitation event is given its spatial location. The new methodology eliminates much of the random noise that was produced by the existing models due to a short data record, producing more reasonable ARI maps when compared with NOAA's long-term Climate Prediction Center (CPC) ground based observations. The resulting ARI maps can be useful for disaster preparation, warning, and management, as well as increased public awareness of the severity of precipitation events. Furthermore, the proposed methodology can be applied to various other extreme climate records.
Experimental design techniques in statistical practice a practical software-based approach
Gardiner, W P
1998-01-01
Provides an introduction to the diverse subject area of experimental design, with many practical and applicable exercises to help the reader understand, present and analyse the data. The pragmatic approach offers technical training for use of designs and teaches statistical and non-statistical skills in design and analysis of project studies throughout science and industry. Provides an introduction to the diverse subject area of experimental design and includes practical and applicable exercises to help understand, present and analyse the data Offers technical training for use of designs and teaches statistical and non-statistical skills in design and analysis of project studies throughout science and industry Discusses one-factor designs and blocking designs, factorial experimental designs, Taguchi methods and response surface methods, among other topics.
A neighborhood statistics model for predicting stream pathogen indicator levels.
Pandey, Pramod K; Pasternack, Gregory B; Majumder, Mahbubul; Soupir, Michelle L; Kaiser, Mark S
2015-03-01
Because elevated levels of water-borne Escherichia coli in streams are a leading cause of water quality impairments in the U.S., water-quality managers need tools for predicting aqueous E. coli levels. Presently, E. coli levels may be predicted using complex mechanistic models that have a high degree of unchecked uncertainty or simpler statistical models. To assess spatio-temporal patterns of instream E. coli levels, herein we measured E. coli, a pathogen indicator, at 16 sites (at four different times) within the Squaw Creek watershed, Iowa, and subsequently, the Markov Random Field model was exploited to develop a neighborhood statistics model for predicting instream E. coli levels. Two observed covariates, local water temperature (degrees Celsius) and mean cross-sectional depth (meters), were used as inputs to the model. Predictions of E. coli levels in the water column were compared with independent observational data collected from 16 in-stream locations. The results revealed that spatio-temporal averages of predicted and observed E. coli levels were extremely close. Approximately 66 % of individual predicted E. coli concentrations were within a factor of 2 of the observed values. In only one event, the difference between prediction and observation was beyond one order of magnitude. The mean of all predicted values at 16 locations was approximately 1 % higher than the mean of the observed values. The approach presented here will be useful while assessing instream contaminations such as pathogen/pathogen indicator levels at the watershed scale.
Statistical mechanics of learning: A variational approach for real data
International Nuclear Information System (INIS)
Malzahn, Doerthe; Opper, Manfred
2002-01-01
Using a variational technique, we generalize the statistical physics approach of learning from random examples to make it applicable to real data. We demonstrate the validity and relevance of our method by computing approximate estimators for generalization errors that are based on training data alone
A statistical-based approach for fault detection and diagnosis in a photovoltaic system
Garoudja, Elyes
2017-07-10
This paper reports a development of a statistical approach for fault detection and diagnosis in a PV system. Specifically, the overarching goal of this work is to early detect and identify faults on the DC side of a PV system (e.g., short-circuit faults; open-circuit faults; and partial shading faults). Towards this end, we apply exponentially-weighted moving average (EWMA) control chart on the residuals obtained from the one-diode model. Such a choice is motivated by the greater sensitivity of EWMA chart to incipient faults and its low-computational cost making it easy to implement in real time. Practical data from a 3.2 KWp photovoltaic plant located within an Algerian research center is used to validate the proposed approach. Results show clearly the efficiency of the developed method in monitoring PV system status.
Yuan, Ke-Hai; Tian, Yubin; Yanagihara, Hirokazu
2015-06-01
Survey data typically contain many variables. Structural equation modeling (SEM) is commonly used in analyzing such data. The most widely used statistic for evaluating the adequacy of a SEM model is T ML, a slight modification to the likelihood ratio statistic. Under normality assumption, T ML approximately follows a chi-square distribution when the number of observations (N) is large and the number of items or variables (p) is small. However, in practice, p can be rather large while N is always limited due to not having enough participants. Even with a relatively large N, empirical results show that T ML rejects the correct model too often when p is not too small. Various corrections to T ML have been proposed, but they are mostly heuristic. Following the principle of the Bartlett correction, this paper proposes an empirical approach to correct T ML so that the mean of the resulting statistic approximately equals the degrees of freedom of the nominal chi-square distribution. Results show that empirically corrected statistics follow the nominal chi-square distribution much more closely than previously proposed corrections to T ML, and they control type I errors reasonably well whenever N ≥ max(50,2p). The formulations of the empirically corrected statistics are further used to predict type I errors of T ML as reported in the literature, and they perform well.
International Nuclear Information System (INIS)
Peluso, E; Gelfusa, M; Gaudio, P; Murari, A
2014-01-01
Access to the H mode of confinement in tokamaks is characterized by an abrupt transition, which has been the subject of continuous investigation for decades. Various theoretical models have been developed and multi-machine databases of experimental data have been collected. In this paper, a new methodology is reviewed for the investigation of the scaling laws for the temperature threshold to access the H mode. The approach is based on symbolic regression via genetic programming and allows first the extraction of the most statistically reliable models from the available experimental data. Nonlinear fitting is then applied to the mathematical expressions found by symbolic regression; this second step permits to easily compare the quality of the data-driven scalings with the most widely accepted theoretical models. The application of a complete set of statistical indicators shows that the data-driven scaling laws are qualitatively better than the theoretical models. The main limitations of the theoretical models are that they are all expressed as power laws, which are too rigid to fit the available experimental data and to extrapolate to ITER. The proposed method is absolutely general and can be applied to the extraction or scaling law from any experimental database of sufficient statistical relevance. (paper)
Statistical Models for Inferring Vegetation Composition from Fossil Pollen
Paciorek, C.; McLachlan, J. S.; Shang, Z.
2011-12-01
Fossil pollen provide information about vegetation composition that can be used to help understand how vegetation has changed over the past. However, these data have not traditionally been analyzed in a way that allows for statistical inference about spatio-temporal patterns and trends. We build a Bayesian hierarchical model called STEPPS (Spatio-Temporal Empirical Prediction from Pollen in Sediments) that predicts forest composition in southern New England, USA, over the last two millenia based on fossil pollen. The critical relationships between abundances of tree taxa in the pollen record and abundances in actual vegetation are estimated using modern (Forest Inventory Analysis) data and (witness tree) data from colonial records. This gives us two time points at which both pollen and direct vegetation data are available. Based on these relationships, and incorporating our uncertainty about them, we predict forest composition using fossil pollen. We estimate the spatial distribution and relative abundances of tree species and draw inference about how these patterns have changed over time. Finally, we describe ongoing work to extend the modeling to the upper Midwest of the U.S., including an approach to infer tree density and thereby estimate the prairie-forest boundary in Minnesota and Wisconsin. This work is part of the PalEON project, which brings together a team of ecosystem modelers, paleoecologists, and statisticians with the goal of reconstructing vegetation responses to climate during the last two millenia in the northeastern and midwestern United States. The estimates from the statistical modeling will be used to assess and calibrate ecosystem models that are used to project ecological changes in response to global change.
Can spatial statistical river temperature models be transferred between catchments?
Jackson, Faye L.; Fryer, Robert J.; Hannah, David M.; Malcolm, Iain A.
2017-09-01
There has been increasing use of spatial statistical models to understand and predict river temperature (Tw) from landscape covariates. However, it is not financially or logistically feasible to monitor all rivers and the transferability of such models has not been explored. This paper uses Tw data from four river catchments collected in August 2015 to assess how well spatial regression models predict the maximum 7-day rolling mean of daily maximum Tw (Twmax) within and between catchments. Models were fitted for each catchment separately using (1) landscape covariates only (LS models) and (2) landscape covariates and an air temperature (Ta) metric (LS_Ta models). All the LS models included upstream catchment area and three included a river network smoother (RNS) that accounted for unexplained spatial structure. The LS models transferred reasonably to other catchments, at least when predicting relative levels of Twmax. However, the predictions were biased when mean Twmax differed between catchments. The RNS was needed to characterise and predict finer-scale spatially correlated variation. Because the RNS was unique to each catchment and thus non-transferable, predictions were better within catchments than between catchments. A single model fitted to all catchments found no interactions between the landscape covariates and catchment, suggesting that the landscape relationships were transferable. The LS_Ta models transferred less well, with particularly poor performance when the relationship with the Ta metric was physically implausible or required extrapolation outside the range of the data. A single model fitted to all catchments found catchment-specific relationships between Twmax and the Ta metric, indicating that the Ta metric was not transferable. These findings improve our understanding of the transferability of spatial statistical river temperature models and provide a foundation for developing new approaches for predicting Tw at unmonitored locations across
Angeler, David G; Viedma, Olga; Moreno, José M
2009-11-01
Time lag analysis (TLA) is a distance-based approach used to study temporal dynamics of ecological communities by measuring community dissimilarity over increasing time lags. Despite its increased use in recent years, its performance in comparison with other more direct methods (i.e., canonical ordination) has not been evaluated. This study fills this gap using extensive simulations and real data sets from experimental temporary ponds (true zooplankton communities) and landscape studies (landscape categories as pseudo-communities) that differ in community structure and anthropogenic stress history. Modeling time with a principal coordinate of neighborhood matrices (PCNM) approach, the canonical ordination technique (redundancy analysis; RDA) consistently outperformed the other statistical tests (i.e., TLAs, Mantel test, and RDA based on linear time trends) using all real data. In addition, the RDA-PCNM revealed different patterns of temporal change, and the strength of each individual time pattern, in terms of adjusted variance explained, could be evaluated, It also identified species contributions to these patterns of temporal change. This additional information is not provided by distance-based methods. The simulation study revealed better Type I error properties of the canonical ordination techniques compared with the distance-based approaches when no deterministic component of change was imposed on the communities. The simulation also revealed that strong emphasis on uniform deterministic change and low variability at other temporal scales is needed to result in decreased statistical power of the RDA-PCNM approach relative to the other methods. Based on the statistical performance of and information content provided by RDA-PCNM models, this technique serves ecologists as a powerful tool for modeling temporal change of ecological (pseudo-) communities.
Statistical transmutation in doped quantum dimer models.
Lamas, C A; Ralko, A; Cabra, D C; Poilblanc, D; Pujol, P
2012-07-06
We prove a "statistical transmutation" symmetry of doped quantum dimer models on the square, triangular, and kagome lattices: the energy spectrum is invariant under a simultaneous change of statistics (i.e., bosonic into fermionic or vice versa) of the holes and of the signs of all the dimer resonance loops. This exact transformation enables us to define the duality equivalence between doped quantum dimer Hamiltonians and provides the analytic framework to analyze dynamical statistical transmutations. We investigate numerically the doping of the triangular quantum dimer model with special focus on the topological Z(2) dimer liquid. Doping leads to four (instead of two for the square lattice) inequivalent families of Hamiltonians. Competition between phase separation, superfluidity, supersolidity, and fermionic phases is investigated in the four families.
International Nuclear Information System (INIS)
Calvin W. Johnson
2004-01-01
The general goal of the project is to develop and implement computer codes and input files to compute nuclear densities of state. Such densities are important input into calculations of statistical neutron capture, and are difficult to access experimentally. In particular, we will focus on calculating densities for nuclides in the mass range A ?????? 50 - 100. We use statistical spectroscopy, a moments method based upon a microscopic framework, the interacting shell model. In this report we present our progress for the past year
Off-critical statistical models: factorized scattering theories and bootstrap program
International Nuclear Information System (INIS)
Mussardo, G.
1992-01-01
We analyze those integrable statistical systems which originate from some relevant perturbations of the minimal models of conformal field theories. When only massive excitations are present, the systems can be efficiently characterized in terms of the relativistic scattering data. We review the general properties of the factorizable S-matrix in two dimensions with particular emphasis on the bootstrap principle. The classification program of the allowed spins of conserved currents and of the non-degenerate S-matrices is discussed and illustrated by means of some significant examples. The scattering theories of several massive perturbations of the minimal models are fully discussed. Among them are the Ising model, the tricritical Ising model, the Potts models, the series of the non-unitary minimal models M 2,2n+3 , the non-unitary model M 3,5 and the scaling limit of the polymer system. The ultraviolet limit of these massive integrable theories can be exploited by the thermodynamics Bethe ansatz, in particular the central charge of the original conformal theories can be recovered from the scattering data. We also consider the numerical method based on the so-called conformal space truncated approach which confirms the theoretical results and allows a direct measurement of the scattering data, i.e. the masses and the S-matrix of the particles in bootstrap interaction. The problem of computing the off-critical correlation functions is discussed in terms of the form-factor approach
DEFF Research Database (Denmark)
Löwe, Roland; Del Giudice, Dario; Mikkelsen, Peter Steen
of input uncertainties observed in the models. The explicit inclusion of such variations in the modelling process will lead to a better fulfilment of the assumptions made in formal statistical frameworks, thus reducing the need to resolve to informal methods. The two approaches presented here...
Eeuwijk, van, F.A.
1996-01-01
In plant breeding it is a common observation to see genotypes react differently to environmental changes. This phenomenon is called genotype by environment interaction. Many statistical approaches for analysing genotype by environment interaction rely heavily on the analysis of variance model. Genotype by environment interaction is then taken to be equivalent to non-additivity. This thesis criticizes the analysis of variance approach. Modelling genotype by environment interaction by non-addit...
Textual information access statistical models
Gaussier, Eric
2013-01-01
This book presents statistical models that have recently been developed within several research communities to access information contained in text collections. The problems considered are linked to applications aiming at facilitating information access:- information extraction and retrieval;- text classification and clustering;- opinion mining;- comprehension aids (automatic summarization, machine translation, visualization).In order to give the reader as complete a description as possible, the focus is placed on the probability models used in the applications
Mazzitello, Karina I.; Candia, Julián
2012-12-01
In every country, public and private agencies allocate extensive funding to collect large-scale statistical data, which in turn are studied and analyzed in order to determine local, regional, national, and international policies regarding all aspects relevant to the welfare of society. One important aspect of that process is the visualization of statistical data with embedded geographical information, which most often relies on archaic methods such as maps colored according to graded scales. In this work, we apply nonstandard visualization techniques based on physical principles. We illustrate the method with recent statistics on homicide rates in Brazil and their correlation to other publicly available data. This physics-based approach provides a novel tool that can be used by interdisciplinary teams investigating statistics and model projections in a variety of fields such as economics and gross domestic product research, public health and epidemiology, sociodemographics, political science, business and marketing, and many others.
Verfaillie, Deborah; Déqué, Michel; Morin, Samuel; Lafaysse, Matthieu
2017-04-01
Projections of future climate change have been increasingly called for lately, as the reality of climate change has been gradually accepted and societies and governments have started to plan upcoming mitigation and adaptation policies. In mountain regions such as the Alps or the Pyrenees, where winter tourism and hydropower production are large contributors to the regional revenue, particular attention is brought to current and future snow availability. The question of the vulnerability of mountain ecosystems as well as the occurrence of climate-related hazards such as avalanches and debris-flows is also under consideration. In order to generate projections of snow conditions, however, downscaling global climate models (GCMs) by using regional climate models (RCMs) is not sufficient to capture the fine-scale processes and thresholds at play. In particular, the altitudinal resolution matters, since the phase of precipitation is mainly controlled by the temperature which is altitude-dependent. Simulations from GCMs and RCMs moreover suffer from biases compared to local observations, due to their rather coarse spatial and altitudinal resolution, and often provide outputs at too coarse time resolution to drive impact models. RCM simulations must therefore be adjusted using empirical-statistical downscaling and error correction methods, before they can be used to drive specific models such as energy balance land surface models. In this study, time series of hourly temperature, precipitation, wind speed, humidity, and short- and longwave radiation were generated over the Pyrenees and the French Alps for the period 1950-2100, by using a new approach (named ADAMONT for ADjustment of RCM outputs to MOuNTain regions) based on quantile mapping applied to daily data, followed by time disaggregation accounting for weather patterns selection. We first introduce a thorough evaluation of the method using using model runs from the ALADIN RCM driven by a global reanalysis over the
International Nuclear Information System (INIS)
Gao, S; Meyer, R; Shi, L; D'Souza, W; Zhang, H
2014-01-01
Purpose: To apply a statistical modeling approach, threshold modeling (TM), for quality control of intensity-modulated radiation therapy (IMRT) treatment plans. Methods: A quantitative measure, which was the weighted sum of violations of dose/dose-volume constraints, was first developed to represent the quality of each IMRT plan. Threshold modeling approach, which is is an extension of extreme value theory in statistics and is an effect way to model extreme values, was then applied to analyze the quality of the plans summarized by our quantitative measures. Our approach modeled the plans generated by planners as a series of independent and identically distributed random variables and described the behaviors of them if the plan quality was controlled below certain threshold. We tested our approach with five locally advanced head and neck cancer patients retrospectively. Two statistics were incorporated for numerical analysis: probability of quality improvement (PQI) of the plans and expected amount of improvement on the quantitative measure (EQI). Results: After clinical planners generated 15 plans for each patient, we applied our approach to obtain the PQI and EQI as if planners would generate additional 15 plans. For two of the patients, the PQI was significantly higher than the other three (0.17 and 0.18 comparing to 0.08, 0.01 and 0.01). The actual percentage of the additional 15 plans that outperformed the best of initial 15 plans was 20% and 27% comparing to 11%, 0% and 0%. EQI for the two potential patients were 34.5 and 32.9 and the rest of three patients were 9.9, 1.4 and 6.6. The actual improvements obtained were 28.3 and 20.5 comparing to 6.2, 0 and 0. Conclusion: TM is capable of reliably identifying the potential quality improvement of IMRT plans. It provides clinicians an effective tool to assess the trade-off between extra planning effort and achievable plan quality. This work was supported in part by NIH/NCI grant CA130814
Prediction of Frost Occurrences Using Statistical Modeling Approaches
Directory of Open Access Journals (Sweden)
Hyojin Lee
2016-01-01
Full Text Available We developed the frost prediction models in spring in Korea using logistic regression and decision tree techniques. Hit Rate (HR, Probability of Detection (POD, and False Alarm Rate (FAR from both models were calculated and compared. Threshold values for the logistic regression models were selected to maximize HR and POD and minimize FAR for each station, and the split for the decision tree models was stopped when change in entropy was relatively small. Average HR values were 0.92 and 0.91 for logistic regression and decision tree techniques, respectively, average POD values were 0.78 and 0.80 for logistic regression and decision tree techniques, respectively, and average FAR values were 0.22 and 0.28 for logistic regression and decision tree techniques, respectively. The average numbers of selected explanatory variables were 5.7 and 2.3 for logistic regression and decision tree techniques, respectively. Fewer explanatory variables can be more appropriate for operational activities to provide a timely warning for the prevention of the frost damages to agricultural crops. We concluded that the decision tree model can be more useful for the timely warning system. It is recommended that the models should be improved to reflect local topological features.
Quantum mechanics and field theory with fractional spin and statistics
International Nuclear Information System (INIS)
Forte, S.
1992-01-01
Planar systems admit quantum states that are neither bosons nor fermions, i.e., whose angular momentum is neither integer nor half-integer. After a discussion of some examples of familiar models in which fractional spin may arise, the relevant (nonrelativistic) quantum mechanics is developed from first principles. The appropriate generalization of statistics is also discussed. Some physical effects of fractional spin and statistics are worked out explicitly. The group theory underlying relativistic models with fractional spin and statistics is then introduced and applied to relativistic particle mechanics and field theory. Field-theoretical models in 2+1 dimensions are presented which admit solitons that carry fractional statistics, and are discussed in a semiclassical approach, in the functional integral approach, and in the canonical approach. Finally, fundamental field theories whose Fock states carry fractional spin and statistics are discussed
Model for neural signaling leap statistics
International Nuclear Information System (INIS)
Chevrollier, Martine; Oria, Marcos
2011-01-01
We present a simple model for neural signaling leaps in the brain considering only the thermodynamic (Nernst) potential in neuron cells and brain temperature. We numerically simulated connections between arbitrarily localized neurons and analyzed the frequency distribution of the distances reached. We observed qualitative change between Normal statistics (with T 37.5 0 C, awaken regime) and Levy statistics (T = 35.5 0 C, sleeping period), characterized by rare events of long range connections.
Model for neural signaling leap statistics
Chevrollier, Martine; Oriá, Marcos
2011-03-01
We present a simple model for neural signaling leaps in the brain considering only the thermodynamic (Nernst) potential in neuron cells and brain temperature. We numerically simulated connections between arbitrarily localized neurons and analyzed the frequency distribution of the distances reached. We observed qualitative change between Normal statistics (with T = 37.5°C, awaken regime) and Lévy statistics (T = 35.5°C, sleeping period), characterized by rare events of long range connections.
WE-A-201-02: Modern Statistical Modeling
Energy Technology Data Exchange (ETDEWEB)
Niemierko, A.
2016-06-15
Chris Marshall: Memorial Introduction Donald Edmonds Herbert Jr., or Don to his colleagues and friends, exemplified the “big tent” vision of medical physics, specializing in Applied Statistics and Dynamical Systems theory. He saw, more clearly than most, that “Making models is the difference between doing science and just fooling around [ref Woodworth, 2004]”. Don developed an interest in chemistry at school by “reading a book” - a recurring theme in his story. He was awarded a Westinghouse Science scholarship and attended the Carnegie Institute of Technology (later Carnegie Mellon University) where his interest turned to physics and led to a BS in Physics after transfer to Northwestern University. After (voluntary) service in the Navy he earned his MS in Physics from the University of Oklahoma, which led him to Johns Hopkins University in Baltimore to pursue a PhD. The early death of his wife led him to take a salaried position in the Physics Department of Colorado College in Colorado Springs so as to better care for their young daughter. There, a chance invitation from Dr. Juan del Regato to teach physics to residents at the Penrose Cancer Hospital introduced him to Medical Physics, and he decided to enter the field. He received his PhD from the University of London (UK) under Prof. Joseph Rotblat, where I first met him, and where he taught himself statistics. He returned to Penrose as a clinical medical physicist, also largely self-taught. In 1975 he formalized an evolving interest in statistical analysis as Professor of Radiology and Head of the Division of Physics and Statistics at the College of Medicine of the University of South Alabama in Mobile, AL where he remained for the rest of his career. He also served as the first Director of their Bio-Statistics and Epidemiology Core Unit working in part on a sickle-cell disease. After retirement he remained active as Professor Emeritus. Don served for several years as a consultant to the Nuclear
WE-A-201-02: Modern Statistical Modeling
International Nuclear Information System (INIS)
Niemierko, A.
2016-01-01
Chris Marshall: Memorial Introduction Donald Edmonds Herbert Jr., or Don to his colleagues and friends, exemplified the “big tent” vision of medical physics, specializing in Applied Statistics and Dynamical Systems theory. He saw, more clearly than most, that “Making models is the difference between doing science and just fooling around [ref Woodworth, 2004]”. Don developed an interest in chemistry at school by “reading a book” - a recurring theme in his story. He was awarded a Westinghouse Science scholarship and attended the Carnegie Institute of Technology (later Carnegie Mellon University) where his interest turned to physics and led to a BS in Physics after transfer to Northwestern University. After (voluntary) service in the Navy he earned his MS in Physics from the University of Oklahoma, which led him to Johns Hopkins University in Baltimore to pursue a PhD. The early death of his wife led him to take a salaried position in the Physics Department of Colorado College in Colorado Springs so as to better care for their young daughter. There, a chance invitation from Dr. Juan del Regato to teach physics to residents at the Penrose Cancer Hospital introduced him to Medical Physics, and he decided to enter the field. He received his PhD from the University of London (UK) under Prof. Joseph Rotblat, where I first met him, and where he taught himself statistics. He returned to Penrose as a clinical medical physicist, also largely self-taught. In 1975 he formalized an evolving interest in statistical analysis as Professor of Radiology and Head of the Division of Physics and Statistics at the College of Medicine of the University of South Alabama in Mobile, AL where he remained for the rest of his career. He also served as the first Director of their Bio-Statistics and Epidemiology Core Unit working in part on a sickle-cell disease. After retirement he remained active as Professor Emeritus. Don served for several years as a consultant to the Nuclear
New advances in the statistical parton distributions approach*
Directory of Open Access Journals (Sweden)
Soffer Jacques
2016-01-01
Full Text Available The quantum statistical parton distributions approach proposed more than one decade ago is revisited by considering a larger set of recent and accurate Deep Inelastic Scattering experimental results. It enables us to improve the description of the data by means of a new determination of the parton distributions. This global next-to-leading order QCD analysis leads to a good description of several structure functions, involving unpolarized parton distributions and helicity distributions, in terms of a rather small number of free parameters. There are many serious challenging issues. The predictions of this theoretical approach will be tested for single-jet production and charge asymmetry in W± production in p̄p and pp collisions up to LHC energies, using recent data and also for forthcoming experimental results.
Application of the statistical approach in diagnosing in medical and biological researches
Directory of Open Access Journals (Sweden)
Komleva N. О.
2017-09-01
Full Text Available The task of diagnosis in biomedical research in a number of cases can be solved using a statistical approach. Current research is the possibility of using statistical analysis to diagnose the state of the human respiratory system based on the values of the percentage contributions of particles of different sizes contained in the exhaled air. The aim of the research is to identify certain regularities in the values of the diagnostic signs of the moisture condensation of the exhaled air, which will make it possible to consider the groups under investigation as disjoint classes. Three groups of individuals were examined: healthy people and patients with bronchitis and pneumonia. For each group, the identification of the particles that are the primary diagnostic data using the laser correlation spectroscopy method and the further processing of the data using the discriminant analysis method are performed. Selection of variables discriminating the study groups in the best possible manner is done; the model of variables and classification functions is constructed. There are presented the results of the main steps of the analysis – the set of variables included in the model and the coefficients of the classification functions for the three groups – which formed the basis for the algorithm for the work of the developed software product.
Valid statistical approaches for analyzing sholl data: Mixed effects versus simple linear models.
Wilson, Machelle D; Sethi, Sunjay; Lein, Pamela J; Keil, Kimberly P
2017-03-01
The Sholl technique is widely used to quantify dendritic morphology. Data from such studies, which typically sample multiple neurons per animal, are often analyzed using simple linear models. However, simple linear models fail to account for intra-class correlation that occurs with clustered data, which can lead to faulty inferences. Mixed effects models account for intra-class correlation that occurs with clustered data; thus, these models more accurately estimate the standard deviation of the parameter estimate, which produces more accurate p-values. While mixed models are not new, their use in neuroscience has lagged behind their use in other disciplines. A review of the published literature illustrates common mistakes in analyses of Sholl data. Analysis of Sholl data collected from Golgi-stained pyramidal neurons in the hippocampus of male and female mice using both simple linear and mixed effects models demonstrates that the p-values and standard deviations obtained using the simple linear models are biased downwards and lead to erroneous rejection of the null hypothesis in some analyses. The mixed effects approach more accurately models the true variability in the data set, which leads to correct inference. Mixed effects models avoid faulty inference in Sholl analysis of data sampled from multiple neurons per animal by accounting for intra-class correlation. Given the widespread practice in neuroscience of obtaining multiple measurements per subject, there is a critical need to apply mixed effects models more widely. Copyright © 2017 Elsevier B.V. All rights reserved.
Equilibrium statistical mechanics of lattice models
Lavis, David A
2015-01-01
Most interesting and difficult problems in equilibrium statistical mechanics concern models which exhibit phase transitions. For graduate students and more experienced researchers this book provides an invaluable reference source of approximate and exact solutions for a comprehensive range of such models. Part I contains background material on classical thermodynamics and statistical mechanics, together with a classification and survey of lattice models. The geometry of phase transitions is described and scaling theory is used to introduce critical exponents and scaling laws. An introduction is given to finite-size scaling, conformal invariance and Schramm—Loewner evolution. Part II contains accounts of classical mean-field methods. The parallels between Landau expansions and catastrophe theory are discussed and Ginzburg—Landau theory is introduced. The extension of mean-field theory to higher-orders is explored using the Kikuchi—Hijmans—De Boer hierarchy of approximations. In Part III the use of alge...
Huttary, Rudolf; Goubergrits, Leonid; Schütte, Christof; Bernhard, Stefan
2017-08-01
It has not yet been possible to obtain modeling approaches suitable for covering a wide range of real world scenarios in cardiovascular physiology because many of the system parameters are uncertain or even unknown. Natural variability and statistical variation of cardiovascular system parameters in healthy and diseased conditions are characteristic features for understanding cardiovascular diseases in more detail. This paper presents SISCA, a novel software framework for cardiovascular system modeling and its MATLAB implementation. The framework defines a multi-model statistical ensemble approach for dimension reduced, multi-compartment models and focuses on statistical variation, system identification and patient-specific simulation based on clinical data. We also discuss a data-driven modeling scenario as a use case example. The regarded dataset originated from routine clinical examinations and comprised typical pre and post surgery clinical data from a patient diagnosed with coarctation of aorta. We conducted patient and disease specific pre/post surgery modeling by adapting a validated nominal multi-compartment model with respect to structure and parametrization using metadata and MRI geometry. In both models, the simulation reproduced measured pressures and flows fairly well with respect to stenosis and stent treatment and by pre-treatment cross stenosis phase shift of the pulse wave. However, with post-treatment data showing unrealistic phase shifts and other more obvious inconsistencies within the dataset, the methods and results we present suggest that conditioning and uncertainty management of routine clinical data sets needs significantly more attention to obtain reasonable results in patient-specific cardiovascular modeling. Copyright © 2017 Elsevier Ltd. All rights reserved.
Spherical Process Models for Global Spatial Statistics
Jeong, Jaehong; Jun, Mikyoung; Genton, Marc G.
2017-01-01
Statistical models used in geophysical, environmental, and climate science applications must reflect the curvature of the spatial domain in global data. Over the past few decades, statisticians have developed covariance models that capture
Model for neural signaling leap statistics
Energy Technology Data Exchange (ETDEWEB)
Chevrollier, Martine; Oria, Marcos, E-mail: oria@otica.ufpb.br [Laboratorio de Fisica Atomica e Lasers Departamento de Fisica, Universidade Federal da ParaIba Caixa Postal 5086 58051-900 Joao Pessoa, Paraiba (Brazil)
2011-03-01
We present a simple model for neural signaling leaps in the brain considering only the thermodynamic (Nernst) potential in neuron cells and brain temperature. We numerically simulated connections between arbitrarily localized neurons and analyzed the frequency distribution of the distances reached. We observed qualitative change between Normal statistics (with T 37.5{sup 0}C, awaken regime) and Levy statistics (T = 35.5{sup 0}C, sleeping period), characterized by rare events of long range connections.
He, Yuning
2015-01-01
The behavior of complex aerospace systems is governed by numerous parameters. For safety analysis it is important to understand how the system behaves with respect to these parameter values. In particular, understanding the boundaries between safe and unsafe regions is of major importance. In this paper, we describe a hierarchical Bayesian statistical modeling approach for the online detection and characterization of such boundaries. Our method for classification with active learning uses a particle filter-based model and a boundary-aware metric for best performance. From a library of candidate shapes incorporated with domain expert knowledge, the location and parameters of the boundaries are estimated using advanced Bayesian modeling techniques. The results of our boundary analysis are then provided in a form understandable by the domain expert. We illustrate our approach using a simulation model of a NASA neuro-adaptive flight control system, as well as a system for the detection of separation violations in the terminal airspace.
Using a Statistical Approach to Anticipate Leaf Wetness Duration Under Climate Change in France
Huard, F.; Imig, A. F.; Perrin, P.
2014-12-01
Leaf wetness plays a major role in the development of fungal plant diseases. Leaf wetness duration (LWD) above a threshold value is determinant for infection and can be seen as a good indicator of impact of climate on infection occurrence and risk. As LWD is not widely measured, several methods, based on physics and empirical approach, have been developed to estimate it from weather data. Many LWD statistical models do exist, but the lack of standard for measurements require reassessments. A new empirical LWD model, called MEDHI (Modèle d'Estimation de la Durée d'Humectation à l'Inra) was developed for french configuration for wetness sensors (angle : 90°, height : 50 cm). This deployment is different from what is usually recommended from constructors or authors in other countries (angle from 10 to 60°, height from 10 to 150 cm…). MEDHI is a decision support system based on hourly climatic conditions at time steps n and n-1 taking account relative humidity, rainfall and previously simulated LWD. Air temperature, relative humidity, wind speed, rain and LWD data from several sensors with 2 configurations were measured during 6 months in Toulouse and Avignon (South West and South East of France) to calibrate MEDHI. A comparison of empirical models : NHRH (RH threshold), DPD (dew point depression), CART (classification and regression tree analysis dependant on RH, wind speed and dew point depression) and MEDHI, using meteorological and LWD measurements obtained during 5 months in Toulouse, showed that the development of this new model MEHDI was definitely better adapted to French conditions. In the context of climate change, MEDHI was used for mapping the evolution of leaf wetness duration in France from 1950 to 2100 with the French regional climate model ALADIN under different Representative Concentration Pathways (RCPs) and using a QM (Quantile-Mapping) statistical downscaling method. Results give information on the spatial distribution of infection risks
Jenkinson, Garrett; Abante, Jordi; Feinberg, Andrew P; Goutsias, John
2018-03-07
DNA methylation is a stable form of epigenetic memory used by cells to control gene expression. Whole genome bisulfite sequencing (WGBS) has emerged as a gold-standard experimental technique for studying DNA methylation by producing high resolution genome-wide methylation profiles. Statistical modeling and analysis is employed to computationally extract and quantify information from these profiles in an effort to identify regions of the genome that demonstrate crucial or aberrant epigenetic behavior. However, the performance of most currently available methods for methylation analysis is hampered by their inability to directly account for statistical dependencies between neighboring methylation sites, thus ignoring significant information available in WGBS reads. We present a powerful information-theoretic approach for genome-wide modeling and analysis of WGBS data based on the 1D Ising model of statistical physics. This approach takes into account correlations in methylation by utilizing a joint probability model that encapsulates all information available in WGBS methylation reads and produces accurate results even when applied on single WGBS samples with low coverage. Using the Shannon entropy, our approach provides a rigorous quantification of methylation stochasticity in individual WGBS samples genome-wide. Furthermore, it utilizes the Jensen-Shannon distance to evaluate differences in methylation distributions between a test and a reference sample. Differential performance assessment using simulated and real human lung normal/cancer data demonstrate a clear superiority of our approach over DSS, a recently proposed method for WGBS data analysis. Critically, these results demonstrate that marginal methods become statistically invalid when correlations are present in the data. This contribution demonstrates clear benefits and the necessity of modeling joint probability distributions of methylation using the 1D Ising model of statistical physics and of
Biagini, Francesca
2016-01-01
This book provides an introduction to elementary probability and to Bayesian statistics using de Finetti's subjectivist approach. One of the features of this approach is that it does not require the introduction of sample space – a non-intrinsic concept that makes the treatment of elementary probability unnecessarily complicate – but introduces as fundamental the concept of random numbers directly related to their interpretation in applications. Events become a particular case of random numbers and probability a particular case of expectation when it is applied to events. The subjective evaluation of expectation and of conditional expectation is based on an economic choice of an acceptable bet or penalty. The properties of expectation and conditional expectation are derived by applying a coherence criterion that the evaluation has to follow. The book is suitable for all introductory courses in probability and statistics for students in Mathematics, Informatics, Engineering, and Physics.
Analysis and Evaluation of Statistical Models for Integrated Circuits Design
Directory of Open Access Journals (Sweden)
Sáenz-Noval J.J.
2011-10-01
Full Text Available Statistical models for integrated circuits (IC allow us to estimate the percentage of acceptable devices in the batch before fabrication. Actually, Pelgrom is the statistical model most accepted in the industry; however it was derived from a micrometer technology, which does not guarantee reliability in nanometric manufacturing processes. This work considers three of the most relevant statistical models in the industry and evaluates their limitations and advantages in analog design, so that the designer has a better criterion to make a choice. Moreover, it shows how several statistical models can be used for each one of the stages and design purposes.
The issue of statistical power for overall model fit in evaluating structural equation models
Directory of Open Access Journals (Sweden)
Richard HERMIDA
2015-06-01
Full Text Available Statistical power is an important concept for psychological research. However, examining the power of a structural equation model (SEM is rare in practice. This article provides an accessible review of the concept of statistical power for the Root Mean Square Error of Approximation (RMSEA index of overall model fit in structural equation modeling. By way of example, we examine the current state of power in the literature by reviewing studies in top Industrial-Organizational (I/O Psychology journals using SEMs. Results indicate that in many studies, power is very low, which implies acceptance of invalid models. Additionally, we examined methodological situations which may have an influence on statistical power of SEMs. Results showed that power varies significantly as a function of model type and whether or not the model is the main model for the study. Finally, results indicated that power is significantly related to model fit statistics used in evaluating SEMs. The results from this quantitative review imply that researchers should be more vigilant with respect to power in structural equation modeling. We therefore conclude by offering methodological best practices to increase confidence in the interpretation of structural equation modeling results with respect to statistical power issues.
Gomez-Ramirez, Jaime; Sanz, Ricardo
2013-09-01
One of the most important scientific challenges today is the quantitative and predictive understanding of biological function. Classical mathematical and computational approaches have been enormously successful in modeling inert matter, but they may be inadequate to address inherent features of biological systems. We address the conceptual and methodological obstacles that lie in the inverse problem in biological systems modeling. We introduce a full Bayesian approach (FBA), a theoretical framework to study biological function, in which probability distributions are conditional on biophysical information that physically resides in the biological system that is studied by the scientist. Copyright © 2013 Elsevier Ltd. All rights reserved.
Santos, José António; Galante-Oliveira, Susana; Barroso, Carlos
2011-03-01
The current work presents an innovative statistical approach to model ordinal variables in environmental monitoring studies. An ordinal variable has values that can only be compared as "less", "equal" or "greater" and it is not possible to have information about the size of the difference between two particular values. The example of ordinal variable under this study is the vas deferens sequence (VDS) used in imposex (superimposition of male sexual characters onto prosobranch females) field assessment programmes for monitoring tributyltin (TBT) pollution. The statistical methodology presented here is the ordered logit regression model. It assumes that the VDS is an ordinal variable whose values match up a process of imposex development that can be considered continuous in both biological and statistical senses and can be described by a latent non-observable continuous variable. This model was applied to the case study of Nucella lapillus imposex monitoring surveys conducted in the Portuguese coast between 2003 and 2008 to evaluate the temporal evolution of TBT pollution in this country. In order to produce more reliable conclusions, the proposed model includes covariates that may influence the imposex response besides TBT (e.g. the shell size). The model also provides an analysis of the environmental risk associated to TBT pollution by estimating the probability of the occurrence of females with VDS ≥ 2 in each year, according to OSPAR criteria. We consider that the proposed application of this statistical methodology has a great potential in environmental monitoring whenever there is the need to model variables that can only be assessed through an ordinal scale of values.
A statistical-thermodynamic model for ordering phenomena in thin film intermetallic structures
International Nuclear Information System (INIS)
Semenova, Olga; Krachler, Regina
2008-01-01
Ordering phenomena in bcc (110) binary thin film intermetallics are studied by a statistical-thermodynamic model. The system is modeled by an Ising approach that includes only nearest-neighbor chemical interactions and is solved in a mean-field approximation. Vacancies and anti-structure atoms are considered on both sublattices. The model describes long-range ordering and simultaneously short-range ordering in the thin film. It is applied to NiAl thin films with B2 structure. Vacancy concentrations, thermodynamic activity profiles and the virtual critical temperature of order-disorder as a function of film composition and thickness are presented. The results point to an important role of vacancies in near-stoichiometric and Ni-rich NiAl thin films
Understanding and forecasting polar stratospheric variability with statistical models
Directory of Open Access Journals (Sweden)
C. Blume
2012-07-01
Full Text Available The variability of the north-polar stratospheric vortex is a prominent aspect of the middle atmosphere. This work investigates a wide class of statistical models with respect to their ability to model geopotential and temperature anomalies, representing variability in the polar stratosphere. Four partly nonstationary, nonlinear models are assessed: linear discriminant analysis (LDA; a cluster method based on finite elements (FEM-VARX; a neural network, namely the multi-layer perceptron (MLP; and support vector regression (SVR. These methods model time series by incorporating all significant external factors simultaneously, including ENSO, QBO, the solar cycle, volcanoes, to then quantify their statistical importance. We show that variability in reanalysis data from 1980 to 2005 is successfully modeled. The period from 2005 to 2011 can be hindcasted to a certain extent, where MLP performs significantly better than the remaining models. However, variability remains that cannot be statistically hindcasted within the current framework, such as the unexpected major warming in January 2009. Finally, the statistical model with the best generalization performance is used to predict a winter 2011/12 with warm and weak vortex conditions. A vortex breakdown is predicted for late January, early February 2012.
Robot Trajectories Comparison: A Statistical Approach
Directory of Open Access Journals (Sweden)
A. Ansuategui
2014-01-01
Full Text Available The task of planning a collision-free trajectory from a start to a goal position is fundamental for an autonomous mobile robot. Although path planning has been extensively investigated since the beginning of robotics, there is no agreement on how to measure the performance of a motion algorithm. This paper presents a new approach to perform robot trajectories comparison that could be applied to any kind of trajectories and in both simulated and real environments. Given an initial set of features, it automatically selects the most significant ones and performs a statistical comparison using them. Additionally, a graphical data visualization named polygraph which helps to better understand the obtained results is provided. The proposed method has been applied, as an example, to compare two different motion planners, FM2 and WaveFront, using different environments, robots, and local planners.
Robot Trajectories Comparison: A Statistical Approach
Ansuategui, A.; Arruti, A.; Susperregi, L.; Yurramendi, Y.; Jauregi, E.; Lazkano, E.; Sierra, B.
2014-01-01
The task of planning a collision-free trajectory from a start to a goal position is fundamental for an autonomous mobile robot. Although path planning has been extensively investigated since the beginning of robotics, there is no agreement on how to measure the performance of a motion algorithm. This paper presents a new approach to perform robot trajectories comparison that could be applied to any kind of trajectories and in both simulated and real environments. Given an initial set of features, it automatically selects the most significant ones and performs a statistical comparison using them. Additionally, a graphical data visualization named polygraph which helps to better understand the obtained results is provided. The proposed method has been applied, as an example, to compare two different motion planners, FM2 and WaveFront, using different environments, robots, and local planners. PMID:25525618
Probabilistic Forecasting of Photovoltaic Generation: An Efficient Statistical Approach
DEFF Research Database (Denmark)
Wan, Can; Lin, Jin; Song, Yonghua
2017-01-01
This letter proposes a novel efficient probabilistic forecasting approach to accurately quantify the variability and uncertainty of the power production from photovoltaic (PV) systems. Distinguished from most existing models, a linear programming based prediction interval construction model for P...... power generation is proposed based on extreme learning machine and quantile regression, featuring high reliability and computational efficiency. The proposed approach is validated through the numerical studies on PV data from Denmark.......This letter proposes a novel efficient probabilistic forecasting approach to accurately quantify the variability and uncertainty of the power production from photovoltaic (PV) systems. Distinguished from most existing models, a linear programming based prediction interval construction model for PV...
Daily precipitation statistics in regional climate models
DEFF Research Database (Denmark)
Frei, Christoph; Christensen, Jens Hesselbjerg; Déqué, Michel
2003-01-01
An evaluation is undertaken of the statistics of daily precipitation as simulated by five regional climate models using comprehensive observations in the region of the European Alps. Four limited area models and one variable-resolution global model are considered, all with a grid spacing of 50 km...
Infinite Random Graphs as Statistical Mechanical Models
DEFF Research Database (Denmark)
Durhuus, Bergfinnur Jøgvan; Napolitano, George Maria
2011-01-01
We discuss two examples of infinite random graphs obtained as limits of finite statistical mechanical systems: a model of two-dimensional dis-cretized quantum gravity defined in terms of causal triangulated surfaces, and the Ising model on generic random trees. For the former model we describe a ...
Snow cover and End of Summer Snowline statistics from a simple stochastic model
Petrelli, A.; Crouzy, B.; Perona, P.
2012-04-01
One essential parameter characterizing snow cover statistics is the End Of Summer Snowline (EOSS), which is also a good indicator of actual climatic trends in mountain regions. EOSS is usually modelled by means of spatially distributed physically based models, and typically require heavy parameterization. In this paper we validate the simple stochastic model proposed by Perona et al. (2007), by showing that the snow cover statistics and the position of EOSS can in principle be explained by only four essential (meteorological) parameters. Perona et al. (2007) proposed a model accounting for stochastic snow accumulation in the cold period, and deterministic melting dynamics in the warm period, and studied the statistical distribution of the snowdepth on the long term. By reworking the ensemble average of the steady state evolution equation we single out a relationship between the snowdepth statistics (including the position of EOSS) and the involved parameters. The validation of the established relationship is done using 50 years of field data from 73 Swiss stations located above 2000 m a.s.l. First an estimation of the meteorological parameters is made. Snow height data are used as a precipitation proxy, using temperature data to estimate SWE during the precipitation event. Thresholds are used both to separate accumulation from actual precipitation and wind transport phenomena, and to better assess summer melting rate, considered to be constant over the melting period according to the simplified model. First results show that data for most of the weather stations actually scales with the proposed relationship. This indicates that, on the long term, the effect of spatial and temporal noise masks most of the process detail so that minimalist models suffice to obtain reliable statistics. Future works will test the validity of this approach at different spatial scales, e.g., regional, continental and planetary. Reference: P. Perona, A. Porporato, and L. Ridolfi, "A
An R2 statistic for fixed effects in the linear mixed model.
Edwards, Lloyd J; Muller, Keith E; Wolfinger, Russell D; Qaqish, Bahjat F; Schabenberger, Oliver
2008-12-20
Statisticians most often use the linear mixed model to analyze Gaussian longitudinal data. The value and familiarity of the R(2) statistic in the linear univariate model naturally creates great interest in extending it to the linear mixed model. We define and describe how to compute a model R(2) statistic for the linear mixed model by using only a single model. The proposed R(2) statistic measures multivariate association between the repeated outcomes and the fixed effects in the linear mixed model. The R(2) statistic arises as a 1-1 function of an appropriate F statistic for testing all fixed effects (except typically the intercept) in a full model. The statistic compares the full model with a null model with all fixed effects deleted (except typically the intercept) while retaining exactly the same covariance structure. Furthermore, the R(2) statistic leads immediately to a natural definition of a partial R(2) statistic. A mixed model in which ethnicity gives a very small p-value as a longitudinal predictor of blood pressure (BP) compellingly illustrates the value of the statistic. In sharp contrast to the extreme p-value, a very small R(2) , a measure of statistical and scientific importance, indicates that ethnicity has an almost negligible association with the repeated BP outcomes for the study.
Directory of Open Access Journals (Sweden)
Peter E. Land
2018-05-01
Full Text Available Uncertainty estimation is crucial to establishing confidence in any data analysis, and this is especially true for Essential Climate Variables, including ocean colour. Methods for deriving uncertainty vary greatly across data types, so a generic statistics-based approach applicable to multiple data types is an advantage to simplify the use and understanding of uncertainty data. Progress towards rigorous uncertainty analysis of ocean colour has been slow, in part because of the complexity of ocean colour processing. Here, we present a general approach to uncertainty characterisation, using a database of satellite-in situ matchups to generate a statistical model of satellite uncertainty as a function of its contributing variables. With an example NASA MODIS-Aqua chlorophyll-a matchups database mostly covering the north Atlantic, we demonstrate a model that explains 67% of the squared error in log(chlorophyll-a as a potentially correctable bias, with the remaining uncertainty being characterised as standard deviation and standard error at each pixel. The method is quite general, depending only on the existence of a suitable database of matchups or reference values, and can be applied to other sensors and data types such as other satellite observed Essential Climate Variables, empirical algorithms derived from in situ data, or even model data.
Directory of Open Access Journals (Sweden)
Elise eVaumourin
2014-05-01
Full Text Available A growing number of studies are reporting simultaneous infections by parasites in many different hosts. The detection of whether these parasites are significantly associated is important in medicine and epidemiology. Numerous approaches to detect associations are available, but only a few provide statistical tests. Furthermore, they generally test for an overall detection of association and do not identify which parasite is associated with which other one. Here, we developed a new approach, the association screening approach, to detect the overall and the detail of multi-parasite associations. We studied the power of this new approach and of three other known ones (i.e. the generalized chi-square, the network and the multinomial GLM approaches to identify parasite associations either due to parasite interactions or to confounding factors. We applied these four approaches to detect associations within two populations of multi-infected hosts: 1 rodents infected with Bartonella sp., Babesia microti and Anaplasma phagocytophilum and 2 bovine population infected with Theileria sp. and Babesia sp.. We found that the best power is obtained with the screening model and the generalized chi-square test. The differentiation between associations, which are due to confounding factors and parasite interactions was not possible. The screening approach significantly identified associations between Bartonella doshiae and B. microti, and between T. parva, T. mutans and T. velifera. Thus, the screening approach was relevant to test the overall presence of parasite associations and identify the parasite combinations that are significantly over- or under-represented. Unravelling whether the associations are due to real biological interactions or confounding factors should be further investigated. Nevertheless, in the age of genomics and the advent of new technologies, it is a considerable asset to speed up researches focusing on the mechanisms driving interactions
Directory of Open Access Journals (Sweden)
KUDRYAVTSEV Pavel Gennadievich
2015-04-01
Full Text Available The paper deals with possibilities to use quasi-homogenous approximation for discription of properties of dispersed systems. The authors applied statistical polymer ethod based on consideration of average structures of all possible macromolecules of the same weight. The equiations which allow evaluating many additive parameters of macromolecules and the systems with them were deduced. Statistical polymer method makes it possible to model branched, cross-linked macromolecules and the systems with them which are in equilibrium or non-equilibrium state. Fractal analysis of statistical polymer allows modeling different types of random fractal and other objects examined with the mehods of fractal theory. The method of fractal polymer can be also applied not only to polymers but also to composites, gels, associates in polar liquids and other packaged systems. There is also a description of the states of colloid solutions of silica oxide from the point of view of statistical physics. This approach is based on the idea that colloid solution of silica dioxide – sol of silica dioxide – consists of enormous number of interacting particles which are always in move. The paper is devoted to the research of ideal system of colliding but not interacting particles of sol. The analysis of behavior of silica sol was performed according to distribution Maxwell-Boltzmann and free path length was calculated. Using this data the number of the particles which can overcome the potential barrier in collision was calculated. To model kinetics of sol-gel transition different approaches were studied.
Statistical approach for calculating opacities of high-Z plasmas
International Nuclear Information System (INIS)
Nishikawa, Takeshi; Nakamura, Shinji; Takabe, Hideaki; Mima, Kunioki
1992-01-01
For simulating the X-ray radiation from laser produced high-Z plasma, an appropriate atomic modeling is necessary. Based on the average ion model, we have used a rather simple atomic model for opacity calculation in a hydrodynamic code and obtained a fairly good agreement with the experiment on the X-ray spectra from the laser-produced plasmas. We have investigated the accuracy of the atomic model used in the hydrodynamic code. It is found that transition energies of 4p-4d, 4d-4f, 4p-5d, 4d-5f and 4f-5g, which are important in laser produced high-Z plasma, can be given within an error of 15 % compared to the values by the Hartree-Fock-Slater (HFS) calculation and their oscillator strengths obtained by HFS calculation vary by a factor two according to the difference of charge state. We also propose a statistical method to carry out detail configuration accounting for electronic state by use of the population of bound electrons calculated with the average ion model. The statistical method is relatively simple and provides much improvement in calculating spectral opacities of line radiation, when we use the average ion model to determine electronic state. (author)
SOCR: Statistics Online Computational Resource
Dinov, Ivo D.
2006-01-01
The need for hands-on computer laboratory experience in undergraduate and graduate statistics education has been firmly established in the past decade. As a result a number of attempts have been undertaken to develop novel approaches for problem-driven statistical thinking, data analysis and result interpretation. In this paper we describe an integrated educational web-based framework for: interactive distribution modeling, virtual online probability experimentation, statistical data analysis...
Predicting The Exit Time Of Employees In An Organization Using Statistical Model
Directory of Open Access Journals (Sweden)
Ahmed Al Kuwaiti
2015-08-01
Full Text Available Employees are considered as an asset to any organization and each organization provide a better and flexible working environment to retain its best and resourceful workforce. As such continuous efforts are being taken to avoid or extend the exitwithdrawal of employees from the organization. Human resource managers are facing a challenge to predict the exit time of employees and there is no precise model existing at present in the literature. This study has been conducted to predict the probability of exit of an employee in an organization using appropriate statistical model. Accordingly authors designed a model using Additive Weibull distribution to predict the expected exit time of employee in an organization. In addition a Shock model approach is also executed to check how well the Additive Weibull distribution suits in an organization. The analytical results showed that when the inter-arrival time increases the expected time for the employees to exit also increases. This study concluded that Additive Weibull distribution can be considered as an alternative in the place of Shock model approach to predict the exit time of employee in an organization.
Adaptive Maneuvering Frequency Method of Current Statistical Model
Institute of Scientific and Technical Information of China (English)
Wei Sun; Yongjian Yang
2017-01-01
Current statistical model(CSM) has a good performance in maneuvering target tracking. However, the fixed maneuvering frequency will deteriorate the tracking results, such as a serious dynamic delay, a slowly converging speedy and a limited precision when using Kalman filter(KF) algorithm. In this study, a new current statistical model and a new Kalman filter are proposed to improve the performance of maneuvering target tracking. The new model which employs innovation dominated subjection function to adaptively adjust maneuvering frequency has a better performance in step maneuvering target tracking, while a fluctuant phenomenon appears. As far as this problem is concerned, a new adaptive fading Kalman filter is proposed as well. In the new Kalman filter, the prediction values are amended in time by setting judgment and amendment rules,so that tracking precision and fluctuant phenomenon of the new current statistical model are improved. The results of simulation indicate the effectiveness of the new algorithm and the practical guiding significance.
Speech emotion recognition based on statistical pitch model
Institute of Scientific and Technical Information of China (English)
WANG Zhiping; ZHAO Li; ZOU Cairong
2006-01-01
A modified Parzen-window method, which keep high resolution in low frequencies and keep smoothness in high frequencies, is proposed to obtain statistical model. Then, a gender classification method utilizing the statistical model is proposed, which have a 98% accuracy of gender classification while long sentence is dealt with. By separation the male voice and female voice, the mean and standard deviation of speech training samples with different emotion are used to create the corresponding emotion models. Then the Bhattacharyya distance between the test sample and statistical models of pitch, are utilized for emotion recognition in speech.The normalization of pitch for the male voice and female voice are also considered, in order to illustrate them into a uniform space. Finally, the speech emotion recognition experiment based on K Nearest Neighbor shows that, the correct rate of 81% is achieved, where it is only 73.85%if the traditional parameters are utilized.
Statistical distance and the approach to KNO scaling
International Nuclear Information System (INIS)
Diosi, L.; Hegyi, S.; Krasznovszky, S.
1990-05-01
A new method is proposed for characterizing the approach to KNO scaling. The essence of our method lies in the concept of statistical distance between nearby KNO distributions which reflects their distinguishability in spite of multiplicity fluctuations. It is shown that the geometry induced by the distance function defines a natural metric on the parameter space of a certain family of KNO distributions. Some examples are given in which the energy dependences of distinguishability of neighbouring KNO distributions are compared in nondiffractive hadron-hadron collisions and electron-positron annihilation. (author) 19 refs.; 4 figs
Statistical modelling of citation exchange between statistics journals.
Varin, Cristiano; Cattelan, Manuela; Firth, David
2016-01-01
Rankings of scholarly journals based on citation data are often met with scepticism by the scientific community. Part of the scepticism is due to disparity between the common perception of journals' prestige and their ranking based on citation counts. A more serious concern is the inappropriate use of journal rankings to evaluate the scientific influence of researchers. The paper focuses on analysis of the table of cross-citations among a selection of statistics journals. Data are collected from the Web of Science database published by Thomson Reuters. Our results suggest that modelling the exchange of citations between journals is useful to highlight the most prestigious journals, but also that journal citation data are characterized by considerable heterogeneity, which needs to be properly summarized. Inferential conclusions require care to avoid potential overinterpretation of insignificant differences between journal ratings. Comparison with published ratings of institutions from the UK's research assessment exercise shows strong correlation at aggregate level between assessed research quality and journal citation 'export scores' within the discipline of statistics.
Directory of Open Access Journals (Sweden)
Isabel M. Joao
2017-09-01
Full Text Available Projects thematically focused on simulation and statistical techniques for designing and optimizing chemical processes can be helpful in chemical engineering education in order to meet the needs of engineers. We argue for the relevance of the projects to improve a student centred approach and boost higher order thinking skills. This paper addresses the use of Aspen HYSYS by Portuguese chemical engineering master students to model distillation systems together with statistical experimental design techniques in order to optimize the systems highlighting the value of applying problem specific knowledge, simulation tools and sound statistical techniques. The paper summarizes the work developed by the students in order to model steady-state processes, dynamic processes and optimize the distillation systems emphasizing the benefits of the simulation tools and statistical techniques in helping the students learn how to learn. Students strengthened their domain specific knowledge and became motivated to rethink and improve chemical processes in their future chemical engineering profession. We discuss the main advantages of the methodology from the students’ and teachers perspective
Advances in statistical models for data analysis
Minerva, Tommaso; Vichi, Maurizio
2015-01-01
This edited volume focuses on recent research results in classification, multivariate statistics and machine learning and highlights advances in statistical models for data analysis. The volume provides both methodological developments and contributions to a wide range of application areas such as economics, marketing, education, social sciences and environment. The papers in this volume were first presented at the 9th biannual meeting of the Classification and Data Analysis Group (CLADAG) of the Italian Statistical Society, held in September 2013 at the University of Modena and Reggio Emilia, Italy.
Bryan, Rebecca; Nair, Prasanth B; Taylor, Mark
2009-09-18
Interpatient variability is often overlooked in orthopaedic computational studies due to the substantial challenges involved in sourcing and generating large numbers of bone models. A statistical model of the whole femur incorporating both geometric and material property variation was developed as a potential solution to this problem. The statistical model was constructed using principal component analysis, applied to 21 individual computer tomography scans. To test the ability of the statistical model to generate realistic, unique, finite element (FE) femur models it was used as a source of 1000 femurs to drive a study on femoral neck fracture risk. The study simulated the impact of an oblique fall to the side, a scenario known to account for a large proportion of hip fractures in the elderly and have a lower fracture load than alternative loading approaches. FE model generation, application of subject specific loading and boundary conditions, FE processing and post processing of the solutions were completed automatically. The generated models were within the bounds of the training data used to create the statistical model with a high mesh quality, able to be used directly by the FE solver without remeshing. The results indicated that 28 of the 1000 femurs were at highest risk of fracture. Closer analysis revealed the percentage of cortical bone in the proximal femur to be a crucial differentiator between the failed and non-failed groups. The likely fracture location was indicated to be intertrochantic. Comparison to previous computational, clinical and experimental work revealed support for these findings.
Butaciu, Sinziana; Senila, Marin; Sarbu, Costel; Ponta, Michaela; Tanaselia, Claudiu; Cadar, Oana; Roman, Marius; Radu, Emil; Sima, Mihaela; Frentiu, Tiberiu
2017-04-01
The study proposes a combined model based on diagrams (Gibbs, Piper, Stuyfzand Hydrogeochemical Classification System) and unsupervised statistical approaches (Cluster Analysis, Principal Component Analysis, Fuzzy Principal Component Analysis, Fuzzy Hierarchical Cross-Clustering) to describe natural enrichment of inorganic arsenic and co-occurring species in groundwater in the Banat Plain, southwestern Romania. Speciation of inorganic As (arsenite, arsenate), ion concentrations (Na + , K + , Ca 2+ , Mg 2+ , HCO 3 - , Cl - , F - , SO 4 2- , PO 4 3- , NO 3 - ), pH, redox potential, conductivity and total dissolved substances were performed. Classical diagrams provided the hydrochemical characterization, while statistical approaches were helpful to establish (i) the mechanism of naturally occurring of As and F - species and the anthropogenic one for NO 3 - , SO 4 2- , PO 4 3- and K + and (ii) classification of groundwater based on content of arsenic species. The HCO 3 - type of local groundwater and alkaline pH (8.31-8.49) were found to be responsible for the enrichment of arsenic species and occurrence of F - but by different paths. The PO 4 3- -AsO 4 3- ion exchange, water-rock interaction (silicates hydrolysis and desorption from clay) were associated to arsenate enrichment in the oxidizing aquifer. Fuzzy Hierarchical Cross-Clustering was the strongest tool for the rapid simultaneous classification of groundwaters as a function of arsenic content and hydrogeochemical characteristics. The approach indicated the Na + -F - -pH cluster as marker for groundwater with naturally elevated As and highlighted which parameters need to be monitored. A chemical conceptual model illustrating the natural and anthropogenic paths and enrichment of As and co-occurring species in the local groundwater supported by mineralogical analysis of rocks was established. Copyright © 2016 Elsevier Ltd. All rights reserved.
Statistical techniques for modeling extreme price dynamics in the energy market
International Nuclear Information System (INIS)
Mbugua, L N; Mwita, P N
2013-01-01
Extreme events have large impact throughout the span of engineering, science and economics. This is because extreme events often lead to failure and losses due to the nature unobservable of extra ordinary occurrences. In this context this paper focuses on appropriate statistical methods relating to a combination of quantile regression approach and extreme value theory to model the excesses. This plays a vital role in risk management. Locally, nonparametric quantile regression is used, a method that is flexible and best suited when one knows little about the functional forms of the object being estimated. The conditions are derived in order to estimate the extreme value distribution function. The threshold model of extreme values is used to circumvent the lack of adequate observation problem at the tail of the distribution function. The application of a selection of these techniques is demonstrated on the volatile fuel market. The results indicate that the method used can extract maximum possible reliable information from the data. The key attraction of this method is that it offers a set of ready made approaches to the most difficult problem of risk modeling.
Models for probability and statistical inference theory and applications
Stapleton, James H
2007-01-01
This concise, yet thorough, book is enhanced with simulations and graphs to build the intuition of readersModels for Probability and Statistical Inference was written over a five-year period and serves as a comprehensive treatment of the fundamentals of probability and statistical inference. With detailed theoretical coverage found throughout the book, readers acquire the fundamentals needed to advance to more specialized topics, such as sampling, linear models, design of experiments, statistical computing, survival analysis, and bootstrapping.Ideal as a textbook for a two-semester sequence on probability and statistical inference, early chapters provide coverage on probability and include discussions of: discrete models and random variables; discrete distributions including binomial, hypergeometric, geometric, and Poisson; continuous, normal, gamma, and conditional distributions; and limit theory. Since limit theory is usually the most difficult topic for readers to master, the author thoroughly discusses mo...
Fluctuations and correlations in statistical models of hadron production
International Nuclear Information System (INIS)
Gorenstein, M. I.
2012-01-01
An extension of the standard concept of the statistical ensembles is suggested. Namely, the statistical ensembles with extensive quantities fluctuating according to an externally given distribution are introduced. Applications in the statistical models of multiple hadron production in high energy physics are discussed.
Sileshi, G
2006-10-01
Researchers and regulatory agencies often make statistical inferences from insect count data using modelling approaches that assume homogeneous variance. Such models do not allow for formal appraisal of variability which in its different forms is the subject of interest in ecology. Therefore, the objectives of this paper were to (i) compare models suitable for handling variance heterogeneity and (ii) select optimal models to ensure valid statistical inferences from insect count data. The log-normal, standard Poisson, Poisson corrected for overdispersion, zero-inflated Poisson, the negative binomial distribution and zero-inflated negative binomial models were compared using six count datasets on foliage-dwelling insects and five families of soil-dwelling insects. Akaike's and Schwarz Bayesian information criteria were used for comparing the various models. Over 50% of the counts were zeros even in locally abundant species such as Ootheca bennigseni Weise, Mesoplatys ochroptera Stål and Diaecoderus spp. The Poisson model after correction for overdispersion and the standard negative binomial distribution model provided better description of the probability distribution of seven out of the 11 insects than the log-normal, standard Poisson, zero-inflated Poisson or zero-inflated negative binomial models. It is concluded that excess zeros and variance heterogeneity are common data phenomena in insect counts. If not properly modelled, these properties can invalidate the normal distribution assumptions resulting in biased estimation of ecological effects and jeopardizing the integrity of the scientific inferences. Therefore, it is recommended that statistical models appropriate for handling these data properties be selected using objective criteria to ensure efficient statistical inference.
A statistical approach to evaluate hydrocarbon remediation in the unsaturated zone
International Nuclear Information System (INIS)
Hajali, P.; Marshall, T.; Overman, S.
1991-01-01
This paper presents an evaluation of performance and cleanup effectiveness of a vapor extraction system (VES) in extracting chlorinated hydrocarbons and petroleum-based hydrocarbons (mineral spirits) from the unsaturated zone. The statistical analysis of soil concentration data to evaluate the VES remediation success is described. The site is a former electronics refurbishing facility in southern California; soil contamination from organic solvents was found mainly in five areas (Area A through E) beneath two buildings. The evaluation begins with a brief description of the site background, discusses the statistical approach, and presents conclusions
Growth curve models and statistical diagnostics
Pan, Jian-Xin
2002-01-01
Growth-curve models are generalized multivariate analysis-of-variance models. These models are especially useful for investigating growth problems on short times in economics, biology, medical research, and epidemiology. This book systematically introduces the theory of the GCM with particular emphasis on their multivariate statistical diagnostics, which are based mainly on recent developments made by the authors and their collaborators. The authors provide complete proofs of theorems as well as practical data sets and MATLAB code.
Magnetic moments of J{sup P} = (3)/(2){sup +} decuplet baryons using the statistical model
Energy Technology Data Exchange (ETDEWEB)
Kaur, Amanpreet; Upadhyay, Alka [Thapar University, School of Physics and Materials Science, Patiala (India)
2016-04-15
A suitable wave function for the baryon decuplet is framed with the inclusion of the sea containing quark-gluon Fock states. Relevant operator formalism is applied to calculate the magnetic moments of J{sup P} = (3)/(2){sup +} baryon decuplet. The statistical model assumes the decomposition of the baryonic state in various quark-gluon Fock states and is used in combination with the detailed balance principle to find the relative probabilities of these Fock states in flavor, spin and color space. The upper limit to the gluon is restricted to three with the possibility of emission of quark-antiquark pairs. We study the importance of strangeness in the sea (scalar, vector and tensor) and its contribution to the magnetic moments. Our approach has confirmed the scalar-tensor sea dominancy over the vector sea. Various modifications in the model are used to check the validity of the statistical approach. The results are matched with the available theoretical data. A good consistency with the experimental data has been achieved for Δ{sup ++}, Δ{sup +} and Ω{sup -}. (orig.)
Milic, Natasa M.; Trajkovic, Goran Z.; Bukumiric, Zoran M.; Cirkovic, Andja; Nikolic, Ivan M.; Milin, Jelena S.; Milic, Nikola V.; Savic, Marko D.; Corac, Aleksandar M.; Marinkovic, Jelena M.; Stanisavljevic, Dejana M.
2016-01-01
Background Although recent studies report on the benefits of blended learning in improving medical student education, there is still no empirical evidence on the relative effectiveness of blended over traditional learning approaches in medical statistics. We implemented blended along with on-site (i.e. face-to-face) learning to further assess the potential value of web-based learning in medical statistics. Methods This was a prospective study conducted with third year medical undergraduate students attending the Faculty of Medicine, University of Belgrade, who passed (440 of 545) the final exam of the obligatory introductory statistics course during 2013–14. Student statistics achievements were stratified based on the two methods of education delivery: blended learning and on-site learning. Blended learning included a combination of face-to-face and distance learning methodologies integrated into a single course. Results Mean exam scores for the blended learning student group were higher than for the on-site student group for both final statistics score (89.36±6.60 vs. 86.06±8.48; p = 0.001) and knowledge test score (7.88±1.30 vs. 7.51±1.36; p = 0.023) with a medium effect size. There were no differences in sex or study duration between the groups. Current grade point average (GPA) was higher in the blended group. In a multivariable regression model, current GPA and knowledge test scores were associated with the final statistics score after adjusting for study duration and learning modality (plearning environments for teaching medical statistics to undergraduate medical students. Blended and on-site training formats led to similar knowledge acquisition; however, students with higher GPA preferred the technology assisted learning format. Implementation of blended learning approaches can be considered an attractive, cost-effective, and efficient alternative to traditional classroom training in medical statistics. PMID:26859832
New applications of statistical tools in plant pathology.
Garrett, K A; Madden, L V; Hughes, G; Pfender, W F
2004-09-01
ABSTRACT The series of papers introduced by this one address a range of statistical applications in plant pathology, including survival analysis, nonparametric analysis of disease associations, multivariate analyses, neural networks, meta-analysis, and Bayesian statistics. Here we present an overview of additional applications of statistics in plant pathology. An analysis of variance based on the assumption of normally distributed responses with equal variances has been a standard approach in biology for decades. Advances in statistical theory and computation now make it convenient to appropriately deal with discrete responses using generalized linear models, with adjustments for overdispersion as needed. New nonparametric approaches are available for analysis of ordinal data such as disease ratings. Many experiments require the use of models with fixed and random effects for data analysis. New or expanded computing packages, such as SAS PROC MIXED, coupled with extensive advances in statistical theory, allow for appropriate analyses of normally distributed data using linear mixed models, and discrete data with generalized linear mixed models. Decision theory offers a framework in plant pathology for contexts such as the decision about whether to apply or withhold a treatment. Model selection can be performed using Akaike's information criterion. Plant pathologists studying pathogens at the population level have traditionally been the main consumers of statistical approaches in plant pathology, but new technologies such as microarrays supply estimates of gene expression for thousands of genes simultaneously and present challenges for statistical analysis. Applications to the study of the landscape of the field and of the genome share the risk of pseudoreplication, the problem of determining the appropriate scale of the experimental unit and of obtaining sufficient replication at that scale.
A Bifactor Approach to Model Multifaceted Constructs in Statistical Mediation Analysis
Gonzalez, Oscar; MacKinnon, David P.
2018-01-01
Statistical mediation analysis allows researchers to identify the most important mediating constructs in the causal process studied. Identifying specific mediators is especially relevant when the hypothesized mediating construct consists of multiple related facets. The general definition of the construct and its facets might relate differently to…
International Nuclear Information System (INIS)
Zhang, Jinzhao; Segurado, Jacobo; Schneidesch, Christophe
2013-01-01
Since 1980's, Tractebel Engineering (TE) has being developed and applied a multi-physical modelling and safety analyses capability, based on a code package consisting of the best estimate 3D neutronic (PANTHER), system thermal hydraulic (RELAP5), core sub-channel thermal hydraulic (COBRA-3C), and fuel thermal mechanic (FRAPCON/FRAPTRAN) codes. A series of methodologies have been developed to perform and to license the reactor safety analysis and core reload design, based on the deterministic bounding approach. Following the recent trends in research and development as well as in industrial applications, TE has been working since 2010 towards the application of the statistical sensitivity and uncertainty analysis methods to the multi-physical modelling and licensing safety analyses. In this paper, the TE multi-physical modelling and safety analyses capability is first described, followed by the proposed TE best estimate plus statistical uncertainty analysis method (BESUAM). The chosen statistical sensitivity and uncertainty analysis methods (non-parametric order statistic method or bootstrap) and tool (DAKOTA) are then presented, followed by some preliminary results of their applications to FRAPCON/FRAPTRAN simulation of OECD RIA fuel rod codes benchmark and RELAP5/MOD3.3 simulation of THTF tests. (authors)
Statistical modelling of survival data with random effects h-likelihood approach
Ha, Il Do; Lee, Youngjo
2017-01-01
This book provides a groundbreaking introduction to the likelihood inference for correlated survival data via the hierarchical (or h-) likelihood in order to obtain the (marginal) likelihood and to address the computational difficulties in inferences and extensions. The approach presented in the book overcomes shortcomings in the traditional likelihood-based methods for clustered survival data such as intractable integration. The text includes technical materials such as derivations and proofs in each chapter, as well as recently developed software programs in R (“frailtyHL”), while the real-world data examples together with an R package, “frailtyHL” in CRAN, provide readers with useful hands-on tools. Reviewing new developments since the introduction of the h-likelihood to survival analysis (methods for interval estimation of the individual frailty and for variable selection of the fixed effects in the general class of frailty models) and guiding future directions, the book is of interest to research...
Improving statistical reasoning theoretical models and practical implications
Sedlmeier, Peter
1999-01-01
This book focuses on how statistical reasoning works and on training programs that can exploit people''s natural cognitive capabilities to improve their statistical reasoning. Training programs that take into account findings from evolutionary psychology and instructional theory are shown to have substantially larger effects that are more stable over time than previous training regimens. The theoretical implications are traced in a neural network model of human performance on statistical reasoning problems. This book apppeals to judgment and decision making researchers and other cognitive scientists, as well as to teachers of statistics and probabilistic reasoning.
A statistical state dynamics approach to wall turbulence.
Farrell, B F; Gayme, D F; Ioannou, P J
2017-03-13
This paper reviews results obtained using statistical state dynamics (SSD) that demonstrate the benefits of adopting this perspective for understanding turbulence in wall-bounded shear flows. The SSD approach used in this work employs a second-order closure that retains only the interaction between the streamwise mean flow and the streamwise mean perturbation covariance. This closure restricts nonlinearity in the SSD to that explicitly retained in the streamwise constant mean flow together with nonlinear interactions between the mean flow and the perturbation covariance. This dynamical restriction, in which explicit perturbation-perturbation nonlinearity is removed from the perturbation equation, results in a simplified dynamics referred to as the restricted nonlinear (RNL) dynamics. RNL systems, in which a finite ensemble of realizations of the perturbation equation share the same mean flow, provide tractable approximations to the SSD, which is equivalent to an infinite ensemble RNL system. This infinite ensemble system, referred to as the stochastic structural stability theory system, introduces new analysis tools for studying turbulence. RNL systems provide computationally efficient means to approximate the SSD and produce self-sustaining turbulence exhibiting qualitative features similar to those observed in direct numerical simulations despite greatly simplified dynamics. The results presented show that RNL turbulence can be supported by as few as a single streamwise varying component interacting with the streamwise constant mean flow and that judicious selection of this truncated support or 'band-limiting' can be used to improve quantitative accuracy of RNL turbulence. These results suggest that the SSD approach provides new analytical and computational tools that allow new insights into wall turbulence.This article is part of the themed issue 'Toward the development of high-fidelity models of wall turbulence at large Reynolds number'. © 2017 The Author(s).
The statistics of multi-step direct reactions
International Nuclear Information System (INIS)
Koning, A.J.; Akkermans, J.M.
1991-01-01
We propose a quantum-statistical framework that provides an integrated perspective on the differences and similarities between the many current models for multi-step direct reactions in the continuum. It is argued that to obtain a statistical theory two physically different approaches are conceivable to postulate randomness, respectively called leading-particle statistics and residual-system statistics. We present a new leading-particle statistics theory for multi-step direct reactions. It is shown that the model of Feshbach et al. can be derived as a simplification of this theory and thus can be founded solely upon leading-particle statistics. The models developed by Tamura et al. and Nishioka et al. are based upon residual-system statistics and hence fall into a physically different class of multi-step direct theories, although the resulting cross-section formulae for the important first step are shown to be the same. The widely used semi-classical models such as the generalized exciton model can be interpreted as further phenomenological simplifications of the leading-particle statistics theory. A more comprehensive exposition will appear before long. (author). 32 refs, 4 figs
Zammit-Mangion, Andrew; Stavert, Ann; Rigby, Matthew; Ganesan, Anita; Rayner, Peter; Cressie, Noel
2017-04-01
The Orbiting Carbon Observatory-2 (OCO-2) satellite was launched on 2 July 2014, and it has been a source of atmospheric CO2 data since September 2014. The OCO-2 dataset contains a number of variables, but the one of most interest for flux inversion has been the column-averaged dry-air mole fraction (in units of ppm). These global level-2 data offer the possibility of inferring CO2 fluxes at Earth's surface and tracking those fluxes over time. However, as well as having a component of random error, the OCO-2 data have a component of systematic error that is dependent on the instrument's mode, namely land nadir, land glint, and ocean glint. Our statistical approach to CO2-flux inversion starts with constructing a statistical model for the random and systematic errors with parameters that can be estimated from the OCO-2 data and possibly in situ sources from flasks, towers, and the Total Column Carbon Observing Network (TCCON). Dimension reduction of the flux field is achieved through the use of physical basis functions, while temporal evolution of the flux is captured by modelling the basis-function coefficients as a vector autoregressive process. For computational efficiency, flux inversion uses only three months of sensitivities of mole fraction to changes in flux, computed using MOZART; any residual variation is captured through the modelling of a stochastic process that varies smoothly as a function of latitude. The second stage of our statistical approach is to simulate from the posterior distribution of the basis-function coefficients and all unknown parameters given the data using a fully Bayesian Markov chain Monte Carlo (MCMC) algorithm. Estimates and posterior variances of the flux field can then be obtained straightforwardly from this distribution. Our statistical approach is different than others, as it simultaneously makes inference (and quantifies uncertainty) on both the error components' parameters and the CO2 fluxes. We compare it to more classical
Solar radiation data - statistical analysis and simulation models
Energy Technology Data Exchange (ETDEWEB)
Mustacchi, C; Cena, V; Rocchi, M; Haghigat, F
1984-01-01
The activities consisted in collecting meteorological data on magnetic tape for ten european locations (with latitudes ranging from 42/sup 0/ to 56/sup 0/ N), analysing the multi-year sequences, developing mathematical models to generate synthetic sequences having the same statistical properties of the original data sets, and producing one or more Short Reference Years (SRY's) for each location. The meteorological parameters examinated were (for all the locations) global + diffuse radiation on horizontal surface, dry bulb temperature, sunshine duration. For some of the locations additional parameters were available, namely, global, beam and diffuse radiation on surfaces other than horizontal, wet bulb temperature, wind velocity, cloud type, cloud cover. The statistical properties investigated were mean, variance, autocorrelation, crosscorrelation with selected parameters, probability density function. For all the meteorological parameters, various mathematical models were built: linear regression, stochastic models of the AR and the DAR type. In each case, the model with the best statistical behaviour was selected for the production of a SRY for the relevant parameter/location.
A DoS/DDoS Attack Detection System Using Chi-Square Statistic Approach
Directory of Open Access Journals (Sweden)
Fang-Yie Leu
2010-04-01
Full Text Available Nowadays, users can easily access and download network attack tools, which often provide friendly interfaces and easily operated features, from the Internet. Therefore, even a naive hacker can also launch a large scale DoS or DDoS attack to prevent a system, i.e., the victim, from providing Internet services. In this paper, we propose an agent based intrusion detection architecture, which is a distributed detection system, to detect DoS/DDoS attacks by invoking a statistic approach that compares source IP addresses' normal and current packet statistics to discriminate whether there is a DoS/DDoS attack. It first collects all resource IPs' packet statistics so as to create their normal packet distribution. Once some IPs' current packet distribution suddenly changes, very often it is an attack. Experimental results show that this approach can effectively detect DoS/DDoS attacks.
A hybrid finite element - statistical energy analysis approach to robust sound transmission modeling
Reynders, Edwin; Langley, Robin S.; Dijckmans, Arne; Vermeir, Gerrit
2014-09-01
When considering the sound transmission through a wall in between two rooms, in an important part of the audio frequency range, the local response of the rooms is highly sensitive to uncertainty in spatial variations in geometry, material properties and boundary conditions, which have a wave scattering effect, while the local response of the wall is rather insensitive to such uncertainty. For this mid-frequency range, a computationally efficient modeling strategy is adopted that accounts for this uncertainty. The partitioning wall is modeled deterministically, e.g. with finite elements. The rooms are modeled in a very efficient, nonparametric stochastic way, as in statistical energy analysis. All components are coupled by means of a rigorous power balance. This hybrid strategy is extended so that the mean and variance of the sound transmission loss can be computed as well as the transition frequency that loosely marks the boundary between low- and high-frequency behavior of a vibro-acoustic component. The method is first validated in a simulation study, and then applied for predicting the airborne sound insulation of a series of partition walls of increasing complexity: a thin plastic plate, a wall consisting of gypsum blocks, a thicker masonry wall and a double glazing. It is found that the uncertainty caused by random scattering is important except at very high frequencies, where the modal overlap of the rooms is very high. The results are compared with laboratory measurements, and both are found to agree within the prediction uncertainty in the considered frequency range.
Sadyś, Magdalena; Skjøth, Carsten Ambelas; Kennedy, Roy
2016-04-01
High concentration levels of Ganoderma spp. spores were observed in Worcester, UK, during 2006-2010. These basidiospores are known to cause sensitization due to the allergen content and their small dimensions. This enables them to penetrate the lower part of the respiratory tract in humans. Establishment of a link between occurring symptoms of sensitization to Ganoderma spp. and other basidiospores is challenging due to lack of information regarding spore concentration in the air. Hence, aerobiological monitoring should be conducted, and if possible extended with the construction of forecast models. Daily mean concentration of allergenic Ganoderma spp. spores in the atmosphere of Worcester was measured using 7-day volumetric spore sampler through five consecutive years. The relationships between the presence of spores in the air and the weather parameters were examined. Forecast models were constructed for Ganoderma spp. spores using advanced statistical techniques, i.e. multivariate regression trees and artificial neural networks. Dew point temperature along with maximum temperature was the most important factor influencing the presence of spores in the air of Worcester. Based on these two major factors and several others of lesser importance, thresholds for certain levels of fungal spore concentration, i.e. low (0-49 s m-3), moderate (50-99 s m-3), high (100-149 s m-3) and very high (150 < n s m-3), could be designated. Despite some deviation in results obtained by artificial neural networks, authors have achieved a forecasting model, which was accurate (correlation between observed and predicted values varied from r s = 0.57 to r s = 0.68).
Simulating metabolism with statistical thermodynamics.
Cannon, William R
2014-01-01
New methods are needed for large scale modeling of metabolism that predict metabolite levels and characterize the thermodynamics of individual reactions and pathways. Current approaches use either kinetic simulations, which are difficult to extend to large networks of reactions because of the need for rate constants, or flux-based methods, which have a large number of feasible solutions because they are unconstrained by the law of mass action. This report presents an alternative modeling approach based on statistical thermodynamics. The principles of this approach are demonstrated using a simple set of coupled reactions, and then the system is characterized with respect to the changes in energy, entropy, free energy, and entropy production. Finally, the physical and biochemical insights that this approach can provide for metabolism are demonstrated by application to the tricarboxylic acid (TCA) cycle of Escherichia coli. The reaction and pathway thermodynamics are evaluated and predictions are made regarding changes in concentration of TCA cycle intermediates due to 10- and 100-fold changes in the ratio of NAD+:NADH concentrations. Finally, the assumptions and caveats regarding the use of statistical thermodynamics to model non-equilibrium reactions are discussed.
De Backer, A; Martinez, G T; Rosenauer, A; Van Aert, S
2013-11-01
In the present paper, a statistical model-based method to count the number of atoms of monotype crystalline nanostructures from high resolution high-angle annular dark-field (HAADF) scanning transmission electron microscopy (STEM) images is discussed in detail together with a thorough study on the possibilities and inherent limitations. In order to count the number of atoms, it is assumed that the total scattered intensity scales with the number of atoms per atom column. These intensities are quantitatively determined using model-based statistical parameter estimation theory. The distribution describing the probability that intensity values are generated by atomic columns containing a specific number of atoms is inferred on the basis of the experimental scattered intensities. Finally, the number of atoms per atom column is quantified using this estimated probability distribution. The number of atom columns available in the observed STEM image, the number of components in the estimated probability distribution, the width of the components of the probability distribution, and the typical shape of a criterion to assess the number of components in the probability distribution directly affect the accuracy and precision with which the number of atoms in a particular atom column can be estimated. It is shown that single atom sensitivity is feasible taking the latter aspects into consideration. © 2013 Elsevier B.V. All rights reserved.
Statistical inference for template aging
Schuckers, Michael E.
2006-04-01
A change in classification error rates for a biometric device is often referred to as template aging. Here we offer two methods for determining whether the effect of time is statistically significant. The first of these is the use of a generalized linear model to determine if these error rates change linearly over time. This approach generalizes previous work assessing the impact of covariates using generalized linear models. The second approach uses of likelihood ratio tests methodology. The focus here is on statistical methods for estimation not the underlying cause of the change in error rates over time. These methodologies are applied to data from the National Institutes of Standards and Technology Biometric Score Set Release 1. The results of these applications are discussed.
Siegert, Stefan
2017-04-01
Initialised climate forecasts on seasonal time scales, run several months or even years ahead, are now an integral part of the battery of products offered by climate services world-wide. The availability of seasonal climate forecasts from various modeling centres gives rise to multi-model ensemble forecasts. Post-processing such seasonal-to-decadal multi-model forecasts is challenging 1) because the cross-correlation structure between multiple models and observations can be complicated, 2) because the amount of training data to fit the post-processing parameters is very limited, and 3) because the forecast skill of numerical models tends to be low on seasonal time scales. In this talk I will review new statistical post-processing frameworks for multi-model ensembles. I will focus particularly on Bayesian hierarchical modelling approaches, which are flexible enough to capture commonly made assumptions about collective and model-specific biases of multi-model ensembles. Despite the advances in statistical methodology, it turns out to be very difficult to out-perform the simplest post-processing method, which just recalibrates the multi-model ensemble mean by linear regression. I will discuss reasons for this, which are closely linked to the specific characteristics of seasonal multi-model forecasts. I explore possible directions for improvements, for example using informative priors on the post-processing parameters, and jointly modelling forecasts and observations.
Statistical models based on conditional probability distributions
International Nuclear Information System (INIS)
Narayanan, R.S.
1991-10-01
We present a formulation of statistical mechanics models based on conditional probability distribution rather than a Hamiltonian. We show that it is possible to realize critical phenomena through this procedure. Closely linked with this formulation is a Monte Carlo algorithm, in which a configuration generated is guaranteed to be statistically independent from any other configuration for all values of the parameters, in particular near the critical point. (orig.)
A statistical model for mapping morphological shape
Directory of Open Access Journals (Sweden)
Li Jiahan
2010-07-01
Full Text Available Abstract Background Living things come in all shapes and sizes, from bacteria, plants, and animals to humans. Knowledge about the genetic mechanisms for biological shape has far-reaching implications for a range spectrum of scientific disciplines including anthropology, agriculture, developmental biology, evolution and biomedicine. Results We derived a statistical model for mapping specific genes or quantitative trait loci (QTLs that control morphological shape. The model was formulated within the mixture framework, in which different types of shape are thought to result from genotypic discrepancies at a QTL. The EM algorithm was implemented to estimate QTL genotype-specific shapes based on a shape correspondence analysis. Computer simulation was used to investigate the statistical property of the model. Conclusion By identifying specific QTLs for morphological shape, the model developed will help to ask, disseminate and address many major integrative biological and genetic questions and challenges in the genetic control of biological shape and function.
Zielke, Olaf; McDougall, Damon; Mai, Martin; Babuska, Ivo
2014-05-01
Seismic, often augmented with geodetic data, are frequently used to invert for the spatio-temporal evolution of slip along a rupture plane. The resulting images of the slip evolution for a single event, inferred by different research teams, often vary distinctly, depending on the adopted inversion approach and rupture model parameterization. This observation raises the question, which of the provided kinematic source inversion solutions is most reliable and most robust, and — more generally — how accurate are fault parameterization and solution predictions? These issues are not included in "standard" source inversion approaches. Here, we present a statistical inversion approach to constrain kinematic rupture parameters from teleseismic body waves. The approach is based a) on a forward-modeling scheme that computes synthetic (body-)waves for a given kinematic rupture model, and b) on the QUESO (Quantification of Uncertainty for Estimation, Simulation, and Optimization) library that uses MCMC algorithms and Bayes theorem for sample selection. We present Bayesian inversions for rupture parameters in synthetic earthquakes (i.e. for which the exact rupture history is known) in an attempt to identify the cross-over at which further model discretization (spatial and temporal resolution of the parameter space) is no longer attributed to a decreasing misfit. Identification of this cross-over is of importance as it reveals the resolution power of the studied data set (i.e. teleseismic body waves), enabling one to constrain kinematic earthquake rupture histories of real earthquakes at a resolution that is supported by data. In addition, the Bayesian approach allows for mapping complete posterior probability density functions of the desired kinematic source parameters, thus enabling us to rigorously assess the uncertainties in earthquake source inversions.
Mathematical and statistical modeling for emerging and re-emerging infectious diseases
Hyman, James
2016-01-01
The contributions by epidemic modeling experts describe how mathematical models and statistical forecasting are created to capture the most important aspects of an emerging epidemic.Readers will discover a broad range of approaches to address questions, such as Can we control Ebola via ring vaccination strategies? How quickly should we detect Ebola cases to ensure epidemic control? What is the likelihood that an Ebola epidemic in West Africa leads to secondary outbreaks in other parts of the world? When does it matter to incorporate the role of disease-induced mortality on epidemic models? What is the role of behavior changes on Ebola dynamics? How can we better understand the control of cholera or Ebola using optimal control theory? How should a population be structured in order to mimic the transmission dynamics of diseases such as chlamydia, Ebola, or cholera? How can we objectively determine the end of an epidemic? How can we use metapopulation models to understand the role of movement restrictions and mi...
Statistical shape and appearance models of bones.
Sarkalkan, Nazli; Weinans, Harrie; Zadpoor, Amir A
2014-03-01
When applied to bones, statistical shape models (SSM) and statistical appearance models (SAM) respectively describe the mean shape and mean density distribution of bones within a certain population as well as the main modes of variations of shape and density distribution from their mean values. The availability of this quantitative information regarding the detailed anatomy of bones provides new opportunities for diagnosis, evaluation, and treatment of skeletal diseases. The potential of SSM and SAM has been recently recognized within the bone research community. For example, these models have been applied for studying the effects of bone shape on the etiology of osteoarthritis, improving the accuracy of clinical osteoporotic fracture prediction techniques, design of orthopedic implants, and surgery planning. This paper reviews the main concepts, methods, and applications of SSM and SAM as applied to bone. Copyright © 2013 Elsevier Inc. All rights reserved.
Statistical models and NMR analysis of polymer microstructure
Statistical models can be used in conjunction with NMR spectroscopy to study polymer microstructure and polymerization mechanisms. Thus, Bernoullian, Markovian, and enantiomorphic-site models are well known. Many additional models have been formulated over the years for additional situations. Typica...
Workshop on Model Uncertainty and its Statistical Implications
1988-01-01
In this book problems related to the choice of models in such diverse fields as regression, covariance structure, time series analysis and multinomial experiments are discussed. The emphasis is on the statistical implications for model assessment when the assessment is done with the same data that generated the model. This is a problem of long standing, notorious for its difficulty. Some contributors discuss this problem in an illuminating way. Others, and this is a truly novel feature, investigate systematically whether sample re-use methods like the bootstrap can be used to assess the quality of estimators or predictors in a reliable way given the initial model uncertainty. The book should prove to be valuable for advanced practitioners and statistical methodologists alike.
From inverse problems to learning: a Statistical Mechanics approach
Baldassi, Carlo; Gerace, Federica; Saglietti, Luca; Zecchina, Riccardo
2018-01-01
We present a brief introduction to the statistical mechanics approaches for the study of inverse problems in data science. We then provide concrete new results on inferring couplings from sampled configurations in systems characterized by an extensive number of stable attractors in the low temperature regime. We also show how these result are connected to the problem of learning with realistic weak signals in computational neuroscience. Our techniques and algorithms rely on advanced mean-field methods developed in the context of disordered systems.
Milic, Natasa M; Trajkovic, Goran Z; Bukumiric, Zoran M; Cirkovic, Andja; Nikolic, Ivan M; Milin, Jelena S; Milic, Nikola V; Savic, Marko D; Corac, Aleksandar M; Marinkovic, Jelena M; Stanisavljevic, Dejana M
2016-01-01
Although recent studies report on the benefits of blended learning in improving medical student education, there is still no empirical evidence on the relative effectiveness of blended over traditional learning approaches in medical statistics. We implemented blended along with on-site (i.e. face-to-face) learning to further assess the potential value of web-based learning in medical statistics. This was a prospective study conducted with third year medical undergraduate students attending the Faculty of Medicine, University of Belgrade, who passed (440 of 545) the final exam of the obligatory introductory statistics course during 2013-14. Student statistics achievements were stratified based on the two methods of education delivery: blended learning and on-site learning. Blended learning included a combination of face-to-face and distance learning methodologies integrated into a single course. Mean exam scores for the blended learning student group were higher than for the on-site student group for both final statistics score (89.36±6.60 vs. 86.06±8.48; p = 0.001) and knowledge test score (7.88±1.30 vs. 7.51±1.36; p = 0.023) with a medium effect size. There were no differences in sex or study duration between the groups. Current grade point average (GPA) was higher in the blended group. In a multivariable regression model, current GPA and knowledge test scores were associated with the final statistics score after adjusting for study duration and learning modality (pstatistics to undergraduate medical students. Blended and on-site training formats led to similar knowledge acquisition; however, students with higher GPA preferred the technology assisted learning format. Implementation of blended learning approaches can be considered an attractive, cost-effective, and efficient alternative to traditional classroom training in medical statistics.
Statistical inference based on divergence measures
Pardo, Leandro
2005-01-01
The idea of using functionals of Information Theory, such as entropies or divergences, in statistical inference is not new. However, in spite of the fact that divergence statistics have become a very good alternative to the classical likelihood ratio test and the Pearson-type statistic in discrete models, many statisticians remain unaware of this powerful approach.Statistical Inference Based on Divergence Measures explores classical problems of statistical inference, such as estimation and hypothesis testing, on the basis of measures of entropy and divergence. The first two chapters form an overview, from a statistical perspective, of the most important measures of entropy and divergence and study their properties. The author then examines the statistical analysis of discrete multivariate data with emphasis is on problems in contingency tables and loglinear models using phi-divergence test statistics as well as minimum phi-divergence estimators. The final chapter looks at testing in general populations, prese...
Cimler, Richard; Tomaskova, Hana; Kuhnova, Jitka; Dolezal, Ondrej; Pscheidl, Pavel; Kuca, Kamil
2018-02-01
Alzheimer's disease is one of the most common mental illnesses. It is posited that more than 25 % of the population is affected by some mental disease during their lifetime. Treatment of each patient draws resources from the economy concerned. Therefore, it is important to quantify the potential economic impact. Agent-based, system dynamics and numerical approaches to dynamic modeling of the population of the European Union and its patients with Alzheimer's disease are presented in this article. Simulations, their characteristics, and the results from different modeling tools are compared. The results of these approaches are compared with EU population growth predictions from the statistical office of the EU by Eurostat. The methodology of a creation of the models is described and all three modeling approaches are compared. The suitability of each modeling approach for the population modeling is discussed. In this case study, all three approaches gave us the results corresponding with the EU population prediction. Moreover, we were able to predict the number of patients with AD and, based on the modeling method, we were also able to monitor different characteristics of the population. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Kolmogorov complexity, pseudorandom generators and statistical models testing
Czech Academy of Sciences Publication Activity Database
Šindelář, Jan; Boček, Pavel
2002-01-01
Roč. 38, č. 6 (2002), s. 747-759 ISSN 0023-5954 R&D Projects: GA ČR GA102/99/1564 Institutional research plan: CEZ:AV0Z1075907 Keywords : Kolmogorov complexity * pseudorandom generators * statistical models testing Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.341, year: 2002
Eeuwijk, van F.A.
1996-01-01
In plant breeding it is a common observation to see genotypes react differently to environmental changes. This phenomenon is called genotype by environment interaction. Many statistical approaches for analysing genotype by environment interaction rely heavily on the analysis of variance model.
Bayesian Multi-Energy Computed Tomography reconstruction approaches based on decomposition models
International Nuclear Information System (INIS)
Cai, Caifang
2013-01-01
Multi-Energy Computed Tomography (MECT) makes it possible to get multiple fractions of basis materials without segmentation. In medical application, one is the soft-tissue equivalent water fraction and the other is the hard-matter equivalent bone fraction. Practical MECT measurements are usually obtained with polychromatic X-ray beams. Existing reconstruction approaches based on linear forward models without counting the beam poly-chromaticity fail to estimate the correct decomposition fractions and result in Beam-Hardening Artifacts (BHA). The existing BHA correction approaches either need to refer to calibration measurements or suffer from the noise amplification caused by the negative-log pre-processing and the water and bone separation problem. To overcome these problems, statistical DECT reconstruction approaches based on non-linear forward models counting the beam poly-chromaticity show great potential for giving accurate fraction images.This work proposes a full-spectral Bayesian reconstruction approach which allows the reconstruction of high quality fraction images from ordinary polychromatic measurements. This approach is based on a Gaussian noise model with unknown variance assigned directly to the projections without taking negative-log. Referring to Bayesian inferences, the decomposition fractions and observation variance are estimated by using the joint Maximum A Posteriori (MAP) estimation method. Subject to an adaptive prior model assigned to the variance, the joint estimation problem is then simplified into a single estimation problem. It transforms the joint MAP estimation problem into a minimization problem with a non-quadratic cost function. To solve it, the use of a monotone Conjugate Gradient (CG) algorithm with suboptimal descent steps is proposed.The performances of the proposed approach are analyzed with both simulated and experimental data. The results show that the proposed Bayesian approach is robust to noise and materials. It is also
Villamizar-Mejia, Rodolfo; Mujica-Delgado, Luis-Eduardo; Ruiz-Ordóñez, Magda-Liliana; Camacho-Navarro, Jhonatan; Moreno-Beltrán, Gustavo
2017-05-01
In previous works, damage detection of metallic specimens exposed to temperature changes has been achieved by using a statistical baseline model based on Principal Component Analysis (PCA), piezodiagnostics principle and taking into account temperature effect by augmenting the baseline model or by using several baseline models according to the current temperature. In this paper a new approach is presented, where damage detection is based in a new index that combine Q and T2 statistical indices with current temperature measurements. Experimental tests were achieved in a carbon-steel pipe of 1m length and 1.5 inches diameter, instrumented with piezodevices acting as actuators or sensors. A PCA baseline model was obtained to a temperature of 21º and then T2 and Q statistical indices were obtained for a 24h temperature profile. Also, mass adding at different points of pipe between sensor and actuator was used as damage. By using the combined index the temperature contribution can be separated and a better differentiation of damages respect to undamaged cases can be graphically obtained.
Visualization of the variability of 3D statistical shape models by animation.
Lamecker, Hans; Seebass, Martin; Lange, Thomas; Hege, Hans-Christian; Deuflhard, Peter
2004-01-01
Models of the 3D shape of anatomical objects and the knowledge about their statistical variability are of great benefit in many computer assisted medical applications like images analysis, therapy or surgery planning. Statistical model of shapes have successfully been applied to automate the task of image segmentation. The generation of 3D statistical shape models requires the identification of corresponding points on two shapes. This remains a difficult problem, especially for shapes of complicated topology. In order to interpret and validate variations encoded in a statistical shape model, visual inspection is of great importance. This work describes the generation and interpretation of statistical shape models of the liver and the pelvic bone.
In silico model-based inference: a contemporary approach for hypothesis testing in network biology.
Klinke, David J
2014-01-01
Inductive inference plays a central role in the study of biological systems where one aims to increase their understanding of the system by reasoning backwards from uncertain observations to identify causal relationships among components of the system. These causal relationships are postulated from prior knowledge as a hypothesis or simply a model. Experiments are designed to test the model. Inferential statistics are used to establish a level of confidence in how well our postulated model explains the acquired data. This iterative process, commonly referred to as the scientific method, either improves our confidence in a model or suggests that we revisit our prior knowledge to develop a new model. Advances in technology impact how we use prior knowledge and data to formulate models of biological networks and how we observe cellular behavior. However, the approach for model-based inference has remained largely unchanged since Fisher, Neyman and Pearson developed the ideas in the early 1900s that gave rise to what is now known as classical statistical hypothesis (model) testing. Here, I will summarize conventional methods for model-based inference and suggest a contemporary approach to aid in our quest to discover how cells dynamically interpret and transmit information for therapeutic aims that integrates ideas drawn from high performance computing, Bayesian statistics, and chemical kinetics. © 2014 American Institute of Chemical Engineers.
The use of statistical models in heavy-ion reactions studies
International Nuclear Information System (INIS)
Stokstad, R.G.
1984-01-01
This chapter reviews the use of statistical models to describe nuclear level densities and the decay of equilibrated nuclei. The statistical models of nuclear structure and nuclear reactions presented here have wide application in the analysis of heavy-ion reaction data. Applications are illustrated with examples of gamma-ray decay, the emission of light particles and heavier clusters of nucleons, and fission. In addition to the compound nucleus, the treatment of equilibrated fragments formed in binary reactions is discussed. The statistical model is shown to be an important tool for the identification of products from nonequilibrium decay
A d-statistic for single-case designs that is equivalent to the usual between-groups d-statistic.
Shadish, William R; Hedges, Larry V; Pustejovsky, James E; Boyajian, Jonathan G; Sullivan, Kristynn J; Andrade, Alma; Barrientos, Jeannette L
2014-01-01
We describe a standardised mean difference statistic (d) for single-case designs that is equivalent to the usual d in between-groups experiments. We show how it can be used to summarise treatment effects over cases within a study, to do power analyses in planning new studies and grant proposals, and to meta-analyse effects across studies of the same question. We discuss limitations of this d-statistic, and possible remedies to them. Even so, this d-statistic is better founded statistically than other effect size measures for single-case design, and unlike many general linear model approaches such as multilevel modelling or generalised additive models, it produces a standardised effect size that can be integrated over studies with different outcome measures. SPSS macros for both effect size computation and power analysis are available.
Multivariate statistical modelling based on generalized linear models
Fahrmeir, Ludwig
1994-01-01
This book is concerned with the use of generalized linear models for univariate and multivariate regression analysis. Its emphasis is to provide a detailed introductory survey of the subject based on the analysis of real data drawn from a variety of subjects including the biological sciences, economics, and the social sciences. Where possible, technical details and proofs are deferred to an appendix in order to provide an accessible account for non-experts. Topics covered include: models for multi-categorical responses, model checking, time series and longitudinal data, random effects models, and state-space models. Throughout, the authors have taken great pains to discuss the underlying theoretical ideas in ways that relate well to the data at hand. As a result, numerous researchers whose work relies on the use of these models will find this an invaluable account to have on their desks. "The basic aim of the authors is to bring together and review a large part of recent advances in statistical modelling of m...
Enhancing Dairy Manufacturing through customer feedback: A statistical approach
Vineesh, D.; Anbuudayasankar, S. P.; Narassima, M. S.
2018-02-01
Dairy products have become inevitable of habitual diet. This study aims to investigate the consumers’ satisfaction towards dairy products so as to provide useful information for the manufacturers which would serve as useful inputs for enriching the quality of products delivered. The study involved consumers of dairy products from various demographical backgrounds across South India. The questionnaire focussed on quality aspects of dairy products and also the service provided. A customer satisfaction model was developed based on various factors identified, with robust hypotheses that govern the use of the product. The developed model proved to be statistically significant as it passed the required statistical tests for reliability, construct validity and interdependency between the constructs. Some major concerns detected were regarding the fat content, taste and odour of packaged milk. A minor proportion of people (15.64%) were unsatisfied with the quality of service provided, which is another issue to be addressed to eliminate the sense of dissatisfaction in the minds of consumers.
Hagell, Peter; Westergren, Albert
Sample size is a major factor in statistical null hypothesis testing, which is the basis for many approaches to testing Rasch model fit. Few sample size recommendations for testing fit to the Rasch model concern the Rasch Unidimensional Measurement Models (RUMM) software, which features chi-square and ANOVA/F-ratio based fit statistics, including Bonferroni and algebraic sample size adjustments. This paper explores the occurrence of Type I errors with RUMM fit statistics, and the effects of algebraic sample size adjustments. Data with simulated Rasch model fitting 25-item dichotomous scales and sample sizes ranging from N = 50 to N = 2500 were analysed with and without algebraically adjusted sample sizes. Results suggest the occurrence of Type I errors with N less then or equal to 500, and that Bonferroni correction as well as downward algebraic sample size adjustment are useful to avoid such errors, whereas upward adjustment of smaller samples falsely signal misfit. Our observations suggest that sample sizes around N = 250 to N = 500 may provide a good balance for the statistical interpretation of the RUMM fit statistics studied here with respect to Type I errors and under the assumption of Rasch model fit within the examined frame of reference (i.e., about 25 item parameters well targeted to the sample).
International Nuclear Information System (INIS)
Boning, Duane S.; Chung, James E.
1998-01-01
Advanced process technology will require more detailed understanding and tighter control of variation in devices and interconnects. The purpose of statistical metrology is to provide methods to measure and characterize variation, to model systematic and random components of that variation, and to understand the impact of variation on both yield and performance of advanced circuits. Of particular concern are spatial or pattern-dependencies within individual chips; such systematic variation within the chip can have a much larger impact on performance than wafer-level random variation. Statistical metrology methods will play an important role in the creation of design rules for advanced technologies. For example, a key issue in multilayer interconnect is the uniformity of interlevel dielectric (ILD) thickness within the chip. For the case of ILD thickness, we describe phases of statistical metrology development and application to understanding and modeling thickness variation arising from chemical-mechanical polishing (CMP). These phases include screening experiments including design of test structures and test masks to gather electrical or optical data, techniques for statistical decomposition and analysis of the data, and approaches to calibrating empirical and physical variation models. These models can be integrated with circuit CAD tools to evaluate different process integration or design rule strategies. One focus for the generation of interconnect design rules are guidelines for the use of 'dummy fill' or 'metal fill' to improve the uniformity of underlying metal density and thus improve the uniformity of oxide thickness within the die. Trade-offs that can be evaluated via statistical metrology include the improvements to uniformity possible versus the effect of increased capacitance due to additional metal
A Statistical Approach for Gain Bandwidth Prediction of Phoenix-Cell Based Reflect arrays
Directory of Open Access Journals (Sweden)
Hassan Salti
2018-01-01
Full Text Available A new statistical approach to predict the gain bandwidth of Phoenix-cell based reflectarrays is proposed. It combines the effects of both main factors that limit the bandwidth of reflectarrays: spatial phase delays and intrinsic bandwidth of radiating cells. As an illustration, the proposed approach is successfully applied to two reflectarrays based on new Phoenix cells.
Active Learning with Statistical Models.
1995-01-01
Active Learning with Statistical Models ASC-9217041, NSF CDA-9309300 6. AUTHOR(S) David A. Cohn, Zoubin Ghahramani, and Michael I. Jordan 7. PERFORMING...TERMS 15. NUMBER OF PAGES Al, MIT, Artificial Intelligence, active learning , queries, locally weighted 6 regression, LOESS, mixtures of gaussians...COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES A.I. Memo No. 1522 January 9. 1995 C.B.C.L. Paper No. 110 Active Learning with
Dai, Mingwei; Ming, Jingsi; Cai, Mingxuan; Liu, Jin; Yang, Can; Wan, Xiang; Xu, Zongben
2017-09-15
Results from genome-wide association studies (GWAS) suggest that a complex phenotype is often affected by many variants with small effects, known as 'polygenicity'. Tens of thousands of samples are often required to ensure statistical power of identifying these variants with small effects. However, it is often the case that a research group can only get approval for the access to individual-level genotype data with a limited sample size (e.g. a few hundreds or thousands). Meanwhile, summary statistics generated using single-variant-based analysis are becoming publicly available. The sample sizes associated with the summary statistics datasets are usually quite large. How to make the most efficient use of existing abundant data resources largely remains an open question. In this study, we propose a statistical approach, IGESS, to increasing statistical power of identifying risk variants and improving accuracy of risk prediction by i ntegrating individual level ge notype data and s ummary s tatistics. An efficient algorithm based on variational inference is developed to handle the genome-wide analysis. Through comprehensive simulation studies, we demonstrated the advantages of IGESS over the methods which take either individual-level data or summary statistics data as input. We applied IGESS to perform integrative analysis of Crohns Disease from WTCCC and summary statistics from other studies. IGESS was able to significantly increase the statistical power of identifying risk variants and improve the risk prediction accuracy from 63.2% ( ±0.4% ) to 69.4% ( ±0.1% ) using about 240 000 variants. The IGESS software is available at https://github.com/daviddaigithub/IGESS . zbxu@xjtu.edu.cn or xwan@comp.hkbu.edu.hk or eeyang@hkbu.edu.hk. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Bayesian Statistics: Concepts and Applications in Animal Breeding – A Review
Directory of Open Access Journals (Sweden)
Lsxmikant-Sambhaji Kokate
2011-07-01
Full Text Available Statistics uses two major approaches- conventional (or frequentist and Bayesian approach. Bayesian approach provides a complete paradigm for both statistical inference and decision making under uncertainty. Bayesian methods solve many of the difficulties faced by conventional statistical methods, and extend the applicability of statistical methods. It exploits the use of probabilistic models to formulate scientific problems. To use Bayesian statistics, there is computational difficulty and secondly, Bayesian methods require specifying prior probability distributions. Markov Chain Monte-Carlo (MCMC methods were applied to overcome the computational difficulty, and interest in Bayesian methods was renewed. In Bayesian statistics, Bayesian structural equation model (SEM is used. It provides a powerful and flexible approach for studying quantitative traits for wide spectrum problems and thus it has no operational difficulties, with the exception of some complex cases. In this method, the problems are solved at ease, and the statisticians feel it comfortable with the particular way of expressing the results and employing the software available to analyze a large variety of problems.
Serdobolskii, Vadim Ivanovich
2007-01-01
This monograph presents mathematical theory of statistical models described by the essentially large number of unknown parameters, comparable with sample size but can also be much larger. In this meaning, the proposed theory can be called "essentially multiparametric". It is developed on the basis of the Kolmogorov asymptotic approach in which sample size increases along with the number of unknown parameters.This theory opens a way for solution of central problems of multivariate statistics, which up until now have not been solved. Traditional statistical methods based on the idea of an infinite sampling often break down in the solution of real problems, and, dependent on data, can be inefficient, unstable and even not applicable. In this situation, practical statisticians are forced to use various heuristic methods in the hope the will find a satisfactory solution.Mathematical theory developed in this book presents a regular technique for implementing new, more efficient versions of statistical procedures. ...
DEFF Research Database (Denmark)
Fock, Jeppe; Sørensen, Jakob Kryger; Lörtscher, Emanuel
2011-01-01
We report on the vibrational fingerprint of single C(60) terminated molecules in a mechanically controlled break junction (MCBJ) setup using a novel statistical approach manipulating the junction mechanically to address different molecular configurations and to monitor the corresponding vibration...
Statistical aspects of fish stock assessment
DEFF Research Database (Denmark)
Berg, Casper Willestofte
for stock assessment by application of state-of-the-art statistical methodology. The main contributions are presented in the form of six research papers. The major part of the thesis deals with age-structured assessment models, which is the most common approach. Conversion from length to age distributions...... statistical aspects of fish stocks assessment, which includes topics such as time series analysis, generalized additive models (GAMs), and non-linear state-space/mixed models capable of handling missing data and a high number of latent states and parameters. The aim is to improve the existing methods...
Parametric analysis of the statistical model of the stick-slip process
Lima, Roberta; Sampaio, Rubens
2017-06-01
In this paper it is performed a parametric analysis of the statistical model of the response of a dry-friction oscillator. The oscillator is a spring-mass system which moves over a base with a rough surface. Due to this roughness, the mass is subject to a dry-frictional force modeled as a Coulomb friction. The system is stochastically excited by an imposed bang-bang base motion. The base velocity is modeled by a Poisson process for which a probabilistic model is fully specified. The excitation induces in the system stochastic stick-slip oscillations. The system response is composed by a random sequence alternating stick and slip-modes. With realizations of the system, a statistical model is constructed for this sequence. In this statistical model, the variables of interest of the sequence are modeled as random variables, as for example, the number of time intervals in which stick or slip occur, the instants at which they begin, and their duration. Samples of the system response are computed by integration of the dynamic equation of the system using independent samples of the base motion. Statistics and histograms of the random variables which characterize the stick-slip process are estimated for the generated samples. The objective of the paper is to analyze how these estimated statistics and histograms vary with the system parameters, i.e., to make a parametric analysis of the statistical model of the stick-slip process.
An approach to the interpretation of backpropagation neural network models in QSAR studies.
Baskin, I I; Ait, A O; Halberstam, N M; Palyulin, V A; Zefirov, N S
2002-03-01
An approach to the interpretation of backpropagation neural network models for quantitative structure-activity and structure-property relationships (QSAR/QSPR) studies is proposed. The method is based on analyzing the first and second moments of distribution of the values of the first and the second partial derivatives of neural network outputs with respect to inputs calculated at data points. The use of such statistics makes it possible not only to obtain actually the same characteristics as for the case of traditional "interpretable" statistical methods, such as the linear regression analysis, but also to reveal important additional information regarding the non-linear character of QSAR/QSPR relationships. The approach is illustrated by an example of interpreting a backpropagation neural network model for predicting position of the long-wave absorption band of cyane dyes.
Study on Semi-Parametric Statistical Model of Safety Monitoring of Cracks in Concrete Dams
Directory of Open Access Journals (Sweden)
Chongshi Gu
2013-01-01
Full Text Available Cracks are one of the hidden dangers in concrete dams. The study on safety monitoring models of concrete dam cracks has always been difficult. Using the parametric statistical model of safety monitoring of cracks in concrete dams, with the help of the semi-parametric statistical theory, and considering the abnormal behaviors of these cracks, the semi-parametric statistical model of safety monitoring of concrete dam cracks is established to overcome the limitation of the parametric model in expressing the objective model. Previous projects show that the semi-parametric statistical model has a stronger fitting effect and has a better explanation for cracks in concrete dams than the parametric statistical model. However, when used for forecast, the forecast capability of the semi-parametric statistical model is equivalent to that of the parametric statistical model. The modeling of the semi-parametric statistical model is simple, has a reasonable principle, and has a strong practicality, with a good application prospect in the actual project.
International Nuclear Information System (INIS)
Yeh, L.
1992-01-01
The phase-space-picture approach to quantum non-equilibrium statistical mechanics via the characteristic function of infinite- mode squeezed coherent states is introduced. We use quantum Brownian motion as an example to show how this approach provides an interesting geometrical interpretation of quantum non-equilibrium phenomena
Statistical models for competing risk analysis
International Nuclear Information System (INIS)
Sather, H.N.
1976-08-01
Research results on three new models for potential applications in competing risks problems. One section covers the basic statistical relationships underlying the subsequent competing risks model development. Another discusses the problem of comparing cause-specific risk structure by competing risks theory in two homogeneous populations, P1 and P2. Weibull models which allow more generality than the Berkson and Elveback models are studied for the effect of time on the hazard function. The use of concomitant information for modeling single-risk survival is extended to the multiple failure mode domain of competing risks. The model used to illustrate the use of this methodology is a life table model which has constant hazards within pre-designated intervals of the time scale. Two parametric models for bivariate dependent competing risks, which provide interesting alternatives, are proposed and examined
Directory of Open Access Journals (Sweden)
Natasa M Milic
Full Text Available Although recent studies report on the benefits of blended learning in improving medical student education, there is still no empirical evidence on the relative effectiveness of blended over traditional learning approaches in medical statistics. We implemented blended along with on-site (i.e. face-to-face learning to further assess the potential value of web-based learning in medical statistics.This was a prospective study conducted with third year medical undergraduate students attending the Faculty of Medicine, University of Belgrade, who passed (440 of 545 the final exam of the obligatory introductory statistics course during 2013-14. Student statistics achievements were stratified based on the two methods of education delivery: blended learning and on-site learning. Blended learning included a combination of face-to-face and distance learning methodologies integrated into a single course.Mean exam scores for the blended learning student group were higher than for the on-site student group for both final statistics score (89.36±6.60 vs. 86.06±8.48; p = 0.001 and knowledge test score (7.88±1.30 vs. 7.51±1.36; p = 0.023 with a medium effect size. There were no differences in sex or study duration between the groups. Current grade point average (GPA was higher in the blended group. In a multivariable regression model, current GPA and knowledge test scores were associated with the final statistics score after adjusting for study duration and learning modality (p<0.001.This study provides empirical evidence to support educator decisions to implement different learning environments for teaching medical statistics to undergraduate medical students. Blended and on-site training formats led to similar knowledge acquisition; however, students with higher GPA preferred the technology assisted learning format. Implementation of blended learning approaches can be considered an attractive, cost-effective, and efficient alternative to traditional
Zhang, Harrison G; Ying, Gui-Shuang
2018-02-09
The aim of this study is to evaluate the current practice of statistical analysis of eye data in clinical science papers published in British Journal of Ophthalmology ( BJO ) and to determine whether the practice of statistical analysis has improved in the past two decades. All clinical science papers (n=125) published in BJO in January-June 2017 were reviewed for their statistical analysis approaches for analysing primary ocular measure. We compared our findings to the results from a previous paper that reviewed BJO papers in 1995. Of 112 papers eligible for analysis, half of the studies analysed the data at an individual level because of the nature of observation, 16 (14%) studies analysed data from one eye only, 36 (32%) studies analysed data from both eyes at ocular level, one study (1%) analysed the overall summary of ocular finding per individual and three (3%) studies used the paired comparison. Among studies with data available from both eyes, 50 (89%) of 56 papers in 2017 did not analyse data from both eyes or ignored the intereye correlation, as compared with in 60 (90%) of 67 papers in 1995 (P=0.96). Among studies that analysed data from both eyes at an ocular level, 33 (92%) of 36 studies completely ignored the intereye correlation in 2017, as compared with in 16 (89%) of 18 studies in 1995 (P=0.40). A majority of studies did not analyse the data properly when data from both eyes were available. The practice of statistical analysis did not improve in the past two decades. Collaborative efforts should be made in the vision research community to improve the practice of statistical analysis for ocular data. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
SoS contract verification using statistical model checking
Directory of Open Access Journals (Sweden)
Alessandro Mignogna
2013-11-01
Full Text Available Exhaustive formal verification for systems of systems (SoS is impractical and cannot be applied on a large scale. In this paper we propose to use statistical model checking for efficient verification of SoS. We address three relevant aspects for systems of systems: 1 the model of the SoS, which includes stochastic aspects; 2 the formalization of the SoS requirements in the form of contracts; 3 the tool-chain to support statistical model checking for SoS. We adapt the SMC technique for application to heterogeneous SoS. We extend the UPDM/SysML specification language to express the SoS requirements that the implemented strategies over the SoS must satisfy. The requirements are specified with a new contract language specifically designed for SoS, targeting a high-level English- pattern language, but relying on an accurate semantics given by the standard temporal logics. The contracts are verified against the UPDM/SysML specification using the Statistical Model Checker (SMC PLASMA combined with the simulation engine DESYRE, which integrates heterogeneous behavioral models through the functional mock-up interface (FMI standard. The tool-chain allows computing an estimation of the satisfiability of the contracts by the SoS. The results help the system architect to trade-off different solutions to guide the evolution of the SoS.
Directory of Open Access Journals (Sweden)
Achim Schilling
2017-10-01
Full Text Available Background: An increasingly used behavioral paradigm for the objective assessment of a possible tinnitus percept in animal models has been proposed by Turner and coworkers in 2006. It is based on gap-prepulse inhibition (PPI of the acoustic startle reflex (ASR and usually referred to as GPIAS. As it does not require conditioning it became the method of choice to study neuroplastic phenomena associated with the development of tinnitus.Objective: It is still controversial if GPIAS is really appropriate for tinnitus screening, as the hypothesis that a tinnitus percept impairs the gap detection ability (“filling-in interpretation” is still questioned. Furthermore, a wide range of criteria for positive tinnitus detection in GPIAS have been used across different laboratories and there still is no consensus on a best practice for statistical evaluation of GPIAS results. Current approaches are often based on simple averaging of measured PPI values and comparisons on a population level without the possibility to perform valid statistics on the level of the single animal.Methods: A total number of 32 animals were measured using the standard GPIAS paradigm with varying number of measurement repetitions. Based on this data further statistical considerations were performed.Results: We here present a new statistical approach to overcome the methodological limitations of GPIAS. In a first step we show that ASR amplitudes are not normally distributed. Next we estimate the distribution of the measured PPI values by exploiting the full combinatorial power of all measured ASR amplitudes. We demonstrate that the amplitude ratios (1-PPI are approximately lognormally distributed, allowing for parametrical testing of the logarithmized values and present a new statistical approach allowing for a valid and reliable statistical assessment of PPI changes in GPIAS.Conclusion: Based on our statistical approach we recommend using a constant criterion, which does not
Model selection and inference a practical information-theoretic approach
Burnham, Kenneth P
1998-01-01
This book is unique in that it covers the philosophy of model-based data analysis and an omnibus strategy for the analysis of empirical data The book introduces information theoretic approaches and focuses critical attention on a priori modeling and the selection of a good approximating model that best represents the inference supported by the data Kullback-Leibler information represents a fundamental quantity in science and is Hirotugu Akaike's basis for model selection The maximized log-likelihood function can be bias-corrected to provide an estimate of expected, relative Kullback-Leibler information This leads to Akaike's Information Criterion (AIC) and various extensions and these are relatively simple and easy to use in practice, but little taught in statistics classes and far less understood in the applied sciences than should be the case The information theoretic approaches provide a unified and rigorous theory, an extension of likelihood theory, an important application of information theory, and are ...
Spatial Statistical Data Fusion (SSDF)
Braverman, Amy J.; Nguyen, Hai M.; Cressie, Noel
2013-01-01
As remote sensing for scientific purposes has transitioned from an experimental technology to an operational one, the selection of instruments has become more coordinated, so that the scientific community can exploit complementary measurements. However, tech nological and scientific heterogeneity across devices means that the statistical characteristics of the data they collect are different. The challenge addressed here is how to combine heterogeneous remote sensing data sets in a way that yields optimal statistical estimates of the underlying geophysical field, and provides rigorous uncertainty measures for those estimates. Different remote sensing data sets may have different spatial resolutions, different measurement error biases and variances, and other disparate characteristics. A state-of-the-art spatial statistical model was used to relate the true, but not directly observed, geophysical field to noisy, spatial aggregates observed by remote sensing instruments. The spatial covariances of the true field and the covariances of the true field with the observations were modeled. The observations are spatial averages of the true field values, over pixels, with different measurement noise superimposed. A kriging framework is used to infer optimal (minimum mean squared error and unbiased) estimates of the true field at point locations from pixel-level, noisy observations. A key feature of the spatial statistical model is the spatial mixed effects model that underlies it. The approach models the spatial covariance function of the underlying field using linear combinations of basis functions of fixed size. Approaches based on kriging require the inversion of very large spatial covariance matrices, and this is usually done by making simplifying assumptions about spatial covariance structure that simply do not hold for geophysical variables. In contrast, this method does not require these assumptions, and is also computationally much faster. This method is
Statistical margin to DNB safety analysis approach for LOFT
International Nuclear Information System (INIS)
Atkinson, S.A.
1982-01-01
A method was developed and used for LOFT thermal safety analysis to estimate the statistical margin to DNB for the hot rod, and to base safety analysis on desired DNB probability limits. This method is an advanced approach using response surface analysis methods, a very efficient experimental design, and a 2nd-order response surface equation with a 2nd-order error propagation analysis to define the MDNBR probability density function. Calculations for limiting transients were used in the response surface analysis thereby including transient interactions and trip uncertainties in the MDNBR probability density
Complex Data Modeling and Computationally Intensive Statistical Methods
Mantovan, Pietro
2010-01-01
The last years have seen the advent and development of many devices able to record and store an always increasing amount of complex and high dimensional data; 3D images generated by medical scanners or satellite remote sensing, DNA microarrays, real time financial data, system control datasets. The analysis of this data poses new challenging problems and requires the development of novel statistical models and computational methods, fueling many fascinating and fast growing research areas of modern statistics. The book offers a wide variety of statistical methods and is addressed to statistici
Moraes, Alvaro
2015-01-01
aspect, we first mention an innovative multi- scale approach, where we introduce a deterministic systematic way of using up-scaled likelihoods for parameter estimation while the statistical fittings are done in the base model through the use of the Master Equation. In a di↵erent approach, we derive a new forward-reverse representation for simulating stochastic bridges between con- secutive observations. This allows us to use the well-known EM Algorithm to infer the reaction rates. The forward-reverse methodology is boosted by an initial phase where, using multi-scale approximation techniques, we provide initial values for the EM Algorithm.
A statistical model for porous structure of rocks
Institute of Scientific and Technical Information of China (English)
JU Yang; YANG YongMing; SONG ZhenDuo; XU WenJing
2008-01-01
The geometric features and the distribution properties of pores in rocks were In-vestigated by means of CT scanning tests of sandstones. The centroidal coordl-nares of pores, the statistic characterristics of pore distance, quantity, size and their probability density functions were formulated in this paper. The Monte Carlo method and the random number generating algorithm were employed to generate two series of random numbers with the desired statistic characteristics and prob-ability density functions upon which the random distribution of pore position, dis-tance and quantity were determined. A three-dimensional porous structural model of sandstone was constructed based on the FLAC3D program and the information of the pore position and distribution that the series of random numbers defined. On the basis of modelling, the Brazil split tests of rock discs were carried out to ex-amine the stress distribution, the pattern of element failure and the inoaculation of failed elements. The simulation indicated that the proposed model was consistent with the realistic porous structure of rock in terms of their statistic properties of pores and geometric similarity. The built-up model disclosed the influence of pores on the stress distribution, failure mode of material elements and the inosculation of failed elements.
A statistical model for porous structure of rocks
Institute of Scientific and Technical Information of China (English)
2008-01-01
The geometric features and the distribution properties of pores in rocks were in- vestigated by means of CT scanning tests of sandstones. The centroidal coordi- nates of pores, the statistic characterristics of pore distance, quantity, size and their probability density functions were formulated in this paper. The Monte Carlo method and the random number generating algorithm were employed to generate two series of random numbers with the desired statistic characteristics and prob- ability density functions upon which the random distribution of pore position, dis- tance and quantity were determined. A three-dimensional porous structural model of sandstone was constructed based on the FLAC3D program and the information of the pore position and distribution that the series of random numbers defined. On the basis of modelling, the Brazil split tests of rock discs were carried out to ex- amine the stress distribution, the pattern of element failure and the inosculation of failed elements. The simulation indicated that the proposed model was consistent with the realistic porous structure of rock in terms of their statistic properties of pores and geometric similarity. The built-up model disclosed the influence of pores on the stress distribution, failure mode of material elements and the inosculation of failed elements.
(ajst) statistical mechanics model for orientational
African Journals Online (AJOL)
Science and Engineering Series Vol. 6, No. 2, pp. 94 - 101. STATISTICAL MECHANICS MODEL FOR ORIENTATIONAL. MOTION OF TWO-DIMENSIONAL RIGID ROTATOR. Malo, J.O. ... there is no translational motion and that they are well separated so .... constant and I is the moment of inertia of a linear rotator. Thus, the ...
Martin, Jordan S; Suarez, Scott A
2017-08-01
Interest in quantifying consistent among-individual variation in primate behavior, also known as personality, has grown rapidly in recent decades. Although behavioral coding is the most frequently utilized method for assessing primate personality, limitations in current statistical practice prevent researchers' from utilizing the full potential of their coding datasets. These limitations include the use of extensive data aggregation, not modeling biologically relevant sources of individual variance during repeatability estimation, not partitioning between-individual (co)variance prior to modeling personality structure, the misuse of principal component analysis, and an over-reliance upon exploratory statistical techniques to compare personality models across populations, species, and data collection methods. In this paper, we propose a statistical framework for primate personality research designed to address these limitations. Our framework synthesizes recently developed mixed-effects modeling approaches for quantifying behavioral variation with an information-theoretic model selection paradigm for confirmatory personality research. After detailing a multi-step analytic procedure for personality assessment and model comparison, we employ this framework to evaluate seven models of personality structure in zoo-housed bonobos (Pan paniscus). We find that differences between sexes, ages, zoos, time of observation, and social group composition contributed to significant behavioral variance. Independently of these factors, however, personality nonetheless accounted for a moderate to high proportion of variance in average behavior across observational periods. A personality structure derived from past rating research receives the strongest support relative to our model set. This model suggests that personality variation across the measured behavioral traits is best described by two correlated but distinct dimensions reflecting individual differences in affiliation and
Dry Sliding Wear Behavior of Super Duplex Stainless Steel AISI 2507: a Statistical Approach
Directory of Open Access Journals (Sweden)
Davanageri M.
2016-12-01
Full Text Available The dry sliding wear behavior of heat-treated super duplex stainless steel AISI 2507 was examined by taking pin-on-disc type of wear-test rig. Independent parameters, namely applied load, sliding distance, and sliding speed, influence mainly the wear rate of super duplex stainless steel. The said material was heat treated to a temperature of 850°C for 1 hour followed by water quenching. The heat treatment was carried out to precipitate the secondary sigma phase formation. Experiments were conducted to study the influence of independent parameters set at three factor levels using the L27 orthogonal array of the Taguchi experimental design on the wear rate. Statistical significance of both individual and combined factor effects was determined for specific wear rate. Surface plots were drawn to explain the behavior of independent variables on the measured wear rate. Statistically, the models were validated using the analysis of variance test. Multiple non-linear regression equations were derived for wear rate expressed as non-linear functions of independent variables. Further, the prediction accuracy of the developed regression equation was tested with the actual experiments. The independent parameters responsible for the desired minimum wear rate were determined by using the desirability function approach. The worn-out surface characteristics obtained for the minimum wear rate was examined using the scanning electron microscope. The desired smooth surface was obtained for the determined optimal condition by desirability function approach.
Computer simulation of HTGR fuel microspheres using a Monte-Carlo statistical approach
International Nuclear Information System (INIS)
Hedrick, C.E.
1976-01-01
The concept and computational aspects of a Monte-Carlo statistical approach in relating structure of HTGR fuel microspheres to the uranium content of fuel samples have been verified. Results of the preliminary validation tests and the benefits to be derived from the program are summarized
Maximum entropy approach to statistical inference for an ocean acoustic waveguide.
Knobles, D P; Sagers, J D; Koch, R A
2012-02-01
A conditional probability distribution suitable for estimating the statistical properties of ocean seabed parameter values inferred from acoustic measurements is derived from a maximum entropy principle. The specification of the expectation value for an error function constrains the maximization of an entropy functional. This constraint determines the sensitivity factor (β) to the error function of the resulting probability distribution, which is a canonical form that provides a conservative estimate of the uncertainty of the parameter values. From the conditional distribution, marginal distributions for individual parameters can be determined from integration over the other parameters. The approach is an alternative to obtaining the posterior probability distribution without an intermediary determination of the likelihood function followed by an application of Bayes' rule. In this paper the expectation value that specifies the constraint is determined from the values of the error function for the model solutions obtained from a sparse number of data samples. The method is applied to ocean acoustic measurements taken on the New Jersey continental shelf. The marginal probability distribution for the values of the sound speed ratio at the surface of the seabed and the source levels of a towed source are examined for different geoacoustic model representations. © 2012 Acoustical Society of America
A Review of Modeling Bioelectrochemical Systems: Engineering and Statistical Aspects
Directory of Open Access Journals (Sweden)
Shuai Luo
2016-02-01
Full Text Available Bioelectrochemical systems (BES are promising technologies to convert organic compounds in wastewater to electrical energy through a series of complex physical-chemical, biological and electrochemical processes. Representative BES such as microbial fuel cells (MFCs have been studied and advanced for energy recovery. Substantial experimental and modeling efforts have been made for investigating the processes involved in electricity generation toward the improvement of the BES performance for practical applications. However, there are many parameters that will potentially affect these processes, thereby making the optimization of system performance hard to be achieved. Mathematical models, including engineering models and statistical models, are powerful tools to help understand the interactions among the parameters in BES and perform optimization of BES configuration/operation. This review paper aims to introduce and discuss the recent developments of BES modeling from engineering and statistical aspects, including analysis on the model structure, description of application cases and sensitivity analysis of various parameters. It is expected to serves as a compass for integrating the engineering and statistical modeling strategies to improve model accuracy for BES development.
Kalantari, Zahra; Cavalli, Marco; Cantone, Carolina; Crema, Stefano; Destouni, Georgia
2017-03-01
Climate-driven increase in the frequency of extreme hydrological events is expected to impose greater strain on the built environment and major transport infrastructure, such as roads and railways. This study develops a data-driven spatial-statistical approach to quantifying and mapping the probability of flooding at critical road-stream intersection locations, where water flow and sediment transport may accumulate and cause serious road damage. The approach is based on novel integration of key watershed and road characteristics, including also measures of sediment connectivity. The approach is concretely applied to and quantified for two specific study case examples in southwest Sweden, with documented road flooding effects of recorded extreme rainfall. The novel contributions of this study in combining a sediment connectivity account with that of soil type, land use, spatial precipitation-runoff variability and road drainage in catchments, and in extending the connectivity measure use for different types of catchments, improve the accuracy of model results for road flood probability. Copyright © 2016 Elsevier B.V. All rights reserved.
Intuitive introductory statistics
Wolfe, Douglas A
2017-01-01
This textbook is designed to give an engaging introduction to statistics and the art of data analysis. The unique scope includes, but also goes beyond, classical methodology associated with the normal distribution. What if the normal model is not valid for a particular data set? This cutting-edge approach provides the alternatives. It is an introduction to the world and possibilities of statistics that uses exercises, computer analyses, and simulations throughout the core lessons. These elementary statistical methods are intuitive. Counting and ranking features prominently in the text. Nonparametric methods, for instance, are often based on counts and ranks and are very easy to integrate into an introductory course. The ease of computation with advanced calculators and statistical software, both of which factor into this text, allows important techniques to be introduced earlier in the study of statistics. This book's novel scope also includes measuring symmetry with Walsh averages, finding a nonp...
Ahmed, Kazi Farzan; Wang, Guiling; Silander, John; Wilson, Adam M.; Allen, Jenica M.; Horton, Radley; Anyah, Richard
2013-01-01
Statistical downscaling can be used to efficiently downscale a large number of General Circulation Model (GCM) outputs to a fine temporal and spatial scale. To facilitate regional impact assessments, this study statistically downscales (to 1/8deg spatial resolution) and corrects the bias of daily maximum and minimum temperature and daily precipitation data from six GCMs and four Regional Climate Models (RCMs) for the northeast United States (US) using the Statistical Downscaling and Bias Correction (SDBC) approach. Based on these downscaled data from multiple models, five extreme indices were analyzed for the future climate to quantify future changes of climate extremes. For a subset of models and indices, results based on raw and bias corrected model outputs for the present-day climate were compared with observations, which demonstrated that bias correction is important not only for GCM outputs, but also for RCM outputs. For future climate, bias correction led to a higher level of agreements among the models in predicting the magnitude and capturing the spatial pattern of the extreme climate indices. We found that the incorporation of dynamical downscaling as an intermediate step does not lead to considerable differences in the results of statistical downscaling for the study domain.
Adair, Desmond; Jaeger, Martin; Price, Owen M.
2018-01-01
The use of a portfolio curriculum approach, when teaching a university introductory statistics and probability course to engineering students, is developed and evaluated. The portfolio curriculum approach, so called, as the students need to keep extensive records both as hard copies and digitally of reading materials, interactions with faculty,…
Pathak, Lakshmi; Singh, Vineeta; Niwas, Ram; Osama, Khwaja; Khan, Saif; Haque, Shafiul; Tripathi, C K M; Mishra, B N
2015-01-01
Cholesterol oxidase (COD) is a bi-functional FAD-containing oxidoreductase which catalyzes the oxidation of cholesterol into 4-cholesten-3-one. The wider biological functions and clinical applications of COD have urged the screening, isolation and characterization of newer microbes from diverse habitats as a source of COD and optimization and over-production of COD for various uses. The practicability of statistical/ artificial intelligence techniques, such as response surface methodology (RSM), artificial neural network (ANN) and genetic algorithm (GA) have been tested to optimize the medium composition for the production of COD from novel strain Streptomyces sp. NCIM 5500. All experiments were performed according to the five factor central composite design (CCD) and the generated data was analysed using RSM and ANN. GA was employed to optimize the models generated by RSM and ANN. Based upon the predicted COD concentration, the model developed with ANN was found to be superior to the model developed with RSM. The RSM-GA approach predicted maximum of 6.283 U/mL COD production, whereas the ANN-GA approach predicted a maximum of 9.93 U/mL COD concentration. The optimum concentrations of the medium variables predicted through ANN-GA approach were: 1.431 g/50 mL soybean, 1.389 g/50 mL maltose, 0.029 g/50 mL MgSO4, 0.45 g/50 mL NaCl and 2.235 ml/50 mL glycerol. The experimental COD concentration was concurrent with the GA predicted yield and led to 9.75 U/mL COD production, which was nearly two times higher than the yield (4.2 U/mL) obtained with the un-optimized medium. This is the very first time we are reporting the statistical versus artificial intelligence based modeling and optimization of COD production by Streptomyces sp. NCIM 5500.
Directory of Open Access Journals (Sweden)
Lakshmi Pathak
Full Text Available Cholesterol oxidase (COD is a bi-functional FAD-containing oxidoreductase which catalyzes the oxidation of cholesterol into 4-cholesten-3-one. The wider biological functions and clinical applications of COD have urged the screening, isolation and characterization of newer microbes from diverse habitats as a source of COD and optimization and over-production of COD for various uses. The practicability of statistical/ artificial intelligence techniques, such as response surface methodology (RSM, artificial neural network (ANN and genetic algorithm (GA have been tested to optimize the medium composition for the production of COD from novel strain Streptomyces sp. NCIM 5500. All experiments were performed according to the five factor central composite design (CCD and the generated data was analysed using RSM and ANN. GA was employed to optimize the models generated by RSM and ANN. Based upon the predicted COD concentration, the model developed with ANN was found to be superior to the model developed with RSM. The RSM-GA approach predicted maximum of 6.283 U/mL COD production, whereas the ANN-GA approach predicted a maximum of 9.93 U/mL COD concentration. The optimum concentrations of the medium variables predicted through ANN-GA approach were: 1.431 g/50 mL soybean, 1.389 g/50 mL maltose, 0.029 g/50 mL MgSO4, 0.45 g/50 mL NaCl and 2.235 ml/50 mL glycerol. The experimental COD concentration was concurrent with the GA predicted yield and led to 9.75 U/mL COD production, which was nearly two times higher than the yield (4.2 U/mL obtained with the un-optimized medium. This is the very first time we are reporting the statistical versus artificial intelligence based modeling and optimization of COD production by Streptomyces sp. NCIM 5500.
Acceleration transforms and statistical kinetic models
International Nuclear Information System (INIS)
LuValle, M.J.; Welsher, T.L.; Svoboda, K.
1988-01-01
For a restricted class of problems a mathematical model of microscopic degradation processes, statistical kinetics, is developed and linked through acceleration transforms to the information which can be obtained from a system in which the only observable sign of degradation is sudden and catastrophic failure. The acceleration transforms were developed in accelerated life testing applications as a tool for extrapolating from the observable results of an accelerated life test to the dynamics of the underlying degradation processes. A particular concern of a physicist attempting to interpreted the results of an analysis based on acceleration transforms is determining the physical species involved in the degradation process. These species may be (a) relatively abundant or (b) relatively rare. The main results of this paper are a theorem showing that for an important subclass of statistical kinetic models, acceleration transforms cannot be used to distinguish between cases a and b, and an example showing that in some cases falling outside the restrictions of the theorem, cases a and b can be distinguished by their acceleration transforms
Statistical models describing the energy signature of buildings
DEFF Research Database (Denmark)
Bacher, Peder; Madsen, Henrik; Thavlov, Anders
2010-01-01
Approximately one third of the primary energy production in Denmark is used for heating in buildings. Therefore efforts to accurately describe and improve energy performance of the building mass are very important. For this purpose statistical models describing the energy signature of a building, i...... or varying energy prices. The paper will give an overview of statistical methods and applied models based on experiments carried out in FlexHouse, which is an experimental building in SYSLAB, Risø DTU. The models are of different complexity and can provide estimates of physical quantities such as UA......-values, time constants of the building, and other parameters related to the heat dynamics. A method for selecting the most appropriate model for a given building is outlined and finally a perspective of the applications is given. Aknowledgements to the Danish Energy Saving Trust and the Interreg IV ``Vind i...
"The Two Brothers": Reconciling Perceptual-Cognitive and Statistical Models of Musical Evolution.
Jan, Steven
2018-01-01
While the "units, events and dynamics" of memetic evolution have been abstractly theorized (Lynch, 1998), they have not been applied systematically to real corpora in music. Some researchers, convinced of the validity of cultural evolution in more than the metaphorical sense adopted by much musicology, but perhaps skeptical of some or all of the claims of memetics, have attempted statistically based corpus-analysis techniques of music drawn from molecular biology, and these have offered strong evidence in favor of system-level change over time (Savage, 2017). This article argues that such statistical approaches, while illuminating, ignore the psychological realities of music-information grouping, the transmission of such groups with varying degrees of fidelity, their selection according to relative perceptual-cognitive salience, and the power of this Darwinian process to drive the systemic changes (such as the development over time of systems of tonal organization in music) that statistical methodologies measure. It asserts that a synthesis between such statistical approaches to the study of music-cultural change and the theory of memetics as applied to music (Jan, 2007), in particular the latter's perceptual-cognitive elements, would harness the strengths of each approach and deepen understanding of cultural evolution in music.
Statistical Approaches to Aerosol Dynamics for Climate Simulation
Energy Technology Data Exchange (ETDEWEB)
Zhu, Wei
2014-09-02
In this work, we introduce two general non-parametric regression analysis methods for errors-in-variable (EIV) models: the compound regression, and the constrained regression. It is shown that these approaches are equivalent to each other and, to the general parametric structural modeling approach. The advantages of these methods lie in their intuitive geometric representations, their distribution free nature, and their ability to offer a practical solution when the ratio of the error variances is unknown. Each includes the classic non-parametric regression methods of ordinary least squares, geometric mean regression, and orthogonal regression as special cases. Both methods can be readily generalized to multiple linear regression with two or more random regressors.
Statistical tests for person misfit in computerized adaptive testing
Glas, Cornelis A.W.; Meijer, R.R.; van Krimpen-Stoop, Edith
1998-01-01
Recently, several person-fit statistics have been proposed to detect nonfitting response patterns. This study is designed to generalize an approach followed by Klauer (1995) to an adaptive testing system using the two-parameter logistic model (2PL) as a null model. The approach developed by Klauer
Dynamic Statistical Models for Pyroclastic Density Current Generation at Soufrière Hills Volcano
Wolpert, Robert L.; Spiller, Elaine T.; Calder, Eliza S.
2018-05-01
To mitigate volcanic hazards from pyroclastic density currents, volcanologists generate hazard maps that provide long-term forecasts of areas of potential impact. Several recent efforts in the field develop new statistical methods for application of flow models to generate fully probabilistic hazard maps that both account for, and quantify, uncertainty. However a limitation to the use of most statistical hazard models, and a key source of uncertainty within them, is the time-averaged nature of the datasets by which the volcanic activity is statistically characterized. Where the level, or directionality, of volcanic activity frequently changes, e.g. during protracted eruptive episodes, or at volcanoes that are classified as persistently active, it is not appropriate to make short term forecasts based on longer time-averaged metrics of the activity. Thus, here we build, fit and explore dynamic statistical models for the generation of pyroclastic density current from Soufrière Hills Volcano (SHV) on Montserrat including their respective collapse direction and flow volumes based on 1996-2008 flow datasets. The development of this approach allows for short-term behavioral changes to be taken into account in probabilistic volcanic hazard assessments. We show that collapses from the SHV lava dome follow a clear pattern, and that a series of smaller flows in a given direction often culminate in a larger collapse and thereafter directionality of the flows change. Such models enable short term forecasting (weeks to months) that can reflect evolving conditions such as dome and crater morphology changes and non-stationary eruptive behavior such as extrusion rate variations. For example, the probability of inundation of the Belham Valley in the first 180 days of a forecast period is about twice as high for lava domes facing Northwest toward that valley as it is for domes pointing East toward the Tar River Valley. As rich multi-parametric volcano monitoring dataset become
The epistemology of mathematical and statistical modeling: a quiet methodological revolution.
Rodgers, Joseph Lee
2010-01-01
A quiet methodological revolution, a modeling revolution, has occurred over the past several decades, almost without discussion. In contrast, the 20th century ended with contentious argument over the utility of null hypothesis significance testing (NHST). The NHST controversy may have been at least partially irrelevant, because in certain ways the modeling revolution obviated the NHST argument. I begin with a history of NHST and modeling and their relation to one another. Next, I define and illustrate principles involved in developing and evaluating mathematical models. Following, I discuss the difference between using statistical procedures within a rule-based framework and building mathematical models from a scientific epistemology. Only the former is treated carefully in most psychology graduate training. The pedagogical implications of this imbalance and the revised pedagogy required to account for the modeling revolution are described. To conclude, I discuss how attention to modeling implies shifting statistical practice in certain progressive ways. The epistemological basis of statistics has moved away from being a set of procedures, applied mechanistically, and moved toward building and evaluating statistical and scientific models. Copyrigiht 2009 APA, all rights reserved.
Establishing statistical models of manufacturing parameters
International Nuclear Information System (INIS)
Senevat, J.; Pape, J.L.; Deshayes, J.F.
1991-01-01
This paper reports on the effect of pilgering and cold-work parameters on contractile strain ratio and mechanical properties that were investigated using a large population of Zircaloy tubes. Statistical models were established between: contractile strain ratio and tooling parameters, mechanical properties (tensile test, creep test) and cold-work parameters, and mechanical properties and stress-relieving temperature
Kuik, Friderike; Lauer, Axel; von Schneidemesser, Erika; Butler, Tim
2017-04-01
Many European cities continue to struggle with meeting the European air quality limits for NO2. In Berlin, Germany, most of the exceedances in NO2 recorded at monitoring sites near busy roads can be largely attributed to emissions from traffic. In order to assess the impact of changes in traffic emissions on air quality at policy relevant scales, we combine the regional atmosphere-chemistry transport model WRF-Chem at a resolution of 1kmx1km with a statistical downscaling approach. Here, we build on the recently published study evaluating the performance of a WRF-Chem setup in representing observed urban background NO2 concentrations from Kuik et al. (2016) and extend this setup by developing and testing an approach to statistically downscale simulated urban background NO2 concentrations to street level. The approach uses a multilinear regression model to relate roadside NO2 concentrations observed with the municipal monitoring network with observed NO2 concentrations at urban background sites and observed traffic counts. For this, the urban background NO2 concentrations are decomposed into a long term, a synoptic and a diurnal component using the Kolmogorov-Zurbenko filtering method. We estimate the coefficients of the regression model for five different roadside stations in Berlin representing different street types. In a next step we combine the coefficients with simulated urban background concentrations and observed traffic counts, in order to estimate roadside NO2 concentrations based on the results obtained with WRF-Chem at the five selected stations. In a third step, we extrapolate the NO2 concentrations to all major roads in Berlin. The latter is based on available data for Berlin of daily mean traffic counts, diurnal and weekly cycles of traffic as well as simulated urban background NO2 concentrations. We evaluate the NO2 concentrations estimated with this method at street level for Berlin with additional observational data from stationary measurements and
Zhang, Bo; Liu, Wei; Zhang, Zhiwei; Qu, Yanping; Chen, Zhen; Albert, Paul S
2017-08-01
Joint modeling and within-cluster resampling are two approaches that are used for analyzing correlated data with informative cluster sizes. Motivated by a developmental toxicity study, we examined the performances and validity of these two approaches in testing covariate effects in generalized linear mixed-effects models. We show that the joint modeling approach is robust to the misspecification of cluster size models in terms of Type I and Type II errors when the corresponding covariates are not included in the random effects structure; otherwise, statistical tests may be affected. We also evaluate the performance of the within-cluster resampling procedure and thoroughly investigate the validity of it in modeling correlated data with informative cluster sizes. We show that within-cluster resampling is a valid alternative to joint modeling for cluster-specific covariates, but it is invalid for time-dependent covariates. The two methods are applied to a developmental toxicity study that investigated the effect of exposure to diethylene glycol dimethyl ether.
Directory of Open Access Journals (Sweden)
Xiao-meng Song
2013-01-01
Full Text Available Parameter identification, model calibration, and uncertainty quantification are important steps in the model-building process, and are necessary for obtaining credible results and valuable information. Sensitivity analysis of hydrological model is a key step in model uncertainty quantification, which can identify the dominant parameters, reduce the model calibration uncertainty, and enhance the model optimization efficiency. There are, however, some shortcomings in classical approaches, including the long duration of time and high computation cost required to quantitatively assess the sensitivity of a multiple-parameter hydrological model. For this reason, a two-step statistical evaluation framework using global techniques is presented. It is based on (1 a screening method (Morris for qualitative ranking of parameters, and (2 a variance-based method integrated with a meta-model for quantitative sensitivity analysis, i.e., the Sobol method integrated with the response surface model (RSMSobol. First, the Morris screening method was used to qualitatively identify the parameters' sensitivity, and then ten parameters were selected to quantify the sensitivity indices. Subsequently, the RSMSobol method was used to quantify the sensitivity, i.e., the first-order and total sensitivity indices based on the response surface model (RSM were calculated. The RSMSobol method can not only quantify the sensitivity, but also reduce the computational cost, with good accuracy compared to the classical approaches. This approach will be effective and reliable in the global sensitivity analysis of a complex large-scale distributed hydrological model.
Automatic generation of 3D statistical shape models with optimal landmark distributions.
Heimann, T; Wolf, I; Meinzer, H-P
2007-01-01
To point out the problem of non-uniform landmark placement in statistical shape modeling, to present an improved method for generating landmarks in the 3D case and to propose an unbiased evaluation metric to determine model quality. Our approach minimizes a cost function based on the minimum description length (MDL) of the shape model to optimize landmark correspondences over the training set. In addition to the standard technique, we employ an extended remeshing method to change the landmark distribution without losing correspondences, thus ensuring a uniform distribution over all training samples. To break the dependency of the established evaluation measures generalization and specificity from the landmark distribution, we change the internal metric from landmark distance to volumetric overlap. Redistributing landmarks to an equally spaced distribution during the model construction phase improves the quality of the resulting models significantly if the shapes feature prominent bulges or other complex geometry. The distribution of landmarks on the training shapes is -- beyond the correspondence issue -- a crucial point in model construction.
Applied statistical thermodynamics
Lucas, Klaus
1991-01-01
The book guides the reader from the foundations of statisti- cal thermodynamics including the theory of intermolecular forces to modern computer-aided applications in chemical en- gineering and physical chemistry. The approach is new. The foundations of quantum and statistical mechanics are presen- ted in a simple way and their applications to the prediction of fluid phase behavior of real systems are demonstrated. A particular effort is made to introduce the reader to expli- cit formulations of intermolecular interaction models and to show how these models influence the properties of fluid sy- stems. The established methods of statistical mechanics - computer simulation, perturbation theory, and numerical in- tegration - are discussed in a style appropriate for newcom- ers and are extensively applied. Numerous worked examples illustrate how practical calculations should be carried out.
Statistics for Petroleum Engineers and Geoscientists
International Nuclear Information System (INIS)
Jensen, J.L.; Lake, L.W.; Corbett, P.W.M.; Goggin, D.J.
2000-01-01
Geostatistics is a common tool in reservoir characterisation. Several texts discuss the subject, however this book differs in its approach and audience from currently available material. Written from the basics of statistics it covers only those topics that are needed for the two goals of the text: to exhibit the diagnostic potential of statistics and to introduce the important features of statistical modeling. This revised edition contains expanded discussions of some materials, in particular conditional probabilities, Bayes Theorem, correlation, and Kriging. The coverage of estimation, variability, and modeling applications have been updated. Seventy examples illustrate concepts and show the role of geology for providing important information for data analysis and model building. Four reservoir case studies conclude the presentation, illustrating the application and importance of the earlier material. This book can help petroleum professionals develop more accurate models, leading to lower sampling costs
Genser, Bernd; Fischer, Joachim E; Figueiredo, Camila A; Alcântara-Neves, Neuza; Barreto, Mauricio L; Cooper, Philip J; Amorim, Leila D; Saemann, Marcus D; Weichhart, Thomas; Rodrigues, Laura C
2016-05-20
Immunologists often measure several correlated immunological markers, such as concentrations of different cytokines produced by different immune cells and/or measured under different conditions, to draw insights from complex immunological mechanisms. Although there have been recent methodological efforts to improve the statistical analysis of immunological data, a framework is still needed for the simultaneous analysis of multiple, often correlated, immune markers. This framework would allow the immunologists' hypotheses about the underlying biological mechanisms to be integrated. We present an analytical approach for statistical analysis of correlated immune markers, such as those commonly collected in modern immuno-epidemiological studies. We demonstrate i) how to deal with interdependencies among multiple measurements of the same immune marker, ii) how to analyse association patterns among different markers, iii) how to aggregate different measures and/or markers to immunological summary scores, iv) how to model the inter-relationships among these scores, and v) how to use these scores in epidemiological association analyses. We illustrate the application of our approach to multiple cytokine measurements from 818 children enrolled in a large immuno-epidemiological study (SCAALA Salvador), which aimed to quantify the major immunological mechanisms underlying atopic diseases or asthma. We demonstrate how to aggregate systematically the information captured in multiple cytokine measurements to immunological summary scores aimed at reflecting the presumed underlying immunological mechanisms (Th1/Th2 balance and immune regulatory network). We show how these aggregated immune scores can be used as predictors in regression models with outcomes of immunological studies (e.g. specific IgE) and compare the results to those obtained by a traditional multivariate regression approach. The proposed analytical approach may be especially useful to quantify complex immune
REMAINING LIFE TIME PREDICTION OF BEARINGS USING K-STAR ALGORITHM – A STATISTICAL APPROACH
Directory of Open Access Journals (Sweden)
R. SATISHKUMAR
2017-01-01
Full Text Available The role of bearings is significant in reducing the down time of all rotating machineries. The increasing trend of bearing failures in recent times has triggered the need and importance of deployment of condition monitoring. There are multiple factors associated to a bearing failure while it is in operation. Hence, a predictive strategy is required to evaluate the current state of the bearings in operation. In past, predictive models with regression techniques were widely used for bearing lifetime estimations. The Objective of this paper is to estimate the remaining useful life of bearings through a machine learning approach. The ultimate objective of this study is to strengthen the predictive maintenance. The present study was done using classification approach following the concepts of machine learning and a predictive model was built to calculate the residual lifetime of bearings in operation. Vibration signals were acquired on a continuous basis from an experiment wherein the bearings are made to run till it fails naturally. It should be noted that the experiment was carried out with new bearings at pre-defined load and speed conditions until the bearing fails on its own. In the present work, statistical features were deployed and feature selection process was carried out using J48 decision tree and selected features were used to develop the prognostic model. The K-Star classification algorithm, a supervised machine learning technique is made use of in building a predictive model to estimate the lifetime of bearings. The performance of classifier was cross validated with distinct data. The result shows that the K-Star classification model gives 98.56% classification accuracy with selected features.
Franceschi, Massimo; Caffarra, Paolo; Savarè, Rita; Cerutti, Renata; Grossi, Enzo
2011-01-01
The early differentiation of Alzheimer's disease (AD) from frontotemporal dementia (FTD) may be difficult. The Tower of London (ToL), thought to assess executive functions such as planning and visuo-spatial working memory, could help in this purpose. Twentytwo Dementia Centers consecutively recruited patients with early FTD or AD. ToL performances of these groups were analyzed using both the conventional statistical approaches and the Artificial Neural Networks (ANNs) modelling. Ninety-four non aphasic FTD and 160 AD patients were recruited. ToL Accuracy Score (AS) significantly (p advanced ANNs developed by Semeion Institute. The best ANNs were selected and submitted to ROC curves. The non-linear model was able to discriminate FTD from AD with an average AUC for 7 independent trials of 0.82. The use of hidden information contained in the different items of ToL and the non linear processing of the data through ANNs allows a high discrimination between FTD and AD in individual patients.
Chen, Wen Li Kelly; Likhitpanichkul, Morakot; Ho, Anthony; Simmons, Craig A
2010-03-01
Cell-substrate interactions are multifaceted, involving the integration of various physical and biochemical signals. The interactions among these microenvironmental factors cannot be facilely elucidated and quantified by conventional experimentation, and necessitate multifactorial strategies. Here we describe an approach that integrates statistical design and analysis of experiments with automated microscopy to systematically investigate the combinatorial effects of substrate-derived stimuli (substrate stiffness and matrix protein concentration) on mesenchymal stem cell (MSC) spreading, proliferation and osteogenic differentiation. C3H10T1/2 cells were grown on type I collagen- or fibronectin-coated polyacrylamide hydrogels with tunable mechanical properties. Experimental conditions, which were defined according to central composite design, consisted of specific permutations of substrate stiffness (3-144 kPa) and adhesion protein concentration (7-520 microg/mL). Spreading area, BrdU incorporation and Runx2 nuclear translocation were quantified using high-content microscopy and modeled as mathematical functions of substrate stiffness and protein concentration. The resulting response surfaces revealed distinct patterns of protein-specific, substrate stiffness-dependent modulation of MSC proliferation and differentiation, demonstrating the advantage of statistical modeling in the detection and description of higher-order cellular responses. In a broader context, this approach can be adapted to study other types of cell-material interactions and can facilitate the efficient screening and optimization of substrate properties for applications involving cell-material interfaces. Copyright 2009 Elsevier Ltd. All rights reserved.
Bayesian approach to errors-in-variables in regression models
Rozliman, Nur Aainaa; Ibrahim, Adriana Irawati Nur; Yunus, Rossita Mohammad
2017-05-01
In many applications and experiments, data sets are often contaminated with error or mismeasured covariates. When at least one of the covariates in a model is measured with error, Errors-in-Variables (EIV) model can be used. Measurement error, when not corrected, would cause misleading statistical inferences and analysis. Therefore, our goal is to examine the relationship of the outcome variable and the unobserved exposure variable given the observed mismeasured surrogate by applying the Bayesian formulation to the EIV model. We shall extend the flexible parametric method proposed by Hossain and Gustafson (2009) to another nonlinear regression model which is the Poisson regression model. We shall then illustrate the application of this approach via a simulation study using Markov chain Monte Carlo sampling methods.
Statistical Hadronization and Holography
DEFF Research Database (Denmark)
Bechi, Jacopo
2009-01-01
In this paper we consider some issues about the statistical model of the hadronization in a holographic approach. We introduce a Rindler like horizon in the bulk and we understand the string breaking as a tunneling event under this horizon. We calculate the hadron spectrum and we get a thermal...
Alexandrov, Mikhail Dmitrievic; Geogdzhayev, Igor V.; Tsigaridis, Konstantinos; Marshak, Alexander; Levy, Robert; Cairns, Brian
2016-01-01
A novel model for the variability in aerosol optical thickness (AOT) is presented. This model is based on the consideration of AOT fields as realizations of a stochastic process, that is the exponent of an underlying Gaussian process with a specific autocorrelation function. In this approach AOT fields have lognormal PDFs and structure functions having the correct asymptotic behavior at large scales. The latter is an advantage compared with fractal (scale-invariant) approaches. The simple analytical form of the structure function in the proposed model facilitates its use for the parameterization of AOT statistics derived from remote sensing data. The new approach is illustrated using a month-long global MODIS AOT dataset (over ocean) with 10 km resolution. It was used to compute AOT statistics for sample cells forming a grid with 5deg spacing. The observed shapes of the structure functions indicated that in a large number of cases the AOT variability is split into two regimes that exhibit different patterns of behavior: small-scale stationary processes and trends reflecting variations at larger scales. The small-scale patterns are suggested to be generated by local aerosols within the marine boundary layer, while the large-scale trends are indicative of elevated aerosols transported from remote continental sources. This assumption is evaluated by comparison of the geographical distributions of these patterns derived from MODIS data with those obtained from the GISS GCM. This study shows considerable potential to enhance comparisons between remote sensing datasets and climate models beyond regional mean AOTs.
A quantum information approach to statistical mechanics
International Nuclear Information System (INIS)
Cuevas, G.
2011-01-01
The field of quantum information and computation harnesses and exploits the properties of quantum mechanics to perform tasks more efficiently than their classical counterparts, or that may uniquely be possible in the quantum world. Its findings and techniques have been applied to a number of fields, such as the study of entanglement in strongly correlated systems, new simulation techniques for many-body physics or, generally, to quantum optics. This thesis aims at broadening the scope of quantum information theory by applying it to problems in statistical mechanics. We focus on classical spin models, which are toy models used in a variety of systems, ranging from magnetism, neural networks, to quantum gravity. We tackle these models using quantum information tools from three different angles. First, we show how the partition function of a class of widely different classical spin models (models in different dimensions, different types of many-body interactions, different symmetries, etc) can be mapped to the partition function of a single model. We prove this by first establishing a relation between partition functions and quantum states, and then transforming the corresponding quantum states to each other. Second, we give efficient quantum algorithms to estimate the partition function of various classical spin models, such as the Ising or the Potts model. The proof is based on a relation between partition functions and quantum circuits, which allows us to determine the quantum computational complexity of the partition function by studying the corresponding quantum circuit. Finally, we outline the possibility of applying quantum information concepts and tools to certain models of dis- crete quantum gravity. The latter provide a natural route to generalize our results, insofar as the central quantity has the form of a partition function, and as classical spin models are used as toy models of matter. (author)
Statistical Modelling of the Soil Dielectric Constant
Usowicz, Boguslaw; Marczewski, Wojciech; Bogdan Usowicz, Jerzy; Lipiec, Jerzy
2010-05-01
The dielectric constant of soil is the physical property being very sensitive on water content. It funds several electrical measurement techniques for determining the water content by means of direct (TDR, FDR, and others related to effects of electrical conductance and/or capacitance) and indirect RS (Remote Sensing) methods. The work is devoted to a particular statistical manner of modelling the dielectric constant as the property accounting a wide range of specific soil composition, porosity, and mass density, within the unsaturated water content. Usually, similar models are determined for few particular soil types, and changing the soil type one needs switching the model on another type or to adjust it by parametrization of soil compounds. Therefore, it is difficult comparing and referring results between models. The presented model was developed for a generic representation of soil being a hypothetical mixture of spheres, each representing a soil fraction, in its proper phase state. The model generates a serial-parallel mesh of conductive and capacitive paths, which is analysed for a total conductive or capacitive property. The model was firstly developed to determine the thermal conductivity property, and now it is extended on the dielectric constant by analysing the capacitive mesh. The analysis is provided by statistical means obeying physical laws related to the serial-parallel branching of the representative electrical mesh. Physical relevance of the analysis is established electrically, but the definition of the electrical mesh is controlled statistically by parametrization of compound fractions, by determining the number of representative spheres per unitary volume per fraction, and by determining the number of fractions. That way the model is capable covering properties of nearly all possible soil types, all phase states within recognition of the Lorenz and Knudsen conditions. In effect the model allows on generating a hypothetical representative of
An Update on Statistical Boosting in Biomedicine.
Mayr, Andreas; Hofner, Benjamin; Waldmann, Elisabeth; Hepp, Tobias; Meyer, Sebastian; Gefeller, Olaf
2017-01-01
Statistical boosting algorithms have triggered a lot of research during the last decade. They combine a powerful machine learning approach with classical statistical modelling, offering various practical advantages like automated variable selection and implicit regularization of effect estimates. They are extremely flexible, as the underlying base-learners (regression functions defining the type of effect for the explanatory variables) can be combined with any kind of loss function (target function to be optimized, defining the type of regression setting). In this review article, we highlight the most recent methodological developments on statistical boosting regarding variable selection, functional regression, and advanced time-to-event modelling. Additionally, we provide a short overview on relevant applications of statistical boosting in biomedicine.
Qi, D.; Majda, A.
2017-12-01
A low-dimensional reduced-order statistical closure model is developed for quantifying the uncertainty in statistical sensitivity and intermittency in principal model directions with largest variability in high-dimensional turbulent system and turbulent transport models. Imperfect model sensitivity is improved through a recent mathematical strategy for calibrating model errors in a training phase, where information theory and linear statistical response theory are combined in a systematic fashion to achieve the optimal model performance. The idea in the reduced-order method is from a self-consistent mathematical framework for general systems with quadratic nonlinearity, where crucial high-order statistics are approximated by a systematic model calibration procedure. Model efficiency is improved through additional damping and noise corrections to replace the expensive energy-conserving nonlinear interactions. Model errors due to the imperfect nonlinear approximation are corrected by tuning the model parameters using linear response theory with an information metric in a training phase before prediction. A statistical energy principle is adopted to introduce a global scaling factor in characterizing the higher-order moments in a consistent way to improve model sensitivity. Stringent models of barotropic and baroclinic turbulence are used to display the feasibility of the reduced-order methods. Principal statistical responses in mean and variance can be captured by the reduced-order models with accuracy and efficiency. Besides, the reduced-order models are also used to capture crucial passive tracer field that is advected by the baroclinic turbulent flow. It is demonstrated that crucial principal statistical quantities like the tracer spectrum and fat-tails in the tracer probability density functions in the most important large scales can be captured efficiently with accuracy using the reduced-order tracer model in various dynamical regimes of the flow field with
On-the-fly confluence detection for statistical model checking (extended version)
Hartmanns, Arnd; Timmer, Mark
Statistical model checking is an analysis method that circumvents the state space explosion problem in model-based verification by combining probabilistic simulation with statistical methods that provide clear error bounds. As a simulation-based technique, it can only provide sound results if the
浅野, 美代子; マーコ, ユー K.W.
2007-01-01
This paper introduces the hybrid approach of neural networks and linear regression model proposed by Asano and Tsubaki (2003). Neural networks are often credited with its superiority in data consistency whereas the linear regression model provides simple interpretation of the data enabling researchers to verify their hypotheses. The hybrid approach aims at combing the strengths of these two well-established statistical methods. A step-by-step procedure for performing the hybrid approach is pr...
A Bayesian Approach to Model Selection in Hierarchical Mixtures-of-Experts Architectures.
Tanner, Martin A.; Peng, Fengchun; Jacobs, Robert A.
1997-03-01
There does not exist a statistical model that shows good performance on all tasks. Consequently, the model selection problem is unavoidable; investigators must decide which model is best at summarizing the data for each task of interest. This article presents an approach to the model selection problem in hierarchical mixtures-of-experts architectures. These architectures combine aspects of generalized linear models with those of finite mixture models in order to perform tasks via a recursive "divide-and-conquer" strategy. Markov chain Monte Carlo methodology is used to estimate the distribution of the architectures' parameters. One part of our approach to model selection attempts to estimate the worth of each component of an architecture so that relatively unused components can be pruned from the architecture's structure. A second part of this approach uses a Bayesian hypothesis testing procedure in order to differentiate inputs that carry useful information from nuisance inputs. Simulation results suggest that the approach presented here adheres to the dictum of Occam's razor; simple architectures that are adequate for summarizing the data are favored over more complex structures. Copyright 1997 Elsevier Science Ltd. All Rights Reserved.
Harrou, Fouzi; Sun, Ying; Taghezouit, Bilal; Saidi, Ahmed; Hamlati, Mohamed-Elkarim
2017-01-01
This study reports the development of an innovative fault detection and diagnosis scheme to monitor the direct current (DC) side of photovoltaic (PV) systems. Towards this end, we propose a statistical approach that exploits the advantages of one
Energy Technology Data Exchange (ETDEWEB)
Piepel, Gregory F.; Matzke, Brett D.; Sego, Landon H.; Amidan, Brett G.
2013-04-27
This report discusses the methodology, formulas, and inputs needed to make characterization and clearance decisions for Bacillus anthracis-contaminated and uncontaminated (or decontaminated) areas using a statistical sampling approach. Specifically, the report includes the methods and formulas for calculating the • number of samples required to achieve a specified confidence in characterization and clearance decisions • confidence in making characterization and clearance decisions for a specified number of samples for two common statistically based environmental sampling approaches. In particular, the report addresses an issue raised by the Government Accountability Office by providing methods and formulas to calculate the confidence that a decision area is uncontaminated (or successfully decontaminated) if all samples collected according to a statistical sampling approach have negative results. Key to addressing this topic is the probability that an individual sample result is a false negative, which is commonly referred to as the false negative rate (FNR). The two statistical sampling approaches currently discussed in this report are 1) hotspot sampling to detect small isolated contaminated locations during the characterization phase, and 2) combined judgment and random (CJR) sampling during the clearance phase. Typically if contamination is widely distributed in a decision area, it will be detectable via judgment sampling during the characterization phrase. Hotspot sampling is appropriate for characterization situations where contamination is not widely distributed and may not be detected by judgment sampling. CJR sampling is appropriate during the clearance phase when it is desired to augment judgment samples with statistical (random) samples. The hotspot and CJR statistical sampling approaches are discussed in the report for four situations: 1. qualitative data (detect and non-detect) when the FNR = 0 or when using statistical sampling methods that account
Topology for Statistical Modeling of Petascale Data
Energy Technology Data Exchange (ETDEWEB)
Bennett, Janine Camille [Sandia National Lab. (SNL-CA), Livermore, CA (United States); Pebay, Philippe Pierre [Sandia National Lab. (SNL-CA), Livermore, CA (United States); Pascucci, Valerio [Univ. of Utah, Salt Lake City, UT (United States); Levine, Joshua [Univ. of Utah, Salt Lake City, UT (United States); Gyulassy, Attila [Univ. of Utah, Salt Lake City, UT (United States); Rojas, Maurice [Texas A & M Univ., College Station, TX (United States)
2014-07-01
This document presents current technical progress and dissemination of results for the Mathematics for Analysis of Petascale Data (MAPD) project titled "Topology for Statistical Modeling of Petascale Data", funded by the Office of Science Advanced Scientific Computing Research (ASCR) Applied Math program.
An efficient statistical-based approach for road traffic congestion monitoring
Abdelhafid, Zeroual
2017-12-14
In this paper, we propose an effective approach which has to detect traffic congestion. The detection strategy is based on the combinational use of piecewise switched linear traffic (PWSL) model with exponentially-weighted moving average (EWMA) chart. PWSL model describes traffic flow dynamics. Then, PWSL residuals are used as the input of EWMA chart to detect traffic congestions. The evaluation results of the developed approach using data from a portion of the I210-W highway in Califorina showed the efficiency of the PWSL-EWMA approach in in detecting traffic congestions.
An efficient statistical-based approach for road traffic congestion monitoring
Abdelhafid, Zeroual; Harrou, Fouzi; Sun, Ying
2017-01-01
In this paper, we propose an effective approach which has to detect traffic congestion. The detection strategy is based on the combinational use of piecewise switched linear traffic (PWSL) model with exponentially-weighted moving average (EWMA) chart. PWSL model describes traffic flow dynamics. Then, PWSL residuals are used as the input of EWMA chart to detect traffic congestions. The evaluation results of the developed approach using data from a portion of the I210-W highway in Califorina showed the efficiency of the PWSL-EWMA approach in in detecting traffic congestions.
Zeroual, Abdelhafid; Harrou, Fouzi; Sun, Ying; Messai, Nadhir
2017-01-01
Monitoring vehicle traffic flow plays a central role in enhancing traffic management, transportation safety and cost savings. In this paper, we propose an innovative approach for detection of traffic congestion. Specifically, we combine the flexibility and simplicity of a piecewise switched linear (PWSL) macroscopic traffic model and the greater capacity of the exponentially-weighted moving average (EWMA) monitoring chart. Macroscopic models, which have few, easily calibrated parameters, are employed to describe a free traffic flow at the macroscopic level. Then, we apply the EWMA monitoring chart to the uncorrelated residuals obtained from the constructed PWSL model to detect congested situations. In this strategy, wavelet-based multiscale filtering of data has been used before the application of the EWMA scheme to improve further the robustness of this method to measurement noise and reduce the false alarms due to modeling errors. The performance of the PWSL-EWMA approach is successfully tested on traffic data from the three lane highway portion of the Interstate 210 (I-210) highway of the west of California and the four lane highway portion of the State Route 60 (SR60) highway from the east of California, provided by the Caltrans Performance Measurement System (PeMS). Results show the ability of the PWSL-EWMA approach to monitor vehicle traffic, confirming the promising application of this statistical tool to the supervision of traffic flow congestion.
Zeroual, Abdelhafid
2017-08-19
Monitoring vehicle traffic flow plays a central role in enhancing traffic management, transportation safety and cost savings. In this paper, we propose an innovative approach for detection of traffic congestion. Specifically, we combine the flexibility and simplicity of a piecewise switched linear (PWSL) macroscopic traffic model and the greater capacity of the exponentially-weighted moving average (EWMA) monitoring chart. Macroscopic models, which have few, easily calibrated parameters, are employed to describe a free traffic flow at the macroscopic level. Then, we apply the EWMA monitoring chart to the uncorrelated residuals obtained from the constructed PWSL model to detect congested situations. In this strategy, wavelet-based multiscale filtering of data has been used before the application of the EWMA scheme to improve further the robustness of this method to measurement noise and reduce the false alarms due to modeling errors. The performance of the PWSL-EWMA approach is successfully tested on traffic data from the three lane highway portion of the Interstate 210 (I-210) highway of the west of California and the four lane highway portion of the State Route 60 (SR60) highway from the east of California, provided by the Caltrans Performance Measurement System (PeMS). Results show the ability of the PWSL-EWMA approach to monitor vehicle traffic, confirming the promising application of this statistical tool to the supervision of traffic flow congestion.
Directory of Open Access Journals (Sweden)
Rochelle E. Tractenberg
2016-12-01
Full Text Available Statistical literacy is essential to an informed citizenry; and two emerging trends highlight a growing need for training that achieves this literacy. The first trend is towards “big” data: while automated analyses can exploit massive amounts of data, the interpretation—and possibly more importantly, the replication—of results are challenging without adequate statistical literacy. The second trend is that science and scientific publishing are struggling with insufficient/inappropriate statistical reasoning in writing, reviewing, and editing. This paper describes a model for statistical literacy (SL and its development that can support modern scientific practice. An established curriculum development and evaluation tool—the Mastery Rubric—is integrated with a new, developmental, model of statistical literacy that reflects the complexity of reasoning and habits of mind that scientists need to cultivate in order to recognize, choose, and interpret statistical methods. This developmental model provides actionable evidence, and explicit opportunities for consequential assessment that serves students, instructors, developers/reviewers/accreditors of a curriculum, and institutions. By supporting the enrichment, rather than increasing the amount, of statistical training in the basic and life sciences, this approach supports curriculum development, evaluation, and delivery to promote statistical literacy for students and a collective quantitative proficiency more broadly.
Analyzing sickness absence with statistical models for survival data
DEFF Research Database (Denmark)
Christensen, Karl Bang; Andersen, Per Kragh; Smith-Hansen, Lars
2007-01-01
OBJECTIVES: Sickness absence is the outcome in many epidemiologic studies and is often based on summary measures such as the number of sickness absences per year. In this study the use of modern statistical methods was examined by making better use of the available information. Since sickness...... absence data deal with events occurring over time, the use of statistical models for survival data has been reviewed, and the use of frailty models has been proposed for the analysis of such data. METHODS: Three methods for analyzing data on sickness absences were compared using a simulation study...... involving the following: (i) Poisson regression using a single outcome variable (number of sickness absences), (ii) analysis of time to first event using the Cox proportional hazards model, and (iii) frailty models, which are random effects proportional hazards models. Data from a study of the relation...
Strauch, R. L.; Istanbulluoglu, E.
2017-12-01
We develop a landslide hazard modeling approach that integrates a data-driven statistical model and a probabilistic process-based shallow landslide model for mapping probability of landslide initiation, transport, and deposition at regional scales. The empirical model integrates the influence of seven site attribute (SA) classes: elevation, slope, curvature, aspect, land use-land cover, lithology, and topographic wetness index, on over 1,600 observed landslides using a frequency ratio (FR) approach. A susceptibility index is calculated by adding FRs for each SA on a grid-cell basis. Using landslide observations we relate susceptibility index to an empirically-derived probability of landslide impact. This probability is combined with results from a physically-based model to produce an integrated probabilistic map. Slope was key in landslide initiation while deposition was linked to lithology and elevation. Vegetation transition from forest to alpine vegetation and barren land cover with lower root cohesion leads to higher frequency of initiation. Aspect effects are likely linked to differences in root cohesion and moisture controlled by solar insulation and snow. We demonstrate the model in the North Cascades of Washington, USA and identify locations of high and low probability of landslide impacts that can be used by land managers in their design, planning, and maintenance.
Physics-based statistical model and simulation method of RF propagation in urban environments
Pao, Hsueh-Yuan; Dvorak, Steven L.
2010-09-14
A physics-based statistical model and simulation/modeling method and system of electromagnetic wave propagation (wireless communication) in urban environments. In particular, the model is a computationally efficient close-formed parametric model of RF propagation in an urban environment which is extracted from a physics-based statistical wireless channel simulation method and system. The simulation divides the complex urban environment into a network of interconnected urban canyon waveguides which can be analyzed individually; calculates spectral coefficients of modal fields in the waveguides excited by the propagation using a database of statistical impedance boundary conditions which incorporates the complexity of building walls in the propagation model; determines statistical parameters of the calculated modal fields; and determines a parametric propagation model based on the statistical parameters of the calculated modal fields from which predictions of communications capability may be made.
Directory of Open Access Journals (Sweden)
Petrova Elena Аleksandrovna
2014-11-01
Full Text Available Formation of strategy of long-term social and economic development is a basis for effective functioning of executive authorities and the assessment of its efficiency in general. Modern theories of assessment of public administration productivity are guided by the process approach when it is expedient to carry out the formation of business processes of regional executive authorities according to strategic indicators of territorial development. In this regard, there is a problem of modeling of interrelation of indicators of social and economic development of the region and quantitative indices of results of business processes of executive authorities. At the first stage of modeling, two main directions of strategic development, namely innovative and investment activity of regional economic systems are considered. In this regard, the work presents the results of modeling the interrelation between the indicators of regional social and economic development and innovative and investment activity. Therefore, for carrying out the analysis, the social and economic system of the region is presented in space of the main indicators of social and economic development of the territory and indicators of innovative and investment activity. The analysis is made on values of the indicators calculated for regions of the Russian Federation during 2000, 2005, 2008, 2010 and 2011. It was revealed that strategic indicators of innovative and investment activity have the most significant impact on key signs of social and economic development.
A statistical approach for segregating cognitive task stages from multivariate fMRI BOLD time series
Directory of Open Access Journals (Sweden)
Charmaine eDemanuele
2015-10-01
Full Text Available Multivariate pattern analysis can reveal new information from neuroimaging data to illuminate human cognition and its disturbances. Here, we develop a methodological approach, based on multivariate statistical/machine learning and time series analysis, to discern cognitive processing stages from fMRI blood oxygenation level dependent (BOLD time series. We apply this method to data recorded from a group of healthy adults whilst performing a virtual reality version of the delayed win-shift radial arm maze task. This task has been frequently used to study working memory and decision making in rodents. Using linear classifiers and multivariate test statistics in conjunction with time series bootstraps, we show that different cognitive stages of the task, as defined by the experimenter, namely, the encoding/retrieval, choice, reward and delay stages, can be statistically discriminated from the BOLD time series in brain areas relevant for decision making and working memory. Discrimination of these task stages was significantly reduced during poor behavioral performance in dorsolateral prefrontal cortex (DLPFC, but not in the primary visual cortex (V1. Experimenter-defined dissection of time series into class labels based on task structure was confirmed by an unsupervised, bottom-up approach based on Hidden Markov Models. Furthermore, we show that different groupings of recorded time points into cognitive event classes can be used to test hypotheses about the specific cognitive role of a given brain region during task execution. We found that whilst the DLPFC strongly differentiated between task stages associated with different memory loads, but not between different visual-spatial aspects, the reverse was true for V1. Our methodology illustrates how different aspects of cognitive information processing during one and the same task can be separated and attributed to specific brain regions based on information contained in multivariate patterns of voxel
Statistical methods for ranking data
Alvo, Mayer
2014-01-01
This book introduces advanced undergraduate, graduate students and practitioners to statistical methods for ranking data. An important aspect of nonparametric statistics is oriented towards the use of ranking data. Rank correlation is defined through the notion of distance functions and the notion of compatibility is introduced to deal with incomplete data. Ranking data are also modeled using a variety of modern tools such as CART, MCMC, EM algorithm and factor analysis. This book deals with statistical methods used for analyzing such data and provides a novel and unifying approach for hypotheses testing. The techniques described in the book are illustrated with examples and the statistical software is provided on the authors’ website.
Castruccio, Stefano; Genton, Marc G.
2015-01-01
One of the main challenges when working with modern climate model ensembles is the increasingly larger size of the data produced, and the consequent difficulty in storing large amounts of spatio-temporally resolved information. Many compression algorithms can be used to mitigate this problem, but since they are designed to compress generic scientific data sets, they do not account for the nature of climate model output and they compress only individual simulations. In this work, we propose a different, statistics-based approach that explicitly accounts for the space-time dependence of the data for annual global three-dimensional temperature fields in an initial condition ensemble. The set of estimated parameters is small (compared to the data size) and can be regarded as a summary of the essential structure of the ensemble output; therefore, it can be used to instantaneously reproduce the temperature fields in an ensemble with a substantial saving in storage and time. The statistical model exploits the gridded geometry of the data and parallelization across processors. It is therefore computationally convenient and allows to fit a non-trivial model to a data set of one billion data points with a covariance matrix comprising of 10^18 entries.
Castruccio, Stefano
2015-04-02
One of the main challenges when working with modern climate model ensembles is the increasingly larger size of the data produced, and the consequent difficulty in storing large amounts of spatio-temporally resolved information. Many compression algorithms can be used to mitigate this problem, but since they are designed to compress generic scientific data sets, they do not account for the nature of climate model output and they compress only individual simulations. In this work, we propose a different, statistics-based approach that explicitly accounts for the space-time dependence of the data for annual global three-dimensional temperature fields in an initial condition ensemble. The set of estimated parameters is small (compared to the data size) and can be regarded as a summary of the essential structure of the ensemble output; therefore, it can be used to instantaneously reproduce the temperature fields in an ensemble with a substantial saving in storage and time. The statistical model exploits the gridded geometry of the data and parallelization across processors. It is therefore computationally convenient and allows to fit a non-trivial model to a data set of one billion data points with a covariance matrix comprising of 10^18 entries.
Encoding Dissimilarity Data for Statistical Model Building.
Wahba, Grace
2010-12-01
We summarize, review and comment upon three papers which discuss the use of discrete, noisy, incomplete, scattered pairwise dissimilarity data in statistical model building. Convex cone optimization codes are used to embed the objects into a Euclidean space which respects the dissimilarity information while controlling the dimension of the space. A "newbie" algorithm is provided for embedding new objects into this space. This allows the dissimilarity information to be incorporated into a Smoothing Spline ANOVA penalized likelihood model, a Support Vector Machine, or any model that will admit Reproducing Kernel Hilbert Space components, for nonparametric regression, supervised learning, or semi-supervised learning. Future work and open questions are discussed. The papers are: F. Lu, S. Keles, S. Wright and G. Wahba 2005. A framework for kernel regularization with application to protein clustering. Proceedings of the National Academy of Sciences 102, 12332-1233.G. Corrada Bravo, G. Wahba, K. Lee, B. Klein, R. Klein and S. Iyengar 2009. Examining the relative influence of familial, genetic and environmental covariate information in flexible risk models. Proceedings of the National Academy of Sciences 106, 8128-8133F. Lu, Y. Lin and G. Wahba. Robust manifold unfolding with kernel regularization. TR 1008, Department of Statistics, University of Wisconsin-Madison.
International Nuclear Information System (INIS)
Pumir, Alain; Naso, Aurore
2010-01-01
A proper description of the velocity gradient tensor is crucial for understanding the dynamics of turbulent flows, in particular the energy transfer from large to small scales. Insight into the statistical properties of the velocity gradient tensor and into its coarse-grained generalization can be obtained with the help of a stochastic 'tetrad model' that describes the coarse-grained velocity gradient tensor based on the evolution of four points. Although the solution of the stochastic model can be formally expressed in terms of path integrals, its numerical determination in terms of the Monte-Carlo method is very challenging, as very few configurations contribute effectively to the statistical weight. Here, we discuss a strategy that allows us to solve the tetrad model numerically. The algorithm is based on the importance sampling method, which consists here of identifying and sampling preferentially the configurations that are likely to correspond to a large statistical weight, and selectively rejecting configurations with a small statistical weight. The algorithm leads to an efficient numerical determination of the solutions of the model and allows us to determine their qualitative behavior as a function of scale. We find that the moments of order n≤4 of the solutions of the model scale with the coarse-graining scale and that the scaling exponents a