WorldWideScience

Sample records for linear statistical model

  1. Matrix Tricks for Linear Statistical Models

    CERN Document Server

    Puntanen, Simo; Styan, George PH

    2011-01-01

    In teaching linear statistical models to first-year graduate students or to final-year undergraduate students there is no way to proceed smoothly without matrices and related concepts of linear algebra; their use is really essential. Our experience is that making some particular matrix tricks very familiar to students can substantially increase their insight into linear statistical models (and also multivariate statistical analysis). In matrix algebra, there are handy, sometimes even very simple "tricks" which simplify and clarify the treatment of a problem - both for the student and

  2. Actuarial statistics with generalized linear mixed models

    NARCIS (Netherlands)

    Antonio, K.; Beirlant, J.

    2007-01-01

    Over the last decade the use of generalized linear models (GLMs) in actuarial statistics has received a lot of attention, starting from the actuarial illustrations in the standard text by McCullagh and Nelder [McCullagh, P., Nelder, J.A., 1989. Generalized linear models. In: Monographs on Statistics

  3. An R2 statistic for fixed effects in the linear mixed model.

    Science.gov (United States)

    Edwards, Lloyd J; Muller, Keith E; Wolfinger, Russell D; Qaqish, Bahjat F; Schabenberger, Oliver

    2008-12-20

    Statisticians most often use the linear mixed model to analyze Gaussian longitudinal data. The value and familiarity of the R(2) statistic in the linear univariate model naturally creates great interest in extending it to the linear mixed model. We define and describe how to compute a model R(2) statistic for the linear mixed model by using only a single model. The proposed R(2) statistic measures multivariate association between the repeated outcomes and the fixed effects in the linear mixed model. The R(2) statistic arises as a 1-1 function of an appropriate F statistic for testing all fixed effects (except typically the intercept) in a full model. The statistic compares the full model with a null model with all fixed effects deleted (except typically the intercept) while retaining exactly the same covariance structure. Furthermore, the R(2) statistic leads immediately to a natural definition of a partial R(2) statistic. A mixed model in which ethnicity gives a very small p-value as a longitudinal predictor of blood pressure (BP) compellingly illustrates the value of the statistic. In sharp contrast to the extreme p-value, a very small R(2) , a measure of statistical and scientific importance, indicates that ethnicity has an almost negligible association with the repeated BP outcomes for the study.

  4. Linear mixed models a practical guide using statistical software

    CERN Document Server

    West, Brady T; Galecki, Andrzej T

    2006-01-01

    Simplifying the often confusing array of software programs for fitting linear mixed models (LMMs), Linear Mixed Models: A Practical Guide Using Statistical Software provides a basic introduction to primary concepts, notation, software implementation, model interpretation, and visualization of clustered and longitudinal data. This easy-to-navigate reference details the use of procedures for fitting LMMs in five popular statistical software packages: SAS, SPSS, Stata, R/S-plus, and HLM. The authors introduce basic theoretical concepts, present a heuristic approach to fitting LMMs based on bo

  5. Non-linear scaling of a musculoskeletal model of the lower limb using statistical shape models.

    Science.gov (United States)

    Nolte, Daniel; Tsang, Chui Kit; Zhang, Kai Yu; Ding, Ziyun; Kedgley, Angela E; Bull, Anthony M J

    2016-10-03

    Accurate muscle geometry for musculoskeletal models is important to enable accurate subject-specific simulations. Commonly, linear scaling is used to obtain individualised muscle geometry. More advanced methods include non-linear scaling using segmented bone surfaces and manual or semi-automatic digitisation of muscle paths from medical images. In this study, a new scaling method combining non-linear scaling with reconstructions of bone surfaces using statistical shape modelling is presented. Statistical Shape Models (SSMs) of femur and tibia/fibula were used to reconstruct bone surfaces of nine subjects. Reference models were created by morphing manually digitised muscle paths to mean shapes of the SSMs using non-linear transformations and inter-subject variability was calculated. Subject-specific models of muscle attachment and via points were created from three reference models. The accuracy was evaluated by calculating the differences between the scaled and manually digitised models. The points defining the muscle paths showed large inter-subject variability at the thigh and shank - up to 26mm; this was found to limit the accuracy of all studied scaling methods. Errors for the subject-specific muscle point reconstructions of the thigh could be decreased by 9% to 20% by using the non-linear scaling compared to a typical linear scaling method. We conclude that the proposed non-linear scaling method is more accurate than linear scaling methods. Thus, when combined with the ability to reconstruct bone surfaces from incomplete or scattered geometry data using statistical shape models our proposed method is an alternative to linear scaling methods. Copyright © 2016 The Author. Published by Elsevier Ltd.. All rights reserved.

  6. Multivariate statistical modelling based on generalized linear models

    CERN Document Server

    Fahrmeir, Ludwig

    1994-01-01

    This book is concerned with the use of generalized linear models for univariate and multivariate regression analysis. Its emphasis is to provide a detailed introductory survey of the subject based on the analysis of real data drawn from a variety of subjects including the biological sciences, economics, and the social sciences. Where possible, technical details and proofs are deferred to an appendix in order to provide an accessible account for non-experts. Topics covered include: models for multi-categorical responses, model checking, time series and longitudinal data, random effects models, and state-space models. Throughout, the authors have taken great pains to discuss the underlying theoretical ideas in ways that relate well to the data at hand. As a result, numerous researchers whose work relies on the use of these models will find this an invaluable account to have on their desks. "The basic aim of the authors is to bring together and review a large part of recent advances in statistical modelling of m...

  7. Foundations of linear and generalized linear models

    CERN Document Server

    Agresti, Alan

    2015-01-01

    A valuable overview of the most important ideas and results in statistical analysis Written by a highly-experienced author, Foundations of Linear and Generalized Linear Models is a clear and comprehensive guide to the key concepts and results of linear statistical models. The book presents a broad, in-depth overview of the most commonly used statistical models by discussing the theory underlying the models, R software applications, and examples with crafted models to elucidate key ideas and promote practical model building. The book begins by illustrating the fundamentals of linear models,

  8. Linear mixed models a practical guide using statistical software

    CERN Document Server

    West, Brady T; Galecki, Andrzej T

    2014-01-01

    Highly recommended by JASA, Technometrics, and other journals, the first edition of this bestseller showed how to easily perform complex linear mixed model (LMM) analyses via a variety of software programs. Linear Mixed Models: A Practical Guide Using Statistical Software, Second Edition continues to lead readers step by step through the process of fitting LMMs. This second edition covers additional topics on the application of LMMs that are valuable for data analysts in all fields. It also updates the case studies using the latest versions of the software procedures and provides up-to-date information on the options and features of the software procedures available for fitting LMMs in SAS, SPSS, Stata, R/S-plus, and HLM.New to the Second Edition A new chapter on models with crossed random effects that uses a case study to illustrate software procedures capable of fitting these models Power analysis methods for longitudinal and clustered study designs, including software options for power analyses and suggest...

  9. OPLS statistical model versus linear regression to assess sonographic predictors of stroke prognosis.

    Science.gov (United States)

    Vajargah, Kianoush Fathi; Sadeghi-Bazargani, Homayoun; Mehdizadeh-Esfanjani, Robab; Savadi-Oskouei, Daryoush; Farhoudi, Mehdi

    2012-01-01

    The objective of the present study was to assess the comparable applicability of orthogonal projections to latent structures (OPLS) statistical model vs traditional linear regression in order to investigate the role of trans cranial doppler (TCD) sonography in predicting ischemic stroke prognosis. The study was conducted on 116 ischemic stroke patients admitted to a specialty neurology ward. The Unified Neurological Stroke Scale was used once for clinical evaluation on the first week of admission and again six months later. All data was primarily analyzed using simple linear regression and later considered for multivariate analysis using PLS/OPLS models through the SIMCA P+12 statistical software package. The linear regression analysis results used for the identification of TCD predictors of stroke prognosis were confirmed through the OPLS modeling technique. Moreover, in comparison to linear regression, the OPLS model appeared to have higher sensitivity in detecting the predictors of ischemic stroke prognosis and detected several more predictors. Applying the OPLS model made it possible to use both single TCD measures/indicators and arbitrarily dichotomized measures of TCD single vessel involvement as well as the overall TCD result. In conclusion, the authors recommend PLS/OPLS methods as complementary rather than alternative to the available classical regression models such as linear regression.

  10. Statistical Tests for Mixed Linear Models

    CERN Document Server

    Khuri, André I; Sinha, Bimal K

    2011-01-01

    An advanced discussion of linear models with mixed or random effects. In recent years a breakthrough has occurred in our ability to draw inferences from exact and optimum tests of variance component models, generating much research activity that relies on linear models with mixed and random effects. This volume covers the most important research of the past decade as well as the latest developments in hypothesis testing. It compiles all currently available results in the area of exact and optimum tests for variance component models and offers the only comprehensive treatment for these models a

  11. Hadronic equation of state in the statistical bootstrap model and linear graph theory

    International Nuclear Information System (INIS)

    Fre, P.; Page, R.

    1976-01-01

    Taking a statistical mechanical point og view, the statistical bootstrap model is discussed and, from a critical analysis of the bootstrap volume comcept, it is reached a physical ipothesis, which leads immediately to the hadronic equation of state provided by the bootstrap integral equation. In this context also the connection between the statistical bootstrap and the linear graph theory approach to interacting gases is analyzed

  12. International Conference on Trends and Perspectives in Linear Statistical Inference

    CERN Document Server

    Rosen, Dietrich

    2018-01-01

    This volume features selected contributions on a variety of topics related to linear statistical inference. The peer-reviewed papers from the International Conference on Trends and Perspectives in Linear Statistical Inference (LinStat 2016) held in Istanbul, Turkey, 22-25 August 2016, cover topics in both theoretical and applied statistics, such as linear models, high-dimensional statistics, computational statistics, the design of experiments, and multivariate analysis. The book is intended for statisticians, Ph.D. students, and professionals who are interested in statistical inference. .

  13. Role of Statistical Random-Effects Linear Models in Personalized Medicine.

    Science.gov (United States)

    Diaz, Francisco J; Yeh, Hung-Wen; de Leon, Jose

    2012-03-01

    Some empirical studies and recent developments in pharmacokinetic theory suggest that statistical random-effects linear models are valuable tools that allow describing simultaneously patient populations as a whole and patients as individuals. This remarkable characteristic indicates that these models may be useful in the development of personalized medicine, which aims at finding treatment regimes that are appropriate for particular patients, not just appropriate for the average patient. In fact, published developments show that random-effects linear models may provide a solid theoretical framework for drug dosage individualization in chronic diseases. In particular, individualized dosages computed with these models by means of an empirical Bayesian approach may produce better results than dosages computed with some methods routinely used in therapeutic drug monitoring. This is further supported by published empirical and theoretical findings that show that random effects linear models may provide accurate representations of phase III and IV steady-state pharmacokinetic data, and may be useful for dosage computations. These models have applications in the design of clinical algorithms for drug dosage individualization in chronic diseases; in the computation of dose correction factors; computation of the minimum number of blood samples from a patient that are necessary for calculating an optimal individualized drug dosage in therapeutic drug monitoring; measure of the clinical importance of clinical, demographic, environmental or genetic covariates; study of drug-drug interactions in clinical settings; the implementation of computational tools for web-site-based evidence farming; design of pharmacogenomic studies; and in the development of a pharmacological theory of dosage individualization.

  14. Linear mixed-effects models for central statistical monitoring of multicenter clinical trials

    OpenAIRE

    Desmet, L.; Venet, D.; Doffagne, E.; Timmermans, C.; BURZYKOWSKI, Tomasz; LEGRAND, Catherine; BUYSE, Marc

    2014-01-01

    Multicenter studies are widely used to meet accrual targets in clinical trials. Clinical data monitoring is required to ensure the quality and validity of the data gathered across centers. One approach to this end is central statistical monitoring, which aims at detecting atypical patterns in the data by means of statistical methods. In this context, we consider the simple case of a continuous variable, and we propose a detection procedure based on a linear mixed-effects model to detect locat...

  15. Introduction to generalized linear models

    CERN Document Server

    Dobson, Annette J

    2008-01-01

    Introduction Background Scope Notation Distributions Related to the Normal Distribution Quadratic Forms Estimation Model Fitting Introduction Examples Some Principles of Statistical Modeling Notation and Coding for Explanatory Variables Exponential Family and Generalized Linear Models Introduction Exponential Family of Distributions Properties of Distributions in the Exponential Family Generalized Linear Models Examples Estimation Introduction Example: Failure Times for Pressure Vessels Maximum Likelihood Estimation Poisson Regression Example Inference Introduction Sampling Distribution for Score Statistics Taylor Series Approximations Sampling Distribution for MLEs Log-Likelihood Ratio Statistic Sampling Distribution for the Deviance Hypothesis Testing Normal Linear Models Introduction Basic Results Multiple Linear Regression Analysis of Variance Analysis of Covariance General Linear Models Binary Variables and Logistic Regression Probability Distributions ...

  16. Linear models with R

    CERN Document Server

    Faraway, Julian J

    2014-01-01

    A Hands-On Way to Learning Data AnalysisPart of the core of statistics, linear models are used to make predictions and explain the relationship between the response and the predictors. Understanding linear models is crucial to a broader competence in the practice of statistics. Linear Models with R, Second Edition explains how to use linear models in physical science, engineering, social science, and business applications. The book incorporates several improvements that reflect how the world of R has greatly expanded since the publication of the first edition.New to the Second EditionReorganiz

  17. Extending the linear model with R generalized linear, mixed effects and nonparametric regression models

    CERN Document Server

    Faraway, Julian J

    2005-01-01

    Linear models are central to the practice of statistics and form the foundation of a vast range of statistical methodologies. Julian J. Faraway''s critically acclaimed Linear Models with R examined regression and analysis of variance, demonstrated the different methods available, and showed in which situations each one applies. Following in those footsteps, Extending the Linear Model with R surveys the techniques that grow from the regression model, presenting three extensions to that framework: generalized linear models (GLMs), mixed effect models, and nonparametric regression models. The author''s treatment is thoroughly modern and covers topics that include GLM diagnostics, generalized linear mixed models, trees, and even the use of neural networks in statistics. To demonstrate the interplay of theory and practice, throughout the book the author weaves the use of the R software environment to analyze the data of real examples, providing all of the R commands necessary to reproduce the analyses. All of the ...

  18. Advanced statistics: linear regression, part II: multiple linear regression.

    Science.gov (United States)

    Marill, Keith A

    2004-01-01

    The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.

  19. A comparison of linear and nonlinear statistical techniques in performance attribution.

    Science.gov (United States)

    Chan, N H; Genovese, C R

    2001-01-01

    Performance attribution is usually conducted under the linear framework of multifactor models. Although commonly used by practitioners in finance, linear multifactor models are known to be less than satisfactory in many situations. After a brief survey of nonlinear methods, nonlinear statistical techniques are applied to performance attribution of a portfolio constructed from a fixed universe of stocks using factors derived from some commonly used cross sectional linear multifactor models. By rebalancing this portfolio monthly, the cumulative returns for procedures based on standard linear multifactor model and three nonlinear techniques-model selection, additive models, and neural networks-are calculated and compared. It is found that the first two nonlinear techniques, especially in combination, outperform the standard linear model. The results in the neural-network case are inconclusive because of the great variety of possible models. Although these methods are more complicated and may require some tuning, toolboxes are developed and suggestions on calibration are proposed. This paper demonstrates the usefulness of modern nonlinear statistical techniques in performance attribution.

  20. Normality of raw data in general linear models: The most widespread myth in statistics

    Science.gov (United States)

    Kery, Marc; Hatfield, Jeff S.

    2003-01-01

    In years of statistical consulting for ecologists and wildlife biologists, by far the most common misconception we have come across has been the one about normality in general linear models. These comprise a very large part of the statistical models used in ecology and include t tests, simple and multiple linear regression, polynomial regression, and analysis of variance (ANOVA) and covariance (ANCOVA). There is a widely held belief that the normality assumption pertains to the raw data rather than to the model residuals. We suspect that this error may also occur in countless published studies, whenever the normality assumption is tested prior to analysis. This may lead to the use of nonparametric alternatives (if there are any), when parametric tests would indeed be appropriate, or to use of transformations of raw data, which may introduce hidden assumptions such as multiplicative effects on the natural scale in the case of log-transformed data. Our aim here is to dispel this myth. We very briefly describe relevant theory for two cases of general linear models to show that the residuals need to be normally distributed if tests requiring normality are to be used, such as t and F tests. We then give two examples demonstrating that the distribution of the response variable may be nonnormal, and yet the residuals are well behaved. We do not go into the issue of how to test normality; instead we display the distributions of response variables and residuals graphically.

  1. Linear Models

    CERN Document Server

    Searle, Shayle R

    2012-01-01

    This 1971 classic on linear models is once again available--as a Wiley Classics Library Edition. It features material that can be understood by any statistician who understands matrix algebra and basic statistical methods.

  2. Generalized, Linear, and Mixed Models

    CERN Document Server

    McCulloch, Charles E; Neuhaus, John M

    2011-01-01

    An accessible and self-contained introduction to statistical models-now in a modernized new editionGeneralized, Linear, and Mixed Models, Second Edition provides an up-to-date treatment of the essential techniques for developing and applying a wide variety of statistical models. The book presents thorough and unified coverage of the theory behind generalized, linear, and mixed models and highlights their similarities and differences in various construction, application, and computational aspects.A clear introduction to the basic ideas of fixed effects models, random effects models, and mixed m

  3. MIDAS: Regionally linear multivariate discriminative statistical mapping.

    Science.gov (United States)

    Varol, Erdem; Sotiras, Aristeidis; Davatzikos, Christos

    2018-07-01

    Statistical parametric maps formed via voxel-wise mass-univariate tests, such as the general linear model, are commonly used to test hypotheses about regionally specific effects in neuroimaging cross-sectional studies where each subject is represented by a single image. Despite being informative, these techniques remain limited as they ignore multivariate relationships in the data. Most importantly, the commonly employed local Gaussian smoothing, which is important for accounting for registration errors and making the data follow Gaussian distributions, is usually chosen in an ad hoc fashion. Thus, it is often suboptimal for the task of detecting group differences and correlations with non-imaging variables. Information mapping techniques, such as searchlight, which use pattern classifiers to exploit multivariate information and obtain more powerful statistical maps, have become increasingly popular in recent years. However, existing methods may lead to important interpretation errors in practice (i.e., misidentifying a cluster as informative, or failing to detect truly informative voxels), while often being computationally expensive. To address these issues, we introduce a novel efficient multivariate statistical framework for cross-sectional studies, termed MIDAS, seeking highly sensitive and specific voxel-wise brain maps, while leveraging the power of regional discriminant analysis. In MIDAS, locally linear discriminative learning is applied to estimate the pattern that best discriminates between two groups, or predicts a variable of interest. This pattern is equivalent to local filtering by an optimal kernel whose coefficients are the weights of the linear discriminant. By composing information from all neighborhoods that contain a given voxel, MIDAS produces a statistic that collectively reflects the contribution of the voxel to the regional classifiers as well as the discriminative power of the classifiers. Critically, MIDAS efficiently assesses the

  4. Application of a Statistical Linear Time-Varying System Model of High Grazing Angle Sea Clutter for Computing Interference Power

    Science.gov (United States)

    2017-12-08

    STATISTICAL LINEAR TIME-VARYING SYSTEM MODEL OF HIGH GRAZING ANGLE SEA CLUTTER FOR COMPUTING INTERFERENCE POWER 1. INTRODUCTION Statistical linear time...beam. We can approximate one of the sinc factors using the Dirichlet kernel to facilitate computation of the integral in (6) as follows: ∣∣∣∣sinc(WB...plotted in Figure 4. The resultant autocorrelation can then be found by substituting (18) into (28). The Python code used to generate Figures 1-4 is found

  5. Introduction to statistical modelling 2: categorical variables and interactions in linear regression.

    Science.gov (United States)

    Lunt, Mark

    2015-07-01

    In the first article in this series we explored the use of linear regression to predict an outcome variable from a number of predictive factors. It assumed that the predictive factors were measured on an interval scale. However, this article shows how categorical variables can also be included in a linear regression model, enabling predictions to be made separately for different groups and allowing for testing the hypothesis that the outcome differs between groups. The use of interaction terms to measure whether the effect of a particular predictor variable differs between groups is also explained. An alternative approach to testing the difference between groups of the effect of a given predictor, which consists of measuring the effect in each group separately and seeing whether the statistical significance differs between the groups, is shown to be misleading. © The Author 2013. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  6. Summary goodness-of-fit statistics for binary generalized linear models with noncanonical link functions.

    Science.gov (United States)

    Canary, Jana D; Blizzard, Leigh; Barry, Ronald P; Hosmer, David W; Quinn, Stephen J

    2016-05-01

    Generalized linear models (GLM) with a canonical logit link function are the primary modeling technique used to relate a binary outcome to predictor variables. However, noncanonical links can offer more flexibility, producing convenient analytical quantities (e.g., probit GLMs in toxicology) and desired measures of effect (e.g., relative risk from log GLMs). Many summary goodness-of-fit (GOF) statistics exist for logistic GLM. Their properties make the development of GOF statistics relatively straightforward, but it can be more difficult under noncanonical links. Although GOF tests for logistic GLM with continuous covariates (GLMCC) have been applied to GLMCCs with log links, we know of no GOF tests in the literature specifically developed for GLMCCs that can be applied regardless of link function chosen. We generalize the Tsiatis GOF statistic originally developed for logistic GLMCCs, (TG), so that it can be applied under any link function. Further, we show that the algebraically related Hosmer-Lemeshow (HL) and Pigeon-Heyse (J(2) ) statistics can be applied directly. In a simulation study, TG, HL, and J(2) were used to evaluate the fit of probit, log-log, complementary log-log, and log models, all calculated with a common grouping method. The TG statistic consistently maintained Type I error rates, while those of HL and J(2) were often lower than expected if terms with little influence were included. Generally, the statistics had similar power to detect an incorrect model. An exception occurred when a log GLMCC was incorrectly fit to data generated from a logistic GLMCC. In this case, TG had more power than HL or J(2) . © 2015 John Wiley & Sons Ltd/London School of Economics.

  7. Diffeomorphic Statistical Deformation Models

    DEFF Research Database (Denmark)

    Hansen, Michael Sass; Hansen, Mads/Fogtman; Larsen, Rasmus

    2007-01-01

    In this paper we present a new method for constructing diffeomorphic statistical deformation models in arbitrary dimensional images with a nonlinear generative model and a linear parameter space. Our deformation model is a modified version of the diffeomorphic model introduced by Cootes et al....... The modifications ensure that no boundary restriction has to be enforced on the parameter space to prevent folds or tears in the deformation field. For straightforward statistical analysis, principal component analysis and sparse methods, we assume that the parameters for a class of deformations lie on a linear...... with ground truth in form of manual expert annotations, and compared to Cootes's model. We anticipate applications in unconstrained diffeomorphic synthesis of images, e.g. for tracking, segmentation, registration or classification purposes....

  8. Multivariate mixed linear model analysis of longitudinal data: an information-rich statistical technique for analyzing disease resistance data

    Science.gov (United States)

    The mixed linear model (MLM) is currently among the most advanced and flexible statistical modeling techniques and its use in tackling problems in plant pathology has begun surfacing in the literature. The longitudinal MLM is a multivariate extension that handles repeatedly measured data, such as r...

  9. Linear regression models and k-means clustering for statistical analysis of fNIRS data.

    Science.gov (United States)

    Bonomini, Viola; Zucchelli, Lucia; Re, Rebecca; Ieva, Francesca; Spinelli, Lorenzo; Contini, Davide; Paganoni, Anna; Torricelli, Alessandro

    2015-02-01

    We propose a new algorithm, based on a linear regression model, to statistically estimate the hemodynamic activations in fNIRS data sets. The main concern guiding the algorithm development was the minimization of assumptions and approximations made on the data set for the application of statistical tests. Further, we propose a K-means method to cluster fNIRS data (i.e. channels) as activated or not activated. The methods were validated both on simulated and in vivo fNIRS data. A time domain (TD) fNIRS technique was preferred because of its high performances in discriminating cortical activation and superficial physiological changes. However, the proposed method is also applicable to continuous wave or frequency domain fNIRS data sets.

  10. Response statistics of rotating shaft with non-linear elastic restoring forces by path integration

    Science.gov (United States)

    Gaidai, Oleg; Naess, Arvid; Dimentberg, Michael

    2017-07-01

    Extreme statistics of random vibrations is studied for a Jeffcott rotor under uniaxial white noise excitation. Restoring force is modelled as elastic non-linear; comparison is done with linearized restoring force to see the force non-linearity effect on the response statistics. While for the linear model analytical solutions and stability conditions are available, it is not generally the case for non-linear system except for some special cases. The statistics of non-linear case is studied by applying path integration (PI) method, which is based on the Markov property of the coupled dynamic system. The Jeffcott rotor response statistics can be obtained by solving the Fokker-Planck (FP) equation of the 4D dynamic system. An efficient implementation of PI algorithm is applied, namely fast Fourier transform (FFT) is used to simulate dynamic system additive noise. The latter allows significantly reduce computational time, compared to the classical PI. Excitation is modelled as Gaussian white noise, however any kind distributed white noise can be implemented with the same PI technique. Also multidirectional Markov noise can be modelled with PI in the same way as unidirectional. PI is accelerated by using Monte Carlo (MC) estimated joint probability density function (PDF) as initial input. Symmetry of dynamic system was utilized to afford higher mesh resolution. Both internal (rotating) and external damping are included in mechanical model of the rotor. The main advantage of using PI rather than MC is that PI offers high accuracy in the probability distribution tail. The latter is of critical importance for e.g. extreme value statistics, system reliability, and first passage probability.

  11. Statistical monitoring of linear antenna arrays

    KAUST Repository

    Harrou, Fouzi

    2016-11-03

    The paper concerns the problem of monitoring linear antenna arrays using the generalized likelihood ratio (GLR) test. When an abnormal event (fault) affects an array of antenna elements, the radiation pattern changes and significant deviation from the desired design performance specifications can resulted. In this paper, the detection of faults is addressed from a statistical point of view as a fault detection problem. Specifically, a statistical method rested on the GLR principle is used to detect potential faults in linear arrays. To assess the strength of the GLR-based monitoring scheme, three case studies involving different types of faults were performed. Simulation results clearly shown the effectiveness of the GLR-based fault-detection method to monitor the performance of linear antenna arrays.

  12. Introducing linear functions: an alternative statistical approach

    Science.gov (United States)

    Nolan, Caroline; Herbert, Sandra

    2015-12-01

    The introduction of linear functions is the turning point where many students decide if mathematics is useful or not. This means the role of parameters and variables in linear functions could be considered to be `threshold concepts'. There is recognition that linear functions can be taught in context through the exploration of linear modelling examples, but this has its limitations. Currently, statistical data is easily attainable, and graphics or computer algebra system (CAS) calculators are common in many classrooms. The use of this technology provides ease of access to different representations of linear functions as well as the ability to fit a least-squares line for real-life data. This means these calculators could support a possible alternative approach to the introduction of linear functions. This study compares the results of an end-of-topic test for two classes of Australian middle secondary students at a regional school to determine if such an alternative approach is feasible. In this study, test questions were grouped by concept and subjected to concept by concept analysis of the means of test results of the two classes. This analysis revealed that the students following the alternative approach demonstrated greater competence with non-standard questions.

  13. Minimal agent based model for financial markets II. Statistical properties of the linear and multiplicative dynamics

    Science.gov (United States)

    Alfi, V.; Cristelli, M.; Pietronero, L.; Zaccaria, A.

    2009-02-01

    We present a detailed study of the statistical properties of the Agent Based Model introduced in paper I [Eur. Phys. J. B, DOI: 10.1140/epjb/e2009-00028-4] and of its generalization to the multiplicative dynamics. The aim of the model is to consider the minimal elements for the understanding of the origin of the stylized facts and their self-organization. The key elements are fundamentalist agents, chartist agents, herding dynamics and price behavior. The first two elements correspond to the competition between stability and instability tendencies in the market. The herding behavior governs the possibility of the agents to change strategy and it is a crucial element of this class of models. We consider a linear approximation for the price dynamics which permits a simple interpretation of the model dynamics and, for many properties, it is possible to derive analytical results. The generalized non linear dynamics results to be extremely more sensible to the parameter space and much more difficult to analyze and control. The main results for the nature and self-organization of the stylized facts are, however, very similar in the two cases. The main peculiarity of the non linear dynamics is an enhancement of the fluctuations and a more marked evidence of the stylized facts. We will also discuss some modifications of the model to introduce more realistic elements with respect to the real markets.

  14. Testing Parametric versus Semiparametric Modelling in Generalized Linear Models

    NARCIS (Netherlands)

    Härdle, W.K.; Mammen, E.; Müller, M.D.

    1996-01-01

    We consider a generalized partially linear model E(Y|X,T) = G{X'b + m(T)} where G is a known function, b is an unknown parameter vector, and m is an unknown function.The paper introduces a test statistic which allows to decide between a parametric and a semiparametric model: (i) m is linear, i.e.

  15. Correlations and Non-Linear Probability Models

    DEFF Research Database (Denmark)

    Breen, Richard; Holm, Anders; Karlson, Kristian Bernt

    2014-01-01

    the dependent variable of the latent variable model and its predictor variables. We show how this correlation can be derived from the parameters of non-linear probability models, develop tests for the statistical significance of the derived correlation, and illustrate its usefulness in two applications. Under......Although the parameters of logit and probit and other non-linear probability models are often explained and interpreted in relation to the regression coefficients of an underlying linear latent variable model, we argue that they may also be usefully interpreted in terms of the correlations between...... certain circumstances, which we explain, the derived correlation provides a way of overcoming the problems inherent in cross-sample comparisons of the parameters of non-linear probability models....

  16. Development of statistical linear regression model for metals from transportation land uses.

    Science.gov (United States)

    Maniquiz, Marla C; Lee, Soyoung; Lee, Eunju; Kim, Lee-Hyung

    2009-01-01

    The transportation landuses possessing impervious surfaces such as highways, parking lots, roads, and bridges were recognized as the highly polluted non-point sources (NPSs) in the urban areas. Lots of pollutants from urban transportation are accumulating on the paved surfaces during dry periods and are washed-off during a storm. In Korea, the identification and monitoring of NPSs still represent a great challenge. Since 2004, the Ministry of Environment (MOE) has been engaged in several researches and monitoring to develop stormwater management policies and treatment systems for future implementation. The data over 131 storm events during May 2004 to September 2008 at eleven sites were analyzed to identify correlation relationships between particulates and metals, and to develop simple linear regression (SLR) model to estimate event mean concentration (EMC). Results indicate that there was no significant relationship between metals and TSS EMC. However, the SLR estimation models although not providing useful results are valuable indicators of high uncertainties that NPS pollution possess. Therefore, long term monitoring employing proper methods and precise statistical analysis of the data should be undertaken to eliminate these uncertainties.

  17. Composite Linear Models | Division of Cancer Prevention

    Science.gov (United States)

    By Stuart G. Baker The composite linear models software is a matrix approach to compute maximum likelihood estimates and asymptotic standard errors for models for incomplete multinomial data. It implements the method described in Baker SG. Composite linear models for incomplete multinomial data. Statistics in Medicine 1994;13:609-622. The software includes a library of thirty

  18. Statistical mechanical analysis of the linear vector channel in digital communication

    International Nuclear Information System (INIS)

    Takeda, Koujin; Hatabu, Atsushi; Kabashima, Yoshiyuki

    2007-01-01

    A statistical mechanical framework to analyze linear vector channel models in digital wireless communication is proposed for a large system. The framework is a generalization of that proposed for code-division multiple-access systems in Takeda et al (2006 Europhys. Lett. 76 1193) and enables the analysis of the system in which the elements of the channel transfer matrix are statistically correlated with each other. The significance of the proposed scheme is demonstrated by assessing the performance of an existing model of multi-input multi-output communication systems

  19. Bayes linear statistics, theory & methods

    CERN Document Server

    Goldstein, Michael

    2007-01-01

    Bayesian methods combine information available from data with any prior information available from expert knowledge. The Bayes linear approach follows this path, offering a quantitative structure for expressing beliefs, and systematic methods for adjusting these beliefs, given observational data. The methodology differs from the full Bayesian methodology in that it establishes simpler approaches to belief specification and analysis based around expectation judgements. Bayes Linear Statistics presents an authoritative account of this approach, explaining the foundations, theory, methodology, and practicalities of this important field. The text provides a thorough coverage of Bayes linear analysis, from the development of the basic language to the collection of algebraic results needed for efficient implementation, with detailed practical examples. The book covers:The importance of partial prior specifications for complex problems where it is difficult to supply a meaningful full prior probability specification...

  20. Commentary on the statistical properties of noise and its implication on general linear models in functional near-infrared spectroscopy.

    Science.gov (United States)

    Huppert, Theodore J

    2016-01-01

    Functional near-infrared spectroscopy (fNIRS) is a noninvasive neuroimaging technique that uses low levels of light to measure changes in cerebral blood oxygenation levels. In the majority of NIRS functional brain studies, analysis of this data is based on a statistical comparison of hemodynamic levels between a baseline and task or between multiple task conditions by means of a linear regression model: the so-called general linear model. Although these methods are similar to their implementation in other fields, particularly for functional magnetic resonance imaging, the specific application of these methods in fNIRS research differs in several key ways related to the sources of noise and artifacts unique to fNIRS. In this brief communication, we discuss the application of linear regression models in fNIRS and the modifications needed to generalize these models in order to deal with structured (colored) noise due to systemic physiology and noise heteroscedasticity due to motion artifacts. The objective of this work is to present an overview of these noise properties in the context of the linear model as it applies to fNIRS data. This work is aimed at explaining these mathematical issues to the general fNIRS experimental researcher but is not intended to be a complete mathematical treatment of these concepts.

  1. Advanced statistics: linear regression, part I: simple linear regression.

    Science.gov (United States)

    Marill, Keith A

    2004-01-01

    Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.

  2. Identification of Influential Points in a Linear Regression Model

    Directory of Open Access Journals (Sweden)

    Jan Grosz

    2011-03-01

    Full Text Available The article deals with the detection and identification of influential points in the linear regression model. Three methods of detection of outliers and leverage points are described. These procedures can also be used for one-sample (independentdatasets. This paper briefly describes theoretical aspects of several robust methods as well. Robust statistics is a powerful tool to increase the reliability and accuracy of statistical modelling and data analysis. A simulation model of the simple linear regression is presented.

  3. linear-quadratic-linear model

    Directory of Open Access Journals (Sweden)

    Tanwiwat Jaikuna

    2017-02-01

    Full Text Available Purpose: To develop an in-house software program that is able to calculate and generate the biological dose distribution and biological dose volume histogram by physical dose conversion using the linear-quadratic-linear (LQL model. Material and methods : The Isobio software was developed using MATLAB version 2014b to calculate and generate the biological dose distribution and biological dose volume histograms. The physical dose from each voxel in treatment planning was extracted through Computational Environment for Radiotherapy Research (CERR, and the accuracy was verified by the differentiation between the dose volume histogram from CERR and the treatment planning system. An equivalent dose in 2 Gy fraction (EQD2 was calculated using biological effective dose (BED based on the LQL model. The software calculation and the manual calculation were compared for EQD2 verification with pair t-test statistical analysis using IBM SPSS Statistics version 22 (64-bit. Results: Two and three-dimensional biological dose distribution and biological dose volume histogram were displayed correctly by the Isobio software. Different physical doses were found between CERR and treatment planning system (TPS in Oncentra, with 3.33% in high-risk clinical target volume (HR-CTV determined by D90%, 0.56% in the bladder, 1.74% in the rectum when determined by D2cc, and less than 1% in Pinnacle. The difference in the EQD2 between the software calculation and the manual calculation was not significantly different with 0.00% at p-values 0.820, 0.095, and 0.593 for external beam radiation therapy (EBRT and 0.240, 0.320, and 0.849 for brachytherapy (BT in HR-CTV, bladder, and rectum, respectively. Conclusions : The Isobio software is a feasible tool to generate the biological dose distribution and biological dose volume histogram for treatment plan evaluation in both EBRT and BT.

  4. Non-linearity consideration when analyzing reactor noise statistical characteristics. [BWR

    Energy Technology Data Exchange (ETDEWEB)

    Kebadze, B V; Adamovski, L A

    1975-06-01

    Statistical characteristics of boiling water reactor noise in the vicinity of stability threshold are studied. The reactor is considered as a non-linear system affected by random perturbations. To solve a non-linear problem the principle of statistical linearization is used. It is shown that the halfwidth of resonance peak in neutron power noise spectrum density as well as the reciprocal of noise dispersion, which are used in predicting a stable operation theshold, are different from zero both within and beyond the stability boundary the determination of which was based on linear criteria.

  5. Matrix algebra for linear models

    CERN Document Server

    Gruber, Marvin H J

    2013-01-01

    Matrix methods have evolved from a tool for expressing statistical problems to an indispensable part of the development, understanding, and use of various types of complex statistical analyses. This evolution has made matrix methods a vital part of statistical education. Traditionally, matrix methods are taught in courses on everything from regression analysis to stochastic processes, thus creating a fractured view of the topic. Matrix Algebra for Linear Models offers readers a unique, unified view of matrix analysis theory (where and when necessary), methods, and their applications. Written f

  6. Right-sizing statistical models for longitudinal data.

    Science.gov (United States)

    Wood, Phillip K; Steinley, Douglas; Jackson, Kristina M

    2015-12-01

    Arguments are proposed that researchers using longitudinal data should consider more and less complex statistical model alternatives to their initially chosen techniques in an effort to "right-size" the model to the data at hand. Such model comparisons may alert researchers who use poorly fitting, overly parsimonious models to more complex, better-fitting alternatives and, alternatively, may identify more parsimonious alternatives to overly complex (and perhaps empirically underidentified and/or less powerful) statistical models. A general framework is proposed for considering (often nested) relationships between a variety of psychometric and growth curve models. A 3-step approach is proposed in which models are evaluated based on the number and patterning of variance components prior to selection of better-fitting growth models that explain both mean and variation-covariation patterns. The orthogonal free curve slope intercept (FCSI) growth model is considered a general model that includes, as special cases, many models, including the factor mean (FM) model (McArdle & Epstein, 1987), McDonald's (1967) linearly constrained factor model, hierarchical linear models (HLMs), repeated-measures multivariate analysis of variance (MANOVA), and the linear slope intercept (linearSI) growth model. The FCSI model, in turn, is nested within the Tuckerized factor model. The approach is illustrated by comparing alternative models in a longitudinal study of children's vocabulary and by comparing several candidate parametric growth and chronometric models in a Monte Carlo study. (c) 2015 APA, all rights reserved).

  7. Valid statistical approaches for analyzing sholl data: Mixed effects versus simple linear models.

    Science.gov (United States)

    Wilson, Machelle D; Sethi, Sunjay; Lein, Pamela J; Keil, Kimberly P

    2017-03-01

    The Sholl technique is widely used to quantify dendritic morphology. Data from such studies, which typically sample multiple neurons per animal, are often analyzed using simple linear models. However, simple linear models fail to account for intra-class correlation that occurs with clustered data, which can lead to faulty inferences. Mixed effects models account for intra-class correlation that occurs with clustered data; thus, these models more accurately estimate the standard deviation of the parameter estimate, which produces more accurate p-values. While mixed models are not new, their use in neuroscience has lagged behind their use in other disciplines. A review of the published literature illustrates common mistakes in analyses of Sholl data. Analysis of Sholl data collected from Golgi-stained pyramidal neurons in the hippocampus of male and female mice using both simple linear and mixed effects models demonstrates that the p-values and standard deviations obtained using the simple linear models are biased downwards and lead to erroneous rejection of the null hypothesis in some analyses. The mixed effects approach more accurately models the true variability in the data set, which leads to correct inference. Mixed effects models avoid faulty inference in Sholl analysis of data sampled from multiple neurons per animal by accounting for intra-class correlation. Given the widespread practice in neuroscience of obtaining multiple measurements per subject, there is a critical need to apply mixed effects models more widely. Copyright © 2017 Elsevier B.V. All rights reserved.

  8. Statistical Model-Based Face Pose Estimation

    Institute of Scientific and Technical Information of China (English)

    GE Xinliang; YANG Jie; LI Feng; WANG Huahua

    2007-01-01

    A robust face pose estimation approach is proposed by using face shape statistical model approach and pose parameters are represented by trigonometric functions. The face shape statistical model is firstly built by analyzing the face shapes from different people under varying poses. The shape alignment is vital in the process of building the statistical model. Then, six trigonometric functions are employed to represent the face pose parameters. Lastly, the mapping function is constructed between face image and face pose by linearly relating different parameters. The proposed approach is able to estimate different face poses using a few face training samples. Experimental results are provided to demonstrate its efficiency and accuracy.

  9. Huffman and linear scanning methods with statistical language models.

    Science.gov (United States)

    Roark, Brian; Fried-Oken, Melanie; Gibbons, Chris

    2015-03-01

    Current scanning access methods for text generation in AAC devices are limited to relatively few options, most notably row/column variations within a matrix. We present Huffman scanning, a new method for applying statistical language models to binary-switch, static-grid typing AAC interfaces, and compare it to other scanning options under a variety of conditions. We present results for 16 adults without disabilities and one 36-year-old man with locked-in syndrome who presents with complex communication needs and uses AAC scanning devices for writing. Huffman scanning with a statistical language model yielded significant typing speedups for the 16 participants without disabilities versus any of the other methods tested, including two row/column scanning methods. A similar pattern of results was found with the individual with locked-in syndrome. Interestingly, faster typing speeds were obtained with Huffman scanning using a more leisurely scan rate than relatively fast individually calibrated scan rates. Overall, the results reported here demonstrate great promise for the usability of Huffman scanning as a faster alternative to row/column scanning.

  10. Statistical Validation of Engineering and Scientific Models: Background

    International Nuclear Information System (INIS)

    Hills, Richard G.; Trucano, Timothy G.

    1999-01-01

    A tutorial is presented discussing the basic issues associated with propagation of uncertainty analysis and statistical validation of engineering and scientific models. The propagation of uncertainty tutorial illustrates the use of the sensitivity method and the Monte Carlo method to evaluate the uncertainty in predictions for linear and nonlinear models. Four example applications are presented; a linear model, a model for the behavior of a damped spring-mass system, a transient thermal conduction model, and a nonlinear transient convective-diffusive model based on Burger's equation. Correlated and uncorrelated model input parameters are considered. The model validation tutorial builds on the material presented in the propagation of uncertainty tutoriaI and uses the damp spring-mass system as the example application. The validation tutorial illustrates several concepts associated with the application of statistical inference to test model predictions against experimental observations. Several validation methods are presented including error band based, multivariate, sum of squares of residuals, and optimization methods. After completion of the tutorial, a survey of statistical model validation literature is presented and recommendations for future work are made

  11. Analyzing longitudinal data with the linear mixed models procedure in SPSS.

    Science.gov (United States)

    West, Brady T

    2009-09-01

    Many applied researchers analyzing longitudinal data share a common misconception: that specialized statistical software is necessary to fit hierarchical linear models (also known as linear mixed models [LMMs], or multilevel models) to longitudinal data sets. Although several specialized statistical software programs of high quality are available that allow researchers to fit these models to longitudinal data sets (e.g., HLM), rapid advances in general purpose statistical software packages have recently enabled analysts to fit these same models when using preferred packages that also enable other more common analyses. One of these general purpose statistical packages is SPSS, which includes a very flexible and powerful procedure for fitting LMMs to longitudinal data sets with continuous outcomes. This article aims to present readers with a practical discussion of how to analyze longitudinal data using the LMMs procedure in the SPSS statistical software package.

  12. On the analysis of clonogenic survival data: Statistical alternatives to the linear-quadratic model

    International Nuclear Information System (INIS)

    Unkel, Steffen; Belka, Claus; Lauber, Kirsten

    2016-01-01

    The most frequently used method to quantitatively describe the response to ionizing irradiation in terms of clonogenic survival is the linear-quadratic (LQ) model. In the LQ model, the logarithm of the surviving fraction is regressed linearly on the radiation dose by means of a second-degree polynomial. The ratio of the estimated parameters for the linear and quadratic term, respectively, represents the dose at which both terms have the same weight in the abrogation of clonogenic survival. This ratio is known as the α/β ratio. However, there are plausible scenarios in which the α/β ratio fails to sufficiently reflect differences between dose-response curves, for example when curves with similar α/β ratio but different overall steepness are being compared. In such situations, the interpretation of the LQ model is severely limited. Colony formation assays were performed in order to measure the clonogenic survival of nine human pancreatic cancer cell lines and immortalized human pancreatic ductal epithelial cells upon irradiation at 0-10 Gy. The resulting dataset was subjected to LQ regression and non-linear log-logistic regression. Dimensionality reduction of the data was performed by cluster analysis and principal component analysis. Both the LQ model and the non-linear log-logistic regression model resulted in accurate approximations of the observed dose-response relationships in the dataset of clonogenic survival. However, in contrast to the LQ model the non-linear regression model allowed the discrimination of curves with different overall steepness but similar α/β ratio and revealed an improved goodness-of-fit. Additionally, the estimated parameters in the non-linear model exhibit a more direct interpretation than the α/β ratio. Dimensionality reduction of clonogenic survival data by means of cluster analysis was shown to be a useful tool for classifying radioresistant and sensitive cell lines. More quantitatively, principal component analysis allowed

  13. Advanced data analysis in neuroscience integrating statistical and computational models

    CERN Document Server

    Durstewitz, Daniel

    2017-01-01

    This book is intended for use in advanced graduate courses in statistics / machine learning, as well as for all experimental neuroscientists seeking to understand statistical methods at a deeper level, and theoretical neuroscientists with a limited background in statistics. It reviews almost all areas of applied statistics, from basic statistical estimation and test theory, linear and nonlinear approaches for regression and classification, to model selection and methods for dimensionality reduction, density estimation and unsupervised clustering.  Its focus, however, is linear and nonlinear time series analysis from a dynamical systems perspective, based on which it aims to convey an understanding also of the dynamical mechanisms that could have generated observed time series. Further, it integrates computational modeling of behavioral and neural dynamics with statistical estimation and hypothesis testing. This way computational models in neuroscience are not only explanat ory frameworks, but become powerfu...

  14. Statistical Model Checking for Biological Systems

    DEFF Research Database (Denmark)

    David, Alexandre; Larsen, Kim Guldstrand; Legay, Axel

    2014-01-01

    Statistical Model Checking (SMC) is a highly scalable simulation-based verification approach for testing and estimating the probability that a stochastic system satisfies a given linear temporal property. The technique has been applied to (discrete and continuous time) Markov chains, stochastic...

  15. Non-linear statistical downscaling of present and LGM precipitation and temperatures over Europe

    Directory of Open Access Journals (Sweden)

    M. Vrac

    2007-12-01

    Full Text Available Local-scale climate information is increasingly needed for the study of past, present and future climate changes. In this study we develop a non-linear statistical downscaling method to generate local temperatures and precipitation values from large-scale variables of a Earth System Model of Intermediate Complexity (here CLIMBER. Our statistical downscaling scheme is based on the concept of Generalized Additive Models (GAMs, capturing non-linearities via non-parametric techniques. Our GAMs are calibrated on the present Western Europe climate. For this region, annual GAMs (i.e. models based on 12 monthly values per location are fitted by combining two types of large-scale explanatory variables: geographical (e.g. topographical information and physical (i.e. entirely simulated by the CLIMBER model.

    To evaluate the adequacy of the non-linear transfer functions fitted on the present Western European climate, they are applied to different spatial and temporal large-scale conditions. Local projections for present North America and Northern Europe climates are obtained and compared to local observations. This partially addresses the issue of spatial robustness of our transfer functions by answering the question "does our statistical model remain valid when applied to large-scale climate conditions from a region different from the one used for calibration?". To asses their temporal performances, local projections for the Last Glacial Maximum period are derived and compared to local reconstructions and General Circulation Model outputs.

    Our downscaling methodology performs adequately for the Western Europe climate. Concerning the spatial and temporal evaluations, it does not behave as well for Northern America and Northern Europe climates because the calibration domain may be too different from the targeted regions. The physical explanatory variables alone are not capable of downscaling realistic values. However, the inclusion of

  16. Mixed models, linear dependency, and identification in age-period-cohort models.

    Science.gov (United States)

    O'Brien, Robert M

    2017-07-20

    This paper examines the identification problem in age-period-cohort models that use either linear or categorically coded ages, periods, and cohorts or combinations of these parameterizations. These models are not identified using the traditional fixed effect regression model approach because of a linear dependency between the ages, periods, and cohorts. However, these models can be identified if the researcher introduces a single just identifying constraint on the model coefficients. The problem with such constraints is that the results can differ substantially depending on the constraint chosen. Somewhat surprisingly, age-period-cohort models that specify one or more of ages and/or periods and/or cohorts as random effects are identified. This is the case without introducing an additional constraint. I label this identification as statistical model identification and show how statistical model identification comes about in mixed models and why which effects are treated as fixed and which are treated as random can substantially change the estimates of the age, period, and cohort effects. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  17. Linear Mixed Models in Statistical Genetics

    NARCIS (Netherlands)

    R. de Vlaming (Ronald)

    2017-01-01

    markdownabstractOne of the goals of statistical genetics is to elucidate the genetic architecture of phenotypes (i.e., observable individual characteristics) that are affected by many genetic variants (e.g., single-nucleotide polymorphisms; SNPs). A particular aim is to identify specific SNPs that

  18. Statistical approach for selection of regression model during validation of bioanalytical method

    Directory of Open Access Journals (Sweden)

    Natalija Nakov

    2014-06-01

    Full Text Available The selection of an adequate regression model is the basis for obtaining accurate and reproducible results during the bionalytical method validation. Given the wide concentration range, frequently present in bioanalytical assays, heteroscedasticity of the data may be expected. Several weighted linear and quadratic regression models were evaluated during the selection of the adequate curve fit using nonparametric statistical tests: One sample rank test and Wilcoxon signed rank test for two independent groups of samples. The results obtained with One sample rank test could not give statistical justification for the selection of linear vs. quadratic regression models because slight differences between the error (presented through the relative residuals were obtained. Estimation of the significance of the differences in the RR was achieved using Wilcoxon signed rank test, where linear and quadratic regression models were treated as two independent groups. The application of this simple non-parametric statistical test provides statistical confirmation of the choice of an adequate regression model.

  19. Online Statistical Modeling (Regression Analysis) for Independent Responses

    Science.gov (United States)

    Made Tirta, I.; Anggraeni, Dian; Pandutama, Martinus

    2017-06-01

    Regression analysis (statistical analmodelling) are among statistical methods which are frequently needed in analyzing quantitative data, especially to model relationship between response and explanatory variables. Nowadays, statistical models have been developed into various directions to model various type and complex relationship of data. Rich varieties of advanced and recent statistical modelling are mostly available on open source software (one of them is R). However, these advanced statistical modelling, are not very friendly to novice R users, since they are based on programming script or command line interface. Our research aims to developed web interface (based on R and shiny), so that most recent and advanced statistical modelling are readily available, accessible and applicable on web. We have previously made interface in the form of e-tutorial for several modern and advanced statistical modelling on R especially for independent responses (including linear models/LM, generalized linier models/GLM, generalized additive model/GAM and generalized additive model for location scale and shape/GAMLSS). In this research we unified them in the form of data analysis, including model using Computer Intensive Statistics (Bootstrap and Markov Chain Monte Carlo/ MCMC). All are readily accessible on our online Virtual Statistics Laboratory. The web (interface) make the statistical modeling becomes easier to apply and easier to compare them in order to find the most appropriate model for the data.

  20. Distributed Monitoring of the R(sup 2) Statistic for Linear Regression

    Science.gov (United States)

    Bhaduri, Kanishka; Das, Kamalika; Giannella, Chris R.

    2011-01-01

    The problem of monitoring a multivariate linear regression model is relevant in studying the evolving relationship between a set of input variables (features) and one or more dependent target variables. This problem becomes challenging for large scale data in a distributed computing environment when only a subset of instances is available at individual nodes and the local data changes frequently. Data centralization and periodic model recomputation can add high overhead to tasks like anomaly detection in such dynamic settings. Therefore, the goal is to develop techniques for monitoring and updating the model over the union of all nodes data in a communication-efficient fashion. Correctness guarantees on such techniques are also often highly desirable, especially in safety-critical application scenarios. In this paper we develop DReMo a distributed algorithm with very low resource overhead, for monitoring the quality of a regression model in terms of its coefficient of determination (R2 statistic). When the nodes collectively determine that R2 has dropped below a fixed threshold, the linear regression model is recomputed via a network-wide convergecast and the updated model is broadcast back to all nodes. We show empirically, using both synthetic and real data, that our proposed method is highly communication-efficient and scalable, and also provide theoretical guarantees on correctness.

  1. Thurstonian models for sensory discrimination tests as generalized linear models

    DEFF Research Database (Denmark)

    Brockhoff, Per B.; Christensen, Rune Haubo Bojesen

    2010-01-01

    as a so-called generalized linear model. The underlying sensory difference 6 becomes directly a parameter of the statistical model and the estimate d' and it's standard error becomes the "usual" output of the statistical analysis. The d' for the monadic A-NOT A method is shown to appear as a standard......Sensory discrimination tests such as the triangle, duo-trio, 2-AFC and 3-AFC tests produce binary data and the Thurstonian decision rule links the underlying sensory difference 6 to the observed number of correct responses. In this paper it is shown how each of these four situations can be viewed...

  2. Statistical Modeling for the Effect of Rotor Speed, Yarn Twist and Linear Density on Production and Quality Characteristics of Rotor Spun Yarn

    Directory of Open Access Journals (Sweden)

    Farooq Ahmed Arain

    2012-01-01

    Full Text Available The aim of this study was to develop a statistical model for the effect of RS (Rotor Speed, YT (Yarn Twist and YLD (Yarn Linear Density on production and quality characteristics of rotor spun yarn. Cotton yarns of 30, 35 and 40 tex were produced on rotor spinning machine at different rotor speeds (i.e. 70000, 80000, 90000 and 100000 rpm and with different twist levels (i.e. 450, 500, 550, 600 and 700 tpm. Yarn production (g/hr and quality characteristics were determined for all the experiments. Based on the results, models were developed using response surface regression on MINITAB�16 statistical tool. The developed models not only characterize the intricate relationships among the factors but may also be used to predict the yarn production and quality characteristics at any level of factors within the range of experimental values.

  3. (ajst) statistical mechanics model for orientational

    African Journals Online (AJOL)

    Science and Engineering Series Vol. 6, No. 2, pp. 94 - 101. STATISTICAL MECHANICS MODEL FOR ORIENTATIONAL. MOTION OF TWO-DIMENSIONAL RIGID ROTATOR. Malo, J.O. ... there is no translational motion and that they are well separated so .... constant and I is the moment of inertia of a linear rotator. Thus, the ...

  4. Modeling exposure–lag–response associations with distributed lag non-linear models

    Science.gov (United States)

    Gasparrini, Antonio

    2014-01-01

    In biomedical research, a health effect is frequently associated with protracted exposures of varying intensity sustained in the past. The main complexity of modeling and interpreting such phenomena lies in the additional temporal dimension needed to express the association, as the risk depends on both intensity and timing of past exposures. This type of dependency is defined here as exposure–lag–response association. In this contribution, I illustrate a general statistical framework for such associations, established through the extension of distributed lag non-linear models, originally developed in time series analysis. This modeling class is based on the definition of a cross-basis, obtained by the combination of two functions to flexibly model linear or nonlinear exposure-responses and the lag structure of the relationship, respectively. The methodology is illustrated with an example application to cohort data and validated through a simulation study. This modeling framework generalizes to various study designs and regression models, and can be applied to study the health effects of protracted exposures to environmental factors, drugs or carcinogenic agents, among others. © 2013 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd. PMID:24027094

  5. Statistical modelling in biostatistics and bioinformatics selected papers

    CERN Document Server

    Peng, Defen

    2014-01-01

    This book presents selected papers on statistical model development related mainly to the fields of Biostatistics and Bioinformatics. The coverage of the material falls squarely into the following categories: (a) Survival analysis and multivariate survival analysis, (b) Time series and longitudinal data analysis, (c) Statistical model development and (d) Applied statistical modelling. Innovations in statistical modelling are presented throughout each of the four areas, with some intriguing new ideas on hierarchical generalized non-linear models and on frailty models with structural dispersion, just to mention two examples. The contributors include distinguished international statisticians such as Philip Hougaard, John Hinde, Il Do Ha, Roger Payne and Alessandra Durio, among others, as well as promising newcomers. Some of the contributions have come from researchers working in the BIO-SI research programme on Biostatistics and Bioinformatics, centred on the Universities of Limerick and Galway in Ireland and fu...

  6. Statistical study of the non-linear propagation of a partially coherent laser beam

    International Nuclear Information System (INIS)

    Ayanides, J.P.

    2001-01-01

    This research thesis is related to the LMJ project (Laser MegaJoule) and thus to the study and development of thermonuclear fusion. It reports the study of the propagation of a partially-coherent laser beam by using a statistical modelling in order to obtain mean values for the field, and thus bypassing a complex and costly calculation of deterministic quantities. Random fluctuations of the propagated field are supposed to comply with a Gaussian statistics; the laser central wavelength is supposed to be small with respect with fluctuation magnitude; a scale factor is introduced to clearly distinguish the scale of the random and fast variations of the field fluctuations, and the scale of the slow deterministic variations of the field envelopes. The author reports the study of propagation through a purely linear media and through a non-dispersive media, and then through slow non-dispersive and non-linear media (in which the reaction time is large with respect to grain correlation duration, but small with respect to the variation scale of the field macroscopic envelope), and thirdly through an instantaneous dispersive and non linear media (which instantaneously reacts to the field) [fr

  7. The importance of statistical modelling in clinical research : Comparing multidimensional Rasch-, structural equation and linear regression models for analyzing the depression of relatives of psychiatric patients.

    Science.gov (United States)

    Alexandrowicz, Rainer W; Jahn, Rebecca; Friedrich, Fabian; Unger, Anne

    2016-06-01

    Various studies have shown that caregiving relatives of schizophrenic patients are at risk of suffering from depression. These studies differ with respect to the applied statistical methods, which could influence the findings. Therefore, the present study analyzes to which extent different methods may cause differing results. The present study contrasts by means of one data set the results of three different modelling approaches, Rasch Modelling (RM), Structural Equation Modelling (SEM), and Linear Regression Modelling (LRM). The results of the three models varied considerably, reflecting the different assumptions of the respective models. Latent trait models (i. e., RM and SEM) generally provide more convincing results by correcting for measurement error and the RM specifically proves superior for it treats ordered categorical data most adequately.

  8. Statistical modelling of transcript profiles of differentially regulated genes

    Directory of Open Access Journals (Sweden)

    Sergeant Martin J

    2008-07-01

    Full Text Available Abstract Background The vast quantities of gene expression profiling data produced in microarray studies, and the more precise quantitative PCR, are often not statistically analysed to their full potential. Previous studies have summarised gene expression profiles using simple descriptive statistics, basic analysis of variance (ANOVA and the clustering of genes based on simple models fitted to their expression profiles over time. We report the novel application of statistical non-linear regression modelling techniques to describe the shapes of expression profiles for the fungus Agaricus bisporus, quantified by PCR, and for E. coli and Rattus norvegicus, using microarray technology. The use of parametric non-linear regression models provides a more precise description of expression profiles, reducing the "noise" of the raw data to produce a clear "signal" given by the fitted curve, and describing each profile with a small number of biologically interpretable parameters. This approach then allows the direct comparison and clustering of the shapes of response patterns between genes and potentially enables a greater exploration and interpretation of the biological processes driving gene expression. Results Quantitative reverse transcriptase PCR-derived time-course data of genes were modelled. "Split-line" or "broken-stick" regression identified the initial time of gene up-regulation, enabling the classification of genes into those with primary and secondary responses. Five-day profiles were modelled using the biologically-oriented, critical exponential curve, y(t = A + (B + CtRt + ε. This non-linear regression approach allowed the expression patterns for different genes to be compared in terms of curve shape, time of maximal transcript level and the decline and asymptotic response levels. Three distinct regulatory patterns were identified for the five genes studied. Applying the regression modelling approach to microarray-derived time course data

  9. A phenomenological biological dose model for proton therapy based on linear energy transfer spectra.

    Science.gov (United States)

    Rørvik, Eivind; Thörnqvist, Sara; Stokkevåg, Camilla H; Dahle, Tordis J; Fjaera, Lars Fredrik; Ytre-Hauge, Kristian S

    2017-06-01

    The relative biological effectiveness (RBE) of protons varies with the radiation quality, quantified by the linear energy transfer (LET). Most phenomenological models employ a linear dependency of the dose-averaged LET (LET d ) to calculate the biological dose. However, several experiments have indicated a possible non-linear trend. Our aim was to investigate if biological dose models including non-linear LET dependencies should be considered, by introducing a LET spectrum based dose model. The RBE-LET relationship was investigated by fitting of polynomials from 1st to 5th degree to a database of 85 data points from aerobic in vitro experiments. We included both unweighted and weighted regression, the latter taking into account experimental uncertainties. Statistical testing was performed to decide whether higher degree polynomials provided better fits to the data as compared to lower degrees. The newly developed models were compared to three published LET d based models for a simulated spread out Bragg peak (SOBP) scenario. The statistical analysis of the weighted regression analysis favored a non-linear RBE-LET relationship, with the quartic polynomial found to best represent the experimental data (P = 0.010). The results of the unweighted regression analysis were on the borderline of statistical significance for non-linear functions (P = 0.053), and with the current database a linear dependency could not be rejected. For the SOBP scenario, the weighted non-linear model estimated a similar mean RBE value (1.14) compared to the three established models (1.13-1.17). The unweighted model calculated a considerably higher RBE value (1.22). The analysis indicated that non-linear models could give a better representation of the RBE-LET relationship. However, this is not decisive, as inclusion of the experimental uncertainties in the regression analysis had a significant impact on the determination and ranking of the models. As differences between the models were

  10. Statistical mechanical analysis of linear programming relaxation for combinatorial optimization problems

    Science.gov (United States)

    Takabe, Satoshi; Hukushima, Koji

    2016-05-01

    Typical behavior of the linear programming (LP) problem is studied as a relaxation of the minimum vertex cover (min-VC), a type of integer programming (IP) problem. A lattice-gas model on the Erdös-Rényi random graphs of α -uniform hyperedges is proposed to express both the LP and IP problems of the min-VC in the common statistical mechanical model with a one-parameter family. Statistical mechanical analyses reveal for α =2 that the LP optimal solution is typically equal to that given by the IP below the critical average degree c =e in the thermodynamic limit. The critical threshold for good accuracy of the relaxation extends the mathematical result c =1 and coincides with the replica symmetry-breaking threshold of the IP. The LP relaxation for the minimum hitting sets with α ≥3 , minimum vertex covers on α -uniform random graphs, is also studied. Analytic and numerical results strongly suggest that the LP relaxation fails to estimate optimal values above the critical average degree c =e /(α -1 ) where the replica symmetry is broken.

  11. Statistical mechanical analysis of linear programming relaxation for combinatorial optimization problems.

    Science.gov (United States)

    Takabe, Satoshi; Hukushima, Koji

    2016-05-01

    Typical behavior of the linear programming (LP) problem is studied as a relaxation of the minimum vertex cover (min-VC), a type of integer programming (IP) problem. A lattice-gas model on the Erdös-Rényi random graphs of α-uniform hyperedges is proposed to express both the LP and IP problems of the min-VC in the common statistical mechanical model with a one-parameter family. Statistical mechanical analyses reveal for α=2 that the LP optimal solution is typically equal to that given by the IP below the critical average degree c=e in the thermodynamic limit. The critical threshold for good accuracy of the relaxation extends the mathematical result c=1 and coincides with the replica symmetry-breaking threshold of the IP. The LP relaxation for the minimum hitting sets with α≥3, minimum vertex covers on α-uniform random graphs, is also studied. Analytic and numerical results strongly suggest that the LP relaxation fails to estimate optimal values above the critical average degree c=e/(α-1) where the replica symmetry is broken.

  12. On the Use of Rank Tests and Estimates in the Linear Model.

    Science.gov (United States)

    1982-06-01

    models," Journal of the Royal Statistical Society , Series B, 42, 366-371. Neter, J. and Wasserman, W. (1974), Applied Linear Statistical Models...University Park, PA. Schuster, E. (1974), "On the rate of convergence of an estimate of a functional of a probability density," Scandinavian Acturial

  13. Statistical properties of the linear σ model used in dynamical simulations of DCC formation

    International Nuclear Information System (INIS)

    Randrup, J.

    1997-01-01

    The present work develops a simple approximate framework for initializing and interpreting dynamical simulations with the linear σ model exploring the formation of disoriented chiral condensates in high-energy collisions. By enclosing the system in a rectangular box with periodic boundary conditions, it is possible to decompose uniquely the chiral field into its spatial average (the order parameter) and its fluctuations (the quasiparticles) which can be treated in the Hartree approximation. The quasiparticle modes are then described approximately by Klein-Gordon dispersion relations containing an effective mass depending on both the temperature and the magnitude of the order parameter; their fluctuations are instrumental in shaping the effective potential governing the order parameter, and the emerging statistical description is thermodynamicially consistent. The temperature dependence of the statistical distribution of the order parameter is discussed, as is the behavior of the associated effective masses; as the system is cooled, the field fluctuations subside, causing a smooth change from the high-temperature phase in which chiral symmetry is approximately restored towards the normal phase. Of practical interest is the fact that the equilibrium field configurations can be sampled in a simple manner, thus providing a convenient means for specifying the initial conditions in dynamical simulations of the nonequilibrium relaxation of the chiral field; in particular, the correlation function is much more realistic than those emerging in previous initialization methods. It is illustrated how such samples remain approximately invariant under propagation by the unapproximated equation of motion over times that are long on the scale of interest, thereby suggesting that the treatment is sufficiently accurate to be of practical utility. copyright 1997 The American Physical Society

  14. SOME STATISTICAL ISSUES RELATED TO MULTIPLE LINEAR REGRESSION MODELING OF BEACH BACTERIA CONCENTRATIONS

    Science.gov (United States)

    As a fast and effective technique, the multiple linear regression (MLR) method has been widely used in modeling and prediction of beach bacteria concentrations. Among previous works on this subject, however, several issues were insufficiently or inconsistently addressed. Those is...

  15. Robust Linear Models for Cis-eQTL Analysis.

    Science.gov (United States)

    Rantalainen, Mattias; Lindgren, Cecilia M; Holmes, Christopher C

    2015-01-01

    Expression Quantitative Trait Loci (eQTL) analysis enables characterisation of functional genetic variation influencing expression levels of individual genes. In outbread populations, including humans, eQTLs are commonly analysed using the conventional linear model, adjusting for relevant covariates, assuming an allelic dosage model and a Gaussian error term. However, gene expression data generally have noise that induces heavy-tailed errors relative to the Gaussian distribution and often include atypical observations, or outliers. Such departures from modelling assumptions can lead to an increased rate of type II errors (false negatives), and to some extent also type I errors (false positives). Careful model checking can reduce the risk of type-I errors but often not type II errors, since it is generally too time-consuming to carefully check all models with a non-significant effect in large-scale and genome-wide studies. Here we propose the application of a robust linear model for eQTL analysis to reduce adverse effects of deviations from the assumption of Gaussian residuals. We present results from a simulation study as well as results from the analysis of real eQTL data sets. Our findings suggest that in many situations robust models have the potential to provide more reliable eQTL results compared to conventional linear models, particularly in respect to reducing type II errors due to non-Gaussian noise. Post-genomic data, such as that generated in genome-wide eQTL studies, are often noisy and frequently contain atypical observations. Robust statistical models have the potential to provide more reliable results and increased statistical power under non-Gaussian conditions. The results presented here suggest that robust models should be considered routinely alongside other commonly used methodologies for eQTL analysis.

  16. Modelling female fertility traits in beef cattle using linear and non-linear models.

    Science.gov (United States)

    Naya, H; Peñagaricano, F; Urioste, J I

    2017-06-01

    Female fertility traits are key components of the profitability of beef cattle production. However, these traits are difficult and expensive to measure, particularly under extensive pastoral conditions, and consequently, fertility records are in general scarce and somehow incomplete. Moreover, fertility traits are usually dominated by the effects of herd-year environment, and it is generally assumed that relatively small margins are kept for genetic improvement. New ways of modelling genetic variation in these traits are needed. Inspired in the methodological developments made by Prof. Daniel Gianola and co-workers, we assayed linear (Gaussian), Poisson, probit (threshold), censored Poisson and censored Gaussian models to three different kinds of endpoints, namely calving success (CS), number of days from first calving (CD) and number of failed oestrus (FE). For models involving FE and CS, non-linear models overperformed their linear counterparts. For models derived from CD, linear versions displayed better adjustment than the non-linear counterparts. Non-linear models showed consistently higher estimates of heritability and repeatability in all cases (h 2  linear models; h 2  > 0.23 and r > 0.24, for non-linear models). While additive and permanent environment effects showed highly favourable correlations between all models (>0.789), consistency in selecting the 10% best sires showed important differences, mainly amongst the considered endpoints (FE, CS and CD). In consequence, endpoints should be considered as modelling different underlying genetic effects, with linear models more appropriate to describe CD and non-linear models better for FE and CS. © 2017 Blackwell Verlag GmbH.

  17. Linear theory for filtering nonlinear multiscale systems with model error.

    Science.gov (United States)

    Berry, Tyrus; Harlim, John

    2014-07-08

    In this paper, we study filtering of multiscale dynamical systems with model error arising from limitations in resolving the smaller scale processes. In particular, the analysis assumes the availability of continuous-time noisy observations of all components of the slow variables. Mathematically, this paper presents new results on higher order asymptotic expansion of the first two moments of a conditional measure. In particular, we are interested in the application of filtering multiscale problems in which the conditional distribution is defined over the slow variables, given noisy observation of the slow variables alone. From the mathematical analysis, we learn that for a continuous time linear model with Gaussian noise, there exists a unique choice of parameters in a linear reduced model for the slow variables which gives the optimal filtering when only the slow variables are observed. Moreover, these parameters simultaneously give the optimal equilibrium statistical estimates of the underlying system, and as a consequence they can be estimated offline from the equilibrium statistics of the true signal. By examining a nonlinear test model, we show that the linear theory extends in this non-Gaussian, nonlinear configuration as long as we know the optimal stochastic parametrization and the correct observation model. However, when the stochastic parametrization model is inappropriate, parameters chosen for good filter performance may give poor equilibrium statistical estimates and vice versa; this finding is based on analytical and numerical results on our nonlinear test model and the two-layer Lorenz-96 model. Finally, even when the correct stochastic ansatz is given, it is imperative to estimate the parameters simultaneously and to account for the nonlinear feedback of the stochastic parameters into the reduced filter estimates. In numerical experiments on the two-layer Lorenz-96 model, we find that the parameters estimated online , as part of a filtering

  18. Comparison of linear, skewed-linear, and proportional hazard models for the analysis of lambing interval in Ripollesa ewes.

    Science.gov (United States)

    Casellas, J; Bach, R

    2012-06-01

    Lambing interval is a relevant reproductive indicator for sheep populations under continuous mating systems, although there is a shortage of selection programs accounting for this trait in the sheep industry. Both the historical assumption of small genetic background and its unorthodox distribution pattern have limited its implementation as a breeding objective. In this manuscript, statistical performances of 3 alternative parametrizations [i.e., symmetric Gaussian mixed linear (GML) model, skew-Gaussian mixed linear (SGML) model, and piecewise Weibull proportional hazard (PWPH) model] have been compared to elucidate the preferred methodology to handle lambing interval data. More specifically, flock-by-flock analyses were performed on 31,986 lambing interval records (257.3 ± 0.2 d) from 6 purebred Ripollesa flocks. Model performances were compared in terms of deviance information criterion (DIC) and Bayes factor (BF). For all flocks, PWPH models were clearly preferred; they generated a reduction of 1,900 or more DIC units and provided BF estimates larger than 100 (i.e., PWPH models against linear models). These differences were reduced when comparing PWPH models with different number of change points for the baseline hazard function. In 4 flocks, only 2 change points were required to minimize the DIC, whereas 4 and 6 change points were needed for the 2 remaining flocks. These differences demonstrated a remarkable degree of heterogeneity across sheep flocks that must be properly accounted for in genetic evaluation models to avoid statistical biases and suboptimal genetic trends. Within this context, all 6 Ripollesa flocks revealed substantial genetic background for lambing interval with heritabilities ranging between 0.13 and 0.19. This study provides the first evidence of the suitability of PWPH models for lambing interval analysis, clearly discarding previous parametrizations focused on mixed linear models.

  19. Modelling diversity in building occupant behaviour: a novel statistical approach

    DEFF Research Database (Denmark)

    Haldi, Frédéric; Calì, Davide; Andersen, Rune Korsholm

    2016-01-01

    We propose an advanced modelling framework to predict the scope and effects of behavioural diversity regarding building occupant actions on window openings, shading devices and lighting. We develop a statistical approach based on generalised linear mixed models to account for the longitudinal nat...

  20. Modelling subject-specific childhood growth using linear mixed-effect models with cubic regression splines.

    Science.gov (United States)

    Grajeda, Laura M; Ivanescu, Andrada; Saito, Mayuko; Crainiceanu, Ciprian; Jaganath, Devan; Gilman, Robert H; Crabtree, Jean E; Kelleher, Dermott; Cabrera, Lilia; Cama, Vitaliano; Checkley, William

    2016-01-01

    Childhood growth is a cornerstone of pediatric research. Statistical models need to consider individual trajectories to adequately describe growth outcomes. Specifically, well-defined longitudinal models are essential to characterize both population and subject-specific growth. Linear mixed-effect models with cubic regression splines can account for the nonlinearity of growth curves and provide reasonable estimators of population and subject-specific growth, velocity and acceleration. We provide a stepwise approach that builds from simple to complex models, and account for the intrinsic complexity of the data. We start with standard cubic splines regression models and build up to a model that includes subject-specific random intercepts and slopes and residual autocorrelation. We then compared cubic regression splines vis-à-vis linear piecewise splines, and with varying number of knots and positions. Statistical code is provided to ensure reproducibility and improve dissemination of methods. Models are applied to longitudinal height measurements in a cohort of 215 Peruvian children followed from birth until their fourth year of life. Unexplained variability, as measured by the variance of the regression model, was reduced from 7.34 when using ordinary least squares to 0.81 (p linear mixed-effect models with random slopes and a first order continuous autoregressive error term. There was substantial heterogeneity in both the intercept (p modeled with a first order continuous autoregressive error term as evidenced by the variogram of the residuals and by a lack of association among residuals. The final model provides a parametric linear regression equation for both estimation and prediction of population- and individual-level growth in height. We show that cubic regression splines are superior to linear regression splines for the case of a small number of knots in both estimation and prediction with the full linear mixed effect model (AIC 19,352 vs. 19

  1. Cellular automata and statistical mechanical models

    International Nuclear Information System (INIS)

    Rujan, P.

    1987-01-01

    The authors elaborate on the analogy between the transfer matrix of usual lattice models and the master equation describing the time development of cellular automata. Transient and stationary properties of probabilistic automata are linked to surface and bulk properties, respectively, of restricted statistical mechanical systems. It is demonstrated that methods of statistical physics can be successfully used to describe the dynamic and the stationary behavior of such automata. Some exact results are derived, including duality transformations, exact mappings, disorder, and linear solutions. Many examples are worked out in detail to demonstrate how to use statistical physics in order to construct cellular automata with desired properties. This approach is considered to be a first step toward the design of fully parallel, probabilistic systems whose computational abilities rely on the cooperative behavior of their components

  2. Models for probability and statistical inference theory and applications

    CERN Document Server

    Stapleton, James H

    2007-01-01

    This concise, yet thorough, book is enhanced with simulations and graphs to build the intuition of readersModels for Probability and Statistical Inference was written over a five-year period and serves as a comprehensive treatment of the fundamentals of probability and statistical inference. With detailed theoretical coverage found throughout the book, readers acquire the fundamentals needed to advance to more specialized topics, such as sampling, linear models, design of experiments, statistical computing, survival analysis, and bootstrapping.Ideal as a textbook for a two-semester sequence on probability and statistical inference, early chapters provide coverage on probability and include discussions of: discrete models and random variables; discrete distributions including binomial, hypergeometric, geometric, and Poisson; continuous, normal, gamma, and conditional distributions; and limit theory. Since limit theory is usually the most difficult topic for readers to master, the author thoroughly discusses mo...

  3. Log-normal frailty models fitted as Poisson generalized linear mixed models.

    Science.gov (United States)

    Hirsch, Katharina; Wienke, Andreas; Kuss, Oliver

    2016-12-01

    The equivalence of a survival model with a piecewise constant baseline hazard function and a Poisson regression model has been known since decades. As shown in recent studies, this equivalence carries over to clustered survival data: A frailty model with a log-normal frailty term can be interpreted and estimated as a generalized linear mixed model with a binary response, a Poisson likelihood, and a specific offset. Proceeding this way, statistical theory and software for generalized linear mixed models are readily available for fitting frailty models. This gain in flexibility comes at the small price of (1) having to fix the number of pieces for the baseline hazard in advance and (2) having to "explode" the data set by the number of pieces. In this paper we extend the simulations of former studies by using a more realistic baseline hazard (Gompertz) and by comparing the model under consideration with competing models. Furthermore, the SAS macro %PCFrailty is introduced to apply the Poisson generalized linear mixed approach to frailty models. The simulations show good results for the shared frailty model. Our new %PCFrailty macro provides proper estimates, especially in case of 4 events per piece. The suggested Poisson generalized linear mixed approach for log-normal frailty models based on the %PCFrailty macro provides several advantages in the analysis of clustered survival data with respect to more flexible modelling of fixed and random effects, exact (in the sense of non-approximate) maximum likelihood estimation, and standard errors and different types of confidence intervals for all variance parameters. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  4. Comparison between linear and non-parametric regression models for genome-enabled prediction in wheat.

    Science.gov (United States)

    Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne

    2012-12-01

    In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models.

  5. A question of separation: disentangling tracer bias and gravitational non-linearity with counts-in-cells statistics

    Science.gov (United States)

    Uhlemann, C.; Feix, M.; Codis, S.; Pichon, C.; Bernardeau, F.; L'Huillier, B.; Kim, J.; Hong, S. E.; Laigle, C.; Park, C.; Shin, J.; Pogosyan, D.

    2018-02-01

    Starting from a very accurate model for density-in-cells statistics of dark matter based on large deviation theory, a bias model for the tracer density in spheres is formulated. It adopts a mean bias relation based on a quadratic bias model to relate the log-densities of dark matter to those of mass-weighted dark haloes in real and redshift space. The validity of the parametrized bias model is established using a parametrization-independent extraction of the bias function. This average bias model is then combined with the dark matter PDF, neglecting any scatter around it: it nevertheless yields an excellent model for densities-in-cells statistics of mass tracers that is parametrized in terms of the underlying dark matter variance and three bias parameters. The procedure is validated on measurements of both the one- and two-point statistics of subhalo densities in the state-of-the-art Horizon Run 4 simulation showing excellent agreement for measured dark matter variance and bias parameters. Finally, it is demonstrated that this formalism allows for a joint estimation of the non-linear dark matter variance and the bias parameters using solely the statistics of subhaloes. Having verified that galaxy counts in hydrodynamical simulations sampled on a scale of 10 Mpc h-1 closely resemble those of subhaloes, this work provides important steps towards making theoretical predictions for density-in-cells statistics applicable to upcoming galaxy surveys like Euclid or WFIRST.

  6. Subset Statistics in the linear IV regression model

    NARCIS (Netherlands)

    Kleibergen, F.R.

    2005-01-01

    We show that the limiting distributions of subset generalizations of the weak instrument robust instrumental variable statistics are boundedly similar when the remaining structural parameters are estimated using maximum likelihood. They are bounded from above by the limiting distributions which

  7. Applied systems ecology: models, data, and statistical methods

    Energy Technology Data Exchange (ETDEWEB)

    Eberhardt, L L

    1976-01-01

    In this report, systems ecology is largely equated to mathematical or computer simulation modelling. The need for models in ecology stems from the necessity to have an integrative device for the diversity of ecological data, much of which is observational, rather than experimental, as well as from the present lack of a theoretical structure for ecology. Different objectives in applied studies require specialized methods. The best predictive devices may be regression equations, often non-linear in form, extracted from much more detailed models. A variety of statistical aspects of modelling, including sampling, are discussed. Several aspects of population dynamics and food-chain kinetics are described, and it is suggested that the two presently separated approaches should be combined into a single theoretical framework. It is concluded that future efforts in systems ecology should emphasize actual data and statistical methods, as well as modelling.

  8. Introduction to statistical modelling: linear regression.

    Science.gov (United States)

    Lunt, Mark

    2015-07-01

    In many studies we wish to assess how a range of variables are associated with a particular outcome and also determine the strength of such relationships so that we can begin to understand how these factors relate to each other at a population level. Ultimately, we may also be interested in predicting the outcome from a series of predictive factors available at, say, a routine clinic visit. In a recent article in Rheumatology, Desai et al. did precisely that when they studied the prediction of hip and spine BMD from hand BMD and various demographic, lifestyle, disease and therapy variables in patients with RA. This article aims to introduce the statistical methodology that can be used in such a situation and explain the meaning of some of the terms employed. It will also outline some common pitfalls encountered when performing such analyses. © The Author 2013. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  9. Predicting Statistical Response and Extreme Events in Uncertainty Quantification through Reduced-Order Models

    Science.gov (United States)

    Qi, D.; Majda, A.

    2017-12-01

    A low-dimensional reduced-order statistical closure model is developed for quantifying the uncertainty in statistical sensitivity and intermittency in principal model directions with largest variability in high-dimensional turbulent system and turbulent transport models. Imperfect model sensitivity is improved through a recent mathematical strategy for calibrating model errors in a training phase, where information theory and linear statistical response theory are combined in a systematic fashion to achieve the optimal model performance. The idea in the reduced-order method is from a self-consistent mathematical framework for general systems with quadratic nonlinearity, where crucial high-order statistics are approximated by a systematic model calibration procedure. Model efficiency is improved through additional damping and noise corrections to replace the expensive energy-conserving nonlinear interactions. Model errors due to the imperfect nonlinear approximation are corrected by tuning the model parameters using linear response theory with an information metric in a training phase before prediction. A statistical energy principle is adopted to introduce a global scaling factor in characterizing the higher-order moments in a consistent way to improve model sensitivity. Stringent models of barotropic and baroclinic turbulence are used to display the feasibility of the reduced-order methods. Principal statistical responses in mean and variance can be captured by the reduced-order models with accuracy and efficiency. Besides, the reduced-order models are also used to capture crucial passive tracer field that is advected by the baroclinic turbulent flow. It is demonstrated that crucial principal statistical quantities like the tracer spectrum and fat-tails in the tracer probability density functions in the most important large scales can be captured efficiently with accuracy using the reduced-order tracer model in various dynamical regimes of the flow field with

  10. Linear and nonlinear methods in modeling the aqueous solubility of organic compounds.

    Science.gov (United States)

    Catana, Cornel; Gao, Hua; Orrenius, Christian; Stouten, Pieter F W

    2005-01-01

    Solubility data for 930 diverse compounds have been analyzed using linear Partial Least Square (PLS) and nonlinear PLS methods, Continuum Regression (CR), and Neural Networks (NN). 1D and 2D descriptors from MOE package in combination with E-state or ISIS keys have been used. The best model was obtained using linear PLS for a combination between 22 MOE descriptors and 65 ISIS keys. It has a correlation coefficient (r2) of 0.935 and a root-mean-square error (RMSE) of 0.468 log molar solubility (log S(w)). The model validated on a test set of 177 compounds not included in the training set has r2 0.911 and RMSE 0.475 log S(w). The descriptors were ranked according to their importance, and at the top of the list have been found the 22 MOE descriptors. The CR model produced results as good as PLS, and because of the way in which cross-validation has been done it is expected to be a valuable tool in prediction besides PLS model. The statistics obtained using nonlinear methods did not surpass those got with linear ones. The good statistic obtained for linear PLS and CR recommends these models to be used in prediction when it is difficult or impossible to make experimental measurements, for virtual screening, combinatorial library design, and efficient leads optimization.

  11. Linear and Generalized Linear Mixed Models and Their Applications

    CERN Document Server

    Jiang, Jiming

    2007-01-01

    This book covers two major classes of mixed effects models, linear mixed models and generalized linear mixed models, and it presents an up-to-date account of theory and methods in analysis of these models as well as their applications in various fields. The book offers a systematic approach to inference about non-Gaussian linear mixed models. Furthermore, it has included recently developed methods, such as mixed model diagnostics, mixed model selection, and jackknife method in the context of mixed models. The book is aimed at students, researchers and other practitioners who are interested

  12. Understanding and forecasting polar stratospheric variability with statistical models

    Directory of Open Access Journals (Sweden)

    C. Blume

    2012-07-01

    Full Text Available The variability of the north-polar stratospheric vortex is a prominent aspect of the middle atmosphere. This work investigates a wide class of statistical models with respect to their ability to model geopotential and temperature anomalies, representing variability in the polar stratosphere. Four partly nonstationary, nonlinear models are assessed: linear discriminant analysis (LDA; a cluster method based on finite elements (FEM-VARX; a neural network, namely the multi-layer perceptron (MLP; and support vector regression (SVR. These methods model time series by incorporating all significant external factors simultaneously, including ENSO, QBO, the solar cycle, volcanoes, to then quantify their statistical importance. We show that variability in reanalysis data from 1980 to 2005 is successfully modeled. The period from 2005 to 2011 can be hindcasted to a certain extent, where MLP performs significantly better than the remaining models. However, variability remains that cannot be statistically hindcasted within the current framework, such as the unexpected major warming in January 2009. Finally, the statistical model with the best generalization performance is used to predict a winter 2011/12 with warm and weak vortex conditions. A vortex breakdown is predicted for late January, early February 2012.

  13. Solar radiation data - statistical analysis and simulation models

    Energy Technology Data Exchange (ETDEWEB)

    Mustacchi, C; Cena, V; Rocchi, M; Haghigat, F

    1984-01-01

    The activities consisted in collecting meteorological data on magnetic tape for ten european locations (with latitudes ranging from 42/sup 0/ to 56/sup 0/ N), analysing the multi-year sequences, developing mathematical models to generate synthetic sequences having the same statistical properties of the original data sets, and producing one or more Short Reference Years (SRY's) for each location. The meteorological parameters examinated were (for all the locations) global + diffuse radiation on horizontal surface, dry bulb temperature, sunshine duration. For some of the locations additional parameters were available, namely, global, beam and diffuse radiation on surfaces other than horizontal, wet bulb temperature, wind velocity, cloud type, cloud cover. The statistical properties investigated were mean, variance, autocorrelation, crosscorrelation with selected parameters, probability density function. For all the meteorological parameters, various mathematical models were built: linear regression, stochastic models of the AR and the DAR type. In each case, the model with the best statistical behaviour was selected for the production of a SRY for the relevant parameter/location.

  14. Effective connectivity between superior temporal gyrus and Heschl's gyrus during white noise listening: linear versus non-linear models.

    Science.gov (United States)

    Hamid, Ka; Yusoff, An; Rahman, Mza; Mohamad, M; Hamid, Aia

    2012-04-01

    This fMRI study is about modelling the effective connectivity between Heschl's gyrus (HG) and the superior temporal gyrus (STG) in human primary auditory cortices. MATERIALS #ENTITYSTARTX00026; Ten healthy male participants were required to listen to white noise stimuli during functional magnetic resonance imaging (fMRI) scans. Statistical parametric mapping (SPM) was used to generate individual and group brain activation maps. For input region determination, two intrinsic connectivity models comprising bilateral HG and STG were constructed using dynamic causal modelling (DCM). The models were estimated and inferred using DCM while Bayesian Model Selection (BMS) for group studies was used for model comparison and selection. Based on the winning model, six linear and six non-linear causal models were derived and were again estimated, inferred, and compared to obtain a model that best represents the effective connectivity between HG and the STG, balancing accuracy and complexity. Group results indicated significant asymmetrical activation (p(uncorr) Model comparison results showed strong evidence of STG as the input centre. The winning model is preferred by 6 out of 10 participants. The results were supported by BMS results for group studies with the expected posterior probability, r = 0.7830 and exceedance probability, ϕ = 0.9823. One-sample t-tests performed on connection values obtained from the winning model indicated that the valid connections for the winning model are the unidirectional parallel connections from STG to bilateral HG (p model comparison between linear and non-linear models using BMS prefers non-linear connection (r = 0.9160, ϕ = 1.000) from which the connectivity between STG and the ipsi- and contralateral HG is gated by the activity in STG itself. We are able to demonstrate that the effective connectivity between HG and STG while listening to white noise for the respective participants can be explained by a non-linear dynamic causal model with

  15. The Relationship between Economic Growth and Money Laundering – a Linear Regression Model

    Directory of Open Access Journals (Sweden)

    Daniel Rece

    2009-09-01

    Full Text Available This study provides an overview of the relationship between economic growth and money laundering modeled by a least squares function. The report analyzes statistically data collected from USA, Russia, Romania and other eleven European countries, rendering a linear regression model. The study illustrates that 23.7% of the total variance in the regressand (level of money laundering is “explained” by the linear regression model. In our opinion, this model will provide critical auxiliary judgment and decision support for anti-money laundering service systems.

  16. A Multiphase Non-Linear Mixed Effects Model: An Application to Spirometry after Lung Transplantation

    Science.gov (United States)

    Rajeswaran, Jeevanantham; Blackstone, Eugene H.

    2014-01-01

    In medical sciences, we often encounter longitudinal temporal relationships that are non-linear in nature. The influence of risk factors may also change across longitudinal follow-up. A system of multiphase non-linear mixed effects model is presented to model temporal patterns of longitudinal continuous measurements, with temporal decomposition to identify the phases and risk factors within each phase. Application of this model is illustrated using spirometry data after lung transplantation using readily available statistical software. This application illustrates the usefulness of our flexible model when dealing with complex non-linear patterns and time varying coefficients. PMID:24919830

  17. Transformation of Summary Statistics from Linear Mixed Model Association on All-or-None Traits to Odds Ratio.

    Science.gov (United States)

    Lloyd-Jones, Luke R; Robinson, Matthew R; Yang, Jian; Visscher, Peter M

    2018-04-01

    Genome-wide association studies (GWAS) have identified thousands of loci that are robustly associated with complex diseases. The use of linear mixed model (LMM) methodology for GWAS is becoming more prevalent due to its ability to control for population structure and cryptic relatedness and to increase power. The odds ratio (OR) is a common measure of the association of a disease with an exposure ( e.g. , a genetic variant) and is readably available from logistic regression. However, when the LMM is applied to all-or-none traits it provides estimates of genetic effects on the observed 0-1 scale, a different scale to that in logistic regression. This limits the comparability of results across studies, for example in a meta-analysis, and makes the interpretation of the magnitude of an effect from an LMM GWAS difficult. In this study, we derived transformations from the genetic effects estimated under the LMM to the OR that only rely on summary statistics. To test the proposed transformations, we used real genotypes from two large, publicly available data sets to simulate all-or-none phenotypes for a set of scenarios that differ in underlying model, disease prevalence, and heritability. Furthermore, we applied these transformations to GWAS summary statistics for type 2 diabetes generated from 108,042 individuals in the UK Biobank. In both simulation and real-data application, we observed very high concordance between the transformed OR from the LMM and either the simulated truth or estimates from logistic regression. The transformations derived and validated in this study improve the comparability of results from prospective and already performed LMM GWAS on complex diseases by providing a reliable transformation to a common comparative scale for the genetic effects. Copyright © 2018 by the Genetics Society of America.

  18. Dimension of linear models

    DEFF Research Database (Denmark)

    Høskuldsson, Agnar

    1996-01-01

    Determination of the proper dimension of a given linear model is one of the most important tasks in the applied modeling work. We consider here eight criteria that can be used to determine the dimension of the model, or equivalently, the number of components to use in the model. Four of these cri......Determination of the proper dimension of a given linear model is one of the most important tasks in the applied modeling work. We consider here eight criteria that can be used to determine the dimension of the model, or equivalently, the number of components to use in the model. Four...... the basic problems in determining the dimension of linear models. Then each of the eight measures are treated. The results are illustrated by examples....

  19. Comparative analysis of insect succession data from Victoria (Australia) using summary statistics versus preceding mean ambient temperature models.

    Science.gov (United States)

    Archer, Mel

    2014-03-01

    Minimum postmortem interval (mPMI) can be estimated with preceding mean ambient temperature models that predict carrion taxon pre-appearance interval. But accuracy has not been compared with using summary statistics (mean ± SD of taxon arrival/departure day, range, 95% CI). This study collected succession data from ten experimental and five control (infrequently sampled) pig carcasses over two summers (n = 2 experimental, n = 1 control per placement date). Linear and exponential preceding mean ambient temperature models for appearance and departure times were constructed for 17 taxa/developmental stages. There was minimal difference in linear or exponential model success, although arrival models were more often significant: 65% of linear arrival (r2 = 0.09–0.79) and exponential arrival models (r2 = 0.05–81.0) were significant, and 35% of linear departure (r2 = 0.0–0.71) and exponential departure models (r2 = 0.0–0.72) were significant. Performance of models and summary statistics for estimating mPMI was compared in two forensic cases. Only summary statistics produced accurate mPMI estimates.

  20. Feature network models for proximity data : statistical inference, model selection, network representations and links with related models

    NARCIS (Netherlands)

    Frank, Laurence Emmanuelle

    2006-01-01

    Feature Network Models (FNM) are graphical structures that represent proximity data in a discrete space with the use of features. A statistical inference theory is introduced, based on the additivity properties of networks and the linear regression framework. Considering features as predictor

  1. A primer on linear models

    CERN Document Server

    Monahan, John F

    2008-01-01

    Preface Examples of the General Linear Model Introduction One-Sample Problem Simple Linear Regression Multiple Regression One-Way ANOVA First Discussion The Two-Way Nested Model Two-Way Crossed Model Analysis of Covariance Autoregression Discussion The Linear Least Squares Problem The Normal Equations The Geometry of Least Squares Reparameterization Gram-Schmidt Orthonormalization Estimability and Least Squares Estimators Assumptions for the Linear Mean Model Confounding, Identifiability, and Estimability Estimability and Least Squares Estimators F

  2. A semiempirical linear model of indirect, flat-panel x-ray detectors

    Energy Technology Data Exchange (ETDEWEB)

    Huang, Shih-Ying; Yang Kai; Abbey, Craig K.; Boone, John M. [Department of Biomedical Engineering, University of California, Davis, California, One Shields Avenue, Davis, California 95616 (United States) and Department of Radiology, University of California, Davis, Medical Center, 4860 Y Street, Ambulatory Care Center Suite 0505, Sacramento, California 95817 (United States); Department of Radiology, University of California, Davis, Medical Center, 4860 Y Street, Ambulatory Care Center Suite 0505, Sacramento, California 95817 (United States); Department of Psychological and Brain Sciences, University of California, Santa Barbara, California 92106 (United States); Department of Biomedical Engineering, University of California, Davis, California, One Shields Avenue, Davis, California 95616 (United States) and Department of Radiology, University of California, Davis, Medical Center, 4860 Y Street, Ambulatory Care Center Suite 3100, Sacramento, California 95817 (United States)

    2012-04-15

    Purpose: It is important to understand signal and noise transfer in the indirect, flat-panel x-ray detector when developing and optimizing imaging systems. For optimization where simulating images is necessary, this study introduces a semiempirical model to simulate projection images with user-defined x-ray fluence interaction. Methods: The signal and noise transfer in the indirect, flat-panel x-ray detectors is characterized by statistics consistent with energy-integration of x-ray photons. For an incident x-ray spectrum, x-ray photons are attenuated and absorbed in the x-ray scintillator to produce light photons, which are coupled to photodiodes for signal readout. The signal mean and variance are linearly related to the energy-integrated x-ray spectrum by empirically determined factors. With the known first- and second-order statistics, images can be simulated by incorporating multipixel signal statistics and the modulation transfer function of the imaging system. To estimate the semiempirical input to this model, 500 projection images (using an indirect, flat-panel x-ray detector in the breast CT system) were acquired with 50-100 kilovolt (kV) x-ray spectra filtered with 0.1-mm tin (Sn), 0.2-mm copper (Cu), 1.5-mm aluminum (Al), or 0.05-mm silver (Ag). The signal mean and variance of each detector element and the noise power spectra (NPS) were calculated and incorporated into this model for accuracy. Additionally, the modulation transfer function of the detector system was physically measured and incorporated in the image simulation steps. For validation purposes, simulated and measured projection images of air scans were compared using 40 kV/0.1-mm Sn, 65 kV/0.2-mm Cu, 85 kV/1.5-mm Al, and 95 kV/0.05-mm Ag. Results: The linear relationship between the measured signal statistics and the energy-integrated x-ray spectrum was confirmed and incorporated into the model. The signal mean and variance factors were linearly related to kV for each filter material (r

  3. A semiempirical linear model of indirect, flat-panel x-ray detectors

    International Nuclear Information System (INIS)

    Huang, Shih-Ying; Yang Kai; Abbey, Craig K.; Boone, John M.

    2012-01-01

    Purpose: It is important to understand signal and noise transfer in the indirect, flat-panel x-ray detector when developing and optimizing imaging systems. For optimization where simulating images is necessary, this study introduces a semiempirical model to simulate projection images with user-defined x-ray fluence interaction. Methods: The signal and noise transfer in the indirect, flat-panel x-ray detectors is characterized by statistics consistent with energy-integration of x-ray photons. For an incident x-ray spectrum, x-ray photons are attenuated and absorbed in the x-ray scintillator to produce light photons, which are coupled to photodiodes for signal readout. The signal mean and variance are linearly related to the energy-integrated x-ray spectrum by empirically determined factors. With the known first- and second-order statistics, images can be simulated by incorporating multipixel signal statistics and the modulation transfer function of the imaging system. To estimate the semiempirical input to this model, 500 projection images (using an indirect, flat-panel x-ray detector in the breast CT system) were acquired with 50-100 kilovolt (kV) x-ray spectra filtered with 0.1-mm tin (Sn), 0.2-mm copper (Cu), 1.5-mm aluminum (Al), or 0.05-mm silver (Ag). The signal mean and variance of each detector element and the noise power spectra (NPS) were calculated and incorporated into this model for accuracy. Additionally, the modulation transfer function of the detector system was physically measured and incorporated in the image simulation steps. For validation purposes, simulated and measured projection images of air scans were compared using 40 kV/0.1-mm Sn, 65 kV/0.2-mm Cu, 85 kV/1.5-mm Al, and 95 kV/0.05-mm Ag. Results: The linear relationship between the measured signal statistics and the energy-integrated x-ray spectrum was confirmed and incorporated into the model. The signal mean and variance factors were linearly related to kV for each filter material (r 2 of

  4. A semiempirical linear model of indirect, flat-panel x-ray detectors.

    Science.gov (United States)

    Huang, Shih-Ying; Yang, Kai; Abbey, Craig K; Boone, John M

    2012-04-01

    It is important to understand signal and noise transfer in the indirect, flat-panel x-ray detector when developing and optimizing imaging systems. For optimization where simulating images is necessary, this study introduces a semiempirical model to simulate projection images with user-defined x-ray fluence interaction. The signal and noise transfer in the indirect, flat-panel x-ray detectors is characterized by statistics consistent with energy-integration of x-ray photons. For an incident x-ray spectrum, x-ray photons are attenuated and absorbed in the x-ray scintillator to produce light photons, which are coupled to photodiodes for signal readout. The signal mean and variance are linearly related to the energy-integrated x-ray spectrum by empirically determined factors. With the known first- and second-order statistics, images can be simulated by incorporating multipixel signal statistics and the modulation transfer function of the imaging system. To estimate the semiempirical input to this model, 500 projection images (using an indirect, flat-panel x-ray detector in the breast CT system) were acquired with 50-100 kilovolt (kV) x-ray spectra filtered with 0.1-mm tin (Sn), 0.2-mm copper (Cu), 1.5-mm aluminum (Al), or 0.05-mm silver (Ag). The signal mean and variance of each detector element and the noise power spectra (NPS) were calculated and incorporated into this model for accuracy. Additionally, the modulation transfer function of the detector system was physically measured and incorporated in the image simulation steps. For validation purposes, simulated and measured projection images of air scans were compared using 40 kV∕0.1-mm Sn, 65 kV∕0.2-mm Cu, 85 kV∕1.5-mm Al, and 95 kV∕0.05-mm Ag. The linear relationship between the measured signal statistics and the energy-integrated x-ray spectrum was confirmed and incorporated into the model. The signal mean and variance factors were linearly related to kV for each filter material (r(2) of signal mean to k

  5. Statistically accurate low-order models for uncertainty quantification in turbulent dynamical systems.

    Science.gov (United States)

    Sapsis, Themistoklis P; Majda, Andrew J

    2013-08-20

    A framework for low-order predictive statistical modeling and uncertainty quantification in turbulent dynamical systems is developed here. These reduced-order, modified quasilinear Gaussian (ROMQG) algorithms apply to turbulent dynamical systems in which there is significant linear instability or linear nonnormal dynamics in the unperturbed system and energy-conserving nonlinear interactions that transfer energy from the unstable modes to the stable modes where dissipation occurs, resulting in a statistical steady state; such turbulent dynamical systems are ubiquitous in geophysical and engineering turbulence. The ROMQG method involves constructing a low-order, nonlinear, dynamical system for the mean and covariance statistics in the reduced subspace that has the unperturbed statistics as a stable fixed point and optimally incorporates the indirect effect of non-Gaussian third-order statistics for the unperturbed system in a systematic calibration stage. This calibration procedure is achieved through information involving only the mean and covariance statistics for the unperturbed equilibrium. The performance of the ROMQG algorithm is assessed on two stringent test cases: the 40-mode Lorenz 96 model mimicking midlatitude atmospheric turbulence and two-layer baroclinic models for high-latitude ocean turbulence with over 125,000 degrees of freedom. In the Lorenz 96 model, the ROMQG algorithm with just a single mode captures the transient response to random or deterministic forcing. For the baroclinic ocean turbulence models, the inexpensive ROMQG algorithm with 252 modes, less than 0.2% of the total, captures the nonlinear response of the energy, the heat flux, and even the one-dimensional energy and heat flux spectra.

  6. What type of statistical model to choose for the analysis of radioimmunoassays

    International Nuclear Information System (INIS)

    Huet, S.

    1984-01-01

    The current techniques used for statistical analysis of radioimmunoassays are not very satisfactory for either the statistician or the biologist. They are based on an attempt to make the response curve linear to avoid complicated computations. The present article shows that this practice has considerable effects (often neglected) on the statistical assumptions which must be formulated. A more strict analysis is proposed by applying the four-parameter logistic model. The advantages of this method are: the statistical assumptions formulated are based on observed data, and the model can be applied to almost all radioimmunoassays [fr

  7. Estimation and variable selection for generalized additive partial linear models

    KAUST Repository

    Wang, Li

    2011-08-01

    We study generalized additive partial linear models, proposing the use of polynomial spline smoothing for estimation of nonparametric functions, and deriving quasi-likelihood based estimators for the linear parameters. We establish asymptotic normality for the estimators of the parametric components. The procedure avoids solving large systems of equations as in kernel-based procedures and thus results in gains in computational simplicity. We further develop a class of variable selection procedures for the linear parameters by employing a nonconcave penalized quasi-likelihood, which is shown to have an asymptotic oracle property. Monte Carlo simulations and an empirical example are presented for illustration. © Institute of Mathematical Statistics, 2011.

  8. Connection between weighted LPC and higher-order statistics for AR model estimation

    NARCIS (Netherlands)

    Kamp, Y.; Ma, C.

    1993-01-01

    This paper establishes the relationship between a weighted linear prediction method used for robust analysis of voiced speech and the autoregressive modelling based on higher-order statistics, known as cumulants

  9. Application of Hierarchical Linear Models/Linear Mixed-Effects Models in School Effectiveness Research

    Science.gov (United States)

    Ker, H. W.

    2014-01-01

    Multilevel data are very common in educational research. Hierarchical linear models/linear mixed-effects models (HLMs/LMEs) are often utilized to analyze multilevel data nowadays. This paper discusses the problems of utilizing ordinary regressions for modeling multilevel educational data, compare the data analytic results from three regression…

  10. A STATISTICAL ANALYSIS OF GDP AND FINAL CONSUMPTION USING SIMPLE LINEAR REGRESSION. THE CASE OF ROMANIA 1990–2010

    OpenAIRE

    Aniela Balacescu; Marian Zaharia

    2011-01-01

    This paper aims to examine the causal relationship between GDP and final consumption. The authors used linear regression model in which GDP is considered variable results, and final consumption variable factor. In drafting article we used Excel software application that is a modern computing and statistical data analysis.

  11. Vector generalized linear and additive models with an implementation in R

    CERN Document Server

    Yee, Thomas W

    2015-01-01

    This book presents a statistical framework that expands generalized linear models (GLMs) for regression modelling. The framework shared in this book allows analyses based on many semi-traditional applied statistics models to be performed as a coherent whole. This is possible through the approximately half-a-dozen major classes of statistical models included in the book and the software infrastructure component, which makes the models easily operable.    The book’s methodology and accompanying software (the extensive VGAM R package) are directed at these limitations, and this is the first time the methodology and software are covered comprehensively in one volume. Since their advent in 1972, GLMs have unified important distributions under a single umbrella with enormous implications. The demands of practical data analysis, however, require a flexibility that GLMs do not have. Data-driven GLMs, in the form of generalized additive models (GAMs), are also largely confined to the exponential family. This book ...

  12. MetabR: an R script for linear model analysis of quantitative metabolomic data

    Directory of Open Access Journals (Sweden)

    Ernest Ben

    2012-10-01

    Full Text Available Abstract Background Metabolomics is an emerging high-throughput approach to systems biology, but data analysis tools are lacking compared to other systems level disciplines such as transcriptomics and proteomics. Metabolomic data analysis requires a normalization step to remove systematic effects of confounding variables on metabolite measurements. Current tools may not correctly normalize every metabolite when the relationships between each metabolite quantity and fixed-effect confounding variables are different, or for the effects of random-effect confounding variables. Linear mixed models, an established methodology in the microarray literature, offer a standardized and flexible approach for removing the effects of fixed- and random-effect confounding variables from metabolomic data. Findings Here we present a simple menu-driven program, “MetabR”, designed to aid researchers with no programming background in statistical analysis of metabolomic data. Written in the open-source statistical programming language R, MetabR implements linear mixed models to normalize metabolomic data and analysis of variance (ANOVA to test treatment differences. MetabR exports normalized data, checks statistical model assumptions, identifies differentially abundant metabolites, and produces output files to help with data interpretation. Example data are provided to illustrate normalization for common confounding variables and to demonstrate the utility of the MetabR program. Conclusions We developed MetabR as a simple and user-friendly tool for implementing linear mixed model-based normalization and statistical analysis of targeted metabolomic data, which helps to fill a lack of available data analysis tools in this field. The program, user guide, example data, and any future news or updates related to the program may be found at http://metabr.r-forge.r-project.org/.

  13. Statistical modelling approach to derive quantitative nanowastes classification index; estimation of nanomaterials exposure

    CSIR Research Space (South Africa)

    Ntaka, L

    2013-08-01

    Full Text Available . In this work, statistical inference approach specifically the non-parametric bootstrapping and linear model were applied. Data used to develop the model were sourced from the literature. 104 data points with information on aggregation, natural organic matter...

  14. From spiking neuron models to linear-nonlinear models.

    Science.gov (United States)

    Ostojic, Srdjan; Brunel, Nicolas

    2011-01-20

    Neurons transform time-varying inputs into action potentials emitted stochastically at a time dependent rate. The mapping from current input to output firing rate is often represented with the help of phenomenological models such as the linear-nonlinear (LN) cascade, in which the output firing rate is estimated by applying to the input successively a linear temporal filter and a static non-linear transformation. These simplified models leave out the biophysical details of action potential generation. It is not a priori clear to which extent the input-output mapping of biophysically more realistic, spiking neuron models can be reduced to a simple linear-nonlinear cascade. Here we investigate this question for the leaky integrate-and-fire (LIF), exponential integrate-and-fire (EIF) and conductance-based Wang-Buzsáki models in presence of background synaptic activity. We exploit available analytic results for these models to determine the corresponding linear filter and static non-linearity in a parameter-free form. We show that the obtained functions are identical to the linear filter and static non-linearity determined using standard reverse correlation analysis. We then quantitatively compare the output of the corresponding linear-nonlinear cascade with numerical simulations of spiking neurons, systematically varying the parameters of input signal and background noise. We find that the LN cascade provides accurate estimates of the firing rates of spiking neurons in most of parameter space. For the EIF and Wang-Buzsáki models, we show that the LN cascade can be reduced to a firing rate model, the timescale of which we determine analytically. Finally we introduce an adaptive timescale rate model in which the timescale of the linear filter depends on the instantaneous firing rate. This model leads to highly accurate estimates of instantaneous firing rates.

  15. Aspects of general linear modelling of migration.

    Science.gov (United States)

    Congdon, P

    1992-01-01

    "This paper investigates the application of general linear modelling principles to analysing migration flows between areas. Particular attention is paid to specifying the form of the regression and error components, and the nature of departures from Poisson randomness. Extensions to take account of spatial and temporal correlation are discussed as well as constrained estimation. The issue of specification bears on the testing of migration theories, and assessing the role migration plays in job and housing markets: the direction and significance of the effects of economic variates on migration depends on the specification of the statistical model. The application is in the context of migration in London and South East England in the 1970s and 1980s." excerpt

  16. A Graphical User Interface to Generalized Linear Models in MATLAB

    Directory of Open Access Journals (Sweden)

    Peter Dunn

    1999-07-01

    Full Text Available Generalized linear models unite a wide variety of statistical models in a common theoretical framework. This paper discusses GLMLAB-software that enables such models to be fitted in the popular mathematical package MATLAB. It provides a graphical user interface to the powerful MATLAB computational engine to produce a program that is easy to use but with many features, including offsets, prior weights and user-defined distributions and link functions. MATLAB's graphical capacities are also utilized in providing a number of simple residual diagnostic plots.

  17. Dynamic Linear Models with R

    CERN Document Server

    Campagnoli, Patrizia; Petris, Giovanni

    2009-01-01

    State space models have gained tremendous popularity in as disparate fields as engineering, economics, genetics and ecology. Introducing general state space models, this book focuses on dynamic linear models, emphasizing their Bayesian analysis. It illustrates the fundamental steps needed to use dynamic linear models in practice, using R package.

  18. Modelling and Predicting Backstroke Start Performance Using Non-Linear and Linear Models.

    Science.gov (United States)

    de Jesus, Karla; Ayala, Helon V H; de Jesus, Kelly; Coelho, Leandro Dos S; Medeiros, Alexandre I A; Abraldes, José A; Vaz, Mário A P; Fernandes, Ricardo J; Vilas-Boas, João Paulo

    2018-03-01

    Our aim was to compare non-linear and linear mathematical model responses for backstroke start performance prediction. Ten swimmers randomly completed eight 15 m backstroke starts with feet over the wedge, four with hands on the highest horizontal and four on the vertical handgrip. Swimmers were videotaped using a dual media camera set-up, with the starts being performed over an instrumented block with four force plates. Artificial neural networks were applied to predict 5 m start time using kinematic and kinetic variables and to determine the accuracy of the mean absolute percentage error. Artificial neural networks predicted start time more robustly than the linear model with respect to changing training to the validation dataset for the vertical handgrip (3.95 ± 1.67 vs. 5.92 ± 3.27%). Artificial neural networks obtained a smaller mean absolute percentage error than the linear model in the horizontal (0.43 ± 0.19 vs. 0.98 ± 0.19%) and vertical handgrip (0.45 ± 0.19 vs. 1.38 ± 0.30%) using all input data. The best artificial neural network validation revealed a smaller mean absolute error than the linear model for the horizontal (0.007 vs. 0.04 s) and vertical handgrip (0.01 vs. 0.03 s). Artificial neural networks should be used for backstroke 5 m start time prediction due to the quite small differences among the elite level performances.

  19. Comparing Regression Coefficients between Nested Linear Models for Clustered Data with Generalized Estimating Equations

    Science.gov (United States)

    Yan, Jun; Aseltine, Robert H., Jr.; Harel, Ofer

    2013-01-01

    Comparing regression coefficients between models when one model is nested within another is of great practical interest when two explanations of a given phenomenon are specified as linear models. The statistical problem is whether the coefficients associated with a given set of covariates change significantly when other covariates are added into…

  20. Statistical models of petrol engines vehicles dynamics

    Science.gov (United States)

    Ilie, C. O.; Marinescu, M.; Alexa, O.; Vilău, R.; Grosu, D.

    2017-10-01

    This paper focuses on studying statistical models of vehicles dynamics. It was design and perform a one year testing program. There were used many same type cars with gasoline engines and different mileage. Experimental data were collected of onboard sensors and those on the engine test stand. A database containing data of 64th tests was created. Several mathematical modelling were developed using database and the system identification method. Each modelling is a SISO or a MISO linear predictive ARMAX (AutoRegressive-Moving-Average with eXogenous inputs) model. It represents a differential equation with constant coefficients. It were made 64th equations for each dependency like engine torque as output and engine’s load and intake manifold pressure, as inputs. There were obtained strings with 64 values for each type of model. The final models were obtained using average values of the coefficients. The accuracy of models was assessed.

  1. Piecewise Linear-Linear Latent Growth Mixture Models with Unknown Knots

    Science.gov (United States)

    Kohli, Nidhi; Harring, Jeffrey R.; Hancock, Gregory R.

    2013-01-01

    Latent growth curve models with piecewise functions are flexible and useful analytic models for investigating individual behaviors that exhibit distinct phases of development in observed variables. As an extension of this framework, this study considers a piecewise linear-linear latent growth mixture model (LGMM) for describing segmented change of…

  2. New robust statistical procedures for the polytomous logistic regression models.

    Science.gov (United States)

    Castilla, Elena; Ghosh, Abhik; Martin, Nirian; Pardo, Leandro

    2018-05-17

    This article derives a new family of estimators, namely the minimum density power divergence estimators, as a robust generalization of the maximum likelihood estimator for the polytomous logistic regression model. Based on these estimators, a family of Wald-type test statistics for linear hypotheses is introduced. Robustness properties of both the proposed estimators and the test statistics are theoretically studied through the classical influence function analysis. Appropriate real life examples are presented to justify the requirement of suitable robust statistical procedures in place of the likelihood based inference for the polytomous logistic regression model. The validity of the theoretical results established in the article are further confirmed empirically through suitable simulation studies. Finally, an approach for the data-driven selection of the robustness tuning parameter is proposed with empirical justifications. © 2018, The International Biometric Society.

  3. Optimizing refiner operation with statistical modelling

    Energy Technology Data Exchange (ETDEWEB)

    Broderick, G [Noranda Research Centre, Pointe Claire, PQ (Canada)

    1997-02-01

    The impact of refining conditions on the energy efficiency of the process and on the handsheet quality of a chemi-mechanical pulp was studied as part of a series of pilot scale refining trials. Statistical models of refiner performance were constructed from these results and non-linear optimization of process conditions were conducted. Optimization results indicated that increasing the ratio of specific energy applied in the first stage led to a reduction of some 15 per cent in the total energy requirement. The strategy can also be used to obtain significant increases in pulp quality for a given energy input. 20 refs., 6 tabs.

  4. Linear indices in nonlinear structural equation models : best fitting proper indices and other composites

    NARCIS (Netherlands)

    Dijkstra, T.K.; Henseler, J.

    2011-01-01

    The recent advent of nonlinear structural equation models with indices poses a new challenge to the measurement of scientific constructs. We discuss, exemplify and add to a family of statistical methods aimed at creating linear indices, and compare their suitability in a complex path model with

  5. Generalized Linear Models in Vehicle Insurance

    Directory of Open Access Journals (Sweden)

    Silvie Kafková

    2014-01-01

    Full Text Available Actuaries in insurance companies try to find the best model for an estimation of insurance premium. It depends on many risk factors, e.g. the car characteristics and the profile of the driver. In this paper, an analysis of the portfolio of vehicle insurance data using a generalized linear model (GLM is performed. The main advantage of the approach presented in this article is that the GLMs are not limited by inflexible preconditions. Our aim is to predict the relation of annual claim frequency on given risk factors. Based on a large real-world sample of data from 57 410 vehicles, the present study proposed a classification analysis approach that addresses the selection of predictor variables. The models with different predictor variables are compared by analysis of deviance and Akaike information criterion (AIC. Based on this comparison, the model for the best estimate of annual claim frequency is chosen. All statistical calculations are computed in R environment, which contains stats package with the function for the estimation of parameters of GLM and the function for analysis of deviation.

  6. Age related neuromuscular changes in sEMG of m. Tibialis Anterior using higher order statistics (Gaussianity & linearity test).

    Science.gov (United States)

    Siddiqi, Ariba; Arjunan, Sridhar P; Kumar, Dinesh K

    2016-08-01

    Age-associated changes in the surface electromyogram (sEMG) of Tibialis Anterior (TA) muscle can be attributable to neuromuscular alterations that precede strength loss. We have used our sEMG model of the Tibialis Anterior to interpret the age-related changes and compared with the experimental sEMG. Eighteen young (20-30 years) and 18 older (60-85 years) performed isometric dorsiflexion at 6 different percentage levels of maximum voluntary contractions (MVC), and their sEMG from the TA muscle was recorded. Six different age-related changes in the neuromuscular system were simulated using the sEMG model at the same MVCs as the experiment. The maximal power of the spectrum, Gaussianity and Linearity Test Statistics were computed from the simulated and experimental sEMG. A correlation analysis at α=0.05 was performed between the simulated and experimental age-related change in the sEMG features. The results show the loss in motor units was distinguished by the Gaussianity and Linearity test statistics; while the maximal power of the PSD distinguished between the muscular factors. The simulated condition of 40% loss of motor units with halved the number of fast fibers best correlated with the age-related change observed in the experimental sEMG higher order statistical features. The simulated aging condition found by this study corresponds with the moderate motor unit remodelling and negligible strength loss reported in literature for the cohorts aged 60-70 years.

  7. INTRODUCTION TO A COMBINED MULTIPLE LINEAR REGRESSION AND ARMA MODELING APPROACH FOR BEACH BACTERIA PREDICTION

    Science.gov (United States)

    Due to the complexity of the processes contributing to beach bacteria concentrations, many researchers rely on statistical modeling, among which multiple linear regression (MLR) modeling is most widely used. Despite its ease of use and interpretation, there may be time dependence...

  8. White Noise Assumptions Revisited : Regression Models and Statistical Designs for Simulation Practice

    NARCIS (Netherlands)

    Kleijnen, J.P.C.

    2006-01-01

    Classic linear regression models and their concomitant statistical designs assume a univariate response and white noise.By definition, white noise is normally, independently, and identically distributed with zero mean.This survey tries to answer the following questions: (i) How realistic are these

  9. Statistical modelling of networked human-automation performance using working memory capacity.

    Science.gov (United States)

    Ahmed, Nisar; de Visser, Ewart; Shaw, Tyler; Mohamed-Ameen, Amira; Campbell, Mark; Parasuraman, Raja

    2014-01-01

    This study examines the challenging problem of modelling the interaction between individual attentional limitations and decision-making performance in networked human-automation system tasks. Analysis of real experimental data from a task involving networked supervision of multiple unmanned aerial vehicles by human participants shows that both task load and network message quality affect performance, but that these effects are modulated by individual differences in working memory (WM) capacity. These insights were used to assess three statistical approaches for modelling and making predictions with real experimental networked supervisory performance data: classical linear regression, non-parametric Gaussian processes and probabilistic Bayesian networks. It is shown that each of these approaches can help designers of networked human-automated systems cope with various uncertainties in order to accommodate future users by linking expected operating conditions and performance from real experimental data to observable cognitive traits like WM capacity. Practitioner Summary: Working memory (WM) capacity helps account for inter-individual variability in operator performance in networked unmanned aerial vehicle supervisory tasks. This is useful for reliable performance prediction near experimental conditions via linear models; robust statistical prediction beyond experimental conditions via Gaussian process models and probabilistic inference about unknown task conditions/WM capacities via Bayesian network models.

  10. Linear models in the mathematics of uncertainty

    CERN Document Server

    Mordeson, John N; Clark, Terry D; Pham, Alex; Redmond, Michael A

    2013-01-01

    The purpose of this book is to present new mathematical techniques for modeling global issues. These mathematical techniques are used to determine linear equations between a dependent variable and one or more independent variables in cases where standard techniques such as linear regression are not suitable. In this book, we examine cases where the number of data points is small (effects of nuclear warfare), where the experiment is not repeatable (the breakup of the former Soviet Union), and where the data is derived from expert opinion (how conservative is a political party). In all these cases the data  is difficult to measure and an assumption of randomness and/or statistical validity is questionable.  We apply our methods to real world issues in international relations such as  nuclear deterrence, smart power, and cooperative threat reduction. We next apply our methods to issues in comparative politics such as successful democratization, quality of life, economic freedom, political stability, and fail...

  11. Applied linear regression

    CERN Document Server

    Weisberg, Sanford

    2013-01-01

    Praise for the Third Edition ""...this is an excellent book which could easily be used as a course text...""-International Statistical Institute The Fourth Edition of Applied Linear Regression provides a thorough update of the basic theory and methodology of linear regression modeling. Demonstrating the practical applications of linear regression analysis techniques, the Fourth Edition uses interesting, real-world exercises and examples. Stressing central concepts such as model building, understanding parameters, assessing fit and reliability, and drawing conclusions, the new edition illus

  12. Explicit estimating equations for semiparametric generalized linear latent variable models

    KAUST Repository

    Ma, Yanyuan

    2010-07-05

    We study generalized linear latent variable models without requiring a distributional assumption of the latent variables. Using a geometric approach, we derive consistent semiparametric estimators. We demonstrate that these models have a property which is similar to that of a sufficient complete statistic, which enables us to simplify the estimating procedure and explicitly to formulate the semiparametric estimating equations. We further show that the explicit estimators have the usual root n consistency and asymptotic normality. We explain the computational implementation of our method and illustrate the numerical performance of the estimators in finite sample situations via extensive simulation studies. The advantage of our estimators over the existing likelihood approach is also shown via numerical comparison. We employ the method to analyse a real data example from economics. © 2010 Royal Statistical Society.

  13. Variability aware compact model characterization for statistical circuit design optimization

    Science.gov (United States)

    Qiao, Ying; Qian, Kun; Spanos, Costas J.

    2012-03-01

    Variability modeling at the compact transistor model level can enable statistically optimized designs in view of limitations imposed by the fabrication technology. In this work we propose an efficient variabilityaware compact model characterization methodology based on the linear propagation of variance. Hierarchical spatial variability patterns of selected compact model parameters are directly calculated from transistor array test structures. This methodology has been implemented and tested using transistor I-V measurements and the EKV-EPFL compact model. Calculation results compare well to full-wafer direct model parameter extractions. Further studies are done on the proper selection of both compact model parameters and electrical measurement metrics used in the method.

  14. Longitudinal Data Analyses Using Linear Mixed Models in SPSS: Concepts, Procedures and Illustrations

    Directory of Open Access Journals (Sweden)

    Daniel T. L. Shek

    2011-01-01

    Full Text Available Although different methods are available for the analyses of longitudinal data, analyses based on generalized linear models (GLM are criticized as violating the assumption of independence of observations. Alternatively, linear mixed models (LMM are commonly used to understand changes in human behavior over time. In this paper, the basic concepts surrounding LMM (or hierarchical linear models are outlined. Although SPSS is a statistical analyses package commonly used by researchers, documentation on LMM procedures in SPSS is not thorough or user friendly. With reference to this limitation, the related procedures for performing analyses based on LMM in SPSS are described. To demonstrate the application of LMM analyses in SPSS, findings based on six waves of data collected in the Project P.A.T.H.S. (Positive Adolescent Training through Holistic Social Programmes in Hong Kong are presented.

  15. Longitudinal data analyses using linear mixed models in SPSS: concepts, procedures and illustrations.

    Science.gov (United States)

    Shek, Daniel T L; Ma, Cecilia M S

    2011-01-05

    Although different methods are available for the analyses of longitudinal data, analyses based on generalized linear models (GLM) are criticized as violating the assumption of independence of observations. Alternatively, linear mixed models (LMM) are commonly used to understand changes in human behavior over time. In this paper, the basic concepts surrounding LMM (or hierarchical linear models) are outlined. Although SPSS is a statistical analyses package commonly used by researchers, documentation on LMM procedures in SPSS is not thorough or user friendly. With reference to this limitation, the related procedures for performing analyses based on LMM in SPSS are described. To demonstrate the application of LMM analyses in SPSS, findings based on six waves of data collected in the Project P.A.T.H.S. (Positive Adolescent Training through Holistic Social Programmes) in Hong Kong are presented.

  16. Linear and non-linear energy barriers in systems of interacting single-domain ferromagnetic particles

    International Nuclear Information System (INIS)

    Petrila, Iulian; Bodale, Ilie; Rotarescu, Cristian; Stancu, Alexandru

    2011-01-01

    A comparative analysis between linear and non-linear energy barriers used for modeling statistical thermally-excited ferromagnetic systems is presented. The linear energy barrier is obtained by new symmetry considerations about the anisotropy energy and the link with the non-linear energy barrier is also presented. For a relevant analysis we compare the effects of linear and non-linear energy barriers implemented in two different models: Preisach-Neel and Ising-Metropolis. The differences between energy barriers which are reflected in different coercive field dependence of the temperature are also presented. -- Highlights: → The linear energy barrier is obtained from symmetry considerations. → The linear and non-linear energy barriers are calibrated and implemented in Preisach-Neel and Ising-Metropolis models. → The temperature and time effects of the linear and non-linear energy barriers are analyzed.

  17. Dimension of linear models

    DEFF Research Database (Denmark)

    Høskuldsson, Agnar

    1996-01-01

    Determination of the proper dimension of a given linear model is one of the most important tasks in the applied modeling work. We consider here eight criteria that can be used to determine the dimension of the model, or equivalently, the number of components to use in the model. Four...... the basic problems in determining the dimension of linear models. Then each of the eight measures are treated. The results are illustrated by examples....... of these criteria are widely used ones, while the remaining four are ones derived from the H-principle of mathematical modeling. Many examples from practice show that the criteria derived from the H-principle function better than the known and popular criteria for the number of components. We shall briefly review...

  18. Ordinal Log-Linear Models for Contingency Tables

    Directory of Open Access Journals (Sweden)

    Brzezińska Justyna

    2016-12-01

    Full Text Available A log-linear analysis is a method providing a comprehensive scheme to describe the association for categorical variables in a contingency table. The log-linear model specifies how the expected counts depend on the levels of the categorical variables for these cells and provide detailed information on the associations. The aim of this paper is to present theoretical, as well as empirical, aspects of ordinal log-linear models used for contingency tables with ordinal variables. We introduce log-linear models for ordinal variables: linear-by-linear association, row effect model, column effect model and RC Goodman’s model. Algorithm, advantages and disadvantages will be discussed in the paper. An empirical analysis will be conducted with the use of R.

  19. Parameterized Linear Longitudinal Airship Model

    Science.gov (United States)

    Kulczycki, Eric; Elfes, Alberto; Bayard, David; Quadrelli, Marco; Johnson, Joseph

    2010-01-01

    A parameterized linear mathematical model of the longitudinal dynamics of an airship is undergoing development. This model is intended to be used in designing control systems for future airships that would operate in the atmospheres of Earth and remote planets. Heretofore, the development of linearized models of the longitudinal dynamics of airships has been costly in that it has been necessary to perform extensive flight testing and to use system-identification techniques to construct models that fit the flight-test data. The present model is a generic one that can be relatively easily specialized to approximate the dynamics of specific airships at specific operating points, without need for further system identification, and with significantly less flight testing. The approach taken in the present development is to merge the linearized dynamical equations of an airship with techniques for estimation of aircraft stability derivatives, and to thereby make it possible to construct a linearized dynamical model of the longitudinal dynamics of a specific airship from geometric and aerodynamic data pertaining to that airship. (It is also planned to develop a model of the lateral dynamics by use of the same methods.) All of the aerodynamic data needed to construct the model of a specific airship can be obtained from wind-tunnel testing and computational fluid dynamics

  20. Linear and non-linear autoregressive models for short-term wind speed forecasting

    International Nuclear Information System (INIS)

    Lydia, M.; Suresh Kumar, S.; Immanuel Selvakumar, A.; Edwin Prem Kumar, G.

    2016-01-01

    Highlights: • Models for wind speed prediction at 10-min intervals up to 1 h built on time-series wind speed data. • Four different multivariate models for wind speed built based on exogenous variables. • Non-linear models built using three data mining algorithms outperform the linear models. • Autoregressive models based on wind direction perform better than other models. - Abstract: Wind speed forecasting aids in estimating the energy produced from wind farms. The soaring energy demands of the world and minimal availability of conventional energy sources have significantly increased the role of non-conventional sources of energy like solar, wind, etc. Development of models for wind speed forecasting with higher reliability and greater accuracy is the need of the hour. In this paper, models for predicting wind speed at 10-min intervals up to 1 h have been built based on linear and non-linear autoregressive moving average models with and without external variables. The autoregressive moving average models based on wind direction and annual trends have been built using data obtained from Sotavento Galicia Plc. and autoregressive moving average models based on wind direction, wind shear and temperature have been built on data obtained from Centre for Wind Energy Technology, Chennai, India. While the parameters of the linear models are obtained using the Gauss–Newton algorithm, the non-linear autoregressive models are developed using three different data mining algorithms. The accuracy of the models has been measured using three performance metrics namely, the Mean Absolute Error, Root Mean Squared Error and Mean Absolute Percentage Error.

  1. Robust-BD Estimation and Inference for General Partially Linear Models

    Directory of Open Access Journals (Sweden)

    Chunming Zhang

    2017-11-01

    Full Text Available The classical quadratic loss for the partially linear model (PLM and the likelihood function for the generalized PLM are not resistant to outliers. This inspires us to propose a class of “robust-Bregman divergence (BD” estimators of both the parametric and nonparametric components in the general partially linear model (GPLM, which allows the distribution of the response variable to be partially specified, without being fully known. Using the local-polynomial function estimation method, we propose a computationally-efficient procedure for obtaining “robust-BD” estimators and establish the consistency and asymptotic normality of the “robust-BD” estimator of the parametric component β o . For inference procedures of β o in the GPLM, we show that the Wald-type test statistic W n constructed from the “robust-BD” estimators is asymptotically distribution free under the null, whereas the likelihood ratio-type test statistic Λ n is not. This provides an insight into the distinction from the asymptotic equivalence (Fan and Huang 2005 between W n and Λ n in the PLM constructed from profile least-squares estimators using the non-robust quadratic loss. Numerical examples illustrate the computational effectiveness of the proposed “robust-BD” estimators and robust Wald-type test in the appearance of outlying observations.

  2. Wavelet-linear genetic programming: A new approach for modeling monthly streamflow

    Science.gov (United States)

    Ravansalar, Masoud; Rajaee, Taher; Kisi, Ozgur

    2017-06-01

    The streamflows are important and effective factors in stream ecosystems and its accurate prediction is an essential and important issue in water resources and environmental engineering systems. A hybrid wavelet-linear genetic programming (WLGP) model, which includes a discrete wavelet transform (DWT) and a linear genetic programming (LGP) to predict the monthly streamflow (Q) in two gauging stations, Pataveh and Shahmokhtar, on the Beshar River at the Yasuj, Iran were used in this study. In the proposed WLGP model, the wavelet analysis was linked to the LGP model where the original time series of streamflow were decomposed into the sub-time series comprising wavelet coefficients. The results were compared with the single LGP, artificial neural network (ANN), a hybrid wavelet-ANN (WANN) and Multi Linear Regression (MLR) models. The comparisons were done by some of the commonly utilized relevant physical statistics. The Nash coefficients (E) were found as 0.877 and 0.817 for the WLGP model, for the Pataveh and Shahmokhtar stations, respectively. The comparison of the results showed that the WLGP model could significantly increase the streamflow prediction accuracy in both stations. Since, the results demonstrate a closer approximation of the peak streamflow values by the WLGP model, this model could be utilized for the simulation of cumulative streamflow data prediction in one month ahead.

  3. Lecture Notes in Statistics. 3rd Semester

    DEFF Research Database (Denmark)

    The lecture note is prepared to meet the requirements for the 3rd semester course in statistics at the Aarhus School of Business. It focuses on multiple regression models, analysis of variance, and log-linear models.......The lecture note is prepared to meet the requirements for the 3rd semester course in statistics at the Aarhus School of Business. It focuses on multiple regression models, analysis of variance, and log-linear models....

  4. Multivariate generalized linear mixed models using R

    CERN Document Server

    Berridge, Damon Mark

    2011-01-01

    Multivariate Generalized Linear Mixed Models Using R presents robust and methodologically sound models for analyzing large and complex data sets, enabling readers to answer increasingly complex research questions. The book applies the principles of modeling to longitudinal data from panel and related studies via the Sabre software package in R. A Unified Framework for a Broad Class of Models The authors first discuss members of the family of generalized linear models, gradually adding complexity to the modeling framework by incorporating random effects. After reviewing the generalized linear model notation, they illustrate a range of random effects models, including three-level, multivariate, endpoint, event history, and state dependence models. They estimate the multivariate generalized linear mixed models (MGLMMs) using either standard or adaptive Gaussian quadrature. The authors also compare two-level fixed and random effects linear models. The appendices contain additional information on quadrature, model...

  5. Modeling patterns in data using linear and related models

    International Nuclear Information System (INIS)

    Engelhardt, M.E.

    1996-06-01

    This report considers the use of linear models for analyzing data related to reliability and safety issues of the type usually associated with nuclear power plants. The report discusses some of the general results of linear regression analysis, such as the model assumptions and properties of the estimators of the parameters. The results are motivated with examples of operational data. Results about the important case of a linear regression model with one covariate are covered in detail. This case includes analysis of time trends. The analysis is applied with two different sets of time trend data. Diagnostic procedures and tests for the adequacy of the model are discussed. Some related methods such as weighted regression and nonlinear models are also considered. A discussion of the general linear model is also included. Appendix A gives some basic SAS programs and outputs for some of the analyses discussed in the body of the report. Appendix B is a review of some of the matrix theoretic results which are useful in the development of linear models

  6. Sampling, Probability Models and Statistical Reasoning Statistical

    Indian Academy of Sciences (India)

    Home; Journals; Resonance – Journal of Science Education; Volume 1; Issue 5. Sampling, Probability Models and Statistical Reasoning Statistical Inference. Mohan Delampady V R Padmawar. General Article Volume 1 Issue 5 May 1996 pp 49-58 ...

  7. Bayesian Subset Modeling for High-Dimensional Generalized Linear Models

    KAUST Repository

    Liang, Faming

    2013-06-01

    This article presents a new prior setting for high-dimensional generalized linear models, which leads to a Bayesian subset regression (BSR) with the maximum a posteriori model approximately equivalent to the minimum extended Bayesian information criterion model. The consistency of the resulting posterior is established under mild conditions. Further, a variable screening procedure is proposed based on the marginal inclusion probability, which shares the same properties of sure screening and consistency with the existing sure independence screening (SIS) and iterative sure independence screening (ISIS) procedures. However, since the proposed procedure makes use of joint information from all predictors, it generally outperforms SIS and ISIS in real applications. This article also makes extensive comparisons of BSR with the popular penalized likelihood methods, including Lasso, elastic net, SIS, and ISIS. The numerical results indicate that BSR can generally outperform the penalized likelihood methods. The models selected by BSR tend to be sparser and, more importantly, of higher prediction ability. In addition, the performance of the penalized likelihood methods tends to deteriorate as the number of predictors increases, while this is not significant for BSR. Supplementary materials for this article are available online. © 2013 American Statistical Association.

  8. Comparison of linear and non-linear models for predicting energy expenditure from raw accelerometer data.

    Science.gov (United States)

    Montoye, Alexander H K; Begum, Munni; Henning, Zachary; Pfeiffer, Karin A

    2017-02-01

    This study had three purposes, all related to evaluating energy expenditure (EE) prediction accuracy from body-worn accelerometers: (1) compare linear regression to linear mixed models, (2) compare linear models to artificial neural network models, and (3) compare accuracy of accelerometers placed on the hip, thigh, and wrists. Forty individuals performed 13 activities in a 90 min semi-structured, laboratory-based protocol. Participants wore accelerometers on the right hip, right thigh, and both wrists and a portable metabolic analyzer (EE criterion). Four EE prediction models were developed for each accelerometer: linear regression, linear mixed, and two ANN models. EE prediction accuracy was assessed using correlations, root mean square error (RMSE), and bias and was compared across models and accelerometers using repeated-measures analysis of variance. For all accelerometer placements, there were no significant differences for correlations or RMSE between linear regression and linear mixed models (correlations: r  =  0.71-0.88, RMSE: 1.11-1.61 METs; p  >  0.05). For the thigh-worn accelerometer, there were no differences in correlations or RMSE between linear and ANN models (ANN-correlations: r  =  0.89, RMSE: 1.07-1.08 METs. Linear models-correlations: r  =  0.88, RMSE: 1.10-1.11 METs; p  >  0.05). Conversely, one ANN had higher correlations and lower RMSE than both linear models for the hip (ANN-correlation: r  =  0.88, RMSE: 1.12 METs. Linear models-correlations: r  =  0.86, RMSE: 1.18-1.19 METs; p  linear models for the wrist-worn accelerometers (ANN-correlations: r  =  0.82-0.84, RMSE: 1.26-1.32 METs. Linear models-correlations: r  =  0.71-0.73, RMSE: 1.55-1.61 METs; p  models offer a significant improvement in EE prediction accuracy over linear models. Conversely, linear models showed similar EE prediction accuracy to machine learning models for hip- and thigh

  9. Prediction of minimum temperatures in an alpine region by linear and non-linear post-processing of meteorological models

    Directory of Open Access Journals (Sweden)

    R. Barbiero

    2007-05-01

    Full Text Available Model Output Statistics (MOS refers to a method of post-processing the direct outputs of numerical weather prediction (NWP models in order to reduce the biases introduced by a coarse horizontal resolution. This technique is especially useful in orographically complex regions, where large differences can be found between the NWP elevation model and the true orography. This study carries out a comparison of linear and non-linear MOS methods, aimed at the prediction of minimum temperatures in a fruit-growing region of the Italian Alps, based on the output of two different NWPs (ECMWF T511–L60 and LAMI-3. Temperature, of course, is a particularly important NWP output; among other roles it drives the local frost forecast, which is of great interest to agriculture. The mechanisms of cold air drainage, a distinctive aspect of mountain environments, are often unsatisfactorily captured by global circulation models. The simplest post-processing technique applied in this work was a correction for the mean bias, assessed at individual model grid points. We also implemented a multivariate linear regression on the output at the grid points surrounding the target area, and two non-linear models based on machine learning techniques: Neural Networks and Random Forest. We compare the performance of all these techniques on four different NWP data sets. Downscaling the temperatures clearly improved the temperature forecasts with respect to the raw NWP output, and also with respect to the basic mean bias correction. Multivariate methods generally yielded better results, but the advantage of using non-linear algorithms was small if not negligible. RF, the best performing method, was implemented on ECMWF prognostic output at 06:00 UTC over the 9 grid points surrounding the target area. Mean absolute errors in the prediction of 2 m temperature at 06:00 UTC were approximately 1.2°C, close to the natural variability inside the area itself.

  10. Statistical model of a gas diffusion electrode. III. Photomicrograph study

    Energy Technology Data Exchange (ETDEWEB)

    Winsel, A W

    1965-12-01

    A linear section through a gas diffusion electrode produces a certain distribution function of sinews with the pores. From this distribution function some qualities of the pore structure are derived, and an automatic device to determine the distribution function is described. With a statistical model of a gas diffusion electrode the behavior of a DSK electrode is discussed and compared with earlier measurements of the flow resistance of this material.

  11. Core seismic behaviour: linear and non-linear models

    International Nuclear Information System (INIS)

    Bernard, M.; Van Dorsselaere, M.; Gauvain, M.; Jenapierre-Gantenbein, M.

    1981-08-01

    The usual methodology for the core seismic behaviour analysis leads to a double complementary approach: to define a core model to be included in the reactor-block seismic response analysis, simple enough but representative of basic movements (diagrid or slab), to define a finer core model, with basic data issued from the first model. This paper presents the history of the different models of both kinds. The inert mass model (IMM) yielded a first rough diagrid movement. The direct linear model (DLM), without shocks and with sodium as an added mass, let to two different ones: DLM 1 with independent movements of the fuel and radial blanket subassemblies, and DLM 2 with a core combined movement. The non-linear (NLM) ''CORALIE'' uses the same basic modelization (Finite Element Beams) but accounts for shocks. It studies the response of a diameter on flats and takes into account the fluid coupling and the wrapper tube flexibility at the pad level. Damping consists of one modal part of 2% and one part due to shocks. Finally, ''CORALIE'' yields the time-history of the displacements and efforts on the supports, but damping (probably greater than 2%) and fluid-structures interaction are still to be precised. The validation experiments were performed on a RAPSODIE core mock-up on scale 1, in similitude of 1/3 as to SPX 1. The equivalent linear model (ELM) was developed for the SPX 1 reactor-block response analysis and a specified seismic level (SB or SM). It is composed of several oscillators fixed to the diagrid and yields the same maximum displacements and efforts than the NLM. The SPX 1 core seismic analysis with a diagrid input spectrum which corresponds to a 0,1 g group acceleration, has been carried out with these models: some aspects of these calculations are presented here

  12. Linear Logistic Test Modeling with R

    Science.gov (United States)

    Baghaei, Purya; Kubinger, Klaus D.

    2015-01-01

    The present paper gives a general introduction to the linear logistic test model (Fischer, 1973), an extension of the Rasch model with linear constraints on item parameters, along with eRm (an R package to estimate different types of Rasch models; Mair, Hatzinger, & Mair, 2014) functions to estimate the model and interpret its parameters. The…

  13. Reflexion on linear regression trip production modelling method for ensuring good model quality

    Science.gov (United States)

    Suprayitno, Hitapriya; Ratnasari, Vita

    2017-11-01

    Transport Modelling is important. For certain cases, the conventional model still has to be used, in which having a good trip production model is capital. A good model can only be obtained from a good sample. Two of the basic principles of a good sampling is having a sample capable to represent the population characteristics and capable to produce an acceptable error at a certain confidence level. It seems that this principle is not yet quite understood and used in trip production modeling. Therefore, investigating the Trip Production Modelling practice in Indonesia and try to formulate a better modeling method for ensuring the Model Quality is necessary. This research result is presented as follows. Statistics knows a method to calculate span of prediction value at a certain confidence level for linear regression, which is called Confidence Interval of Predicted Value. The common modeling practice uses R2 as the principal quality measure, the sampling practice varies and not always conform to the sampling principles. An experiment indicates that small sample is already capable to give excellent R2 value and sample composition can significantly change the model. Hence, good R2 value, in fact, does not always mean good model quality. These lead to three basic ideas for ensuring good model quality, i.e. reformulating quality measure, calculation procedure, and sampling method. A quality measure is defined as having a good R2 value and a good Confidence Interval of Predicted Value. Calculation procedure must incorporate statistical calculation method and appropriate statistical tests needed. A good sampling method must incorporate random well distributed stratified sampling with a certain minimum number of samples. These three ideas need to be more developed and tested.

  14. Explorative methods in linear models

    DEFF Research Database (Denmark)

    Høskuldsson, Agnar

    2004-01-01

    The author has developed the H-method of mathematical modeling that builds up the model by parts, where each part is optimized with respect to prediction. Besides providing with better predictions than traditional methods, these methods provide with graphic procedures for analyzing different feat...... features in data. These graphic methods extend the well-known methods and results of Principal Component Analysis to any linear model. Here the graphic procedures are applied to linear regression and Ridge Regression....

  15. Sparse Linear Identifiable Multivariate Modeling

    DEFF Research Database (Denmark)

    Henao, Ricardo; Winther, Ole

    2011-01-01

    and bench-marked on artificial and real biological data sets. SLIM is closest in spirit to LiNGAM (Shimizu et al., 2006), but differs substantially in inference, Bayesian network structure learning and model comparison. Experimentally, SLIM performs equally well or better than LiNGAM with comparable......In this paper we consider sparse and identifiable linear latent variable (factor) and linear Bayesian network models for parsimonious analysis of multivariate data. We propose a computationally efficient method for joint parameter and model inference, and model comparison. It consists of a fully...

  16. USE OF THE SIMPLE LINEAR REGRESSION MODEL IN MACRO-ECONOMICAL ANALYSES

    Directory of Open Access Journals (Sweden)

    Constantin ANGHELACHE

    2011-10-01

    Full Text Available The article presents the fundamental aspects of the linear regression, as a toolbox which can be used in macroeconomic analyses. The article describes the estimation of the parameters, the statistical tests used, the homoscesasticity and heteroskedasticity. The use of econometrics instrument in macroeconomics is an important factor that guarantees the quality of the models, analyses, results and possible interpretation that can be drawn at this level.

  17. Linearization effect in multifractal analysis: Insights from the Random Energy Model

    Science.gov (United States)

    Angeletti, Florian; Mézard, Marc; Bertin, Eric; Abry, Patrice

    2011-08-01

    The analysis of the linearization effect in multifractal analysis, and hence of the estimation of moments for multifractal processes, is revisited borrowing concepts from the statistical physics of disordered systems, notably from the analysis of the so-called Random Energy Model. Considering a standard multifractal process (compound Poisson motion), chosen as a simple representative example, we show the following: (i) the existence of a critical order q∗ beyond which moments, though finite, cannot be estimated through empirical averages, irrespective of the sample size of the observation; (ii) multifractal exponents necessarily behave linearly in q, for q>q∗. Tailoring the analysis conducted for the Random Energy Model to that of Compound Poisson motion, we provide explicative and quantitative predictions for the values of q∗ and for the slope controlling the linear behavior of the multifractal exponents. These quantities are shown to be related only to the definition of the multifractal process and not to depend on the sample size of the observation. Monte Carlo simulations, conducted over a large number of large sample size realizations of compound Poisson motion, comfort and extend these analyses.

  18. Statistical Model of Extreme Shear

    DEFF Research Database (Denmark)

    Larsen, Gunner Chr.; Hansen, Kurt Schaldemose

    2004-01-01

    In order to continue cost-optimisation of modern large wind turbines, it is important to continously increase the knowledge on wind field parameters relevant to design loads. This paper presents a general statistical model that offers site-specific prediction of the probability density function...... by a model that, on a statistically consistent basis, describe the most likely spatial shape of an extreme wind shear event. Predictions from the model have been compared with results from an extreme value data analysis, based on a large number of high-sampled full-scale time series measurements...... are consistent, given the inevitabel uncertainties associated with model as well as with the extreme value data analysis. Keywords: Statistical model, extreme wind conditions, statistical analysis, turbulence, wind loading, statistical analysis, turbulence, wind loading, wind shear, wind turbines....

  19. Central Limit Theorem for Exponentially Quasi-local Statistics of Spin Models on Cayley Graphs

    Science.gov (United States)

    Reddy, Tulasi Ram; Vadlamani, Sreekar; Yogeshwaran, D.

    2018-04-01

    Central limit theorems for linear statistics of lattice random fields (including spin models) are usually proven under suitable mixing conditions or quasi-associativity. Many interesting examples of spin models do not satisfy mixing conditions, and on the other hand, it does not seem easy to show central limit theorem for local statistics via quasi-associativity. In this work, we prove general central limit theorems for local statistics and exponentially quasi-local statistics of spin models on discrete Cayley graphs with polynomial growth. Further, we supplement these results by proving similar central limit theorems for random fields on discrete Cayley graphs taking values in a countable space, but under the stronger assumptions of α -mixing (for local statistics) and exponential α -mixing (for exponentially quasi-local statistics). All our central limit theorems assume a suitable variance lower bound like many others in the literature. We illustrate our general central limit theorem with specific examples of lattice spin models and statistics arising in computational topology, statistical physics and random networks. Examples of clustering spin models include quasi-associated spin models with fast decaying covariances like the off-critical Ising model, level sets of Gaussian random fields with fast decaying covariances like the massive Gaussian free field and determinantal point processes with fast decaying kernels. Examples of local statistics include intrinsic volumes, face counts, component counts of random cubical complexes while exponentially quasi-local statistics include nearest neighbour distances in spin models and Betti numbers of sub-critical random cubical complexes.

  20. Latent log-linear models for handwritten digit classification.

    Science.gov (United States)

    Deselaers, Thomas; Gass, Tobias; Heigold, Georg; Ney, Hermann

    2012-06-01

    We present latent log-linear models, an extension of log-linear models incorporating latent variables, and we propose two applications thereof: log-linear mixture models and image deformation-aware log-linear models. The resulting models are fully discriminative, can be trained efficiently, and the model complexity can be controlled. Log-linear mixture models offer additional flexibility within the log-linear modeling framework. Unlike previous approaches, the image deformation-aware model directly considers image deformations and allows for a discriminative training of the deformation parameters. Both are trained using alternating optimization. For certain variants, convergence to a stationary point is guaranteed and, in practice, even variants without this guarantee converge and find models that perform well. We tune the methods on the USPS data set and evaluate on the MNIST data set, demonstrating the generalization capabilities of our proposed models. Our models, although using significantly fewer parameters, are able to obtain competitive results with models proposed in the literature.

  1. Equivalent linear damping characterization in linear and nonlinear force-stiffness muscle models.

    Science.gov (United States)

    Ovesy, Marzieh; Nazari, Mohammad Ali; Mahdavian, Mohammad

    2016-02-01

    In the current research, the muscle equivalent linear damping coefficient which is introduced as the force-velocity relation in a muscle model and the corresponding time constant are investigated. In order to reach this goal, a 1D skeletal muscle model was used. Two characterizations of this model using a linear force-stiffness relationship (Hill-type model) and a nonlinear one have been implemented. The OpenSim platform was used for verification of the model. The isometric activation has been used for the simulation. The equivalent linear damping and the time constant of each model were extracted by using the results obtained from the simulation. The results provide a better insight into the characteristics of each model. It is found that the nonlinear models had a response rate closer to the reality compared to the Hill-type models.

  2. Capturing spike variability in noisy Izhikevich neurons using point process generalized linear models

    DEFF Research Database (Denmark)

    Østergaard, Jacob; Kramer, Mark A.; Eden, Uri T.

    2018-01-01

    current. We then fit these spike train datawith a statistical model (a generalized linear model, GLM, with multiplicative influences of past spiking). For different levels of noise, we show how the GLM captures both the deterministic features of the Izhikevich neuron and the variability driven...... by the noise. We conclude that the GLM captures essential features of the simulated spike trains, but for near-deterministic spike trains, goodness-of-fit analyses reveal that the model does not fit very well in a statistical sense; the essential random part of the GLM is not captured....... are separately applied; understanding the relationships between these modeling approaches remains an area of active research. In this letter, we examine this relationship using simulation. To do so, we first generate spike train data from a well-known dynamical model, the Izhikevich neuron, with a noisy input...

  3. Tracking Electroencephalographic Changes Using Distributions of Linear Models: Application to Propofol-Based Depth of Anesthesia Monitoring.

    Science.gov (United States)

    Kuhlmann, Levin; Manton, Jonathan H; Heyse, Bjorn; Vereecke, Hugo E M; Lipping, Tarmo; Struys, Michel M R F; Liley, David T J

    2017-04-01

    Tracking brain states with electrophysiological measurements often relies on short-term averages of extracted features and this may not adequately capture the variability of brain dynamics. The objective is to assess the hypotheses that this can be overcome by tracking distributions of linear models using anesthesia data, and that anesthetic brain state tracking performance of linear models is comparable to that of a high performing depth of anesthesia monitoring feature. Individuals' brain states are classified by comparing the distribution of linear (auto-regressive moving average-ARMA) model parameters estimated from electroencephalographic (EEG) data obtained with a sliding window to distributions of linear model parameters for each brain state. The method is applied to frontal EEG data from 15 subjects undergoing propofol anesthesia and classified by the observers assessment of alertness/sedation (OAA/S) scale. Classification of the OAA/S score was performed using distributions of either ARMA parameters or the benchmark feature, Higuchi fractal dimension. The highest average testing sensitivity of 59% (chance sensitivity: 17%) was found for ARMA (2,1) models and Higuchi fractal dimension achieved 52%, however, no statistical difference was observed. For the same ARMA case, there was no statistical difference if medians are used instead of distributions (sensitivity: 56%). The model-based distribution approach is not necessarily more effective than a median/short-term average approach, however, it performs well compared with a distribution approach based on a high performing anesthesia monitoring measure. These techniques hold potential for anesthesia monitoring and may be generally applicable for tracking brain states.

  4. ARSENIC CONTAMINATION IN GROUNDWATER: A STATISTICAL MODELING

    Directory of Open Access Journals (Sweden)

    Palas Roy

    2013-01-01

    Full Text Available High arsenic in natural groundwater in most of the tubewells of the Purbasthali- Block II area of Burdwan district (W.B, India has recently been focused as a serious environmental concern. This paper is intending to illustrate the statistical modeling of the arsenic contaminated groundwater to identify the interrelation of that arsenic contain with other participating groundwater parameters so that the arsenic contamination level can easily be predicted by analyzing only such parameters. Multivariate data analysis was done with the collected groundwater samples from the 132 tubewells of this contaminated region shows that three variable parameters are significantly related with the arsenic. Based on these relationships, a multiple linear regression model has been developed that estimated the arsenic contamination by measuring such three predictor parameters of the groundwater variables in the contaminated aquifer. This model could also be a suggestive tool while designing the arsenic removal scheme for any affected groundwater.

  5. Distributed Monitoring of the R2 Statistic for Linear Regression

    Data.gov (United States)

    National Aeronautics and Space Administration — The problem of monitoring a multivariate linear regression model is relevant in studying the evolving relationship between a set of input variables (features) and...

  6. An Introduction to the Hybrid Approach of Neural Networks and the Linear Regression Model : An Illustration in the Hedonic Pricing Model of Building Costs

    OpenAIRE

    浅野, 美代子; マーコ, ユー K.W.

    2007-01-01

    This paper introduces the hybrid approach of neural networks and linear regression model proposed by Asano and Tsubaki (2003). Neural networks are often credited with its superiority in data consistency whereas the linear regression model provides simple interpretation of the data enabling researchers to verify their hypotheses. The hybrid approach aims at combing the strengths of these two well-established statistical methods. A step-by-step procedure for performing the hybrid approach is pr...

  7. From linear to generalized linear mixed models: A case study in repeated measures

    Science.gov (United States)

    Compared to traditional linear mixed models, generalized linear mixed models (GLMMs) can offer better correspondence between response variables and explanatory models, yielding more efficient estimates and tests in the analysis of data from designed experiments. Using proportion data from a designed...

  8. Extended Linear Models with Gaussian Priors

    DEFF Research Database (Denmark)

    Quinonero, Joaquin

    2002-01-01

    In extended linear models the input space is projected onto a feature space by means of an arbitrary non-linear transformation. A linear model is then applied to the feature space to construct the model output. The dimension of the feature space can be very large, or even infinite, giving the model...... a very big flexibility. Support Vector Machines (SVM's) and Gaussian processes are two examples of such models. In this technical report I present a model in which the dimension of the feature space remains finite, and where a Bayesian approach is used to train the model with Gaussian priors...... on the parameters. The Relevance Vector Machine, introduced by Tipping, is a particular case of such a model. I give the detailed derivations of the expectation-maximisation (EM) algorithm used in the training. These derivations are not found in the literature, and might be helpful for newcomers....

  9. Linear mixed models for longitudinal data

    CERN Document Server

    Molenberghs, Geert

    2000-01-01

    This paperback edition is a reprint of the 2000 edition. This book provides a comprehensive treatment of linear mixed models for continuous longitudinal data. Next to model formulation, this edition puts major emphasis on exploratory data analysis for all aspects of the model, such as the marginal model, subject-specific profiles, and residual covariance structure. Further, model diagnostics and missing data receive extensive treatment. Sensitivity analysis for incomplete data is given a prominent place. Several variations to the conventional linear mixed model are discussed (a heterogeity model, conditional linear mixed models). This book will be of interest to applied statisticians and biomedical researchers in industry, public health organizations, contract research organizations, and academia. The book is explanatory rather than mathematically rigorous. Most analyses were done with the MIXED procedure of the SAS software package, and many of its features are clearly elucidated. However, some other commerc...

  10. Introduction to applied Bayesian statistics and estimation for social scientists

    CERN Document Server

    Lynch, Scott M

    2007-01-01

    ""Introduction to Applied Bayesian Statistics and Estimation for Social Scientists"" covers the complete process of Bayesian statistical analysis in great detail from the development of a model through the process of making statistical inference. The key feature of this book is that it covers models that are most commonly used in social science research - including the linear regression model, generalized linear models, hierarchical models, and multivariate regression models - and it thoroughly develops each real-data example in painstaking detail.The first part of the book provides a detailed

  11. Statistical inference for template aging

    Science.gov (United States)

    Schuckers, Michael E.

    2006-04-01

    A change in classification error rates for a biometric device is often referred to as template aging. Here we offer two methods for determining whether the effect of time is statistically significant. The first of these is the use of a generalized linear model to determine if these error rates change linearly over time. This approach generalizes previous work assessing the impact of covariates using generalized linear models. The second approach uses of likelihood ratio tests methodology. The focus here is on statistical methods for estimation not the underlying cause of the change in error rates over time. These methodologies are applied to data from the National Institutes of Standards and Technology Biometric Score Set Release 1. The results of these applications are discussed.

  12. Non-linear finite element modeling

    DEFF Research Database (Denmark)

    Mikkelsen, Lars Pilgaard

    The note is written for courses in "Non-linear finite element method". The note has been used by the author teaching non-linear finite element modeling at Civil Engineering at Aalborg University, Computational Mechanics at Aalborg University Esbjerg, Structural Engineering at the University...

  13. A Linear Regression Model for Global Solar Radiation on Horizontal Surfaces at Warri, Nigeria

    Directory of Open Access Journals (Sweden)

    Michael S. Okundamiya

    2013-10-01

    Full Text Available The growing anxiety on the negative effects of fossil fuels on the environment and the global emission reduction targets call for a more extensive use of renewable energy alternatives. Efficient solar energy utilization is an essential solution to the high atmospheric pollution caused by fossil fuel combustion. Global solar radiation (GSR data, which are useful for the design and evaluation of solar energy conversion system, are not measured at the forty-five meteorological stations in Nigeria. The dearth of the measured solar radiation data calls for accurate estimation. This study proposed a temperature-based linear regression, for predicting the monthly average daily GSR on horizontal surfaces, at Warri (latitude 5.020N and longitude 7.880E an oil city located in the south-south geopolitical zone, in Nigeria. The proposed model is analyzed based on five statistical indicators (coefficient of correlation, coefficient of determination, mean bias error, root mean square error, and t-statistic, and compared with the existing sunshine-based model for the same study. The results indicate that the proposed temperature-based linear regression model could replace the existing sunshine-based model for generating global solar radiation data. Keywords: air temperature; empirical model; global solar radiation; regression analysis; renewable energy; Warri

  14. Exclusion statistics and integrable models

    International Nuclear Information System (INIS)

    Mashkevich, S.

    1998-01-01

    The definition of exclusion statistics that was given by Haldane admits a 'statistical interaction' between distinguishable particles (multispecies statistics). For such statistics, thermodynamic quantities can be evaluated exactly; explicit expressions are presented here for cluster coefficients. Furthermore, single-species exclusion statistics is realized in one-dimensional integrable models of the Calogero-Sutherland type. The interesting questions of generalizing this correspondence to the higher-dimensional and the multispecies cases remain essentially open; however, our results provide some hints as to searches for the models in question

  15. Introducing Linear Functions: An Alternative Statistical Approach

    Science.gov (United States)

    Nolan, Caroline; Herbert, Sandra

    2015-01-01

    The introduction of linear functions is the turning point where many students decide if mathematics is useful or not. This means the role of parameters and variables in linear functions could be considered to be "threshold concepts". There is recognition that linear functions can be taught in context through the exploration of linear…

  16. Falling in the elderly: Do statistical models matter for performance criteria of fall prediction? Results from two large population-based studies.

    Science.gov (United States)

    Kabeshova, Anastasiia; Launay, Cyrille P; Gromov, Vasilii A; Fantino, Bruno; Levinoff, Elise J; Allali, Gilles; Beauchet, Olivier

    2016-01-01

    To compare performance criteria (i.e., sensitivity, specificity, positive predictive value, negative predictive value, area under receiver operating characteristic curve and accuracy) of linear and non-linear statistical models for fall risk in older community-dwellers. Participants were recruited in two large population-based studies, "Prévention des Chutes, Réseau 4" (PCR4, n=1760, cross-sectional design, retrospective collection of falls) and "Prévention des Chutes Personnes Agées" (PCPA, n=1765, cohort design, prospective collection of falls). Six linear statistical models (i.e., logistic regression, discriminant analysis, Bayes network algorithm, decision tree, random forest, boosted trees), three non-linear statistical models corresponding to artificial neural networks (multilayer perceptron, genetic algorithm and neuroevolution of augmenting topologies [NEAT]) and the adaptive neuro fuzzy interference system (ANFIS) were used. Falls ≥1 characterizing fallers and falls ≥2 characterizing recurrent fallers were used as outcomes. Data of studies were analyzed separately and together. NEAT and ANFIS had better performance criteria compared to other models. The highest performance criteria were reported with NEAT when using PCR4 database and falls ≥1, and with both NEAT and ANFIS when pooling data together and using falls ≥2. However, sensitivity and specificity were unbalanced. Sensitivity was higher than specificity when identifying fallers, whereas the converse was found when predicting recurrent fallers. Our results showed that NEAT and ANFIS were non-linear statistical models with the best performance criteria for the prediction of falls but their sensitivity and specificity were unbalanced, underscoring that models should be used respectively for the screening of fallers and the diagnosis of recurrent fallers. Copyright © 2015 European Federation of Internal Medicine. Published by Elsevier B.V. All rights reserved.

  17. Statistical distributions of earthquakes and related non-linear features in seismic waves

    International Nuclear Information System (INIS)

    Apostol, B.-F.

    2006-01-01

    A few basic facts in the science of the earthquakes are briefly reviewed. An accumulation, or growth, model is put forward for the focal mechanisms and the critical focal zone of the earthquakes, which relates the earthquake average recurrence time to the released seismic energy. The temporal statistical distribution for average recurrence time is introduced for earthquakes, and, on this basis, the Omori-type distribution in energy is derived, as well as the distribution in magnitude, by making use of the semi-empirical Gutenberg-Richter law relating seismic energy to earthquake magnitude. On geometric grounds, the accumulation model suggests the value r = 1/3 for the Omori parameter in the power-law of energy distribution, which leads to β = 1,17 for the coefficient in the Gutenberg-Richter recurrence law, in fair agreement with the statistical analysis of the empirical data. Making use of this value, the empirical Bath's law is discussed for the average magnitude of the aftershocks (which is 1.2 less than the magnitude of the main seismic shock), by assuming that the aftershocks are relaxation events of the seismic zone. The time distribution of the earthquakes with a fixed average recurrence time is also derived, the earthquake occurrence prediction is discussed by means of the average recurrence time and the seismicity rate, and application of this discussion to the seismic region Vrancea, Romania, is outlined. Finally, a special effect of non-linear behaviour of the seismic waves is discussed, by describing an exact solution derived recently for the elastic waves equation with cubic anharmonicities, its relevance, and its connection to the approximate quasi-plane waves picture. The properties of the seismic activity accompanying a main seismic shock, both like foreshocks and aftershocks, are relegated to forthcoming publications. (author)

  18. A heteroscedastic generalized linear model with a non-normal speed factor for responses and response times.

    Science.gov (United States)

    Molenaar, Dylan; Bolsinova, Maria

    2017-05-01

    In generalized linear modelling of responses and response times, the observed response time variables are commonly transformed to make their distribution approximately normal. A normal distribution for the transformed response times is desirable as it justifies the linearity and homoscedasticity assumptions in the underlying linear model. Past research has, however, shown that the transformed response times are not always normal. Models have been developed to accommodate this violation. In the present study, we propose a modelling approach for responses and response times to test and model non-normality in the transformed response times. Most importantly, we distinguish between non-normality due to heteroscedastic residual variances, and non-normality due to a skewed speed factor. In a simulation study, we establish parameter recovery and the power to separate both effects. In addition, we apply the model to a real data set. © 2017 The Authors. British Journal of Mathematical and Statistical Psychology published by John Wiley & Sons Ltd on behalf of British Psychological Society.

  19. Analysis of dental caries using generalized linear and count regression models

    Directory of Open Access Journals (Sweden)

    Javali M. Phil

    2013-11-01

    Full Text Available Generalized linear models (GLM are generalization of linear regression models, which allow fitting regression models to response data in all the sciences especially medical and dental sciences that follow a general exponential family. These are flexible and widely used class of such models that can accommodate response variables. Count data are frequently characterized by overdispersion and excess zeros. Zero-inflated count models provide a parsimonious yet powerful way to model this type of situation. Such models assume that the data are a mixture of two separate data generation processes: one generates only zeros, and the other is either a Poisson or a negative binomial data-generating process. Zero inflated count regression models such as the zero-inflated Poisson (ZIP, zero-inflated negative binomial (ZINB regression models have been used to handle dental caries count data with many zeros. We present an evaluation framework to the suitability of applying the GLM, Poisson, NB, ZIP and ZINB to dental caries data set where the count data may exhibit evidence of many zeros and over-dispersion. Estimation of the model parameters using the method of maximum likelihood is provided. Based on the Vuong test statistic and the goodness of fit measure for dental caries data, the NB and ZINB regression models perform better than other count regression models.

  20. An Investigation of the Fit of Linear Regression Models to Data from an SAT[R] Validity Study. Research Report 2011-3

    Science.gov (United States)

    Kobrin, Jennifer L.; Sinharay, Sandip; Haberman, Shelby J.; Chajewski, Michael

    2011-01-01

    This study examined the adequacy of a multiple linear regression model for predicting first-year college grade point average (FYGPA) using SAT[R] scores and high school grade point average (HSGPA). A variety of techniques, both graphical and statistical, were used to examine if it is possible to improve on the linear regression model. The results…

  1. Application of the Hyper-Poisson Generalized Linear Model for Analyzing Motor Vehicle Crashes.

    Science.gov (United States)

    Khazraee, S Hadi; Sáez-Castillo, Antonio Jose; Geedipally, Srinivas Reddy; Lord, Dominique

    2015-05-01

    The hyper-Poisson distribution can handle both over- and underdispersion, and its generalized linear model formulation allows the dispersion of the distribution to be observation-specific and dependent on model covariates. This study's objective is to examine the potential applicability of a newly proposed generalized linear model framework for the hyper-Poisson distribution in analyzing motor vehicle crash count data. The hyper-Poisson generalized linear model was first fitted to intersection crash data from Toronto, characterized by overdispersion, and then to crash data from railway-highway crossings in Korea, characterized by underdispersion. The results of this study are promising. When fitted to the Toronto data set, the goodness-of-fit measures indicated that the hyper-Poisson model with a variable dispersion parameter provided a statistical fit as good as the traditional negative binomial model. The hyper-Poisson model was also successful in handling the underdispersed data from Korea; the model performed as well as the gamma probability model and the Conway-Maxwell-Poisson model previously developed for the same data set. The advantages of the hyper-Poisson model studied in this article are noteworthy. Unlike the negative binomial model, which has difficulties in handling underdispersed data, the hyper-Poisson model can handle both over- and underdispersed crash data. Although not a major issue for the Conway-Maxwell-Poisson model, the effect of each variable on the expected mean of crashes is easily interpretable in the case of this new model. © 2014 Society for Risk Analysis.

  2. Statistical modelling with quantile functions

    CERN Document Server

    Gilchrist, Warren

    2000-01-01

    Galton used quantiles more than a hundred years ago in describing data. Tukey and Parzen used them in the 60s and 70s in describing populations. Since then, the authors of many papers, both theoretical and practical, have used various aspects of quantiles in their work. Until now, however, no one put all the ideas together to form what turns out to be a general approach to statistics.Statistical Modelling with Quantile Functions does just that. It systematically examines the entire process of statistical modelling, starting with using the quantile function to define continuous distributions. The author shows that by using this approach, it becomes possible to develop complex distributional models from simple components. A modelling kit can be developed that applies to the whole model - deterministic and stochastic components - and this kit operates by adding, multiplying, and transforming distributions rather than data.Statistical Modelling with Quantile Functions adds a new dimension to the practice of stati...

  3. A Statistical Programme Assignment Model

    DEFF Research Database (Denmark)

    Rosholm, Michael; Staghøj, Jonas; Svarer, Michael

    When treatment effects of active labour market programmes are heterogeneous in an observable way  across the population, the allocation of the unemployed into different programmes becomes a particularly  important issue. In this paper, we present a statistical model designed to improve the present...... duration of unemployment spells may result if a statistical programme assignment model is introduced. We discuss several issues regarding the  plementation of such a system, especially the interplay between the statistical model and  case workers....

  4. An online re-linearization scheme suited for Model Predictive and Linear Quadratic Control

    DEFF Research Database (Denmark)

    Henriksen, Lars Christian; Poulsen, Niels Kjølstad

    This technical note documents the equations for primal-dual interior-point quadratic programming problem solver used for MPC. The algorithm exploits the special structure of the MPC problem and is able to reduce the computational burden such that the computational burden scales with prediction...... horizon length in a linear way rather than cubic, which would be the case if the structure was not exploited. It is also shown how models used for design of model-based controllers, e.g. linear quadratic and model predictive, can be linearized both at equilibrium and non-equilibrium points, making...

  5. Statistical Multipath Model Based on Experimental GNSS Data in Static Urban Canyon Environment

    Directory of Open Access Journals (Sweden)

    Yuze Wang

    2018-04-01

    Full Text Available A deep understanding of multipath characteristics is essential to design signal simulators and receivers in global navigation satellite system applications. As a new constellation is deployed and more applications occur in the urban environment, the statistical multipath models of navigation signal need further study. In this paper, we present statistical distribution models of multipath time delay, multipath power attenuation, and multipath fading frequency based on the experimental data in the urban canyon environment. The raw data of multipath characteristics are obtained by processing real navigation signal to study the statistical distribution. By fitting the statistical data, it shows that the probability distribution of time delay follows a gamma distribution which is related to the waiting time of Poisson distributed events. The fading frequency follows an exponential distribution, and the mean of multipath power attenuation decreases linearly with an increasing time delay. In addition, the detailed statistical characteristics for different elevations and orbits satellites is studied, and the parameters of each distribution are quite different. The research results give useful guidance for navigation simulator and receiver designers.

  6. Modeling of Volatility with Non-linear Time Series Model

    OpenAIRE

    Kim Song Yon; Kim Mun Chol

    2013-01-01

    In this paper, non-linear time series models are used to describe volatility in financial time series data. To describe volatility, two of the non-linear time series are combined into form TAR (Threshold Auto-Regressive Model) with AARCH (Asymmetric Auto-Regressive Conditional Heteroskedasticity) error term and its parameter estimation is studied.

  7. Applicability of linear and non-linear potential flow models on a Wavestar float

    DEFF Research Database (Denmark)

    Bozonnet, Pauline; Dupin, Victor; Tona, Paolino

    2017-01-01

    as a model based on non-linear potential flow theory and weakscatterer hypothesis are successively considered. Simple tests, such as dip tests, decay tests and captive tests enable to highlight the improvements obtained with the introduction of nonlinearities. Float motion under wave actions and without...... control action, limited to small amplitude motion with a single float, is well predicted by the numerical models, including the linear one. Still, float velocity is better predicted by accounting for non-linear hydrostatic and Froude-Krylov forces.......Numerical models based on potential flow theory, including different types of nonlinearities are compared and validated against experimental data for the Wavestar wave energy converter technology. Exact resolution of the rotational motion, non-linear hydrostatic and Froude-Krylov forces as well...

  8. Forecasting Volatility of Dhaka Stock Exchange: Linear Vs Non-linear models

    Directory of Open Access Journals (Sweden)

    Masudul Islam

    2012-10-01

    Full Text Available Prior information about a financial market is very essential for investor to invest money on parches share from the stock market which can strengthen the economy. The study examines the relative ability of various models to forecast daily stock indexes future volatility. The forecasting models that employed from simple to relatively complex ARCH-class models. It is found that among linear models of stock indexes volatility, the moving average model ranks first using root mean square error, mean absolute percent error, Theil-U and Linex loss function  criteria. We also examine five nonlinear models. These models are ARCH, GARCH, EGARCH, TGARCH and restricted GARCH models. We find that nonlinear models failed to dominate linear models utilizing different error measurement criteria and moving average model appears to be the best. Then we forecast the next two months future stock index price volatility by the best (moving average model.

  9. Comparing linear probability model coefficients across groups

    DEFF Research Database (Denmark)

    Holm, Anders; Ejrnæs, Mette; Karlson, Kristian Bernt

    2015-01-01

    of the following three components: outcome truncation, scale parameters and distributional shape of the predictor variable. These results point to limitations in using linear probability model coefficients for group comparisons. We also provide Monte Carlo simulations and real examples to illustrate......This article offers a formal identification analysis of the problem in comparing coefficients from linear probability models between groups. We show that differences in coefficients from these models can result not only from genuine differences in effects, but also from differences in one or more...... these limitations, and we suggest a restricted approach to using linear probability model coefficients in group comparisons....

  10. Approximate Forward Difference Equations for the Lower Order Non-Stationary Statistics of Geometrically Non-Linear Systems subject to Random Excitation

    DEFF Research Database (Denmark)

    Köylüoglu, H. U.; Nielsen, Søren R. K.; Cakmak, A. S.

    Geometrically non-linear multi-degree-of-freedom (MDOF) systems subject to random excitation are considered. New semi-analytical approximate forward difference equations for the lower order non-stationary statistical moments of the response are derived from the stochastic differential equations...... of motion, and, the accuracy of these equations is numerically investigated. For stationary excitations, the proposed method computes the stationary statistical moments of the response from the solution of non-linear algebraic equations....

  11. Linear and non-linear quantitative structure-activity relationship models on indole substitution patterns as inhibitors of HIV-1 attachment.

    Science.gov (United States)

    Nirouei, Mahyar; Ghasemi, Ghasem; Abdolmaleki, Parviz; Tavakoli, Abdolreza; Shariati, Shahab

    2012-06-01

    The antiviral drugs that inhibit human immunodeficiency virus (HIV) entry to the target cells are already in different phases of clinical trials. They prevent viral entry and have a highly specific mechanism of action with a low toxicity profile. Few QSAR studies have been performed on this group of inhibitors. This study was performed to develop a quantitative structure-activity relationship (QSAR) model of the biological activity of indole glyoxamide derivatives as inhibitors of the interaction between HIV glycoprotein gp120 and host cell CD4 receptors. Forty different indole glyoxamide derivatives were selected as a sample set and geometrically optimized using Gaussian 98W. Different combinations of multiple linear regression (MLR), genetic algorithms (GA) and artificial neural networks (ANN) were then utilized to construct the QSAR models. These models were also utilized to select the most efficient subsets of descriptors in a cross-validation procedure for non-linear log (1/EC50) prediction. The results that were obtained using GA-ANN were compared with MLR-MLR and MLR-ANN models. A high predictive ability was observed for the MLR, MLR-ANN and GA-ANN models, with root mean sum square errors (RMSE) of 0.99, 0.91 and 0.67, respectively (N = 40). In summary, machine learning methods were highly effective in designing QSAR models when compared to statistical method.

  12. Model Predictive Control for Linear Complementarity and Extended Linear Complementarity Systems

    Directory of Open Access Journals (Sweden)

    Bambang Riyanto

    2005-11-01

    Full Text Available In this paper, we propose model predictive control method for linear complementarity and extended linear complementarity systems by formulating optimization along prediction horizon as mixed integer quadratic program. Such systems contain interaction between continuous dynamics and discrete event systems, and therefore, can be categorized as hybrid systems. As linear complementarity and extended linear complementarity systems finds applications in different research areas, such as impact mechanical systems, traffic control and process control, this work will contribute to the development of control design method for those areas as well, as shown by three given examples.

  13. Introduction to Bayesian statistics

    CERN Document Server

    Bolstad, William M

    2017-01-01

    There is a strong upsurge in the use of Bayesian methods in applied statistical analysis, yet most introductory statistics texts only present frequentist methods. Bayesian statistics has many important advantages that students should learn about if they are going into fields where statistics will be used. In this Third Edition, four newly-added chapters address topics that reflect the rapid advances in the field of Bayesian staistics. The author continues to provide a Bayesian treatment of introductory statistical topics, such as scientific data gathering, discrete random variables, robust Bayesian methods, and Bayesian approaches to inferenfe cfor discrete random variables, bionomial proprotion, Poisson, normal mean, and simple linear regression. In addition, newly-developing topics in the field are presented in four new chapters: Bayesian inference with unknown mean and variance; Bayesian inference for Multivariate Normal mean vector; Bayesian inference for Multiple Linear RegressionModel; and Computati...

  14. Quantifying the astronomical contribution to Pleistocene climate change: A non-linear, statistical approach

    Science.gov (United States)

    Crucifix, Michel; Wilkinson, Richard; Carson, Jake; Preston, Simon; Alemeida, Carlos; Rougier, Jonathan

    2013-04-01

    The existence of an action of astronomical forcing on the Pleistocene climate is almost undisputed. However, quantifying this action is not straightforward. In particular, the phenomenon of deglaciation is generally interpreted as a manifestation of instability, which is typical of non-linear systems. As a consequence, explaining the Pleistocene climate record as the addition of an astronomical contribution and noise-as often done using harmonic analysis tools-is potentially deceptive. Rather, we advocate a methodology in which non-linear stochastic dynamical systems are calibrated on the Pleistocene climate record. The exercise, though, requires careful statistical reasoning and state-of-the-art techniques. In fact, the problem has been judged to be mathematically 'intractable and unsolved' and some pragmatism is justified. In order to illustrate the methodology we consider one dynamical system that potentially captures four dynamical features of the Pleistocene climate : the existence of a saddle-node bifurcation in at least one of its slow components, a time-scale separation between a slow and a fast component, the action of astronomical forcing, and the existence a stochastic contribution to the system dynamics. This model is obviously not the only possible representation of Pleistocene dynamics, but it encapsulates well enough both our theoretical and empirical knowledge into a very simple form to constitute a valid starting point. The purpose of this poster is to outline the practical challenges in calibrating such a model on paleoclimate observations. Just as in time series analysis, there is no one single and universal test or criteria that would demonstrate the validity of an approach. Several methods exist to calibrate the model and judgement develops by the confrontation of the results of the different methods. In particular, we consider here the Kalman filter variants, the Particle Monte-Carlo Markov Chain, and two other variants of Sequential Monte

  15. Effect of correlation on covariate selection in linear and nonlinear mixed effect models.

    Science.gov (United States)

    Bonate, Peter L

    2017-01-01

    The effect of correlation among covariates on covariate selection was examined with linear and nonlinear mixed effect models. Demographic covariates were extracted from the National Health and Nutrition Examination Survey III database. Concentration-time profiles were Monte Carlo simulated where only one covariate affected apparent oral clearance (CL/F). A series of univariate covariate population pharmacokinetic models was fit to the data and compared with the reduced model without covariate. The "best" covariate was identified using either the likelihood ratio test statistic or AIC. Weight and body surface area (calculated using Gehan and George equation, 1970) were highly correlated (r = 0.98). Body surface area was often selected as a better covariate than weight, sometimes as high as 1 in 5 times, when weight was the covariate used in the data generating mechanism. In a second simulation, parent drug concentration and three metabolites were simulated from a thorough QT study and used as covariates in a series of univariate linear mixed effects models of ddQTc interval prolongation. The covariate with the largest significant LRT statistic was deemed the "best" predictor. When the metabolite was formation-rate limited and only parent concentrations affected ddQTc intervals the metabolite was chosen as a better predictor as often as 1 in 5 times depending on the slope of the relationship between parent concentrations and ddQTc intervals. A correlated covariate can be chosen as being a better predictor than another covariate in a linear or nonlinear population analysis by sheer correlation These results explain why for the same drug different covariates may be identified in different analyses. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  16. Linear models of income patterns in consumer demand for foods and evaluation of its elasticity

    Directory of Open Access Journals (Sweden)

    Pavel Syrovátka

    2005-01-01

    Full Text Available The paper is focused on the use of the linear constructions for developing of Engel’s demand models in the field of the food-consumer demand. In the theoretical part of the paper, the linear approximations of this demand models are analysed on the bases of the linear interpolation. In the same part of this text, the hyperbolic elasticity function was defined for the linear Engel model. The behaviour of the hyperbolic elasticity function and its properties were consequently investigated too. The behaviour of the determined elasticity function was investigated according to the values of the intercept point and the direction parameter in the original linear Engel model. The obtained theoretical findings were tested using the real data of Czech Statistical Office. The developed linear Engel model was explicitly dynamised, because the achieved database was formed into the time series. With respect to the two variables definitions of the hyperbolic function in the theoretical part of the text, the determined dynamic model of the Engel demand for food was transformed into the form with parametric intercept point:ret* = At + 0.0946 · rmt*,where the values of absolute member are defined as:At = 1773.0973 + 9.3064 · t – 0.3023 · t2; (t = 1, 2, ... 32.The value of At in the parametric linear model of Engel consumer demand for food was during the observed period (1995–2002 always positive. Thus, the hyperbolic elasticity function achieved the elasticity coefficients from the interval:ηt ∈〈+0; +1.Within quantitative analysis of Engel demand for food in the Czech Republic during the given time period, it was founded, that income elasticity of food expenditures of the average Czech household was moved between +0.4080 and +0.4511. The Czech-household demand for food is thus income inelastic with the normal income reactions.

  17. Modelling binary data

    CERN Document Server

    Collett, David

    2002-01-01

    INTRODUCTION Some Examples The Scope of this Book Use of Statistical Software STATISTICAL INFERENCE FOR BINARY DATA The Binomial Distribution Inference about the Success Probability Comparison of Two Proportions Comparison of Two or More Proportions MODELS FOR BINARY AND BINOMIAL DATA Statistical Modelling Linear Models Methods of Estimation Fitting Linear Models to Binomial Data Models for Binomial Response Data The Linear Logistic Model Fitting the Linear Logistic Model to Binomial Data Goodness of Fit of a Linear Logistic Model Comparing Linear Logistic Models Linear Trend in Proportions Comparing Stimulus-Response Relationships Non-Convergence and Overfitting Some other Goodness of Fit Statistics Strategy for Model Selection Predicting a Binary Response Probability BIOASSAY AND SOME OTHER APPLICATIONS The Tolerance Distribution Estimating an Effective Dose Relative Potency Natural Response Non-Linear Logistic Regression Models Applications of the Complementary Log-Log Model MODEL CHECKING Definition of Re...

  18. Linear approximation model network and its formation via ...

    Indian Academy of Sciences (India)

    To overcome the deficiency of `local model network' (LMN) techniques, an alternative `linear approximation model' (LAM) network approach is proposed. Such a network models a nonlinear or practical system with multiple linear models fitted along operating trajectories, where individual models are simply networked ...

  19. Improving UWB-Based Localization in IoT Scenarios with Statistical Models of Distance Error.

    Science.gov (United States)

    Monica, Stefania; Ferrari, Gianluigi

    2018-05-17

    Interest in the Internet of Things (IoT) is rapidly increasing, as the number of connected devices is exponentially growing. One of the application scenarios envisaged for IoT technologies involves indoor localization and context awareness. In this paper, we focus on a localization approach that relies on a particular type of communication technology, namely Ultra Wide Band (UWB). UWB technology is an attractive choice for indoor localization, owing to its high accuracy. Since localization algorithms typically rely on estimated inter-node distances, the goal of this paper is to evaluate the improvement brought by a simple (linear) statistical model of the distance error. On the basis of an extensive experimental measurement campaign, we propose a general analytical framework, based on a Least Square (LS) method, to derive a novel statistical model for the range estimation error between a pair of UWB nodes. The proposed statistical model is then applied to improve the performance of a few illustrative localization algorithms in various realistic scenarios. The obtained experimental results show that the use of the proposed statistical model improves the accuracy of the considered localization algorithms with a reduction of the localization error up to 66%.

  20. A comparison of linear interpolation models for iterative CT reconstruction.

    Science.gov (United States)

    Hahn, Katharina; Schöndube, Harald; Stierstorfer, Karl; Hornegger, Joachim; Noo, Frédéric

    2016-12-01

    Recent reports indicate that model-based iterative reconstruction methods may improve image quality in computed tomography (CT). One difficulty with these methods is the number of options available to implement them, including the selection of the forward projection model and the penalty term. Currently, the literature is fairly scarce in terms of guidance regarding this selection step, whereas these options impact image quality. Here, the authors investigate the merits of three forward projection models that rely on linear interpolation: the distance-driven method, Joseph's method, and the bilinear method. The authors' selection is motivated by three factors: (1) in CT, linear interpolation is often seen as a suitable trade-off between discretization errors and computational cost, (2) the first two methods are popular with manufacturers, and (3) the third method enables assessing the importance of a key assumption in the other methods. One approach to evaluate forward projection models is to inspect their effect on discretized images, as well as the effect of their transpose on data sets, but significance of such studies is unclear since the matrix and its transpose are always jointly used in iterative reconstruction. Another approach is to investigate the models in the context they are used, i.e., together with statistical weights and a penalty term. Unfortunately, this approach requires the selection of a preferred objective function and does not provide clear information on features that are intrinsic to the model. The authors adopted the following two-stage methodology. First, the authors analyze images that progressively include components of the singular value decomposition of the model in a reconstructed image without statistical weights and penalty term. Next, the authors examine the impact of weights and penalty on observed differences. Image quality metrics were investigated for 16 different fan-beam imaging scenarios that enabled probing various aspects

  1. Direction of Effects in Multiple Linear Regression Models.

    Science.gov (United States)

    Wiedermann, Wolfgang; von Eye, Alexander

    2015-01-01

    Previous studies analyzed asymmetric properties of the Pearson correlation coefficient using higher than second order moments. These asymmetric properties can be used to determine the direction of dependence in a linear regression setting (i.e., establish which of two variables is more likely to be on the outcome side) within the framework of cross-sectional observational data. Extant approaches are restricted to the bivariate regression case. The present contribution extends the direction of dependence methodology to a multiple linear regression setting by analyzing distributional properties of residuals of competing multiple regression models. It is shown that, under certain conditions, the third central moments of estimated regression residuals can be used to decide upon direction of effects. In addition, three different approaches for statistical inference are discussed: a combined D'Agostino normality test, a skewness difference test, and a bootstrap difference test. Type I error and power of the procedures are assessed using Monte Carlo simulations, and an empirical example is provided for illustrative purposes. In the discussion, issues concerning the quality of psychological data, possible extensions of the proposed methods to the fourth central moment of regression residuals, and potential applications are addressed.

  2. Microscopic model for the non-linear fluctuating hydrodynamic of 4 He superfluid helium deduced by maximum entropy method

    International Nuclear Information System (INIS)

    Alvarez R, J.T.

    1998-01-01

    This thesis presents a microscopic model for the non-linear fluctuating hydrodynamic of superfluid helium ( 4 He), model developed by means of the Maximum Entropy Method (Maxent). In the chapter 1, it is demonstrated the necessity to developing a microscopic model for the fluctuating hydrodynamic of the superfluid helium, starting from to show a brief overview of the theories and experiments developed in order to explain the behavior of the superfluid helium. On the other hand, it is presented the Morozov heuristic method for the construction of the non-linear hydrodynamic fluctuating of simple fluid. Method that will be generalized for the construction of the non-linear fluctuating hydrodynamic of the superfluid helium. Besides, it is presented a brief summary of the content of the thesis. In the chapter 2, it is reproduced the construction of a Generalized Fokker-Planck equation, (GFP), for a distribution function associated with the coarse grained variables. Function defined with aid of a nonequilibrium statistical operator ρhut FP that is evaluated as Wigneris function through ρ CG obtained by Maxent. Later this equation of GFP is reduced to a non-linear local FP equation from considering a slow and Markov process in the coarse grained variables. In this equation appears a matrix D mn defined with a nonequilibrium coarse grained statistical operator ρhut CG , matrix elements are used in the construction of the non-linear fluctuating hydrodynamics equations of the superfluid helium. In the chapter 3, the Lagrange multipliers are evaluated for to determine ρhut CG by means of the local equilibrium statistical operator ρhut l -tilde with the hypothesis that the system presents small fluctuations. Also are determined the currents associated with the coarse grained variables and furthermore are evaluated the matrix elements D mn but with aid of a quasi equilibrium statistical operator ρhut qe instead of the local equilibrium operator ρhut l -tilde. Matrix

  3. Statistical modeling for degradation data

    CERN Document Server

    Lio, Yuhlong; Ng, Hon; Tsai, Tzong-Ru

    2017-01-01

    This book focuses on the statistical aspects of the analysis of degradation data. In recent years, degradation data analysis has come to play an increasingly important role in different disciplines such as reliability, public health sciences, and finance. For example, information on products’ reliability can be obtained by analyzing degradation data. In addition, statistical modeling and inference techniques have been developed on the basis of different degradation measures. The book brings together experts engaged in statistical modeling and inference, presenting and discussing important recent advances in degradation data analysis and related applications. The topics covered are timely and have considerable potential to impact both statistics and reliability engineering.

  4. Exclusion statistics and integrable models

    International Nuclear Information System (INIS)

    Mashkevich, S.

    1998-01-01

    The definition of exclusion statistics, as given by Haldane, allows for a statistical interaction between distinguishable particles (multi-species statistics). The thermodynamic quantities for such statistics ca be evaluated exactly. The explicit expressions for the cluster coefficients are presented. Furthermore, single-species exclusion statistics is realized in one-dimensional integrable models. The interesting questions of generalizing this correspondence onto the higher-dimensional and the multi-species cases remain essentially open

  5. Estimating the impact of mineral aerosols on crop yields in food insecure regions using statistical crop models

    Science.gov (United States)

    Hoffman, A.; Forest, C. E.; Kemanian, A.

    2016-12-01

    A significant number of food-insecure nations exist in regions of the world where dust plays a large role in the climate system. While the impacts of common climate variables (e.g. temperature, precipitation, ozone, and carbon dioxide) on crop yields are relatively well understood, the impact of mineral aerosols on yields have not yet been thoroughly investigated. This research aims to develop the data and tools to progress our understanding of mineral aerosol impacts on crop yields. Suspended dust affects crop yields by altering the amount and type of radiation reaching the plant, modifying local temperature and precipitation. While dust events (i.e. dust storms) affect crop yields by depleting the soil of nutrients or by defoliation via particle abrasion. The impact of dust on yields is modeled statistically because we are uncertain which impacts will dominate the response on national and regional scales considered in this study. Multiple linear regression is used in a number of large-scale statistical crop modeling studies to estimate yield responses to various climate variables. In alignment with previous work, we develop linear crop models, but build upon this simple method of regression with machine-learning techniques (e.g. random forests) to identify important statistical predictors and isolate how dust affects yields on the scales of interest. To perform this analysis, we develop a crop-climate dataset for maize, soybean, groundnut, sorghum, rice, and wheat for the regions of West Africa, East Africa, South Africa, and the Sahel. Random forest regression models consistently model historic crop yields better than the linear models. In several instances, the random forest models accurately capture the temperature and precipitation threshold behavior in crops. Additionally, improving agricultural technology has caused a well-documented positive trend that dominates time series of global and regional yields. This trend is often removed before regression with

  6. Stochastic or statistic? Comparing flow duration curve models in ungauged basins and changing climates

    Science.gov (United States)

    Müller, M. F.; Thompson, S. E.

    2015-09-01

    The prediction of flow duration curves (FDCs) in ungauged basins remains an important task for hydrologists given the practical relevance of FDCs for water management and infrastructure design. Predicting FDCs in ungauged basins typically requires spatial interpolation of statistical or model parameters. This task is complicated if climate becomes non-stationary, as the prediction challenge now also requires extrapolation through time. In this context, process-based models for FDCs that mechanistically link the streamflow distribution to climate and landscape factors may have an advantage over purely statistical methods to predict FDCs. This study compares a stochastic (process-based) and statistical method for FDC prediction in both stationary and non-stationary contexts, using Nepal as a case study. Under contemporary conditions, both models perform well in predicting FDCs, with Nash-Sutcliffe coefficients above 0.80 in 75 % of the tested catchments. The main drives of uncertainty differ between the models: parameter interpolation was the main source of error for the statistical model, while violations of the assumptions of the process-based model represented the main source of its error. The process-based approach performed better than the statistical approach in numerical simulations with non-stationary climate drivers. The predictions of the statistical method under non-stationary rainfall conditions were poor if (i) local runoff coefficients were not accurately determined from the gauge network, or (ii) streamflow variability was strongly affected by changes in rainfall. A Monte Carlo analysis shows that the streamflow regimes in catchments characterized by a strong wet-season runoff and a rapid, strongly non-linear hydrologic response are particularly sensitive to changes in rainfall statistics. In these cases, process-based prediction approaches are strongly favored over statistical models.

  7. Connection between perturbation theory, projection-operator techniques, and statistical linearization for nonlinear systems

    International Nuclear Information System (INIS)

    Budgor, A.B.; West, B.J.

    1978-01-01

    We employ the equivalence between Zwanzig's projection-operator formalism and perturbation theory to demonstrate that the approximate-solution technique of statistical linearization for nonlinear stochastic differential equations corresponds to the lowest-order β truncation in both the consolidated perturbation expansions and in the ''mass operator'' of a renormalized Green's function equation. Other consolidated equations can be obtained by selectively modifying this mass operator. We particularize the results of this paper to the Duffing anharmonic oscillator equation

  8. Some computer simulations based on the linear relative risk model

    International Nuclear Information System (INIS)

    Gilbert, E.S.

    1991-10-01

    This report presents the results of computer simulations designed to evaluate and compare the performance of the likelihood ratio statistic and the score statistic for making inferences about the linear relative risk mode. The work was motivated by data on workers exposed to low doses of radiation, and the report includes illustration of several procedures for obtaining confidence limits for the excess relative risk coefficient based on data from three studies of nuclear workers. The computer simulations indicate that with small sample sizes and highly skewed dose distributions, asymptotic approximations to the score statistic or to the likelihood ratio statistic may not be adequate. For testing the null hypothesis that the excess relative risk is equal to zero, the asymptotic approximation to the likelihood ratio statistic was adequate, but use of the asymptotic approximation to the score statistic rejected the null hypothesis too often. Frequently the likelihood was maximized at the lower constraint, and when this occurred, the asymptotic approximations for the likelihood ratio and score statistics did not perform well in obtaining upper confidence limits. The score statistic and likelihood ratio statistics were found to perform comparably in terms of power and width of the confidence limits. It is recommended that with modest sample sizes, confidence limits be obtained using computer simulations based on the score statistic. Although nuclear worker studies are emphasized in this report, its results are relevant for any study investigating linear dose-response functions with highly skewed exposure distributions. 22 refs., 14 tabs

  9. Nonlinear wave chaos: statistics of second harmonic fields.

    Science.gov (United States)

    Zhou, Min; Ott, Edward; Antonsen, Thomas M; Anlage, Steven M

    2017-10-01

    Concepts from the field of wave chaos have been shown to successfully predict the statistical properties of linear electromagnetic fields in electrically large enclosures. The Random Coupling Model (RCM) describes these properties by incorporating both universal features described by Random Matrix Theory and the system-specific features of particular system realizations. In an effort to extend this approach to the nonlinear domain, we add an active nonlinear frequency-doubling circuit to an otherwise linear wave chaotic system, and we measure the statistical properties of the resulting second harmonic fields. We develop an RCM-based model of this system as two linear chaotic cavities coupled by means of a nonlinear transfer function. The harmonic field strengths are predicted to be the product of two statistical quantities and the nonlinearity characteristics. Statistical results from measurement-based calculation, RCM-based simulation, and direct experimental measurements are compared and show good agreement over many decades of power.

  10. Heterotic sigma models and non-linear strings

    International Nuclear Information System (INIS)

    Hull, C.M.

    1986-01-01

    The two-dimensional supersymmetric non-linear sigma models are examined with respect to the heterotic string. The paper was presented at the workshop on :Supersymmetry and its applications', Cambridge, United Kingdom, 1985. The non-linear sigma model with Wess-Zumino-type term, the coupling of the fermionic superfields to the sigma model, super-conformal invariance, and the supersymmetric string, are all discussed. (U.K.)

  11. Comparison of linear and non-linear models for the adsorption of fluoride onto geo-material: limonite.

    Science.gov (United States)

    Sahin, Rubina; Tapadia, Kavita

    2015-01-01

    The three widely used isotherms Langmuir, Freundlich and Temkin were examined in an experiment using fluoride (F⁻) ion adsorption on a geo-material (limonite) at four different temperatures by linear and non-linear models. Comparison of linear and non-linear regression models were given in selecting the optimum isotherm for the experimental results. The coefficient of determination, r², was used to select the best theoretical isotherm. The four Langmuir linear equations (1, 2, 3, and 4) are discussed. Langmuir isotherm parameters obtained from the four Langmuir linear equations using the linear model differed but they were the same when using the nonlinear model. Langmuir-2 isotherm is one of the linear forms, and it had the highest coefficient of determination (r² = 0.99) compared to the other Langmuir linear equations (1, 3 and 4) in linear form, whereas, for non-linear, Langmuir-4 fitted best among all the isotherms because it had the highest coefficient of determination (r² = 0.99). The results showed that the non-linear model may be a better way to obtain the parameters. In the present work, the thermodynamic parameters show that the absorption of fluoride onto limonite is both spontaneous (ΔG 0). Scanning electron microscope and X-ray diffraction images also confirm the adsorption of F⁻ ion onto limonite. The isotherm and kinetic study reveals that limonite can be used as an adsorbent for fluoride removal. In future we can develop new technology for fluoride removal in large scale by using limonite which is cost-effective, eco-friendly and is easily available in the study area.

  12. Role of statistical linearization in the solution of nonlinear stochastic equations

    International Nuclear Information System (INIS)

    Budgor, A.B.

    1977-01-01

    The solution of a generalized Langevin equation is referred to as a stochastic process. If the external forcing function is Gaussian white noise, the forward Kolmogarov equation yields the transition probability density function. Nonlinear problems must be handled by approximation procedures e.g., perturbation theories, eigenfunction expansions, and nonlinear optimization procedures. After some comments on the first two of these, attention is directed to the third, and the method of statistical linearization is used to demonstrate a relation to the former two. Nonlinear stochastic systems exhibiting sustained or forced oscillations and the centered nonlinear Schroedinger equation in the presence of Gaussian white noise excitation are considered as examples. 5 figures, 2 tables

  13. The Impact of a Flipped Classroom Model of Learning on a Large Undergraduate Statistics Class

    Science.gov (United States)

    Nielson, Perpetua Lynne; Bean, Nathan William Bean; Larsen, Ross Allen Andrew

    2018-01-01

    We examine the impact of a flipped classroom model of learning on student performance and satisfaction in a large undergraduate introductory statistics class. Two professors each taught a lecture-section and a flipped-class section. Using MANCOVA, a linear combination of final exam scores, average quiz scores, and course ratings was compared for…

  14. Non linear viscoelastic models

    DEFF Research Database (Denmark)

    Agerkvist, Finn T.

    2011-01-01

    Viscoelastic eects are often present in loudspeaker suspensions, this can be seen in the displacement transfer function which often shows a frequency dependent value below the resonance frequency. In this paper nonlinear versions of the standard linear solid model (SLS) are investigated....... The simulations show that the nonlinear version of the Maxwell SLS model can result in a time dependent small signal stiness while the Kelvin Voight version does not....

  15. Genetic parameters for racing records in trotters using linear and generalized linear models.

    Science.gov (United States)

    Suontama, M; van der Werf, J H J; Juga, J; Ojala, M

    2012-09-01

    Heritability and repeatability and genetic and phenotypic correlations were estimated for trotting race records with linear and generalized linear models using 510,519 records on 17,792 Finnhorses and 513,161 records on 25,536 Standardbred trotters. Heritability and repeatability were estimated for single racing time and earnings traits with linear models, and logarithmic scale was used for racing time and fourth-root scale for earnings to correct for nonnormality. Generalized linear models with a gamma distribution were applied for single racing time and with a multinomial distribution for single earnings traits. In addition, genetic parameters for annual earnings were estimated with linear models on the observed and fourth-root scales. Racing success traits of single placings, winnings, breaking stride, and disqualifications were analyzed using generalized linear models with a binomial distribution. Estimates of heritability were greatest for racing time, which ranged from 0.32 to 0.34. Estimates of heritability were low for single earnings with all distributions, ranging from 0.01 to 0.09. Annual earnings were closer to normal distribution than single earnings. Heritability estimates were moderate for annual earnings on the fourth-root scale, 0.19 for Finnhorses and 0.27 for Standardbred trotters. Heritability estimates for binomial racing success variables ranged from 0.04 to 0.12, being greatest for winnings and least for breaking stride. Genetic correlations among racing traits were high, whereas phenotypic correlations were mainly low to moderate, except correlations between racing time and earnings were high. On the basis of a moderate heritability and moderate to high repeatability for racing time and annual earnings, selection of horses for these traits is effective when based on a few repeated records. Because of high genetic correlations, direct selection for racing time and annual earnings would also result in good genetic response in racing success.

  16. Generalised linear models for correlated pseudo-observations, with applications to multi-state models

    DEFF Research Database (Denmark)

    Andersen, Per Kragh; Klein, John P.; Rosthøj, Susanne

    2003-01-01

    Generalised estimating equation; Generalised linear model; Jackknife pseudo-value; Logistic regression; Markov Model; Multi-state model......Generalised estimating equation; Generalised linear model; Jackknife pseudo-value; Logistic regression; Markov Model; Multi-state model...

  17. Linear causal modeling with structural equations

    CERN Document Server

    Mulaik, Stanley A

    2009-01-01

    Emphasizing causation as a functional relationship between variables that describe objects, Linear Causal Modeling with Structural Equations integrates a general philosophical theory of causation with structural equation modeling (SEM) that concerns the special case of linear causal relations. In addition to describing how the functional relation concept may be generalized to treat probabilistic causation, the book reviews historical treatments of causation and explores recent developments in experimental psychology on studies of the perception of causation. It looks at how to perceive causal

  18. TENSOR DECOMPOSITIONS AND SPARSE LOG-LINEAR MODELS

    Science.gov (United States)

    Johndrow, James E.; Bhattacharya, Anirban; Dunson, David B.

    2017-01-01

    Contingency table analysis routinely relies on log-linear models, with latent structure analysis providing a common alternative. Latent structure models lead to a reduced rank tensor factorization of the probability mass function for multivariate categorical data, while log-linear models achieve dimensionality reduction through sparsity. Little is known about the relationship between these notions of dimensionality reduction in the two paradigms. We derive several results relating the support of a log-linear model to nonnegative ranks of the associated probability tensor. Motivated by these findings, we propose a new collapsed Tucker class of tensor decompositions, which bridge existing PARAFAC and Tucker decompositions, providing a more flexible framework for parsimoniously characterizing multivariate categorical data. Taking a Bayesian approach to inference, we illustrate empirical advantages of the new decompositions. PMID:29332971

  19. Statistical mechanics of competitive resource allocation using agent-based models

    Science.gov (United States)

    Chakraborti, Anirban; Challet, Damien; Chatterjee, Arnab; Marsili, Matteo; Zhang, Yi-Cheng; Chakrabarti, Bikas K.

    2015-01-01

    Demand outstrips available resources in most situations, which gives rise to competition, interaction and learning. In this article, we review a broad spectrum of multi-agent models of competition (El Farol Bar problem, Minority Game, Kolkata Paise Restaurant problem, Stable marriage problem, Parking space problem and others) and the methods used to understand them analytically. We emphasize the power of concepts and tools from statistical mechanics to understand and explain fully collective phenomena such as phase transitions and long memory, and the mapping between agent heterogeneity and physical disorder. As these methods can be applied to any large-scale model of competitive resource allocation made up of heterogeneous adaptive agent with non-linear interaction, they provide a prospective unifying paradigm for many scientific disciplines.

  20. Non-linear signal response functions and their effects on the statistical and noise cancellation properties of isotope ratio measurements by multi-collector plasma mass spectrometry

    International Nuclear Information System (INIS)

    Doherty, W.

    2013-01-01

    A nebulizer-centric response function model of the analytical inductively coupled argon plasma ion source was used to investigate the statistical frequency distributions and noise reduction factors of simultaneously measured flicker noise limited isotope ion signals and their ratios. The response function model was extended by assuming i) a single gaussian distributed random noise source (nebulizer gas pressure fluctuations) and ii) the isotope ion signal response is a parabolic function of the nebulizer gas pressure. Model calculations of ion signal and signal ratio histograms were obtained by applying the statistical method of translation to the non-linear response function model of the plasma. Histograms of Ni, Cu, Pr, Tl and Pb isotope ion signals measured using a multi-collector plasma mass spectrometer were, without exception, negative skew. Histograms of the corresponding isotope ratios of Ni, Cu, Tl and Pb were either positive or negative skew. There was a complete agreement between the measured and model calculated histogram skew properties. The nebulizer-centric response function model was also used to investigate the effect of non-linear response functions on the effectiveness of noise cancellation by signal division. An alternative noise correction procedure suitable for parabolic signal response functions was derived and applied to measurements of isotope ratios of Cu, Ni, Pb and Tl. The largest noise reduction factors were always obtained when the non-linearity of the response functions was taken into account by the isotope ratio calculation. Possible applications of the nebulizer-centric response function model to other types of analytical instrumentation, large amplitude signal noise sources (e.g., lasers, pumped nebulizers) and analytical error in isotope ratio measurements by multi-collector plasma mass spectrometry are discussed. - Highlights: ► Isotope ion signal noise is modelled as a parabolic transform of a gaussian variable. ► Flicker

  1. On D-branes from gauged linear sigma models

    International Nuclear Information System (INIS)

    Govindarajan, S.; Jayaraman, T.; Sarkar, T.

    2001-01-01

    We study both A-type and B-type D-branes in the gauged linear sigma model by considering worldsheets with boundary. The boundary conditions on the matter and vector multiplet fields are first considered in the large-volume phase/non-linear sigma model limit of the corresponding Calabi-Yau manifold, where we find that we need to add a contact term on the boundary. These considerations enable to us to derive the boundary conditions in the full gauged linear sigma model, including the addition of the appropriate boundary contact terms, such that these boundary conditions have the correct non-linear sigma model limit. Most of the analysis is for the case of Calabi-Yau manifolds with one Kaehler modulus (including those corresponding to hypersurfaces in weighted projective space), though we comment on possible generalisations

  2. New applications of statistical tools in plant pathology.

    Science.gov (United States)

    Garrett, K A; Madden, L V; Hughes, G; Pfender, W F

    2004-09-01

    ABSTRACT The series of papers introduced by this one address a range of statistical applications in plant pathology, including survival analysis, nonparametric analysis of disease associations, multivariate analyses, neural networks, meta-analysis, and Bayesian statistics. Here we present an overview of additional applications of statistics in plant pathology. An analysis of variance based on the assumption of normally distributed responses with equal variances has been a standard approach in biology for decades. Advances in statistical theory and computation now make it convenient to appropriately deal with discrete responses using generalized linear models, with adjustments for overdispersion as needed. New nonparametric approaches are available for analysis of ordinal data such as disease ratings. Many experiments require the use of models with fixed and random effects for data analysis. New or expanded computing packages, such as SAS PROC MIXED, coupled with extensive advances in statistical theory, allow for appropriate analyses of normally distributed data using linear mixed models, and discrete data with generalized linear mixed models. Decision theory offers a framework in plant pathology for contexts such as the decision about whether to apply or withhold a treatment. Model selection can be performed using Akaike's information criterion. Plant pathologists studying pathogens at the population level have traditionally been the main consumers of statistical approaches in plant pathology, but new technologies such as microarrays supply estimates of gene expression for thousands of genes simultaneously and present challenges for statistical analysis. Applications to the study of the landscape of the field and of the genome share the risk of pseudoreplication, the problem of determining the appropriate scale of the experimental unit and of obtaining sufficient replication at that scale.

  3. Markov and semi-Markov switching linear mixed models used to identify forest tree growth components.

    Science.gov (United States)

    Chaubert-Pereira, Florence; Guédon, Yann; Lavergne, Christian; Trottier, Catherine

    2010-09-01

    Tree growth is assumed to be mainly the result of three components: (i) an endogenous component assumed to be structured as a succession of roughly stationary phases separated by marked change points that are asynchronous among individuals, (ii) a time-varying environmental component assumed to take the form of synchronous fluctuations among individuals, and (iii) an individual component corresponding mainly to the local environment of each tree. To identify and characterize these three components, we propose to use semi-Markov switching linear mixed models, i.e., models that combine linear mixed models in a semi-Markovian manner. The underlying semi-Markov chain represents the succession of growth phases and their lengths (endogenous component) whereas the linear mixed models attached to each state of the underlying semi-Markov chain represent-in the corresponding growth phase-both the influence of time-varying climatic covariates (environmental component) as fixed effects, and interindividual heterogeneity (individual component) as random effects. In this article, we address the estimation of Markov and semi-Markov switching linear mixed models in a general framework. We propose a Monte Carlo expectation-maximization like algorithm whose iterations decompose into three steps: (i) sampling of state sequences given random effects, (ii) prediction of random effects given state sequences, and (iii) maximization. The proposed statistical modeling approach is illustrated by the analysis of successive annual shoots along Corsican pine trunks influenced by climatic covariates. © 2009, The International Biometric Society.

  4. Is There a Critical Distance for Fickian Transport? - a Statistical Approach to Sub-Fickian Transport Modelling in Porous Media

    Science.gov (United States)

    Most, S.; Nowak, W.; Bijeljic, B.

    2014-12-01

    Transport processes in porous media are frequently simulated as particle movement. This process can be formulated as a stochastic process of particle position increments. At the pore scale, the geometry and micro-heterogeneities prohibit the commonly made assumption of independent and normally distributed increments to represent dispersion. Many recent particle methods seek to loosen this assumption. Recent experimental data suggest that we have not yet reached the end of the need to generalize, because particle increments show statistical dependency beyond linear correlation and over many time steps. The goal of this work is to better understand the validity regions of commonly made assumptions. We are investigating after what transport distances can we observe: A statistical dependence between increments, that can be modelled as an order-k Markov process, boils down to order 1. This would be the Markovian distance for the process, where the validity of yet-unexplored non-Gaussian-but-Markovian random walks would start. A bivariate statistical dependence that simplifies to a multi-Gaussian dependence based on simple linear correlation (validity of correlated PTRW). Complete absence of statistical dependence (validity of classical PTRW/CTRW). The approach is to derive a statistical model for pore-scale transport from a powerful experimental data set via copula analysis. The model is formulated as a non-Gaussian, mutually dependent Markov process of higher order, which allows us to investigate the validity ranges of simpler models.

  5. The Accuracy and Reproducibility of Linear Measurements Made on CBCT-derived Digital Models.

    Science.gov (United States)

    Maroua, Ahmad L; Ajaj, Mowaffak; Hajeer, Mohammad Y

    2016-04-01

    To evaluate the accuracy and reproducibility of linear measurements made on cone-beam computed tomography (CBCT)-derived digital models. A total of 25 patients (44% female, 18.7 ± 4 years) who had CBCT images for diagnostic purposes were included. Plaster models were obtained and digital models were extracted from CBCT scans. Seven linear measurements from predetermined landmarks were measured and analyzed on plaster models and the corresponding digital models. The measurements included arch length and width at different sites. Paired t test and Bland-Altman analysis were used to evaluate the accuracy of measurements on digital models compared to the plaster models. Also, intraclass correlation coefficients (ICCs) were used to evaluate the reproducibility of the measurements in order to assess the intraobserver reliability. The statistical analysis showed significant differences on 5 out of 14 variables, and the mean differences ranged from -0.48 to 0.51 mm. The Bland-Altman analysis revealed that the mean difference between variables was (0.14 ± 0.56) and (0.05 ± 0.96) mm and limits of agreement between the two methods ranged from -1.2 to 0.96 and from -1.8 to 1.9 mm in the maxilla and the mandible, respectively. The intraobserver reliability values were determined for all 14 variables of two types of models separately. The mean ICC value for the plaster models was 0.984 (0.924-0.999), while it was 0.946 for the CBCT models (range from 0.850 to 0.985). Linear measurements obtained from the CBCT-derived models appeared to have a high level of accuracy and reproducibility.

  6. Statistics for Finance

    DEFF Research Database (Denmark)

    Lindström, Erik; Madsen, Henrik; Nielsen, Jan Nygaard

    Statistics for Finance develops students’ professional skills in statistics with applications in finance. Developed from the authors’ courses at the Technical University of Denmark and Lund University, the text bridges the gap between classical, rigorous treatments of financial mathematics...... that rarely connect concepts to data and books on econometrics and time series analysis that do not cover specific problems related to option valuation. The book discusses applications of financial derivatives pertaining to risk assessment and elimination. The authors cover various statistical...... and mathematical techniques, including linear and nonlinear time series analysis, stochastic calculus models, stochastic differential equations, Itō’s formula, the Black–Scholes model, the generalized method-of-moments, and the Kalman filter. They explain how these tools are used to price financial derivatives...

  7. Modeling Linguistic Variables With Regression Models: Addressing Non-Gaussian Distributions, Non-independent Observations, and Non-linear Predictors With Random Effects and Generalized Additive Models for Location, Scale, and Shape

    Directory of Open Access Journals (Sweden)

    Christophe Coupé

    2018-04-01

    Full Text Available As statistical approaches are getting increasingly used in linguistics, attention must be paid to the choice of methods and algorithms used. This is especially true since they require assumptions to be satisfied to provide valid results, and because scientific articles still often fall short of reporting whether such assumptions are met. Progress is being, however, made in various directions, one of them being the introduction of techniques able to model data that cannot be properly analyzed with simpler linear regression models. We report recent advances in statistical modeling in linguistics. We first describe linear mixed-effects regression models (LMM, which address grouping of observations, and generalized linear mixed-effects models (GLMM, which offer a family of distributions for the dependent variable. Generalized additive models (GAM are then introduced, which allow modeling non-linear parametric or non-parametric relationships between the dependent variable and the predictors. We then highlight the possibilities offered by generalized additive models for location, scale, and shape (GAMLSS. We explain how they make it possible to go beyond common distributions, such as Gaussian or Poisson, and offer the appropriate inferential framework to account for ‘difficult’ variables such as count data with strong overdispersion. We also demonstrate how they offer interesting perspectives on data when not only the mean of the dependent variable is modeled, but also its variance, skewness, and kurtosis. As an illustration, the case of phonemic inventory size is analyzed throughout the article. For over 1,500 languages, we consider as predictors the number of speakers, the distance from Africa, an estimation of the intensity of language contact, and linguistic relationships. We discuss the use of random effects to account for genealogical relationships, the choice of appropriate distributions to model count data, and non-linear relationships

  8. Modeling Linguistic Variables With Regression Models: Addressing Non-Gaussian Distributions, Non-independent Observations, and Non-linear Predictors With Random Effects and Generalized Additive Models for Location, Scale, and Shape.

    Science.gov (United States)

    Coupé, Christophe

    2018-01-01

    As statistical approaches are getting increasingly used in linguistics, attention must be paid to the choice of methods and algorithms used. This is especially true since they require assumptions to be satisfied to provide valid results, and because scientific articles still often fall short of reporting whether such assumptions are met. Progress is being, however, made in various directions, one of them being the introduction of techniques able to model data that cannot be properly analyzed with simpler linear regression models. We report recent advances in statistical modeling in linguistics. We first describe linear mixed-effects regression models (LMM), which address grouping of observations, and generalized linear mixed-effects models (GLMM), which offer a family of distributions for the dependent variable. Generalized additive models (GAM) are then introduced, which allow modeling non-linear parametric or non-parametric relationships between the dependent variable and the predictors. We then highlight the possibilities offered by generalized additive models for location, scale, and shape (GAMLSS). We explain how they make it possible to go beyond common distributions, such as Gaussian or Poisson, and offer the appropriate inferential framework to account for 'difficult' variables such as count data with strong overdispersion. We also demonstrate how they offer interesting perspectives on data when not only the mean of the dependent variable is modeled, but also its variance, skewness, and kurtosis. As an illustration, the case of phonemic inventory size is analyzed throughout the article. For over 1,500 languages, we consider as predictors the number of speakers, the distance from Africa, an estimation of the intensity of language contact, and linguistic relationships. We discuss the use of random effects to account for genealogical relationships, the choice of appropriate distributions to model count data, and non-linear relationships. Relying on GAMLSS, we

  9. Estimation of the limit of detection in semiconductor gas sensors through linearized calibration models.

    Science.gov (United States)

    Burgués, Javier; Jiménez-Soto, Juan Manuel; Marco, Santiago

    2018-07-12

    The limit of detection (LOD) is a key figure of merit in chemical sensing. However, the estimation of this figure of merit is hindered by the non-linear calibration curve characteristic of semiconductor gas sensor technologies such as, metal oxide (MOX), gasFETs or thermoelectric sensors. Additionally, chemical sensors suffer from cross-sensitivities and temporal stability problems. The application of the International Union of Pure and Applied Chemistry (IUPAC) recommendations for univariate LOD estimation in non-linear semiconductor gas sensors is not straightforward due to the strong statistical requirements of the IUPAC methodology (linearity, homoscedasticity, normality). Here, we propose a methodological approach to LOD estimation through linearized calibration models. As an example, the methodology is applied to the detection of low concentrations of carbon monoxide using MOX gas sensors in a scenario where the main source of error is the presence of uncontrolled levels of humidity. Copyright © 2018 Elsevier B.V. All rights reserved.

  10. Generalized functional linear models for gene-based case-control association studies.

    Science.gov (United States)

    Fan, Ruzong; Wang, Yifan; Mills, James L; Carter, Tonia C; Lobach, Iryna; Wilson, Alexander F; Bailey-Wilson, Joan E; Weeks, Daniel E; Xiong, Momiao

    2014-11-01

    By using functional data analysis techniques, we developed generalized functional linear models for testing association between a dichotomous trait and multiple genetic variants in a genetic region while adjusting for covariates. Both fixed and mixed effect models are developed and compared. Extensive simulations show that Rao's efficient score tests of the fixed effect models are very conservative since they generate lower type I errors than nominal levels, and global tests of the mixed effect models generate accurate type I errors. Furthermore, we found that the Rao's efficient score test statistics of the fixed effect models have higher power than the sequence kernel association test (SKAT) and its optimal unified version (SKAT-O) in most cases when the causal variants are both rare and common. When the causal variants are all rare (i.e., minor allele frequencies less than 0.03), the Rao's efficient score test statistics and the global tests have similar or slightly lower power than SKAT and SKAT-O. In practice, it is not known whether rare variants or common variants in a gene region are disease related. All we can assume is that a combination of rare and common variants influences disease susceptibility. Thus, the improved performance of our models when the causal variants are both rare and common shows that the proposed models can be very useful in dissecting complex traits. We compare the performance of our methods with SKAT and SKAT-O on real neural tube defects and Hirschsprung's disease datasets. The Rao's efficient score test statistics and the global tests are more sensitive than SKAT and SKAT-O in the real data analysis. Our methods can be used in either gene-disease genome-wide/exome-wide association studies or candidate gene analyses. © 2014 WILEY PERIODICALS, INC.

  11. Decomposable log-linear models

    DEFF Research Database (Denmark)

    Eriksen, Poul Svante

    can be characterized by a structured set of conditional independencies between some variables given some other variables. We term the new model class decomposable log-linear models, which is illustrated to be a much richer class than decomposable graphical models.It covers a wide range of non...... The present paper considers discrete probability models with exact computational properties. In relation to contingency tables this means closed form expressions of the maksimum likelihood estimate and its distribution. The model class includes what is known as decomposable graphicalmodels, which......-hierarchical models, models with structural zeroes, models described by quasi independence and models for level merging. Also, they have a very natural interpretation as they may be formulated by a structured set of conditional independencies between two events given some other event. In relation to contingency...

  12. Modeling digital switching circuits with linear algebra

    CERN Document Server

    Thornton, Mitchell A

    2014-01-01

    Modeling Digital Switching Circuits with Linear Algebra describes an approach for modeling digital information and circuitry that is an alternative to Boolean algebra. While the Boolean algebraic model has been wildly successful and is responsible for many advances in modern information technology, the approach described in this book offers new insight and different ways of solving problems. Modeling the bit as a vector instead of a scalar value in the set {0, 1} allows digital circuits to be characterized with transfer functions in the form of a linear transformation matrix. The use of transf

  13. Classical model of intermediate statistics

    International Nuclear Information System (INIS)

    Kaniadakis, G.

    1994-01-01

    In this work we present a classical kinetic model of intermediate statistics. In the case of Brownian particles we show that the Fermi-Dirac (FD) and Bose-Einstein (BE) distributions can be obtained, just as the Maxwell-Boltzmann (MD) distribution, as steady states of a classical kinetic equation that intrinsically takes into account an exclusion-inclusion principle. In our model the intermediate statistics are obtained as steady states of a system of coupled nonlinear kinetic equations, where the coupling constants are the transmutational potentials η κκ' . We show that, besides the FD-BE intermediate statistics extensively studied from the quantum point of view, we can also study the MB-FD and MB-BE ones. Moreover, our model allows us to treat the three-state mixing FD-MB-BE intermediate statistics. For boson and fermion mixing in a D-dimensional space, we obtain a family of FD-BE intermediate statistics by varying the transmutational potential η BF . This family contains, as a particular case when η BF =0, the quantum statistics recently proposed by L. Wu, Z. Wu, and J. Sun [Phys. Lett. A 170, 280 (1992)]. When we consider the two-dimensional FD-BE statistics, we derive an analytic expression of the fraction of fermions. When the temperature T→∞, the system is composed by an equal number of bosons and fermions, regardless of the value of η BF . On the contrary, when T=0, η BF becomes important and, according to its value, the system can be completely bosonic or fermionic, or composed both by bosons and fermions

  14. Noise limitations in optical linear algebra processors.

    Science.gov (United States)

    Batsell, S G; Jong, T L; Walkup, J F; Krile, T F

    1990-05-10

    A general statistical noise model is presented for optical linear algebra processors. A statistical analysis which includes device noise, the multiplication process, and the addition operation is undertaken. We focus on those processes which are architecturally independent. Finally, experimental results which verify the analytical predictions are also presented.

  15. Non-linear Growth Models in Mplus and SAS

    Science.gov (United States)

    Grimm, Kevin J.; Ram, Nilam

    2013-01-01

    Non-linear growth curves or growth curves that follow a specified non-linear function in time enable researchers to model complex developmental patterns with parameters that are easily interpretable. In this paper we describe how a variety of sigmoid curves can be fit using the Mplus structural modeling program and the non-linear mixed-effects modeling procedure NLMIXED in SAS. Using longitudinal achievement data collected as part of a study examining the effects of preschool instruction on academic gain we illustrate the procedures for fitting growth models of logistic, Gompertz, and Richards functions. Brief notes regarding the practical benefits, limitations, and choices faced in the fitting and estimation of such models are included. PMID:23882134

  16. Variance Function Partially Linear Single-Index Models1.

    Science.gov (United States)

    Lian, Heng; Liang, Hua; Carroll, Raymond J

    2015-01-01

    We consider heteroscedastic regression models where the mean function is a partially linear single index model and the variance function depends upon a generalized partially linear single index model. We do not insist that the variance function depend only upon the mean function, as happens in the classical generalized partially linear single index model. We develop efficient and practical estimation methods for the variance function and for the mean function. Asymptotic theory for the parametric and nonparametric parts of the model is developed. Simulations illustrate the results. An empirical example involving ozone levels is used to further illustrate the results, and is shown to be a case where the variance function does not depend upon the mean function.

  17. Predicting oropharyngeal tumor volume throughout the course of radiation therapy from pretreatment computed tomography data using general linear models.

    Science.gov (United States)

    Yock, Adam D; Rao, Arvind; Dong, Lei; Beadle, Beth M; Garden, Adam S; Kudchadker, Rajat J; Court, Laurence E

    2014-05-01

    The purpose of this work was to develop and evaluate the accuracy of several predictive models of variation in tumor volume throughout the course of radiation therapy. Nineteen patients with oropharyngeal cancers were imaged daily with CT-on-rails for image-guided alignment per an institutional protocol. The daily volumes of 35 tumors in these 19 patients were determined and used to generate (1) a linear model in which tumor volume changed at a constant rate, (2) a general linear model that utilized the power fit relationship between the daily and initial tumor volumes, and (3) a functional general linear model that identified and exploited the primary modes of variation between time series describing the changing tumor volumes. Primary and nodal tumor volumes were examined separately. The accuracy of these models in predicting daily tumor volumes were compared with those of static and linear reference models using leave-one-out cross-validation. In predicting the daily volume of primary tumors, the general linear model and the functional general linear model were more accurate than the static reference model by 9.9% (range: -11.6%-23.8%) and 14.6% (range: -7.3%-27.5%), respectively, and were more accurate than the linear reference model by 14.2% (range: -6.8%-40.3%) and 13.1% (range: -1.5%-52.5%), respectively. In predicting the daily volume of nodal tumors, only the 14.4% (range: -11.1%-20.5%) improvement in accuracy of the functional general linear model compared to the static reference model was statistically significant. A general linear model and a functional general linear model trained on data from a small population of patients can predict the primary tumor volume throughout the course of radiation therapy with greater accuracy than standard reference models. These more accurate models may increase the prognostic value of information about the tumor garnered from pretreatment computed tomography images and facilitate improved treatment management.

  18. Predicting oropharyngeal tumor volume throughout the course of radiation therapy from pretreatment computed tomography data using general linear models

    International Nuclear Information System (INIS)

    Yock, Adam D.; Kudchadker, Rajat J.; Rao, Arvind; Dong, Lei; Beadle, Beth M.; Garden, Adam S.; Court, Laurence E.

    2014-01-01

    Purpose: The purpose of this work was to develop and evaluate the accuracy of several predictive models of variation in tumor volume throughout the course of radiation therapy. Methods: Nineteen patients with oropharyngeal cancers were imaged daily with CT-on-rails for image-guided alignment per an institutional protocol. The daily volumes of 35 tumors in these 19 patients were determined and used to generate (1) a linear model in which tumor volume changed at a constant rate, (2) a general linear model that utilized the power fit relationship between the daily and initial tumor volumes, and (3) a functional general linear model that identified and exploited the primary modes of variation between time series describing the changing tumor volumes. Primary and nodal tumor volumes were examined separately. The accuracy of these models in predicting daily tumor volumes were compared with those of static and linear reference models using leave-one-out cross-validation. Results: In predicting the daily volume of primary tumors, the general linear model and the functional general linear model were more accurate than the static reference model by 9.9% (range: −11.6%–23.8%) and 14.6% (range: −7.3%–27.5%), respectively, and were more accurate than the linear reference model by 14.2% (range: −6.8%–40.3%) and 13.1% (range: −1.5%–52.5%), respectively. In predicting the daily volume of nodal tumors, only the 14.4% (range: −11.1%–20.5%) improvement in accuracy of the functional general linear model compared to the static reference model was statistically significant. Conclusions: A general linear model and a functional general linear model trained on data from a small population of patients can predict the primary tumor volume throughout the course of radiation therapy with greater accuracy than standard reference models. These more accurate models may increase the prognostic value of information about the tumor garnered from pretreatment computed tomography

  19. Probing NWP model deficiencies by statistical postprocessing

    DEFF Research Database (Denmark)

    Rosgaard, Martin Haubjerg; Nielsen, Henrik Aalborg; Nielsen, Torben S.

    2016-01-01

    The objective in this article is twofold. On one hand, a Model Output Statistics (MOS) framework for improved wind speed forecast accuracy is described and evaluated. On the other hand, the approach explored identifies unintuitive explanatory value from a diagnostic variable in an operational....... Based on the statistical model candidates inferred from the data, the lifted index NWP model diagnostic is consistently found among the NWP model predictors of the best performing statistical models across sites....

  20. Comments on a time-dependent version of the linear-quadratic model

    International Nuclear Information System (INIS)

    Tucker, S.L.; Travis, E.L.

    1990-01-01

    The accuracy and interpretation of the 'LQ + time' model are discussed. Evidence is presented, based on data in the literature, that this model does not accurately describe the changes in isoeffect dose occurring with protraction of the overall treatment time during fractionated irradiation of the lung. This lack of fit of the model explains, in part, the surprisingly large values of γ/α that have been derived from experimental lung data. The large apparent time factors for lung suggested by the model are also partly explained by the fact that γT/α, despite having units of dose, actually measures the influence of treatment time on the effect scale, not the dose scale, and is shown to consistently overestimate the change in total dose. The unusually high values of α/β that have been derived for lung using the model are shown to be influenced by the method by which the model was fitted to data. Reanalyses of the data using a more statistically valid regression procedure produce estimates of α/β more typical of those usually cited for lung. Most importantly, published isoeffect data from lung indicate that the true deviation from the linear-quadratic (LQ) model is nonlinear in time, instead of linear, and also depends on other factors such as the effect level and the size of dose per fraction. Thus, the authors do not advocate the use of the 'LQ + time' expression as a general isoeffect model. (author). 32 refs.; 3 figs.; 1 tab

  1. Linear latent variable models: the lava-package

    DEFF Research Database (Denmark)

    Holst, Klaus Kähler; Budtz-Jørgensen, Esben

    2013-01-01

    are implemented including robust standard errors for clustered correlated data, multigroup analyses, non-linear parameter constraints, inference with incomplete data, maximum likelihood estimation with censored and binary observations, and instrumental variable estimators. In addition an extensive simulation......An R package for specifying and estimating linear latent variable models is presented. The philosophy of the implementation is to separate the model specification from the actual data, which leads to a dynamic and easy way of modeling complex hierarchical structures. Several advanced features...

  2. Statistical Model of Extreme Shear

    DEFF Research Database (Denmark)

    Hansen, Kurt Schaldemose; Larsen, Gunner Chr.

    2005-01-01

    In order to continue cost-optimisation of modern large wind turbines, it is important to continuously increase the knowledge of wind field parameters relevant to design loads. This paper presents a general statistical model that offers site-specific prediction of the probability density function...... by a model that, on a statistically consistent basis, describes the most likely spatial shape of an extreme wind shear event. Predictions from the model have been compared with results from an extreme value data analysis, based on a large number of full-scale measurements recorded with a high sampling rate...

  3. Aspects of statistical model for multifragmentation

    International Nuclear Information System (INIS)

    Bhattacharyya, P.; Das Gupta, S.; Mekjian, A. Z.

    1999-01-01

    We deal with two different aspects of an exactly soluble statistical model of fragmentation. First we show, using zero range force and finite temperature Thomas-Fermi theory, that a common link can be found between finite temperature mean field theory and the statistical fragmentation model. We show the latter naturally arises in the spinodal region. Next we show that although the exact statistical model is a canonical model and uses temperature, microcanonical results which use constant energy rather than constant temperature can also be obtained from the canonical model using saddle-point approximation. The methodology is extremely simple to implement and at least in all the examples studied in this work is very accurate. (c) 1999 The American Physical Society

  4. Statistical Compression for Climate Model Output

    Science.gov (United States)

    Hammerling, D.; Guinness, J.; Soh, Y. J.

    2017-12-01

    Numerical climate model simulations run at high spatial and temporal resolutions generate massive quantities of data. As our computing capabilities continue to increase, storing all of the data is not sustainable, and thus is it important to develop methods for representing the full datasets by smaller compressed versions. We propose a statistical compression and decompression algorithm based on storing a set of summary statistics as well as a statistical model describing the conditional distribution of the full dataset given the summary statistics. We decompress the data by computing conditional expectations and conditional simulations from the model given the summary statistics. Conditional expectations represent our best estimate of the original data but are subject to oversmoothing in space and time. Conditional simulations introduce realistic small-scale noise so that the decompressed fields are neither too smooth nor too rough compared with the original data. Considerable attention is paid to accurately modeling the original dataset-one year of daily mean temperature data-particularly with regard to the inherent spatial nonstationarity in global fields, and to determining the statistics to be stored, so that the variation in the original data can be closely captured, while allowing for fast decompression and conditional emulation on modest computers.

  5. Statistical monitoring of linear antenna arrays

    KAUST Repository

    Harrou, Fouzi; Sun, Ying

    2016-01-01

    The paper concerns the problem of monitoring linear antenna arrays using the generalized likelihood ratio (GLR) test. When an abnormal event (fault) affects an array of antenna elements, the radiation pattern changes and significant deviation from

  6. Probabilistic model of ligaments and tendons: Quasistatic linear stretching

    Science.gov (United States)

    Bontempi, M.

    2009-03-01

    Ligaments and tendons have a significant role in the musculoskeletal system and are frequently subjected to injury. This study presents a model of collagen fibers, based on the study of a statistical distribution of fibers when they are subjected to quasistatic linear stretching. With respect to other methodologies, this model is able to describe the behavior of the bundle using less ad hoc hypotheses and is able to describe all the quasistatic stretch-load responses of the bundle, including the yield and failure regions described in the literature. It has two other important results: the first is that it is able to correlate the mechanical behavior of the bundle with its internal structure, and it suggests a methodology to deduce the fibers population distribution directly from the tensile-test data. The second is that it can follow fibers’ structure evolution during the stretching and it is possible to study the internal adaptation of fibers in physiological and pathological conditions.

  7. Automated statistical modeling of analytical measurement systems

    International Nuclear Information System (INIS)

    Jacobson, J.J.

    1992-01-01

    The statistical modeling of analytical measurement systems at the Idaho Chemical Processing Plant (ICPP) has been completely automated through computer software. The statistical modeling of analytical measurement systems is one part of a complete quality control program used by the Remote Analytical Laboratory (RAL) at the ICPP. The quality control program is an integration of automated data input, measurement system calibration, database management, and statistical process control. The quality control program and statistical modeling program meet the guidelines set forth by the American Society for Testing Materials and American National Standards Institute. A statistical model is a set of mathematical equations describing any systematic bias inherent in a measurement system and the precision of a measurement system. A statistical model is developed from data generated from the analysis of control standards. Control standards are samples which are made up at precise known levels by an independent laboratory and submitted to the RAL. The RAL analysts who process control standards do not know the values of those control standards. The object behind statistical modeling is to describe real process samples in terms of their bias and precision and, to verify that a measurement system is operating satisfactorily. The processing of control standards gives us this ability

  8. Comparing statistical and process-based flow duration curve models in ungauged basins and changing rain regimes

    Science.gov (United States)

    Müller, M. F.; Thompson, S. E.

    2016-02-01

    The prediction of flow duration curves (FDCs) in ungauged basins remains an important task for hydrologists given the practical relevance of FDCs for water management and infrastructure design. Predicting FDCs in ungauged basins typically requires spatial interpolation of statistical or model parameters. This task is complicated if climate becomes non-stationary, as the prediction challenge now also requires extrapolation through time. In this context, process-based models for FDCs that mechanistically link the streamflow distribution to climate and landscape factors may have an advantage over purely statistical methods to predict FDCs. This study compares a stochastic (process-based) and statistical method for FDC prediction in both stationary and non-stationary contexts, using Nepal as a case study. Under contemporary conditions, both models perform well in predicting FDCs, with Nash-Sutcliffe coefficients above 0.80 in 75 % of the tested catchments. The main drivers of uncertainty differ between the models: parameter interpolation was the main source of error for the statistical model, while violations of the assumptions of the process-based model represented the main source of its error. The process-based approach performed better than the statistical approach in numerical simulations with non-stationary climate drivers. The predictions of the statistical method under non-stationary rainfall conditions were poor if (i) local runoff coefficients were not accurately determined from the gauge network, or (ii) streamflow variability was strongly affected by changes in rainfall. A Monte Carlo analysis shows that the streamflow regimes in catchments characterized by frequent wet-season runoff and a rapid, strongly non-linear hydrologic response are particularly sensitive to changes in rainfall statistics. In these cases, process-based prediction approaches are favored over statistical models.

  9. Design and implementation of new design of numerical experiments for non linear models

    International Nuclear Information System (INIS)

    Gazut, St.

    2007-03-01

    This thesis addresses the problem of the construction of surrogate models in numerical simulation. Whenever numerical experiments are costly, the simulation model is complex and difficult to use. It is important then to select the numerical experiments as efficiently as possible in order to minimize their number. In statistics, the selection of experiments is known as optimal experimental design. In the context of numerical simulation where no measurement uncertainty is present, we describe an alternative approach based on statistical learning theory and re-sampling techniques. The surrogate models are constructed using neural networks and the generalization error is estimated by leave-one-out, cross-validation and bootstrap. It is shown that the bootstrap can control the over-fitting and extend the concept of leverage for non linear in their parameters surrogate models. The thesis describes an iterative method called LDR for Learner Disagreement from experiment Re-sampling, based on active learning using several surrogate models constructed on bootstrap samples. The method consists in adding new experiments where the predictors constructed from bootstrap samples disagree most. We compare the LDR method with other methods of experimental design such as D-optimal selection. (author)

  10. Statistical models for thermal ageing of steel materials in nuclear power plants

    International Nuclear Information System (INIS)

    Persoz, M.

    1996-01-01

    Some category of steel materials in nuclear power plants may be subjected to thermal ageing, whose extent depends on the steel chemical composition and the ageing parameters, i.e. temperature and duration. This ageing affects the 'impact strength' of the materials, which is a mechanical property. In order to assess the residual lifetime of these components, a probabilistic study has been launched, which takes into account the scatter over the input parameters of the mechanical model. Predictive formulae for estimating the impact strength of aged materials are important input data of the model. A data base has been created with impact strength results obtained from an ageing program in laboratory and statistical treatments have been undertaken. Two kinds of model have been developed, with non linear regression methods (PROC NLIN, available in SAS/STAT). The first one, using a hyperbolic tangent function, is partly based on physical considerations, and the second one, of an exponential type, is purely statistically built. The difficulties consist in selecting the significant parameters and attributing initial values to the coefficients, which is a requirement of the NLIN procedure. This global statistical analysis has led to general models that are unction of the chemical variables and the ageing parameters. These models are as precise (if not more) as local models that had been developed earlier for some specific values of ageing temperature and ageing duration. This paper describes the data and the methodology used to build the models and analyses the results given by the SAS system. (author)

  11. Comparison between Linear and Nonlinear Regression in a Laboratory Heat Transfer Experiment

    Science.gov (United States)

    Gonçalves, Carine Messias; Schwaab, Marcio; Pinto, José Carlos

    2013-01-01

    In order to interpret laboratory experimental data, undergraduate students are used to perform linear regression through linearized versions of nonlinear models. However, the use of linearized models can lead to statistically biased parameter estimates. Even so, it is not an easy task to introduce nonlinear regression and show for the students…

  12. Statistical modelling for ship propulsion efficiency

    DEFF Research Database (Denmark)

    Petersen, Jóan Petur; Jacobsen, Daniel J.; Winther, Ole

    2012-01-01

    This paper presents a state-of-the-art systems approach to statistical modelling of fuel efficiency in ship propulsion, and also a novel and publicly available data set of high quality sensory data. Two statistical model approaches are investigated and compared: artificial neural networks...

  13. Downlink Linear Precoders Based on Statistical CSI for Multicell MIMO-OFDM

    Directory of Open Access Journals (Sweden)

    Ebrahim Baktash

    2017-01-01

    Full Text Available With 5G communication systems on the horizon, efficient interference management in heterogeneous multicell networks is more vital than ever. This paper investigates the linear precoder design for downlink multicell multiple-input multiple-output orthogonal frequency-division multiplexing (MIMO-OFDM systems, where base stations (BSs coordinate to reduce the interference across space and frequency. In order to minimize the overall feedback overhead in next-generation systems, we consider precoding schemes that require statistical channel state information (CSI only. We apply the random matrix theory to approximate the ergodic weighted sum rate of the system with a closed form expression. After formulating the approximation for general channels, we reduce the results to a more compact form using the Kronecker channel model for which several multicarrier concepts such as frequency selectivity, channel tap correlations, and intercarrier interference (ICI are rigorously represented. We find the local optimal solution for the maximization of the approximate rate using a gradient method that requires only the covariance structure of the MIMO-OFDM channels. Within this covariance structure are the channel tap correlations and ICI information, both of which are taken into consideration in the precoder design. Simulation results show that the rate approximation is very accurate even for very small MIMO-OFDM systems and the proposed method converges rapidly to a near-optimal solution that competes with networked MIMO and precoders based on instantaneous full CSI.

  14. Sensometrics: Thurstonian and Statistical Models

    DEFF Research Database (Denmark)

    Christensen, Rune Haubo Bojesen

    . sensR is a package for sensory discrimination testing with Thurstonian models and ordinal supports analysis of ordinal data with cumulative link (mixed) models. While sensR is closely connected to the sensometrics field, the ordinal package has developed into a generic statistical package applicable......This thesis is concerned with the development and bridging of Thurstonian and statistical models for sensory discrimination testing as applied in the scientific discipline of sensometrics. In sensory discrimination testing sensory differences between products are detected and quantified by the use...... and sensory discrimination testing in particular in a series of papers by advancing Thurstonian models for a range of sensory discrimination protocols in addition to facilitating their application by providing software for fitting these models. The main focus is on identifying Thurstonian models...

  15. Statistical Modelling of Temperature and Moisture Uptake of Biochars Exposed to Selected Relative Humidity of Air

    Directory of Open Access Journals (Sweden)

    Luciane Bastistella

    2018-02-01

    Full Text Available New experimental techniques, as well as modern variants on known methods, have recently been employed to investigate the fundamental reactions underlying the oxidation of biochar. The purpose of this paper was to experimentally and statistically study how the relative humidity of air, mass, and particle size of four biochars influenced the adsorption of water and the increase in temperature. A random factorial design was employed using the intuitive statistical software Xlstat. A simple linear regression model and an analysis of variance with a pairwise comparison were performed. The experimental study was carried out on the wood of Quercus pubescens, Cyclobalanopsis glauca, Trigonostemon huangmosun, and Bambusa vulgaris, and involved five relative humidity conditions (22, 43, 75, 84, and 90%, two mass samples (0.1 and 1 g, and two particle sizes (powder and piece. Two response variables including water adsorption and temperature increase were analyzed and discussed. The temperature did not increase linearly with the adsorption of water. Temperature was modeled by nine explanatory variables, while water adsorption was modeled by eight. Five variables, including factors and their interactions, were found to be common to the two models. Sample mass and relative humidity influenced the two qualitative variables, while particle size and biochar type only influenced the temperature.

  16. Statistical Downscaling of Temperature with the Random Forest Model

    Directory of Open Access Journals (Sweden)

    Bo Pang

    2017-01-01

    Full Text Available The issues with downscaling the outputs of a global climate model (GCM to a regional scale that are appropriate to hydrological impact studies are investigated using the random forest (RF model, which has been shown to be superior for large dataset analysis and variable importance evaluation. The RF is proposed for downscaling daily mean temperature in the Pearl River basin in southern China. Four downscaling models were developed and validated by using the observed temperature series from 61 national stations and large-scale predictor variables derived from the National Center for Environmental Prediction–National Center for Atmospheric Research reanalysis dataset. The proposed RF downscaling model was compared to multiple linear regression, artificial neural network, and support vector machine models. Principal component analysis (PCA and partial correlation analysis (PAR were used in the predictor selection for the other models for a comprehensive study. It was shown that the model efficiency of the RF model was higher than that of the other models according to five selected criteria. By evaluating the predictor importance, the RF could choose the best predictor combination without using PCA and PAR. The results indicate that the RF is a feasible tool for the statistical downscaling of temperature.

  17. SOCR Analyses - an Instructional Java Web-based Statistical Analysis Toolkit.

    Science.gov (United States)

    Chu, Annie; Cui, Jenny; Dinov, Ivo D

    2009-03-01

    The Statistical Online Computational Resource (SOCR) designs web-based tools for educational use in a variety of undergraduate courses (Dinov 2006). Several studies have demonstrated that these resources significantly improve students' motivation and learning experiences (Dinov et al. 2008). SOCR Analyses is a new component that concentrates on data modeling and analysis using parametric and non-parametric techniques supported with graphical model diagnostics. Currently implemented analyses include commonly used models in undergraduate statistics courses like linear models (Simple Linear Regression, Multiple Linear Regression, One-Way and Two-Way ANOVA). In addition, we implemented tests for sample comparisons, such as t-test in the parametric category; and Wilcoxon rank sum test, Kruskal-Wallis test, Friedman's test, in the non-parametric category. SOCR Analyses also include several hypothesis test models, such as Contingency tables, Friedman's test and Fisher's exact test.The code itself is open source (http://socr.googlecode.com/), hoping to contribute to the efforts of the statistical computing community. The code includes functionality for each specific analysis model and it has general utilities that can be applied in various statistical computing tasks. For example, concrete methods with API (Application Programming Interface) have been implemented in statistical summary, least square solutions of general linear models, rank calculations, etc. HTML interfaces, tutorials, source code, activities, and data are freely available via the web (www.SOCR.ucla.edu). Code examples for developers and demos for educators are provided on the SOCR Wiki website.In this article, the pedagogical utilization of the SOCR Analyses is discussed, as well as the underlying design framework. As the SOCR project is on-going and more functions and tools are being added to it, these resources are constantly improved. The reader is strongly encouraged to check the SOCR site for most

  18. Filtering a statistically exactly solvable test model for turbulent tracers from partial observations

    International Nuclear Information System (INIS)

    Gershgorin, B.; Majda, A.J.

    2011-01-01

    A statistically exactly solvable model for passive tracers is introduced as a test model for the authors' Nonlinear Extended Kalman Filter (NEKF) as well as other filtering algorithms. The model involves a Gaussian velocity field and a passive tracer governed by the advection-diffusion equation with an imposed mean gradient. The model has direct relevance to engineering problems such as the spread of pollutants in the air or contaminants in the water as well as climate change problems concerning the transport of greenhouse gases such as carbon dioxide with strongly intermittent probability distributions consistent with the actual observations of the atmosphere. One of the attractive properties of the model is the existence of the exact statistical solution. In particular, this unique feature of the model provides an opportunity to design and test fast and efficient algorithms for real-time data assimilation based on rigorous mathematical theory for a turbulence model problem with many active spatiotemporal scales. Here, we extensively study the performance of the NEKF which uses the exact first and second order nonlinear statistics without any approximations due to linearization. The role of partial and sparse observations, the frequency of observations and the observation noise strength in recovering the true signal, its spectrum, and fat tail probability distribution are the central issues discussed here. The results of our study provide useful guidelines for filtering realistic turbulent systems with passive tracers through partial observations.

  19. Statistical modelling for social researchers principles and practice

    CERN Document Server

    Tarling, Roger

    2008-01-01

    This book explains the principles and theory of statistical modelling in an intelligible way for the non-mathematical social scientist looking to apply statistical modelling techniques in research. The book also serves as an introduction for those wishing to develop more detailed knowledge and skills in statistical modelling. Rather than present a limited number of statistical models in great depth, the aim is to provide a comprehensive overview of the statistical models currently adopted in social research, in order that the researcher can make appropriate choices and select the most suitable model for the research question to be addressed. To facilitate application, the book also offers practical guidance and instruction in fitting models using SPSS and Stata, the most popular statistical computer software which is available to most social researchers. Instruction in using MLwiN is also given. Models covered in the book include; multiple regression, binary, multinomial and ordered logistic regression, log-l...

  20. Interpretation of commonly used statistical regression models.

    Science.gov (United States)

    Kasza, Jessica; Wolfe, Rory

    2014-01-01

    A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.

  1. Topology for statistical modeling of petascale data.

    Energy Technology Data Exchange (ETDEWEB)

    Pascucci, Valerio (University of Utah, Salt Lake City, UT); Mascarenhas, Ajith Arthur; Rusek, Korben (Texas A& M University, College Station, TX); Bennett, Janine Camille; Levine, Joshua (University of Utah, Salt Lake City, UT); Pebay, Philippe Pierre; Gyulassy, Attila (University of Utah, Salt Lake City, UT); Thompson, David C.; Rojas, Joseph Maurice (Texas A& M University, College Station, TX)

    2011-07-01

    This document presents current technical progress and dissemination of results for the Mathematics for Analysis of Petascale Data (MAPD) project titled 'Topology for Statistical Modeling of Petascale Data', funded by the Office of Science Advanced Scientific Computing Research (ASCR) Applied Math program. Many commonly used algorithms for mathematical analysis do not scale well enough to accommodate the size or complexity of petascale data produced by computational simulations. The primary goal of this project is thus to develop new mathematical tools that address both the petascale size and uncertain nature of current data. At a high level, our approach is based on the complementary techniques of combinatorial topology and statistical modeling. In particular, we use combinatorial topology to filter out spurious data that would otherwise skew statistical modeling techniques, and we employ advanced algorithms from algebraic statistics to efficiently find globally optimal fits to statistical models. This document summarizes the technical advances we have made to date that were made possible in whole or in part by MAPD funding. These technical contributions can be divided loosely into three categories: (1) advances in the field of combinatorial topology, (2) advances in statistical modeling, and (3) new integrated topological and statistical methods.

  2. Bayesian models: A statistical primer for ecologists

    Science.gov (United States)

    Hobbs, N. Thompson; Hooten, Mevin B.

    2015-01-01

    Bayesian modeling has become an indispensable tool for ecological research because it is uniquely suited to deal with complexity in a statistically coherent way. This textbook provides a comprehensive and accessible introduction to the latest Bayesian methods—in language ecologists can understand. Unlike other books on the subject, this one emphasizes the principles behind the computations, giving ecologists a big-picture understanding of how to implement this powerful statistical approach.Bayesian Models is an essential primer for non-statisticians. It begins with a definition of probability and develops a step-by-step sequence of connected ideas, including basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and inference from single and multiple models. This unique book places less emphasis on computer coding, favoring instead a concise presentation of the mathematical statistics needed to understand how and why Bayesian analysis works. It also explains how to write out properly formulated hierarchical Bayesian models and use them in computing, research papers, and proposals.This primer enables ecologists to understand the statistical principles behind Bayesian modeling and apply them to research, teaching, policy, and management.Presents the mathematical and statistical foundations of Bayesian modeling in language accessible to non-statisticiansCovers basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and moreDeemphasizes computer coding in favor of basic principlesExplains how to write out properly factored statistical expressions representing Bayesian models

  3. Statistical mechanics of directed models of polymers in the square lattice

    International Nuclear Information System (INIS)

    Rensburg, E J Janse van

    2003-01-01

    Directed square lattice models of polymers and vesicles have received considerable attention in the recent mathematical and physical sciences literature. These are idealized geometric directed lattice models introduced to study phase behaviour in polymers, and include Dyck paths, partially directed paths, directed trees and directed vesicles models. Directed models are closely related to models studied in the combinatorics literature (and are often exactly solvable). They are also simplified versions of a number of statistical mechanics models, including the self-avoiding walk, lattice animals and lattice vesicles. The exchange of approaches and ideas between statistical mechanics and combinatorics have considerably advanced the description and understanding of directed lattice models, and this will be explored in this review. The combinatorial nature of directed lattice path models makes a study using generating function approaches most natural. In contrast, the statistical mechanics approach would introduce partition functions and free energies, and then investigate these using the general framework of critical phenomena. Generating function and statistical mechanics approaches are closely related. For example, questions regarding the limiting free energy may be approached by considering the radius of convergence of a generating function, and the scaling properties of thermodynamic quantities are related to the asymptotic properties of the generating function. In this review the methods for obtaining generating functions and determining free energies in directed lattice path models of linear polymers is presented. These methods include decomposition methods leading to functional recursions, as well as the Temperley method (that is implemented by creating a combinatorial object, one slice at a time). A constant term formulation of the generating function will also be reviewed. The thermodynamic features and critical behaviour in models of directed paths may be

  4. Climate change estimates of South American riverflow through statistical downscaling

    CSIR Research Space (South Africa)

    Landman, W

    2014-03-01

    Full Text Available projections are assimilated into a linear statistical model in order to produce an ensemble of downscaled riverflows in the La Plata Basin and in southern-central Chile. The statistical model uses atmospheric circulation fields (geopotential heights at the 200...

  5. Isobio software: biological dose distribution and biological dose volume histogram from physical dose conversion using linear-quadratic-linear model.

    Science.gov (United States)

    Jaikuna, Tanwiwat; Khadsiri, Phatchareewan; Chawapun, Nisa; Saekho, Suwit; Tharavichitkul, Ekkasit

    2017-02-01

    To develop an in-house software program that is able to calculate and generate the biological dose distribution and biological dose volume histogram by physical dose conversion using the linear-quadratic-linear (LQL) model. The Isobio software was developed using MATLAB version 2014b to calculate and generate the biological dose distribution and biological dose volume histograms. The physical dose from each voxel in treatment planning was extracted through Computational Environment for Radiotherapy Research (CERR), and the accuracy was verified by the differentiation between the dose volume histogram from CERR and the treatment planning system. An equivalent dose in 2 Gy fraction (EQD 2 ) was calculated using biological effective dose (BED) based on the LQL model. The software calculation and the manual calculation were compared for EQD 2 verification with pair t -test statistical analysis using IBM SPSS Statistics version 22 (64-bit). Two and three-dimensional biological dose distribution and biological dose volume histogram were displayed correctly by the Isobio software. Different physical doses were found between CERR and treatment planning system (TPS) in Oncentra, with 3.33% in high-risk clinical target volume (HR-CTV) determined by D 90% , 0.56% in the bladder, 1.74% in the rectum when determined by D 2cc , and less than 1% in Pinnacle. The difference in the EQD 2 between the software calculation and the manual calculation was not significantly different with 0.00% at p -values 0.820, 0.095, and 0.593 for external beam radiation therapy (EBRT) and 0.240, 0.320, and 0.849 for brachytherapy (BT) in HR-CTV, bladder, and rectum, respectively. The Isobio software is a feasible tool to generate the biological dose distribution and biological dose volume histogram for treatment plan evaluation in both EBRT and BT.

  6. Linear factor copula models and their properties

    KAUST Repository

    Krupskii, Pavel; Genton, Marc G.

    2018-01-01

    We consider a special case of factor copula models with additive common factors and independent components. These models are flexible and parsimonious with O(d) parameters where d is the dimension. The linear structure allows one to obtain closed form expressions for some copulas and their extreme‐value limits. These copulas can be used to model data with strong tail dependencies, such as extreme data. We study the dependence properties of these linear factor copula models and derive the corresponding limiting extreme‐value copulas with a factor structure. We show how parameter estimates can be obtained for these copulas and apply one of these copulas to analyse a financial data set.

  7. Linear factor copula models and their properties

    KAUST Repository

    Krupskii, Pavel

    2018-04-25

    We consider a special case of factor copula models with additive common factors and independent components. These models are flexible and parsimonious with O(d) parameters where d is the dimension. The linear structure allows one to obtain closed form expressions for some copulas and their extreme‐value limits. These copulas can be used to model data with strong tail dependencies, such as extreme data. We study the dependence properties of these linear factor copula models and derive the corresponding limiting extreme‐value copulas with a factor structure. We show how parameter estimates can be obtained for these copulas and apply one of these copulas to analyse a financial data set.

  8. Spatial generalized linear mixed models of electric power outages due to hurricanes and ice storms

    International Nuclear Information System (INIS)

    Liu Haibin; Davidson, Rachel A.; Apanasovich, Tatiyana V.

    2008-01-01

    This paper presents new statistical models that predict the number of hurricane- and ice storm-related electric power outages likely to occur in each 3 kmx3 km grid cell in a region. The models are based on a large database of recent outages experienced by three major East Coast power companies in six hurricanes and eight ice storms. A spatial generalized linear mixed modeling (GLMM) approach was used in which spatial correlation is incorporated through random effects. Models were fitted using a composite likelihood approach and the covariance matrix was estimated empirically. A simulation study was conducted to test the model estimation procedure, and model training, validation, and testing were done to select the best models and assess their predictive power. The final hurricane model includes number of protective devices, maximum gust wind speed, hurricane indicator, and company indicator covariates. The final ice storm model includes number of protective devices, ice thickness, and ice storm indicator covariates. The models should be useful for power companies as they plan for future storms. The statistical modeling approach offers a new way to assess the reliability of electric power and other infrastructure systems in extreme events

  9. Modelling Loudspeaker Non-Linearities

    DEFF Research Database (Denmark)

    Agerkvist, Finn T.

    2007-01-01

    This paper investigates different techniques for modelling the non-linear parameters of the electrodynamic loudspeaker. The methods are tested not only for their accuracy within the range of original data, but also for the ability to work reasonable outside that range, and it is demonstrated...... that polynomial expansions are rather poor at this, whereas an inverse polynomial expansion or localized fitting functions such as the gaussian are better suited for modelling the Bl-factor and compliance. For the inductance the sigmoid function is shown to give very good results. Finally the time varying...

  10. Bearing Fault Diagnosis Based on Statistical Locally Linear Embedding.

    Science.gov (United States)

    Wang, Xiang; Zheng, Yuan; Zhao, Zhenzhou; Wang, Jinping

    2015-07-06

    Fault diagnosis is essentially a kind of pattern recognition. The measured signal samples usually distribute on nonlinear low-dimensional manifolds embedded in the high-dimensional signal space, so how to implement feature extraction, dimensionality reduction and improve recognition performance is a crucial task. In this paper a novel machinery fault diagnosis approach based on a statistical locally linear embedding (S-LLE) algorithm which is an extension of LLE by exploiting the fault class label information is proposed. The fault diagnosis approach first extracts the intrinsic manifold features from the high-dimensional feature vectors which are obtained from vibration signals that feature extraction by time-domain, frequency-domain and empirical mode decomposition (EMD), and then translates the complex mode space into a salient low-dimensional feature space by the manifold learning algorithm S-LLE, which outperforms other feature reduction methods such as PCA, LDA and LLE. Finally in the feature reduction space pattern classification and fault diagnosis by classifier are carried out easily and rapidly. Rolling bearing fault signals are used to validate the proposed fault diagnosis approach. The results indicate that the proposed approach obviously improves the classification performance of fault pattern recognition and outperforms the other traditional approaches.

  11. Subcellular localization for Gram positive and Gram negative bacterial proteins using linear interpolation smoothing model.

    Science.gov (United States)

    Saini, Harsh; Raicar, Gaurav; Dehzangi, Abdollah; Lal, Sunil; Sharma, Alok

    2015-12-07

    Protein subcellular localization is an important topic in proteomics since it is related to a protein׳s overall function, helps in the understanding of metabolic pathways, and in drug design and discovery. In this paper, a basic approximation technique from natural language processing called the linear interpolation smoothing model is applied for predicting protein subcellular localizations. The proposed approach extracts features from syntactical information in protein sequences to build probabilistic profiles using dependency models, which are used in linear interpolation to determine how likely is a sequence to belong to a particular subcellular location. This technique builds a statistical model based on maximum likelihood. It is able to deal effectively with high dimensionality that hinders other traditional classifiers such as Support Vector Machines or k-Nearest Neighbours without sacrificing performance. This approach has been evaluated by predicting subcellular localizations of Gram positive and Gram negative bacterial proteins. Copyright © 2015 Elsevier Ltd. All rights reserved.

  12. Simple statistical model for branched aggregates

    DEFF Research Database (Denmark)

    Lemarchand, Claire; Hansen, Jesper Schmidt

    2015-01-01

    , given that it already has bonds with others. The model is applied here to asphaltene nanoaggregates observed in molecular dynamics simulations of Cooee bitumen. The variation with temperature of the probabilities deduced from this model is discussed in terms of statistical mechanics arguments....... The relevance of the statistical model in the case of asphaltene nanoaggregates is checked by comparing the predicted value of the probability for one molecule to have exactly i bonds with the same probability directly measured in the molecular dynamics simulations. The agreement is satisfactory......We propose a statistical model that can reproduce the size distribution of any branched aggregate, including amylopectin, dendrimers, molecular clusters of monoalcohols, and asphaltene nanoaggregates. It is based on the conditional probability for one molecule to form a new bond with a molecule...

  13. Linearized models for a new magnetic control in MAST

    International Nuclear Information System (INIS)

    Artaserse, G.; Maviglia, F.; Albanese, R.; McArdle, G.J.; Pangione, L.

    2013-01-01

    Highlights: ► We applied linearized models for a new magnetic control on MAST tokamak. ► A suite of procedures, conceived to be machine independent, have been used. ► We carried out model-based simulations, taking into account eddy currents effects. ► Comparison with the EFIT flux maps and the experimental magnetic signals are shown. ► A current driven model for the dynamic simulations of the experimental data have been performed. -- Abstract: The aim of this work is to provide reliable linearized models for the design and assessment of a new magnetic control system for MAST (Mega Ampère Spherical Tokamak) using rtEFIT, which can easily be exported to MAST Upgrade. Linearized models for magnetic control have been obtained using the 2D axisymmetric finite element code CREATE L. MAST linearized models include equivalent 2D axisymmetric schematization of poloidal field (PF) coils, vacuum vessel, and other conducting structures. A plasmaless and a double null configuration have been chosen as benchmark cases for the comparison with experimental data and EFIT reconstructions. Good agreement has been found with the EFIT flux map and the experimental signals coming from magnetic probes with only few mismatches probably due to broken sensors. A suite of procedures (equipped with a user friendly interface to be run even remotely) to provide linearized models for magnetic control is now available on the MAST linux machines. A new current driven model has been used to obtain a state space model having the PF coil currents as inputs. Dynamic simulations of experimental data have been carried out using linearized models, including modelling of the effects of the passive structures, showing a fair agreement. The modelling activity has been useful also to reproduce accurately the interaction between plasma current and radial position control loops

  14. Linearized models for a new magnetic control in MAST

    Energy Technology Data Exchange (ETDEWEB)

    Artaserse, G., E-mail: giovanni.artaserse@enea.it [Associazione Euratom-ENEA sulla Fusione, Via Enrico Fermi 45, I-00044 Frascati (RM) (Italy); Maviglia, F.; Albanese, R. [Associazione Euratom-ENEA-CREATE sulla Fusione, Via Claudio 21, I-80125 Napoli (Italy); McArdle, G.J.; Pangione, L. [EURATOM/CCFE Fusion Association, Culham Science Centre, Abingdon, Oxon, OX14 3DB (United Kingdom)

    2013-10-15

    Highlights: ► We applied linearized models for a new magnetic control on MAST tokamak. ► A suite of procedures, conceived to be machine independent, have been used. ► We carried out model-based simulations, taking into account eddy currents effects. ► Comparison with the EFIT flux maps and the experimental magnetic signals are shown. ► A current driven model for the dynamic simulations of the experimental data have been performed. -- Abstract: The aim of this work is to provide reliable linearized models for the design and assessment of a new magnetic control system for MAST (Mega Ampère Spherical Tokamak) using rtEFIT, which can easily be exported to MAST Upgrade. Linearized models for magnetic control have been obtained using the 2D axisymmetric finite element code CREATE L. MAST linearized models include equivalent 2D axisymmetric schematization of poloidal field (PF) coils, vacuum vessel, and other conducting structures. A plasmaless and a double null configuration have been chosen as benchmark cases for the comparison with experimental data and EFIT reconstructions. Good agreement has been found with the EFIT flux map and the experimental signals coming from magnetic probes with only few mismatches probably due to broken sensors. A suite of procedures (equipped with a user friendly interface to be run even remotely) to provide linearized models for magnetic control is now available on the MAST linux machines. A new current driven model has been used to obtain a state space model having the PF coil currents as inputs. Dynamic simulations of experimental data have been carried out using linearized models, including modelling of the effects of the passive structures, showing a fair agreement. The modelling activity has been useful also to reproduce accurately the interaction between plasma current and radial position control loops.

  15. Statistical Model Checking of Rich Models and Properties

    DEFF Research Database (Denmark)

    Poulsen, Danny Bøgsted

    in undecidability issues for the traditional model checking approaches. Statistical model checking has proven itself a valuable supplement to model checking and this thesis is concerned with extending this software validation technique to stochastic hybrid systems. The thesis consists of two parts: the first part...... motivates why existing model checking technology should be supplemented by new techniques. It also contains a brief introduction to probability theory and concepts covered by the six papers making up the second part. The first two papers are concerned with developing online monitoring techniques...... systems. The fifth paper shows how stochastic hybrid automata are useful for modelling biological systems and the final paper is concerned with showing how statistical model checking is efficiently distributed. In parallel with developing the theory contained in the papers, a substantial part of this work...

  16. Statistical Modelling of Wind Proles - Data Analysis and Modelling

    DEFF Research Database (Denmark)

    Jónsson, Tryggvi; Pinson, Pierre

    The aim of the analysis presented in this document is to investigate whether statistical models can be used to make very short-term predictions of wind profiles.......The aim of the analysis presented in this document is to investigate whether statistical models can be used to make very short-term predictions of wind profiles....

  17. Statistical and Biophysical Models for Predicting Total and Outdoor Water Use in Los Angeles

    Science.gov (United States)

    Mini, C.; Hogue, T. S.; Pincetl, S.

    2012-04-01

    Modeling water demand is a complex exercise in the choice of the functional form, techniques and variables to integrate in the model. The goal of the current research is to identify the determinants that control total and outdoor residential water use in semi-arid cities and to utilize that information in the development of statistical and biophysical models that can forecast spatial and temporal urban water use. The City of Los Angeles is unique in its highly diverse socio-demographic, economic and cultural characteristics across neighborhoods, which introduces significant challenges in modeling water use. Increasing climate variability also contributes to uncertainties in water use predictions in urban areas. Monthly individual water use records were acquired from the Los Angeles Department of Water and Power (LADWP) for the 2000 to 2010 period. Study predictors of residential water use include socio-demographic, economic, climate and landscaping variables at the zip code level collected from US Census database. Climate variables are estimated from ground-based observations and calculated at the centroid of each zip code by inverse-distance weighting method. Remotely-sensed products of vegetation biomass and landscape land cover are also utilized. Two linear regression models were developed based on the panel data and variables described: a pooled-OLS regression model and a linear mixed effects model. Both models show income per capita and the percentage of landscape areas in each zip code as being statistically significant predictors. The pooled-OLS model tends to over-estimate higher water use zip codes and both models provide similar RMSE values.Outdoor water use was estimated at the census tract level as the residual between total water use and indoor use. This residual is being compared with the output from a biophysical model including tree and grass cover areas, climate variables and estimates of evapotranspiration at very high spatial resolution. A

  18. Landslide susceptibility mapping using GIS-based statistical models and Remote sensing data in tropical environment.

    Science.gov (United States)

    Shahabi, Himan; Hashim, Mazlan

    2015-04-22

    This research presents the results of the GIS-based statistical models for generation of landslide susceptibility mapping using geographic information system (GIS) and remote-sensing data for Cameron Highlands area in Malaysia. Ten factors including slope, aspect, soil, lithology, NDVI, land cover, distance to drainage, precipitation, distance to fault, and distance to road were extracted from SAR data, SPOT 5 and WorldView-1 images. The relationships between the detected landslide locations and these ten related factors were identified by using GIS-based statistical models including analytical hierarchy process (AHP), weighted linear combination (WLC) and spatial multi-criteria evaluation (SMCE) models. The landslide inventory map which has a total of 92 landslide locations was created based on numerous resources such as digital aerial photographs, AIRSAR data, WorldView-1 images, and field surveys. Then, 80% of the landslide inventory was used for training the statistical models and the remaining 20% was used for validation purpose. The validation results using the Relative landslide density index (R-index) and Receiver operating characteristic (ROC) demonstrated that the SMCE model (accuracy is 96%) is better in prediction than AHP (accuracy is 91%) and WLC (accuracy is 89%) models. These landslide susceptibility maps would be useful for hazard mitigation purpose and regional planning.

  19. Modeling Non-Linear Material Properties in Composite Materials

    Science.gov (United States)

    2016-06-28

    Technical Report ARWSB-TR-16013 MODELING NON-LINEAR MATERIAL PROPERTIES IN COMPOSITE MATERIALS Michael F. Macri Andrew G...REPORT TYPE Technical 3. DATES COVERED (From - To) 4. TITLE AND SUBTITLE MODELING NON-LINEAR MATERIAL PROPERTIES IN COMPOSITE MATERIALS ...systems are increasingly incorporating composite materials into their design. Many of these systems subject the composites to environmental conditions

  20. An ensemble Kalman filter for statistical estimation of physics constrained nonlinear regression models

    International Nuclear Information System (INIS)

    Harlim, John; Mahdi, Adam; Majda, Andrew J.

    2014-01-01

    A central issue in contemporary science is the development of nonlinear data driven statistical–dynamical models for time series of noisy partial observations from nature or a complex model. It has been established recently that ad-hoc quadratic multi-level regression models can have finite-time blow-up of statistical solutions and/or pathological behavior of their invariant measure. Recently, a new class of physics constrained nonlinear regression models were developed to ameliorate this pathological behavior. Here a new finite ensemble Kalman filtering algorithm is developed for estimating the state, the linear and nonlinear model coefficients, the model and the observation noise covariances from available partial noisy observations of the state. Several stringent tests and applications of the method are developed here. In the most complex application, the perfect model has 57 degrees of freedom involving a zonal (east–west) jet, two topographic Rossby waves, and 54 nonlinearly interacting Rossby waves; the perfect model has significant non-Gaussian statistics in the zonal jet with blocked and unblocked regimes and a non-Gaussian skewed distribution due to interaction with the other 56 modes. We only observe the zonal jet contaminated by noise and apply the ensemble filter algorithm for estimation. Numerically, we find that a three dimensional nonlinear stochastic model with one level of memory mimics the statistical effect of the other 56 modes on the zonal jet in an accurate fashion, including the skew non-Gaussian distribution and autocorrelation decay. On the other hand, a similar stochastic model with zero memory levels fails to capture the crucial non-Gaussian behavior of the zonal jet from the perfect 57-mode model

  1. Applied multivariate statistics with R

    CERN Document Server

    Zelterman, Daniel

    2015-01-01

    This book brings the power of multivariate statistics to graduate-level practitioners, making these analytical methods accessible without lengthy mathematical derivations. Using the open source, shareware program R, Professor Zelterman demonstrates the process and outcomes for a wide array of multivariate statistical applications. Chapters cover graphical displays, linear algebra, univariate, bivariate and multivariate normal distributions, factor methods, linear regression, discrimination and classification, clustering, time series models, and additional methods. Zelterman uses practical examples from diverse disciplines to welcome readers from a variety of academic specialties. Those with backgrounds in statistics will learn new methods while they review more familiar topics. Chapters include exercises, real data sets, and R implementations. The data are interesting, real-world topics, particularly from health and biology-related contexts. As an example of the approach, the text examines a sample from the B...

  2. Statistical physics of pairwise probability models

    DEFF Research Database (Denmark)

    Roudi, Yasser; Aurell, Erik; Hertz, John

    2009-01-01

    (dansk abstrakt findes ikke) Statistical models for describing the probability distribution over the states of biological systems are commonly used for dimensional reduction. Among these models, pairwise models are very attractive in part because they can be fit using a reasonable amount of  data......: knowledge of the means and correlations between pairs of elements in the system is sufficient. Not surprisingly, then, using pairwise models for studying neural data has been the focus of many studies in recent years. In this paper, we describe how tools from statistical physics can be employed for studying...

  3. Statistical aspects of fish stock assessment

    DEFF Research Database (Denmark)

    Berg, Casper Willestofte

    for stock assessment by application of state-of-the-art statistical methodology. The main contributions are presented in the form of six research papers. The major part of the thesis deals with age-structured assessment models, which is the most common approach. Conversion from length to age distributions...... statistical aspects of fish stocks assessment, which includes topics such as time series analysis, generalized additive models (GAMs), and non-linear state-space/mixed models capable of handling missing data and a high number of latent states and parameters. The aim is to improve the existing methods...

  4. Non-linear Loudspeaker Unit Modelling

    DEFF Research Database (Denmark)

    Pedersen, Bo Rohde; Agerkvist, Finn T.

    2008-01-01

    Simulations of a 6½-inch loudspeaker unit are performed and compared with a displacement measurement. The non-linear loudspeaker model is based on the major nonlinear functions and expanded with time-varying suspension behaviour and flux modulation. The results are presented with FFT plots of thr...... frequencies and different displacement levels. The model errors are discussed and analysed including a test with loudspeaker unit where the diaphragm is removed....

  5. A statistical model for investigating binding probabilities of DNA nucleotide sequences using microarrays.

    Science.gov (United States)

    Lee, Mei-Ling Ting; Bulyk, Martha L; Whitmore, G A; Church, George M

    2002-12-01

    There is considerable scientific interest in knowing the probability that a site-specific transcription factor will bind to a given DNA sequence. Microarray methods provide an effective means for assessing the binding affinities of a large number of DNA sequences as demonstrated by Bulyk et al. (2001, Proceedings of the National Academy of Sciences, USA 98, 7158-7163) in their study of the DNA-binding specificities of Zif268 zinc fingers using microarray technology. In a follow-up investigation, Bulyk, Johnson, and Church (2002, Nucleic Acid Research 30, 1255-1261) studied the interdependence of nucleotides on the binding affinities of transcription proteins. Our article is motivated by this pair of studies. We present a general statistical methodology for analyzing microarray intensity measurements reflecting DNA-protein interactions. The log probability of a protein binding to a DNA sequence on an array is modeled using a linear ANOVA model. This model is convenient because it employs familiar statistical concepts and procedures and also because it is effective for investigating the probability structure of the binding mechanism.

  6. Method for determining appropriate statistical models of the random cyclic stress amplitudes of a stainless pipe weld metal

    International Nuclear Information System (INIS)

    Wang Jinnuo; Zhao Yongxiang; Wang Shaohua

    2001-01-01

    It is revealed by a strain-controlled fatigue test that there is a significant scatter of the cyclic stress-strain responses for a nuclear engineering material, 1Cr18Ni9Ti stainless steel pipe-weld metal. This implies that the existent deterministic analysis might be non-conservative. Taking into account the scatter, a method for determining the appropriate statistical models of material cyclic stress amplitudes is presented by considering the total fit, consistency with fatigue physics, and safety of design of seven commonly used distribution fitting into the test data. The seven distribution are Weibull (two-and three-parameter), normal, lognormal, extreme minimum value, extreme maximum value, and exponential. In the method, statistical parameters of the distributions are evaluated by a linear regression technique. Statistical tests are made by a transformation from t-distribution function to Pearson statistical parameter. Statistical tests are made by a transformation from t-distribution function to Pearson statistical parameter, i.e. the linear relationship coefficient. The total fit is assessed by a parameter so-called fitted relationship coefficient of the empirical and theoretical failure probabilities. The consistency with fatigue physics is analyzed by the hazard rate curves of distributions. The safety of design is measured by examining the change of predicted errors in the tail regions of distributions

  7. A Dynamic Linear Modeling Approach to Public Policy Change

    DEFF Research Database (Denmark)

    Loftis, Matthew; Mortensen, Peter Bjerre

    2017-01-01

    Theories of public policy change, despite their differences, converge on one point of strong agreement. The relationship between policy and its causes can and does change over time. This consensus yields numerous empirical implications, but our standard analytical tools are inadequate for testing...... them. As a result, the dynamic and transformative relationships predicted by policy theories have been left largely unexplored in time-series analysis of public policy. This paper introduces dynamic linear modeling (DLM) as a useful statistical tool for exploring time-varying relationships in public...... policy. The paper offers a detailed exposition of the DLM approach and illustrates its usefulness with a time series analysis of U.S. defense policy from 1957-2010. The results point the way for a new attention to dynamics in the policy process and the paper concludes with a discussion of how...

  8. Comparison between linear quadratic and early time dose models

    International Nuclear Information System (INIS)

    Chougule, A.A.; Supe, S.J.

    1993-01-01

    During the 70s, much interest was focused on fractionation in radiotherapy with the aim of improving tumor control rate without producing unacceptable normal tissue damage. To compare the radiobiological effectiveness of various fractionation schedules, empirical formulae such as Nominal Standard Dose, Time Dose Factor, Cumulative Radiation Effect and Tumour Significant Dose, were introduced and were used despite many shortcomings. It has been claimed that a recent linear quadratic model is able to predict the radiobiological responses of tumours as well as normal tissues more accurately. We compared Time Dose Factor and Tumour Significant Dose models with the linear quadratic model for tumour regression in patients with carcinomas of the cervix. It was observed that the prediction of tumour regression estimated by the Tumour Significant Dose and Time Dose factor concepts varied by 1.6% from that of the linear quadratic model prediction. In view of the lack of knowledge of the precise values of the parameters of the linear quadratic model, it should be applied with caution. One can continue to use the Time Dose Factor concept which has been in use for more than a decade as its results are within ±2% as compared to that predicted by the linear quadratic model. (author). 11 refs., 3 figs., 4 tabs

  9. A Note on the Identifiability of Generalized Linear Mixed Models

    DEFF Research Database (Denmark)

    Labouriau, Rodrigo

    2014-01-01

    I present here a simple proof that, under general regularity conditions, the standard parametrization of generalized linear mixed model is identifiable. The proof is based on the assumptions of generalized linear mixed models on the first and second order moments and some general mild regularity...... conditions, and, therefore, is extensible to quasi-likelihood based generalized linear models. In particular, binomial and Poisson mixed models with dispersion parameter are identifiable when equipped with the standard parametrization...

  10. A non-linear state space approach to model groundwater fluctuations

    NARCIS (Netherlands)

    Berendrecht, W.L.; Heemink, A.W.; Geer, F.C. van; Gehrels, J.C.

    2006-01-01

    A non-linear state space model is developed for describing groundwater fluctuations. Non-linearity is introduced by modeling the (unobserved) degree of water saturation of the root zone. The non-linear relations are based on physical concepts describing the dependence of both the actual

  11. Linear Equating for the NEAT Design: Parameter Substitution Models and Chained Linear Relationship Models

    Science.gov (United States)

    Kane, Michael T.; Mroch, Andrew A.; Suh, Youngsuk; Ripkey, Douglas R.

    2009-01-01

    This paper analyzes five linear equating models for the "nonequivalent groups with anchor test" (NEAT) design with internal anchors (i.e., the anchor test is part of the full test). The analysis employs a two-dimensional framework. The first dimension contrasts two general approaches to developing the equating relationship. Under a "parameter…

  12. Estimation of aboveground biomass in Mediterranean forests by statistical modelling of ASTER fraction images

    Science.gov (United States)

    Fernández-Manso, O.; Fernández-Manso, A.; Quintano, C.

    2014-09-01

    Aboveground biomass (AGB) estimation from optical satellite data is usually based on regression models of original or synthetic bands. To overcome the poor relation between AGB and spectral bands due to mixed-pixels when a medium spatial resolution sensor is considered, we propose to base the AGB estimation on fraction images from Linear Spectral Mixture Analysis (LSMA). Our study area is a managed Mediterranean pine woodland (Pinus pinaster Ait.) in central Spain. A total of 1033 circular field plots were used to estimate AGB from Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) optical data. We applied Pearson correlation statistics and stepwise multiple regression to identify suitable predictors from the set of variables of original bands, fraction imagery, Normalized Difference Vegetation Index and Tasselled Cap components. Four linear models and one nonlinear model were tested. A linear combination of ASTER band 2 (red, 0.630-0.690 μm), band 8 (short wave infrared 5, 2.295-2.365 μm) and green vegetation fraction (from LSMA) was the best AGB predictor (Radj2=0.632, the root-mean-squared error of estimated AGB was 13.3 Mg ha-1 (or 37.7%), resulting from cross-validation), rather than other combinations of the above cited independent variables. Results indicated that using ASTER fraction images in regression models improves the AGB estimation in Mediterranean pine forests. The spatial distribution of the estimated AGB, based on a multiple linear regression model, may be used as baseline information for forest managers in future studies, such as quantifying the regional carbon budget, fuel accumulation or monitoring of management practices.

  13. Recent Updates to the GEOS-5 Linear Model

    Science.gov (United States)

    Holdaway, Dan; Kim, Jong G.; Errico, Ron; Gelaro, Ronald; Mahajan, Rahul

    2014-01-01

    Global Modeling and Assimilation Office (GMAO) is close to having a working 4DVAR system and has developed a linearized version of GEOS-5.This talk outlines a series of improvements made to the linearized dynamics, physics and trajectory.Of particular interest is the development of linearized cloud microphysics, which provides the framework for 'all-sky' data assimilation.

  14. Using the classical linear regression model in analysis of the dependences of conveyor belt life

    Directory of Open Access Journals (Sweden)

    Miriam Andrejiová

    2013-12-01

    Full Text Available The paper deals with the classical linear regression model of the dependence of conveyor belt life on some selected parameters: thickness of paint layer, width and length of the belt, conveyor speed and quantity of transported material. The first part of the article is about regression model design, point and interval estimation of parameters, verification of statistical significance of the model, and about the parameters of the proposed regression model. The second part of the article deals with identification of influential and extreme values that can have an impact on estimation of regression model parameters. The third part focuses on assumptions of the classical regression model, i.e. on verification of independence assumptions, normality and homoscedasticity of residuals.

  15. Recursive and non-linear logistic regression: moving on from the original EuroSCORE and EuroSCORE II methodologies.

    Science.gov (United States)

    Poullis, Michael

    2014-11-01

    EuroSCORE II, despite improving on the original EuroSCORE system, has not solved all the calibration and predictability issues. Recursive, non-linear and mixed recursive and non-linear regression analysis were assessed with regard to sensitivity, specificity and predictability of the original EuroSCORE and EuroSCORE II systems. The original logistic EuroSCORE, EuroSCORE II and recursive, non-linear and mixed recursive and non-linear regression analyses of these risk models were assessed via receiver operator characteristic curves (ROC) and Hosmer-Lemeshow statistic analysis with regard to the accuracy of predicting in-hospital mortality. Analysis was performed for isolated coronary artery bypass grafts (CABGs) (n = 2913), aortic valve replacement (AVR) (n = 814), mitral valve surgery (n = 340), combined AVR and CABG (n = 517), aortic (n = 350), miscellaneous cases (n = 642), and combinations of the above cases (n = 5576). The original EuroSCORE had an ROC below 0.7 for isolated AVR and combined AVR and CABG. None of the methods described increased the ROC above 0.7. The EuroSCORE II risk model had an ROC below 0.7 for isolated AVR only. Recursive regression, non-linear regression, and mixed recursive and non-linear regression all increased the ROC above 0.7 for isolated AVR. The original EuroSCORE had a Hosmer-Lemeshow statistic that was above 0.05 for all patients and the subgroups analysed. All of the techniques markedly increased the Hosmer-Lemeshow statistic. The EuroSCORE II risk model had a Hosmer-Lemeshow statistic that was significant for all patients (P linear regression failed to improve on the original Hosmer-Lemeshow statistic. The mixed recursive and non-linear regression using the EuroSCORE II risk model was the only model that produced an ROC of 0.7 or above for all patients and procedures and had a Hosmer-Lemeshow statistic that was highly non-significant. The original EuroSCORE and the EuroSCORE II risk models do not have adequate ROC and Hosmer

  16. Statistical models for the analysis of water distribution system pipe break data

    International Nuclear Information System (INIS)

    Yamijala, Shridhar; Guikema, Seth D.; Brumbelow, Kelly

    2009-01-01

    The deterioration of pipes leading to pipe breaks and leaks in urban water distribution systems is of concern to water utilities throughout the world. Pipe breaks and leaks may result in reduction in the water-carrying capacity of the pipes and contamination of water in the distribution systems. Water utilities incur large expenses in the replacement and rehabilitation of water mains, making it critical to evaluate the current and future condition of the system for maintenance decision-making. This paper compares different statistical regression models proposed in the literature for estimating the reliability of pipes in a water distribution system on the basis of short time histories. The goals of these models are to estimate the likelihood of pipe breaks in the future and determine the parameters that most affect the likelihood of pipe breaks. The data set used for the analysis comes from a major US city, and these data include approximately 85,000 pipe segments with nearly 2500 breaks from 2000 through 2005. The results show that the set of statistical models previously proposed for this problem do not provide good estimates with the test data set. However, logistic generalized linear models do provide good estimates of pipe reliability and can be useful for water utilities in planning pipe inspection and maintenance

  17. Assessing biomass of diverse coastal marsh ecosystems using statistical and machine learning models

    Science.gov (United States)

    Mo, Yu; Kearney, Michael S.; Riter, J. C. Alexis; Zhao, Feng; Tilley, David R.

    2018-06-01

    The importance and vulnerability of coastal marshes necessitate effective ways to closely monitor them. Optical remote sensing is a powerful tool for this task, yet its application to diverse coastal marsh ecosystems consisting of different marsh types is limited. This study samples spectral and biophysical data from freshwater, intermediate, brackish, and saline marshes in Louisiana, and develops statistical and machine learning models to assess the marshes' biomass with combined ground, airborne, and spaceborne remote sensing data. It is found that linear models derived from NDVI and EVI are most favorable for assessing Leaf Area Index (LAI) using multispectral data (R2 = 0.7 and 0.67, respectively), and the random forest models are most useful in retrieving LAI and Aboveground Green Biomass (AGB) using hyperspectral data (R2 = 0.91 and 0.84, respectively). It is also found that marsh type and plant species significantly impact the linear model development (P biomass of Louisiana's coastal marshes using various optical remote sensing techniques, and highlights the impacts of the marshes' species composition on the model development and the sensors' spatial resolution on biomass mapping, thereby providing useful tools for monitoring the biomass of coastal marshes in Louisiana and diverse coastal marsh ecosystems elsewhere.

  18. Non-linear calibration models for near infrared spectroscopy

    DEFF Research Database (Denmark)

    Ni, Wangdong; Nørgaard, Lars; Mørup, Morten

    2014-01-01

    by ridge regression (RR). The performance of the different methods is demonstrated by their practical applications using three real-life near infrared (NIR) data sets. Different aspects of the various approaches including computational time, model interpretability, potential over-fitting using the non-linear...... models on linear problems, robustness to small or medium sample sets, and robustness to pre-processing, are discussed. The results suggest that GPR and BANN are powerful and promising methods for handling linear as well as nonlinear systems, even when the data sets are moderately small. The LS......-SVM), relevance vector machines (RVM), Gaussian process regression (GPR), artificial neural network (ANN), and Bayesian ANN (BANN). In this comparison, partial least squares (PLS) regression is used as a linear benchmark, while the relationship of the methods is considered in terms of traditional calibration...

  19. Modern applied statistics with s-plus

    CERN Document Server

    Venables, W N

    1997-01-01

    S-PLUS is a powerful environment for the statistical and graphical analysis of data. It provides the tools to implement many statistical ideas which have been made possible by the widespread availability of workstations having good graphics and computational capabilities. This book is a guide to using S-PLUS to perform statistical analyses and provides both an introduction to the use of S-PLUS and a course in modern statistical methods. S-PLUS is available for both Windows and UNIX workstations, and both versions are covered in depth. The aim of the book is to show how to use S-PLUS as a powerful and graphical system. Readers are assumed to have a basic grounding in statistics, and so the book is intended for would-be users of S-PLUS, and both students and researchers using statistics. Throughout, the emphasis is on presenting practical problems and full analyses of real data sets. Many of the methods discussed are state-of-the-art approaches to topics such as linear and non-linear regression models, robust a...

  20. A d-statistic for single-case designs that is equivalent to the usual between-groups d-statistic.

    Science.gov (United States)

    Shadish, William R; Hedges, Larry V; Pustejovsky, James E; Boyajian, Jonathan G; Sullivan, Kristynn J; Andrade, Alma; Barrientos, Jeannette L

    2014-01-01

    We describe a standardised mean difference statistic (d) for single-case designs that is equivalent to the usual d in between-groups experiments. We show how it can be used to summarise treatment effects over cases within a study, to do power analyses in planning new studies and grant proposals, and to meta-analyse effects across studies of the same question. We discuss limitations of this d-statistic, and possible remedies to them. Even so, this d-statistic is better founded statistically than other effect size measures for single-case design, and unlike many general linear model approaches such as multilevel modelling or generalised additive models, it produces a standardised effect size that can be integrated over studies with different outcome measures. SPSS macros for both effect size computation and power analysis are available.

  1. [From clinical judgment to linear regression model.

    Science.gov (United States)

    Palacios-Cruz, Lino; Pérez, Marcela; Rivas-Ruiz, Rodolfo; Talavera, Juan O

    2013-01-01

    When we think about mathematical models, such as linear regression model, we think that these terms are only used by those engaged in research, a notion that is far from the truth. Legendre described the first mathematical model in 1805, and Galton introduced the formal term in 1886. Linear regression is one of the most commonly used regression models in clinical practice. It is useful to predict or show the relationship between two or more variables as long as the dependent variable is quantitative and has normal distribution. Stated in another way, the regression is used to predict a measure based on the knowledge of at least one other variable. Linear regression has as it's first objective to determine the slope or inclination of the regression line: Y = a + bx, where "a" is the intercept or regression constant and it is equivalent to "Y" value when "X" equals 0 and "b" (also called slope) indicates the increase or decrease that occurs when the variable "x" increases or decreases in one unit. In the regression line, "b" is called regression coefficient. The coefficient of determination (R 2 ) indicates the importance of independent variables in the outcome.

  2. Noise and non-linearities in high-throughput data

    International Nuclear Information System (INIS)

    Nguyen, Viet-Anh; Lió, Pietro; Koukolíková-Nicola, Zdena; Bagnoli, Franco

    2009-01-01

    High-throughput data analyses are becoming common in biology, communications, economics and sociology. The vast amounts of data are usually represented in the form of matrices and can be considered as knowledge networks. Spectra-based approaches have proved useful in extracting hidden information within such networks and for estimating missing data, but these methods are based essentially on linear assumptions. The physical models of matching, when applicable, often suggest non-linear mechanisms, that may sometimes be identified as noise. The use of non-linear models in data analysis, however, may require the introduction of many parameters, which lowers the statistical weight of the model. According to the quality of data, a simpler linear analysis may be more convenient than more complex approaches. In this paper, we show how a simple non-parametric Bayesian model may be used to explore the role of non-linearities and noise in synthetic and experimental data sets

  3. Uncertainty the soul of modeling, probability & statistics

    CERN Document Server

    Briggs, William

    2016-01-01

    This book presents a philosophical approach to probability and probabilistic thinking, considering the underpinnings of probabilistic reasoning and modeling, which effectively underlie everything in data science. The ultimate goal is to call into question many standard tenets and lay the philosophical and probabilistic groundwork and infrastructure for statistical modeling. It is the first book devoted to the philosophy of data aimed at working scientists and calls for a new consideration in the practice of probability and statistics to eliminate what has been referred to as the "Cult of Statistical Significance". The book explains the philosophy of these ideas and not the mathematics, though there are a handful of mathematical examples. The topics are logically laid out, starting with basic philosophy as related to probability, statistics, and science, and stepping through the key probabilistic ideas and concepts, and ending with statistical models. Its jargon-free approach asserts that standard methods, suc...

  4. Statistical Mechanics of the Human Placenta: A Stationary State of a Near-Equilibrium System in a Linear Regime.

    Science.gov (United States)

    Lecarpentier, Yves; Claes, Victor; Hébert, Jean-Louis; Krokidis, Xénophon; Blanc, François-Xavier; Michel, Francine; Timbely, Oumar

    2015-01-01

    All near-equilibrium systems under linear regime evolve to stationary states in which there is constant entropy production rate. In an open chemical system that exchanges matter and energy with the exterior, we can identify both the energy and entropy flows associated with the exchange of matter and energy. This can be achieved by applying statistical mechanics (SM), which links the microscopic properties of a system to its bulk properties. In the case of contractile tissues such as human placenta, Huxley's equations offer a phenomenological formalism for applying SM. SM was investigated in human placental stem villi (PSV) (n = 40). PSV were stimulated by means of KCl exposure (n = 20) and tetanic electrical stimulation (n = 20). This made it possible to determine statistical entropy (S), internal energy (E), affinity (A), thermodynamic force (A / T) (T: temperature), thermodynamic flow (v) and entropy production rate (A / T x v). We found that PSV operated near equilibrium, i.e., A ≺≺ 2500 J/mol and in a stationary linear regime, i.e., (A / T) varied linearly with v. As v was dramatically low, entropy production rate which quantified irreversibility of chemical processes appeared to be the lowest ever observed in any contractile system.

  5. Defining a Family of Cognitive Diagnosis Models Using Log-Linear Models with Latent Variables

    Science.gov (United States)

    Henson, Robert A.; Templin, Jonathan L.; Willse, John T.

    2009-01-01

    This paper uses log-linear models with latent variables (Hagenaars, in "Loglinear Models with Latent Variables," 1993) to define a family of cognitive diagnosis models. In doing so, the relationship between many common models is explicitly defined and discussed. In addition, because the log-linear model with latent variables is a general model for…

  6. Augmenting Data with Published Results in Bayesian Linear Regression

    Science.gov (United States)

    de Leeuw, Christiaan; Klugkist, Irene

    2012-01-01

    In most research, linear regression analyses are performed without taking into account published results (i.e., reported summary statistics) of similar previous studies. Although the prior density in Bayesian linear regression could accommodate such prior knowledge, formal models for doing so are absent from the literature. The goal of this…

  7. Phylogenetic mixtures and linear invariants for equal input models.

    Science.gov (United States)

    Casanellas, Marta; Steel, Mike

    2017-04-01

    The reconstruction of phylogenetic trees from molecular sequence data relies on modelling site substitutions by a Markov process, or a mixture of such processes. In general, allowing mixed processes can result in different tree topologies becoming indistinguishable from the data, even for infinitely long sequences. However, when the underlying Markov process supports linear phylogenetic invariants, then provided these are sufficiently informative, the identifiability of the tree topology can be restored. In this paper, we investigate a class of processes that support linear invariants once the stationary distribution is fixed, the 'equal input model'. This model generalizes the 'Felsenstein 1981' model (and thereby the Jukes-Cantor model) from four states to an arbitrary number of states (finite or infinite), and it can also be described by a 'random cluster' process. We describe the structure and dimension of the vector spaces of phylogenetic mixtures and of linear invariants for any fixed phylogenetic tree (and for all trees-the so called 'model invariants'), on any number n of leaves. We also provide a precise description of the space of mixtures and linear invariants for the special case of [Formula: see text] leaves. By combining techniques from discrete random processes and (multi-) linear algebra, our results build on a classic result that was first established by James Lake (Mol Biol Evol 4:167-191, 1987).

  8. A statistical model describing combined irreversible electroporation and electroporation-induced blood-brain barrier disruption.

    Science.gov (United States)

    Sharabi, Shirley; Kos, Bor; Last, David; Guez, David; Daniels, Dianne; Harnof, Sagi; Mardor, Yael; Miklavcic, Damijan

    2016-03-01

    Electroporation-based therapies such as electrochemotherapy (ECT) and irreversible electroporation (IRE) are emerging as promising tools for treatment of tumors. When applied to the brain, electroporation can also induce transient blood-brain-barrier (BBB) disruption in volumes extending beyond IRE, thus enabling efficient drug penetration. The main objective of this study was to develop a statistical model predicting cell death and BBB disruption induced by electroporation. This model can be used for individual treatment planning. Cell death and BBB disruption models were developed based on the Peleg-Fermi model in combination with numerical models of the electric field. The model calculates the electric field thresholds for cell kill and BBB disruption and describes the dependence on the number of treatment pulses. The model was validated using in vivo experimental data consisting of rats brains MRIs post electroporation treatments. Linear regression analysis confirmed that the model described the IRE and BBB disruption volumes as a function of treatment pulses number (r(2) = 0.79; p disruption, the ratio increased with the number of pulses. BBB disruption radii were on average 67% ± 11% larger than IRE volumes. The statistical model can be used to describe the dependence of treatment-effects on the number of pulses independent of the experimental setup.

  9. Statistical Models for Social Networks

    NARCIS (Netherlands)

    Snijders, Tom A. B.; Cook, KS; Massey, DS

    2011-01-01

    Statistical models for social networks as dependent variables must represent the typical network dependencies between tie variables such as reciprocity, homophily, transitivity, etc. This review first treats models for single (cross-sectionally observed) networks and then for network dynamics. For

  10. Comparison of Linear and Non-linear Regression Analysis to Determine Pulmonary Pressure in Hyperthyroidism.

    Science.gov (United States)

    Scarneciu, Camelia C; Sangeorzan, Livia; Rus, Horatiu; Scarneciu, Vlad D; Varciu, Mihai S; Andreescu, Oana; Scarneciu, Ioan

    2017-01-01

    This study aimed at assessing the incidence of pulmonary hypertension (PH) at newly diagnosed hyperthyroid patients and at finding a simple model showing the complex functional relation between pulmonary hypertension in hyperthyroidism and the factors causing it. The 53 hyperthyroid patients (H-group) were evaluated mainly by using an echocardiographical method and compared with 35 euthyroid (E-group) and 25 healthy people (C-group). In order to identify the factors causing pulmonary hypertension the statistical method of comparing the values of arithmetical means is used. The functional relation between the two random variables (PAPs and each of the factors determining it within our research study) can be expressed by linear or non-linear function. By applying the linear regression method described by a first-degree equation the line of regression (linear model) has been determined; by applying the non-linear regression method described by a second degree equation, a parabola-type curve of regression (non-linear or polynomial model) has been determined. We made the comparison and the validation of these two models by calculating the determination coefficient (criterion 1), the comparison of residuals (criterion 2), application of AIC criterion (criterion 3) and use of F-test (criterion 4). From the H-group, 47% have pulmonary hypertension completely reversible when obtaining euthyroidism. The factors causing pulmonary hypertension were identified: previously known- level of free thyroxin, pulmonary vascular resistance, cardiac output; new factors identified in this study- pretreatment period, age, systolic blood pressure. According to the four criteria and to the clinical judgment, we consider that the polynomial model (graphically parabola- type) is better than the linear one. The better model showing the functional relation between the pulmonary hypertension in hyperthyroidism and the factors identified in this study is given by a polynomial equation of second

  11. Forecasting the EMU inflation rate: Linear econometric vs. non-linear computational models using genetic neural fuzzy systems

    DEFF Research Database (Denmark)

    Kooths, Stefan; Mitze, Timo Friedel; Ringhut, Eric

    2004-01-01

    This paper compares the predictive power of linear econometric and non-linear computational models for forecasting the inflation rate in the European Monetary Union (EMU). Various models of both types are developed using different monetary and real activity indicators. They are compared according...

  12. A BEHAVIORAL-APPROACH TO LINEAR EXACT MODELING

    NARCIS (Netherlands)

    ANTOULAS, AC; WILLEMS, JC

    1993-01-01

    The behavioral approach to system theory provides a parameter-free framework for the study of the general problem of linear exact modeling and recursive modeling. The main contribution of this paper is the solution of the (continuous-time) polynomial-exponential time series modeling problem. Both

  13. Functional summary statistics for the Johnson-Mehl model

    DEFF Research Database (Denmark)

    Møller, Jesper; Ghorbani, Mohammad

    The Johnson-Mehl germination-growth model is a spatio-temporal point process model which among other things have been used for the description of neurotransmitters datasets. However, for such datasets parametric Johnson-Mehl models fitted by maximum likelihood have yet not been evaluated by means...... of functional summary statistics. This paper therefore invents four functional summary statistics adapted to the Johnson-Mehl model, with two of them based on the second-order properties and the other two on the nuclei-boundary distances for the associated Johnson-Mehl tessellation. The functional summary...... statistics theoretical properties are investigated, non-parametric estimators are suggested, and their usefulness for model checking is examined in a simulation study. The functional summary statistics are also used for checking fitted parametric Johnson-Mehl models for a neurotransmitters dataset....

  14. Statistical limitations in functional neuroimaging. I. Non-inferential methods and statistical models.

    Science.gov (United States)

    Petersson, K M; Nichols, T E; Poline, J B; Holmes, A P

    1999-01-01

    Functional neuroimaging (FNI) provides experimental access to the intact living brain making it possible to study higher cognitive functions in humans. In this review and in a companion paper in this issue, we discuss some common methods used to analyse FNI data. The emphasis in both papers is on assumptions and limitations of the methods reviewed. There are several methods available to analyse FNI data indicating that none is optimal for all purposes. In order to make optimal use of the methods available it is important to know the limits of applicability. For the interpretation of FNI results it is also important to take into account the assumptions, approximations and inherent limitations of the methods used. This paper gives a brief overview over some non-inferential descriptive methods and common statistical models used in FNI. Issues relating to the complex problem of model selection are discussed. In general, proper model selection is a necessary prerequisite for the validity of the subsequent statistical inference. The non-inferential section describes methods that, combined with inspection of parameter estimates and other simple measures, can aid in the process of model selection and verification of assumptions. The section on statistical models covers approaches to global normalization and some aspects of univariate, multivariate, and Bayesian models. Finally, approaches to functional connectivity and effective connectivity are discussed. In the companion paper we review issues related to signal detection and statistical inference. PMID:10466149

  15. Statistical mechanics of nonequilibrium liquids

    CERN Document Server

    Evans, Denis J; Craig, D P; McWeeny, R

    1990-01-01

    Statistical Mechanics of Nonequilibrium Liquids deals with theoretical rheology. The book discusses nonlinear response of systems and outlines the statistical mechanical theory. In discussing the framework of nonequilibrium statistical mechanics, the book explains the derivation of a nonequilibrium analogue of the Gibbsian basis for equilibrium statistical mechanics. The book reviews the linear irreversible thermodynamics, the Liouville equation, and the Irving-Kirkwood procedure. The text then explains the Green-Kubo relations used in linear transport coefficients, the linear response theory,

  16. Preisach hysteresis model for non-linear 2D heat diffusion

    International Nuclear Information System (INIS)

    Jancskar, Ildiko; Ivanyi, Amalia

    2006-01-01

    This paper analyzes a non-linear heat diffusion process when the thermal diffusivity behaviour is a hysteretic function of the temperature. Modelling this temperature dependence, the discrete Preisach algorithm as general hysteresis model has been integrated into a non-linear multigrid solver. The hysteretic diffusion shows a heating-cooling asymmetry in character. The presented type of hysteresis speeds up the thermal processes in the modelled systems by a very interesting non-linear way

  17. Statistical post-processing of seasonal multi-model forecasts: Why is it so hard to beat the multi-model mean?

    Science.gov (United States)

    Siegert, Stefan

    2017-04-01

    Initialised climate forecasts on seasonal time scales, run several months or even years ahead, are now an integral part of the battery of products offered by climate services world-wide. The availability of seasonal climate forecasts from various modeling centres gives rise to multi-model ensemble forecasts. Post-processing such seasonal-to-decadal multi-model forecasts is challenging 1) because the cross-correlation structure between multiple models and observations can be complicated, 2) because the amount of training data to fit the post-processing parameters is very limited, and 3) because the forecast skill of numerical models tends to be low on seasonal time scales. In this talk I will review new statistical post-processing frameworks for multi-model ensembles. I will focus particularly on Bayesian hierarchical modelling approaches, which are flexible enough to capture commonly made assumptions about collective and model-specific biases of multi-model ensembles. Despite the advances in statistical methodology, it turns out to be very difficult to out-perform the simplest post-processing method, which just recalibrates the multi-model ensemble mean by linear regression. I will discuss reasons for this, which are closely linked to the specific characteristics of seasonal multi-model forecasts. I explore possible directions for improvements, for example using informative priors on the post-processing parameters, and jointly modelling forecasts and observations.

  18. Using Artificial Neural Networks in Educational Research: Some Comparisons with Linear Statistical Models.

    Science.gov (United States)

    Everson, Howard T.; And Others

    This paper explores the feasibility of neural computing methods such as artificial neural networks (ANNs) and abductory induction mechanisms (AIM) for use in educational measurement. ANNs and AIMS methods are contrasted with more traditional statistical techniques, such as multiple regression and discriminant function analyses, for making…

  19. Tip-tilt disturbance model identification based on non-linear least squares fitting for Linear Quadratic Gaussian control

    Science.gov (United States)

    Yang, Kangjian; Yang, Ping; Wang, Shuai; Dong, Lizhi; Xu, Bing

    2018-05-01

    We propose a method to identify tip-tilt disturbance model for Linear Quadratic Gaussian control. This identification method based on Levenberg-Marquardt method conducts with a little prior information and no auxiliary system and it is convenient to identify the tip-tilt disturbance model on-line for real-time control. This identification method makes it easy that Linear Quadratic Gaussian control runs efficiently in different adaptive optics systems for vibration mitigation. The validity of the Linear Quadratic Gaussian control associated with this tip-tilt disturbance model identification method is verified by experimental data, which is conducted in replay mode by simulation.

  20. Logistic and linear regression model documentation for statistical relations between continuous real-time and discrete water-quality constituents in the Kansas River, Kansas, July 2012 through June 2015

    Science.gov (United States)

    Foster, Guy M.; Graham, Jennifer L.

    2016-04-06

    The Kansas River is a primary source of drinking water for about 800,000 people in northeastern Kansas. Source-water supplies are treated by a combination of chemical and physical processes to remove contaminants before distribution. Advanced notification of changing water-quality conditions and cyanobacteria and associated toxin and taste-and-odor compounds provides drinking-water treatment facilities time to develop and implement adequate treatment strategies. The U.S. Geological Survey (USGS), in cooperation with the Kansas Water Office (funded in part through the Kansas State Water Plan Fund), and the City of Lawrence, the City of Topeka, the City of Olathe, and Johnson County Water One, began a study in July 2012 to develop statistical models at two Kansas River sites located upstream from drinking-water intakes. Continuous water-quality monitors have been operated and discrete-water quality samples have been collected on the Kansas River at Wamego (USGS site number 06887500) and De Soto (USGS site number 06892350) since July 2012. Continuous and discrete water-quality data collected during July 2012 through June 2015 were used to develop statistical models for constituents of interest at the Wamego and De Soto sites. Logistic models to continuously estimate the probability of occurrence above selected thresholds were developed for cyanobacteria, microcystin, and geosmin. Linear regression models to continuously estimate constituent concentrations were developed for major ions, dissolved solids, alkalinity, nutrients (nitrogen and phosphorus species), suspended sediment, indicator bacteria (Escherichia coli, fecal coliform, and enterococci), and actinomycetes bacteria. These models will be used to provide real-time estimates of the probability that cyanobacteria and associated compounds exceed thresholds and of the concentrations of other water-quality constituents in the Kansas River. The models documented in this report are useful for characterizing changes

  1. Distributions with given marginals and statistical modelling

    CERN Document Server

    Fortiana, Josep; Rodriguez-Lallena, José

    2002-01-01

    This book contains a selection of the papers presented at the meeting `Distributions with given marginals and statistical modelling', held in Barcelona (Spain), July 17-20, 2000. In 24 chapters, this book covers topics such as the theory of copulas and quasi-copulas, the theory and compatibility of distributions, models for survival distributions and other well-known distributions, time series, categorical models, definition and estimation of measures of dependence, monotonicity and stochastic ordering, shape and separability of distributions, hidden truncation models, diagonal families, orthogonal expansions, tests of independence, and goodness of fit assessment. These topics share the use and properties of distributions with given marginals, this being the fourth specialised text on this theme. The innovative aspect of the book is the inclusion of statistical aspects such as modelling, Bayesian statistics, estimation, and tests.

  2. On the interpretation of weight vectors of linear models in multivariate neuroimaging.

    Science.gov (United States)

    Haufe, Stefan; Meinecke, Frank; Görgen, Kai; Dähne, Sven; Haynes, John-Dylan; Blankertz, Benjamin; Bießmann, Felix

    2014-02-15

    The increase in spatiotemporal resolution of neuroimaging devices is accompanied by a trend towards more powerful multivariate analysis methods. Often it is desired to interpret the outcome of these methods with respect to the cognitive processes under study. Here we discuss which methods allow for such interpretations, and provide guidelines for choosing an appropriate analysis for a given experimental goal: For a surgeon who needs to decide where to remove brain tissue it is most important to determine the origin of cognitive functions and associated neural processes. In contrast, when communicating with paralyzed or comatose patients via brain-computer interfaces, it is most important to accurately extract the neural processes specific to a certain mental state. These equally important but complementary objectives require different analysis methods. Determining the origin of neural processes in time or space from the parameters of a data-driven model requires what we call a forward model of the data; such a model explains how the measured data was generated from the neural sources. Examples are general linear models (GLMs). Methods for the extraction of neural information from data can be considered as backward models, as they attempt to reverse the data generating process. Examples are multivariate classifiers. Here we demonstrate that the parameters of forward models are neurophysiologically interpretable in the sense that significant nonzero weights are only observed at channels the activity of which is related to the brain process under study. In contrast, the interpretation of backward model parameters can lead to wrong conclusions regarding the spatial or temporal origin of the neural signals of interest, since significant nonzero weights may also be observed at channels the activity of which is statistically independent of the brain process under study. As a remedy for the linear case, we propose a procedure for transforming backward models into forward

  3. On-line validation of linear process models using generalized likelihood ratios

    International Nuclear Information System (INIS)

    Tylee, J.L.

    1981-12-01

    A real-time method for testing the validity of linear models of nonlinear processes is described and evaluated. Using generalized likelihood ratios, the model dynamics are continually monitored to see if the process has moved far enough away from the nominal linear model operating point to justify generation of a new linear model. The method is demonstrated using a seventh-order model of a natural circulation steam generator

  4. Strangeness content and structure function of the nucleon in a statistical quark model

    CERN Document Server

    Trevisan, L A; Tomio, L

    1999-01-01

    The strangeness content of the nucleon is determined from a statistical model using confined quark levels, and is shown to have a good agreement with the corresponding values extracted from experimental data. The quark levels are generated in a Dirac equation that uses a linear confining potential (scalar plus vector). With the requirement that the result for the Gottfried sum rule violation, given by the new muon collaboration (NMC), is well reproduced, we also obtain the difference between the structure functions of the proton and neutron, and the corresponding sea quark contributions. (27 refs).

  5. A statistical theory of cell killing by radiation of varying linear energy transfer

    International Nuclear Information System (INIS)

    Hawkins, R.B.

    1994-01-01

    A theory is presented that provides an explanation for the observed features of the survival of cultured cells after exposure to densely ionizing high-linear energy transfer (LET) radiation. It starts from a phenomenological postulate based on the linear-quadratic form of cell survival observed for low-LET radiation and uses principles of statistics and fluctuation theory to demonstrate that the effect of varying LET on cell survival can be attributed to random variation of dose to small volumes contained within the nucleus. A simple relation is presented for surviving fraction of cells after exposure to radiation of varying LET that depends on the α and β parameters for the same cells in the limit of low-LET radiation. This relation implies that the value of β is independent of LET. Agreement of the theory with selected observations of cell survival from the literature is demonstrated. A relation is presented that gives relative biological effectiveness (RBE) as a function of the α and β parameters for low-LET radiation. Measurements from microdosimetry are used to estimate the size of the subnuclear volume to which the fluctuation pertains. 11 refs., 4 figs., 2 tabs

  6. Second-order kinetic model for the sorption of cadmium onto tree fern: a comparison of linear and non-linear methods.

    Science.gov (United States)

    Ho, Yuh-Shan

    2006-01-01

    A comparison was made of the linear least-squares method and a trial-and-error non-linear method of the widely used pseudo-second-order kinetic model for the sorption of cadmium onto ground-up tree fern. Four pseudo-second-order kinetic linear equations are discussed. Kinetic parameters obtained from the four kinetic linear equations using the linear method differed but they were the same when using the non-linear method. A type 1 pseudo-second-order linear kinetic model has the highest coefficient of determination. Results show that the non-linear method may be a better way to obtain the desired parameters.

  7. Identification of an Equivalent Linear Model for a Non-Linear Time-Variant RC-Structure

    DEFF Research Database (Denmark)

    Kirkegaard, Poul Henning; Andersen, P.; Brincker, Rune

    are investigated and compared with ARMAX models used on a running window. The techniques are evaluated using simulated data generated by the non-linear finite element program SARCOF modeling a 10-storey 3-bay concrete structure subjected to amplitude modulated Gaussian white noise filtered through a Kanai......This paper considers estimation of the maximum softening for a RC-structure subjected to earthquake excitation. The so-called Maximum Softening damage indicator relates the global damage state of the RC-structure to the relative decrease of the fundamental eigenfrequency in an equivalent linear...

  8. On-line control models for the Stanford Linear Collider

    International Nuclear Information System (INIS)

    Sheppard, J.C.; Helm, R.H.; Lee, M.J.; Woodley, M.D.

    1983-03-01

    Models for computer control of the SLAC three-kilometer linear accelerator and damping rings have been developed as part of the control system for the Stanford Linear Collider. Some of these models have been tested experimentally and implemented in the control program for routine linac operations. This paper will describe the development and implementation of these models, as well as some of the operational results

  9. Vortices, semi-local vortices in gauged linear sigma model

    International Nuclear Information System (INIS)

    Kim, Namkwon

    1998-11-01

    We consider the static (2+1)D gauged linear sigma model. By analyzing the governing system of partial differential equations, we investigate various aspects of the model. We show the existence of energy finite vortices under a partially broken symmetry on R 2 with the necessary condition suggested by Y. Yang. We also introduce generalized semi-local vortices and show the existence of energy finite semi-local vortices under a certain condition. The vacuum manifold for the semi-local vortices turns out to be graded. Besides, with a special choice of a representation, we show that the O(3) sigma model of which target space is nonlinear is a singular limit of the gauged linear sigma model of which target space is linear. (author)

  10. Robust design in generelaised linear models for improving the quality of polyurethane soles

    Directory of Open Access Journals (Sweden)

    Castro, Armando Mares

    2015-11-01

    Full Text Available In a process that manufactures polyurethane soles by casting, a number of problems lead to different types of defects on the sole, causing significant economic losses for the company. In order to improve the product quality and decrease the number of defects, this study conducts an experimental design in the context of robust design. Since the response variable is binary, the statistical analysis was performed using generalised linear models. The operational methodology reduced the percentage of defects, while combining the experimental technique and control systems to achieve superior quality and a consequent reduction in costs.

  11. Inverse estimation of multiple muscle activations based on linear logistic regression.

    Science.gov (United States)

    Sekiya, Masashi; Tsuji, Toshiaki

    2017-07-01

    This study deals with a technology to estimate the muscle activity from the movement data using a statistical model. A linear regression (LR) model and artificial neural networks (ANN) have been known as statistical models for such use. Although ANN has a high estimation capability, it is often in the clinical application that the lack of data amount leads to performance deterioration. On the other hand, the LR model has a limitation in generalization performance. We therefore propose a muscle activity estimation method to improve the generalization performance through the use of linear logistic regression model. The proposed method was compared with the LR model and ANN in the verification experiment with 7 participants. As a result, the proposed method showed better generalization performance than the conventional methods in various tasks.

  12. Improved Statistical Fault Detection Technique and Application to Biological Phenomena Modeled by S-Systems.

    Science.gov (United States)

    Mansouri, Majdi; Nounou, Mohamed N; Nounou, Hazem N

    2017-09-01

    In our previous work, we have demonstrated the effectiveness of the linear multiscale principal component analysis (PCA)-based moving window (MW)-generalized likelihood ratio test (GLRT) technique over the classical PCA and multiscale principal component analysis (MSPCA)-based GLRT methods. The developed fault detection algorithm provided optimal properties by maximizing the detection probability for a particular false alarm rate (FAR) with different values of windows, and however, most real systems are nonlinear, which make the linear PCA method not able to tackle the issue of non-linearity to a great extent. Thus, in this paper, first, we apply a nonlinear PCA to obtain an accurate principal component of a set of data and handle a wide range of nonlinearities using the kernel principal component analysis (KPCA) model. The KPCA is among the most popular nonlinear statistical methods. Second, we extend the MW-GLRT technique to one that utilizes exponential weights to residuals in the moving window (instead of equal weightage) as it might be able to further improve fault detection performance by reducing the FAR using exponentially weighed moving average (EWMA). The developed detection method, which is called EWMA-GLRT, provides improved properties, such as smaller missed detection and FARs and smaller average run length. The idea behind the developed EWMA-GLRT is to compute a new GLRT statistic that integrates current and previous data information in a decreasing exponential fashion giving more weight to the more recent data. This provides a more accurate estimation of the GLRT statistic and provides a stronger memory that will enable better decision making with respect to fault detection. Therefore, in this paper, a KPCA-based EWMA-GLRT method is developed and utilized in practice to improve fault detection in biological phenomena modeled by S-systems and to enhance monitoring process mean. The idea behind a KPCA-based EWMA-GLRT fault detection algorithm is to

  13. Structured statistical models of inductive reasoning.

    Science.gov (United States)

    Kemp, Charles; Tenenbaum, Joshua B

    2009-01-01

    Everyday inductive inferences are often guided by rich background knowledge. Formal models of induction should aim to incorporate this knowledge and should explain how different kinds of knowledge lead to the distinctive patterns of reasoning found in different inductive contexts. This article presents a Bayesian framework that attempts to meet both goals and describes [corrected] 4 applications of the framework: a taxonomic model, a spatial model, a threshold model, and a causal model. Each model makes probabilistic inferences about the extensions of novel properties, but the priors for the 4 models are defined over different kinds of structures that capture different relationships between the categories in a domain. The framework therefore shows how statistical inference can operate over structured background knowledge, and the authors argue that this interaction between structure and statistics is critical for explaining the power and flexibility of human reasoning.

  14. Applied matrix algebra in the statistical sciences

    CERN Document Server

    Basilevsky, Alexander

    2005-01-01

    This comprehensive text offers teachings relevant to both applied and theoretical branches of matrix algebra and provides a bridge between linear algebra and statistical models. Appropriate for advanced undergraduate and graduate students. 1983 edition.

  15. Model Selection with the Linear Mixed Model for Longitudinal Data

    Science.gov (United States)

    Ryoo, Ji Hoon

    2011-01-01

    Model building or model selection with linear mixed models (LMMs) is complicated by the presence of both fixed effects and random effects. The fixed effects structure and random effects structure are codependent, so selection of one influences the other. Most presentations of LMM in psychology and education are based on a multilevel or…

  16. Application of linearized model to the stability analysis of the pressurized water reactor

    International Nuclear Information System (INIS)

    Li Haipeng; Huang Xiaojin; Zhang Liangju

    2008-01-01

    A Linear Time-Invariant model of the Pressurized Water Reactor is formulated through the linearization of the nonlinear model. The model simulation results show that the linearized model agrees well with the nonlinear model under small perturbation. Based upon the Lyapunov's First Method, the linearized model is applied to the stability analysis of the Pressurized Water Reactor. The calculation results show that the methodology of linearization to stability analysis is conveniently feasible. (authors)

  17. A Non-linear Stochastic Model for an Office Building with Air Infiltration

    DEFF Research Database (Denmark)

    Thavlov, Anders; Madsen, Henrik

    2015-01-01

    This paper presents a non-linear heat dynamic model for a multi-room office building with air infiltration. Several linear and non-linear models, with and without air infiltration, are investigated and compared. The models are formulated using stochastic differential equations and the model...

  18. Stochastic linear programming models, theory, and computation

    CERN Document Server

    Kall, Peter

    2011-01-01

    This new edition of Stochastic Linear Programming: Models, Theory and Computation has been brought completely up to date, either dealing with or at least referring to new material on models and methods, including DEA with stochastic outputs modeled via constraints on special risk functions (generalizing chance constraints, ICC’s and CVaR constraints), material on Sharpe-ratio, and Asset Liability Management models involving CVaR in a multi-stage setup. To facilitate use as a text, exercises are included throughout the book, and web access is provided to a student version of the authors’ SLP-IOR software. Additionally, the authors have updated the Guide to Available Software, and they have included newer algorithms and modeling systems for SLP. The book is thus suitable as a text for advanced courses in stochastic optimization, and as a reference to the field. From Reviews of the First Edition: "The book presents a comprehensive study of stochastic linear optimization problems and their applications. … T...

  19. Multivariate covariance generalized linear models

    DEFF Research Database (Denmark)

    Bonat, W. H.; Jørgensen, Bent

    2016-01-01

    are fitted by using an efficient Newton scoring algorithm based on quasi-likelihood and Pearson estimating functions, using only second-moment assumptions. This provides a unified approach to a wide variety of types of response variables and covariance structures, including multivariate extensions......We propose a general framework for non-normal multivariate data analysis called multivariate covariance generalized linear models, designed to handle multivariate response variables, along with a wide range of temporal and spatial correlation structures defined in terms of a covariance link...... function combined with a matrix linear predictor involving known matrices. The method is motivated by three data examples that are not easily handled by existing methods. The first example concerns multivariate count data, the second involves response variables of mixed types, combined with repeated...

  20. Modelling a linear PM motor including magnetic saturation

    NARCIS (Netherlands)

    Polinder, H.; Slootweg, J.G.; Compter, J.C.; Hoeijmakers, M.J.

    2002-01-01

    The use of linear permanent-magnet (PM) actuators increases in a wide variety of applications because of the high force density, robustness and accuracy. The paper describes the modelling of a linear PM motor applied in, for example, wafer steppers, including magnetic saturation. This is important

  1. Stationarity and periodicities of linear speed of coronal mass ejection: a statistical signal processing approach

    Science.gov (United States)

    Chattopadhyay, Anirban; Khondekar, Mofazzal Hossain; Bhattacharjee, Anup Kumar

    2017-09-01

    In this paper initiative has been taken to search the periodicities of linear speed of Coronal Mass Ejection in solar cycle 23. Double exponential smoothing and Discrete Wavelet Transform are being used for detrending and filtering of the CME linear speed time series. To choose the appropriate statistical methodology for the said purpose, Smoothed Pseudo Wigner-Ville distribution (SPWVD) has been used beforehand to confirm the non-stationarity of the time series. The Time-Frequency representation tool like Hilbert Huang Transform and Empirical Mode decomposition has been implemented to unearth the underneath periodicities in the non-stationary time series of the linear speed of CME. Of all the periodicities having more than 95% Confidence Level, the relevant periodicities have been segregated out using Integral peak detection algorithm. The periodicities observed are of low scale ranging from 2-159 days with some relevant periods like 4 days, 10 days, 11 days, 12 days, 13.7 days, 14.5 and 21.6 days. These short range periodicities indicate the probable origin of the CME is the active longitude and the magnetic flux network of the sun. The results also insinuate about the probable mutual influence and causality with other solar activities (like solar radio emission, Ap index, solar wind speed, etc.) owing to the similitude between their periods and CME linear speed periods. The periodicities of 4 days and 10 days indicate the possible existence of the Rossby-type waves or planetary waves in Sun.

  2. Parameter Identification of Piecewise Linear Plasticity Metal Models Used in Numerical Modeling of Structures Under Plastic Deformation and Failure

    Directory of Open Access Journals (Sweden)

    A. V. Shmeliov

    2016-01-01

    Full Text Available The article describes the models of metallic materials used in the calculation of deformation and destruction of engineering structures. The reliability of material models can adequately assess the strength characteristics of the designs of new technology in its designing and certification.The article deals with contingencies and true mechanical properties of materials and presents equations of their relationship. It notes that in the software systems mechanical characteristics of materials are given in the true sense.The paper considers the linear and exponential models of materials, their characteristics, and methods to implement them. It considers the models of Johnson-Cook Steinberg-Guinan, Zerilli-Armstrong, Cowper-Symonds, Gurson-Tvergaard that take into account the strain rate and temperature of the material. Describes their applications, advantages and disadvantages. Considers single- and multi-parameter criteria of materials fracture, the prospects for their use. Gives a rational justification for using a piecewise linear plasticity material model *MAT_PIECEWISE_LINEAR_PLASTICITY (024, LS-DYNA software package for the engineering industry, and presents its main parameters.A technique to identify parameters of piecewise linear plasticity metal material models has been developed. The technique consists of the stages, based on the equations of transition from the conventional stress and strain values to the true ones. Taking into consideration the stressstrain state in the neck of the sample is a distinctive feature of the technique.Tensile tests of the round material samples have been conducted. To test the developed technique in the software package ANSYS LS-DYNA PC have been made tensile sample modeling and results comparison to show high convergence.Further improvement of the technique can be achieved through the development of a statistical approach to the analysis of the results of a series of tests. This will allow a kind of

  3. Genomic prediction based on data from three layer lines using non-linear regression models.

    Science.gov (United States)

    Huang, Heyun; Windig, Jack J; Vereijken, Addie; Calus, Mario P L

    2014-11-06

    Most studies on genomic prediction with reference populations that include multiple lines or breeds have used linear models. Data heterogeneity due to using multiple populations may conflict with model assumptions used in linear regression methods. In an attempt to alleviate potential discrepancies between assumptions of linear models and multi-population data, two types of alternative models were used: (1) a multi-trait genomic best linear unbiased prediction (GBLUP) model that modelled trait by line combinations as separate but correlated traits and (2) non-linear models based on kernel learning. These models were compared to conventional linear models for genomic prediction for two lines of brown layer hens (B1 and B2) and one line of white hens (W1). The three lines each had 1004 to 1023 training and 238 to 240 validation animals. Prediction accuracy was evaluated by estimating the correlation between observed phenotypes and predicted breeding values. When the training dataset included only data from the evaluated line, non-linear models yielded at best a similar accuracy as linear models. In some cases, when adding a distantly related line, the linear models showed a slight decrease in performance, while non-linear models generally showed no change in accuracy. When only information from a closely related line was used for training, linear models and non-linear radial basis function (RBF) kernel models performed similarly. The multi-trait GBLUP model took advantage of the estimated genetic correlations between the lines. Combining linear and non-linear models improved the accuracy of multi-line genomic prediction. Linear models and non-linear RBF models performed very similarly for genomic prediction, despite the expectation that non-linear models could deal better with the heterogeneous multi-population data. This heterogeneity of the data can be overcome by modelling trait by line combinations as separate but correlated traits, which avoids the occasional

  4. Statistical network analysis for analyzing policy networks

    DEFF Research Database (Denmark)

    Robins, Garry; Lewis, Jenny; Wang, Peng

    2012-01-01

    and policy network methodology is the development of statistical modeling approaches that can accommodate such dependent data. In this article, we review three network statistical methods commonly used in the current literature: quadratic assignment procedures, exponential random graph models (ERGMs......To analyze social network data using standard statistical approaches is to risk incorrect inference. The dependencies among observations implied in a network conceptualization undermine standard assumptions of the usual general linear models. One of the most quickly expanding areas of social......), and stochastic actor-oriented models. We focus most attention on ERGMs by providing an illustrative example of a model for a strategic information network within a local government. We draw inferences about the structural role played by individuals recognized as key innovators and conclude that such an approach...

  5. Determining Predictor Importance in Hierarchical Linear Models Using Dominance Analysis

    Science.gov (United States)

    Luo, Wen; Azen, Razia

    2013-01-01

    Dominance analysis (DA) is a method used to evaluate the relative importance of predictors that was originally proposed for linear regression models. This article proposes an extension of DA that allows researchers to determine the relative importance of predictors in hierarchical linear models (HLM). Commonly used measures of model adequacy in…

  6. General mirror pairs for gauged linear sigma models

    Energy Technology Data Exchange (ETDEWEB)

    Aspinwall, Paul S.; Plesser, M. Ronen [Departments of Mathematics and Physics, Duke University,Box 90320, Durham, NC 27708-0320 (United States)

    2015-11-05

    We carefully analyze the conditions for an abelian gauged linear σ-model to exhibit nontrivial IR behavior described by a nonsingular superconformal field theory determining a superstring vacuum. This is done without reference to a geometric phase, by associating singular behavior to a noncompact space of (semi-)classical vacua. We find that models determined by reflexive combinatorial data are nonsingular for generic values of their parameters. This condition has the pleasant feature that the mirror of a nonsingular gauged linear σ-model is another such model, but it is clearly too strong and we provide an example of a non-reflexive mirror pair. We discuss a weaker condition inspired by considering extremal transitions, which is also mirror symmetric and which we conjecture to be sufficient. We apply these ideas to extremal transitions and to understanding the way in which both Berglund-Hübsch mirror symmetry and the Vafa-Witten mirror orbifold with discrete torsion can be seen as special cases of the general combinatorial duality of gauged linear σ-models. In the former case we encounter an example showing that our weaker condition is still not necessary.

  7. General mirror pairs for gauged linear sigma models

    International Nuclear Information System (INIS)

    Aspinwall, Paul S.; Plesser, M. Ronen

    2015-01-01

    We carefully analyze the conditions for an abelian gauged linear σ-model to exhibit nontrivial IR behavior described by a nonsingular superconformal field theory determining a superstring vacuum. This is done without reference to a geometric phase, by associating singular behavior to a noncompact space of (semi-)classical vacua. We find that models determined by reflexive combinatorial data are nonsingular for generic values of their parameters. This condition has the pleasant feature that the mirror of a nonsingular gauged linear σ-model is another such model, but it is clearly too strong and we provide an example of a non-reflexive mirror pair. We discuss a weaker condition inspired by considering extremal transitions, which is also mirror symmetric and which we conjecture to be sufficient. We apply these ideas to extremal transitions and to understanding the way in which both Berglund-Hübsch mirror symmetry and the Vafa-Witten mirror orbifold with discrete torsion can be seen as special cases of the general combinatorial duality of gauged linear σ-models. In the former case we encounter an example showing that our weaker condition is still not necessary.

  8. Half-trek criterion for generic identifiability of linear structural equation models

    NARCIS (Netherlands)

    Foygel, R.; Draisma, J.; Drton, M.

    2012-01-01

    A linear structural equation model relates random variables of interest and corresponding Gaussian noise terms via a linear equation system. Each such model can be represented by a mixed graph in which directed edges encode the linear equations, and bidirected edges indicate possible correlations

  9. Half-trek criterion for generic identifiability of linear structural equation models

    NARCIS (Netherlands)

    Foygel, R.; Draisma, J.; Drton, M.

    2011-01-01

    A linear structural equation model relates random variables of interest and corresponding Gaussian noise terms via a linear equation system. Each such model can be represented by a mixed graph in which directed edges encode the linear equations, and bidirected edges indicate possible correlations

  10. Spectral-Lagrangian methods for collisional models of non-equilibrium statistical states

    International Nuclear Information System (INIS)

    Gamba, Irene M.; Tharkabhushanam, Sri Harsha

    2009-01-01

    We propose a new spectral Lagrangian based deterministic solver for the non-linear Boltzmann transport equation (BTE) in d-dimensions for variable hard sphere (VHS) collision kernels with conservative or non-conservative binary interactions. The method is based on symmetries of the Fourier transform of the collision integral, where the complexity in its computation is reduced to a separate integral over the unit sphere S d-1 . The conservation of moments is enforced by Lagrangian constraints. The resulting scheme, implemented in free space, is very versatile and adjusts in a very simple manner to several cases that involve energy dissipation due to local micro-reversibility (inelastic interactions) or elastic models of slowing down process. Our simulations are benchmarked with available exact self-similar solutions, exact moment equations and analytical estimates for the homogeneous Boltzmann equation, both for elastic and inelastic VHS interactions. Benchmarking of the simulations involves the selection of a time self-similar rescaling of the numerical distribution function which is performed using the continuous spectrum of the equation for Maxwell molecules as studied first in Bobylev et al. [A.V. Bobylev, C. Cercignani, G. Toscani, Proof of an asymptotic property of self-similar solutions of the Boltzmann equation for granular materials, Journal of Statistical Physics 111 (2003) 403-417] and generalized to a wide range of related models in Bobylev et al. [A.V. Bobylev, C. Cercignani, I.M. Gamba, On the self-similar asymptotics for generalized non-linear kinetic Maxwell models, Communication in Mathematical Physics, in press. URL: ( )]. The method also produces accurate results in the case of inelastic diffusive Boltzmann equations for hard spheres (inelastic collisions under thermal bath), where overpopulated non-Gaussian exponential tails have been conjectured in computations by stochastic methods [T.V. Noije, M. Ernst, Velocity distributions in homogeneously

  11. Distributing Correlation Coefficients of Linear Structure-Activity/Property Models

    Directory of Open Access Journals (Sweden)

    Sorana D. BOLBOACA

    2011-12-01

    Full Text Available Quantitative structure-activity/property relationships are mathematical relationships linking chemical structure and activity/property in a quantitative manner. These in silico approaches are frequently used to reduce animal testing and risk-assessment, as well as to increase time- and cost-effectiveness in characterization and identification of active compounds. The aim of our study was to investigate the pattern of correlation coefficients distribution associated to simple linear relationships linking the compounds structure with their activities. A set of the most common ordnance compounds found at naval facilities with a limited data set with a range of toxicities on aquatic ecosystem and a set of seven properties was studied. Statistically significant models were selected and investigated. The probability density function of the correlation coefficients was investigated using a series of possible continuous distribution laws. Almost 48% of the correlation coefficients proved fit Beta distribution, 40% fit Generalized Pareto distribution, and 12% fit Pert distribution.

  12. Generalized Linear Models with Applications in Engineering and the Sciences

    CERN Document Server

    Myers, Raymond H; Vining, G Geoffrey; Robinson, Timothy J

    2012-01-01

    Praise for the First Edition "The obvious enthusiasm of Myers, Montgomery, and Vining and their reliance on their many examples as a major focus of their pedagogy make Generalized Linear Models a joy to read. Every statistician working in any area of applied science should buy it and experience the excitement of these new approaches to familiar activities."-Technometrics Generalized Linear Models: With Applications in Engineering and the Sciences, Second Edition continues to provide a clear introduction to the theoretical foundations and key applications of generalized linear models (GLMs). Ma

  13. Comparison of the Predictive Performance and Interpretability of Random Forest and Linear Models on Benchmark Data Sets.

    Science.gov (United States)

    Marchese Robinson, Richard L; Palczewska, Anna; Palczewski, Jan; Kidley, Nathan

    2017-08-28

    The ability to interpret the predictions made by quantitative structure-activity relationships (QSARs) offers a number of advantages. While QSARs built using nonlinear modeling approaches, such as the popular Random Forest algorithm, might sometimes be more predictive than those built using linear modeling approaches, their predictions have been perceived as difficult to interpret. However, a growing number of approaches have been proposed for interpreting nonlinear QSAR models in general and Random Forest in particular. In the current work, we compare the performance of Random Forest to those of two widely used linear modeling approaches: linear Support Vector Machines (SVMs) (or Support Vector Regression (SVR)) and partial least-squares (PLS). We compare their performance in terms of their predictivity as well as the chemical interpretability of the predictions using novel scoring schemes for assessing heat map images of substructural contributions. We critically assess different approaches for interpreting Random Forest models as well as for obtaining predictions from the forest. We assess the models on a large number of widely employed public-domain benchmark data sets corresponding to regression and binary classification problems of relevance to hit identification and toxicology. We conclude that Random Forest typically yields comparable or possibly better predictive performance than the linear modeling approaches and that its predictions may also be interpreted in a chemically and biologically meaningful way. In contrast to earlier work looking at interpretation of nonlinear QSAR models, we directly compare two methodologically distinct approaches for interpreting Random Forest models. The approaches for interpreting Random Forest assessed in our article were implemented using open-source programs that we have made available to the community. These programs are the rfFC package ( https://r-forge.r-project.org/R/?group_id=1725 ) for the R statistical

  14. Jackknife Variance Estimator for Two Sample Linear Rank Statistics

    Science.gov (United States)

    1988-11-01

    Accesion For - - ,NTIS GPA&I "TIC TAB Unann c, nc .. [d Keywords: strong consistency; linear rank test’ influence function . i , at L By S- )Distribut...reverse if necessary and identify by block number) FIELD IGROUP SUB-GROUP Strong consistency; linear rank test; influence function . 19. ABSTRACT

  15. A penalized framework for distributed lag non-linear models.

    Science.gov (United States)

    Gasparrini, Antonio; Scheipl, Fabian; Armstrong, Ben; Kenward, Michael G

    2017-09-01

    Distributed lag non-linear models (DLNMs) are a modelling tool for describing potentially non-linear and delayed dependencies. Here, we illustrate an extension of the DLNM framework through the use of penalized splines within generalized additive models (GAM). This extension offers built-in model selection procedures and the possibility of accommodating assumptions on the shape of the lag structure through specific penalties. In addition, this framework includes, as special cases, simpler models previously proposed for linear relationships (DLMs). Alternative versions of penalized DLNMs are compared with each other and with the standard unpenalized version in a simulation study. Results show that this penalized extension to the DLNM class provides greater flexibility and improved inferential properties. The framework exploits recent theoretical developments of GAMs and is implemented using efficient routines within freely available software. Real-data applications are illustrated through two reproducible examples in time series and survival analysis. © 2017 The Authors Biometrics published by Wiley Periodicals, Inc. on behalf of International Biometric Society.

  16. Prediction of Mind-Wandering with Electroencephalogram and Non-linear Regression Modeling.

    Science.gov (United States)

    Kawashima, Issaku; Kumano, Hiroaki

    2017-01-01

    Mind-wandering (MW), task-unrelated thought, has been examined by researchers in an increasing number of articles using models to predict whether subjects are in MW, using numerous physiological variables. However, these models are not applicable in general situations. Moreover, they output only binary classification. The current study suggests that the combination of electroencephalogram (EEG) variables and non-linear regression modeling can be a good indicator of MW intensity. We recorded EEGs of 50 subjects during the performance of a Sustained Attention to Response Task, including a thought sampling probe that inquired the focus of attention. We calculated the power and coherence value and prepared 35 patterns of variable combinations and applied Support Vector machine Regression (SVR) to them. Finally, we chose four SVR models: two of them non-linear models and the others linear models; two of the four models are composed of a limited number of electrodes to satisfy model usefulness. Examination using the held-out data indicated that all models had robust predictive precision and provided significantly better estimations than a linear regression model using single electrode EEG variables. Furthermore, in limited electrode condition, non-linear SVR model showed significantly better precision than linear SVR model. The method proposed in this study helps investigations into MW in various little-examined situations. Further, by measuring MW with a high temporal resolution EEG, unclear aspects of MW, such as time series variation, are expected to be revealed. Furthermore, our suggestion that a few electrodes can also predict MW contributes to the development of neuro-feedback studies.

  17. Prediction of Mind-Wandering with Electroencephalogram and Non-linear Regression Modeling

    Directory of Open Access Journals (Sweden)

    Issaku Kawashima

    2017-07-01

    Full Text Available Mind-wandering (MW, task-unrelated thought, has been examined by researchers in an increasing number of articles using models to predict whether subjects are in MW, using numerous physiological variables. However, these models are not applicable in general situations. Moreover, they output only binary classification. The current study suggests that the combination of electroencephalogram (EEG variables and non-linear regression modeling can be a good indicator of MW intensity. We recorded EEGs of 50 subjects during the performance of a Sustained Attention to Response Task, including a thought sampling probe that inquired the focus of attention. We calculated the power and coherence value and prepared 35 patterns of variable combinations and applied Support Vector machine Regression (SVR to them. Finally, we chose four SVR models: two of them non-linear models and the others linear models; two of the four models are composed of a limited number of electrodes to satisfy model usefulness. Examination using the held-out data indicated that all models had robust predictive precision and provided significantly better estimations than a linear regression model using single electrode EEG variables. Furthermore, in limited electrode condition, non-linear SVR model showed significantly better precision than linear SVR model. The method proposed in this study helps investigations into MW in various little-examined situations. Further, by measuring MW with a high temporal resolution EEG, unclear aspects of MW, such as time series variation, are expected to be revealed. Furthermore, our suggestion that a few electrodes can also predict MW contributes to the development of neuro-feedback studies.

  18. Game Theory and its Relationship with Linear Programming Models ...

    African Journals Online (AJOL)

    Game Theory and its Relationship with Linear Programming Models. ... This paper shows that game theory and linear programming problem are closely related subjects since any computing method devised for ... AJOL African Journals Online.

  19. Linear and evolutionary polynomial regression models to forecast coastal dynamics: Comparison and reliability assessment

    Science.gov (United States)

    Bruno, Delia Evelina; Barca, Emanuele; Goncalves, Rodrigo Mikosz; de Araujo Queiroz, Heithor Alexandre; Berardi, Luigi; Passarella, Giuseppe

    2018-01-01

    In this paper, the Evolutionary Polynomial Regression data modelling strategy has been applied to study small scale, short-term coastal morphodynamics, given its capability for treating a wide database of known information, non-linearly. Simple linear and multilinear regression models were also applied to achieve a balance between the computational load and reliability of estimations of the three models. In fact, even though it is easy to imagine that the more complex the model, the more the prediction improves, sometimes a "slight" worsening of estimations can be accepted in exchange for the time saved in data organization and computational load. The models' outcomes were validated through a detailed statistical, error analysis, which revealed a slightly better estimation of the polynomial model with respect to the multilinear model, as expected. On the other hand, even though the data organization was identical for the two models, the multilinear one required a simpler simulation setting and a faster run time. Finally, the most reliable evolutionary polynomial regression model was used in order to make some conjecture about the uncertainty increase with the extension of extrapolation time of the estimation. The overlapping rate between the confidence band of the mean of the known coast position and the prediction band of the estimated position can be a good index of the weakness in producing reliable estimations when the extrapolation time increases too much. The proposed models and tests have been applied to a coastal sector located nearby Torre Colimena in the Apulia region, south Italy.

  20. Double generalized linear compound poisson models to insurance claims data

    DEFF Research Database (Denmark)

    Andersen, Daniel Arnfeldt; Bonat, Wagner Hugo

    2017-01-01

    This paper describes the specification, estimation and comparison of double generalized linear compound Poisson models based on the likelihood paradigm. The models are motivated by insurance applications, where the distribution of the response variable is composed by a degenerate distribution...... implementation and illustrate the application of double generalized linear compound Poisson models using a data set about car insurances....

  1. A Stochastic Fractional Dynamics Model of Rainfall Statistics

    Science.gov (United States)

    Kundu, Prasun; Travis, James

    2013-04-01

    Rainfall varies in space and time in a highly irregular manner and is described naturally in terms of a stochastic process. A characteristic feature of rainfall statistics is that they depend strongly on the space-time scales over which rain data are averaged. A spectral model of precipitation has been developed based on a stochastic differential equation of fractional order for the point rain rate, that allows a concise description of the second moment statistics of rain at any prescribed space-time averaging scale. The model is designed to faithfully reflect the scale dependence and is thus capable of providing a unified description of the statistics of both radar and rain gauge data. The underlying dynamical equation can be expressed in terms of space-time derivatives of fractional orders that are adjusted together with other model parameters to fit the data. The form of the resulting spectrum gives the model adequate flexibility to capture the subtle interplay between the spatial and temporal scales of variability of rain but strongly constrains the predicted statistical behavior as a function of the averaging length and times scales. The main restriction is the assumption that the statistics of the precipitation field is spatially homogeneous and isotropic and stationary in time. We test the model with radar and gauge data collected contemporaneously at the NASA TRMM ground validation sites located near Melbourne, Florida and in Kwajalein Atoll, Marshall Islands in the tropical Pacific. We estimate the parameters by tuning them to the second moment statistics of the radar data. The model predictions are then found to fit the second moment statistics of the gauge data reasonably well without any further adjustment. Some data sets containing periods of non-stationary behavior that involves occasional anomalously correlated rain events, present a challenge for the model.

  2. Reliability modelling and simulation of switched linear system ...

    African Journals Online (AJOL)

    Reliability modelling and simulation of switched linear system control using temporal databases. ... design of fault-tolerant real-time switching systems control and modelling embedded micro-schedulers for complex systems maintenance.

  3. HERITABILITY AND BREEDING VALUE OF SHEEP FERTILITY ESTIMATED BY MEANS OF THE GIBBS SAMPLING METHOD USING THE LINEAR AND THRESHOLD MODELS

    Directory of Open Access Journals (Sweden)

    DARIUSZ Piwczynski

    2013-03-01

    Full Text Available The research was carried out on 4,030 Polish Merino ewes born in the years 1991- 2001, kept in 15 flocks from the Pomorze and Kujawy region. Fertility of ewes in subsequent reproduction seasons was analysed with the use of multiple logistic regression. The research showed that there is a statistical influence of the flock, year of birth, age of dam, flock year interaction of birth on the ewes fertility. In order to estimate the genetic parameters, the Gibbs sampling method was applied, using the univariate animal models, both linear as well as threshold. Estimates of fertility depending on the model equalled 0.067 to 0.104, whereas the estimates of repeatability equalled respectively: 0.076 and 0.139. The obtained genetic parameters were then used to estimate the breeding values of the animals in terms of controlled trait (Best Linear Unbiased Prediction method using linear and threshold models. The obtained animal breeding values rankings in respect of the same trait with the use of linear and threshold models were strongly correlated with each other (rs = 0.972. Negative genetic trends of fertility (0.01-0.08% per year were found.

  4. Statistical Models and Methods for Lifetime Data

    CERN Document Server

    Lawless, Jerald F

    2011-01-01

    Praise for the First Edition"An indispensable addition to any serious collection on lifetime data analysis and . . . a valuable contribution to the statistical literature. Highly recommended . . ."-Choice"This is an important book, which will appeal to statisticians working on survival analysis problems."-Biometrics"A thorough, unified treatment of statistical models and methods used in the analysis of lifetime data . . . this is a highly competent and agreeable statistical textbook."-Statistics in MedicineThe statistical analysis of lifetime or response time data is a key tool in engineering,

  5. Non-linear modelling of monthly mean vorticity time changes: an application to the western Mediterranean

    Directory of Open Access Journals (Sweden)

    M. Finizio

    Full Text Available Starting from a number of observables in the form of time-series of meteorological elements in various areas of the northern hemisphere, a model capable of fitting past records and predicting monthly vorticity time changes in the western Mediterranean is implemented. A new powerful statistical methodology is introduced (MARS in order to capture the non-linear dynamics of time-series representing the available 40-year history of the hemispheric circulation. The developed model is tested on a suitable independent data set. An ensemble forecast exercise is also carried out to check model stability in reference to the uncertainty of input quantities.

    Key words. Meteorology and atmospheric dynamics · General circulation ocean-atmosphere interactions · Synoptic-scale meteorology

  6. A fuzzy Bi-linear management model in reverse logistic chains

    Directory of Open Access Journals (Sweden)

    Tadić Danijela

    2016-01-01

    Full Text Available The management of the electrical and electronic waste (WEEE problem in the uncertain environment has a critical effect on the economy and environmental protection of each region. The considered problem can be stated as a fuzzy non-convex optimization problem with linear objective function and a set of linear and non-linear constraints. The original problem is reformulated by using linear relaxation into a fuzzy linear programming problem. The fuzzy rating of collecting point capacities and fix costs of recycling centers are modeled by triangular fuzzy numbers. The optimal solution of the reformulation model is found by using optimality concept. The proposed model is verified through an illustrative example with real-life data. The obtained results represent an input for future research which should include a good benchmark base for tested reverse logistic chains and their continuous improvement. [Projekat Ministarstva nauke Republike Srbije, br. 035033: Sustainable development technology and equipment for the recycling of motor vehicles

  7. Topology for Statistical Modeling of Petascale Data

    Energy Technology Data Exchange (ETDEWEB)

    Pascucci, Valerio [Univ. of Utah, Salt Lake City, UT (United States); Levine, Joshua [Univ. of Utah, Salt Lake City, UT (United States); Gyulassy, Attila [Univ. of Utah, Salt Lake City, UT (United States); Bremer, P. -T. [Univ. of Utah, Salt Lake City, UT (United States)

    2013-10-31

    Many commonly used algorithms for mathematical analysis do not scale well enough to accommodate the size or complexity of petascale data produced by computational simulations. The primary goal of this project is to develop new mathematical tools that address both the petascale size and uncertain nature of current data. At a high level, the approach of the entire team involving all three institutions is based on the complementary techniques of combinatorial topology and statistical modelling. In particular, we use combinatorial topology to filter out spurious data that would otherwise skew statistical modelling techniques, and we employ advanced algorithms from algebraic statistics to efficiently find globally optimal fits to statistical models. The overall technical contributions can be divided loosely into three categories: (1) advances in the field of combinatorial topology, (2) advances in statistical modelling, and (3) new integrated topological and statistical methods. Roughly speaking, the division of labor between our 3 groups (Sandia Labs in Livermore, Texas A&M in College Station, and U Utah in Salt Lake City) is as follows: the Sandia group focuses on statistical methods and their formulation in algebraic terms, and finds the application problems (and data sets) most relevant to this project, the Texas A&M Group develops new algebraic geometry algorithms, in particular with fewnomial theory, and the Utah group develops new algorithms in computational topology via Discrete Morse Theory. However, we hasten to point out that our three groups stay in tight contact via videconference every 2 weeks, so there is much synergy of ideas between the groups. The following of this document is focused on the contributions that had grater direct involvement from the team at the University of Utah in Salt Lake City.

  8. Statistical models and methods for reliability and survival analysis

    CERN Document Server

    Couallier, Vincent; Huber-Carol, Catherine; Mesbah, Mounir; Huber -Carol, Catherine; Limnios, Nikolaos; Gerville-Reache, Leo

    2013-01-01

    Statistical Models and Methods for Reliability and Survival Analysis brings together contributions by specialists in statistical theory as they discuss their applications providing up-to-date developments in methods used in survival analysis, statistical goodness of fit, stochastic processes for system reliability, amongst others. Many of these are related to the work of Professor M. Nikulin in statistics over the past 30 years. The authors gather together various contributions with a broad array of techniques and results, divided into three parts - Statistical Models and Methods, Statistical

  9. Linear regression crash prediction models : issues and proposed solutions.

    Science.gov (United States)

    2010-05-01

    The paper develops a linear regression model approach that can be applied to : crash data to predict vehicle crashes. The proposed approach involves novice data aggregation : to satisfy linear regression assumptions; namely error structure normality ...

  10. Comparing a single case to a control group - Applying linear mixed effects models to repeated measures data.

    Science.gov (United States)

    Huber, Stefan; Klein, Elise; Moeller, Korbinian; Willmes, Klaus

    2015-10-01

    In neuropsychological research, single-cases are often compared with a small control sample. Crawford and colleagues developed inferential methods (i.e., the modified t-test) for such a research design. In the present article, we suggest an extension of the methods of Crawford and colleagues employing linear mixed models (LMM). We first show that a t-test for the significance of a dummy coded predictor variable in a linear regression is equivalent to the modified t-test of Crawford and colleagues. As an extension to this idea, we then generalized the modified t-test to repeated measures data by using LMMs to compare the performance difference in two conditions observed in a single participant to that of a small control group. The performance of LMMs regarding Type I error rates and statistical power were tested based on Monte-Carlo simulations. We found that starting with about 15-20 participants in the control sample Type I error rates were close to the nominal Type I error rate using the Satterthwaite approximation for the degrees of freedom. Moreover, statistical power was acceptable. Therefore, we conclude that LMMs can be applied successfully to statistically evaluate performance differences between a single-case and a control sample. Copyright © 2015 Elsevier Ltd. All rights reserved.

  11. Matrix model and time-like linear dila ton matter

    International Nuclear Information System (INIS)

    Takayanagi, Tadashi

    2004-01-01

    We consider a matrix model description of the 2d string theory whose matter part is given by a time-like linear dilaton CFT. This is equivalent to the c=1 matrix model with a deformed, but very simple Fermi surface. Indeed, after a Lorentz transformation, the corresponding 2d spacetime is a conventional linear dila ton background with a time-dependent tachyon field. We show that the tree level scattering amplitudes in the matrix model perfectly agree with those computed in the world-sheet theory. The classical trajectories of fermions correspond to the decaying D-boranes in the time-like linear dilaton CFT. We also discuss the ground ring structure. Furthermore, we study the properties of the time-like Liouville theory by applying this matrix model description. We find that its ground ring structure is very similar to that of the minimal string. (author)

  12. Non-linear mixed effects modeling - from methodology and software development to driving implementation in drug development science.

    Science.gov (United States)

    Pillai, Goonaseelan Colin; Mentré, France; Steimer, Jean-Louis

    2005-04-01

    Few scientific contributions have made significant impact unless there was a champion who had the vision to see the potential for its use in seemingly disparate areas-and who then drove active implementation. In this paper, we present a historical summary of the development of non-linear mixed effects (NLME) modeling up to the more recent extensions of this statistical methodology. The paper places strong emphasis on the pivotal role played by Lewis B. Sheiner (1940-2004), who used this statistical methodology to elucidate solutions to real problems identified in clinical practice and in medical research and on how he drove implementation of the proposed solutions. A succinct overview of the evolution of the NLME modeling methodology is presented as well as ideas on how its expansion helped to provide guidance for a more scientific view of (model-based) drug development that reduces empiricism in favor of critical quantitative thinking and decision making.

  13. A Statistical Approach For Modeling Tropical Cyclones. Synthetic Hurricanes Generator Model

    Energy Technology Data Exchange (ETDEWEB)

    Pasqualini, Donatella [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2016-05-11

    This manuscript brie y describes a statistical ap- proach to generate synthetic tropical cyclone tracks to be used in risk evaluations. The Synthetic Hur- ricane Generator (SynHurG) model allows model- ing hurricane risk in the United States supporting decision makers and implementations of adaptation strategies to extreme weather. In the literature there are mainly two approaches to model hurricane hazard for risk prediction: deterministic-statistical approaches, where the storm key physical parameters are calculated using physi- cal complex climate models and the tracks are usually determined statistically from historical data; and sta- tistical approaches, where both variables and tracks are estimated stochastically using historical records. SynHurG falls in the second category adopting a pure stochastic approach.

  14. Fault diagnosis and comparing risk for the steel coil manufacturing process using statistical models for binary data

    International Nuclear Information System (INIS)

    Debón, A.; Carlos Garcia-Díaz, J.

    2012-01-01

    Advanced statistical models can help industry to design more economical and rational investment plans. Fault detection and diagnosis is an important problem in continuous hot dip galvanizing. Increasingly stringent quality requirements in the automotive industry also require ongoing efforts in process control to make processes more robust. Robust methods for estimating the quality of galvanized steel coils are an important tool for the comprehensive monitoring of the performance of the manufacturing process. This study applies different statistical regression models: generalized linear models, generalized additive models and classification trees to estimate the quality of galvanized steel coils on the basis of short time histories. The data, consisting of 48 galvanized steel coils, was divided into sets of conforming and nonconforming coils. Five variables were selected for monitoring the process: steel strip velocity and four bath temperatures. The present paper reports a comparative evaluation of statistical models for binary data using Receiver Operating Characteristic (ROC) curves. A ROC curve is a graph or a technique for visualizing, organizing and selecting classifiers based on their performance. The purpose of this paper is to examine their use in research to obtain the best model to predict defective steel coil probability. In relation to the work of other authors who only propose goodness of fit statistics, we should highlight one distinctive feature of the methodology presented here, which is the possibility of comparing the different models with ROC graphs which are based on model classification performance. Finally, the results are validated by bootstrap procedures.

  15. Linear Power-Flow Models in Multiphase Distribution Networks: Preprint

    Energy Technology Data Exchange (ETDEWEB)

    Bernstein, Andrey; Dall' Anese, Emiliano

    2017-05-26

    This paper considers multiphase unbalanced distribution systems and develops approximate power-flow models where bus-voltages, line-currents, and powers at the point of common coupling are linearly related to the nodal net power injections. The linearization approach is grounded on a fixed-point interpretation of the AC power-flow equations, and it is applicable to distribution systems featuring (i) wye connections; (ii) ungrounded delta connections; (iii) a combination of wye-connected and delta-connected sources/loads; and, (iv) a combination of line-to-line and line-to-grounded-neutral devices at the secondary of distribution transformers. The proposed linear models can facilitate the development of computationally-affordable optimization and control applications -- from advanced distribution management systems settings to online and distributed optimization routines. Performance of the proposed models is evaluated on different test feeders.

  16. Model-generated air quality statistics for application in vegetation response models in Alberta

    International Nuclear Information System (INIS)

    McVehil, G.E.; Nosal, M.

    1990-01-01

    To test and apply vegetation response models in Alberta, air pollution statistics representative of various parts of the Province are required. At this time, air quality monitoring data of the requisite accuracy and time resolution are not available for most parts of Alberta. Therefore, there exists a need to develop appropriate air quality statistics. The objectives of the work reported here were to determine the applicability of model generated air quality statistics and to develop by modelling, realistic and representative time series of hourly SO 2 concentrations that could be used to generate the statistics demanded by vegetation response models

  17. Performance modeling, loss networks, and statistical multiplexing

    CERN Document Server

    Mazumdar, Ravi

    2009-01-01

    This monograph presents a concise mathematical approach for modeling and analyzing the performance of communication networks with the aim of understanding the phenomenon of statistical multiplexing. The novelty of the monograph is the fresh approach and insights provided by a sample-path methodology for queueing models that highlights the important ideas of Palm distributions associated with traffic models and their role in performance measures. Also presented are recent ideas of large buffer, and many sources asymptotics that play an important role in understanding statistical multiplexing. I

  18. Hyperparameterization of soil moisture statistical models for North America with Ensemble Learning Models (Elm)

    Science.gov (United States)

    Steinberg, P. D.; Brener, G.; Duffy, D.; Nearing, G. S.; Pelissier, C.

    2017-12-01

    Hyperparameterization, of statistical models, i.e. automated model scoring and selection, such as evolutionary algorithms, grid searches, and randomized searches, can improve forecast model skill by reducing errors associated with model parameterization, model structure, and statistical properties of training data. Ensemble Learning Models (Elm), and the related Earthio package, provide a flexible interface for automating the selection of parameters and model structure for machine learning models common in climate science and land cover classification, offering convenient tools for loading NetCDF, HDF, Grib, or GeoTiff files, decomposition methods like PCA and manifold learning, and parallel training and prediction with unsupervised and supervised classification, clustering, and regression estimators. Continuum Analytics is using Elm to experiment with statistical soil moisture forecasting based on meteorological forcing data from NASA's North American Land Data Assimilation System (NLDAS). There Elm is using the NSGA-2 multiobjective optimization algorithm for optimizing statistical preprocessing of forcing data to improve goodness-of-fit for statistical models (i.e. feature engineering). This presentation will discuss Elm and its components, including dask (distributed task scheduling), xarray (data structures for n-dimensional arrays), and scikit-learn (statistical preprocessing, clustering, classification, regression), and it will show how NSGA-2 is being used for automate selection of soil moisture forecast statistical models for North America.

  19. Optimal designs for linear mixture models

    NARCIS (Netherlands)

    Mendieta, E.J.; Linssen, H.N.; Doornbos, R.

    1975-01-01

    In a recent paper Snee and Marquardt (1974) considered designs for linear mixture models, where the components are subject to individual lower and/or upper bounds. When the number of components is large their algorithm XVERT yields designs far too extensive for practical purposes. The purpose of

  20. Functional linear models for association analysis of quantitative traits.

    Science.gov (United States)

    Fan, Ruzong; Wang, Yifan; Mills, James L; Wilson, Alexander F; Bailey-Wilson, Joan E; Xiong, Momiao

    2013-11-01

    Functional linear models are developed in this paper for testing associations between quantitative traits and genetic variants, which can be rare variants or common variants or the combination of the two. By treating multiple genetic variants of an individual in a human population as a realization of a stochastic process, the genome of an individual in a chromosome region is a continuum of sequence data rather than discrete observations. The genome of an individual is viewed as a stochastic function that contains both linkage and linkage disequilibrium (LD) information of the genetic markers. By using techniques of functional data analysis, both fixed and mixed effect functional linear models are built to test the association between quantitative traits and genetic variants adjusting for covariates. After extensive simulation analysis, it is shown that the F-distributed tests of the proposed fixed effect functional linear models have higher power than that of sequence kernel association test (SKAT) and its optimal unified test (SKAT-O) for three scenarios in most cases: (1) the causal variants are all rare, (2) the causal variants are both rare and common, and (3) the causal variants are common. The superior performance of the fixed effect functional linear models is most likely due to its optimal utilization of both genetic linkage and LD information of multiple genetic variants in a genome and similarity among different individuals, while SKAT and SKAT-O only model the similarities and pairwise LD but do not model linkage and higher order LD information sufficiently. In addition, the proposed fixed effect models generate accurate type I error rates in simulation studies. We also show that the functional kernel score tests of the proposed mixed effect functional linear models are preferable in candidate gene analysis and small sample problems. The methods are applied to analyze three biochemical traits in data from the Trinity Students Study. © 2013 WILEY

  1. The non-equilibrium statistical mechanics of a simple geophysical fluid dynamics model

    Science.gov (United States)

    Verkley, Wim; Severijns, Camiel

    2014-05-01

    Lorenz [1] has devised a dynamical system that has proved to be very useful as a benchmark system in geophysical fluid dynamics. The system in its simplest form consists of a periodic array of variables that can be associated with an atmospheric field on a latitude circle. The system is driven by a constant forcing, is damped by linear friction and has a simple advection term that causes the model to behave chaotically if the forcing is large enough. Our aim is to predict the statistics of Lorenz' model on the basis of a given average value of its total energy - obtained from a numerical integration - and the assumption of statistical stationarity. Our method is the principle of maximum entropy [2] which in this case reads: the information entropy of the system's probability density function shall be maximal under the constraints of normalization, a given value of the average total energy and statistical stationarity. Statistical stationarity is incorporated approximately by using `stationarity constraints', i.e., by requiring that the average first and possibly higher-order time-derivatives of the energy are zero in the maximization of entropy. The analysis [3] reveals that, if the first stationarity constraint is used, the resulting probability density function rather accurately reproduces the statistics of the individual variables. If the second stationarity constraint is used as well, the correlations between the variables are also reproduced quite adequately. The method can be generalized straightforwardly and holds the promise of a viable non-equilibrium statistical mechanics of the forced-dissipative systems of geophysical fluid dynamics. [1] E.N. Lorenz, 1996: Predictability - A problem partly solved, in Proc. Seminar on Predictability (ECMWF, Reading, Berkshire, UK), Vol. 1, pp. 1-18. [2] E.T. Jaynes, 2003: Probability Theory - The Logic of Science (Cambridge University Press, Cambridge). [3] W.T.M. Verkley and C.A. Severijns, 2014: The maximum entropy

  2. Statistical Models of Adaptive Immune populations

    Science.gov (United States)

    Sethna, Zachary; Callan, Curtis; Walczak, Aleksandra; Mora, Thierry

    The availability of large (104-106 sequences) datasets of B or T cell populations from a single individual allows reliable fitting of complex statistical models for naïve generation, somatic selection, and hypermutation. It is crucial to utilize a probabilistic/informational approach when modeling these populations. The inferred probability distributions allow for population characterization, calculation of probability distributions of various hidden variables (e.g. number of insertions), as well as statistical properties of the distribution itself (e.g. entropy). In particular, the differences between the T cell populations of embryonic and mature mice will be examined as a case study. Comparing these populations, as well as proposed mixed populations, provides a concrete exercise in model creation, comparison, choice, and validation.

  3. Linear mixed-effects models for within-participant psychology experiments: an introductory tutorial and free, graphical user interface (LMMgui).

    Science.gov (United States)

    Magezi, David A

    2015-01-01

    Linear mixed-effects models (LMMs) are increasingly being used for data analysis in cognitive neuroscience and experimental psychology, where within-participant designs are common. The current article provides an introductory review of the use of LMMs for within-participant data analysis and describes a free, simple, graphical user interface (LMMgui). LMMgui uses the package lme4 (Bates et al., 2014a,b) in the statistical environment R (R Core Team).

  4. Tropical geometry of statistical models.

    Science.gov (United States)

    Pachter, Lior; Sturmfels, Bernd

    2004-11-16

    This article presents a unified mathematical framework for inference in graphical models, building on the observation that graphical models are algebraic varieties. From this geometric viewpoint, observations generated from a model are coordinates of a point in the variety, and the sum-product algorithm is an efficient tool for evaluating specific coordinates. Here, we address the question of how the solutions to various inference problems depend on the model parameters. The proposed answer is expressed in terms of tropical algebraic geometry. The Newton polytope of a statistical model plays a key role. Our results are applied to the hidden Markov model and the general Markov model on a binary tree.

  5. Renormalization a la BRS of the non-linear σ-model

    International Nuclear Information System (INIS)

    Blasi, A.; Collina, R.

    1987-01-01

    We characterize the non-linear O(N+1) σ-model in an arbitrary parametrization with a nihilpotent BRS operator obtained from the symmetry transformation by the use of anticommuting parameters. The identity can be made compatible with the presence of a mass term in the model, so we can analyze its stability and prove that the model is anomaly free. This procedure avoids many problems encountered in the conventional analysis; in particular the introduction of an infinite number of sources coupled to the successive variations of the field is not necessary and the linear O(N) symmetry is respected as a consequence of the identity. The approach may provide useful in discussing the renormalizability of a wider class of models with non-linear symmetries. (orig.)

  6. Robust Comparison of the Linear Model Structures in Self-tuning Adaptive Control

    DEFF Research Database (Denmark)

    Zhou, Jianjun; Conrad, Finn

    1989-01-01

    The Generalized Predictive Controller (GPC) is extended to the systems with a generalized linear model structure which contains a number of choices of linear model structures. The Recursive Prediction Error Method (RPEM) is used to estimate the unknown parameters of the linear model structures...... to constitute a GPC self-tuner. Different linear model structures commonly used are compared and evaluated by applying them to the extended GPC self-tuner as well as to the special cases of the GPC, the GMV and MV self-tuners. The simulation results show how the choice of model structure affects the input......-output behaviour of self-tuning controllers....

  7. Multicollinearity in hierarchical linear models.

    Science.gov (United States)

    Yu, Han; Jiang, Shanhe; Land, Kenneth C

    2015-09-01

    This study investigates an ill-posed problem (multicollinearity) in Hierarchical Linear Models from both the data and the model perspectives. We propose an intuitive, effective approach to diagnosing the presence of multicollinearity and its remedies in this class of models. A simulation study demonstrates the impacts of multicollinearity on coefficient estimates, associated standard errors, and variance components at various levels of multicollinearity for finite sample sizes typical in social science studies. We further investigate the role multicollinearity plays at each level for estimation of coefficient parameters in terms of shrinkage. Based on these analyses, we recommend a top-down method for assessing multicollinearity in HLMs that first examines the contextual predictors (Level-2 in a two-level model) and then the individual predictors (Level-1) and uses the results for data collection, research problem redefinition, model re-specification, variable selection and estimation of a final model. Copyright © 2015 Elsevier Inc. All rights reserved.

  8. 12th Workshop on Stochastic Models, Statistics and Their Applications

    CERN Document Server

    Rafajłowicz, Ewaryst; Szajowski, Krzysztof

    2015-01-01

    This volume presents the latest advances and trends in stochastic models and related statistical procedures. Selected peer-reviewed contributions focus on statistical inference, quality control, change-point analysis and detection, empirical processes, time series analysis, survival analysis and reliability, statistics for stochastic processes, big data in technology and the sciences, statistical genetics, experiment design, and stochastic models in engineering. Stochastic models and related statistical procedures play an important part in furthering our understanding of the challenging problems currently arising in areas of application such as the natural sciences, information technology, engineering, image analysis, genetics, energy and finance, to name but a few. This collection arises from the 12th Workshop on Stochastic Models, Statistics and Their Applications, Wroclaw, Poland.

  9. Optimal designs for linear mixture models

    NARCIS (Netherlands)

    Mendieta, E.J.; Linssen, H.N.; Doornbos, R.

    1975-01-01

    In a recent paper Snee and Marquardt [8] considered designs for linear mixture models, where the components are subject to individual lower and/or upper bounds. When the number of components is large their algorithm XVERT yields designs far too extensive for practical purposes. The purpose of this

  10. Statistics and data analysis for financial engineering with R examples

    CERN Document Server

    Ruppert, David

    2015-01-01

    The new edition of this influential textbook, geared towards graduate or advanced undergraduate students, teaches the statistics necessary for financial engineering. In doing so, it illustrates concepts using financial markets and economic data, R Labs with real-data exercises, and graphical and analytic methods for modeling and diagnosing modeling errors. Financial engineers now have access to enormous quantities of data. To make use of these data, the powerful methods in this book, particularly about volatility and risks, are essential. Strengths of this fully-revised edition include major additions to the R code and the advanced topics covered. Individual chapters cover, among other topics, multivariate distributions, copulas, Bayesian computations, risk management, multivariate volatility and cointegration. Suggested prerequisites are basic knowledge of statistics and probability, matrices and linear algebra, and calculus. There is an appendix on probability, statistics and linear algebra. Practicing fina...

  11. Low-energy limit of the extended Linear Sigma Model

    Energy Technology Data Exchange (ETDEWEB)

    Divotgey, Florian [Johann Wolfgang Goethe-Universitaet, Institut fuer Theoretische Physik, Frankfurt am Main (Germany); Kovacs, Peter [Wigner Research Center for Physics, Hungarian Academy of Sciences, Institute for Particle and Nuclear Physics, Budapest (Hungary); GSI Helmholtzzentrum fuer Schwerionenforschung, ExtreMe Matter Institute, Darmstadt (Germany); Giacosa, Francesco [Johann Wolfgang Goethe-Universitaet, Institut fuer Theoretische Physik, Frankfurt am Main (Germany); Jan-Kochanowski University, Institute of Physics, Kielce (Poland); Rischke, Dirk H. [Johann Wolfgang Goethe-Universitaet, Institut fuer Theoretische Physik, Frankfurt am Main (Germany); University of Science and Technology of China, Interdisciplinary Center for Theoretical Study and Department of Modern Physics, Hefei, Anhui (China)

    2018-01-15

    The extended Linear Sigma Model is an effective hadronic model based on the linear realization of chiral symmetry SU(N{sub f}){sub L} x SU(N{sub f}){sub R}, with (pseudo)scalar and (axial-)vector mesons as degrees of freedom. In this paper, we study the low-energy limit of the extended Linear Sigma Model (eLSM) for N{sub f} = flavors by integrating out all fields except for the pions, the (pseudo-)Nambu-Goldstone bosons of chiral symmetry breaking. The resulting low-energy effective action is identical to Chiral Perturbation Theory (ChPT) after choosing a representative for the coset space generated by chiral symmetry breaking and expanding it in powers of (derivatives of) the pion fields. The tree-level values of the coupling constants of the effective low-energy action agree remarkably well with those of ChPT. (orig.)

  12. A national-scale model of linear features improves predictions of farmland biodiversity.

    Science.gov (United States)

    Sullivan, Martin J P; Pearce-Higgins, James W; Newson, Stuart E; Scholefield, Paul; Brereton, Tom; Oliver, Tom H

    2017-12-01

    Modelling species distribution and abundance is important for many conservation applications, but it is typically performed using relatively coarse-scale environmental variables such as the area of broad land-cover types. Fine-scale environmental data capturing the most biologically relevant variables have the potential to improve these models. For example, field studies have demonstrated the importance of linear features, such as hedgerows, for multiple taxa, but the absence of large-scale datasets of their extent prevents their inclusion in large-scale modelling studies.We assessed whether a novel spatial dataset mapping linear and woody-linear features across the UK improves the performance of abundance models of 18 bird and 24 butterfly species across 3723 and 1547 UK monitoring sites, respectively.Although improvements in explanatory power were small, the inclusion of linear features data significantly improved model predictive performance for many species. For some species, the importance of linear features depended on landscape context, with greater importance in agricultural areas. Synthesis and applications . This study demonstrates that a national-scale model of the extent and distribution of linear features improves predictions of farmland biodiversity. The ability to model spatial variability in the role of linear features such as hedgerows will be important in targeting agri-environment schemes to maximally deliver biodiversity benefits. Although this study focuses on farmland, data on the extent of different linear features are likely to improve species distribution and abundance models in a wide range of systems and also can potentially be used to assess habitat connectivity.

  13. Modeling containment of large wildfires using generalized linear mixed-model analysis

    Science.gov (United States)

    Mark Finney; Isaac C. Grenfell; Charles W. McHugh

    2009-01-01

    Billions of dollars are spent annually in the United States to contain large wildland fires, but the factors contributing to suppression success remain poorly understood. We used a regression model (generalized linear mixed-model) to model containment probability of individual fires, assuming that containment was a repeated-measures problem (fixed effect) and...

  14. An Emulator Toolbox to Approximate Radiative Transfer Models with Statistical Learning

    Directory of Open Access Journals (Sweden)

    Juan Pablo Rivera

    2015-07-01

    Full Text Available Physically-based radiative transfer models (RTMs help in understanding the processes occurring on the Earth’s surface and their interactions with vegetation and atmosphere. When it comes to studying vegetation properties, RTMs allows us to study light interception by plant canopies and are used in the retrieval of biophysical variables through model inversion. However, advanced RTMs can take a long computational time, which makes them unfeasible in many real applications. To overcome this problem, it has been proposed to substitute RTMs through so-called emulators. Emulators are statistical models that approximate the functioning of RTMs. Emulators are advantageous in real practice because of the computational efficiency and excellent accuracy and flexibility for extrapolation. We hereby present an “Emulator toolbox” that enables analysing multi-output machine learning regression algorithms (MO-MLRAs on their ability to approximate an RTM. The toolbox is included in the free-access ARTMO’s MATLAB suite for parameter retrieval and model inversion and currently contains both linear and non-linear MO-MLRAs, namely partial least squares regression (PLSR, kernel ridge regression (KRR and neural networks (NN. These MO-MLRAs have been evaluated on their precision and speed to approximate the soil vegetation atmosphere transfer model SCOPE (Soil Canopy Observation, Photochemistry and Energy balance. SCOPE generates, amongst others, sun-induced chlorophyll fluorescence as the output signal. KRR and NN were evaluated as capable of reconstructing fluorescence spectra with great precision. Relative errors fell below 0.5% when trained with 500 or more samples using cross-validation and principal component analysis to alleviate the underdetermination problem. Moreover, NN reconstructed fluorescence spectra about 50-times faster and KRR about 800-times faster than SCOPE. The Emulator toolbox is foreseen to open new opportunities in the use of advanced

  15. A test for the parameters of multiple linear regression models ...

    African Journals Online (AJOL)

    A test for the parameters of multiple linear regression models is developed for conducting tests simultaneously on all the parameters of multiple linear regression models. The test is robust relative to the assumptions of homogeneity of variances and absence of serial correlation of the classical F-test. Under certain null and ...

  16. Nonlinear Modeling by Assembling Piecewise Linear Models

    Science.gov (United States)

    Yao, Weigang; Liou, Meng-Sing

    2013-01-01

    To preserve nonlinearity of a full order system over a parameters range of interest, we propose a simple modeling approach by assembling a set of piecewise local solutions, including the first-order Taylor series terms expanded about some sampling states. The work by Rewienski and White inspired our use of piecewise linear local solutions. The assembly of these local approximations is accomplished by assigning nonlinear weights, through radial basis functions in this study. The efficacy of the proposed procedure is validated for a two-dimensional airfoil moving at different Mach numbers and pitching motions, under which the flow exhibits prominent nonlinear behaviors. All results confirm that our nonlinear model is accurate and stable for predicting not only aerodynamic forces but also detailed flowfields. Moreover, the model is robustness-accurate for inputs considerably different from the base trajectory in form and magnitude. This modeling preserves nonlinearity of the problems considered in a rather simple and accurate manner.

  17. Comparison of height-diameter models based on geographically weighted regressions and linear mixed modelling applied to large scale forest inventory data

    Energy Technology Data Exchange (ETDEWEB)

    Quirós Segovia, M.; Condés Ruiz, S.; Drápela, K.

    2016-07-01

    Aim of the study: The main objective of this study was to test Geographically Weighted Regression (GWR) for developing height-diameter curves for forests on a large scale and to compare it with Linear Mixed Models (LMM). Area of study: Monospecific stands of Pinus halepensis Mill. located in the region of Murcia (Southeast Spain). Materials and Methods: The dataset consisted of 230 sample plots (2582 trees) from the Third Spanish National Forest Inventory (SNFI) randomly split into training data (152 plots) and validation data (78 plots). Two different methodologies were used for modelling local (Petterson) and generalized height-diameter relationships (Cañadas I): GWR, with different bandwidths, and linear mixed models. Finally, the quality of the estimated models was compared throughout statistical analysis. Main results: In general, both LMM and GWR provide better prediction capability when applied to a generalized height-diameter function than when applied to a local one, with R2 values increasing from around 0.6 to 0.7 in the model validation. Bias and RMSE were also lower for the generalized function. However, error analysis showed that there were no large differences between these two methodologies, evidencing that GWR provides results which are as good as the more frequently used LMM methodology, at least when no additional measurements are available for calibrating. Research highlights: GWR is a type of spatial analysis for exploring spatially heterogeneous processes. GWR can model spatial variation in tree height-diameter relationship and its regression quality is comparable to LMM. The advantage of GWR over LMM is the possibility to determine the spatial location of every parameter without additional measurements. Abbreviations: GWR (Geographically Weighted Regression); LMM (Linear Mixed Model); SNFI (Spanish National Forest Inventory). (Author)

  18. Multiple commodities in statistical microeconomics: Model and market

    Science.gov (United States)

    Baaquie, Belal E.; Yu, Miao; Du, Xin

    2016-11-01

    A statistical generalization of microeconomics has been made in Baaquie (2013). In Baaquie et al. (2015), the market behavior of single commodities was analyzed and it was shown that market data provides strong support for the statistical microeconomic description of commodity prices. The case of multiple commodities is studied and a parsimonious generalization of the single commodity model is made for the multiple commodities case. Market data shows that the generalization can accurately model the simultaneous correlation functions of up to four commodities. To accurately model five or more commodities, further terms have to be included in the model. This study shows that the statistical microeconomics approach is a comprehensive and complete formulation of microeconomics, and which is independent to the mainstream formulation of microeconomics.

  19. Portfolio optimization by using linear programing models based on genetic algorithm

    Science.gov (United States)

    Sukono; Hidayat, Y.; Lesmana, E.; Putra, A. S.; Napitupulu, H.; Supian, S.

    2018-01-01

    In this paper, we discussed the investment portfolio optimization using linear programming model based on genetic algorithms. It is assumed that the portfolio risk is measured by absolute standard deviation, and each investor has a risk tolerance on the investment portfolio. To complete the investment portfolio optimization problem, the issue is arranged into a linear programming model. Furthermore, determination of the optimum solution for linear programming is done by using a genetic algorithm. As a numerical illustration, we analyze some of the stocks traded on the capital market in Indonesia. Based on the analysis, it is shown that the portfolio optimization performed by genetic algorithm approach produces more optimal efficient portfolio, compared to the portfolio optimization performed by a linear programming algorithm approach. Therefore, genetic algorithms can be considered as an alternative on determining the investment portfolio optimization, particularly using linear programming models.

  20. Statistical models for optimizing mineral exploration

    International Nuclear Information System (INIS)

    Wignall, T.K.; DeGeoffroy, J.

    1987-01-01

    The primary purpose of mineral exploration is to discover ore deposits. The emphasis of this volume is on the mathematical and computational aspects of optimizing mineral exploration. The seven chapters that make up the main body of the book are devoted to the description and application of various types of computerized geomathematical models. These chapters include: (1) the optimal selection of ore deposit types and regions of search, as well as prospecting selected areas, (2) designing airborne and ground field programs for the optimal coverage of prospecting areas, and (3) delineating and evaluating exploration targets within prospecting areas by means of statistical modeling. Many of these statistical programs are innovative and are designed to be useful for mineral exploration modeling. Examples of geomathematical models are applied to exploring for six main types of base and precious metal deposits, as well as other mineral resources (such as bauxite and uranium)

  1. Comparison of safflower oil extraction kinetics under two characteristic moisture conditions: statistical analysis of non-linear model parameters

    Directory of Open Access Journals (Sweden)

    E. Baümler

    2014-06-01

    Full Text Available In this study the kinetics of oil extraction from partially dehulled safflower seeds under two moisture conditions (7 and 9% dry basis was investigated. The extraction assays were performed using a stirred batch system, thermostated at 50 ºC, using n-hexane as solvent. The data obtained were fitted to a modified diffusion model in order to represent the extraction kinetics. The model took into account a washing and a diffusive step. Fitting parameters were compared statistically for both moisture conditions. The oil yield increased with the extraction time in both cases, although the oil was released at different rates. A comparison of the parameters showed that both the portion extracted in the washing phase and the effective diffusion coefficient were moisture-dependent. The effective diffusivities were 2.81 10-12 and 8.06 10-13 m²s-1 for moisture contents of 7% and 9%, respectively.

  2. Statistical modeling of copper losses in the silicate slag of the sulfide concentrate smelting process

    Directory of Open Access Journals (Sweden)

    Savic Marija V.

    2015-09-01

    Full Text Available This article presents the results of the statistical modeling of copper losses in the silicate slag of the sulfide concentrates smelting process. The aim of this study was to define the correlation dependence of the degree of copper losses in the silicate slag on the following parameters of technological processes: SiO2, FeO, Fe3O4, CaO and Al2O3 content in the slag and copper content in the matte. Multiple linear regression analysis (MLRA, artificial neural networks (ANNs and adaptive network based fuzzy inference system (ANFIS were used as tools for mathematical analysis of the indicated problem. The best correlation coefficient (R2 = 0.719 of the final model was obtained using the ANFIS modeling approach.

  3. Plane answers to complex questions the theory of linear models

    CERN Document Server

    Christensen, Ronald

    1987-01-01

    This book was written to rigorously illustrate the practical application of the projective approach to linear models. To some, this may seem contradictory. I contend that it is possible to be both rigorous and illustrative and that it is possible to use the projective approach in practical applications. Therefore, unlike many other books on linear models, the use of projections and sub­ spaces does not stop after the general theory. They are used wherever I could figure out how to do it. Solving normal equations and using calculus (outside of maximum likelihood theory) are anathema to me. This is because I do not believe that they contribute to the understanding of linear models. I have similar feelings about the use of side conditions. Such topics are mentioned when appropriate and thenceforward avoided like the plague. On the other side of the coin, I just as strenuously reject teaching linear models with a coordinate free approach. Although Joe Eaton assures me that the issues in complicated problems freq...

  4. A land use regression model for ambient ultrafine particles in Montreal, Canada: A comparison of linear regression and a machine learning approach.

    Science.gov (United States)

    Weichenthal, Scott; Ryswyk, Keith Van; Goldstein, Alon; Bagg, Scott; Shekkarizfard, Maryam; Hatzopoulou, Marianne

    2016-04-01

    Existing evidence suggests that ambient ultrafine particles (UFPs) (regression model for UFPs in Montreal, Canada using mobile monitoring data collected from 414 road segments during the summer and winter months between 2011 and 2012. Two different approaches were examined for model development including standard multivariable linear regression and a machine learning approach (kernel-based regularized least squares (KRLS)) that learns the functional form of covariate impacts on ambient UFP concentrations from the data. The final models included parameters for population density, ambient temperature and wind speed, land use parameters (park space and open space), length of local roads and rail, and estimated annual average NOx emissions from traffic. The final multivariable linear regression model explained 62% of the spatial variation in ambient UFP concentrations whereas the KRLS model explained 79% of the variance. The KRLS model performed slightly better than the linear regression model when evaluated using an external dataset (R(2)=0.58 vs. 0.55) or a cross-validation procedure (R(2)=0.67 vs. 0.60). In general, our findings suggest that the KRLS approach may offer modest improvements in predictive performance compared to standard multivariable linear regression models used to estimate spatial variations in ambient UFPs. However, differences in predictive performance were not statistically significant when evaluated using the cross-validation procedure. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.

  5. Statistical physics of pairwise probability models

    Directory of Open Access Journals (Sweden)

    Yasser Roudi

    2009-11-01

    Full Text Available Statistical models for describing the probability distribution over the states of biological systems are commonly used for dimensional reduction. Among these models, pairwise models are very attractive in part because they can be fit using a reasonable amount of data: knowledge of the means and correlations between pairs of elements in the system is sufficient. Not surprisingly, then, using pairwise models for studying neural data has been the focus of many studies in recent years. In this paper, we describe how tools from statistical physics can be employed for studying and using pairwise models. We build on our previous work on the subject and study the relation between different methods for fitting these models and evaluating their quality. In particular, using data from simulated cortical networks we study how the quality of various approximate methods for inferring the parameters in a pairwise model depends on the time bin chosen for binning the data. We also study the effect of the size of the time bin on the model quality itself, again using simulated data. We show that using finer time bins increases the quality of the pairwise model. We offer new ways of deriving the expressions reported in our previous work for assessing the quality of pairwise models.

  6. Approximating chiral quark models with linear σ-models

    International Nuclear Information System (INIS)

    Broniowski, Wojciech; Golli, Bojan

    2003-01-01

    We study the approximation of chiral quark models with simpler models, obtained via gradient expansion. The resulting Lagrangian of the type of the linear σ-model contains, at the lowest level of the gradient-expanded meson action, an additional term of the form ((1)/(2))A(σ∂ μ σ+π∂ μ π) 2 . We investigate the dynamical consequences of this term and its relevance to the phenomenology of the soliton models of the nucleon. It is found that the inclusion of the new term allows for a more efficient approximation of the underlying quark theory, especially in those cases where dynamics allows for a large deviation of the chiral fields from the chiral circle, such as in quark models with non-local regulators. This is of practical importance, since the σ-models with valence quarks only are technically much easier to treat and simpler to solve than the quark models with the full-fledged Dirac sea

  7. Finiteness of Ricci flat supersymmetric non-linear sigma-models

    International Nuclear Information System (INIS)

    Alvarez-Gaume, L.; Ginsparg, P.

    1985-01-01

    Combining the constraints of Kaehler differential geometry with the universality of the normal coordinate expansion in the background field method, we study the ultraviolet behavior of 2-dimensional supersymmetric non-linear sigma-models with target space an arbitrary riemannian manifold M. We show that the constraint of N=2 supersymmetry requires that all counterterms to the metric beyond one-loop order are cohomologically trivial. It follows that such supersymmetric non-linear sigma-models defined on locally symmetric spaces are super-renormalizable and that N=4 models are on-shell ultraviolet finite to all orders of perturbation theory. (orig.)

  8. Comparing artificial neural networks, general linear models and support vector machines in building predictive models for small interfering RNAs.

    Directory of Open Access Journals (Sweden)

    Kyle A McQuisten

    2009-10-01

    Full Text Available Exogenous short interfering RNAs (siRNAs induce a gene knockdown effect in cells by interacting with naturally occurring RNA processing machinery. However not all siRNAs induce this effect equally. Several heterogeneous kinds of machine learning techniques and feature sets have been applied to modeling siRNAs and their abilities to induce knockdown. There is some growing agreement to which techniques produce maximally predictive models and yet there is little consensus for methods to compare among predictive models. Also, there are few comparative studies that address what the effect of choosing learning technique, feature set or cross validation approach has on finding and discriminating among predictive models.Three learning techniques were used to develop predictive models for effective siRNA sequences including Artificial Neural Networks (ANNs, General Linear Models (GLMs and Support Vector Machines (SVMs. Five feature mapping methods were also used to generate models of siRNA activities. The 2 factors of learning technique and feature mapping were evaluated by complete 3x5 factorial ANOVA. Overall, both learning techniques and feature mapping contributed significantly to the observed variance in predictive models, but to differing degrees for precision and accuracy as well as across different kinds and levels of model cross-validation.The methods presented here provide a robust statistical framework to compare among models developed under distinct learning techniques and feature sets for siRNAs. Further comparisons among current or future modeling approaches should apply these or other suitable statistically equivalent methods to critically evaluate the performance of proposed models. ANN and GLM techniques tend to be more sensitive to the inclusion of noisy features, but the SVM technique is more robust under large numbers of features for measures of model precision and accuracy. Features found to result in maximally predictive models are

  9. Practical likelihood analysis for spatial generalized linear mixed models

    DEFF Research Database (Denmark)

    Bonat, W. H.; Ribeiro, Paulo Justiniano

    2016-01-01

    We investigate an algorithm for maximum likelihood estimation of spatial generalized linear mixed models based on the Laplace approximation. We compare our algorithm with a set of alternative approaches for two datasets from the literature. The Rhizoctonia root rot and the Rongelap are......, respectively, examples of binomial and count datasets modeled by spatial generalized linear mixed models. Our results show that the Laplace approximation provides similar estimates to Markov Chain Monte Carlo likelihood, Monte Carlo expectation maximization, and modified Laplace approximation. Some advantages...... of Laplace approximation include the computation of the maximized log-likelihood value, which can be used for model selection and tests, and the possibility to obtain realistic confidence intervals for model parameters based on profile likelihoods. The Laplace approximation also avoids the tuning...

  10. Statistical Mechanics of Turbulent Flows

    International Nuclear Information System (INIS)

    Cambon, C

    2004-01-01

    results from both non-linearity and non-locality of Navier-Stokes equations, the latter coming from, e.g., the nonlocal relationship of velocity and pressure in the quasi-incompressible case. These two aspects are often intricately linked. It is well known that non-linearity alone is not responsible for the 'problem', as evidenced by 1D turbulence without pressure ('Burgulence' from the Burgers equation) and probably 3D (cosmological gas). A local description in terms of pdf for the velocity can resolve the 'non-linear' problem, which instead yields an infinite hierarchy of equations in terms of moments. On the other hand, non-locality yields a hierarchy of unclosed equations, with the single-point pdf equation for velocity derived from NS incompressible equations involving a two-point pdf, and so on. The general relationship was given by Lundgren (1967, Phys. Fluids 10 (5), 969-975), with the equation for pdf at n points involving the pdf at n+1 points. The nonlocal problem appears in various statistical models which are not discussed in the book. The simplest example is full RST or ASM models, in which the closure of pressure-strain correlations is pivotal (their counterpart ought to be identified and discussed in equations (5-21) and the following ones). The book does not address more sophisticated non-local approaches, such as two-point (or spectral) non-linear closure theories and models, 'rapid distortion theory' for linear regimes, not to mention scaling and intermittency based on two-point structure functions, etc. The book sometimes mixes theoretical modelling and pure empirical relationships, the empirical character coming from the lack of a nonlocal (two-point) approach.) In short, the book is orientated more towards applications than towards turbulence theory; it is written clearly and concisely and should be useful to a large community, interested either in the underlying stochastic formalism or in CFD applications. (book review)

  11. A Generalization of Pythagoras's Theorem and Application to Explanations of Variance Contributions in Linear Models. Research Report. ETS RR-14-18

    Science.gov (United States)

    Carlson, James E.

    2014-01-01

    Many aspects of the geometry of linear statistical models and least squares estimation are well known. Discussions of the geometry may be found in many sources. Some aspects of the geometry relating to the partitioning of variation that can be explained using a little-known theorem of Pappus and have not been discussed previously are the topic of…

  12. A linear model of ductile plastic damage

    International Nuclear Information System (INIS)

    Lemaitre, J.

    1983-01-01

    A three-dimensional model of isotropic ductile plastic damage based on a continuum damage variable on the effective stress concept and on thermodynamics is derived. As shown by experiments on several metals and alloys, the model, integrated in the case of proportional loading, is linear with respect to the accumulated plastic strain and shows a large influence of stress triaxiality [fr

  13. State analysis of BOP using statistical and heuristic methods

    International Nuclear Information System (INIS)

    Heo, Gyun Young; Chang, Soon Heung

    2003-01-01

    Under the deregulation environment, the performance enhancement of BOP in nuclear power plants is being highlighted. To analyze performance level of BOP, we use the performance test procedures provided from an authorized institution such as ASME. However, through plant investigation, it was proved that the requirements of the performance test procedures about the reliability and quantity of sensors was difficult to be satisfied. As a solution of this, state analysis method that are the expanded concept of signal validation, was proposed on the basis of the statistical and heuristic approaches. Authors recommended the statistical linear regression model by analyzing correlation among BOP parameters as a reference state analysis method. Its advantage is that its derivation is not heuristic, it is possible to calculate model uncertainty, and it is easy to apply to an actual plant. The error of the statistical linear regression model is below 3% under normal as well as abnormal system states. Additionally a neural network model was recommended since the statistical model is impossible to apply to the validation of all of the sensors and is sensitive to the outlier that is the signal located out of a statistical distribution. Because there are a lot of sensors need to be validated in BOP, wavelet analysis (WA) were applied as a pre-processor for the reduction of input dimension and for the enhancement of training accuracy. The outlier localization capability of WA enhanced the robustness of the neural network. The trained neural network restored the degraded signals to the values within ±3% of the true signals

  14. Sphaleron in a non-linear sigma model

    International Nuclear Information System (INIS)

    Sogo, Kiyoshi; Fujimoto, Yasushi.

    1989-08-01

    We present an exact classical saddle point solution in a non-linear sigma model. It has a topological charge 1/2 and mediates the vacuum transition. The quantum fluctuations and the transition rate are also examined. (author)

  15. Mathematical statistics

    CERN Document Server

    Pestman, Wiebe R

    2009-01-01

    This textbook provides a broad and solid introduction to mathematical statistics, including the classical subjects hypothesis testing, normal regression analysis, and normal analysis of variance. In addition, non-parametric statistics and vectorial statistics are considered, as well as applications of stochastic analysis in modern statistics, e.g., Kolmogorov-Smirnov testing, smoothing techniques, robustness and density estimation. For students with some elementary mathematical background. With many exercises. Prerequisites from measure theory and linear algebra are presented.

  16. Statistical transmutation in doped quantum dimer models.

    Science.gov (United States)

    Lamas, C A; Ralko, A; Cabra, D C; Poilblanc, D; Pujol, P

    2012-07-06

    We prove a "statistical transmutation" symmetry of doped quantum dimer models on the square, triangular, and kagome lattices: the energy spectrum is invariant under a simultaneous change of statistics (i.e., bosonic into fermionic or vice versa) of the holes and of the signs of all the dimer resonance loops. This exact transformation enables us to define the duality equivalence between doped quantum dimer Hamiltonians and provides the analytic framework to analyze dynamical statistical transmutations. We investigate numerically the doping of the triangular quantum dimer model with special focus on the topological Z(2) dimer liquid. Doping leads to four (instead of two for the square lattice) inequivalent families of Hamiltonians. Competition between phase separation, superfluidity, supersolidity, and fermionic phases is investigated in the four families.

  17. Fast and local non-linear evolution of steep wave-groups on deep water: A comparison of approximate models to fully non-linear simulations

    International Nuclear Information System (INIS)

    Adcock, T. A. A.; Taylor, P. H.

    2016-01-01

    The non-linear Schrödinger equation and its higher order extensions are routinely used for analysis of extreme ocean waves. This paper compares the evolution of individual wave-packets modelled using non-linear Schrödinger type equations with packets modelled using fully non-linear potential flow models. The modified non-linear Schrödinger Equation accurately models the relatively large scale non-linear changes to the shape of wave-groups, with a dramatic contraction of the group along the mean propagation direction and a corresponding extension of the width of the wave-crests. In addition, as extreme wave form, there is a local non-linear contraction of the wave-group around the crest which leads to a localised broadening of the wave spectrum which the bandwidth limited non-linear Schrödinger Equations struggle to capture. This limitation occurs for waves of moderate steepness and a narrow underlying spectrum

  18. Robust Combining of Disparate Classifiers Through Order Statistics

    Science.gov (United States)

    Tumer, Kagan; Ghosh, Joydeep

    2001-01-01

    Integrating the outputs of multiple classifiers via combiners or meta-learners has led to substantial improvements in several difficult pattern recognition problems. In this article we investigate a family of combiners based on order statistics, for robust handling of situations where there are large discrepancies in performance of individual classifiers. Based on a mathematical modeling of how the decision boundaries are affected by order statistic combiners, we derive expressions for the reductions in error expected when simple output combination methods based on the the median, the maximum and in general, the ith order statistic, are used. Furthermore, we analyze the trim and spread combiners, both based on linear combinations of the ordered classifier outputs, and show that in the presence of uneven classifier performance, they often provide substantial gains over both linear and simple order statistics combiners. Experimental results on both real world data and standard public domain data sets corroborate these findings.

  19. Ground Motion Models for Future Linear Colliders

    International Nuclear Information System (INIS)

    Seryi, Andrei

    2000-01-01

    Optimization of the parameters of a future linear collider requires comprehensive models of ground motion. Both general models of ground motion and specific models of the particular site and local conditions are essential. Existing models are not completely adequate, either because they are too general, or because they omit important peculiarities of ground motion. The model considered in this paper is based on recent ground motion measurements performed at SLAC and at other accelerator laboratories, as well as on historical data. The issues to be studied for the models to become more predictive are also discussed

  20. Mapping extreme rainfall in the Northwest Portugal region: statistical analysis and spatial modelling

    Science.gov (United States)

    Santos, Monica; Fragoso, Marcelo

    2010-05-01

    Extreme precipitation events are one of the causes of natural hazards, such as floods and landslides, making its investigation so important, and this research aims to contribute to the study of the extreme rainfall patterns in a Portuguese mountainous area. The study area is centred on the Arcos de Valdevez county, located in the northwest region of Portugal, the rainiest of the country, with more than 3000 mm of annual rainfall at the Peneda-Gerês mountain system. This work focus on two main subjects related with the precipitation variability on the study area. First, a statistical analysis of several precipitation parameters is carried out, using daily data from 17 rain-gauges with a complete record for the 1960-1995 period. This approach aims to evaluate the main spatial contrasts regarding different aspects of the rainfall regime, described by ten parameters and indices of precipitation extremes (e.g. mean annual precipitation, the annual frequency of precipitation days, wet spells durations, maximum daily precipitation, maximum of precipitation in 30 days, number of days with rainfall exceeding 100 mm and estimated maximum daily rainfall for a return period of 100 years). The results show that the highest precipitation amounts (from annual to daily scales) and the higher frequency of very abundant rainfall events occur in the Serra da Peneda and Gerês mountains, opposing to the valleys of the Lima, Minho and Vez rivers, with lower precipitation amounts and less frequent heavy storms. The second purpose of this work is to find a method of mapping extreme rainfall in this mountainous region, investigating the complex influence of the relief (e.g. elevation, topography) on the precipitation patterns, as well others geographical variables (e.g. distance from coast, latitude), applying tested geo-statistical techniques (Goovaerts, 2000; Diodato, 2005). Models of linear regression were applied to evaluate the influence of different geographical variables (altitude

  1. Modelling of a linear accelerator VARIAN 600 C/D for dosimetric study using the Monte Carlo Method

    International Nuclear Information System (INIS)

    Cancino, Jorge Luis Batista

    2016-01-01

    Based on the high availability of low energy linear accelerators in Brazil and with the goal of developing a reliable tool for dose distribution calculations in radiotherapy; this research aims to validate a linear accelerator head model using MCNP Monte Carlo code. The Varian 600 C/D linear accelerator installed at the Hospital São João de is taken as reference. The main components of the linear accelerator head were simulated based on detailed information of the manufacturer. In order to calculate dose distribution, a water phantom with dimensions of 30 x 30 x 30 cm 3 was simulated and placed at 100 cm of source-surface distance. A monoenergetic electron beam of 6,3 MeV was considered as a source. The number of primary particles used in the simulation was 10 8 . A Phase-Space Surface was used to scoring the photon spectrum below the tungsten target. Others two were placed in the model in order to reduce computational time and improve statistical accuracy. In order to validate the developed model, the X-ray spectrum generated by Bremsstrahlung was calculated and analyzed. Furthermore, the results of percentage depth doses and beam profiles calculations were compared with available measurements. The MCNP calculations results were compared to measurement showing good agreement between them. The comparison between MCNP calculations and measurement of PDD showed reasonable coherence at build-up region. The results were in an acceptable interval of confidence at the flat region of beam profiles comparison for three different field sizes. In this work, we compared MCNP calculations to experimental data in order to validate the developed LINAC head model. The results showed a good agreement according to the recommended criteria. The developed model was validated as an accurate tool for LINAC quality control procedures. (author)

  2. Predicting musically induced emotions from physiological inputs: linear and neural network models.

    Science.gov (United States)

    Russo, Frank A; Vempala, Naresh N; Sandstrom, Gillian M

    2013-01-01

    Listening to music often leads to physiological responses. Do these physiological responses contain sufficient information to infer emotion induced in the listener? The current study explores this question by attempting to predict judgments of "felt" emotion from physiological responses alone using linear and neural network models. We measured five channels of peripheral physiology from 20 participants-heart rate (HR), respiration, galvanic skin response, and activity in corrugator supercilii and zygomaticus major facial muscles. Using valence and arousal (VA) dimensions, participants rated their felt emotion after listening to each of 12 classical music excerpts. After extracting features from the five channels, we examined their correlation with VA ratings, and then performed multiple linear regression to see if a linear relationship between the physiological responses could account for the ratings. Although linear models predicted a significant amount of variance in arousal ratings, they were unable to do so with valence ratings. We then used a neural network to provide a non-linear account of the ratings. The network was trained on the mean ratings of eight of the 12 excerpts and tested on the remainder. Performance of the neural network confirms that physiological responses alone can be used to predict musically induced emotion. The non-linear model derived from the neural network was more accurate than linear models derived from multiple linear regression, particularly along the valence dimension. A secondary analysis allowed us to quantify the relative contributions of inputs to the non-linear model. The study represents a novel approach to understanding the complex relationship between physiological responses and musically induced emotion.

  3. Textual information access statistical models

    CERN Document Server

    Gaussier, Eric

    2013-01-01

    This book presents statistical models that have recently been developed within several research communities to access information contained in text collections. The problems considered are linked to applications aiming at facilitating information access:- information extraction and retrieval;- text classification and clustering;- opinion mining;- comprehension aids (automatic summarization, machine translation, visualization).In order to give the reader as complete a description as possible, the focus is placed on the probability models used in the applications

  4. Modelling point patterns with linear structures

    DEFF Research Database (Denmark)

    Møller, Jesper; Rasmussen, Jakob Gulddahl

    2009-01-01

    processes whose realizations contain such linear structures. Such a point process is constructed sequentially by placing one point at a time. The points are placed in such a way that new points are often placed close to previously placed points, and the points form roughly line shaped structures. We...... consider simulations of this model and compare with real data....

  5. Modelling point patterns with linear structures

    DEFF Research Database (Denmark)

    Møller, Jesper; Rasmussen, Jakob Gulddahl

    processes whose realizations contain such linear structures. Such a point process is constructed sequentially by placing one point at a time. The points are placed in such a way that new points are often placed close to previously placed points, and the points form roughly line shaped structures. We...... consider simulations of this model and compare with real data....

  6. Reduction of interferences in graphite furnace atomic absorption spectrometry by multiple linear regression modelling

    Science.gov (United States)

    Grotti, Marco; Abelmoschi, Maria Luisa; Soggia, Francesco; Tiberiade, Christian; Frache, Roberto

    2000-12-01

    The multivariate effects of Na, K, Mg and Ca as nitrates on the electrothermal atomisation of manganese, cadmium and iron were studied by multiple linear regression modelling. Since the models proved to efficiently predict the effects of the considered matrix elements in a wide range of concentrations, they were applied to correct the interferences occurring in the determination of trace elements in seawater after pre-concentration of the analytes. In order to obtain a statistically significant number of samples, a large volume of the certified seawater reference materials CASS-3 and NASS-3 was treated with Chelex-100 resin; then, the chelating resin was separated from the solution, divided into several sub-samples, each of them was eluted with nitric acid and analysed by electrothermal atomic absorption spectrometry (for trace element determinations) and inductively coupled plasma optical emission spectrometry (for matrix element determinations). To minimise any other systematic error besides that due to matrix effects, accuracy of the pre-concentration step and contamination levels of the procedure were checked by inductively coupled plasma mass spectrometric measurements. Analytical results obtained by applying the multiple linear regression models were compared with those obtained with other calibration methods, such as external calibration using acid-based standards, external calibration using matrix-matched standards and the analyte addition technique. Empirical models proved to efficiently reduce interferences occurring in the analysis of real samples, allowing an improvement of accuracy better than for other calibration methods.

  7. Model for neural signaling leap statistics

    International Nuclear Information System (INIS)

    Chevrollier, Martine; Oria, Marcos

    2011-01-01

    We present a simple model for neural signaling leaps in the brain considering only the thermodynamic (Nernst) potential in neuron cells and brain temperature. We numerically simulated connections between arbitrarily localized neurons and analyzed the frequency distribution of the distances reached. We observed qualitative change between Normal statistics (with T 37.5 0 C, awaken regime) and Levy statistics (T = 35.5 0 C, sleeping period), characterized by rare events of long range connections.

  8. Model for neural signaling leap statistics

    Science.gov (United States)

    Chevrollier, Martine; Oriá, Marcos

    2011-03-01

    We present a simple model for neural signaling leaps in the brain considering only the thermodynamic (Nernst) potential in neuron cells and brain temperature. We numerically simulated connections between arbitrarily localized neurons and analyzed the frequency distribution of the distances reached. We observed qualitative change between Normal statistics (with T = 37.5°C, awaken regime) and Lévy statistics (T = 35.5°C, sleeping period), characterized by rare events of long range connections.

  9. A statistical-dynamical modeling approach for the simulation of local paleo proxy records using GCM output

    Energy Technology Data Exchange (ETDEWEB)

    Reichert, B.K.; Bengtsson, L. [Max-Planck-Institut fuer Meteorologie, Hamburg (Germany); Aakesson, O. [Sveriges Meteorologiska och Hydrologiska Inst., Norrkoeping (Sweden)

    1998-08-01

    Recent proxy data obtained from ice core measurements, dendrochronology and valley glaciers provide important information on the evolution of the regional or local climate. General circulation models integrated over a long period of time could help to understand the (external and internal) forcing mechanisms of natural climate variability. For a systematic interpretation of in situ paleo proxy records, a combined method of dynamical and statistical modeling is proposed. Local 'paleo records' can be simulated from GCM output by first undertaking a model-consistent statistical downscaling and then using a process-based forward modeling approach to obtain the behavior of valley glaciers and the growth of trees under specific conditions. The simulated records can be compared to actual proxy records in order to investigate whether e.g. the response of glaciers to climatic change can be reproduced by models and to what extent climate variability obtained from proxy records (with the main focus on the last millennium) can be represented. For statistical downscaling to local weather conditions, a multiple linear forward regression model is used. Daily sets of observed weather station data and various large-scale predictors at 7 pressure levels obtained from ECMWF reanalyses are used for development of the model. Daily data give the closest and most robust relationships due to the strong dependence on individual synoptic-scale patterns. For some local variables, the performance of the model can be further increased by developing seasonal specific statistical relationships. The model is validated using both independent and restricted predictor data sets. The model is applied to a long integration of a mixed layer GCM experiment simulating pre-industrial climate variability. The dynamical-statistical local GCM output within a region around Nigardsbreen glacier, Norway is compared to nearby observed station data for the period 1868-1993. Patterns of observed

  10. WE-A-201-02: Modern Statistical Modeling

    Energy Technology Data Exchange (ETDEWEB)

    Niemierko, A.

    2016-06-15

    Chris Marshall: Memorial Introduction Donald Edmonds Herbert Jr., or Don to his colleagues and friends, exemplified the “big tent” vision of medical physics, specializing in Applied Statistics and Dynamical Systems theory. He saw, more clearly than most, that “Making models is the difference between doing science and just fooling around [ref Woodworth, 2004]”. Don developed an interest in chemistry at school by “reading a book” - a recurring theme in his story. He was awarded a Westinghouse Science scholarship and attended the Carnegie Institute of Technology (later Carnegie Mellon University) where his interest turned to physics and led to a BS in Physics after transfer to Northwestern University. After (voluntary) service in the Navy he earned his MS in Physics from the University of Oklahoma, which led him to Johns Hopkins University in Baltimore to pursue a PhD. The early death of his wife led him to take a salaried position in the Physics Department of Colorado College in Colorado Springs so as to better care for their young daughter. There, a chance invitation from Dr. Juan del Regato to teach physics to residents at the Penrose Cancer Hospital introduced him to Medical Physics, and he decided to enter the field. He received his PhD from the University of London (UK) under Prof. Joseph Rotblat, where I first met him, and where he taught himself statistics. He returned to Penrose as a clinical medical physicist, also largely self-taught. In 1975 he formalized an evolving interest in statistical analysis as Professor of Radiology and Head of the Division of Physics and Statistics at the College of Medicine of the University of South Alabama in Mobile, AL where he remained for the rest of his career. He also served as the first Director of their Bio-Statistics and Epidemiology Core Unit working in part on a sickle-cell disease. After retirement he remained active as Professor Emeritus. Don served for several years as a consultant to the Nuclear

  11. WE-A-201-02: Modern Statistical Modeling

    International Nuclear Information System (INIS)

    Niemierko, A.

    2016-01-01

    Chris Marshall: Memorial Introduction Donald Edmonds Herbert Jr., or Don to his colleagues and friends, exemplified the “big tent” vision of medical physics, specializing in Applied Statistics and Dynamical Systems theory. He saw, more clearly than most, that “Making models is the difference between doing science and just fooling around [ref Woodworth, 2004]”. Don developed an interest in chemistry at school by “reading a book” - a recurring theme in his story. He was awarded a Westinghouse Science scholarship and attended the Carnegie Institute of Technology (later Carnegie Mellon University) where his interest turned to physics and led to a BS in Physics after transfer to Northwestern University. After (voluntary) service in the Navy he earned his MS in Physics from the University of Oklahoma, which led him to Johns Hopkins University in Baltimore to pursue a PhD. The early death of his wife led him to take a salaried position in the Physics Department of Colorado College in Colorado Springs so as to better care for their young daughter. There, a chance invitation from Dr. Juan del Regato to teach physics to residents at the Penrose Cancer Hospital introduced him to Medical Physics, and he decided to enter the field. He received his PhD from the University of London (UK) under Prof. Joseph Rotblat, where I first met him, and where he taught himself statistics. He returned to Penrose as a clinical medical physicist, also largely self-taught. In 1975 he formalized an evolving interest in statistical analysis as Professor of Radiology and Head of the Division of Physics and Statistics at the College of Medicine of the University of South Alabama in Mobile, AL where he remained for the rest of his career. He also served as the first Director of their Bio-Statistics and Epidemiology Core Unit working in part on a sickle-cell disease. After retirement he remained active as Professor Emeritus. Don served for several years as a consultant to the Nuclear

  12. Bayesian models a statistical primer for ecologists

    CERN Document Server

    Hobbs, N Thompson

    2015-01-01

    Bayesian modeling has become an indispensable tool for ecological research because it is uniquely suited to deal with complexity in a statistically coherent way. This textbook provides a comprehensive and accessible introduction to the latest Bayesian methods-in language ecologists can understand. Unlike other books on the subject, this one emphasizes the principles behind the computations, giving ecologists a big-picture understanding of how to implement this powerful statistical approach. Bayesian Models is an essential primer for non-statisticians. It begins with a definition of probabili

  13. Biochemical methane potential prediction of plant biomasses: Comparing chemical composition versus near infrared methods and linear versus non-linear models.

    Science.gov (United States)

    Godin, Bruno; Mayer, Frédéric; Agneessens, Richard; Gerin, Patrick; Dardenne, Pierre; Delfosse, Philippe; Delcarte, Jérôme

    2015-01-01

    The reliability of different models to predict the biochemical methane potential (BMP) of various plant biomasses using a multispecies dataset was compared. The most reliable prediction models of the BMP were those based on the near infrared (NIR) spectrum compared to those based on the chemical composition. The NIR predictions of local (specific regression and non-linear) models were able to estimate quantitatively, rapidly, cheaply and easily the BMP. Such a model could be further used for biomethanation plant management and optimization. The predictions of non-linear models were more reliable compared to those of linear models. The presentation form (green-dried, silage-dried and silage-wet form) of biomasses to the NIR spectrometer did not influence the performances of the NIR prediction models. The accuracy of the BMP method should be improved to enhance further the BMP prediction models. Copyright © 2014 Elsevier Ltd. All rights reserved.

  14. Methodology in robust and nonparametric statistics

    CERN Document Server

    Jurecková, Jana; Picek, Jan

    2012-01-01

    Introduction and SynopsisIntroductionSynopsisPreliminariesIntroductionInference in Linear ModelsRobustness ConceptsRobust and Minimax Estimation of LocationClippings from Probability and Asymptotic TheoryProblemsRobust Estimation of Location and RegressionIntroductionM-EstimatorsL-EstimatorsR-EstimatorsMinimum Distance and Pitman EstimatorsDifferentiable Statistical FunctionsProblemsAsymptotic Representations for L-Estimators

  15. Construction of multiple linear regression models using blood biomarkers for selecting against abdominal fat traits in broilers.

    Science.gov (United States)

    Dong, J Q; Zhang, X Y; Wang, S Z; Jiang, X F; Zhang, K; Ma, G W; Wu, M Q; Li, H; Zhang, H

    2018-01-01

    Plasma very low-density lipoprotein (VLDL) can be used to select for low body fat or abdominal fat (AF) in broilers, but its correlation with AF is limited. We investigated whether any other biochemical indicator can be used in combination with VLDL for a better selective effect. Nineteen plasma biochemical indicators were measured in male chickens from the Northeast Agricultural University broiler lines divergently selected for AF content (NEAUHLF) in the fed state at 46 and 48 d of age. The average concentration of every parameter for the 2 d was used for statistical analysis. Levels of these 19 plasma biochemical parameters were compared between the lean and fat lines. The phenotypic correlations between these plasma biochemical indicators and AF traits were analyzed. Then, multiple linear regression models were constructed to select the best model used for selecting against AF content. and the heritabilities of plasma indicators contained in the best models were estimated. The results showed that 11 plasma biochemical indicators (triglycerides, total bile acid, total protein, globulin, albumin/globulin, aspartate transaminase, alanine transaminase, gamma-glutamyl transpeptidase, uric acid, creatinine, and VLDL) differed significantly between the lean and fat lines (P linear regression models based on albumin/globulin, VLDL, triglycerides, globulin, total bile acid, and uric acid, had higher R2 (0.73) than the model based only on VLDL (0.21). The plasma parameters included in the best models had moderate heritability estimates (0.21 ≤ h2 ≤ 0.43). These results indicate that these multiple linear regression models can be used to select for lean broiler chickens. © 2017 Poultry Science Association Inc.

  16. The effect of disinfection of alginate impressions with 35% beetle juice spray on stone model linear dimensional changes

    Directory of Open Access Journals (Sweden)

    Anggra Yudha Ramadianto

    2007-07-01

    Full Text Available Dimensional stability of alginate impression is very important for treatment in dentistry. This study was to find the effect of the beetle juice spray procedure on alginate impression on gypsum model linear dimensional changes. This experimental study used 25 samples, divided into 5 groups. The first group, as control, were the alginate impressions filled with dental stone immediately after forming. The other four groups were the alginate impressions gel spray each 1,2,3, and 4 times with 35% beetle juice and then filled with dental stone. Dimensional changes were measured in the lower part of the plaster model from buccal-lingual and mesial-distal direction and also measured in the outer distance between the upper part of the stone model by using Mitutoyo digital micrometre and profile projector scaled 0,001 mm. The results of mesial-distal diameter average of the control group and group 2,3,4, and 5 were 9.909 mm, 9.852 mm, 9.845 mm, 9.824 mm, and 9.754 mm. Meanwhile, the results of buccal-lingual diameter average were 9.847 mm, 9.841 mm, 9.826 mm, 9.776 mm, and 9.729 mm. The results of the outer distance between the upper part of the stone model were 31.739 mm, 31.689 mm, 31.682 mm, 31.670 mm, and 31.670 mm. The data of this study was evaluated statistically based on the variant analysis. The conclusion of this study was statistically, there was no significant effect on gypsum model linear dimensional changes obtained from alginate impressions sprayed with 35% beetle juice.

  17. Characterizing the performance of the Conway-Maxwell Poisson generalized linear model.

    Science.gov (United States)

    Francis, Royce A; Geedipally, Srinivas Reddy; Guikema, Seth D; Dhavala, Soma Sekhar; Lord, Dominique; LaRocca, Sarah

    2012-01-01

    Count data are pervasive in many areas of risk analysis; deaths, adverse health outcomes, infrastructure system failures, and traffic accidents are all recorded as count events, for example. Risk analysts often wish to estimate the probability distribution for the number of discrete events as part of doing a risk assessment. Traditional count data regression models of the type often used in risk assessment for this problem suffer from limitations due to the assumed variance structure. A more flexible model based on the Conway-Maxwell Poisson (COM-Poisson) distribution was recently proposed, a model that has the potential to overcome the limitations of the traditional model. However, the statistical performance of this new model has not yet been fully characterized. This article assesses the performance of a maximum likelihood estimation method for fitting the COM-Poisson generalized linear model (GLM). The objectives of this article are to (1) characterize the parameter estimation accuracy of the MLE implementation of the COM-Poisson GLM, and (2) estimate the prediction accuracy of the COM-Poisson GLM using simulated data sets. The results of the study indicate that the COM-Poisson GLM is flexible enough to model under-, equi-, and overdispersed data sets with different sample mean values. The results also show that the COM-Poisson GLM yields accurate parameter estimates. The COM-Poisson GLM provides a promising and flexible approach for performing count data regression. © 2011 Society for Risk Analysis.

  18. Equilibrium statistical mechanics of lattice models

    CERN Document Server

    Lavis, David A

    2015-01-01

    Most interesting and difficult problems in equilibrium statistical mechanics concern models which exhibit phase transitions. For graduate students and more experienced researchers this book provides an invaluable reference source of approximate and exact solutions for a comprehensive range of such models. Part I contains background material on classical thermodynamics and statistical mechanics, together with a classification and survey of lattice models. The geometry of phase transitions is described and scaling theory is used to introduce critical exponents and scaling laws. An introduction is given to finite-size scaling, conformal invariance and Schramm—Loewner evolution. Part II contains accounts of classical mean-field methods. The parallels between Landau expansions and catastrophe theory are discussed and Ginzburg—Landau theory is introduced. The extension of mean-field theory to higher-orders is explored using the Kikuchi—Hijmans—De Boer hierarchy of approximations. In Part III the use of alge...

  19. Estimation of group means when adjusting for covariates in generalized linear models.

    Science.gov (United States)

    Qu, Yongming; Luo, Junxiang

    2015-01-01

    Generalized linear models are commonly used to analyze categorical data such as binary, count, and ordinal outcomes. Adjusting for important prognostic factors or baseline covariates in generalized linear models may improve the estimation efficiency. The model-based mean for a treatment group produced by most software packages estimates the response at the mean covariate, not the mean response for this treatment group for the studied population. Although this is not an issue for linear models, the model-based group mean estimates in generalized linear models could be seriously biased for the true group means. We propose a new method to estimate the group mean consistently with the corresponding variance estimation. Simulation showed the proposed method produces an unbiased estimator for the group means and provided the correct coverage probability. The proposed method was applied to analyze hypoglycemia data from clinical trials in diabetes. Copyright © 2014 John Wiley & Sons, Ltd.

  20. Computational and Statistical Models: A Comparison for Policy Modeling of Childhood Obesity

    Science.gov (United States)

    Mabry, Patricia L.; Hammond, Ross; Ip, Edward Hak-Sing; Huang, Terry T.-K.

    As systems science methodologies have begun to emerge as a set of innovative approaches to address complex problems in behavioral, social science, and public health research, some apparent conflicts with traditional statistical methodologies for public health have arisen. Computational modeling is an approach set in context that integrates diverse sources of data to test the plausibility of working hypotheses and to elicit novel ones. Statistical models are reductionist approaches geared towards proving the null hypothesis. While these two approaches may seem contrary to each other, we propose that they are in fact complementary and can be used jointly to advance solutions to complex problems. Outputs from statistical models can be fed into computational models, and outputs from computational models can lead to further empirical data collection and statistical models. Together, this presents an iterative process that refines the models and contributes to a greater understanding of the problem and its potential solutions. The purpose of this panel is to foster communication and understanding between statistical and computational modelers. Our goal is to shed light on the differences between the approaches and convey what kinds of research inquiries each one is best for addressing and how they can serve complementary (and synergistic) roles in the research process, to mutual benefit. For each approach the panel will cover the relevant "assumptions" and how the differences in what is assumed can foster misunderstandings. The interpretations of the results from each approach will be compared and contrasted and the limitations for each approach will be delineated. We will use illustrative examples from CompMod, the Comparative Modeling Network for Childhood Obesity Policy. The panel will also incorporate interactive discussions with the audience on the issues raised here.

  1. Identifiability Results for Several Classes of Linear Compartment Models.

    Science.gov (United States)

    Meshkat, Nicolette; Sullivant, Seth; Eisenberg, Marisa

    2015-08-01

    Identifiability concerns finding which unknown parameters of a model can be estimated, uniquely or otherwise, from given input-output data. If some subset of the parameters of a model cannot be determined given input-output data, then we say the model is unidentifiable. In this work, we study linear compartment models, which are a class of biological models commonly used in pharmacokinetics, physiology, and ecology. In past work, we used commutative algebra and graph theory to identify a class of linear compartment models that we call identifiable cycle models, which are unidentifiable but have the simplest possible identifiable functions (so-called monomial cycles). Here we show how to modify identifiable cycle models by adding inputs, adding outputs, or removing leaks, in such a way that we obtain an identifiable model. We also prove a constructive result on how to combine identifiable models, each corresponding to strongly connected graphs, into a larger identifiable model. We apply these theoretical results to several real-world biological models from physiology, cell biology, and ecology.

  2. A non-linear model of economic production processes

    Science.gov (United States)

    Ponzi, A.; Yasutomi, A.; Kaneko, K.

    2003-06-01

    We present a new two phase model of economic production processes which is a non-linear dynamical version of von Neumann's neoclassical model of production, including a market price-setting phase as well as a production phase. The rate of an economic production process is observed, for the first time, to depend on the minimum of its input supplies. This creates highly non-linear supply and demand dynamics. By numerical simulation, production networks are shown to become unstable when the ratio of different products to total processes increases. This provides some insight into observed stability of competitive capitalist economies in comparison to monopolistic economies. Capitalist economies are also shown to have low unemployment.

  3. Linear minimax estimation for random vectors with parametric uncertainty

    KAUST Repository

    Bitar, E

    2010-06-01

    In this paper, we take a minimax approach to the problem of computing a worst-case linear mean squared error (MSE) estimate of X given Y , where X and Y are jointly distributed random vectors with parametric uncertainty in their distribution. We consider two uncertainty models, PA and PB. Model PA represents X and Y as jointly Gaussian whose covariance matrix Λ belongs to the convex hull of a set of m known covariance matrices. Model PB characterizes X and Y as jointly distributed according to a Gaussian mixture model with m known zero-mean components, but unknown component weights. We show: (a) the linear minimax estimator computed under model PA is identical to that computed under model PB when the vertices of the uncertain covariance set in PA are the same as the component covariances in model PB, and (b) the problem of computing the linear minimax estimator under either model reduces to a semidefinite program (SDP). We also consider the dynamic situation where x(t) and y(t) evolve according to a discrete-time LTI state space model driven by white noise, the statistics of which is modeled by PA and PB as before. We derive a recursive linear minimax filter for x(t) given y(t).

  4. Spherical Process Models for Global Spatial Statistics

    KAUST Repository

    Jeong, Jaehong; Jun, Mikyoung; Genton, Marc G.

    2017-01-01

    Statistical models used in geophysical, environmental, and climate science applications must reflect the curvature of the spatial domain in global data. Over the past few decades, statisticians have developed covariance models that capture

  5. Model for neural signaling leap statistics

    Energy Technology Data Exchange (ETDEWEB)

    Chevrollier, Martine; Oria, Marcos, E-mail: oria@otica.ufpb.br [Laboratorio de Fisica Atomica e Lasers Departamento de Fisica, Universidade Federal da ParaIba Caixa Postal 5086 58051-900 Joao Pessoa, Paraiba (Brazil)

    2011-03-01

    We present a simple model for neural signaling leaps in the brain considering only the thermodynamic (Nernst) potential in neuron cells and brain temperature. We numerically simulated connections between arbitrarily localized neurons and analyzed the frequency distribution of the distances reached. We observed qualitative change between Normal statistics (with T 37.5{sup 0}C, awaken regime) and Levy statistics (T = 35.5{sup 0}C, sleeping period), characterized by rare events of long range connections.

  6. The linear nonthreshold (LNT) model as used in radiation protection: an NCRP update.

    Science.gov (United States)

    Boice, John D

    2017-10-01

    The linear nonthreshold (LNT) model has been used in radiation protection for over 40 years and has been hotly debated. It relies heavily on human epidemiology, with support from radiobiology. The scientific underpinnings include NCRP Report No. 136 ('Evaluation of the Linear-Nonthreshold Dose-Response Model for Ionizing Radiation'), UNSCEAR 2000, ICRP Publication 99 (2004) and the National Academies BEIR VII Report (2006). NCRP Scientific Committee 1-25 is reviewing recent epidemiologic studies focusing on dose-response models, including threshold, and the relevance to radiation protection. Recent studies after the BEIR VII Report are being critically reviewed and include atomic-bomb survivors, Mayak workers, atomic veterans, populations on the Techa River, U.S. radiological technologists, the U.S. Million Person Study, international workers (INWORKS), Chernobyl cleanup workers, children given computerized tomography scans, and tuberculosis-fluoroscopy patients. Methodologic limitations, dose uncertainties and statistical approaches (and modeling assumptions) are being systematically evaluated. The review of studies continues and will be published as an NCRP commentary in 2017. Most studies reviewed to date are consistent with a straight-line dose response but there are a few exceptions. In the past, the scientific consensus process has worked in providing practical and prudent guidance. So pragmatic judgment is anticipated. The evaluations are ongoing and the extensive NCRP review process has just begun, so no decisions or recommendations are in stone. The march of science requires a constant assessment of emerging evidence to provide an optimum, though not necessarily perfect, approach to radiation protection. Alternatives to the LNT model may be forthcoming, e.g. an approach that couples the best epidemiology with biologically-based models of carcinogenesis, focusing on chronic (not acute) exposure circumstances. Currently for the practical purposes of

  7. Effect Displays in R for Generalised Linear Models

    Directory of Open Access Journals (Sweden)

    John Fox

    2003-07-01

    Full Text Available This paper describes the implementation in R of a method for tabular or graphical display of terms in a complex generalised linear model. By complex, I mean a model that contains terms related by marginality or hierarchy, such as polynomial terms, or main effects and interactions. I call these tables or graphs effect displays. Effect displays are constructed by identifying high-order terms in a generalised linear model. Fitted values under the model are computed for each such term. The lower-order "relatives" of a high-order term (e.g., main effects marginal to an interaction are absorbed into the term, allowing the predictors appearing in the high-order term to range over their values. The values of other predictors are fixed at typical values: for example, a covariate could be fixed at its mean or median, a factor at its proportional distribution in the data, or to equal proportions in its several levels. Variations of effect displays are also described, including representation of terms higher-order to any appearing in the model.

  8. Analysis and Evaluation of Statistical Models for Integrated Circuits Design

    Directory of Open Access Journals (Sweden)

    Sáenz-Noval J.J.

    2011-10-01

    Full Text Available Statistical models for integrated circuits (IC allow us to estimate the percentage of acceptable devices in the batch before fabrication. Actually, Pelgrom is the statistical model most accepted in the industry; however it was derived from a micrometer technology, which does not guarantee reliability in nanometric manufacturing processes. This work considers three of the most relevant statistical models in the industry and evaluates their limitations and advantages in analog design, so that the designer has a better criterion to make a choice. Moreover, it shows how several statistical models can be used for each one of the stages and design purposes.

  9. H∞ /H2 model reduction through dilated linear matrix inequalities

    DEFF Research Database (Denmark)

    Adegas, Fabiano Daher; Stoustrup, Jakob

    2012-01-01

    This paper presents sufficient dilated linear matrix inequalities (LMI) conditions to the $H_{infty}$ and $H_{2}$ model reduction problem. A special structure of the auxiliary (slack) variables allows the original model of order $n$ to be reduced to an order $r=n/s$ where $n,r,s in field{N}$. Arb......This paper presents sufficient dilated linear matrix inequalities (LMI) conditions to the $H_{infty}$ and $H_{2}$ model reduction problem. A special structure of the auxiliary (slack) variables allows the original model of order $n$ to be reduced to an order $r=n/s$ where $n,r,s in field...

  10. The issue of statistical power for overall model fit in evaluating structural equation models

    Directory of Open Access Journals (Sweden)

    Richard HERMIDA

    2015-06-01

    Full Text Available Statistical power is an important concept for psychological research. However, examining the power of a structural equation model (SEM is rare in practice. This article provides an accessible review of the concept of statistical power for the Root Mean Square Error of Approximation (RMSEA index of overall model fit in structural equation modeling. By way of example, we examine the current state of power in the literature by reviewing studies in top Industrial-Organizational (I/O Psychology journals using SEMs. Results indicate that in many studies, power is very low, which implies acceptance of invalid models. Additionally, we examined methodological situations which may have an influence on statistical power of SEMs. Results showed that power varies significantly as a function of model type and whether or not the model is the main model for the study. Finally, results indicated that power is significantly related to model fit statistics used in evaluating SEMs. The results from this quantitative review imply that researchers should be more vigilant with respect to power in structural equation modeling. We therefore conclude by offering methodological best practices to increase confidence in the interpretation of structural equation modeling results with respect to statistical power issues.

  11. Optimization Research of Generation Investment Based on Linear Programming Model

    Science.gov (United States)

    Wu, Juan; Ge, Xueqian

    Linear programming is an important branch of operational research and it is a mathematical method to assist the people to carry out scientific management. GAMS is an advanced simulation and optimization modeling language and it will combine a large number of complex mathematical programming, such as linear programming LP, nonlinear programming NLP, MIP and other mixed-integer programming with the system simulation. In this paper, based on the linear programming model, the optimized investment decision-making of generation is simulated and analyzed. At last, the optimal installed capacity of power plants and the final total cost are got, which provides the rational decision-making basis for optimized investments.

  12. Improved model for statistical alignment

    Energy Technology Data Exchange (ETDEWEB)

    Miklos, I.; Toroczkai, Z. (Zoltan)

    2001-01-01

    The statistical approach to molecular sequence evolution involves the stochastic modeling of the substitution, insertion and deletion processes. Substitution has been modeled in a reliable way for more than three decades by using finite Markov-processes. Insertion and deletion, however, seem to be more difficult to model, and thc recent approaches cannot acceptably deal with multiple insertions and deletions. A new method based on a generating function approach is introduced to describe the multiple insertion process. The presented algorithm computes the approximate joint probability of two sequences in 0(13) running time where 1 is the geometric mean of the sequence lengths.

  13. Modeling of non-linear CHP efficiency curves in distributed energy systems

    DEFF Research Database (Denmark)

    Milan, Christian; Stadler, Michael; Cardoso, Gonçalo

    2015-01-01

    Distributed energy resources gain an increased importance in commercial and industrial building design. Combined heat and power (CHP) units are considered as one of the key technologies for cost and emission reduction in buildings. In order to make optimal decisions on investment and operation...... for these technologies, detailed system models are needed. These models are often formulated as linear programming problems to keep computational costs and complexity in a reasonable range. However, CHP systems involve variations of the efficiency for large nameplate capacity ranges and in case of part load operation......, which can be even of non-linear nature. Since considering these characteristics would turn the models into non-linear problems, in most cases only constant efficiencies are assumed. This paper proposes possible solutions to address this issue. For a mixed integer linear programming problem two...

  14. Whole vertebral bone segmentation method with a statistical intensity-shape model based approach

    Science.gov (United States)

    Hanaoka, Shouhei; Fritscher, Karl; Schuler, Benedikt; Masutani, Yoshitaka; Hayashi, Naoto; Ohtomo, Kuni; Schubert, Rainer

    2011-03-01

    An automatic segmentation algorithm for the vertebrae in human body CT images is presented. Especially we focused on constructing and utilizing 4 different statistical intensity-shape combined models for the cervical, upper / lower thoracic and lumbar vertebrae, respectively. For this purpose, two previously reported methods were combined: a deformable model-based initial segmentation method and a statistical shape-intensity model-based precise segmentation method. The former is used as a pre-processing to detect the position and orientation of each vertebra, which determines the initial condition for the latter precise segmentation method. The precise segmentation method needs prior knowledge on both the intensities and the shapes of the objects. After PCA analysis of such shape-intensity expressions obtained from training image sets, vertebrae were parametrically modeled as a linear combination of the principal component vectors. The segmentation of each target vertebra was performed as fitting of this parametric model to the target image by maximum a posteriori estimation, combined with the geodesic active contour method. In the experimental result by using 10 cases, the initial segmentation was successful in 6 cases and only partially failed in 4 cases (2 in the cervical area and 2 in the lumbo-sacral). In the precise segmentation, the mean error distances were 2.078, 1.416, 0.777, 0.939 mm for cervical, upper and lower thoracic, lumbar spines, respectively. In conclusion, our automatic segmentation algorithm for the vertebrae in human body CT images showed a fair performance for cervical, thoracic and lumbar vertebrae.

  15. Non-linear characterisation of the physical model of an ancient masonry bridge

    International Nuclear Information System (INIS)

    Fragonara, L Zanotti; Ceravolo, R; Matta, E; Quattrone, A; De Stefano, A; Pecorelli, M

    2012-01-01

    This paper presents the non-linear investigations carried out on a scaled model of a two-span masonry arch bridge. The model has been built in order to study the effect of the central pile settlement due to riverbank erosion. Progressive damage was induced in several steps by applying increasing settlements at the central pier. For each settlement step, harmonic shaker tests were conducted under different excitation levels, this allowing for the non-linear identification of the progressively damaged system. The shaker tests have been performed at resonance with the modal frequency of the structure, which were determined from a previous linear identification. Estimated non-linearity parameters, which result from the systematic application of restoring force based identification algorithms, can corroborate models to be used in the reassessment of existing structures. The method used for non-linear identification allows monitoring the evolution of non-linear parameters or indicators which can be used in damage and safety assessment.

  16. Daily precipitation statistics in regional climate models

    DEFF Research Database (Denmark)

    Frei, Christoph; Christensen, Jens Hesselbjerg; Déqué, Michel

    2003-01-01

    An evaluation is undertaken of the statistics of daily precipitation as simulated by five regional climate models using comprehensive observations in the region of the European Alps. Four limited area models and one variable-resolution global model are considered, all with a grid spacing of 50 km...

  17. Infinite Random Graphs as Statistical Mechanical Models

    DEFF Research Database (Denmark)

    Durhuus, Bergfinnur Jøgvan; Napolitano, George Maria

    2011-01-01

    We discuss two examples of infinite random graphs obtained as limits of finite statistical mechanical systems: a model of two-dimensional dis-cretized quantum gravity defined in terms of causal triangulated surfaces, and the Ising model on generic random trees. For the former model we describe a ...

  18. Study of the critical behavior of the O(N) linear and nonlinear sigma models

    International Nuclear Information System (INIS)

    Graziani, F.R.

    1983-01-01

    A study of the large N behavior of both the O(N) linear and nonlinear sigma models is presented. The purpose is to investigate the relationship between the disordered (ordered) phase of the linear and nonlinear sigma models. Utilizing operator product expansions and stability analyses, it is shown that for 2 - (lambda/sub R/(M) is the dimensionless renormalized quartic coupling and lambda* is the IR fixed point) limit of the linear sigma model which yields the nonlinear sigma model. It is also shown that stable large N linear sigma models with lambda 0) and nonlinear models are trivial. This result (i.e., triviality) is well known but only for one and two component models. Interestingly enough, the lambda< d = 4 linear sigma model remains nontrivial and tachyonic free

  19. Linear versus non-linear supersymmetry, in general

    Energy Technology Data Exchange (ETDEWEB)

    Ferrara, Sergio [Theoretical Physics Department, CERN,CH-1211 Geneva 23 (Switzerland); INFN - Laboratori Nazionali di Frascati,Via Enrico Fermi 40, I-00044 Frascati (Italy); Department of Physics and Astronomy, UniversityC.L.A.,Los Angeles, CA 90095-1547 (United States); Kallosh, Renata [SITP and Department of Physics, Stanford University,Stanford, California 94305 (United States); Proeyen, Antoine Van [Institute for Theoretical Physics, Katholieke Universiteit Leuven,Celestijnenlaan 200D, B-3001 Leuven (Belgium); Wrase, Timm [Institute for Theoretical Physics, Technische Universität Wien,Wiedner Hauptstr. 8-10, A-1040 Vienna (Austria)

    2016-04-12

    We study superconformal and supergravity models with constrained superfields. The underlying version of such models with all unconstrained superfields and linearly realized supersymmetry is presented here, in addition to the physical multiplets there are Lagrange multiplier (LM) superfields. Once the equations of motion for the LM superfields are solved, some of the physical superfields become constrained. The linear supersymmetry of the original models becomes non-linearly realized, its exact form can be deduced from the original linear supersymmetry. Known examples of constrained superfields are shown to require the following LM’s: chiral superfields, linear superfields, general complex superfields, some of them are multiplets with a spin.

  20. Linear versus non-linear supersymmetry, in general

    International Nuclear Information System (INIS)

    Ferrara, Sergio; Kallosh, Renata; Proeyen, Antoine Van; Wrase, Timm

    2016-01-01

    We study superconformal and supergravity models with constrained superfields. The underlying version of such models with all unconstrained superfields and linearly realized supersymmetry is presented here, in addition to the physical multiplets there are Lagrange multiplier (LM) superfields. Once the equations of motion for the LM superfields are solved, some of the physical superfields become constrained. The linear supersymmetry of the original models becomes non-linearly realized, its exact form can be deduced from the original linear supersymmetry. Known examples of constrained superfields are shown to require the following LM’s: chiral superfields, linear superfields, general complex superfields, some of them are multiplets with a spin.

  1. Breaking of ensembles of linear and nonlinear oscillators

    International Nuclear Information System (INIS)

    Buts, V.A.

    2016-01-01

    Some results concerning the study of the dynamics of ensembles of linear and nonlinear oscillators are stated. It is shown that, in general, a stable ensemble of linear oscillator has a limited number of oscillators. This number has been defined for some simple models. It is shown that the features of the dynamics of linear oscillators can be used for conversion of the low-frequency energy oscillations into high frequency oscillations. The dynamics of coupled nonlinear oscillators in most cases is chaotic. For such a case, it is shown that the statistical characteristics (moments) of chaotic motion can significantly reduce potential barriers that keep the particles in the capture region

  2. Chaos as an intermittently forced linear system.

    Science.gov (United States)

    Brunton, Steven L; Brunton, Bingni W; Proctor, Joshua L; Kaiser, Eurika; Kutz, J Nathan

    2017-05-30

    Understanding the interplay of order and disorder in chaos is a central challenge in modern quantitative science. Approximate linear representations of nonlinear dynamics have long been sought, driving considerable interest in Koopman theory. We present a universal, data-driven decomposition of chaos as an intermittently forced linear system. This work combines delay embedding and Koopman theory to decompose chaotic dynamics into a linear model in the leading delay coordinates with forcing by low-energy delay coordinates; this is called the Hankel alternative view of Koopman (HAVOK) analysis. This analysis is applied to the Lorenz system and real-world examples including Earth's magnetic field reversal and measles outbreaks. In each case, forcing statistics are non-Gaussian, with long tails corresponding to rare intermittent forcing that precedes switching and bursting phenomena. The forcing activity demarcates coherent phase space regions where the dynamics are approximately linear from those that are strongly nonlinear.The huge amount of data generated in fields like neuroscience or finance calls for effective strategies that mine data to reveal underlying dynamics. Here Brunton et al.develop a data-driven technique to analyze chaotic systems and predict their dynamics in terms of a forced linear model.

  3. Inverse Modelling Problems in Linear Algebra Undergraduate Courses

    Science.gov (United States)

    Martinez-Luaces, Victor E.

    2013-01-01

    This paper will offer an analysis from a theoretical point of view of mathematical modelling, applications and inverse problems of both causation and specification types. Inverse modelling problems give the opportunity to establish connections between theory and practice and to show this fact, a simple linear algebra example in two different…

  4. ANALISIS MODEL REGRESI NONPARAMETRIK SIRKULAR-LINEAR BERGANDA

    Directory of Open Access Journals (Sweden)

    KOMANG CANDRA IVAN

    2016-05-01

    Full Text Available Circular data are data which the value in form of vector is circular data. Statistic analysis that is used in analyzing circular data is circular statistics analysis. In regression analysis, if any of predictor or response variables or both are circular then the regression analysis used is called circular regression analysis. Observation data in circular statistic which use direction and time units usually don’t satisfy all of the parametric assumptions, thus making nonparametric regression as a good solution. Nonparametric regression function estimation is using epanechnikov kernel estimator for the linier variables and von Mises kernel estimator for the circular variable. This study showed that the result of circular analysis by using circular descriptive statistic is better than common statistic. Multiple circular-linier nonparametric regressions with Epanechnikov and von Mises kernel estimator didn’t create estimation model explicitly as parametric regression does, but create estimation from its observation knots instead.

  5. Available pressure amplitude of linear compressor based on phasor triangle model

    Science.gov (United States)

    Duan, C. X.; Jiang, X.; Zhi, X. Q.; You, X. K.; Qiu, L. M.

    2017-12-01

    The linear compressor for cryocoolers possess the advantages of long-life operation, high efficiency, low vibration and compact structure. It is significant to study the match mechanisms between the compressor and the cold finger, which determines the working efficiency of the cryocooler. However, the output characteristics of linear compressor are complicated since it is affected by many interacting parameters. The existing matching methods are simplified and mainly focus on the compressor efficiency and output acoustic power, while neglecting the important output parameter of pressure amplitude. In this study, a phasor triangle model basing on analyzing the forces of the piston is proposed. It can be used to predict not only the output acoustic power, the efficiency, but also the pressure amplitude of the linear compressor. Calculated results agree well with the measurement results of the experiment. By this phasor triangle model, the theoretical maximum output pressure amplitude of the linear compressor can be calculated simply based on a known charging pressure and operating frequency. Compared with the mechanical and electrical model of the linear compressor, the new model can provide an intuitionistic understanding on the match mechanism with faster computational process. The model can also explain the experimental phenomenon of the proportional relationship between the output pressure amplitude and the piston displacement in experiments. By further model analysis, such phenomenon is confirmed as an expression of the unmatched design of the compressor. The phasor triangle model may provide an alternative method for the compressor design and matching with the cold finger.

  6. The Overgeneralization of Linear Models among University Students' Mathematical Productions: A Long-Term Study

    Science.gov (United States)

    Esteley, Cristina B.; Villarreal, Monica E.; Alagia, Humberto R.

    2010-01-01

    Over the past several years, we have been exploring and researching a phenomenon that occurs among undergraduate students that we called extension of linear models to non-linear contexts or overgeneralization of linear models. This phenomenon appears when some students use linear representations in situations that are non-linear. In a first phase,…

  7. Stochastic linear hybrid systems: Modeling, estimation, and application

    Science.gov (United States)

    Seah, Chze Eng

    Hybrid systems are dynamical systems which have interacting continuous state and discrete state (or mode). Accurate modeling and state estimation of hybrid systems are important in many applications. We propose a hybrid system model, known as the Stochastic Linear Hybrid System (SLHS), to describe hybrid systems with stochastic linear system dynamics in each mode and stochastic continuous-state-dependent mode transitions. We then develop a hybrid estimation algorithm, called the State-Dependent-Transition Hybrid Estimation (SDTHE) algorithm, to estimate the continuous state and discrete state of the SLHS from noisy measurements. It is shown that the SDTHE algorithm is more accurate or more computationally efficient than existing hybrid estimation algorithms. Next, we develop a performance analysis algorithm to evaluate the performance of the SDTHE algorithm in a given operating scenario. We also investigate sufficient conditions for the stability of the SDTHE algorithm. The proposed SLHS model and SDTHE algorithm are illustrated to be useful in several applications. In Air Traffic Control (ATC), to facilitate implementations of new efficient operational concepts, accurate modeling and estimation of aircraft trajectories are needed. In ATC, an aircraft's trajectory can be divided into a number of flight modes. Furthermore, as the aircraft is required to follow a given flight plan or clearance, its flight mode transitions are dependent of its continuous state. However, the flight mode transitions are also stochastic due to navigation uncertainties or unknown pilot intents. Thus, we develop an aircraft dynamics model in ATC based on the SLHS. The SDTHE algorithm is then used in aircraft tracking applications to estimate the positions/velocities of aircraft and their flight modes accurately. Next, we develop an aircraft conformance monitoring algorithm to detect any deviations of aircraft trajectories in ATC that might compromise safety. In this application, the SLHS

  8. Free-piston engine linear generator for hybrid vehicles modeling study

    Science.gov (United States)

    Callahan, T. J.; Ingram, S. K.

    1995-05-01

    Development of a free piston engine linear generator was investigated for use as an auxiliary power unit for a hybrid electric vehicle. The main focus of the program was to develop an efficient linear generator concept to convert the piston motion directly into electrical power. Computer modeling techniques were used to evaluate five different designs for linear generators. These designs included permanent magnet generators, reluctance generators, linear DC generators, and two and three-coil induction generators. The efficiency of the linear generator was highly dependent on the design concept. The two-coil induction generator was determined to be the best design, with an efficiency of approximately 90 percent.

  9. Mixed deterministic statistical modelling of regional ozone air pollution

    KAUST Repository

    Kalenderski, Stoitchko

    2011-03-17

    We develop a physically motivated statistical model for regional ozone air pollution by separating the ground-level pollutant concentration field into three components, namely: transport, local production and large-scale mean trend mostly dominated by emission rates. The model is novel in the field of environmental spatial statistics in that it is a combined deterministic-statistical model, which gives a new perspective to the modelling of air pollution. The model is presented in a Bayesian hierarchical formalism, and explicitly accounts for advection of pollutants, using the advection equation. We apply the model to a specific case of regional ozone pollution-the Lower Fraser valley of British Columbia, Canada. As a predictive tool, we demonstrate that the model vastly outperforms existing, simpler modelling approaches. Our study highlights the importance of simultaneously considering different aspects of an air pollution problem as well as taking into account the physical bases that govern the processes of interest. © 2011 John Wiley & Sons, Ltd..

  10. Adaptive Maneuvering Frequency Method of Current Statistical Model

    Institute of Scientific and Technical Information of China (English)

    Wei Sun; Yongjian Yang

    2017-01-01

    Current statistical model(CSM) has a good performance in maneuvering target tracking. However, the fixed maneuvering frequency will deteriorate the tracking results, such as a serious dynamic delay, a slowly converging speedy and a limited precision when using Kalman filter(KF) algorithm. In this study, a new current statistical model and a new Kalman filter are proposed to improve the performance of maneuvering target tracking. The new model which employs innovation dominated subjection function to adaptively adjust maneuvering frequency has a better performance in step maneuvering target tracking, while a fluctuant phenomenon appears. As far as this problem is concerned, a new adaptive fading Kalman filter is proposed as well. In the new Kalman filter, the prediction values are amended in time by setting judgment and amendment rules,so that tracking precision and fluctuant phenomenon of the new current statistical model are improved. The results of simulation indicate the effectiveness of the new algorithm and the practical guiding significance.

  11. Speech emotion recognition based on statistical pitch model

    Institute of Scientific and Technical Information of China (English)

    WANG Zhiping; ZHAO Li; ZOU Cairong

    2006-01-01

    A modified Parzen-window method, which keep high resolution in low frequencies and keep smoothness in high frequencies, is proposed to obtain statistical model. Then, a gender classification method utilizing the statistical model is proposed, which have a 98% accuracy of gender classification while long sentence is dealt with. By separation the male voice and female voice, the mean and standard deviation of speech training samples with different emotion are used to create the corresponding emotion models. Then the Bhattacharyya distance between the test sample and statistical models of pitch, are utilized for emotion recognition in speech.The normalization of pitch for the male voice and female voice are also considered, in order to illustrate them into a uniform space. Finally, the speech emotion recognition experiment based on K Nearest Neighbor shows that, the correct rate of 81% is achieved, where it is only 73.85%if the traditional parameters are utilized.

  12. Neutron stars in non-linear coupling models

    International Nuclear Information System (INIS)

    Taurines, Andre R.; Vasconcellos, Cesar A.Z.; Malheiro, Manuel; Chiapparini, Marcelo

    2001-01-01

    We present a class of relativistic models for nuclear matter and neutron stars which exhibits a parameterization, through mathematical constants, of the non-linear meson-baryon couplings. For appropriate choices of the parameters, it recovers current QHD models found in the literature: Walecka, ZM and ZM3 models. We have found that the ZM3 model predicts a very small maximum neutron star mass, ∼ 0.72M s un. A strong similarity between the results of ZM-like models and those with exponential couplings is noted. Finally, we discuss the very intense scalar condensates found in the interior of neutron stars which may lead to negative effective masses. (author)

  13. Neutron stars in non-linear coupling models

    Energy Technology Data Exchange (ETDEWEB)

    Taurines, Andre R.; Vasconcellos, Cesar A.Z. [Rio Grande do Sul Univ., Porto Alegre, RS (Brazil); Malheiro, Manuel [Universidade Federal Fluminense, Niteroi, RJ (Brazil); Chiapparini, Marcelo [Universidade do Estado, Rio de Janeiro, RJ (Brazil)

    2001-07-01

    We present a class of relativistic models for nuclear matter and neutron stars which exhibits a parameterization, through mathematical constants, of the non-linear meson-baryon couplings. For appropriate choices of the parameters, it recovers current QHD models found in the literature: Walecka, ZM and ZM3 models. We have found that the ZM3 model predicts a very small maximum neutron star mass, {approx} 0.72M{sub s}un. A strong similarity between the results of ZM-like models and those with exponential couplings is noted. Finally, we discuss the very intense scalar condensates found in the interior of neutron stars which may lead to negative effective masses. (author)

  14. Statistical modelling of citation exchange between statistics journals.

    Science.gov (United States)

    Varin, Cristiano; Cattelan, Manuela; Firth, David

    2016-01-01

    Rankings of scholarly journals based on citation data are often met with scepticism by the scientific community. Part of the scepticism is due to disparity between the common perception of journals' prestige and their ranking based on citation counts. A more serious concern is the inappropriate use of journal rankings to evaluate the scientific influence of researchers. The paper focuses on analysis of the table of cross-citations among a selection of statistics journals. Data are collected from the Web of Science database published by Thomson Reuters. Our results suggest that modelling the exchange of citations between journals is useful to highlight the most prestigious journals, but also that journal citation data are characterized by considerable heterogeneity, which needs to be properly summarized. Inferential conclusions require care to avoid potential overinterpretation of insignificant differences between journal ratings. Comparison with published ratings of institutions from the UK's research assessment exercise shows strong correlation at aggregate level between assessed research quality and journal citation 'export scores' within the discipline of statistics.

  15. Statistical modeling and extrapolation of carcinogenesis data

    International Nuclear Information System (INIS)

    Krewski, D.; Murdoch, D.; Dewanji, A.

    1986-01-01

    Mathematical models of carcinogenesis are reviewed, including pharmacokinetic models for metabolic activation of carcinogenic substances. Maximum likelihood procedures for fitting these models to epidemiological data are discussed, including situations where the time to tumor occurrence is unobservable. The plausibility of different possible shapes of the dose response curve at low doses is examined, and a robust method for linear extrapolation to low doses is proposed and applied to epidemiological data on radiation carcinogenesis

  16. Statistics Based Models for the Dynamics of Chernivtsi Children Disease

    Directory of Open Access Journals (Sweden)

    Igor G. Nesteruk

    2017-10-01

    Full Text Available Background. Simple mathematical models of contamination and SIR-model of spreading an infection were used to simulate the time dynamics of the unknown before children disease, which occurred in Chernivtsi (Ukraine. The cause of many cases of alopecia, which began in this city in August 1988 is still not fully clarified. According to the official report of the governmental commission, the last new cases occurred in the middle of November 1988, and the reason of the illness was reported as chemical exogenous intoxication. Later this illness became the name “Chernivtsi chemical disease”. Nevertheless, the significantly increased number of new cases of the local alopecia was registered almost three years and is still not clarified. Objective. The comparison of two different versions of the disease: chemical exogenous intoxication and infection. Identification of the parameters of mathematical models and prediction of the disease development. Methods. Analytical solutions of the contamination models and SIR-model for an epidemic are obtained. The optimal values of parameters with the use of linear regression were found. Results. The optimal values of the models parameters with the use of statistical approach were identified. The calculations showed that the infectious version of the disease is more reliable in comparison with the popular contamination one. The possible date of the epidemic beginning was estimated. Conclusions. The optimal parameters of SIR-model allow calculating the realistic number of victims and other characteristics of possible epidemic. They also show that increased number of cases of local alopecia could be a part of the same epidemic as “Chernivtsi chemical disease”.

  17. A comparison of linear tyre models for analysing shimmy

    NARCIS (Netherlands)

    Besselink, I.J.M.; Maas, J.W.L.H.; Nijmeijer, H.

    2011-01-01

    A comparison is made between three linear, dynamic tyre models using low speed step responses and yaw oscillation tests. The match with the measurements improves with increasing complexity of the tyre model. Application of the different tyre models to a two degree of freedom trailing arm suspension

  18. S-AMP for non-linear observation models

    DEFF Research Database (Denmark)

    Cakmak, Burak; Winther, Ole; Fleury, Bernard H.

    2015-01-01

    Recently we presented the S-AMP approach, an extension of approximate message passing (AMP), to be able to handle general invariant matrix ensembles. In this contribution we extend S-AMP to non-linear observation models. We obtain generalized AMP (GAMP) as the special case when the measurement...

  19. Diagnostics for Linear Models With Functional Responses

    OpenAIRE

    Xu, Hongquan; Shen, Qing

    2005-01-01

    Linear models where the response is a function and the predictors are vectors are useful in analyzing data from designed experiments and other situations with functional observations. Residual analysis and diagnostics are considered for such models. Studentized residuals are defined and their properties are studied. Chi-square quantile-quantile plots are proposed to check the assumption of Gaussian error process and outliers. Jackknife residuals and an associated test are proposed to det...

  20. Statistical validation of normal tissue complication probability models.

    Science.gov (United States)

    Xu, Cheng-Jian; van der Schaaf, Arjen; Van't Veld, Aart A; Langendijk, Johannes A; Schilstra, Cornelis

    2012-09-01

    To investigate the applicability and value of double cross-validation and permutation tests as established statistical approaches in the validation of normal tissue complication probability (NTCP) models. A penalized regression method, LASSO (least absolute shrinkage and selection operator), was used to build NTCP models for xerostomia after radiation therapy treatment of head-and-neck cancer. Model assessment was based on the likelihood function and the area under the receiver operating characteristic curve. Repeated double cross-validation showed the uncertainty and instability of the NTCP models and indicated that the statistical significance of model performance can be obtained by permutation testing. Repeated double cross-validation and permutation tests are recommended to validate NTCP models before clinical use. Copyright © 2012 Elsevier Inc. All rights reserved.

  1. Statistical Validation of Normal Tissue Complication Probability Models

    Energy Technology Data Exchange (ETDEWEB)

    Xu Chengjian, E-mail: c.j.xu@umcg.nl [Department of Radiation Oncology, University of Groningen, University Medical Center Groningen, Groningen (Netherlands); Schaaf, Arjen van der; Veld, Aart A. van' t; Langendijk, Johannes A. [Department of Radiation Oncology, University of Groningen, University Medical Center Groningen, Groningen (Netherlands); Schilstra, Cornelis [Department of Radiation Oncology, University of Groningen, University Medical Center Groningen, Groningen (Netherlands); Radiotherapy Institute Friesland, Leeuwarden (Netherlands)

    2012-09-01

    Purpose: To investigate the applicability and value of double cross-validation and permutation tests as established statistical approaches in the validation of normal tissue complication probability (NTCP) models. Methods and Materials: A penalized regression method, LASSO (least absolute shrinkage and selection operator), was used to build NTCP models for xerostomia after radiation therapy treatment of head-and-neck cancer. Model assessment was based on the likelihood function and the area under the receiver operating characteristic curve. Results: Repeated double cross-validation showed the uncertainty and instability of the NTCP models and indicated that the statistical significance of model performance can be obtained by permutation testing. Conclusion: Repeated double cross-validation and permutation tests are recommended to validate NTCP models before clinical use.

  2. Shell model in large spaces and statistical spectroscopy

    International Nuclear Information System (INIS)

    Kota, V.K.B.

    1996-01-01

    For many nuclear structure problems of current interest it is essential to deal with shell model in large spaces. For this, three different approaches are now in use and two of them are: (i) the conventional shell model diagonalization approach but taking into account new advances in computer technology; (ii) the shell model Monte Carlo method. A brief overview of these two methods is given. Large space shell model studies raise fundamental questions regarding the information content of the shell model spectrum of complex nuclei. This led to the third approach- the statistical spectroscopy methods. The principles of statistical spectroscopy have their basis in nuclear quantum chaos and they are described (which are substantiated by large scale shell model calculations) in some detail. (author)

  3. Advances in statistical models for data analysis

    CERN Document Server

    Minerva, Tommaso; Vichi, Maurizio

    2015-01-01

    This edited volume focuses on recent research results in classification, multivariate statistics and machine learning and highlights advances in statistical models for data analysis. The volume provides both methodological developments and contributions to a wide range of application areas such as economics, marketing, education, social sciences and environment. The papers in this volume were first presented at the 9th biannual meeting of the Classification and Data Analysis Group (CLADAG) of the Italian Statistical Society, held in September 2013 at the University of Modena and Reggio Emilia, Italy.

  4. Linear models to perform treaty verification tasks for enhanced information security

    International Nuclear Information System (INIS)

    MacGahan, Christopher J.; Kupinski, Matthew A.; Brubaker, Erik M.; Hilton, Nathan R.; Marleau, Peter A.

    2017-01-01

    Linear mathematical models were applied to binary-discrimination tasks relevant to arms control verification measurements in which a host party wishes to convince a monitoring party that an item is or is not treaty accountable. These models process data in list-mode format and can compensate for the presence of variability in the source, such as uncertain object orientation and location. The Hotelling observer applies an optimal set of weights to binned detector data, yielding a test statistic that is thresholded to make a decision. The channelized Hotelling observer applies a channelizing matrix to the vectorized data, resulting in a lower dimensional vector available to the monitor to make decisions. We demonstrate how incorporating additional terms in this channelizing-matrix optimization offers benefits for treaty verification. We present two methods to increase shared information and trust between the host and monitor. The first method penalizes individual channel performance in order to maximize the information available to the monitor while maintaining optimal performance. Second, we present a method that penalizes predefined sensitive information while maintaining the capability to discriminate between binary choices. Data used in this study was generated using Monte Carlo simulations for fission neutrons, accomplished with the GEANT4 toolkit. Custom models for plutonium inspection objects were measured in simulation by a radiation imaging system. Model performance was evaluated and presented using the area under the receiver operating characteristic curve.

  5. Linear models to perform treaty verification tasks for enhanced information security

    Energy Technology Data Exchange (ETDEWEB)

    MacGahan, Christopher J., E-mail: cmacgahan@optics.arizona.edu [College of Optical Sciences, The University of Arizona, 1630 E. University Blvd, Tucson, AZ 85721 (United States); Sandia National Laboratories, Livermore, CA 94551 (United States); Kupinski, Matthew A. [College of Optical Sciences, The University of Arizona, 1630 E. University Blvd, Tucson, AZ 85721 (United States); Brubaker, Erik M.; Hilton, Nathan R.; Marleau, Peter A. [Sandia National Laboratories, Livermore, CA 94551 (United States)

    2017-02-01

    Linear mathematical models were applied to binary-discrimination tasks relevant to arms control verification measurements in which a host party wishes to convince a monitoring party that an item is or is not treaty accountable. These models process data in list-mode format and can compensate for the presence of variability in the source, such as uncertain object orientation and location. The Hotelling observer applies an optimal set of weights to binned detector data, yielding a test statistic that is thresholded to make a decision. The channelized Hotelling observer applies a channelizing matrix to the vectorized data, resulting in a lower dimensional vector available to the monitor to make decisions. We demonstrate how incorporating additional terms in this channelizing-matrix optimization offers benefits for treaty verification. We present two methods to increase shared information and trust between the host and monitor. The first method penalizes individual channel performance in order to maximize the information available to the monitor while maintaining optimal performance. Second, we present a method that penalizes predefined sensitive information while maintaining the capability to discriminate between binary choices. Data used in this study was generated using Monte Carlo simulations for fission neutrons, accomplished with the GEANT4 toolkit. Custom models for plutonium inspection objects were measured in simulation by a radiation imaging system. Model performance was evaluated and presented using the area under the receiver operating characteristic curve.

  6. New classical r-matrices from integrable non-linear sigma-models

    International Nuclear Information System (INIS)

    Laartz, J.; Bordemann, M.; Forger, M.; Schaper, U.

    1993-01-01

    Non-linear sigma models on Riemannian symmetric spaces constitute the most general class of classical non-linear sigma models which are known to be integrable. Using the current algebra structure of these models their canonical structure is analyzed and it is shown that their non-ultralocal fundamental Poisson bracket relation is governed by a field dependent non antisymmetric r-matrix obeying a dynamical Yang Baxter equation. The fundamental Poisson bracket relations and the r-matrix are derived explicitly and a new kind of algebra is found that is supposed to replace the classical Yang Baxter algebra governing the canonical structure of ultralocal models. (Author) 9 refs

  7. Non-linear mixed-effects pharmacokinetic/pharmacodynamic modelling in NLME using differential equations

    DEFF Research Database (Denmark)

    Tornøe, Christoffer Wenzel; Agersø, Henrik; Madsen, Henrik

    2004-01-01

    The standard software for non-linear mixed-effect analysis of pharmacokinetic/phar-macodynamic (PK/PD) data is NONMEM while the non-linear mixed-effects package NLME is an alternative as tong as the models are fairly simple. We present the nlmeODE package which combines the ordinary differential...... equation (ODE) solver package odesolve and the non-Linear mixed effects package NLME thereby enabling the analysis of complicated systems of ODEs by non-linear mixed-effects modelling. The pharmacokinetics of the anti-asthmatic drug theophylline is used to illustrate the applicability of the nlme...

  8. Using hierarchical linear growth models to evaluate protective mechanisms that mediate science achievement

    Science.gov (United States)

    von Secker, Clare Elaine

    The study of students at risk is a major topic of science education policy and discussion. Much research has focused on describing conditions and problems associated with the statistical risk of low science achievement among individuals who are members of groups characterized by problems such as poverty and social disadvantage. But outcomes attributed to these factors do not explain the nature and extent of mechanisms that account for differences in performance among individuals at risk. There is ample theoretical and empirical evidence that demographic differences should be conceptualized as social contexts, or collections of variables, that alter the psychological significance and social demands of life events, and affect subsequent relationships between risk and resilience. The hierarchical linear growth models used in this dissertation provide greater specification of the role of social context and the protective effects of attitude, expectations, parenting practices, peer influences, and learning opportunities on science achievement. While the individual influences of these protective factors on science achievement were small, their cumulative effect was substantial. Meta-analysis conducted on the effects associated with psychological and environmental processes that mediate risk mechanisms in sixteen social contexts revealed twenty-two significant differences between groups of students. Positive attitudes, high expectations, and more intense science course-taking had positive effects on achievement of all students, although these factors were not equally protective in all social contexts. In general, effects associated with authoritative parenting and peer influences were negative, regardless of social context. An evaluation comparing the performance and stability of hierarchical linear growth models with traditional repeated measures models is included as well.

  9. A gauge model describing N relativistic particles bound by linear forces

    International Nuclear Information System (INIS)

    Filippov, A.T.

    1988-01-01

    A relativistic model of N particles bound by linear forces is obtained by applying the gauging procedure to the linear canonical symmteries of a simple (rudimentary) nonrelativistic N-particle Lagrangian extended to relativistic phase space. The new (gauged) Lagrangian is formally Poincare invariant, the Hamiltonian is a linear combination of first-class constraints which are closed with respect to Pisson brackets and generate the localized canonical symmteries. The gauge potentials appear as the Lagrange multipliers of the constraints. Gauge fixing and quantization of the model are also briefly discussed. 11 refs

  10. Computationally efficient statistical differential equation modeling using homogenization

    Science.gov (United States)

    Hooten, Mevin B.; Garlick, Martha J.; Powell, James A.

    2013-01-01

    Statistical models using partial differential equations (PDEs) to describe dynamically evolving natural systems are appearing in the scientific literature with some regularity in recent years. Often such studies seek to characterize the dynamics of temporal or spatio-temporal phenomena such as invasive species, consumer-resource interactions, community evolution, and resource selection. Specifically, in the spatial setting, data are often available at varying spatial and temporal scales. Additionally, the necessary numerical integration of a PDE may be computationally infeasible over the spatial support of interest. We present an approach to impose computationally advantageous changes of support in statistical implementations of PDE models and demonstrate its utility through simulation using a form of PDE known as “ecological diffusion.” We also apply a statistical ecological diffusion model to a data set involving the spread of mountain pine beetle (Dendroctonus ponderosae) in Idaho, USA.

  11. Genomic similarity and kernel methods I: advancements by building on mathematical and statistical foundations.

    Science.gov (United States)

    Schaid, Daniel J

    2010-01-01

    Measures of genomic similarity are the basis of many statistical analytic methods. We review the mathematical and statistical basis of similarity methods, particularly based on kernel methods. A kernel function converts information for a pair of subjects to a quantitative value representing either similarity (larger values meaning more similar) or distance (smaller values meaning more similar), with the requirement that it must create a positive semidefinite matrix when applied to all pairs of subjects. This review emphasizes the wide range of statistical methods and software that can be used when similarity is based on kernel methods, such as nonparametric regression, linear mixed models and generalized linear mixed models, hierarchical models, score statistics, and support vector machines. The mathematical rigor for these methods is summarized, as is the mathematical framework for making kernels. This review provides a framework to move from intuitive and heuristic approaches to define genomic similarities to more rigorous methods that can take advantage of powerful statistical modeling and existing software. A companion paper reviews novel approaches to creating kernels that might be useful for genomic analyses, providing insights with examples [1]. Copyright © 2010 S. Karger AG, Basel.

  12. Mathematical modelling in engineering: A proposal to introduce linear algebra concepts

    Directory of Open Access Journals (Sweden)

    Andrea Dorila Cárcamo

    2016-03-01

    Full Text Available The modern dynamic world requires that basic science courses for engineering, including linear algebra, emphasize the development of mathematical abilities primarily associated with modelling and interpreting, which aren´t limited only to calculus abilities. Considering this, an instructional design was elaborated based on mathematic modelling and emerging heuristic models for the construction of specific linear algebra concepts:  span and spanning set. This was applied to first year engineering students. Results suggest that this type of instructional design contributes to the construction of these mathematical concepts and can also favour first year engineering students understanding of key linear algebra concepts and potentiate the development of higher order skills.

  13. Non-linear models for the detection of impaired cerebral blood flow autoregulation.

    Science.gov (United States)

    Chacón, Max; Jara, José Luis; Miranda, Rodrigo; Katsogridakis, Emmanuel; Panerai, Ronney B

    2018-01-01

    The ability to discriminate between normal and impaired dynamic cerebral autoregulation (CA), based on measurements of spontaneous fluctuations in arterial blood pressure (BP) and cerebral blood flow (CBF), has considerable clinical relevance. We studied 45 normal subjects at rest and under hypercapnia induced by breathing a mixture of carbon dioxide and air. Non-linear models with BP as input and CBF velocity (CBFV) as output, were implemented with support vector machines (SVM) using separate recordings for learning and validation. Dynamic SVM implementations used either moving average or autoregressive structures. The efficiency of dynamic CA was estimated from the model's derived CBFV response to a step change in BP as an autoregulation index for both linear and non-linear models. Non-linear models with recurrences (autoregressive) showed the best results, with CA indexes of 5.9 ± 1.5 in normocapnia, and 2.5 ± 1.2 for hypercapnia with an area under the receiver-operator curve of 0.955. The high performance achieved by non-linear SVM models to detect deterioration of dynamic CA should encourage further assessment of its applicability to clinical conditions where CA might be impaired.

  14. Linear models in matrix form a hands-on approach for the behavioral sciences

    CERN Document Server

    Brown, Jonathon D

    2014-01-01

    This textbook is an approachable introduction to statistical analysis using matrix algebra. Prior knowledge of matrix algebra is not necessary. Advanced topics are easy to follow through analyses that were performed on an open-source spreadsheet using a few built-in functions. These topics include ordinary linear regression, as well as maximum likelihood estimation, matrix decompositions, nonparametric smoothers and penalized cubic splines. Each data set (1) contains a limited number of observations to encourage readers to do the calculations themselves, and (2) tells a coherent story based on statistical significance and confidence intervals. In this way, students will learn how the numbers were generated and how they can be used to make cogent arguments about everyday matters. This textbook is designed for use in upper level undergraduate courses or first year graduate courses. The first chapter introduces students to linear equations, then covers matrix algebra, focusing on three essential operations: sum ...

  15. Statistical Analysis of Clinical Data on a Pocket Calculator, Part 2 Statistics on a Pocket Calculator, Part 2

    CERN Document Server

    Cleophas, Ton J

    2012-01-01

    The first part of this title contained all statistical tests relevant to starting clinical investigations, and included tests for continuous and binary data, power, sample size, multiple testing, variability, confounding, interaction, and reliability. The current part 2 of this title reviews methods for handling missing data, manipulated data, multiple confounders, predictions beyond observation, uncertainty of diagnostic tests, and the problems of outliers. Also robust tests, non-linear modeling , goodness of fit testing, Bhatacharya models, item response modeling, superiority testing, variab

  16. Analysis of baseline, average, and longitudinally measured blood pressure data using linear mixed models.

    Science.gov (United States)

    Hossain, Ahmed; Beyene, Joseph

    2014-01-01

    This article compares baseline, average, and longitudinal data analysis methods for identifying genetic variants in genome-wide association study using the Genetic Analysis Workshop 18 data. We apply methods that include (a) linear mixed models with baseline measures, (b) random intercept linear mixed models with mean measures outcome, and (c) random intercept linear mixed models with longitudinal measurements. In the linear mixed models, covariates are included as fixed effects, whereas relatedness among individuals is incorporated as the variance-covariance structure of the random effect for the individuals. The overall strategy of applying linear mixed models decorrelate the data is based on Aulchenko et al.'s GRAMMAR. By analyzing systolic and diastolic blood pressure, which are used separately as outcomes, we compare the 3 methods in identifying a known genetic variant that is associated with blood pressure from chromosome 3 and simulated phenotype data. We also analyze the real phenotype data to illustrate the methods. We conclude that the linear mixed model with longitudinal measurements of diastolic blood pressure is the most accurate at identifying the known single-nucleotide polymorphism among the methods, but linear mixed models with baseline measures perform best with systolic blood pressure as the outcome.

  17. Correlation and simple linear regression.

    Science.gov (United States)

    Zou, Kelly H; Tuncali, Kemal; Silverman, Stuart G

    2003-06-01

    In this tutorial article, the concepts of correlation and regression are reviewed and demonstrated. The authors review and compare two correlation coefficients, the Pearson correlation coefficient and the Spearman rho, for measuring linear and nonlinear relationships between two continuous variables. In the case of measuring the linear relationship between a predictor and an outcome variable, simple linear regression analysis is conducted. These statistical concepts are illustrated by using a data set from published literature to assess a computed tomography-guided interventional technique. These statistical methods are important for exploring the relationships between variables and can be applied to many radiologic studies.

  18. A SOCIOLOGICAL ANALYSIS OF THE CHILDBEARING COEFFICIENT IN THE ALTAI REGION BASED ON METHOD OF FUZZY LINEAR REGRESSION

    Directory of Open Access Journals (Sweden)

    Sergei Vladimirovich Varaksin

    2017-06-01

    Full Text Available Purpose. Construction of a mathematical model of the dynamics of childbearing change in the Altai region in 2000–2016, analysis of the dynamics of changes in birth rates for multiple age categories of women of childbearing age. Methodology. A auxiliary analysis element is the construction of linear mathematical models of the dynamics of childbearing by using fuzzy linear regression method based on fuzzy numbers. Fuzzy linear regression is considered as an alternative to standard statistical linear regression for short time series and unknown distribution law. The parameters of fuzzy linear and standard statistical regressions for childbearing time series were defined with using the built in language MatLab algorithm. Method of fuzzy linear regression is not used in sociological researches yet. Results. There are made the conclusions about the socio-demographic changes in society, the high efficiency of the demographic policy of the leadership of the region and the country, and the applicability of the method of fuzzy linear regression for sociological analysis.

  19. An approximate inversion method of geoelectrical sounding data using linear and bayesian statistical approaches. Examples of Tritrivakely volcanic lake and Mahitsy area (central part of Madagascar)

    International Nuclear Information System (INIS)

    Ranaivo Nomenjanahary, F.; Rakoto, H.; Ratsimbazafy, J.B.

    1994-08-01

    This paper is concerned with resistivity sounding measurements performed from single site (vertical sounding) or from several sites (profiles) within a bounded area. The objective is to present an accurate information about the study area and to estimate the likelihood of the produced quantitative models. The achievement of this objective obviously requires quite relevant data and processing methods. It also requires interpretation methods which should take into account the probable effect of an heterogeneous structure. In front of such difficulties, the interpretation of resistivity sounding data inevitably involves the use of inversion methods. We suggest starting the interpretation in simple situation (1-D approximation), and using the rough but correct model obtained as an a-priori model for any more refined interpretation. Related to this point of view, special attention should be paid for the inverse problem applied to the resistivity sounding data. This inverse problem is nonlinear, while linearity inherent in the functional response used to describe the physical experiment. Two different approaches are used to build an approximate but higher dimensional inversion of geoelectrical data: the linear approach and the bayesian statistical approach. Some illustrations of their application in resistivity sounding data acquired at Tritrivakely volcanic lake (single site) and at Mahitsy area (several sites) will be given. (author). 28 refs, 7 figs

  20. Fluctuations and correlations in statistical models of hadron production

    International Nuclear Information System (INIS)

    Gorenstein, M. I.

    2012-01-01

    An extension of the standard concept of the statistical ensembles is suggested. Namely, the statistical ensembles with extensive quantities fluctuating according to an externally given distribution are introduced. Applications in the statistical models of multiple hadron production in high energy physics are discussed.